PowerPC V10 Patches for -mcpu=future

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* PowerPC V10 Patches for -mcpu=future
@ 2019-12-12  0:07 Michael Meissner
  2019-12-12  0:12 ` [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future Michael Meissner
                   ` (11 more replies)
  0 siblings, 12 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:07 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn, Michael Meissner

This set of patches is an attempt to address the issues raised in the previous
sets of patches:

    The V7 patches were for important functionality
    The V8 patches were for tests
    The V9 patches were for the PCREL_OPT support

As I write this there are 12 patches.  There will be more patches later to
address the remaining test suite patches.  I need to look at the comments for
PCREL_OPT in detail to see what the strategy should be for those patches.

Patches V10 #1-3 are the remaining issues from V7 #1-3 to add PADDI and PLI
support for large constants.  In theory once the reformating that was
previously done and checked in, these should be simple.

Patches V10 #4-7 break up patch V7 #6 (vector extract) into 4 separate patches.

Patch V10 #8 is patch V7 #7 (turn on -mpcrel by default on 64-bit Linux targets
for -mcpu=future), changing the names of the enabling macros.

Patch V10 #9 is patch V7 #5 that was redone.  This patch adds new effective
target options for PowerPC.  I have changed this patch to look at the code
generated by the compiler to see if prefixed adddressing or PC-relative
addressing is used for -mcpu=future.  This patch needs patch V10 #8 installed
to enable the prefixed addressing and PC-relative tests.

In patch V10 #9, I did not modify the existing test
(check_effective_target_powerpc_future_ok).  As we discussed, this test should
really test whether a non-prefixed instruction is generated to allow for
targets that might support -mcpu=future but not enable prefixed addressing.
However, at present the only instructions being submitted are prefixed
instructions.  So this will have to wait until we get further down the road
with 'future' instructions.

Patch V10 #10 is a modification of patch V8 #1.  I renamed the files from
paddi-?.c to prefixed-*.c so that there isn't a false match due to the .ident
directive.

Patch V10 #11 is a slight reworking of patch V8 #2 (testing whether we generate
a prefixed instruction when the offset would be invalid for DS and DQ
instruction formats).

Patch V10 #12 is a slight reworking of patch V8 #3 (making sure we don't try to
generate the non-existant PLWZU and PSTWU pre-modify instructions).

There are 3 other patches from V8 that I will address at a later date.  Patch
V8 #4 are the tests for using prefixed instructions for each of the types when
a large numeric offset is used.  Patch V8 #5 are the tests for using
PC-relative load/store instructions for each of the types to reference static
values.  Patch V8 #6 is the test to make sure the -fstack-protector support
works when the stack frame is large and -mcpu=future is used.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
@ 2019-12-12  0:12 ` Michael Meissner
  2019-12-17 15:58   ` Segher Boessenkool
  2019-12-12  0:15 ` [PATCH] V10 patch #2, use PLI to load up large SImode " Michael Meissner
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:12 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch adds an alternative to use PLI to load up large DImode constants if
-mcpu=future is used.

It is a slight reworking of patch V7 #1 after reformating the movdi_interal64
insn.  I have done bootstraps and make check on a power8 little endian system
and there were no regressions.  Can I check this patch in?

Patch V7 #1:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01301.html

2019-12-09  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (num_insns_constant_gpr): Return 1 if the
	constant can be loaded with PLI if -mcpu=future.
	* config/rs6000/rs6000.md (movdi_internal64): Add alternative to
	use PLI to load up 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279141)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5541,6 +5541,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 	   && (value >> 31 == -1 || value >> 31 == 0))
     return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+    return 1;
+
   else if (TARGET_POWERPC64)
     {
       HOST_WIDE_INT low  = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 279141)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8828,7 +8828,7 @@ (define_split
 })
 
 ;;	   GPR store   GPR load    GPR move
-;;	   GPR li      GPR lis     GPR #
+;;	   GPR li      GPR lis     GPR pli     GPR #
 ;;	   FPR store   FPR load    FPR move
 ;;	   AVX store   AVX store   AVX load    AVX load    VSX move
 ;;	   P9 0        P9 -1       AVX 0/-1    VSX 0       VSX -1
@@ -8838,7 +8838,7 @@ (define_split
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
 	  "=YZ,        r,          r,
-	   r,          r,          r,
+	   r,          r,          r,          r,
 	   m,          ^d,         ^d,
 	   wY,         Z,          $v,         $v,         ^wa,
 	   wa,         wa,         v,          wa,         wa,
@@ -8847,7 +8847,7 @@ (define_insn "*movdi_internal64"
 	   ?r,         ?wa")
 	(match_operand:DI 1 "input_operand"
 	  "r,          YZ,         r,
-	   I,          L,          nF,
+	   I,          L,          eI,         nF,
 	   ^d,         m,          ^d,
 	   ^v,         $v,         wY,         Z,          ^wa,
 	   Oj,         wM,         OjwM,       Oj,         wM,
@@ -8863,6 +8863,7 @@ (define_insn "*movdi_internal64"
    mr %0,%1
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8886,7 +8887,7 @@ (define_insn "*movdi_internal64"
    mtvsrd %x0,%1"
   [(set_attr "type"
 	  "store,      load,       *,
-	   *,          *,          *,
+	   *,          *,          *,          *,
 	   fpstore,    fpload,     fpsimple,
 	   fpstore,    fpstore,    fpload,     fpload,     veclogical,
 	   vecsimple,  vecsimple,  vecsimple,  veclogical, veclogical,
@@ -8896,7 +8897,7 @@ (define_insn "*movdi_internal64"
    (set_attr "size" "64")
    (set_attr "length"
 	  "*,          *,          *,
-	   *,          *,          20,
+	   *,          *,          *,          20,
 	   *,          *,          *,
 	   *,          *,          *,          *,          *,
 	   *,          *,          *,          *,          *,
@@ -8905,7 +8906,7 @@ (define_insn "*movdi_internal64"
 	   *,          *")
    (set_attr "isa"
 	  "*,          *,          *,
-	   *,          *,          *,
+	   *,          *,          fut,        *,
 	   *,          *,          *,
 	   p9v,        p7v,        p9v,        p7v,        *,
 	   p9v,        p9v,        p7v,        *,          *,

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #2, use PLI to load up large SImode constants if -mcpu=future
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
  2019-12-12  0:12 ` [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future Michael Meissner
@ 2019-12-12  0:15 ` Michael Meissner
  2019-12-17 16:03   ` Segher Boessenkool
  2019-12-12  0:17 ` [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used Michael Meissner
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:15 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch adds an alternative to use PLI to load up large SImode constants if
-mcpu=future is used.

It is a slight reworking of patch V7 #2 after reformating the movsi_interal1
insn.  I have done bootstraps and make check on a power8 little endian system
and there were no regressions.  Can I check this patch in once patch V10 #1 is
checked in?

Patch V7 #2:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01302.html

2019-12-09  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.md (movsi_internal1): Add alternative to
	use PLI to load up 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 279143)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6892,7 +6892,7 @@ (define_split
 ;;	   MR          LA
 ;;	   LWZ         LFIWZX      LXSIWZX
 ;;	   STW         STFIWX      STXSIWX
-;;	   LI          LIS         #
+;;	   LI          LIS         PLI         #
 ;;	   XXLOR       XXSPLTIB 0  XXSPLTIB -1 VSPLTISW
 ;;	   XXLXOR 0    XXLORC -1   P9 const
 ;;	   MTVSRWZ     MFVSRWZ
@@ -6903,7 +6903,7 @@ (define_insn "*movsi_internal1"
 	  "=r,         r,
 	   r,          d,          v,
 	   m,          Z,          Z,
-	   r,          r,          r,
+	   r,          r,          r,          r,
 	   wa,         wa,         wa,         v,
 	   wa,         v,          v,
 	   wa,         r,
@@ -6912,7 +6912,7 @@ (define_insn "*movsi_internal1"
 	  "r,          U,
 	   m,          Z,          Z,
 	   r,          d,          v,
-	   I,          L,          n,
+	   I,          L,          eI,         n,
 	   wa,         O,          wM,         wB,
 	   O,          wM,         wS,
 	   r,          wa,
@@ -6930,6 +6930,7 @@ (define_insn "*movsi_internal1"
    stxsiwx %x1,%y0
    li %0,%1
    lis %0,%v1
+   li %0,%1
    #
    xxlor %x0,%x1,%x1
    xxspltib %x0,0
@@ -6947,7 +6948,7 @@ (define_insn "*movsi_internal1"
 	  "*,          *,
 	   load,       fpload,     fpload,
 	   store,      fpstore,    fpstore,
-	   *,          *,          *,
+	   *,          *,          *,          *,
 	   veclogical, vecsimple,  vecsimple,  vecsimple,
 	   veclogical, veclogical, vecsimple,
 	   mffgpr,     mftgpr,
@@ -6956,7 +6957,7 @@ (define_insn "*movsi_internal1"
 	  "*,          *,
 	   *,          *,          *,
 	   *,          *,          *,
-	   *,          *,          8,
+	   *,          *,          *,          8,
 	   *,          *,          *,          *,
 	   *,          *,          8,
 	   *,          *,
@@ -6965,7 +6966,7 @@ (define_insn "*movsi_internal1"
 	  "*,          *,
 	   *,          p8v,        p8v,
 	   *,          p8v,        p8v,
-	   *,          *,          *,
+	   *,          *,          fut,        *,
 	   p8v,        p9v,        p9v,        p8v,
 	   p9v,        p8v,        p9v,
 	   p8v,        p8v,
@@ -7120,8 +7121,7 @@ (define_insn "*movsi_from_df"
 (define_split
   [(set (match_operand:SI 0 "gpc_reg_operand")
 	(match_operand:SI 1 "const_int_operand"))]
-  "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
-   && (INTVAL (operands[1]) & 0xffff) != 0"
+  "num_insns_constant (operands[1], SImode) > 1"
   [(set (match_dup 0)
 	(match_dup 2))
    (set (match_dup 0)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
  2019-12-12  0:12 ` [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future Michael Meissner
  2019-12-12  0:15 ` [PATCH] V10 patch #2, use PLI to load up large SImode " Michael Meissner
@ 2019-12-12  0:17 ` Michael Meissner
  2019-12-17 16:27   ` Segher Boessenkool
  2019-12-12  0:29 ` [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints Michael Meissner
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:17 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch adds an alternative to use PADDI to add large SImode and DImode
constants if -mcpu=future is used.

It is a slight reworking of patch V7 #3.  I have done bootstraps and make check
on a power8 little endian system and there were no regressions.  Can I check
this patch in?

2019-12-09  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (add_operand): Allow eI constants.
	* config/rs6000/rs6000.md (add<mode>3): Add alternative to
	generate PADDI for 34-bit constants if -mcpu=future.

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 279141)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
     (match_test "satisfies_constraint_I (op)
-		 || satisfies_constraint_L (op)")
+		 || satisfies_constraint_L (op)
+		 || satisfies_constraint_eI (op)")
     (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 279144)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -1761,15 +1761,17 @@ (define_expand "add<mode>3"
 })
 
 (define_insn "*add<mode>3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
-		  (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+	(plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+		  (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
    add %0,%1,%2
    addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (2 preceding siblings ...)
  2019-12-12  0:17 ` [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used Michael Meissner
@ 2019-12-12  0:29 ` Michael Meissner
  2019-12-17 17:27   ` Segher Boessenkool
  2019-12-12  0:48 ` [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address Michael Meissner
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:29 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

Add new constraints to match whether a memory is not prefixed (em constraint)
or prefixed (ep constraint).  This is one of 4 parts aimed at reworking the
vector extract code in patch V7 #6.

This patch just adds the new constraints, but these constraints will not be
used until the next patch.  Originally I had just one constraint (em) that
matched non-prefixed memory operands.  But in order to use it, I needed to make
sure the combiner did not combine vector extracts with a variable offset with a
PC-relative memory location.

I.e.:

	#include <altivec.h>

	static vector double vd;

	double get (unsigned int n)
	{
	  return vec_extract (vd, n);
	}

In addition, as I contemplate the bigger issue about the insn length attribute,
I suspect we may need to have an ep attribute as well as em.

Patch V7 #6:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01306.html

I have bootstrapped the compiler on a little endian power8 system and ran make
check and there were no regressions.  Can I check this patch in?

2019-12-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (em constraint): New constraint for
	non-prefixed memory operands.
	(ep constraint): New constraint for prefixed memory operands.
	* config/rs6000/predicates.md (non_prefixed_memory): New predicate
	for non-prefixed memory operands.
	* doc/md.texi (PowerPC constraints): Document em and ep constraints.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 279182)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -202,6 +202,16 @@ (define_constraint "H"
 
 ;; Memory constraints
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+       (match_operand 0 "non_prefixed_memory")))
+
+(define_memory_constraint "ep"
+  "A memory operand that does contains a prefixed address."
+  (and (match_code "mem")
+       (match_operand 0 "prefixed_memory")))
+
 (define_memory_constraint "es"
   "A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  Unlike @samp{m}, this constraint
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 279151)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1846,3 +1846,17 @@ (define_predicate "prefixed_memory"
 {
   return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
 })
+
+;; Return true if the operand is a valid memory address that does not use a
+;; prefixed address.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  enum insn_form iform
+    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+
+  return (iform != INSN_FORM_BAD
+          && iform != INSN_FORM_PREFIXED_NUMERIC
+	  && iform != INSN_FORM_PCREL_LOCAL
+	  && iform != INSN_FORM_PCREL_EXTERNAL);
+})
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 279182)
+++ gcc/doc/md.texi	(working copy)
@@ -3373,6 +3373,12 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
 
 is not.
 
+@item em
+A memory operand that does not contain a prefixed address.
+
+@item ep
+A memory operand that does contains a prefixed address.
+
 @item es
 A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  This used to be useful when

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (3 preceding siblings ...)
  2019-12-12  0:29 ` [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints Michael Meissner
@ 2019-12-12  0:48 ` Michael Meissner
  2019-12-17 18:02   ` Segher Boessenkool
  2019-12-12  0:54 ` [PATCH] V10 patch #6, Use prefixed load/stores for vector extract with large offsets Michael Meissner
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:48 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch fixes a bug with vector extracts using a PC-relative address and a
variable offset with using -mcpu=future.

Consider the code:

	#include <altivec.h>

	static vector double vd;
	vector double *p = &vd;

	double get (unsigned int n)
	{
	  return vec_extract (vd, n);
	}

If you compile this code with -O2 -mcpu=future -mpcrel you get:

	get:
	        pla 9,.LANCHOR0@pcrel
	        lfdx 1,9,9
	        blr

This is because there is only one base register temporary, and the current code
tries to first create the offset and then use the same temporary to hold the
address of the PC-relative value.

After combine the insn is:

(insn 14 9 15 2 (parallel [
            (set (reg/i:DF 33 1)
                (unspec:DF [
                        (mem/c:V2DF (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) [1 vd+0 S16 A128])
                        (reg:DI 123 [ n ])
                    ] UNSPEC_VSX_EXTRACT))
            (clobber (scratch:DI))
            (clobber (scratch:V2DI))
        ]) "foo.c":9:1 1314 {vsx_extract_v2df_var}


Split2 changes this to:

(insn 20 8 21 2 (set (reg:DI 3 3 [orig:123 n ] [123])
        (and:DI (reg:DI 3 3 [orig:123 n ] [123])
            (const_int 1 [0x1]))) "foo.c":9:1 193 {anddi3_mask}
     (nil))
(insn 21 20 22 2 (set (reg:DI 9 9 [126])
        (ashift:DI (reg:DI 3 3 [orig:123 n ] [123])
            (const_int 3 [0x3]))) "foo.c":9:1 256 {ashldi3}
     (nil))
(insn 22 21 23 2 (set (reg:DI 9 9 [126])
        (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])) "foo.c":9:1 680 {*pcrel_local_addr}
     (nil))
(insn 23 22 15 2 (set (reg/i:DF 33 1)
        (mem/c:DF (plus:DI (reg:DI 9 9 [126])
                (reg:DI 9 9 [126])) [1  S8 A8])) "foo.c":9:1 512 {*movdf_hardfloat64}
     (nil))

I.e. setting GPR r9 first to the offset << 3, and then wiping out the offset
and setting in the address of the PC-relative structure.

This patch changes all of the variable extract insns and the function in
rs6000.c that processes them to have a second base register temporary only if
we have prefixed addresses.  The code generated then becomes:

	get:
		extsw 3,3
	        pla 10,.LANCHOR0@pcrel
	        rldicl 3,3,0,63
	        sldi 9,3,3
	        lfdx 1,10,9

I use the em and ep constraints to keep the alternatives separate.  Using em
prevents the register allocator from skipping the alternative with ep in it
because it has an extra scratch register.

I have bootstrapped the compiler on a little endian power8 system and ran make
check without regression.  Can I check this in once patch V10 #4 is checked in?

2019-12-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_split_vec_extract_var):
	Update calling signature.
	* config/rs6000/rs6000.c (rs6000_split_vec_extract_var): Add
	additional tmp base register argument.  If the memory is prefixed,
	put the address into the new tmp base register.
	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Add new temporary for loading up the address of prefixed memory
	operands.
	(vsx_extract_v4sf_var): Add new temporary for loading up the
	address of prefixed memory operands.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Add new
	temporary for loading up the address of prefixed memory operands.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Add new temporary for
	loading up the address of prefixed memory operands.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 279182)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -59,7 +59,7 @@ extern void rs6000_expand_float128_conve
 extern void rs6000_expand_vector_init (rtx, rtx);
 extern void rs6000_expand_vector_set (rtx, rtx, int);
 extern void rs6000_expand_vector_extract (rtx, rtx, rtx);
-extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx);
+extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx, rtx);
 extern rtx rs6000_adjust_vec_address (rtx, rtx, rtx, rtx, machine_mode);
 extern void altivec_expand_vec_perm_le (rtx op[4]);
 extern void rs6000_expand_extract_even (rtx, rtx, rtx);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279182)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6861,7 +6861,7 @@ rs6000_adjust_vec_address (rtx scalar_re
 
 void
 rs6000_split_vec_extract_var (rtx dest, rtx src, rtx element, rtx tmp_gpr,
-			      rtx tmp_altivec)
+			      rtx tmp_altivec, rtx tmp_prefixed)
 {
   machine_mode mode = GET_MODE (src);
   machine_mode scalar_mode = GET_MODE_INNER (GET_MODE (src));
@@ -6878,6 +6878,16 @@ rs6000_split_vec_extract_var (rtx dest,
       int num_elements = GET_MODE_NUNITS (mode);
       rtx num_ele_m1 = GEN_INT (num_elements - 1);
 
+      /* If we have a prefixed address, we need to load the address into a
+	 separate register and then add the variable offset to that
+	 address.  */
+      if (prefixed_memory (src, mode))
+	{
+	  gcc_assert (REG_P (tmp_prefixed));
+	  rs6000_emit_move (tmp_prefixed, XEXP (src, 0), Pmode);
+	  src = change_address (src, mode, tmp_prefixed);
+	}
+
       emit_insn (gen_anddi3 (element, element, num_ele_m1));
       gcc_assert (REG_P (tmp_gpr));
       emit_move_insn (dest, rs6000_adjust_vec_address (dest, src, element,
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 279182)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -3247,21 +3247,24 @@ (define_insn "vsx_vslo_<mode>"
 
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
-  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
-			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
-			    UNSPEC_VSX_EXTRACT))
-   (clobber (match_scratch:DI 3 "=r,&b,&b"))
-   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r,wa,r")
+	(unspec:<VS_scalar>
+	 [(match_operand:VSX_D 1 "input_operand" "v,em,em,ep,ep")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r,r")]
+	 UNSPEC_VSX_EXTRACT))
+   (clobber (match_scratch:DI 3 "=r,&b,&b,&b,&b"))
+   (clobber (match_scratch:V2DI 4 "=&v,X,X,X,X"))
+   (clobber (match_scratch:DI 5 "=X,X,X,&b,&b"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
   rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
-				operands[3], operands[4]);
+				operands[3], operands[4], operands[5]);
   DONE;
-})
+}
+  [(set_attr "isa" "*,*,*,fut,fut")])
 
 ;; Extract a SF element from V4SF
 (define_insn_and_split "vsx_extract_v4sf"
@@ -3317,21 +3320,23 @@ (define_insn_and_split "*vsx_extract_v4s
 
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
-		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r,wa,?r")
+	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,em,em,ep,ep")
+		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r,r")]
 		   UNSPEC_VSX_EXTRACT))
-   (clobber (match_scratch:DI 3 "=r,&b,&b"))
-   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
+   (clobber (match_scratch:DI 3 "=r,&b,&b,&b,&b"))
+   (clobber (match_scratch:V2DI 4 "=&v,X,X,X,X"))
+   (clobber (match_scratch:DI 5 "=X,X,X,&b,&b"))]
   "VECTOR_MEM_VSX_P (V4SFmode) && TARGET_DIRECT_MOVE_64BIT"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
   rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
-				operands[3], operands[4]);
+				operands[3], operands[4], operands[5]);
   DONE;
-})
+}
+  [(set_attr "isa" "*,*,*,fut,fut")])
 
 ;; Expand the builtin form of xxpermdi to canonical rtl.
 (define_expand "vsx_xxpermdi_<mode>"
@@ -3679,33 +3684,35 @@ (define_insn_and_split "*vsx_extract_<mo
 
 ;; Variable V16QI/V8HI/V4SI extract
 (define_insn_and_split "vsx_extract_<mode>_var"
-  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
-	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,em,ep")
+	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
-   (clobber (match_scratch:DI 3 "=r,r,&b"))
-   (clobber (match_scratch:V2DI 4 "=X,&v,X"))]
+   (clobber (match_scratch:DI 3 "=r,r,&b,&b"))
+   (clobber (match_scratch:V2DI 4 "=X,&v,X,X"))
+   (clobber (match_scratch:DI 5 "=X,X,X,&b"))]
   "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
   rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
-				operands[3], operands[4]);
+				operands[3], operands[4], operands[5]);
   DONE;
 }
-  [(set_attr "isa" "p9v,*,*")])
+  [(set_attr "isa" "p9v,*,*,fut")])
 
 (define_insn_and_split "*vsx_extract_<mode>_<VS_scalar>mode_var"
-  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
-	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
+	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,em,ep")
+	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
-   (clobber (match_scratch:DI 3 "=r,r,&b"))
-   (clobber (match_scratch:V2DI 4 "=X,&v,X"))]
+   (clobber (match_scratch:DI 3 "=r,r,&b,&b"))
+   (clobber (match_scratch:V2DI 4 "=X,&v,X,X"))
+   (clobber (match_scratch:DI 5 "=X,X,X,&b"))]
   "VECTOR_MEM_VSX_P (<VSX_EXTRACT_I:MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
   "#"
   "&& reload_completed"
@@ -3714,10 +3721,11 @@ (define_insn_and_split "*vsx_extract_<mo
   machine_mode smode = <VS_scalar>mode;
   rs6000_split_vec_extract_var (gen_rtx_REG (smode, REGNO (operands[0])),
 				operands[1], operands[2],
-				operands[3], operands[4]);
+				operands[3], operands[4],
+				operands[5]);
   DONE;
 }
-  [(set_attr "isa" "p9v,*,*")])
+  [(set_attr "isa" "p9v,*,*,fut")])
 
 ;; VSX_EXTRACT optimizations
 ;; Optimize double d = (double) vec_extract (vi, <n>)


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #6, Use prefixed load/stores for vector extract with large offsets
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (4 preceding siblings ...)
  2019-12-12  0:48 ` [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address Michael Meissner
@ 2019-12-12  0:54 ` Michael Meissner
  2019-12-12  0:58 ` [PATCH] V10 patch #7, Improve vector_extract code of a PC-relative address with a constant offset for -mcpu=future Michael Meissner
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch optimizes vector extracts where the vector is pointed to by an
address with an offset larger than 16-bits to fold the add into the final
address.

I.e.

	#include <altivec.h>

	double get (vector double *p, unsigned int h)
	{
	  return vec_extract (p[50000], 1);
	}

I have bootstraped this patch on a little endian power8 system and ran make
check with no regressions.  Can I check this patch in?

2019-12-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for the offset being 34-bits when -mcpu=future is used.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279199)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6766,9 +6766,17 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
+	  /* 16-bit offset.  */
+	  if (SIGNED_16BIT_OFFSET_P (offset)
 	      && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  /* 34-bit offset if we have prefixed addresses.  */
+	  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  /* Offset overflowed, move offset to the temporary (which will likely
+	     be split), and do X-FORM addressing.  */
 	  else
 	    {
 	      emit_move_insn (base_tmp, offset_rtx);
@@ -6799,6 +6807,12 @@ rs6000_adjust_vec_address (rtx scalar_re
 	      emit_insn (insn);
 	    }
 
+	  /* Make sure we don't overwrite the temporary if the element being
+	     extracted is variable, and we've put the offset into base_tmp
+	     previously.  */
+	  else if (rtx_equal_p (base_tmp, element_offset))
+	    emit_insn (gen_add2_insn (base_tmp, op1));
+
 	  else
 	    {
 	      emit_move_insn (base_tmp, op1);

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #7, Improve vector_extract code of a PC-relative address with a constant offset for -mcpu=future
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (5 preceding siblings ...)
  2019-12-12  0:54 ` [PATCH] V10 patch #6, Use prefixed load/stores for vector extract with large offsets Michael Meissner
@ 2019-12-12  0:58 ` Michael Meissner
  2019-12-12  1:06 ` [PATCH] V10 patch #8, Enable -mpcrel and -mprefixed-addr for -mcpu=future on 64-bit little endian Linux systems Michael Meissner
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  0:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch improves the code of vector_extract when the vector is addressed
with a PC-relative address, and the element number is constant.

I.e.

	#include <altivec.h>

	static vector double vd[10];
	vector double *p = &vd[0];

	double get (void)
	{
	  return vector_extract (vd[4], 1);
	}

I have bootstrapped this code on a little endian power8 and ran make check and
there were no regressions.  Can I check this into the trunk?

2019-12-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
	function.
	(rs6000_adjust_vec_address): Add support for folding a constant
	offset of a vector extract of a vector accessed with PC-relative
	addressing into the offset of the load.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279200)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6698,6 +6698,30 @@ rs6000_expand_vector_extract (rtx target
     }
 }
 
+/* Helper function to return an address mask based on a physical register.  */
+
+static addr_mask_type
+rs6000_reg_to_addr_mask (rtx reg, machine_mode mode)
+{
+  unsigned int r = reg_or_subregno (reg);
+  addr_mask_type addr_mask;
+
+  gcc_assert (HARD_REGISTER_NUM_P (r));
+  if (INT_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+  else if (FP_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
+
+  else if (ALTIVEC_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+  else
+    gcc_unreachable ();
+
+  return addr_mask;
+}
+
 /* Adjust a memory address (MEM) of a vector type to point to a scalar field
    within the vector (ELEMENT) with a mode (SCALAR_MODE).  Use a base register
    temporary (BASE_TMP) to fixup the address.  Return the new memory address
@@ -6823,8 +6847,57 @@ rs6000_adjust_vec_address (rtx scalar_re
 	}
     }
 
+  /* For references to local static variables, try to fold a constant offset
+     into the address.  */
+  else if (pcrel_local_address (addr, Pmode) && CONST_INT_P (element_offset))
+    {
+      if (GET_CODE (addr) == CONST)
+	addr = XEXP (addr, 0);
+
+      if (GET_CODE (addr) == PLUS)
+	{
+	  rtx op0 = XEXP (addr, 0);
+	  rtx op1 = XEXP (addr, 1);
+	  if (CONST_INT_P (op1))
+	    {
+	      HOST_WIDE_INT offset
+		= INTVAL (XEXP (addr, 1)) + INTVAL (element_offset);
+
+	      if (offset == 0)
+		new_addr = op0;
+
+	      else if (SIGNED_34BIT_OFFSET_P (offset))
+		{
+		  rtx plus = gen_rtx_PLUS (Pmode, op0, GEN_INT (offset));
+		  new_addr = gen_rtx_CONST (Pmode, plus);
+		}
+
+	      else
+		{
+		  emit_move_insn (base_tmp, addr);
+		  new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+		}
+	    }
+	  else
+	    {
+	      emit_move_insn (base_tmp, addr);
+	      new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+	    }
+	}
+
+      else
+	{
+	  rtx plus = gen_rtx_PLUS (Pmode, addr, element_offset);
+	  new_addr = gen_rtx_CONST (Pmode, plus);
+	}
+    }
+
   else
     {
+      /* Make sure we don't overwrite the temporary if the vector extract
+	 offset was variable.  */
+      gcc_assert (!rtx_equal_p (base_tmp, element_offset));
+
       emit_move_insn (base_tmp, addr);
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
@@ -6834,21 +6907,8 @@ rs6000_adjust_vec_address (rtx scalar_re
   if (GET_CODE (new_addr) == PLUS)
     {
       rtx op1 = XEXP (new_addr, 1);
-      addr_mask_type addr_mask;
-      unsigned int scalar_regno = reg_or_subregno (scalar_reg);
-
-      gcc_assert (HARD_REGISTER_NUM_P (scalar_regno));
-      if (INT_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_GPR];
-
-      else if (FP_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_FPR];
-
-      else if (ALTIVEC_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_VMX];
-
-      else
-	gcc_unreachable ();
+      addr_mask_type addr_mask
+	= rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
 
       if (REG_P (op1) || SUBREG_P (op1))
 	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
@@ -6856,9 +6916,21 @@ rs6000_adjust_vec_address (rtx scalar_re
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
     }
 
+  /* An address that is a single register is always valid for either indexed or
+     offsettable loads.  */
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
     valid_addr_p = true;
 
+  /* If we have a PC-relative address, check if offsetable loads are
+     allowed.  */
+  else if (pcrel_local_address (new_addr, Pmode))
+    {
+      addr_mask_type addr_mask
+	= rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
+
+      valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+    }
+
   else
     valid_addr_p = false;
 
-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #8, Enable -mpcrel and -mprefixed-addr for -mcpu=future on 64-bit little endian Linux systems
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (6 preceding siblings ...)
  2019-12-12  0:58 ` [PATCH] V10 patch #7, Improve vector_extract code of a PC-relative address with a constant offset for -mcpu=future Michael Meissner
@ 2019-12-12  1:06 ` Michael Meissner
  2019-12-12  1:12 ` [PATCH] V10 patch #9, Add new effective targets for the testsuite Michael Meissner
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  1:06 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch enables -mpcrel and -mprefixed-addr when -mcpu=future is used on a
64-bit little endian Linux system, but it does not enable those options on
other systems.  It is a slight reworking of patch V7 #7 taking into account the
comments you made.

In particular, I changed the macros used by the target tm.h file to be:
	PREFIXED_ADDR_SUPPORTED_BY_OS
	PCREL_SUPPORTED_BY_OS

Patch V7 #7:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01307.html

I have bootstrapped the compiler on a little endian power8 system, and ran make
check with no regressions.  I also tested the code by not turning on -mpcrel or
-mprefixed-addr for Linux 64-bit little endian and inspected the code and saw
the appropriate code was generated.

In terms of your comment:

| ... and I don't understand this code.  If you use -mpcrel but you do not
| have the medium model, you _do_ get prefixed but you do _not_ get pcrel?
| And this all quietly?

You do not get this quietly.  You will get an error if you use -mpcrel and
-mcmodel=large options together.

2019-12-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (PREFIXED_ADDR_SUPPORTED_BY_OS): Set to
	1 to enable prefixed addressing if -mcpu=future.
	(PCREL_SUPPORTED_BY_OS): Set to 1 to enable PC-relative addressing
	if -mcpu=future.
	* config/rs6000/rs6000-cpus.h (ISA_FUTURE_MASKS_SERVER): Do not
	enable -mprefixed-addr or -mpcrel by default.
	(ADDRESSING_FUTURE_MASKS): New macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (PREFIXED_ADDR_SUPPORTED_BY_OS): Disable
	prefixed addressing unless the target OS tm.h says we should
	enable it.
	(PCREL_SUPPORTED_BY_OS): Disable PC-relative addressing unless the
	target OS tm.h says we should enable it.
	(rs6000_debug_reg_global): Print whether prefixed addressing and
	PC-relative addressing is enabled by default if -mcpu=future.
	(rs6000_option_override_internal): Move setting prefixed
	addressing and PC-relative addressing after the sub-target option
	handling is done.  Only enable prefixed addressing or PC-relative
	address on -mcpu=future system if the target OS says to enable
	it.  Disallow prefixed addressing on 32-bit systems or if the
	target object file is not ELF v2.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 279141)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  PREFIXED_ADDR_SUPPORTED_BY_OS
+#define PREFIXED_ADDR_SUPPORTED_BY_OS	1
+
+#undef  PCREL_SUPPORTED_BY_OS
+#define PCREL_SUPPORTED_BY_OS		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 279141)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,22 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, they are added if the target
+   OS tm.h says that it supports the addressing modes by default when
+   -mcpu=future is used.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These are options that need
+   to be cleared if the target OS is not capable of supporting prefixed
+   addressing at all (such as 32-bit mode or if the object file format is not
+   ELF v2).  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279202)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef PREFIXED_ADDR_SUPPORTED_BY_OS
+#define PREFIXED_ADDR_SUPPORTED_BY_OS	0
+#endif
+
+#ifndef PCREL_SUPPORTED_BY_OS
+#define PCREL_SUPPORTED_BY_OS		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2535,6 +2545,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "PREFIXED_ADDR_SUPPORTED_BY_OS",
+	       PREFIXED_ADDR_SUPPORTED_BY_OS);
+      fprintf (stderr, DEBUG_FMT_D, "PCREL_SUPPORTED_BY_OS",
+	       PCREL_SUPPORTED_BY_OS);
+    }
 }
 
 \f
@@ -4015,26 +4033,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4166,12 +4164,91 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and PC-relative addressing if the target OS
+     tm.h file says that it is supported and the user did not explicitly use
+     -mprefixed-addr or -mpcrel.  At the present time, only 64-bit Linux
+     enables this.
+
+     PC-relative support also requires the medium code model.
+
+     We can't check for ELFv2 or -mcmodel=medium until after the subtarget
+     macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (PREFIXED_ADDR_SUPPORTED_BY_OS
+		  || TARGET_PCREL
+		  || PCREL_SUPPORTED_BY_OS))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && PCREL_SUPPORTED_BY_OS
+	      && TARGET_PREFIXED_ADDR
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+
+      /* PC-relative requires the medium code model.  */
+      if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #9, Add new effective targets for the testsuite
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (7 preceding siblings ...)
  2019-12-12  1:06 ` [PATCH] V10 patch #8, Enable -mpcrel and -mprefixed-addr for -mcpu=future on 64-bit little endian Linux systems Michael Meissner
@ 2019-12-12  1:12 ` Michael Meissner
  2019-12-12  1:13 ` [PATCH] V10 patch #10, Add PADDI/PLI tests for -mcpu=future Michael Meissner
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  1:12 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

Patch V10 #9 is patch V7 #5 that was redone.  This patch adds new effective
target options for PowerPC.  I have changed this patch to look at the code
generated by the compiler to see if prefixed adddressing or PC-relative
addressing is used for -mcpu=future.  This patch needs patch V10 #8 installed
to enable the prefixed addressing and PC-relative tests.

In patch V10 #9, I did not modify the existing test
(check_effective_target_powerpc_future_ok).  As we discussed, this test should
really test whether a non-prefixed instruction is generated to allow for
targets that might support -mcpu=future but not enable prefixed addressing.
However, at present the only instructions being submitted are prefixed
instructions.  So this will have to wait until we get further down the road
with 'future' instructions.

I have bootstrapped a little endian power8 compiler and ran make check with no
regressions.  In addition with this patch installed, the new tests now run as
expected with these changes.  Can I check this in (this needs patch V10 #8 to
be installed to enable the tests).

2019-12-11  Michael Meissner  <meissner@linux.ibm.com>

	* lib/target-supports.exp (check_effective_target_powerpc_pcrel):
	New target for PowerPC -mcpu=future support.
	(check_effective_target_powerpc_prefixed_addr): New target for
	PowerPC -mcpu=future support.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 279141)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -2161,6 +2161,23 @@ proc check_p9modulo_hw_available { } {
     }]
 }

+# Return 1 if the target generates PC-relative instructions automatically
+proc check_effective_target_powerpc_pcrel { } {
+    return [check_no_messages_and_pattern powerpc_pcrel \
+	{\mpld\M.*[@]pcrel} assembly {
+	    static long s;
+	    long *p = &s;
+	    long foo (void) { return s; }
+	} {-O2 -mcpu=future}]
+}
+
+# Return 1 if the target generates prefixed instructions automatically
+proc check_effective_target_powerpc_prefixed_addr { } {
+    return [check_no_messages_and_pattern powerpc_prefixed_addr \
+	{\mpld\M} assembly {
+	    long foo (long *p) { return p[0x12345]; }
+	} {-O2 -mcpu=future}]
+}

 # Return 1 if the target supports executing FUTURE instructions, 0 otherwise.
 # Cache the result.  It is assumed that if a simulator does not support the

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #10, Add PADDI/PLI tests for -mcpu=future
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (8 preceding siblings ...)
  2019-12-12  1:12 ` [PATCH] V10 patch #9, Add new effective targets for the testsuite Michael Meissner
@ 2019-12-12  1:13 ` Michael Meissner
  2019-12-12  1:16 ` [PATCH] V10 patch #11, Add test for generating prefixed load/store when the offset is not valid for DS/DQ instructions Michael Meissner
  2019-12-12  1:18 ` [PATCH] V10 patch #12, Test to make sure we don't generate prefixed pre-modify load/stores for -mcpu=future Michael Meissner
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  1:13 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

Patch V10 #10 is a modification of patch V8 #1.  I renamed the files from
paddi-?.c to prefixed-*.c so that there isn't a false match due to the .ident
directive.

This test passes when I do a make check.  One patch V10 #9 is checked in can I
commit this patch?

2019-12-11  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-add.c: New test for -mcpu=future
	generating PADDI for large constant adds.
	* gcc.target/powerpc/prefix-di-constant.c: New test for
	-mcpu=future generating PLI to load up large DImode constants.
	* gcc.target/powerpc/prefix-si-constant.c: New test for
	-mcpu=future generating PLI to load up large SImode constants.

Index: gcc/testsuite/gcc.target/powerpc/prefix-add.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-add.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-add.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #11, Add test for generating prefixed load/store when the offset is not valid for DS/DQ instructions
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (9 preceding siblings ...)
  2019-12-12  1:13 ` [PATCH] V10 patch #10, Add PADDI/PLI tests for -mcpu=future Michael Meissner
@ 2019-12-12  1:16 ` Michael Meissner
  2019-12-12  1:18 ` [PATCH] V10 patch #12, Test to make sure we don't generate prefixed pre-modify load/stores for -mcpu=future Michael Meissner
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  1:16 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

Patch V10 #11 is a slight reworking of patch V8 #2 (testing whether we generate
a prefixed instruction when the offset would be invalid for DS and DQ
instruction formats).

This test passes when I run make check.  Can I check this in when patch V10 #9
is checked in?

2019-12-11  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-ds-dq.c: New test to verify that we
	generate the prefix load/store instructions for traditional
	instructions with an offset that doesn't match DS/DQ
	requirements.

Index: gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c	(revision 279256)
+++ gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c	(working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we generate a prefixed load/store operation for addresses that
+   don't meet DS/DQ offset constraints.  */
+
+unsigned long
+load_uc_offset1 (unsigned char *p)
+{
+  return p[1];				/* should generate LBZ.  */
+}
+
+long
+load_sc_offset1 (signed char *p)
+{
+  return p[1];				/* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_offset1 (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);	/* should generate LHZ.  */
+}
+
+long
+load_ss_offset1 (unsigned char *p)
+{
+  return *(short *)(p + 1);		/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_offset1 (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1);	/* should generate LWZ.  */
+}
+
+long
+load_si_offset1 (unsigned char *p)
+{
+  return *(int *)(p + 1);		/* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_offset1 (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);	/* should generate PLD.  */
+}
+
+long
+load_sl_offset1 (unsigned char *p)
+{
+  return *(long *)(p + 1);		/* should generate PLD.  */
+}
+
+float
+load_float_offset1 (unsigned char *p)
+{
+  return *(float *)(p + 1);		/* should generate LFS.  */
+}
+
+double
+load_double_offset1 (unsigned char *p)
+{
+  return *(double *)(p + 1);		/* should generate LFD.  */
+}
+
+__float128
+load_float128_offset1 (unsigned char *p)
+{
+  return *(__float128 *)(p + 1);	/* should generate PLXV.  */
+}
+
+void
+store_uc_offset1 (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;				/* should generate STB.  */
+}
+
+void
+store_sc_offset1 (signed char sc, signed char *p)
+{
+  p[1] = sc;				/* should generate STB.  */
+}
+
+void
+store_us_offset1 (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us;	/* should generate STH.  */
+}
+
+void
+store_ss_offset1 (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;	/* should generate STH.  */
+}
+
+void
+store_ui_offset1 (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;	/* should generate STW.  */
+}
+
+void
+store_si_offset1 (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si;		/* should generate STW.  */
+}
+
+void
+store_ul_offset1 (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;	/* should generate PSTD.  */
+}
+
+void
+store_sl_offset1 (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;		/* should generate PSTD.  */
+}
+
+void
+store_float_offset1 (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;		/* should generate STF.  */
+}
+
+void
+store_double_offset1 (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;		/* should generate STD.  */
+}
+
+void
+store_float128_offset1 (__float128 f128, unsigned char *p)
+{
+  *(__float128 *)(p + 1) = f128;	/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] V10 patch #12, Test to make sure we don't generate prefixed pre-modify load/stores for -mcpu=future
  2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
                   ` (10 preceding siblings ...)
  2019-12-12  1:16 ` [PATCH] V10 patch #11, Add test for generating prefixed load/store when the offset is not valid for DS/DQ instructions Michael Meissner
@ 2019-12-12  1:18 ` Michael Meissner
  11 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-12  1:18 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

Patch V10 #12 is a slight reworking of patch V8 #3 (making sure we don't try to
generate the non-existant PLWZU and PSTWU pre-modify instructions).

This test passes when I run make check.  Can I check this in when patch V10 #9
is installed?

2019-12-11  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-no-premodify.c: Make sure we do not
	generate the non-existent PLWZU instruction if -mcpu=future.

Index: gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c	(revision 279259)
+++ gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c	(working copy)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't generate a prefixed form of the load and store with
+   update instructions (i.e. instead of generating LWZU we have to generate
+   PLWZ plus a PADDI).  */
+
+#ifndef SIZE
+#define SIZE 50000
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;	/* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;	/* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;	/* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;	/* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mlwz\M}    2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}    2 } } */
+/* { dg-final { scan-assembler-times {\mpaddi\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}   2 } } */
+/* { dg-final { scan-assembler-not   {\mplwzu\M}    } } */
+/* { dg-final { scan-assembler-not   {\mpstwu\M}    } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}    } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}     } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future
  2019-12-12  0:12 ` [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future Michael Meissner
@ 2019-12-17 15:58   ` Segher Boessenkool
  0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 15:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Wed, Dec 11, 2019 at 07:12:23PM -0500, Michael Meissner wrote:
> --- gcc/config/rs6000/rs6000.c	(revision 279141)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -5541,6 +5541,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
>  	   && (value >> 31 == -1 || value >> 31 == 0))
>      return 1;
>  
> +  /* PADDI can support up to 34 bit signed integers.  */
> +  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
> +    return 1;

Please follow up with a patch to not call random numbers "OFFSET".

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #2, use PLI to load up large SImode constants if -mcpu=future
  2019-12-12  0:15 ` [PATCH] V10 patch #2, use PLI to load up large SImode " Michael Meissner
@ 2019-12-17 16:03   ` Segher Boessenkool
  0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 16:03 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Wed, Dec 11, 2019 at 07:15:15PM -0500, Michael Meissner wrote:
> This patch adds an alternative to use PLI to load up large SImode constants if
> -mcpu=future is used.

> 
> 	* config/rs6000/rs6000.md (movsi_internal1): Add alternative to
> 	use PLI to load up 34-bit constants if -mcpu=future.

This is okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used
  2019-12-12  0:17 ` [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used Michael Meissner
@ 2019-12-17 16:27   ` Segher Boessenkool
  0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 16:27 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

On Wed, Dec 11, 2019 at 07:17:02PM -0500, Michael Meissner wrote:
> This patch adds an alternative to use PADDI to add large SImode and DImode
> constants if -mcpu=future is used.

> 2019-12-09  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/predicates.md (add_operand): Allow eI constants.
> 	* config/rs6000/rs6000.md (add<mode>3): Add alternative to
> 	generate PADDI for 34-bit constants if -mcpu=future.

This is fine.  Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-12  0:29 ` [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints Michael Meissner
@ 2019-12-17 17:27   ` Segher Boessenkool
  2019-12-17 22:34     ` Michael Meissner
  0 siblings, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 17:27 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Wed, Dec 11, 2019 at 07:29:05PM -0500, Michael Meissner wrote:
> +(define_memory_constraint "em"
> +  "A memory operand that does not contain a prefixed address."
> +  (and (match_code "mem")
> +       (match_operand 0 "non_prefixed_memory")))
> +
> +(define_memory_constraint "ep"
> +  "A memory operand that does contains a prefixed address."
> +  (and (match_code "mem")
> +       (match_operand 0 "prefixed_memory")))

"does contain".  Or maybe just say "with a non-prefixed address" and
"with a prefixed address"?

> +;; Return true if the operand is a valid memory address that does not use a
> +;; prefixed address.
> +(define_predicate "non_prefixed_memory"
> +  (match_code "mem")
> +{
> +  enum insn_form iform
> +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> +
> +  return (iform != INSN_FORM_BAD
> +          && iform != INSN_FORM_PREFIXED_NUMERIC
> +	  && iform != INSN_FORM_PCREL_LOCAL
> +	  && iform != INSN_FORM_PCREL_EXTERNAL);
> +})

Why can this not use just !address_is_prefixed?  Why is an
INSN_FORM_PCREL_EXTERNAL address neither prefixed nor non-prefixed?  What
does "BAD" mean, really?  Should that ever happen, should that not ICE?

It is very confusing if any valid memory is neither "prefixed_memory" nor
"non_prefixed_memory"!

> --- gcc/doc/md.texi	(revision 279182)
> +++ gcc/doc/md.texi	(working copy)
> @@ -3373,6 +3373,12 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
>  
>  is not.
>  
> +@item em
> +A memory operand that does not contain a prefixed address.
> +
> +@item ep
> +A memory operand that does contains a prefixed address.

Same comments as above.


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address
  2019-12-12  0:48 ` [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address Michael Meissner
@ 2019-12-17 18:02   ` Segher Boessenkool
  2019-12-18 22:15     ` Michael Meissner
  0 siblings, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 18:02 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Wed, Dec 11, 2019 at 07:48:39PM -0500, Michael Meissner wrote:
> This patch fixes a bug with vector extracts using a PC-relative address and a
> variable offset with using -mcpu=future.
> 
> Consider the code:
> 
> 	#include <altivec.h>
> 
> 	static vector double vd;
> 	vector double *p = &vd;
> 
> 	double get (unsigned int n)
> 	{
> 	  return vec_extract (vd, n);
> 	}
> 
> If you compile this code with -O2 -mcpu=future -mpcrel you get:
> 
> 	get:
> 	        pla 9,.LANCHOR0@pcrel
> 	        lfdx 1,9,9
> 	        blr
> 
> This is because there is only one base register temporary, and the current code
> tries to first create the offset and then use the same temporary to hold the
> address of the PC-relative value.
> 
> After combine the insn is:
> 
> (insn 14 9 15 2 (parallel [
>             (set (reg/i:DF 33 1)
>                 (unspec:DF [
>                         (mem/c:V2DF (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) [1 vd+0 S16 A128])
>                         (reg:DI 123 [ n ])
>                     ] UNSPEC_VSX_EXTRACT))
>             (clobber (scratch:DI))
>             (clobber (scratch:V2DI))
>         ]) "foo.c":9:1 1314 {vsx_extract_v2df_var}

(After postreload as well, more to the point -- well, it has hard regs
there, of course).

> Split2 changes this to:

The vsx_extract_<mode>_var splitter dooes, yeah.

> (insn 20 8 21 2 (set (reg:DI 3 3 [orig:123 n ] [123])
>         (and:DI (reg:DI 3 3 [orig:123 n ] [123])
>             (const_int 1 [0x1]))) "foo.c":9:1 193 {anddi3_mask}
>      (nil))
> (insn 21 20 22 2 (set (reg:DI 9 9 [126])
>         (ashift:DI (reg:DI 3 3 [orig:123 n ] [123])
>             (const_int 3 [0x3]))) "foo.c":9:1 256 {ashldi3}
>      (nil))

These two are just  rlwinm 3,3,3,8  together, btw.  A good example why
splitters after reload are not great.

>  ;; Variable V2DI/V2DF extract
>  (define_insn_and_split "vsx_extract_<mode>_var"
> -  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
> -	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
> -			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> -			    UNSPEC_VSX_EXTRACT))
> -   (clobber (match_scratch:DI 3 "=r,&b,&b"))
> -   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
> +  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r,wa,r")
> +	(unspec:<VS_scalar>
> +	 [(match_operand:VSX_D 1 "input_operand" "v,em,em,ep,ep")
> +	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r,r")]
> +	 UNSPEC_VSX_EXTRACT))
> +   (clobber (match_scratch:DI 3 "=r,&b,&b,&b,&b"))
> +   (clobber (match_scratch:V2DI 4 "=&v,X,X,X,X"))
> +   (clobber (match_scratch:DI 5 "=X,X,X,&b,&b"))]
>    "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
>    "#"
>    "&& reload_completed"
>    [(const_int 0)]
>  {
>    rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
> -				operands[3], operands[4]);
> +				operands[3], operands[4], operands[5]);

This writes to operands[2], which does not match its constraint.

Same in the other splitters.


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-17 17:27   ` Segher Boessenkool
@ 2019-12-17 22:34     ` Michael Meissner
  2019-12-17 23:55       ` Segher Boessenkool
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-17 22:34 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 17, 2019 at 11:15:29AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Dec 11, 2019 at 07:29:05PM -0500, Michael Meissner wrote:
> > +(define_memory_constraint "em"
> > +  "A memory operand that does not contain a prefixed address."
> > +  (and (match_code "mem")
> > +       (match_operand 0 "non_prefixed_memory")))
> > +
> > +(define_memory_constraint "ep"
> > +  "A memory operand that does contains a prefixed address."
> > +  (and (match_code "mem")
> > +       (match_operand 0 "prefixed_memory")))
> 
> "does contain".  Or maybe just say "with a non-prefixed address" and
> "with a prefixed address"?

Ok.

> > +;; Return true if the operand is a valid memory address that does not use a
> > +;; prefixed address.
> > +(define_predicate "non_prefixed_memory"
> > +  (match_code "mem")
> > +{
> > +  enum insn_form iform
> > +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > +
> > +  return (iform != INSN_FORM_BAD
> > +          && iform != INSN_FORM_PREFIXED_NUMERIC
> > +	  && iform != INSN_FORM_PCREL_LOCAL
> > +	  && iform != INSN_FORM_PCREL_EXTERNAL);
> > +})
> 
> Why can this not use just !address_is_prefixed?  Why is an
> INSN_FORM_PCREL_EXTERNAL address neither prefixed nor non-prefixed?  What
> does "BAD" mean, really?  Should that ever happen, should that not ICE?

You can't just invert !address_is_prefixed, because it would all things that
may not be valid memory addresses.

So we could just do:

{
  /* If the operand is not a valid memory operand even if it is not prefixed,
     do not return true.  */
  if (!memory_operand (op, mode))
    return false;

  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
}

It is important that the predicate not return true if the operand is NOT a
valid memory address.  If you allow non-valid memory addresses, the register
allocator will create things like:

	(mem:MODE (plus:DI (reg:DI x)
			   (plus:DI (reg:DI y)
				    (const_int z))))

Or some such -- I forget the exact sequence it created.  A later pass would
then choke with bad insn.

INSN_FORM_BAD just means that the operand is not valid as a memory address.

> It is very confusing if any valid memory is neither "prefixed_memory" nor
> "non_prefixed_memory"!

The point was to make sure the memory is valid.  Once it is a valid memory
address, then just a simple !address_is_prefixed will work.

> > --- gcc/doc/md.texi	(revision 279182)
> > +++ gcc/doc/md.texi	(working copy)
> > @@ -3373,6 +3373,12 @@ asm ("st %1,%0" : "=m<>" (mem) : "r" (va
> >  
> >  is not.
> >  
> > +@item em
> > +A memory operand that does not contain a prefixed address.
> > +
> > +@item ep
> > +A memory operand that does contains a prefixed address.
> 
> Same comments as above.

Ok.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-17 22:34     ` Michael Meissner
@ 2019-12-17 23:55       ` Segher Boessenkool
  2019-12-18  1:06         ` Michael Meissner
  0 siblings, 1 reply; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-17 23:55 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 17, 2019 at 05:29:44PM -0500, Michael Meissner wrote:
> On Tue, Dec 17, 2019 at 11:15:29AM -0600, Segher Boessenkool wrote:
> > > +;; Return true if the operand is a valid memory address that does not use a
> > > +;; prefixed address.
> > > +(define_predicate "non_prefixed_memory"
> > > +  (match_code "mem")
> > > +{
> > > +  enum insn_form iform
> > > +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > > +
> > > +  return (iform != INSN_FORM_BAD
> > > +          && iform != INSN_FORM_PREFIXED_NUMERIC
> > > +	  && iform != INSN_FORM_PCREL_LOCAL
> > > +	  && iform != INSN_FORM_PCREL_EXTERNAL);
> > > +})
> > 
> > Why can this not use just !address_is_prefixed?  Why is an
> > INSN_FORM_PCREL_EXTERNAL address neither prefixed nor non-prefixed?  What
> > does "BAD" mean, really?  Should that ever happen, should that not ICE?
> 
> You can't just invert !address_is_prefixed, because it would all things that
> may not be valid memory addresses.

Yes, so test that *explicitly*, in the "prefixed_memory" predicate as
well please.  Make the two predicates as much the same as possible.

And what is with the INSN_FORM_PCREL_EXTERNAL?


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-17 23:55       ` Segher Boessenkool
@ 2019-12-18  1:06         ` Michael Meissner
  2019-12-18 13:59           ` Segher Boessenkool
  0 siblings, 1 reply; 23+ messages in thread
From: Michael Meissner @ 2019-12-18  1:06 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 17, 2019 at 05:35:24PM -0600, Segher Boessenkool wrote:
> On Tue, Dec 17, 2019 at 05:29:44PM -0500, Michael Meissner wrote:
> > On Tue, Dec 17, 2019 at 11:15:29AM -0600, Segher Boessenkool wrote:
> > > > +;; Return true if the operand is a valid memory address that does not use a
> > > > +;; prefixed address.
> > > > +(define_predicate "non_prefixed_memory"
> > > > +  (match_code "mem")
> > > > +{
> > > > +  enum insn_form iform
> > > > +    = address_to_insn_form (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> > > > +
> > > > +  return (iform != INSN_FORM_BAD
> > > > +          && iform != INSN_FORM_PREFIXED_NUMERIC
> > > > +	  && iform != INSN_FORM_PCREL_LOCAL
> > > > +	  && iform != INSN_FORM_PCREL_EXTERNAL);
> > > > +})
> > > 
> > > Why can this not use just !address_is_prefixed?  Why is an
> > > INSN_FORM_PCREL_EXTERNAL address neither prefixed nor non-prefixed?  What
> > > does "BAD" mean, really?  Should that ever happen, should that not ICE?
> > 
> > You can't just invert !address_is_prefixed, because it would all things that
> > may not be valid memory addresses.
> 
> Yes, so test that *explicitly*, in the "prefixed_memory" predicate as
> well please.  Make the two predicates as much the same as possible.
> 
> And what is with the INSN_FORM_PCREL_EXTERNAL?

INSN_FORM_PCREL_EXTERNAL says that the operand is a reference to an external
symbol.  It cannot appear in an actual memory insns in normal usage, but it
needs to be handled several places:

1) pcrel_extern_addr needs to be able to load an external address into a GPR
register.

2) The prefixed insn attribute (and prefixed_paddi_p which it calls) needs to
recognize pcrel_extern_addr and note that it is prefixed.

3) The PCREL_OPT support will need to support it.  If you do the PCREL_OPT
support via combine and flow control passes, you will need to be able to handle
external references as addresses.

The function address_is_prefixed, specifically does not return true for
external symbols, because you can't use them in a normal context.

In the context of the patch (vector extract), it needs to decide whether the
address is prefixed or not, in order to decide whether it needs a second base
register temporary.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints
  2019-12-18  1:06         ` Michael Meissner
@ 2019-12-18 13:59           ` Segher Boessenkool
  0 siblings, 0 replies; 23+ messages in thread
From: Segher Boessenkool @ 2019-12-18 13:59 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Tue, Dec 17, 2019 at 07:38:51PM -0500, Michael Meissner wrote:
> On Tue, Dec 17, 2019 at 05:35:24PM -0600, Segher Boessenkool wrote:
> > And what is with the INSN_FORM_PCREL_EXTERNAL?
> 
> INSN_FORM_PCREL_EXTERNAL says that the operand is a reference to an external
> symbol.  It cannot appear in an actual memory insns in normal usage, but it
> needs to be handled several places:

Sure.  Both prefixed_memory and non_prefixed_memory should test something
like memory_operand, not just whether it is a MEM.

But *both* of them, that's the point, and using some more generic hook.


Segher

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address
  2019-12-17 18:02   ` Segher Boessenkool
@ 2019-12-18 22:15     ` Michael Meissner
  0 siblings, 0 replies; 23+ messages in thread
From: Michael Meissner @ 2019-12-18 22:15 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 17, 2019 at 12:02:46PM -0600, Segher Boessenkool wrote:
> >  ;; Variable V2DI/V2DF extract
> >  (define_insn_and_split "vsx_extract_<mode>_var"
> > -  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
> > -	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
> > -			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> > -			    UNSPEC_VSX_EXTRACT))
> > -   (clobber (match_scratch:DI 3 "=r,&b,&b"))
> > -   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
> > +  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r,wa,r")
> > +	(unspec:<VS_scalar>
> > +	 [(match_operand:VSX_D 1 "input_operand" "v,em,em,ep,ep")
> > +	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r,r")]
> > +	 UNSPEC_VSX_EXTRACT))
> > +   (clobber (match_scratch:DI 3 "=r,&b,&b,&b,&b"))
> > +   (clobber (match_scratch:V2DI 4 "=&v,X,X,X,X"))
> > +   (clobber (match_scratch:DI 5 "=X,X,X,&b,&b"))]
> >    "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
> >    "#"
> >    "&& reload_completed"
> >    [(const_int 0)]
> >  {
> >    rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
> > -				operands[3], operands[4]);
> > +				operands[3], operands[4], operands[5]);
> 
> This writes to operands[2], which does not match its constraint.
> 
> Same in the other splitters.

Right.  Good catch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-12-18 21:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12  0:07 PowerPC V10 Patches for -mcpu=future Michael Meissner
2019-12-12  0:12 ` [PATCH] V10 patch #1, Use PLI to load up large DImode constants if -mcpu=future Michael Meissner
2019-12-17 15:58   ` Segher Boessenkool
2019-12-12  0:15 ` [PATCH] V10 patch #2, use PLI to load up large SImode " Michael Meissner
2019-12-17 16:03   ` Segher Boessenkool
2019-12-12  0:17 ` [PATCH] V10 patch #3, Use PADDI to add large constants if -mcpu=future is used Michael Meissner
2019-12-17 16:27   ` Segher Boessenkool
2019-12-12  0:29 ` [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints Michael Meissner
2019-12-17 17:27   ` Segher Boessenkool
2019-12-17 22:34     ` Michael Meissner
2019-12-17 23:55       ` Segher Boessenkool
2019-12-18  1:06         ` Michael Meissner
2019-12-18 13:59           ` Segher Boessenkool
2019-12-12  0:48 ` [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address Michael Meissner
2019-12-17 18:02   ` Segher Boessenkool
2019-12-18 22:15     ` Michael Meissner
2019-12-12  0:54 ` [PATCH] V10 patch #6, Use prefixed load/stores for vector extract with large offsets Michael Meissner
2019-12-12  0:58 ` [PATCH] V10 patch #7, Improve vector_extract code of a PC-relative address with a constant offset for -mcpu=future Michael Meissner
2019-12-12  1:06 ` [PATCH] V10 patch #8, Enable -mpcrel and -mprefixed-addr for -mcpu=future on 64-bit little endian Linux systems Michael Meissner
2019-12-12  1:12 ` [PATCH] V10 patch #9, Add new effective targets for the testsuite Michael Meissner
2019-12-12  1:13 ` [PATCH] V10 patch #10, Add PADDI/PLI tests for -mcpu=future Michael Meissner
2019-12-12  1:16 ` [PATCH] V10 patch #11, Add test for generating prefixed load/store when the offset is not valid for DS/DQ instructions Michael Meissner
2019-12-12  1:18 ` [PATCH] V10 patch #12, Test to make sure we don't generate prefixed pre-modify load/stores for -mcpu=future Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).