public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work160-test)] PR target/112886, Add %S<n> to print_operand for vector pair support. Power10: Add options to disab
@ 2024-02-27  7:13 Michael Meissner
  0 siblings, 0 replies; only message in thread
From: Michael Meissner @ 2024-02-27  7:13 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:0f24781ff05c7ea7f779fd007274c4b45d4f7c05

commit 0f24781ff05c7ea7f779fd007274c4b45d4f7c05
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Tue Feb 27 02:12:10 2024 -0500

    PR target/112886, Add %S<n> to print_operand for vector pair support.  Power10: Add options to disable load and store vector pair.
    
    In looking at support for load vector pair and store vector pair for the
    PowerPC in GCC, I noticed that we were missing a print_operand output modifier
    if you are dealing with vector pairs to print the 2nd register in the vector
    pair.
    
    If the instruction inside of the asm used the Altivec encoding, then we could
    use the %L<n> modifier:
    
            __vector_pair *p, *q, *r;
            // ...
            __asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
                     : "=v" (*p)
                     : "v" (*q), "v" (*r));
    
    Likewise if we know the value to be in a tradiational FPR register, %L<n> will
    work for instructions that use the VSX encoding:
    
            __vector_pair *p, *q, *r;
            // ...
            __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
                     : "=f" (*p)
                     : "f" (*q), "f" (*r));
    
    But if have a value that is in a traditional Altivec register, and the
    instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
    should give a value between 32 and 63.
    
    This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
    register number.
    
    This is version 2 of the patch.  The only difference is I made the test case
    simpler to read.
    
    I have tested this on power10 and power9 little endian systems and on a power9
    big endian system.  There were no regressions in the patch.  Can I apply it to
    the trunk?
    
    It would be nice if I could apply it to the open branches.  Can I backport it
    after a burn-in period?
    
    This is version 2 of the patch to add -mno-load-vector-pair and
    -mno-store-vector-pair undocumented tuning switches.
    
    The differences between the first version of the patch and this version is that
    I added explicit RTL abi attributes for when the compiler can generate the load
    vector pair and store vector pair instructions.  By having this attribute, the
    movoo insn has separate alternatives for when we generate the instruction and
    when we want to split the instruction into 2 separate vector loads or stores.
    
    In the first version of the patch, I had previously provided built-in functions
    that would always generate load vector pair and store vector pair instructions
    even if these instructions are normally disabled.  I found these built-ins
    weren't specified like the other vector pair built-ins, and I didn't include
    documentation for the built-in functions.  If we want such built-in functions,
    we can add them as a separate patch later.
    
    In addition, since both versions of the patch adds #pragma target and attribute
    support to change the results for individual functions, we can select on a
    function by function basis what the defaults for load/store vector pair is.
    
    The original text for the patch is:
    
    In working on some future patches that involve utilizing vector pair
    instructions, I wanted to be able to tune my program to enable or disable using
    the vector pair load or store operations while still keeping the other
    operations on the vector pair.
    
    This patch adds two undocumented tuning options.  The -mno-load-vector-pair
    option would tell GCC to generate two load vector instructions instead of a
    single load vector pair.  The -mno-store-vector-pair option would tell GCC to
    generate two store vector instructions instead of a single store vector pair.
    
    If either -mno-load-vector-pair is used, GCC will not generate the indexed
    stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will not
    generate the indexed lxvpx instruction.  The reason for this is to enable
    splitting the {,p}lxvp or {,p}stxvp instructions after reload without needing a
    scratch GPR register.
    
    The default for -mcpu=power10 is that both load vector pair and store vector
    pair are enabled.
    
    I added code so that the user code can modify these settings using either a
    '#pragma GCC target' directive or used __attribute__((__target__(...))) in the
    function declaration.
    
    I added tests for the switches, #pragma, and attribute options.
    
    I have built this on both little endian power10 systems and big endian power9
    systems doing the normal bootstrap and test.  There were no regressions in any
    of the tests, and the new tests passed.  Can I check this patch into the master
    branch?
    
    2024-02-27  Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            PR target/112886
            * config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
            * doc/md.texi (Modifiers): Mention %S can be used like %x.
    
    gcc/testsuite/
    
            PR target/112886
            * /gcc.target/powerpc/pr112886.c: New test.
    
    2024-02-27  Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            * config/rs6000/mma.md (movoo): Add support for -mno-load-vector-pair and
            -mno-store-vector-pair.
            * config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support for
            -mload-vector-pair and -mstore-vector-pair.
            (POWERPC_MASKS): Likewise.
            * config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
            indexed mode for OOmode if we are generating both load vector pair and
            store vector pair instructions.
            (rs6000_option_override_internal): Add support for -mno-load-vector-pair
            and -mno-store-vector-pair.
            (rs6000_opt_masks): Likewise.
            * config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
            attributes.
            (enabled attribute): Likewise.
            * config/rs6000/rs6000.opt (-mload-vector-pair): New option.
            (-mstore-vector-pair): Likewise.
    
    gcc/testsuite/
    
            * gcc.target/powerpc/vector-pair-attribute.c: New test.
            * gcc.target/powerpc/vector-pair-pragma.c: New test.
            * gcc.target/powerpc/vector-pair-switch1.c: New test.
            * gcc.target/powerpc/vector-pair-switch2.c: New test.
            * gcc.target/powerpc/vector-pair-switch3.c: New test.
            * gcc.target/powerpc/vector-pair-switch4.c: New test.

Diff:
---
 gcc/ChangeLog.test                                 | 143 +++++++++++++++++++++
 gcc/config/rs6000/mma.md                           |  19 ++-
 gcc/config/rs6000/rs6000-cpus.def                  |   8 +-
 gcc/config/rs6000/rs6000.cc                        |  40 +++++-
 gcc/config/rs6000/rs6000.md                        |  10 +-
 gcc/config/rs6000/rs6000.opt                       |   8 ++
 gcc/doc/md.texi                                    |   5 +-
 gcc/testsuite/gcc.target/powerpc/pr112886.c        |  29 +++++
 .../gcc.target/powerpc/vector-pair-attribute.c     |  39 ++++++
 .../gcc.target/powerpc/vector-pair-pragma.c        |  55 ++++++++
 .../gcc.target/powerpc/vector-pair-switch1.c       |  16 +++
 .../gcc.target/powerpc/vector-pair-switch2.c       |  17 +++
 .../gcc.target/powerpc/vector-pair-switch3.c       |  17 +++
 .../gcc.target/powerpc/vector-pair-switch4.c       |  17 +++
 14 files changed, 407 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
index 6e72d4a246d..c0fd494a53a 100644
--- a/gcc/ChangeLog.test
+++ b/gcc/ChangeLog.test
@@ -1,3 +1,146 @@
+==================== Branch work160-test, patch #5 from work160 branch ====================
+
+PR target/112886, Add %S<n> to print_operand for vector pair support.
+
+In looking at support for load vector pair and store vector pair for the
+PowerPC in GCC, I noticed that we were missing a print_operand output modifier
+if you are dealing with vector pairs to print the 2nd register in the vector
+pair.
+
+If the instruction inside of the asm used the Altivec encoding, then we could
+use the %L<n> modifier:
+
+	__vector_pair *p, *q, *r;
+	// ...
+	__asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
+		 : "=v" (*p)
+		 : "v" (*q), "v" (*r));
+
+Likewise if we know the value to be in a tradiational FPR register, %L<n> will
+work for instructions that use the VSX encoding:
+
+	__vector_pair *p, *q, *r;
+	// ...
+	__asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
+		 : "=f" (*p)
+		 : "f" (*q), "f" (*r));
+
+But if have a value that is in a traditional Altivec register, and the
+instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
+should give a value between 32 and 63.
+
+This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
+register number.
+
+This is version 2 of the patch.  The only difference is I made the test case
+simpler to read.
+
+I have tested this on power10 and power9 little endian systems and on a power9
+big endian system.  There were no regressions in the patch.  Can I apply it to
+the trunk?
+
+It would be nice if I could apply it to the open branches.  Can I backport it
+after a burn-in period?
+
+2024-02-27  Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	PR target/112886
+	* config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
+	* doc/md.texi (Modifiers): Mention %S can be used like %x.
+
+gcc/testsuite/
+
+	PR target/112886
+	* /gcc.target/powerpc/pr112886.c: New test.
+
+==================== Branch work160-test, patch #4 from work160 branch ====================
+
+Power10: Add options to disable load and store vector pair.
+
+This is version 2 of the patch to add -mno-load-vector-pair and
+-mno-store-vector-pair undocumented tuning switches.
+
+The differences between the first version of the patch and this version is that
+I added explicit RTL abi attributes for when the compiler can generate the load
+vector pair and store vector pair instructions.  By having this attribute, the
+movoo insn has separate alternatives for when we generate the instruction and
+when we want to split the instruction into 2 separate vector loads or stores.
+
+In the first version of the patch, I had previously provided built-in functions
+that would always generate load vector pair and store vector pair instructions
+even if these instructions are normally disabled.  I found these built-ins
+weren't specified like the other vector pair built-ins, and I didn't include
+documentation for the built-in functions.  If we want such built-in functions,
+we can add them as a separate patch later.
+
+In addition, since both versions of the patch adds #pragma target and attribute
+support to change the results for individual functions, we can select on a
+function by function basis what the defaults for load/store vector pair is.
+
+The original text for the patch is:
+
+In working on some future patches that involve utilizing vector pair
+instructions, I wanted to be able to tune my program to enable or disable using
+the vector pair load or store operations while still keeping the other
+operations on the vector pair.
+
+This patch adds two undocumented tuning options.  The -mno-load-vector-pair
+option would tell GCC to generate two load vector instructions instead of a
+single load vector pair.  The -mno-store-vector-pair option would tell GCC to
+generate two store vector instructions instead of a single store vector pair.
+
+If either -mno-load-vector-pair is used, GCC will not generate the indexed
+stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will not
+generate the indexed lxvpx instruction.  The reason for this is to enable
+splitting the {,p}lxvp or {,p}stxvp instructions after reload without needing a
+scratch GPR register.
+
+The default for -mcpu=power10 is that both load vector pair and store vector
+pair are enabled.
+
+I added code so that the user code can modify these settings using either a
+'#pragma GCC target' directive or used __attribute__((__target__(...))) in the
+function declaration.
+
+I added tests for the switches, #pragma, and attribute options.
+
+I have built this on both little endian power10 systems and big endian power9
+systems doing the normal bootstrap and test.  There were no regressions in any
+of the tests, and the new tests passed.  Can I check this patch into the master
+branch?
+
+2024-02-27  Michael Meissner  <meissner@linux.ibm.com>
+
+gcc/
+
+	* config/rs6000/mma.md (movoo): Add support for -mno-load-vector-pair and
+	-mno-store-vector-pair.
+	* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support for
+	-mload-vector-pair and -mstore-vector-pair.
+	(POWERPC_MASKS): Likewise.
+	* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
+	indexed mode for OOmode if we are generating both load vector pair and
+	store vector pair instructions.
+	(rs6000_option_override_internal): Add support for -mno-load-vector-pair
+	and -mno-store-vector-pair.
+	(rs6000_opt_masks): Likewise.
+	* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
+	attributes.
+	(enabled attribute): Likewise.
+	* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
+	(-mstore-vector-pair): Likewise.
+
+gcc/testsuite/
+
+	* gcc.target/powerpc/vector-pair-attribute.c: New test.
+	* gcc.target/powerpc/vector-pair-pragma.c: New test.
+	* gcc.target/powerpc/vector-pair-switch1.c: New test.
+	* gcc.target/powerpc/vector-pair-switch2.c: New test.
+	* gcc.target/powerpc/vector-pair-switch3.c: New test.
+	* gcc.target/powerpc/vector-pair-switch4.c: New test.
+
 ==================== Branch work160-test, patch #3 from work160 branch ====================
 
 Use vector pair load/store for memcpy with -mcpu=future
diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..6a7d8a836db 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,27 +292,34 @@
     gcc_assert (false);
 })
 
+;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
+;; alternative that does not allow indexed addresses so we can split the load
+;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-	(match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
+	(match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
   "TARGET_MMA
    && (gpc_reg_operand (operands[0], OOmode)
        || gpc_reg_operand (operands[1], OOmode))"
   "@
    lxvp%X1 %x0,%1
+   #
    stxvp%X0 %x1,%0
+   #
    #"
   "&& reload_completed
-   && (!MEM_P (operands[0]) && !MEM_P (operands[1]))"
+   && ((MEM_P (operands[0]) && !TARGET_STORE_VECTOR_PAIR)
+       || (MEM_P (operands[1]) && !TARGET_LOAD_VECTOR_PAIR)
+       || (!MEM_P (operands[0]) && !MEM_P (operands[1])))"
   [(const_int 0)]
 {
   rs6000_split_multireg_move (operands[0], operands[1]);
   DONE;
 }
-  [(set_attr "type" "vecload,vecstore,veclogical")
+  [(set_attr "type" "vecload,vecload,vecstore,vecstore,veclogical")
    (set_attr "size" "256")
-   (set_attr "length" "*,*,8")])
-
+   (set_attr "length" "*,8,*,8,8")
+   (set_attr "isa" "lxvp,*,stxvp,*,*")])
 \f
 ;; Vector quad support.  XOmode can only live in FPRs.
 (define_expand "movxo"
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index 8da1d560e49..7740206a3f7 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -77,10 +77,12 @@
 /* Flags that need to be turned off if -mno-power10.  */
 /* We comment out PCREL_OPT here to disable it by default because SPEC2017
    performance was degraded by it.  */
-#define OTHER_POWER10_MASKS	(OPTION_MASK_MMA			\
+#define OTHER_POWER10_MASKS	(OPTION_MASK_LOAD_VECTOR_PAIR		\
+				 | OPTION_MASK_MMA			\
 				 | OPTION_MASK_PCREL			\
 				 /* | OPTION_MASK_PCREL_OPT */		\
-				 | OPTION_MASK_PREFIXED)
+				 | OPTION_MASK_PREFIXED			\
+				 | OPTION_MASK_STORE_VECTOR_PAIR)
 
 #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_POWER10			\
@@ -134,6 +136,7 @@
 				 | OPTION_MASK_FLOAT128_HW		\
 				 | OPTION_MASK_FLOAT128_KEYWORD		\
 				 | OPTION_MASK_FPRND			\
+				 | OPTION_MASK_LOAD_VECTOR_PAIR		\
 				 | OPTION_MASK_POWER10			\
 				 | OPTION_MASK_POWER11			\
 				 | OPTION_MASK_P10_FUSION		\
@@ -162,6 +165,7 @@
 				 | OPTION_MASK_QUAD_MEMORY_ATOMIC	\
 				 | OPTION_MASK_RECIP_PRECISION		\
 				 | OPTION_MASK_SOFT_FLOAT		\
+				 | OPTION_MASK_STORE_VECTOR_PAIR	\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_VSX)
 
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e3ca4e799e4..a600169a4c5 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -2722,7 +2722,9 @@ rs6000_setup_reg_addr_masks (void)
 	  /* Vector pairs can do both indexed and offset loads if the
 	     instructions are enabled, otherwise they can only do offset loads
 	     since it will be broken into two vector moves.  Vector quads can
-	     only do offset loads.  */
+	     only do offset loads.  If the user restricted generation of either
+	     of the LXVP or STXVP instructions, do not allow indexed mode so
+	     that we can split the load/store.  */
 	  else if ((addr_mask != 0) && TARGET_MMA
 		   && (m2 == OOmode || m2 == XOmode))
 	    {
@@ -2730,7 +2732,9 @@ rs6000_setup_reg_addr_masks (void)
 	      if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX)
 		{
 		  addr_mask |= RELOAD_REG_QUAD_OFFSET;
-		  if (m2 == OOmode)
+		  if (m2 == OOmode
+		      && TARGET_LOAD_VECTOR_PAIR
+		      && TARGET_STORE_VECTOR_PAIR)
 		    addr_mask |= RELOAD_REG_INDEXED;
 		}
 	    }
@@ -4375,6 +4379,26 @@ rs6000_option_override_internal (bool global_init_p)
       rs6000_isa_flags &= ~OPTION_MASK_MMA;
     }
 
+  /* Warn if -m-load-vector-pair or -m-store-vector-pair are used and MMA is
+     not set.  */
+  if (!TARGET_MMA && TARGET_LOAD_VECTOR_PAIR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_LOAD_VECTOR_PAIR) != 0)
+	warning (0, "%qs should not be used unless you use %qs",
+		 "-mload-vector-pair", "-mmma");
+
+      rs6000_isa_flags &= ~OPTION_MASK_LOAD_VECTOR_PAIR;
+    }
+
+  if (!TARGET_MMA && TARGET_STORE_VECTOR_PAIR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_STORE_VECTOR_PAIR) != 0)
+	warning (0, "%qs should not be used unless you use %qs",
+		 "-mstore-vector-pair", "-mmma");
+
+      rs6000_isa_flags &= OPTION_MASK_STORE_VECTOR_PAIR;
+    }
+
   /* Enable power10 fusion if we are tuning for power10, even if we aren't
      generating power10 instructions.  */
   if (!(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION))
@@ -14444,13 +14468,17 @@ print_operand (FILE *file, rtx x, int code)
 	print_operand (file, x, 0);
       return;
 
+    case 'S':
     case 'x':
-      /* X is a FPR or Altivec register used in a VSX context.  */
+      /* X is a FPR or Altivec register used in a VSX context.  %x<n> prints
+	 the VSX register number, %S<n> prints the 2nd register number for
+	 vector pair, decimal 128-bit floating and IBM 128-bit binary floating
+	 values.  */
       if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
-	output_operand_lossage ("invalid %%x value");
+	output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
       else
 	{
-	  int reg = REGNO (x);
+	  int reg = REGNO (x) + (code == 'S' ? 1 : 0);
 	  int vsx_reg = (FP_REGNO_P (reg)
 			 ? reg - 32
 			 : reg - FIRST_ALTIVEC_REGNO + 32);
@@ -24445,6 +24473,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "hard-dfp",			OPTION_MASK_DFP,		false, true  },
   { "htm",			OPTION_MASK_HTM,		false, true  },
   { "isel",			OPTION_MASK_ISEL,		false, true  },
+  { "load-vector-pair",		OPTION_MASK_LOAD_VECTOR_PAIR,	false, true  },
   { "mfcrf",			OPTION_MASK_MFCRF,		false, true  },
   { "mfpgpr",			0,				false, true  },
   { "mma",			OPTION_MASK_MMA,		false, true  },
@@ -24470,6 +24499,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "quad-memory-atomic",	OPTION_MASK_QUAD_MEMORY_ATOMIC,	false, true  },
   { "recip-precision",		OPTION_MASK_RECIP_PRECISION,	false, true  },
   { "save-toc-indirect",	OPTION_MASK_SAVE_TOC_INDIRECT,	false, true  },
+  { "store-vector-pair",	OPTION_MASK_STORE_VECTOR_PAIR,	false, true  },
   { "string",			0,				false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index bc8bc6ab060..4acb4031ae0 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -355,7 +355,7 @@
   (const (symbol_ref "(enum attr_cpu) rs6000_tune")))
 
 ;; The ISA we implement.
-(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
+(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,lxvp,stxvp"
   (const_string "any"))
 
 ;; Is this alternative enabled for the current CPU/ISA/etc.?
@@ -403,6 +403,14 @@
      (and (eq_attr "isa" "p10")
 	  (match_test "TARGET_POWER10"))
      (const_int 1)
+
+     (and (eq_attr "isa" "lxvp")
+	  (match_test "TARGET_LOAD_VECTOR_PAIR"))
+     (const_int 1)
+
+     (and (eq_attr "isa" "stxvp")
+	  (match_test "TARGET_STORE_VECTOR_PAIR"))
+     (const_int 1)
     ] (const_int 0)))
 
 ;; If this instruction is microcoded on the CELL processor
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 8e46b8fbabb..b89f30c87af 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -597,6 +597,14 @@ mmma
 Target Mask(MMA) Var(rs6000_isa_flags)
 Generate (do not generate) MMA instructions.
 
+mload-vector-pair
+Target Undocumented Mask(LOAD_VECTOR_PAIR) Var(rs6000_isa_flags)
+Generate (do not generate) load vector pair instructions.
+
+mstore-vector-pair
+Target Undocumented Mask(STORE_VECTOR_PAIR) Var(rs6000_isa_flags)
+Generate (do not generate) store vector pair instructions.
+
 mrelative-jumptables
 Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 5730bda80dc..7b7e6507754 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3386,8 +3386,9 @@ A VSX register (VSR), @code{vs0}@dots{}@code{vs63}.  This is either an
 FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
 (@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
 
-When using @code{wa}, you should use the @code{%x} output modifier, so that
-the correct register number is printed.  For example:
+When using @code{wa}, you should use either the @code{%x} or @code{%S}
+output modifier, so that the correct register number is printed.  For
+example:
 
 @smallexample
 asm ("xvadddp %x0,%x1,%x2"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
new file mode 100644
index 00000000000..4e59dcda6ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* PR target/112886: Test that print_operand %S<n> gives the correct register
+   number for VSX registers (i.e. if the register is an Altivec register, the
+   register number is 32..63 instead of 0..31.  */
+
+void
+test (__vector_pair *ptr1, __vector_pair *ptr2, __vector_pair *ptr3)
+{
+  register __vector_pair p asm ("vs10");
+  register __vector_pair q asm ("vs42");
+  register __vector_pair r asm ("vs44");
+
+  q = *ptr2;
+  r = *ptr3;
+
+  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
+	   : "=wa" (p)
+	   : "wa"  (q), "wa" (r));
+
+  *ptr1 = p;
+}
+
+/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlxvpx?\M}           2 } } */
+/* { dg-final { scan-assembler-times {\mstxvpx?\M}          1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c
new file mode 100644
index 00000000000..985a44aca85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-attribute.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test if we can control generating load and store vector pair via the target
+   attribute.  */
+
+__attribute__((__target__("load-vector-pair,store-vector-pair")))
+void
+test_load_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 1 stxvp.  */
+}
+
+__attribute__((__target__("load-vector-pair,no-store-vector-pair")))
+void
+test_load_no_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 2 stxv.  */
+}
+
+__attribute__((__target__("no-load-vector-pair,store-vector-pair")))
+void
+test_store_no_load (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 1 stxvp.  */
+}
+
+__attribute__((__target__("no-load-vector-pair,no-store-vector-pair")))
+void
+test_no_load_or_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 2 stxv.  */
+}
+
+/* { dg-final { scan-assembler-times {\mp?lxvpx?\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mp?lxvx?\M}   4 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvx?\M}  4 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c
new file mode 100644
index 00000000000..74c6baf8185
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-pragma.c
@@ -0,0 +1,55 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test if we can control generating load and store vector pair via the #pragma
+   directive.  */
+
+#pragma gcc push_options
+#pragma GCC target("load-vector-pair,store-vector-pair")
+
+void
+test_load_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 1 stxvp.  */
+}
+
+#pragma gcc pop_options
+
+#pragma gcc push_options
+#pragma GCC target("load-vector-pair,no-store-vector-pair")
+
+void
+test_load_no_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 2 stxv.  */
+}
+
+#pragma gcc pop_options
+
+#pragma gcc push_options
+#pragma GCC target("no-load-vector-pair,store-vector-pair")
+
+void
+test_store_no_load (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 1 stxvp.  */
+}
+
+#pragma gcc pop_options
+
+#pragma gcc push_options
+#pragma GCC target("no-load-vector-pair,no-store-vector-pair")
+
+void
+test_no_load_or_store (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 2 stxv.  */
+}
+
+#pragma gcc pop_options
+
+/* { dg-final { scan-assembler-times {\mp?lxvpx?\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mp?lxvx?\M}   4 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvx?\M}  4 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c
new file mode 100644
index 00000000000..48e433b378e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* Test if we generate load and store vector pair by default on power 10.  */
+
+void
+test (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 1 stxvp.  */
+}
+
+/* { dg-final { scan-assembler-times {\mp?lxvpx?\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 1 } } */
+/* { dg-final { scan-assembler-not   {\mp?lxvx?\M}     } } */
+/* { dg-final { scan-assembler-not   {\mp?stxvx?\M}    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c
new file mode 100644
index 00000000000..2a38c2f2aae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-store-vector-pair" } */
+
+/* Test if we generate load vector pair but not store vector pair if
+   -mno-store-vector-pair is used on power10.  */
+
+void
+test (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 1 lxvp, 2 stxv.  */
+}
+
+/* { dg-final { scan-assembler-times {\mp?lxvpx?\M}  1 } } */
+/* { dg-final { scan-assembler-not   {\mp?stxvpx?\M}   } } */
+/* { dg-final { scan-assembler-not   {\mp?lxvx?\M}     } } */
+/* { dg-final { scan-assembler-times {\mp?stxvx?\M}  2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c
new file mode 100644
index 00000000000..fd273056b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-load-vector-pair" } */
+
+/* Test if we do not generate load vector pair but generate store vector pair
+   if -mno-load-vector-pair is used on power10.  */
+
+void
+test (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 1 stxvp.  */
+}
+
+/* { dg-final { scan-assembler-not   {\mp?lxvpx?\M}    } } */
+/* { dg-final { scan-assembler-times {\mp?stxvpx?\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mp?lxvx?\M}   2 } } */
+/* { dg-final { scan-assembler-not   {\mp?stxvx?\M}    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c
new file mode 100644
index 00000000000..01686e073fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-pair-switch4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -mno-load-vector-pair -mno-store-vector-pair" } */
+
+/* Test if we do not generate load and store vector pair if directed to on
+   power 10.  */
+
+void
+test (__vector_pair *p, __vector_pair *q)
+{
+  *p = *q;	/* 2 lxv, 2 stxv.  */
+}
+
+/* { dg-final { scan-assembler-not   {\mp?lxvpx?\M}    } } */
+/* { dg-final { scan-assembler-not   {\mp?stxvpx?\M}   } } */
+/* { dg-final { scan-assembler-times {\mp?lxvx?\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mp?stxvx?\M}  2 } } */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-02-27  7:13 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-27  7:13 [gcc(refs/users/meissner/heads/work160-test)] PR target/112886, Add %S<n> to print_operand for vector pair support. Power10: Add options to disab Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).