public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* PowerPC -mcpu=future patches, V11
@ 2019-12-20 23:15 Michael Meissner
  2019-12-20 23:28 ` [PATCH] V11 patch #1 of 15, Fix bug in vec_extract Michael Meissner
                   ` (14 more replies)
  0 siblings, 15 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-20 23:15 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn, Michael Meissner

This set of patches reworks the vector extract issues in the V10 patches.

If you recall, in V10, you pointed out that for vector extract, the existing
code overwrote an input argument, and that is fixed in these patches.

In V10, I added two new constraints (ep and em) to categorize whether a memory
is prefixed or not prefixed, and we had some discussion about how to write the
predicates.

However, yesterday I realized that for the case adding new constraints (vector
extract with a variable element number, where the vector is in memory, and we
are optimizing the load to just load up the element being extract), what we
want is just the address of the vector in a base register.

This is because in order access the element where the element number is
variable, we eventually will need to do an X-FORM load, with the vector address
in one register, and the byte offset in another.

Instead of adding new alternatives and new scratch registers, I could just
simplify the code and use the 'Q' constraint that says use a single register as
the address.  The register allocator will do the necessary work to load up the
address during register allocation.

I did notice that the documentation for 'Q' was wrong, so one of the patches
updates the documentation.

In addition, after committing the first 3 patches from V10 that added PADDI and
PLI support for -mcpu=future, Segher asked me to do a patch to rename two of
the macros.  That patch is now checked in, and some of these patches include
changes due to the macro renaming.

After the vector extract patch rework, I included the remaining patch to the
compiler (make -mpcrel default on Linux 64-bit for -mcpu=future).  I included
the tests after doing the -mpcrel default changes.  In addition to the tests in
V10, I added some new tests for the vector extract code.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #1 of 15, Fix bug in vec_extract
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
@ 2019-12-20 23:28 ` Michael Meissner
  2019-12-22 14:06   ` Segher Boessenkool
  2019-12-20 23:47 ` [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset Michael Meissner
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Michael Meissner @ 2019-12-20 23:28 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch fixes the bug pointed out in the V10 patch review that the code
modified an input argument to vector extract with a variable element number.

I also added two gcc_asserts to the vector extract address code to signal an
internal error if the temporary base register was used for two different
purposes.  This shows up if you have a vector whose address is a PC-relative
address and the element number was variable.

Later patches will fix the case that I know of that generates the bad code, but
it is still important to make sure the same case doesn't happen in the future.

With this patch applied, the compiler will signal an error.  FWIW, I did build
all of Spec 2017 and Spec 2006 with this patch applied, but not the others, and
we did not get an assertion failure.

I have bootstrapped the compiler and there were no regression test failures on
a little endian Power8 system.

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add
	assertion to make sure that we don't load an address into a
	temporary that is already used.
	(rs6000_split_vec_extract_var): Do not overwrite the element when
	masking it.  Use the base register temporary instead.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279549)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6757,6 +6757,8 @@ rs6000_adjust_vec_address (rtx scalar_re
 
       else
 	{
+	  /* If we are called from rs6000_split_vec_extract_var, base_tmp may
+	     be the same as element.  */
 	  if (TARGET_POWERPC64)
 	    emit_insn (gen_ashldi3 (base_tmp, element, GEN_INT (byte_shift)));
 	  else
@@ -6825,6 +6827,11 @@ rs6000_adjust_vec_address (rtx scalar_re
 
 	  else
 	    {
+	      /* Make sure base_tmp is not the same as element_offset.  This
+		 can happen if the element number is variable and the address
+		 is not a simple address.  Otherwise we lose the offset, and
+		 double the address.  */
+	      gcc_assert (!reg_mentioned_p (base_tmp, element_offset));
 	      emit_move_insn (base_tmp, op1);
 	      emit_insn (gen_add2_insn (base_tmp, element_offset));
 	    }
@@ -6835,6 +6842,10 @@ rs6000_adjust_vec_address (rtx scalar_re
 
   else
     {
+      /* Make sure base_tmp is not the same as element_offset.  This can happen
+	 if the element number is variable and the address is not a simple
+	 address.  Otherwise we lose the offset, and double the address.  */
+      gcc_assert (!reg_mentioned_p (base_tmp, element_offset));
       emit_move_insn (base_tmp, addr);
       new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
     }
@@ -6902,9 +6913,10 @@ rs6000_split_vec_extract_var (rtx dest,
       int num_elements = GET_MODE_NUNITS (mode);
       rtx num_ele_m1 = GEN_INT (num_elements - 1);
 
-      emit_insn (gen_anddi3 (element, element, num_ele_m1));
+      /* Make sure the element number is in bounds.  */
       gcc_assert (REG_P (tmp_gpr));
-      emit_move_insn (dest, rs6000_adjust_vec_address (dest, src, element,
+      emit_insn (gen_anddi3 (tmp_gpr, element, num_ele_m1));
+      emit_move_insn (dest, rs6000_adjust_vec_address (dest, src, tmp_gpr,
 						       tmp_gpr, scalar_mode));
       return;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
  2019-12-20 23:28 ` [PATCH] V11 patch #1 of 15, Fix bug in vec_extract Michael Meissner
@ 2019-12-20 23:47 ` Michael Meissner
  2019-12-22 17:24   ` Segher Boessenkool
  2019-12-20 23:49 ` [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory Michael Meissner
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Michael Meissner @ 2019-12-20 23:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch incorporates large offsets for -mcpu=future when we optimization a
vector extract from memory and the memory address previously had been a
prefixed address with a large offset.

The current code would generate loading up the constant into a temporary and
then doing an indexed load.  Successive passes would eventually optimize that
back into the form we want (having the base register plus a large offset), but
it is better to generate the optimial code sooner.

I have bootstrapped this change on a little endian power8 system and there were
no regressions.  Can I check this into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for the offset being 34-bits when -mcpu=future is used.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279553)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6792,9 +6792,17 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
+	  /* 16-bit offset.  */
+	  if (SIGNED_INTEGER_16BIT_P (offset)
 	      && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  /* 34-bit offset if we have prefixed addresses.  */
+	  else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  /* Offset overflowed, move offset to the temporary (which will likely
+	     be split), and do X-FORM addressing.  */
 	  else
 	    {
 	      emit_move_insn (base_tmp, offset_rtx);
@@ -6825,6 +6833,12 @@ rs6000_adjust_vec_address (rtx scalar_re
 	      emit_insn (insn);
 	    }
 
+	  /* Make sure we don't overwrite the temporary if the element being
+	     extracted is variable, and we've put the offset into base_tmp
+	     previously.  */
+	  else if (rtx_equal_p (base_tmp, element_offset))
+	    emit_insn (gen_add2_insn (base_tmp, op1));
+
 	  else
 	    {
 	      /* Make sure base_tmp is not the same as element_offset.  This

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
  2019-12-20 23:28 ` [PATCH] V11 patch #1 of 15, Fix bug in vec_extract Michael Meissner
  2019-12-20 23:47 ` [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset Michael Meissner
@ 2019-12-20 23:49 ` Michael Meissner
  2019-12-22 17:49   ` Segher Boessenkool
  2019-12-20 23:56 ` [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation Michael Meissner
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Michael Meissner @ 2019-12-20 23:49 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

As I mentioned in the intro, for the case where we are optimizing the extract
of a variable element from a vector in memory, the current code takes a regular
address, and the temporary that holds the byte offset, and tries to generate a
new address.  In particular, it failed when the vector was a PC-relative
address, because it didn't have enough temporary registers, and it used the
temporary to hold the byte offset to hold the address.

Initially in doing these patches, I reworked the constraints for prefixed and
non-prefixed memory so we could identify when we needed a second temporary.
Then I realized that eventaully we will want to generate an X-FORM (register +
register) address, and it was just simpler to use the 'Q' constraint, and have
the register allocator put the address into a register.

I have verified that the bug is indeed fixed (patch #15 will include the new
tests for this).  I have also bootstrapped the compiler on a little endian
power8 machine and there were no regressions in the test suite.  Can I check
this patch into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Use 'Q' for memory constraints because we need to do an X-FORM
	load with the variable index.
	(vsx_extract_v4sf_var): Use 'Q' for memory constraints because we
	need to do an X-FORM load with the variable index.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator):Use 'Q' for
	memory constraints because we need to do an X-FORM load with the
	variable index.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for memory
	constraints because we need to do an X-FORM load with the variable
	index.

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 279597)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -3245,10 +3245,11 @@ (define_insn "vsx_vslo_<mode>"
   "vslo %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-;; Variable V2DI/V2DF extract
+;; Variable V2DI/V2DF extract.  Use 'Q' for the memory because we will
+;; ultimately have to convert the address into base + index.
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
+	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,Q,Q")
 			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 			    UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3318,7 +3319,7 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
+	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,Q,Q")
 		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 		   UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3681,7 +3682,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,Q")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3701,7 +3702,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,Q")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation.
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (2 preceding siblings ...)
  2019-12-20 23:49 ` [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory Michael Meissner
@ 2019-12-20 23:56 ` Michael Meissner
  2019-12-22 20:02   ` Segher Boessenkool
  2019-12-21  0:00 ` [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address Michael Meissner
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Michael Meissner @ 2019-12-20 23:56 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

In doing V11 patch #3, I noticed that the documentation for the 'Q' was
misleading.  This patch updates the documentation.  Can I check this patch into
the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (Q constraint): Update
	documentation.
	* doc/md.tet (PowerPC constraints): Update 'Q' constraint
	documentation.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 279547)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -211,8 +211,7 @@ several times, or that might not access
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
 (define_memory_constraint "Q"
-  "Memory operand that is an offset from a register (it is usually better
-to use @samp{m} or @samp{es} in @code{asm} statements)"
+  "A memory operand whose address which uses a single register with no offset."
   (and (match_code "mem")
        (match_test "REG_P (XEXP (op, 0))")))
 
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 279547)
+++ gcc/doc/md.texi	(working copy)
@@ -3381,8 +3381,7 @@ allowed when @samp{<} or @samp{>} is use
 as @samp{m} without @samp{<} and @samp{>}.
 
 @item Q
-Memory operand that is an offset from a register (it is usually better
-to use @samp{m} or @samp{es} in @code{asm} statements)
+A memory operand whose address which uses a single register with no offset.
 
 @item Z
 Memory operand that is an indexed or indirect from a register (it is

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (3 preceding siblings ...)
  2019-12-20 23:56 ` [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation Michael Meissner
@ 2019-12-21  0:00 ` Michael Meissner
  2019-12-25  6:41   ` Segher Boessenkool
  2019-12-21  0:03 ` [PATCH] V11 patch #6 of 15, Make -mpcrel the default for -mcpu=future on Linux 64-bit Michael Meissner
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:00 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This patch recognizes when we are doing the optimization of vector extract with
a constant element number when the vector is in memory and the vector's address
is PC-relative, to directly re-form the address using a PC-relative load,
instead of loading the address into a temporary register, and then doing an
indirect load.

I have bootstrapped a compiler on a little endian power8 machine and ran the
testsuite with no regressions.  Can I check this into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
	function to identify the address mask of a hard register.
	(rs6000_adjust_vec_address): If we have a PC-relative address and
	a constant vector element number, fold the element number into the
	PC-relative address.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279597)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6722,6 +6722,30 @@ rs6000_expand_vector_extract (rtx target
     }
 }
 
+/* Helper function to return an address mask based on a physical register.  */
+
+static addr_mask_type
+rs6000_reg_to_addr_mask (rtx reg, machine_mode mode)
+{
+  unsigned int r = reg_or_subregno (reg);
+  addr_mask_type addr_mask;
+
+  gcc_assert (HARD_REGISTER_NUM_P (r));
+  if (INT_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+  else if (FP_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
+
+  else if (ALTIVEC_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+  else
+    gcc_unreachable ();
+
+  return addr_mask;
+}
+
 /* Adjust a memory address (MEM) of a vector type to point to a scalar field
    within the vector (ELEMENT) with a mode (SCALAR_MODE).  Use a base register
    temporary (BASE_TMP) to fixup the address.  Return the new memory address
@@ -6854,6 +6878,51 @@ rs6000_adjust_vec_address (rtx scalar_re
 	}
     }
 
+  /* For references to local static variables, try to fold a constant offset
+     into the address.  */
+  else if (pcrel_local_address (addr, Pmode) && CONST_INT_P (element_offset))
+    {
+      if (GET_CODE (addr) == CONST)
+	addr = XEXP (addr, 0);
+
+      if (GET_CODE (addr) == PLUS)
+	{
+	  rtx op0 = XEXP (addr, 0);
+	  rtx op1 = XEXP (addr, 1);
+	  if (CONST_INT_P (op1))
+	    {
+	      HOST_WIDE_INT offset
+		= INTVAL (XEXP (addr, 1)) + INTVAL (element_offset);
+
+	      if (offset == 0)
+		new_addr = op0;
+
+	      else if (SIGNED_INTEGER_34BIT_P (offset))
+		{
+		  rtx plus = gen_rtx_PLUS (Pmode, op0, GEN_INT (offset));
+		  new_addr = gen_rtx_CONST (Pmode, plus);
+		}
+
+	      else
+		{
+		  emit_move_insn (base_tmp, addr);
+		  new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+		}
+	    }
+	  else
+	    {
+	      emit_move_insn (base_tmp, addr);
+	      new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
+	    }
+	}
+
+      else
+	{
+	  rtx plus = gen_rtx_PLUS (Pmode, addr, element_offset);
+	  new_addr = gen_rtx_CONST (Pmode, plus);
+	}
+    }
+
   else
     {
       /* Make sure base_tmp is not the same as element_offset.  This can happen
@@ -6869,21 +6938,8 @@ rs6000_adjust_vec_address (rtx scalar_re
   if (GET_CODE (new_addr) == PLUS)
     {
       rtx op1 = XEXP (new_addr, 1);
-      addr_mask_type addr_mask;
-      unsigned int scalar_regno = reg_or_subregno (scalar_reg);
-
-      gcc_assert (HARD_REGISTER_NUM_P (scalar_regno));
-      if (INT_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_GPR];
-
-      else if (FP_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_FPR];
-
-      else if (ALTIVEC_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_VMX];
-
-      else
-	gcc_unreachable ();
+      addr_mask_type addr_mask
+	= rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
 
       if (REG_P (op1) || SUBREG_P (op1))
 	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
@@ -6891,9 +6947,21 @@ rs6000_adjust_vec_address (rtx scalar_re
 	valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
     }
 
+  /* An address that is a single register is always valid for either indexed or
+     offsettable loads.  */
   else if (REG_P (new_addr) || SUBREG_P (new_addr))
     valid_addr_p = true;
 
+  /* If we have a PC-relative address, check if offsetable loads are
+     allowed.  */
+  else if (pcrel_local_address (new_addr, Pmode))
+    {
+      addr_mask_type addr_mask
+	= rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
+
+      valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+    }
+
   else
     valid_addr_p = false;
 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #6 of 15, Make -mpcrel the default for -mcpu=future on Linux 64-bit
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (4 preceding siblings ...)
  2019-12-21  0:00 ` [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address Michael Meissner
@ 2019-12-21  0:03 ` Michael Meissner
  2019-12-21  0:06 ` [PATCH] V11 patch #7 of 15, Add new target_supports cases for -mcpu=future tests Michael Meissner
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:03 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is the same as V10 patch #8.  Once the vector extract patches are
committed, this patch flips the default to use PC-relative addressing on 64-bit
Linux systems when the uses -mcpu=future.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00841.html

I have bootstrapped the compiler on a little endian power8 system and ran the
testsuite with no regressions.  Once the preceeding V11 patches have been
checked in, can I check these patches into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/linux64.h (PREFIXED_ADDR_SUPPORTED_BY_OS): Set to
	1 to enable prefixed addressing if -mcpu=future.
	(PCREL_SUPPORTED_BY_OS): Set to 1 to enable PC-relative addressing
	if -mcpu=future.
	* config/rs6000/rs6000-cpus.h (ISA_FUTURE_MASKS_SERVER): Do not
	enable -mprefixed-addr or -mpcrel by default.
	(ADDRESSING_FUTURE_MASKS): New macro.
	(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
	* config/rs6000/rs6000.c (PREFIXED_ADDR_SUPPORTED_BY_OS): Disable
	prefixed addressing unless the target OS tm.h says we should
	enable it.
	(PCREL_SUPPORTED_BY_OS): Disable PC-relative addressing unless the
	target OS tm.h says we should enable it.
	(rs6000_debug_reg_global): Print whether prefixed addressing and
	PC-relative addressing is enabled by default if -mcpu=future.
	(rs6000_option_override_internal): Move setting prefixed
	addressing and PC-relative addressing after the sub-target option
	handling is done.  Only enable prefixed addressing or PC-relative
	address on -mcpu=future system if the target OS says to enable
	it.  Disallow prefixed addressing on 32-bit systems or if the
	target object file is not ELF v2.

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 279141)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
    enabling the __float128 keyword.  */
 #undef	TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  PREFIXED_ADDR_SUPPORTED_BY_OS
+#define PREFIXED_ADDR_SUPPORTED_BY_OS	1
+
+#undef  PCREL_SUPPORTED_BY_OS
+#define PCREL_SUPPORTED_BY_OS		1
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 279141)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -75,15 +75,22 @@
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, they are added if the target
+   OS tm.h says that it supports the addressing modes by default when
+   -mcpu=future is used.  */
 #define ISA_FUTURE_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
-				 | OPTION_MASK_FUTURE			\
+				 | OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These are options that need
+   to be cleared if the target OS is not capable of supporting prefixed
+   addressing at all (such as 32-bit mode or if the object file format is not
+   ELF v2).  */
+#define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS	(OPTION_MASK_PCREL			\
-				 | OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS	ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279202)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef PREFIXED_ADDR_SUPPORTED_BY_OS
+#define PREFIXED_ADDR_SUPPORTED_BY_OS	0
+#endif
+
+#ifndef PCREL_SUPPORTED_BY_OS
+#define PCREL_SUPPORTED_BY_OS		0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2535,6 +2545,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 	     (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+    {
+      fprintf (stderr, DEBUG_FMT_D, "PREFIXED_ADDR_SUPPORTED_BY_OS",
+	       PREFIXED_ADDR_SUPPORTED_BY_OS);
+      fprintf (stderr, DEBUG_FMT_D, "PCREL_SUPPORTED_BY_OS",
+	       PCREL_SUPPORTED_BY_OS);
+    }
 }
 
 \f
@@ -4015,26 +4033,6 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
     }
 
-  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
-  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
-      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
-	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
-
-      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
-    }
-
-  /* -mpcrel requires prefixed load/store addressing.  */
-  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
-    {
-      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
-
-      rs6000_isa_flags &= ~OPTION_MASK_PCREL;
-    }
-
   /* Print the options after updating the defaults.  */
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
@@ -4166,12 +4164,91 @@ rs6000_option_override_internal (bool gl
   SUB3TARGET_OVERRIDE_OPTIONS;
 #endif
 
-  /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
-      after the subtarget override options are done.  */
-  if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+  /* Enable prefixed addressing and PC-relative addressing if the target OS
+     tm.h file says that it is supported and the user did not explicitly use
+     -mprefixed-addr or -mpcrel.  At the present time, only 64-bit Linux
+     enables this.
+
+     PC-relative support also requires the medium code model.
+
+     We can't check for ELFv2 or -mcmodel=medium until after the subtarget
+     macros are run.
+
+     If prefixed addressing is disabled by default, and the user does -mpcrel,
+     don't force them to also specify -mprefixed-addr.  */
+  if (TARGET_FUTURE)
+    {
+      bool explicit_prefixed = ((rs6000_isa_flags_explicit
+				 & OPTION_MASK_PREFIXED_ADDR) != 0);
+      bool explicit_pcrel = ((rs6000_isa_flags_explicit
+			      & OPTION_MASK_PCREL) != 0);
+
+      /* Prefixed addressing requires 64-bit registers.  */
+      if (!TARGET_POWERPC64)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-m64");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-m64");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Only ELFv2 currently supports prefixed/pcrel addressing.  */
+      else if (rs6000_current_abi != ABI_ELFv2)
+	{
+	  if (TARGET_PCREL && explicit_pcrel)
+	    error ("%qs requires %qs", "-mpcrel", "-mabi=elfv2");
+
+	  else if (TARGET_PREFIXED_ADDR && explicit_prefixed)
+	    error ("%qs requires %qs", "-mprefixed-addr", "-mabi=elfv2");
+
+	  rs6000_isa_flags &= ~ADDRESSING_FUTURE_MASKS;
+	}
+
+      /* Enable defaults if desired.  */
+      else
+	{
+	  if (!explicit_prefixed
+	      && (PREFIXED_ADDR_SUPPORTED_BY_OS
+		  || TARGET_PCREL
+		  || PCREL_SUPPORTED_BY_OS))
+	    rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
+
+	  if (!explicit_pcrel && PCREL_SUPPORTED_BY_OS
+	      && TARGET_PREFIXED_ADDR
+	      && TARGET_CMODEL == CMODEL_MEDIUM)
+	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	}
+
+      /* PC-relative requires the medium code model.  */
+      if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
+	{
+	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+
+	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	}
+
+    }
+
+  /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
+  if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
     {
       if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
-	error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
+	error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
+      else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+	error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+      rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+    }
+
+  /* -mpcrel requires prefixed load/store addressing.  */
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+	error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #7 of 15, Add new target_supports cases for -mcpu=future tests.
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (5 preceding siblings ...)
  2019-12-21  0:03 ` [PATCH] V11 patch #6 of 15, Make -mpcrel the default for -mcpu=future on Linux 64-bit Michael Meissner
@ 2019-12-21  0:06 ` Michael Meissner
  2019-12-21  0:11 ` [PATCH] V11 patch #8 of 15, Add new tests for using PADDI and PLI with -mcpu=future Michael Meissner
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:06 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is V10 patch #9.  It adds new target_supports tests for the new patches:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00842.html

All of the new tests work with these target supports.  Can I check it into the
trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* lib/target-supports.exp (check_effective_target_powerpc_pcrel):
	New target for PowerPC -mcpu=future support.
	(check_effective_target_powerpc_prefixed_addr): New target for
	PowerPC -mcpu=future support.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 279547)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -2161,6 +2161,23 @@ proc check_p9modulo_hw_available { } {
     }]
 }
 
+# Return 1 if the target generates PC-relative instructions automatically
+proc check_effective_target_powerpc_pcrel { } {
+    return [check_no_messages_and_pattern powerpc_pcrel \
+	{\mpld\M.*[@]pcrel} assembly {
+	    static long s;
+	    long *p = &s;
+	    long foo (void) { return s; }
+	} {-O2 -mcpu=future}]
+}
+
+# Return 1 if the target generates prefixed instructions automatically
+proc check_effective_target_powerpc_prefixed_addr { } {
+    return [check_no_messages_and_pattern powerpc_prefixed_addr \
+	{\mpld\M} assembly {
+	    long foo (long *p) { return p[0x12345]; }
+	} {-O2 -mcpu=future}]
+}
 
 # Return 1 if the target supports executing FUTURE instructions, 0 otherwise.
 # Cache the result.  It is assumed that if a simulator does not support the

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #8 of 15, Add new tests for using PADDI and PLI with -mcpu=future
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (6 preceding siblings ...)
  2019-12-21  0:06 ` [PATCH] V11 patch #7 of 15, Add new target_supports cases for -mcpu=future tests Michael Meissner
@ 2019-12-21  0:11 ` Michael Meissner
  2019-12-21  0:12 ` [PATCH] V11 patch #9 of 15, Add test to validate generating prefixed memory when the offset is invalid for DS/DQ insns Michael Meissner
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:11 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is V10 patch #10. It adds 3 new tests to verify that we generate PADDI/PLI
for large constants when -mcpu=future is used.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00843.html

This test passes when the preceeding patches are applied.  Can I check this in?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-add.c: New test for -mcpu=future
	generating PADDI for large constant adds.
	* gcc.target/powerpc/prefix-di-constant.c: New test for
	-mcpu=future generating PLI to load up large DImode constants.
	* gcc.target/powerpc/prefix-si-constant.c: New test for
	-mcpu=future generating PLI to load up large SImode constants.

Index: gcc/testsuite/gcc.target/powerpc/prefix-add.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-add.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-add.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-di-constant.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c	(revision 279252)
+++ gcc/testsuite/gcc.target/powerpc/prefix-si-constant.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #9 of 15, Add test to validate generating prefixed memory when the offset is invalid for DS/DQ insns
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (7 preceding siblings ...)
  2019-12-21  0:11 ` [PATCH] V11 patch #8 of 15, Add new tests for using PADDI and PLI with -mcpu=future Michael Meissner
@ 2019-12-21  0:12 ` Michael Meissner
  2019-12-21  0:22 ` [PATCH] V11 patch #10 of 15, Make sure we don't generate pre-modify prefixed insns with -mcpu=future Michael Meissner
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:12 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is V10 patch #11.  This adds a new test to validate that for -mcpu=future,
we generate a prefixed load/store if the offset would have been illegal for a
non-prefixed DS or DQ instruction.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00845.html

This test passes when I run the testsuite.  Can I check it in?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-ds-dq.c: New test to verify that we
	generate the prefix load/store instructions for traditional
	instructions with an offset that doesn't match DS/DQ
	requirements.

Index: gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c	(revision 279256)
+++ gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c	(working copy)
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we generate a prefixed load/store operation for addresses that
+   don't meet DS/DQ offset constraints.  */
+
+unsigned long
+load_uc_offset1 (unsigned char *p)
+{
+  return p[1];				/* should generate LBZ.  */
+}
+
+long
+load_sc_offset1 (signed char *p)
+{
+  return p[1];				/* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_offset1 (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);	/* should generate LHZ.  */
+}
+
+long
+load_ss_offset1 (unsigned char *p)
+{
+  return *(short *)(p + 1);		/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_offset1 (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1);	/* should generate LWZ.  */
+}
+
+long
+load_si_offset1 (unsigned char *p)
+{
+  return *(int *)(p + 1);		/* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_offset1 (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);	/* should generate PLD.  */
+}
+
+long
+load_sl_offset1 (unsigned char *p)
+{
+  return *(long *)(p + 1);		/* should generate PLD.  */
+}
+
+float
+load_float_offset1 (unsigned char *p)
+{
+  return *(float *)(p + 1);		/* should generate LFS.  */
+}
+
+double
+load_double_offset1 (unsigned char *p)
+{
+  return *(double *)(p + 1);		/* should generate LFD.  */
+}
+
+__float128
+load_float128_offset1 (unsigned char *p)
+{
+  return *(__float128 *)(p + 1);	/* should generate PLXV.  */
+}
+
+void
+store_uc_offset1 (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;				/* should generate STB.  */
+}
+
+void
+store_sc_offset1 (signed char sc, signed char *p)
+{
+  p[1] = sc;				/* should generate STB.  */
+}
+
+void
+store_us_offset1 (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us;	/* should generate STH.  */
+}
+
+void
+store_ss_offset1 (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;	/* should generate STH.  */
+}
+
+void
+store_ui_offset1 (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;	/* should generate STW.  */
+}
+
+void
+store_si_offset1 (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si;		/* should generate STW.  */
+}
+
+void
+store_ul_offset1 (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;	/* should generate PSTD.  */
+}
+
+void
+store_sl_offset1 (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;		/* should generate PSTD.  */
+}
+
+void
+store_float_offset1 (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;		/* should generate STF.  */
+}
+
+void
+store_double_offset1 (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;		/* should generate STD.  */
+}
+
+void
+store_float128_offset1 (__float128 f128, unsigned char *p)
+{
+  *(__float128 *)(p + 1) = f128;	/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #10 of 15, Make sure we don't generate pre-modify prefixed insns with -mcpu=future
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (8 preceding siblings ...)
  2019-12-21  0:12 ` [PATCH] V11 patch #9 of 15, Add test to validate generating prefixed memory when the offset is invalid for DS/DQ insns Michael Meissner
@ 2019-12-21  0:22 ` Michael Meissner
  2019-12-21  0:25 ` [PATCH] V11 patch #11 of 15, Add new tests for generating prefixed loads/stores on -mcpu=future with large offsets Michael Meissner
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:22 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is V10 patch #12.  It adds a test to make sure we don't generate a
prefixed instruction with PRE_INC, PRE_DEC, or PRE_MODIFY.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00846.html

This test passes when I run it.  Can I check this into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-no-premodify.c: Make sure we do not
	generate the non-existent PLWZU instruction if -mcpu=future.

Index: gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c	(revision 279259)
+++ gcc/testsuite/gcc.target/powerpc/prefix-no-premodify.c	(working copy)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't generate a prefixed form of the load and store with
+   update instructions (i.e. instead of generating LWZU we have to generate
+   PLWZ plus a PADDI).  */
+
+#ifndef SIZE
+#define SIZE 50000
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;	/* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;	/* PLWZ, PADDI, STW.  */
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;	/* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;	/* LWZ, PADDI, PSTW.  */
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mlwz\M}    2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}    2 } } */
+/* { dg-final { scan-assembler-times {\mpaddi\M}  4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}   2 } } */
+/* { dg-final { scan-assembler-not   {\mplwzu\M}    } } */
+/* { dg-final { scan-assembler-not   {\mpstwu\M}    } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}    } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}     } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #12 of 15, Add new PC-relative tests for -mcpu=future
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (10 preceding siblings ...)
  2019-12-21  0:25 ` [PATCH] V11 patch #11 of 15, Add new tests for generating prefixed loads/stores on -mcpu=future with large offsets Michael Meissner
@ 2019-12-21  0:25 ` Michael Meissner
  2019-12-21  0:33 ` [PATCH] V11 patch #13 of 15, Add test for -mcpu=future -fstack-protect-strong with large stacks Michael Meissner
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is a reworking of patch V8 #5.  It adds a bunch of PC-relative tests for
the -mcpu=future target.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00085.html

This test passes when I run it.  Can I check it in?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-pcrel.h: New set of tests to test
	prefixed addressing on 'future' system with PC-relative addresses
	for various types.
	* gcc.target/powerpc/prefix-pcrel-dd.c: New test for prefixed
	loads/stores with PC-relative addresses for the _Decimal64 type.
	* gcc.target/powerpc/prefix-pcrel-df.c: New test for prefixed
	loads/stores with PC-relative addresses for the double type.
	* gcc.target/powerpc/prefix-pcrel-di.c: New test for prefixed
	loads/stores with PC-relative addresses for the long type.
	* gcc.target/powerpc/prefix-pcrel-hi.c: New test for prefixed
	loads/stores with PC-relative addresses for the short type.
	* gcc.target/powerpc/prefix-pcrel-kf.c: New test for prefixed
	loads/stores with PC-relative addresses for the __float128 type.
	* gcc.target/powerpc/prefix-pcrel-qi.c: New test for prefixed
	loads/stores with PC-relative addresses for the signed char type.
	* gcc.target/powerpc/prefix-pcrel-sd.c: New test for prefixed
	loads/stores with PC-relative addresses for the _Decimal32 type.
	* gcc.target/powerpc/prefix-pcrel-sf.c: New test for prefixed
	loads/stores with PC-relative addresses for the float type.
	* gcc.target/powerpc/prefix-pcrel-si.c: New test for prefixed
	loads/stores with PC-relative addresses for the int type.
	* gcc.target/powerpc/prefix-pcrel-udi.c: New test for prefixed
	loads/stores with PC-relative addresses for the unsigned long
	type.
	* gcc.target/powerpc/prefix-pcrel-uhi.c: New test for prefixed
	loads/stores with PC-relative addresses for the unsigned short
	type.
	* gcc.target/powerpc/prefix-pcrel-uqi.c: New test for prefixed
	loads/stores with PC-relative addresses for the unsigned char
	type.
	* gcc.target/powerpc/prefix-pcrel-usi.c: New test for prefixed
	loads/stores with PC-relative addresses for the unsigned int
	type.
	* gcc.target/powerpc/prefix-pcrel-v2df.c: New test for prefixed
	loads/stores with PC-relative addresses for the vector double
	type.

Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h	(working copy)
@@ -0,0 +1,58 @@
+/* Common tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for each type.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+static TYPE a;
+TYPE *p = &a;
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#if DO_ADD
+void
+add (TYPE b)
+{
+  a += b;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (void)
+{
+  return (OTYPE)a;
+}
+#endif
+
+#if DO_SET
+void
+set (ITYPE b)
+{
+  a = (TYPE)b;
+}
+#endif
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the _Decimal64 type.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the double type.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the long type.  */
+
+#define TYPE long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel} 4 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the short type.  */
+
+#define TYPE short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}     4 } } */
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the __float128 type.  */
+
+#define TYPE __float128
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the signed char type.  */
+
+#define TYPE signed char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c	(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the _Decimal32 type.  Note, the _Decimal32
+   type will not generate any prefixed load or stores, because there is no
+   prefixed load/store instruction to load up a vector register as a zero
+   extended 32-bit integer.  So we count the number load addresses that are
+   generated.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel} 3 } } */
+/* { dg-final { scan-assembler-times {\mpla\M}  3 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the float type.  */
+
+#define TYPE float
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the int type.  */
+
+#define TYPE int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}     4 } } */
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned long type.  */
+
+#define TYPE unsigned long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel} 4 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the unsigned short type.  */
+
+#define TYPE unsigned short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the unsigned char type.  */
+
+#define TYPE unsigned char
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for unsigned int type.  */
+
+#define TYPE unsigned int
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(revision 279322)
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for the vector double type.  */
+
+#define TYPE vector double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  4 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #11 of 15, Add new tests for generating prefixed loads/stores on -mcpu=future with large offsets
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (9 preceding siblings ...)
  2019-12-21  0:22 ` [PATCH] V11 patch #10 of 15, Make sure we don't generate pre-modify prefixed insns with -mcpu=future Michael Meissner
@ 2019-12-21  0:25 ` Michael Meissner
  2019-12-21  0:25 ` [PATCH] V11 patch #12 of 15, Add new PC-relative tests for -mcpu=future Michael Meissner
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is a reworking of the tests I submitted previously in V8 #4.  It generates
a bunch of loads and stores for various types using large addresses, and
verifies that the number of prefixed loads and stores is correct.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00084.html

This patch works when I run the testsuite.  Can I check it in?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-large.h: New set of tests to test
	prefixed addressing on 'future' system with large numeric offsets
	for various types.
	* gcc.target/powerpc/prefix-large-dd.c: New test for prefixed
	loads/stores with large offsets for the _Decimal64 type.
	* gcc.target/powerpc/prefix-large-df.c: New test for prefixed
	loads/stores with large offsets for the double type.
	* gcc.target/powerpc/prefix-large-di.c: New test for prefixed
	loads/stores with large offsets for the long type.
	* gcc.target/powerpc/prefix-large-hi.c: New test for prefixed
	loads/stores with large offsets for the short type.
	* gcc.target/powerpc/prefix-large-kf.c: New test for prefixed
	loads/stores with large offsets for the __float128 type.
	* gcc.target/powerpc/prefix-large-qi.c: New test for prefixed
	loads/stores with large offsets for the signed char type.
	* gcc.target/powerpc/prefix-large-sd.c: New test for prefixed
	loads/stores with large offsets for the _Decimal32 type.
	* gcc.target/powerpc/prefix-large-sf.c: New test for prefixed
	loads/stores with large offsets for the float type.
	* gcc.target/powerpc/prefix-large-si.c: New test for prefixed
	loads/stores with large offsets for the int type.
	* gcc.target/powerpc/prefix-large-udi.c: New test for prefixed
	loads/stores with large offsets for the unsigned long type.
	* gcc.target/powerpc/prefix-large-uhi.c: New test for prefixed
	loads/stores with large offsets for the unsigned short type.
	* gcc.target/powerpc/prefix-large-uqi.c: New test for prefixed
	loads/stores with large offsets for the unsigned char type.
	* gcc.target/powerpc/prefix-large-usi.c: New test for prefixed
	loads/stores with large offsets for the unsigned int type.
	* gcc.target/powerpc/prefix-large-v2df.c: New test for prefixed
	loads/stores with large offsets for the vector double type.

Index: gcc/testsuite/gcc.target/powerpc/prefix-large.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large.h	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large.h	(working copy)
@@ -0,0 +1,59 @@
+/* Common tests for prefixed instructions testing whether we can generate a
+   34-bit offset using 1 instruction.  */
+
+typedef signed char	schar;
+typedef unsigned char	uchar;
+typedef unsigned short	ushort;
+typedef unsigned int	uint;
+typedef unsigned long	ulong;
+typedef long double	ldouble;
+typedef vector double	v2df;
+typedef vector long	v2di;
+typedef vector float	v4sf;
+typedef vector int	v4si;
+
+#ifndef TYPE
+#define TYPE ulong
+#endif
+
+#ifndef ITYPE
+#define ITYPE TYPE
+#endif
+
+#ifndef OTYPE
+#define OTYPE TYPE
+#endif
+
+#if !defined(DO_ADD) && !defined(DO_VALUE) && !defined(DO_SET)
+#define DO_ADD		1
+#define DO_VALUE	1
+#define DO_SET		1
+#endif
+
+#ifndef CONSTANT
+#define CONSTANT	0x123450UL
+#endif
+
+#if DO_ADD
+void
+add (TYPE *p, TYPE a)
+{
+  p[CONSTANT] += a;
+}
+#endif
+
+#if DO_VALUE
+OTYPE
+value (TYPE *p)
+{
+  return p[CONSTANT];
+}
+#endif
+
+#if DO_SET
+void
+set (TYPE *p, ITYPE a)
+{
+  p[CONSTANT] = a;
+}
+#endif
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for _Decimal64 objects.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for double objects.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-di.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for long objects.  */
+
+#define TYPE long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for short objects.  */
+
+#define TYPE short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for __float128 objects.  */
+
+#define TYPE __float128
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for signed char objects.  */
+
+#define TYPE signed char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for _Decimal32 objects.  */
+
+#define TYPE _Decimal32
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpaddi\M|\mpli|\mpla\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlfiwzx\M}              2 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M}              2 } } */
+
+
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for float objects.  */
+
+#define TYPE float
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfs\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-si.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for int objects.  */
+
+#define TYPE int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplw[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}     2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for unsigned long
+   objects.  */
+
+#define TYPE unsigned long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for unsigned short
+   objects.  */
+
+#define TYPE unsigned short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplhz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for unsigned char
+   objects.  */
+
+#define TYPE unsigned char
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplbz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstb\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for unsigned int
+   objects.  */
+
+#define TYPE unsigned int
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(revision 279319)
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset for vector objects.  */
+
+#define TYPE vector double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #13 of 15, Add test for -mcpu=future -fstack-protect-strong with large stacks
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (11 preceding siblings ...)
  2019-12-21  0:25 ` [PATCH] V11 patch #12 of 15, Add new PC-relative tests for -mcpu=future Michael Meissner
@ 2019-12-21  0:33 ` Michael Meissner
  2019-12-21  1:23 ` [PATCH] V11 patch #14 of 15, Add tests for vec_extract from memory with PC-relative addrss Michael Meissner
  2019-12-21  1:25 ` [PATCH] V11 patch #15 of 15, Add tests for -mcpu=future vec_extract from memory with a large offset Michael Meissner
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  0:33 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

This is patch V8 #6.  It makes sure the stack protect insns work when
-mcpu=future and -fstack-protector-strong are used together.  We discovered
this failure when we attempted to build GLIBC using -mcpu=future.
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00089.html

This test now passes when I run it as part of the test suite, can I check it
in to the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/prefix-stack-protect.c: New test to make sure
	-fstack-protect-strong works with prefixed addressing.

Index: gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c	(revision 279324)
+++ gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c	(working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future -fstack-protector-strong" } */
+
+/* Test that we can handle large stack frames with -fstack-protector-strong and
+   prefixed addressing.  This was originally discovered in trying to build
+   glibc with -mcpu=future, and vfwprintf.c failed because it used
+   -fstack-protector-strong.  */
+
+extern long foo (char *);
+
+long
+bar (void)
+{
+  char buffer[0x20000];
+  return foo (buffer) + 1;
+}
+
+/* { dg-final { scan-assembler {\mpld\M}  } } */
+/* { dg-final { scan-assembler {\mpstd\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #14 of 15, Add tests for vec_extract from memory with PC-relative addrss
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (12 preceding siblings ...)
  2019-12-21  0:33 ` [PATCH] V11 patch #13 of 15, Add test for -mcpu=future -fstack-protect-strong with large stacks Michael Meissner
@ 2019-12-21  1:23 ` Michael Meissner
  2019-12-21  1:25 ` [PATCH] V11 patch #15 of 15, Add tests for -mcpu=future vec_extract from memory with a large offset Michael Meissner
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  1:23 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

These tests are new.  These tests check that the vector extract from a vector
in memory works correctly for both constant and variable element numbers.

These tests pass with all of the previoius pataches applied.  Can I check these
patches into the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/vec-extract-pcrel-si.c: New test for
	vec_extract from a PC-relative address.
	* gcc.target/powerpc/vec-extract-pcrel-di.c: New test for
	vec_extract from a PC-relative address.
	* gcc.target/powerpc/vec-extract-pcrel-sf.c: New test for
	vec_extract from a PC-relative address.
	* gcc.target/powerpc/vec-extract-pcrel-df.c: New test for
	vec_extract from a PC-relative address.

Index: gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-df.c	(revision 279615)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-df.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we can support vec_extract on V2DF vectors with a PC-relative
+   address.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE double
+#endif
+
+static vector TYPE v;
+vector TYPE *p = &v;
+
+TYPE
+get0 (void)
+{
+  return vec_extract (v, 0);
+}
+
+TYPE
+get1 (void)
+{
+  return vec_extract (v, 1);
+}
+
+TYPE
+getn (unsigned long n)
+{
+  return vec_extract (v, n);
+}
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  3 } } */
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpla\M}   1 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-di.c	(revision 279615)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-di.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we can support vec_extract on V2DI vectors with a PC-relative
+   address.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE unsigned long
+#endif
+
+static vector TYPE v;
+vector TYPE *p = &v;
+
+TYPE
+get0 (void)
+{
+  return vec_extract (v, 0);
+}
+
+TYPE
+get1 (void)
+{
+  return vec_extract (v, 1);
+}
+
+TYPE
+getn (unsigned long n)
+{
+  return vec_extract (v, n);
+}
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  3 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mpla\M}   1 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-sf.c	(revision 279615)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-sf.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we can support vec_extract on V4SF vectors with a PC-relative
+   address.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE float
+#endif
+
+static vector TYPE v;
+vector TYPE *p = &v;
+
+TYPE
+get0 (void)
+{
+  return vec_extract (v, 0);
+}
+
+TYPE
+get1 (void)
+{
+  return vec_extract (v, 1);
+}
+
+TYPE
+getn (unsigned long n)
+{
+  return vec_extract (v, n);
+}
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  3 } } */
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpla\M}   1 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-si.c	(revision 279615)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-pcrel-si.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we can support vec_extract on V4SI vectors with a PC-relative
+   address.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE unsigned int
+#endif
+
+static vector TYPE v;
+vector TYPE *p = &v;
+
+TYPE
+get0 (void)
+{
+  return vec_extract (v, 0);
+}
+
+TYPE
+get1 (void)
+{
+  return vec_extract (v, 1);
+}
+
+TYPE
+getn (unsigned long n)
+{
+  return vec_extract (v, n);
+}
+
+/* { dg-final { scan-assembler-times {[@]pcrel}  3 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpla\M}   1 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] V11 patch #15 of 15, Add tests for -mcpu=future vec_extract from memory with a large offset
  2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
                   ` (13 preceding siblings ...)
  2019-12-21  1:23 ` [PATCH] V11 patch #14 of 15, Add tests for vec_extract from memory with PC-relative addrss Michael Meissner
@ 2019-12-21  1:25 ` Michael Meissner
  14 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2019-12-21  1:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, David Edelsohn

These are new tests.  They verify if you are doing a vec_extract of a vector in
memory and the vector's address contains a large offset and the element number
is constant, it generates a prefixed load instruction when -mcpu=future.

Once all of the other V11 patches are checked in, can I check this patch into
the trunk?

2019-12-20  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/vec-extract-large-si.c: New test for
	vec_extract from a vector unsigned int in memory with a large
	offset.
	* gcc.target/powerpc/vec-extract-large-di.c: New test for
	vec_extract from a vector long in memory with a large offset.
	* gcc.target/powerpc/vec-extract-large-sf.c: New test for
	vec_extract from a vector float in memory with a large offset.
	* gcc.target/powerpc/vec-extract-large-df.c: New test for
	vec_extract from a vector double in memory with a large offset.

Index: gcc/testsuite/gcc.target/powerpc/vec-extract-large-df.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-large-df.c	(revision 279691)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-large-df.c	(working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we generate prefixed loads for vec_extract of a vector double in
+   memory, and the memory address has a large offset.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE double
+#endif
+
+#ifndef LARGE
+#define LARGE 0x50000
+#endif
+
+TYPE
+get0 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 0);		/* PLFD.  */
+}
+
+TYPE
+get1 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 1);		/* PLFD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-large-di.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-large-di.c	(revision 279691)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-large-di.c	(working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we generate prefixed loads for vec_extract of a vector unsigned long
+   in memory, and the memory address has a large offset.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE unsigned long
+#endif
+
+#ifndef LARGE
+#define LARGE 0x50000
+#endif
+
+TYPE
+get0 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 0);		/* PLD.  */
+}
+
+TYPE
+get1 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 1);		/* PLD.  */
+}
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-large-sf.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-large-sf.c	(revision 279691)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-large-sf.c	(working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we generate prefixed loads for vec_extract of a vector float in
+   memory, and the memory address has a large offset.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE float
+#endif
+
+#ifndef LARGE
+#define LARGE 0x50000
+#endif
+
+TYPE
+get0 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 0);		/* PLFS.  */
+}
+
+TYPE
+get1 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 1);		/* PLFS.  */
+}
+
+/* { dg-final { scan-assembler-times {\mplfs\M}  2 } } */
Index: gcc/testsuite/gcc.target/powerpc/vec-extract-large-si.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-extract-large-si.c	(revision 279691)
+++ gcc/testsuite/gcc.target/powerpc/vec-extract-large-si.c	(working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test if we generate prefixed loads for vec_extract of a vector unsigned int
+   in memory, and the memory address has a large offset.  */
+
+#include <altivec.h>
+
+#ifndef TYPE
+#define TYPE unsigned int
+#endif
+
+#ifndef LARGE
+#define LARGE 0x50000
+#endif
+
+TYPE
+get0 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 0);		/* PLWZ.  */
+}
+
+TYPE
+get1 (vector TYPE *p)
+{
+  return vec_extract (p[LARGE], 1);		/* PLWZ.  */
+}
+
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #1 of 15, Fix bug in vec_extract
  2019-12-20 23:28 ` [PATCH] V11 patch #1 of 15, Fix bug in vec_extract Michael Meissner
@ 2019-12-22 14:06   ` Segher Boessenkool
  0 siblings, 0 replies; 27+ messages in thread
From: Segher Boessenkool @ 2019-12-22 14:06 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Fri, Dec 20, 2019 at 06:24:57PM -0500, Michael Meissner wrote:
> This patch fixes the bug pointed out in the V10 patch review that the code
> modified an input argument to vector extract with a variable element number.

Great, thanks.

> With this patch applied, the compiler will signal an error.  FWIW, I did build
> all of Spec 2017 and Spec 2006 with this patch applied, but not the others, and
> we did not get an assertion failure.

Please document (at the start of rs6000_adjust_vec_address, maybe in the
function comment even) which arguments can be the same register and which
not, that kind of thing?  That makes it much simpler to check that all
callers are okay (and do that as well), but even more importantly it makes
it more likely that it won't come back to bite us later.

> --- gcc/config/rs6000/rs6000.c	(revision 279549)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -6757,6 +6757,8 @@ rs6000_adjust_vec_address (rtx scalar_re
>  
>        else
>  	{
> +	  /* If we are called from rs6000_split_vec_extract_var, base_tmp may
> +	     be the same as element.  */

This comment isn't very useful here (it is confusing even, I'd say).
Move this comment up?  Make it part of what I propose above?

> @@ -6825,6 +6827,11 @@ rs6000_adjust_vec_address (rtx scalar_re
>  
>  	  else
>  	    {
> +	      /* Make sure base_tmp is not the same as element_offset.  This
> +		 can happen if the element number is variable and the address
> +		 is not a simple address.  Otherwise we lose the offset, and
> +		 double the address.  */
> +	      gcc_assert (!reg_mentioned_p (base_tmp, element_offset));

Otherwise we ICE, certainly after adding the assert ;-)

Asserts often do not need documentation at all.  If they do, usually
something unexpected is going on.  Rewriting things a bit can help.

The comment isn't very exact, btw...  "is not the same as"...  ithe assert
actually tests if the base_tmp is used in element_offset at all; the latter
can be a more complicated construct.

> +      /* Make sure base_tmp is not the same as element_offset.  This can happen
> +	 if the element number is variable and the address is not a simple
> +	 address.  Otherwise we lose the offset, and double the address.  */
> +      gcc_assert (!reg_mentioned_p (base_tmp, element_offset));

Same here.

> @@ -6902,9 +6913,10 @@ rs6000_split_vec_extract_var (rtx dest,
>        int num_elements = GET_MODE_NUNITS (mode);
>        rtx num_ele_m1 = GEN_INT (num_elements - 1);
>  
> -      emit_insn (gen_anddi3 (element, element, num_ele_m1));
> +      /* Make sure the element number is in bounds.  */
>        gcc_assert (REG_P (tmp_gpr));

How does that make sure the number is in bounds?


In general, do asserts as early as practical?


Segher

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset
  2019-12-20 23:47 ` [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset Michael Meissner
@ 2019-12-22 17:24   ` Segher Boessenkool
  2020-01-07  1:41     ` [PATCH, committed] " Michael Meissner
  0 siblings, 1 reply; 27+ messages in thread
From: Segher Boessenkool @ 2019-12-22 17:24 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Fri, Dec 20, 2019 at 06:38:32PM -0500, Michael Meissner wrote:
> --- gcc/config/rs6000/rs6000.c	(revision 279553)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -6792,9 +6792,17 @@ rs6000_adjust_vec_address (rtx scalar_re
>  	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
>  	  rtx offset_rtx = GEN_INT (offset);
>  
> -	  if (IN_RANGE (offset, -32768, 32767)
> +	  /* 16-bit offset.  */
> +	  if (SIGNED_INTEGER_16BIT_P (offset)
>  	      && (scalar_size < 8 || (offset & 0x3) == 0))
>  	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);

We probably should have a macro for this, hrm.  The
reg_or_aligned_short_operand predicate is the closest we have right now.

> +	  /* 34-bit offset if we have prefixed addresses.  */
> +	  else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (offset))
> +	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);

This is cint34_operand.

And maybe we want both in one, for convenience?

(Something for the future of course, not this patch).

> +	  /* Offset overflowed, move offset to the temporary (which will likely
> +	     be split), and do X-FORM addressing.  */
>  	  else
>  	    {

The comment should go here, instead (after the {).

> +	  /* Make sure we don't overwrite the temporary if the element being
> +	     extracted is variable, and we've put the offset into base_tmp
> +	     previously.  */
> +	  else if (rtx_equal_p (base_tmp, element_offset))
> +	    emit_insn (gen_add2_insn (base_tmp, op1));

Register equality (in the same mode, as we have here) is just "==".  Is
that what we need here, or should it be reg_mentioned_p?


This whole function is too complex (and it writes TARGET_POWERPC64 where
it needs TARGET_64BIT, for example).


The patch is okay for trunk (with the comment moved, and the rtx_equal_p
fixed).  Thanks!


Segher

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory
  2019-12-20 23:49 ` [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory Michael Meissner
@ 2019-12-22 17:49   ` Segher Boessenkool
  2020-01-07  1:43     ` [PATCH, committed] " Michael Meissner
  0 siblings, 1 reply; 27+ messages in thread
From: Segher Boessenkool @ 2019-12-22 17:49 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Fri, Dec 20, 2019 at 06:47:28PM -0500, Michael Meissner wrote:
> Then I realized that eventaully we will want to generate an X-FORM (register +
> register) address, and it was just simpler to use the 'Q' constraint, and have
> the register allocator put the address into a register.

Yep, good call.

> 	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
> 	Use 'Q' for memory constraints because we need to do an X-FORM
> 	load with the variable index.
> 	(vsx_extract_v4sf_var): Use 'Q' for memory constraints because we
> 	need to do an X-FORM load with the variable index.

This comment is a headscratcher -- but you shouldn't say "why" in
changelogs at all, so that is an easy fix ;-)

> 	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator):Use 'Q' for

(missing space)

> 	memory constraints because we need to do an X-FORM load with the
> 	variable index.
> 	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for memory
> 	constraints because we need to do an X-FORM load with the variable
> 	index.

(and more)

> -;; Variable V2DI/V2DF extract
> +;; Variable V2DI/V2DF extract.  Use 'Q' for the memory because we will
> +;; ultimately have to convert the address into base + index.

Maybe just don't write anything at all, since it is hard to explain in a
few words?  It is clear that "Q" is not a usual constraint, anyway :-)

Okay for trunk like that.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation.
  2019-12-20 23:56 ` [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation Michael Meissner
@ 2019-12-22 20:02   ` Segher Boessenkool
  2020-01-07  1:45     ` [PATCH, committed] " Michael Meissner
  0 siblings, 1 reply; 27+ messages in thread
From: Segher Boessenkool @ 2019-12-22 20:02 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

On Fri, Dec 20, 2019 at 06:49:30PM -0500, Michael Meissner wrote:
> In doing V11 patch #3, I noticed that the documentation for the 'Q' was
> misleading.

It originally was used just for lswi/stswi, which can access up to the
first 32 bytes of storage pointed to by the register.  But yes, the
current comment is confusing.

> 	* config/rs6000/constraints.md (Q constraint): Update
> 	documentation.
> 	* doc/md.tet (PowerPC constraints): Update 'Q' constraint
> 	documentation.

"md.tet"?  That's an interesting typo :-)

>  (define_memory_constraint "Q"
> -  "Memory operand that is an offset from a register (it is usually better
> -to use @samp{m} or @samp{es} in @code{asm} statements)"
> +  "A memory operand whose address which uses a single register with no offset."

Arm has

(define_memory_constraint "Q"
 "@internal
  An address that is a single base register."
 (and (match_code "mem")
      (match_test "REG_P (XEXP (op, 0))")))

which is more correct for us (the register cannot be r0!)

But it is not an address.

Maybe "A memory operand addressed by just a base register." ?

Okay for trunk like that.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address
  2019-12-21  0:00 ` [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address Michael Meissner
@ 2019-12-25  6:41   ` Segher Boessenkool
  2020-01-06 20:55     ` Michael Meissner
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Segher Boessenkool @ 2019-12-25  6:41 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn

Hi!

On Fri, Dec 20, 2019 at 06:55:53PM -0500, Michael Meissner wrote:
> 	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
> 	function to identify the address mask of a hard register.

Do this as a separate patch please.  That refactoring is pre-approved.
Please explain in the function comment what an "address mask" is.  Or
better yet, don't call it a "mask", it isn't a mask?

Also various of the names here still have "reload" in it, which doesn't
really make much sense.

rs6000_mode_to_addressing_flags?  And a reg_to for this new one?
Something like that.

> +  /* For references to local static variables, try to fold a constant offset
> +     into the address.  */
> +  else if (pcrel_local_address (addr, Pmode) && CONST_INT_P (element_offset))
> +    {
> +      if (GET_CODE (addr) == CONST)
> +	addr = XEXP (addr, 0);
> +
> +      if (GET_CODE (addr) == PLUS)
> +	{
> +	  rtx op0 = XEXP (addr, 0);
> +	  rtx op1 = XEXP (addr, 1);
> +	  if (CONST_INT_P (op1))
> +	    {
> +	      HOST_WIDE_INT offset
> +		= INTVAL (XEXP (addr, 1)) + INTVAL (element_offset);
> +
> +	      if (offset == 0)
> +		new_addr = op0;
> +
> +	      else if (SIGNED_INTEGER_34BIT_P (offset))
> +		{
> +		  rtx plus = gen_rtx_PLUS (Pmode, op0, GEN_INT (offset));
> +		  new_addr = gen_rtx_CONST (Pmode, plus);
> +		}
> +
> +	      else
> +		{
> +		  emit_move_insn (base_tmp, addr);
> +		  new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
> +		}
> +	    }
> +	  else
> +	    {
> +	      emit_move_insn (base_tmp, addr);
> +	      new_addr = gen_rtx_PLUS (Pmode, base_tmp, element_offset);
> +	    }
> +	}
> +
> +      else
> +	{
> +	  rtx plus = gen_rtx_PLUS (Pmode, addr, element_offset);
> +	  new_addr = gen_rtx_CONST (Pmode, plus);
> +	}
> +    }

This adds four new if's and four new else's, indented three deep.  Please
write it as a separate function?  Something like "pcrel_adjust_address",
adding an extra offset?

Or not pcrel perhaps, you can do other addresses in the same routine?

Either way, adjust the address is useful more often than for extracts,
and is a much more general thing to do.  So please split that out from
the existing code, as a separate patch again, and then add to that?

> +  /* If we have a PC-relative address, check if offsetable loads are
> +     allowed.  */
> +  else if (pcrel_local_address (new_addr, Pmode))
> +    {
> +      addr_mask_type addr_mask
> +	= rs6000_reg_to_addr_mask (scalar_reg, scalar_mode);
> +
> +      valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
> +    }

That comment could be better, too?  (And two letters "t" in offsettable).

Thanks,


Segher

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address
  2019-12-25  6:41   ` Segher Boessenkool
@ 2020-01-06 20:55     ` Michael Meissner
  2020-01-06 21:01     ` Michael Meissner
  2020-01-07  1:48     ` [PATCH, committed] " Michael Meissner
  2 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-06 20:55 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 24, 2019 at 10:24:55AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Dec 20, 2019 at 06:55:53PM -0500, Michael Meissner wrote:
> > 	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
> > 	function to identify the address mask of a hard register.
> 
> Do this as a separate patch please.  That refactoring is pre-approved.
> Please explain in the function comment what an "address mask" is.  Or
> better yet, don't call it a "mask", it isn't a mask?

It is called mask because everywhere else in rs6000.c uses 'addr_mask' or just
mask.  It is a mask of valid bits.

> Also various of the names here still have "reload" in it, which doesn't
> really make much sense.

When these functions were written, it was in the context of supporting the
secondary reload functions, and so reload was in the name.

I will make a refactoring patch that uses the current names.  If we want to
change all of the uses we can in a future patch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address
  2019-12-25  6:41   ` Segher Boessenkool
  2020-01-06 20:55     ` Michael Meissner
@ 2020-01-06 21:01     ` Michael Meissner
  2020-01-07  1:48     ` [PATCH, committed] " Michael Meissner
  2 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-06 21:01 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 24, 2019 at 10:24:55AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Dec 20, 2019 at 06:55:53PM -0500, Michael Meissner wrote:
> > 	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
> > 	function to identify the address mask of a hard register.
> 
> Do this as a separate patch please.  That refactoring is pre-approved.
> Please explain in the function comment what an "address mask" is.  Or
> better yet, don't call it a "mask", it isn't a mask?
> 
> Also various of the names here still have "reload" in it, which doesn't
> really make much sense.
> 
> rs6000_mode_to_addressing_flags?  And a reg_to for this new one?
> Something like that.

Note, rs6000_mode_to_addressing_flags also does not fit the usage.  The key is
to return the address mask of the valid addressing options that needs both a
hard register and a mode.  Mode by itself is not useful, since loading up
SImode to vector registers requires X_FORM, while then same mode in GPR
registers can of course do D_FORM and X_FORM addressing.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH, committed] V11 patch #2 of 15, Use prefixed load for vector extract with large offset
  2019-12-22 17:24   ` Segher Boessenkool
@ 2020-01-07  1:41     ` Michael Meissner
  0 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-07  1:41 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Sun, Dec 22, 2019 at 11:10:09AM -0600, Segher Boessenkool wrote:
> The patch is okay for trunk (with the comment moved, and the rtx_equal_p
> fixed).  Thanks!

Here is the patch I committed (subversion id 279937):

2020-01-06  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
	for the offset being 34-bits when -mcpu=future is used.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279910)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6797,11 +6797,19 @@ rs6000_adjust_vec_address (rtx scalar_re
 	  HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
 	  rtx offset_rtx = GEN_INT (offset);
 
-	  if (IN_RANGE (offset, -32768, 32767)
+	  /* 16-bit offset.  */
+	  if (SIGNED_INTEGER_16BIT_P (offset)
 	      && (scalar_size < 8 || (offset & 0x3) == 0))
 	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+	  /* 34-bit offset if we have prefixed addresses.  */
+	  else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (offset))
+	    new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
 	  else
 	    {
+	      /* Offset overflowed, move offset to the temporary (which will
+		 likely be split), and do X-FORM addressing.  */
 	      emit_move_insn (base_tmp, offset_rtx);
 	      new_addr = gen_rtx_PLUS (Pmode, op0, base_tmp);
 	    }
@@ -6830,6 +6838,12 @@ rs6000_adjust_vec_address (rtx scalar_re
 	      emit_insn (insn);
 	    }
 
+	  /* Make sure we don't overwrite the temporary if the element being
+	     extracted is variable, and we've put the offset into base_tmp
+	     previously.  */
+	  else if (reg_mentioned_p (base_tmp, element_offset))
+	    emit_insn (gen_add2_insn (base_tmp, op1));
+
 	  else
 	    {
 	      emit_move_insn (base_tmp, op1);

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH, committed] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory
  2019-12-22 17:49   ` Segher Boessenkool
@ 2020-01-07  1:43     ` Michael Meissner
  0 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-07  1:43 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Sun, Dec 22, 2019 at 11:24:51AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Dec 20, 2019 at 06:47:28PM -0500, Michael Meissner wrote:
> > Then I realized that eventaully we will want to generate an X-FORM (register +
> > register) address, and it was just simpler to use the 'Q' constraint, and have
> > the register allocator put the address into a register.
> 
> Yep, good call.
> 
> > 	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
> > 	Use 'Q' for memory constraints because we need to do an X-FORM
> > 	load with the variable index.
> > 	(vsx_extract_v4sf_var): Use 'Q' for memory constraints because we
> > 	need to do an X-FORM load with the variable index.
> 
> This comment is a headscratcher -- but you shouldn't say "why" in
> changelogs at all, so that is an easy fix ;-)
> 
> > 	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator):Use 'Q' for
> 
> (missing space)
> 
> > 	memory constraints because we need to do an X-FORM load with the
> > 	variable index.
> > 	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for memory
> > 	constraints because we need to do an X-FORM load with the variable
> > 	index.
> 
> (and more)
> 
> > -;; Variable V2DI/V2DF extract
> > +;; Variable V2DI/V2DF extract.  Use 'Q' for the memory because we will
> > +;; ultimately have to convert the address into base + index.
> 
> Maybe just don't write anything at all, since it is hard to explain in a
> few words?  It is clear that "Q" is not a usual constraint, anyway :-)
> 
> Okay for trunk like that.  Thanks!

This is the patch I committed (subversion id 279938):

2020-01-06  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Use 'Q' for doing vector extract from memory.
	(vsx_extract_v4sf_var): Use 'Q' for doing vector extract from
	memory.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Use 'Q' for
	doing vector extract from memory.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for doing vector
	extract from memory.

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 279910)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -3248,7 +3248,7 @@ (define_insn "vsx_vslo_<mode>"
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,wa,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
+	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,Q,Q")
 			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 			    UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3318,7 +3318,7 @@ (define_insn_and_split "*vsx_extract_v4s
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=wa,wa,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
+	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,Q,Q")
 		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 		   UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3681,7 +3681,7 @@ (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,Q")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3701,7 +3701,7 @@ (define_insn_and_split "*vsx_extract_<mo
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,Q")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH, committed] V11 patch #4 of 15, Update 'Q' constraint documentation.
  2019-12-22 20:02   ` Segher Boessenkool
@ 2020-01-07  1:45     ` Michael Meissner
  0 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-07  1:45 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Sun, Dec 22, 2019 at 11:49:19AM -0600, Segher Boessenkool wrote:
> On Fri, Dec 20, 2019 at 06:49:30PM -0500, Michael Meissner wrote:
> > In doing V11 patch #3, I noticed that the documentation for the 'Q' was
> > misleading.
> 
> It originally was used just for lswi/stswi, which can access up to the
> first 32 bytes of storage pointed to by the register.  But yes, the
> current comment is confusing.
> 
> > 	* config/rs6000/constraints.md (Q constraint): Update
> > 	documentation.
> > 	* doc/md.tet (PowerPC constraints): Update 'Q' constraint
> > 	documentation.
> 
> "md.tet"?  That's an interesting typo :-)
> 
> >  (define_memory_constraint "Q"
> > -  "Memory operand that is an offset from a register (it is usually better
> > -to use @samp{m} or @samp{es} in @code{asm} statements)"
> > +  "A memory operand whose address which uses a single register with no offset."
> 
> Arm has
> 
> (define_memory_constraint "Q"
>  "@internal
>   An address that is a single base register."
>  (and (match_code "mem")
>       (match_test "REG_P (XEXP (op, 0))")))
> 
> which is more correct for us (the register cannot be r0!)
> 
> But it is not an address.
> 
> Maybe "A memory operand addressed by just a base register." ?
> 
> Okay for trunk like that.  Thanks!

This is the patch I committed (subversion ids 279939 and 279940).

2020-01-06  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (Q constraint): Update
	documentation.
	* doc/md.texi (RS/6000 constraints): Update 'Q' cosntraint
	documentation.

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 279910)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -211,8 +211,7 @@ several times, or that might not access
        (match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
 (define_memory_constraint "Q"
-  "Memory operand that is an offset from a register (it is usually better
-to use @samp{m} or @samp{es} in @code{asm} statements)"
+  "A memory operand addressed by just a base register."
   (and (match_code "mem")
        (match_test "REG_P (XEXP (op, 0))")))
 
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 279910)
+++ gcc/doc/md.texi	(working copy)
@@ -3381,8 +3381,7 @@ allowed when @samp{<} or @samp{>} is use
 as @samp{m} without @samp{<} and @samp{>}.
 
 @item Q
-Memory operand that is an offset from a register (it is usually better
-to use @samp{m} or @samp{es} in @code{asm} statements)
+A memory operand addressed by just a base register.
 
 @item Z
 Memory operand that is an indexed or indirect from a register (it is

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH, committed] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address
  2019-12-25  6:41   ` Segher Boessenkool
  2020-01-06 20:55     ` Michael Meissner
  2020-01-06 21:01     ` Michael Meissner
@ 2020-01-07  1:48     ` Michael Meissner
  2 siblings, 0 replies; 27+ messages in thread
From: Michael Meissner @ 2020-01-07  1:48 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Michael Meissner, gcc-patches, David Edelsohn

On Tue, Dec 24, 2019 at 10:24:55AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Dec 20, 2019 at 06:55:53PM -0500, Michael Meissner wrote:
> > 	* config/rs6000/rs6000.c (rs6000_reg_to_addr_mask): New helper
> > 	function to identify the address mask of a hard register.
> 
> Do this as a separate patch please.  That refactoring is pre-approved.
> Please explain in the function comment what an "address mask" is.  Or
> better yet, don't call it a "mask", it isn't a mask?

I committed this patch for the refactoring (subversion id 279941).  I will
submit the other pieces later.

2020-01-06  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (hard_reg_and_mode_to_addr_mask): New
	helper function to return the valid addressing formats for a given
	hard register and mode.
	(rs6000_adjust_vec_address): Call hard_reg_and_mode_to_addr_mask.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 279912)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6729,6 +6729,30 @@ rs6000_expand_vector_extract (rtx target
     }
 }
 
+/* Helper function to return an address mask based on a physical register.  */
+
+static addr_mask_type
+hard_reg_and_mode_to_addr_mask (rtx reg, machine_mode mode)
+{
+  unsigned int r = reg_or_subregno (reg);
+  addr_mask_type addr_mask;
+
+  gcc_assert (HARD_REGISTER_NUM_P (r));
+  if (INT_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+  else if (FP_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
+
+  else if (ALTIVEC_REGNO_P (r))
+    addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+  else
+    gcc_unreachable ();
+
+  return addr_mask;
+}
+
 /* Adjust a memory address (MEM) of a vector type to point to a scalar field
    within the vector (ELEMENT) with a mode (SCALAR_MODE).  Use a base register
    temporary (BASE_TMP) to fixup the address.  Return the new memory address
@@ -6865,21 +6889,8 @@ rs6000_adjust_vec_address (rtx scalar_re
   if (GET_CODE (new_addr) == PLUS)
     {
       rtx op1 = XEXP (new_addr, 1);
-      addr_mask_type addr_mask;
-      unsigned int scalar_regno = reg_or_subregno (scalar_reg);
-
-      gcc_assert (HARD_REGISTER_NUM_P (scalar_regno));
-      if (INT_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_GPR];
-
-      else if (FP_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_FPR];
-
-      else if (ALTIVEC_REGNO_P (scalar_regno))
-	addr_mask = reg_addr[scalar_mode].addr_mask[RELOAD_REG_VMX];
-
-      else
-	gcc_unreachable ();
+      addr_mask_type addr_mask
+	= hard_reg_and_mode_to_addr_mask (scalar_reg, scalar_mode);
 
       if (REG_P (op1) || SUBREG_P (op1))
 	valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-01-07  1:48 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-20 23:15 PowerPC -mcpu=future patches, V11 Michael Meissner
2019-12-20 23:28 ` [PATCH] V11 patch #1 of 15, Fix bug in vec_extract Michael Meissner
2019-12-22 14:06   ` Segher Boessenkool
2019-12-20 23:47 ` [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset Michael Meissner
2019-12-22 17:24   ` Segher Boessenkool
2020-01-07  1:41     ` [PATCH, committed] " Michael Meissner
2019-12-20 23:49 ` [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory Michael Meissner
2019-12-22 17:49   ` Segher Boessenkool
2020-01-07  1:43     ` [PATCH, committed] " Michael Meissner
2019-12-20 23:56 ` [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation Michael Meissner
2019-12-22 20:02   ` Segher Boessenkool
2020-01-07  1:45     ` [PATCH, committed] " Michael Meissner
2019-12-21  0:00 ` [PATCH] V11 patch #5 of 15, Optimize vec_extract of a vector in memory with a PC-relative address Michael Meissner
2019-12-25  6:41   ` Segher Boessenkool
2020-01-06 20:55     ` Michael Meissner
2020-01-06 21:01     ` Michael Meissner
2020-01-07  1:48     ` [PATCH, committed] " Michael Meissner
2019-12-21  0:03 ` [PATCH] V11 patch #6 of 15, Make -mpcrel the default for -mcpu=future on Linux 64-bit Michael Meissner
2019-12-21  0:06 ` [PATCH] V11 patch #7 of 15, Add new target_supports cases for -mcpu=future tests Michael Meissner
2019-12-21  0:11 ` [PATCH] V11 patch #8 of 15, Add new tests for using PADDI and PLI with -mcpu=future Michael Meissner
2019-12-21  0:12 ` [PATCH] V11 patch #9 of 15, Add test to validate generating prefixed memory when the offset is invalid for DS/DQ insns Michael Meissner
2019-12-21  0:22 ` [PATCH] V11 patch #10 of 15, Make sure we don't generate pre-modify prefixed insns with -mcpu=future Michael Meissner
2019-12-21  0:25 ` [PATCH] V11 patch #11 of 15, Add new tests for generating prefixed loads/stores on -mcpu=future with large offsets Michael Meissner
2019-12-21  0:25 ` [PATCH] V11 patch #12 of 15, Add new PC-relative tests for -mcpu=future Michael Meissner
2019-12-21  0:33 ` [PATCH] V11 patch #13 of 15, Add test for -mcpu=future -fstack-protect-strong with large stacks Michael Meissner
2019-12-21  1:23 ` [PATCH] V11 patch #14 of 15, Add tests for vec_extract from memory with PC-relative addrss Michael Meissner
2019-12-21  1:25 ` [PATCH] V11 patch #15 of 15, Add tests for -mcpu=future vec_extract from memory with a large offset Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).