[PATCH], PowerPC: Allow DImode in Altivec registers

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH], PowerPC: Allow DImode in Altivec registers
@ 2016-06-13 17:28 Michael Meissner
  2016-06-13 18:30 ` Michael Meissner
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Meissner @ 2016-06-13 17:28 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn, Bill Schmidt

This patch goes through the PowerPC compiler and adds support to allow DImode
(64-bit integers) into Altivec registers for VSX systems.  It also adds some
support to allow loading some DImode constants via either ISA 2.07 or ISA 3.0
instructions.

I have bootstrapped this with no regressions on both a big endian power7 system
and a little endian power8 system.

I have run a Spec 2006 INT tests with these changes, and the run times were
comparable between the original compiler and the compiler with the changes.

Are these changes ok to install in the trunk?  Assuming they go in the trunk,
can I install them in the 6.2 branch if they cause no regression?

Note, I will be away from the office, starting Thursday afternoon (June 16th,
2016) and I will return on Monday (June 20th, 2016).  I will not have easy
access to email during this time.

[gcc]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
	DImode constants with XXSPLTIB in vector registers.
	(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
	vsx_extract_<mode>_internal{1,2} into a single insn that handles
	direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
	extraction of the element at the top of the register as a scalar
	value.
	(vsx_extract_<mode>_internal1): Likewise.
	(vsx_extract_<mode>_internal2): Likewise.
	* config/rs6000/constraints.md (wi constraint): Remove a comment
	about DImode not being allowed in Altivec registers.
	(wB constraint): New constraint for constants that can be
	generated in Altivec registers with VSPLTISW/VUPKHSW.
	* config/rs6000/predicates.md (xxspltib_constant_split): Update
	comments.
	(xxspltib_constant_nosplit): Likewise.
	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
	support for -mupper-regs-di to enable DImode to go into Altivec
	registers.
	(POWERPC_MASKS): Likewise.
	(power7 cpu): Likewise.
	* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
	for DImode being allowed in Altivec registers.  Update wi/wj
	constraints.  Set scalar_in_vmx_p flag.
	(rs6000_option_override_internal): Add checks for -mupper-regs-di.
	(xxspltib_constant_p): Allow CONST_INT's with VOIDmode.  Don't
	return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
	(rs6000_opt_masks): Add -mupper-regs-di.
	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
	direct move to use wi and now wj.
	(lfiwzx): Likewise.
	(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
	alternative.
	(floatunssi<mode>2_lfiwzx_mem): Likewise.
	(fix_trunc<mode>di2_fctidz): Change second alternative to allow
	any VSX register, instead of just Altivec registers, to allow
	either operand to be an Altivec register or both.
	(fixuns_trunc<mode>di2_fctiduz): Likewise.
	(movdi_internal32): Add support for -mupper-regs-di.  Add support
	to load constants via XXSPLTIB or VSPLTISW.  Add spacing to allow
	the alternatives and attributes to be lined up to be easier to
	read.
	(movdi_internal64): Likewise.
	(64-bit DImode splitters): Change predicates to only split loading
	up GPR registers.  Add splits for using XXSPLTIB or VSPLTISW to
	load constants in ISA 3.0 or ISA 2.07 respectively.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
	-mupper-regs-di.  Update -mupper-regs-df and -mupper-regs-sf to
	mention -mcpu=power9 sets these options.
	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
	wB constraint.

[gcc/testsuite]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-dimode1.c: New test.
	* gcc.target/powerpc/p9-dimode2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH], PowerPC: Allow DImode in Altivec registers
  2016-06-13 17:28 [PATCH], PowerPC: Allow DImode in Altivec registers Michael Meissner
@ 2016-06-13 18:30 ` Michael Meissner
  2016-06-14 22:54   ` Segher Boessenkool
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Meissner @ 2016-06-13 18:30 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 3990 bytes --]

It would help if I included the patch.

On Mon, Jun 13, 2016 at 01:28:16PM -0400, Michael Meissner wrote:
> This patch goes through the PowerPC compiler and adds support to allow DImode
> (64-bit integers) into Altivec registers for VSX systems.  It also adds some
> support to allow loading some DImode constants via either ISA 2.07 or ISA 3.0
> instructions.
> 
> I have bootstrapped this with no regressions on both a big endian power7 system
> and a little endian power8 system.
> 
> I have run a Spec 2006 INT tests with these changes, and the run times were
> comparable between the original compiler and the compiler with the changes.
> 
> Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> can I install them in the 6.2 branch if they cause no regression?
> 
> Note, I will be away from the office, starting Thursday afternoon (June 16th,
> 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> access to email during this time.

[gcc]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
	DImode constants with XXSPLTIB in vector registers.
	(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
	vsx_extract_<mode>_internal{1,2} into a single insn that handles
	direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
	extraction of the element at the top of the register as a scalar
	value.
	(vsx_extract_<mode>_internal1): Likewise.
	(vsx_extract_<mode>_internal2): Likewise.
	* config/rs6000/constraints.md (wi constraint): Remove a comment
	about DImode not being allowed in Altivec registers.
	(wB constraint): New constraint for constants that can be
	generated in Altivec registers with VSPLTISW/VUPKHSW.
	* config/rs6000/predicates.md (xxspltib_constant_split): Update
	comments.
	(xxspltib_constant_nosplit): Likewise.
	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
	support for -mupper-regs-di to enable DImode to go into Altivec
	registers.
	(POWERPC_MASKS): Likewise.
	(power7 cpu): Likewise.
	* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
	for DImode being allowed in Altivec registers.  Update wi/wj
	constraints.  Set scalar_in_vmx_p flag.
	(rs6000_option_override_internal): Add checks for -mupper-regs-di.
	(xxspltib_constant_p): Allow CONST_INT's with VOIDmode.  Don't
	return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
	(rs6000_opt_masks): Add -mupper-regs-di.
	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
	direct move to use wi and now wj.
	(lfiwzx): Likewise.
	(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
	alternative.
	(floatunssi<mode>2_lfiwzx_mem): Likewise.
	(fix_trunc<mode>di2_fctidz): Change second alternative to allow
	any VSX register, instead of just Altivec registers, to allow
	either operand to be an Altivec register or both.
	(fixuns_trunc<mode>di2_fctiduz): Likewise.
	(movdi_internal32): Add support for -mupper-regs-di.  Add support
	to load constants via XXSPLTIB or VSPLTISW.  Add spacing to allow
	the alternatives and attributes to be lined up to be easier to
	read.
	(movdi_internal64): Likewise.
	(64-bit DImode splitters): Change predicates to only split loading
	up GPR registers.  Add splits for using XXSPLTIB or VSPLTISW to
	load constants in ISA 3.0 or ISA 2.07 respectively.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
	-mupper-regs-di.  Update -mupper-regs-df and -mupper-regs-sf to
	mention -mcpu=power9 sets these options.
	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
	wB constraint.

[gcc/testsuite]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-dimode1.c: New test.
	* gcc.target/powerpc/p9-dimode2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-stage7.dimode003b --]
[-- Type: text/plain, Size: 32000 bytes --]

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/vsx.md	(.../gcc/config/rs6000)	(working copy)
@@ -260,7 +260,7 @@ (define_mode_attr VS_64reg [(V2DF	"ws")
 			    (V2DI	"wi")])
 
 ;; Iterators for loading constants with xxspltib
-(define_mode_iterator VSINT_84  [V4SI V2DI])
+(define_mode_iterator VSINT_84  [V4SI V2DI DI])
 (define_mode_iterator VSINT_842 [V8HI V4SI V2DI])
 
 ;; Constants for creating unspecs
@@ -2095,77 +2095,69 @@ (define_insn "vsx_set_<mode>"
   [(set_attr "type" "vecperm")])
 
 ;; Extract a DF/DI element from V2DF/V2DI
-(define_expand "vsx_extract_<mode>"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "")
-	(vec_select:<VS_scalar> (match_operand:VSX_D 1 "register_operand" "")
-		       (parallel
-			[(match_operand:QI 2 "u5bit_cint_operand" "")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "")
-
 ;; Optimize cases were we can do a simple or direct move.
 ;; Or see if we can avoid doing the move at all
-(define_insn "*vsx_extract_<mode>_internal1"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,<VS_64reg>,r,r")
+
+;; There are some unresolved problems with reload that show up if an Altivec
+;; register was picked.  Limit the scalar value to FPRs for now.
+
+(define_insn "vsx_extract_<mode>"
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand"
+            "=d,     wm,      wo,    d")
+
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "register_operand" "d,<VS_64reg>,<VS_64dm>,<VS_64dm>")
+	 (match_operand:VSX_D 1 "gpc_reg_operand"
+            "<VSa>, <VSa>,  <VSa>,  <VSa>")
+
 	 (parallel
-	  [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wL")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+	  [(match_operand:QI 2 "const_0_to_1_operand"
+            "wD,    wD,     wL,     n")])))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
+  int element = INTVAL (operands[2]);
   int op0_regno = REGNO (operands[0]);
   int op1_regno = REGNO (operands[1]);
+  int fldDM;
 
-  if (op0_regno == op1_regno)
-    return "nop";
-
-  if (INT_REGNO_P (op0_regno))
-    return ((INTVAL (operands[2]) == VECTOR_ELEMENT_MFVSRLD_64BIT)
-	    ? "mfvsrdl %0,%x1"
-	    : "mfvsrd %0,%x1");
+  gcc_assert (IN_RANGE (element, 0, 1));
+  gcc_assert (VSX_REGNO_P (op1_regno));
 
-  if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
-    return "fmr %0,%1";
+  if (element == VECTOR_ELEMENT_SCALAR_64BIT)
+    {
+      if (op0_regno == op1_regno)
+	return ASM_COMMENT_START " vec_extract to same register";
 
-  return "xxlor %x0,%x1,%x1";
-}
-  [(set_attr "type" "fp,vecsimple,mftgpr,mftgpr")
-   (set_attr "length" "4")])
+      else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE
+	       && TARGET_POWERPC64)
+	return "mfvsrd %0,%x1";
 
-(define_insn "*vsx_extract_<mode>_internal2"
-  [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,<VS_64reg>,<VS_64reg>")
-	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd")
-	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-   && (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE
-       || INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)"
-{
-  int fldDM;
-  gcc_assert (UINTVAL (operands[2]) <= 1);
+      else if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
+	return "fmr %0,%1";
 
-  if (INTVAL (operands[2]) == VECTOR_ELEMENT_SCALAR_64BIT)
-    {
-      int op0_regno = REGNO (operands[0]);
-      int op1_regno = REGNO (operands[1]);
+      else if (VSX_REGNO_P (op0_regno))
+	return "xxlor %x0,%x1,%x1";
 
-      if (op0_regno == op1_regno)
-	return "nop";
+      else
+	gcc_unreachable ();
+    }
 
-      if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
-	return "fmr %0,%1";
+  else if (element == VECTOR_ELEMENT_MFVSRLD_64BIT && INT_REGNO_P (op0_regno)
+	   && TARGET_P9_VECTOR && TARGET_POWERPC64 && TARGET_DIRECT_MOVE)
+    return "mfvsrdl %0,%x1";
 
-      return "xxlor %x0,%x1,%x1";
+  else if (VSX_REGNO_P (op0_regno))
+    {
+      fldDM = element << 1;
+      if (!BYTES_BIG_ENDIAN)
+	fldDM = 3 - fldDM;
+      operands[3] = GEN_INT (fldDM);
+      return "xxpermdi %x0,%x1,%x1,%3";
     }
 
-  fldDM = INTVAL (operands[2]) << 1;
-  if (!BYTES_BIG_ENDIAN)
-    fldDM = 3 - fldDM;
-  operands[3] = GEN_INT (fldDM);
-  return "xxpermdi %x0,%x1,%x1,%3";
+  else
+    gcc_unreachable ();
 }
-  [(set_attr "type" "fp,vecsimple,vecperm")
-   (set_attr "length" "4")])
+  [(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")])
 
 ;; Optimize extracting a single scalar element from memory if the scalar is in
 ;; the correct location to use a single load.
Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/constraints.md	(.../gcc/config/rs6000)	(working copy)
@@ -77,8 +77,6 @@ (define_register_constraint "wg" "rs6000
 (define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
   "Floating point register if direct moves are available, or NO_REGS.")
 
-;; At present, DImode is not allowed in the Altivec registers.  If in the
-;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS.
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
@@ -135,6 +133,13 @@ (define_register_constraint "wy" "rs6000
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
   "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
 
+;; wB needs ISA 2.07 VUPKHSW
+(define_constraint "wB"
+  "Signed 5-bit constant integer that can be loaded into an altivec register."
+  (and (match_code "const_int")
+       (and (match_test "TARGET_P8_VECTOR")
+	    (match_operand 0 "s5bit_cint_operand"))))
+
 (define_constraint "wD"
   "Int constant that is the element number of the 64-bit scalar in a vector."
   (and (match_code "const_int")
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/predicates.md	(.../gcc/config/rs6000)	(working copy)
@@ -565,9 +565,8 @@ (define_predicate "easy_fp_constant"
     }
 })
 
-;; Return 1 if the operand is a CONST_VECTOR or VEC_DUPLICATE of a constant
-;; that can loaded with a XXSPLTIB instruction and then a VUPKHSB, VECSB2W or
-;; VECSB2D instruction.
+;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
+;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
 
 (define_predicate "xxspltib_constant_split"
   (match_code "const_vector,vec_duplicate,const_int")
@@ -582,8 +581,8 @@ (define_predicate "xxspltib_constant_spl
 })
 
 
-;; Return 1 if the operand is a CONST_VECTOR that can loaded directly with a
-;; XXSPLTIB instruction.
+;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
+;; instruction.
 
 (define_predicate "xxspltib_constant_nosplit"
   (match_code "const_vector,vec_duplicate,const_int")
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000-cpus.def	(.../gcc/config/rs6000)	(working copy)
@@ -45,6 +45,7 @@
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
@@ -119,6 +120,7 @@
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_TOC_FUSION		\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF		\
 				 | OPTION_MASK_UPPER_REGS_SF		\
 				 | OPTION_MASK_VSX			\
@@ -211,7 +213,8 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
+	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF
+	    | OPTION_MASK_UPPER_REGS_DI)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.opt	(.../gcc/config/rs6000)	(working copy)
@@ -597,6 +597,10 @@ mupper-regs
 Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
 Allow float/double variables in upper registers if cpu allows it.
 
+mupper-regs-di
+Target Report Mask(UPPER_REGS_DI) Var(rs6000_isa_flags)
+Allow 64-bit integer variables in upper registers with -mcpu=power7 or -mvsx.
+
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.c	(.../gcc/config/rs6000)	(working copy)
@@ -1938,7 +1938,8 @@ rs6000_hard_regno_mode_ok (int regno, ma
 	  || FLOAT128_VECTOR_P (mode)
 	  || reg_addr[mode].scalar_in_vmx_p
 	  || (TARGET_VSX_TIMODE && mode == TImode)
-	  || (TARGET_VADDUQM && mode == V1TImode)))
+	  || (TARGET_VADDUQM && mode == V1TImode)
+	  || (TARGET_UPPER_REGS_DI && mode == DImode)))
     {
       if (FP_REGNO_P (regno))
 	return FP_REGNO_P (last_regno);
@@ -3082,7 +3083,6 @@ rs6000_init_hard_regno_mode_ok (bool glo
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;	/* V2DFmode  */
       rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;	/* V4SFmode  */
-      rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;	/* DImode  */
 
       if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;	/* TImode  */
@@ -3094,6 +3094,11 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	}
       else
 	rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;
+
+      if (TARGET_UPPER_REGS_DF)					/* DImode  */
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS;
+      else
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;
     }
 
   /* Add conditional constraints based on various options, to allow us to
@@ -3306,6 +3311,9 @@ rs6000_init_hard_regno_mode_ok (bool glo
       if (TARGET_UPPER_REGS_DF)
 	reg_addr[DFmode].scalar_in_vmx_p = true;
 
+      if (TARGET_UPPER_REGS_DI)
+	reg_addr[DImode].scalar_in_vmx_p = true;
+
       if (TARGET_UPPER_REGS_SF)
 	reg_addr[SFmode].scalar_in_vmx_p = true;
     }
@@ -4085,9 +4093,9 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_DFP;
     }
 
-  /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
-     -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
-     the individual option.  */
+  /* Allow an explicit -mupper-regs to set -mupper-regs-df, -mupper-regs-di,
+     and -mupper-regs-sf, depending on the cpu, unless the user explicitly also
+     set the individual option.  */
   if (TARGET_UPPER_REGS > 0)
     {
       if (TARGET_VSX
@@ -4096,6 +4104,12 @@ rs6000_option_override_internal (bool gl
 	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
       if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@@ -4111,6 +4125,12 @@ rs6000_option_override_internal (bool gl
 	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
       if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@@ -4126,6 +4146,13 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
     }
 
+  if (TARGET_UPPER_REGS_DI && !TARGET_VSX)
+    {
+      if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
+	error ("-mupper-regs-di requires -mvsx");
+      rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+    }
+
   if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR)
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)
@@ -4386,6 +4413,7 @@ rs6000_option_override_internal (bool gl
   if (TARGET_FLOAT128_HW
       && (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR
 			      | OPTION_MASK_DIRECT_MOVE
+			      | OPTION_MASK_UPPER_REGS_DI
 			      | OPTION_MASK_UPPER_REGS_DF
 			      | OPTION_MASK_UPPER_REGS_SF)) == 0)
     {
@@ -6284,7 +6312,7 @@ xxspltib_constant_p (rtx op,
   if (mode == VOIDmode)
     mode = GET_MODE (op);
 
-  else if (mode != GET_MODE (op))
+  else if (mode != GET_MODE (op) && GET_MODE (op) != VOIDmode)
     return false;
 
   /* Handle (vec_duplicate <constant>).  */
@@ -6337,8 +6365,8 @@ xxspltib_constant_p (rtx op,
     }
 
   /* Handle integer constants being loaded into the upper part of the VSX
-     register as a scalar.  If the value isn't 0/-1, only allow it if
-     the mode can go in Altivec registers.  */
+     register as a scalar.  If the value isn't 0/-1, only allow it if the mode
+     can go in Altivec registers.  Prefer VSPLTISW/VUPKHSW over XXSPLITIB.  */
   else if (CONST_INT_P (op))
     {
       if (!SCALAR_INT_MODE_P (mode))
@@ -6348,9 +6376,14 @@ xxspltib_constant_p (rtx op,
       if (!IN_RANGE (value, -128, 127))
 	return false;
 
-      if (!IN_RANGE (value, -1, 0)
-	  && (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0)
-	return false;
+      if (!IN_RANGE (value, -1, 0))
+	{
+	  if (!(reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID))
+	    return false;
+
+	  if (EASY_VECTOR_15 (value))
+	    return false;
+	}
     }
 
   else
@@ -35485,6 +35518,7 @@ static struct rs6000_opt_mask const rs60
   { "string",			OPTION_MASK_STRING,		false, true  },
   { "toc-fusion",		OPTION_MASK_TOC_FUSION,		false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
+  { "upper-regs-di",		OPTION_MASK_UPPER_REGS_DI,	false, true  },
   { "upper-regs-df",		OPTION_MASK_UPPER_REGS_DF,	false, true  },
   { "upper-regs-sf",		OPTION_MASK_UPPER_REGS_SF,	false, true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -4866,7 +4866,7 @@ (define_insn "lfiwax"
 (define_insn_and_split "floatsi<mode>2_lfiwax"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
    && <SI_CONVERT_FP> && can_create_pseudo_p ()"
   "#"
@@ -4905,11 +4905,11 @@ (define_insn_and_split "floatsi<mode>2_l
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatsi<mode>2_lfiwax_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF
 	 (sign_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
    && <SI_CONVERT_FP>"
   "#"
@@ -4941,7 +4941,7 @@ (define_insn "lfiwzx"
 (define_insn_and_split "floatunssi<mode>2_lfiwzx"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
    && <SI_CONVERT_FP>"
   "#"
@@ -4980,11 +4980,11 @@ (define_insn_and_split "floatunssi<mode>
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatunssi<mode>2_lfiwzx_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF
 	 (zero_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
    && <SI_CONVERT_FP>"
   "#"
@@ -5288,7 +5288,7 @@ (define_expand "fix_trunc<mode>di2"
 
 (define_insn "*fix_trunc<mode>di2_fctidz"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
     && TARGET_FCFID"
   "@
@@ -5360,7 +5360,7 @@ (define_expand "fixuns_trunc<mode>di2"
 
 (define_insn "*fixuns_trunc<mode>di2_fctiduz"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
     && TARGET_FCTIDUZ"
   "@
@@ -7700,9 +7700,25 @@ (define_insn "p8_mfvsrd_4_disf"
 ;; non-offsettable address by using r->r which won't make progress.
 ;; Use of fprs is disparaged slightly otherwise reload prefers to reload
 ;; a gpr into a fpr instead of reloading an invalid 'Y' address
+
+;;        GPR store  GPR load   GPR move   FPR store  FPR load    FPR move
+;;        GPR const  AVX store  AVX store  AVX load   AVX load    VSX move
+;;        P9 0       P9 -1      AVX 0/-1   VSX 0      VSX -1      P9 const
+;;        AVX const  
+
 (define_insn "*movdi_internal32"
-  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
-	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
+  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
+         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
+          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
+          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
+          ?wv")
+
+	(match_operand:DI 1 "input_operand"
+          "r,        Y,         r,         d,         m,          d,
+           IJKnGHF,  wb,        wv,        Y,         Z,          wi,
+           Oj,       wM,        OjwM,      Oj,        wM,         wS,
+           wB"))]
+
   "! TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -7713,8 +7729,24 @@ (define_insn "*movdi_internal32"
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
    fmr %0,%1
+   #
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
    #"
-  [(set_attr "type" "store,load,*,fpstore,fpload,fp,*")])
+  [(set_attr "type"
+               "store,     load,      *,         fpstore,   fpload,     fp,
+                *,         fpstore,   fpstore,   fpload,    fpload,     vecsimple,
+                vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple")])
 
 (define_split
   [(set (match_operand:DI 0 "gpc_reg_operand" "")
@@ -7744,9 +7776,26 @@ (define_split
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
+;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
+;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
+;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
+;;              FPR->GPR   GPR->FPR   VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj,?*wi")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r,O"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand"
+               "=Y,        r,         r,         r,         r,          r,
+                ?m,        ?*d,       ?*d,       ?Y,        ?Z,         ?*wb,
+                ?*wv,      ?wi,       ?wo,       ?wo,       ?wv,        ?wi,
+                ?wi,       ?wv,       ?wv,       r,         *h,         *h,
+                ?*r,       ?*wg,      ?*r,       ?*wj")
+
+	(match_operand:DI 1 "input_operand"
+                "r,        Y,         r,         I,         L,          nF,
+                 d,        m,         d,         wb,        wv,         Y,
+                 Z,        wi,        Oj,        wM,        OjwM,       Oj,
+                 wM,       wS,        wB,        *h,        r,          0,
+                 wg,       r,         wj,        r"))]
+
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -7760,21 +7809,43 @@ (define_insn "*movdi_internal64"
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
    fmr %0,%1
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
+   #
    mf%1 %0
    mt%0 %1
    nop
    mftgpr %0,%1
    mffgpr %0,%1
    mfvsrd %0,%x1
-   mtvsrd %x0,%1
-   xxlxor %x0,%x0,%x0"
-  [(set_attr "type" "store,load,*,*,*,*,fpstore,fpload,fp,mfjmpr,mtjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr,vecsimple")
-   (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4,4,4")])
+   mtvsrd %x0,%1"
+  [(set_attr "type"
+               "store,     load,      *,         *,         *,          *,
+                fpstore,   fpload,    fp,        fpstore,   fpstore,    fpload,
+                fpload,    vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple, vecsimple, vecsimple, mfjmpr,    mtjmpr,     *,
+                mftgpr,    mffgpr,    mftgpr,    mffgpr")
+
+   (set_attr "length"
+               "4,         4,         4,         4,         4,          20,
+                4,         4,         4,         4,         4,          4,
+                4,         4,         4,         4,         4,          8,
+                8,         4,         4,         4,         4,          4,
+                4,         4,         4,         4")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
 ; instruction.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]
   "TARGET_POWERPC64
    && num_insns_constant (operands[1], DImode) > 1
@@ -7791,7 +7862,7 @@ (define_split
 ;; When non-easy constants can go in the TOC, this should use
 ;; easy_fp_constant predicate.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_int_operand" ""))]
   "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
   [(set (match_dup 0) (match_dup 2))
@@ -7805,7 +7876,7 @@ (define_split
 }")
 
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_scalar_int_operand" ""))]
   "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
   [(set (match_dup 0) (match_dup 2))
@@ -7817,6 +7888,43 @@ (define_split
   else
     FAIL;
 }")
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "s5bit_cint_operand" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_VSX && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v4si = gen_rtx_REG (V4SImode, r);
+
+  emit_insn (gen_altivec_vspltisw (op0_v4si, op1));
+  if (op1 != const0_rtx && op1 != constm1_rtx)
+    {
+      rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+      emit_insn (gen_altivec_vupkhsw (op0_v2di, op0_v4si));
+    }
+  DONE;
+})
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "xxspltib_constant_split" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v16qi = gen_rtx_REG (V16QImode, r);
+
+  emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1));
+  emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi));
+  DONE;
+})
+
 \f
 ;; TImode/PTImode is similar, except that we usually want to compute the
 ;; address into a register and use lsi/stsi (the exception is during reload).
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 237222)
+++ gcc/doc/invoke.texi	(.../gcc/doc)	(working copy)
@@ -1005,6 +1005,7 @@ See RS/6000 and PowerPC Options.
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
 -mcompat-align-parm -mno-compat-align-parm @gol
 -mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs-di -mno-upper-regs-di @gol
 -mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol
 -mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol
 -mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector @gol
@@ -20210,6 +20211,17 @@ Generate code that uses (does not use) t
 instructions.  The @option{-mquad-memory-atomic} option requires use of
 64-bit mode.
 
+@item -mupper-regs-di
+@itemx -mno-upper-regs-di
+@opindex mupper-regs-di
+@opindex mno-upper-regs-di
+Generate code that uses (does not use) the scalar instructions that
+target all 64 registers in the vector/scalar floating point register
+set that were added in version 2.06 of the PowerPC ISA when processing
+integers.  @option{-mupper-regs-di} is turned on by default if you use
+any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.
+
 @item -mupper-regs-df
 @itemx -mno-upper-regs-df
 @opindex mupper-regs-df
@@ -20218,8 +20230,8 @@ Generate code that uses (does not use) t
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.06 of the
 PowerPC ISA.  @option{-mupper-regs-df} is turned on by default if you
-use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
-@option{-mvsx} options.
+use any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.
 
 @item -mupper-regs-sf
 @itemx -mno-upper-regs-sf
@@ -20229,8 +20241,8 @@ Generate code that uses (does not use) t
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.07 of the
 PowerPC ISA.  @option{-mupper-regs-sf} is turned on by default if you
-use either of the @option{-mcpu=power8} or @option{-mpower8-vector}
-options.
+use either of the @option{-mcpu=power8}, @option{-mpower8-vector}, or
+@option{-mpower9} options.
 
 @item -mupper-regs
 @itemx -mno-upper-regs
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 237222)
+++ gcc/doc/md.texi	(.../gcc/doc)	(working copy)
@@ -3211,6 +3211,9 @@ FP or VSX register to perform ISA 2.07 f
 @item wz
 Floating point register if the LFIWZX instruction is enabled or NO_REGS.
 
+@item wB
+Signed 5-bit constant integer that can be loaded into an altivec register.
+
 @item wD
 Int constant that is the element number of the 64-bit scalar in a vector.
 
Index: gcc/testsuite/gcc.target/powerpc/p9-dimode1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dimode1.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-dimode1.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237344)
@@ -0,0 +1,50 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify P9 changes to allow DImode into Altivec registers, and generate
+   constants using XXSPLTIB.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_zero (void)
+{
+  long l = 0;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_plus_1 (void)
+{
+  long l = 1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_minus_1 (void)
+{
+  long l = -1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "xxspltib" } } */
+/* { dg-final { scan-assembler-not "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dimode2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dimode2.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-dimode2.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237344)
@@ -0,0 +1,27 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify that large integer constants are loaded via direct move instead of being
+   loaded from memory.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_large (void)
+{
+  long l = 0x12345678;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH], PowerPC: Allow DImode in Altivec registers
  2016-06-13 18:30 ` Michael Meissner
@ 2016-06-14 22:54   ` Segher Boessenkool
  2016-06-15 18:25     ` Michael Meissner
  0 siblings, 1 reply; 6+ messages in thread
From: Segher Boessenkool @ 2016-06-14 22:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt

On Mon, Jun 13, 2016 at 02:29:41PM -0400, Michael Meissner wrote:
> It would help if I included the patch.

:-)

> > Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> > can I install them in the 6.2 branch if they cause no regression?

Okay for trunk.  Okay for 6 after a week.

> > Note, I will be away from the office, starting Thursday afternoon (June 16th,
> > 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> > access to email during this time.

If big problems show up, we can always revert the patch ;-)

A few things...

> 	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
> 	direct move to use wi and now wj.

s/now/not/

> +;; wB needs ISA 2.07 VUPKHSW
> +(define_constraint "wB"
> +  "Signed 5-bit constant integer that can be loaded into an altivec register."
> +  (and (match_code "const_int")
> +       (and (match_test "TARGET_P8_VECTOR")
> +	    (match_operand 0 "s5bit_cint_operand"))))

"and" takes as many operands as you want, i.e.

+  (and (match_code "const_int")
+       (match_test "TARGET_P8_VECTOR")
+       (match_operand 0 "s5bit_cint_operand")))

>  (define_insn "*movdi_internal32"
> -  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
> -	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
> +  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
> +         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
> +          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
> +          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
> +          ?wv")
> +
> +	(match_operand:DI 1 "input_operand"
> +          "r,        Y,         r,         d,         m,          d,
> +           IJKnGHF,  wb,        wv,        Y,         Z,          wi,

"n" includes "IJK" already?

>  ; Some DImode loads are best done as a load of -1 followed by a mask
>  ; instruction.
>  (define_split
> -  [(set (match_operand:DI 0 "gpc_reg_operand")
> +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")

Not sure what this is for...  If you want to say this split is only to
be done after RA, just say that explicitly in the split condition (i.e.
"reload_completed").  Or does this mean something else?


Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH], PowerPC: Allow DImode in Altivec registers
  2016-06-14 22:54   ` Segher Boessenkool
@ 2016-06-15 18:25     ` Michael Meissner
  2016-06-15 19:51       ` Segher Boessenkool
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Meissner @ 2016-06-15 18:25 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt

On Tue, Jun 14, 2016 at 05:53:46PM -0500, Segher Boessenkool wrote:
> On Mon, Jun 13, 2016 at 02:29:41PM -0400, Michael Meissner wrote:
> > It would help if I included the patch.
> 
> :-)
> 
> > > Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> > > can I install them in the 6.2 branch if they cause no regression?
> 
> Okay for trunk.  Okay for 6 after a week.
> 
> > > Note, I will be away from the office, starting Thursday afternoon (June 16th,
> > > 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> > > access to email during this time.
> 
> If big problems show up, we can always revert the patch ;-)
> 
> A few things...
> 
> > 	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
> > 	direct move to use wi and now wj.
> 
> s/now/not/

Ok.

> > +;; wB needs ISA 2.07 VUPKHSW
> > +(define_constraint "wB"
> > +  "Signed 5-bit constant integer that can be loaded into an altivec register."
> > +  (and (match_code "const_int")
> > +       (and (match_test "TARGET_P8_VECTOR")
> > +	    (match_operand 0 "s5bit_cint_operand"))))
> 
> "and" takes as many operands as you want, i.e.

Ok, useful to know for the future.

> +  (and (match_code "const_int")
> +       (match_test "TARGET_P8_VECTOR")
> +       (match_operand 0 "s5bit_cint_operand")))
> 
> >  (define_insn "*movdi_internal32"
> > -  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
> > -	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
> > +  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
> > +         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
> > +          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
> > +          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
> > +          ?wv")
> > +
> > +	(match_operand:DI 1 "input_operand"
> > +          "r,        Y,         r,         d,         m,          d,
> > +           IJKnGHF,  wb,        wv,        Y,         Z,          wi,
> 
> "n" includes "IJK" already?

In this case, I merely copied the existing code before formatting it.

> >  ; Some DImode loads are best done as a load of -1 followed by a mask
> >  ; instruction.
> >  (define_split
> > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> 
> Not sure what this is for...  If you want to say this split is only to
> be done after RA, just say that explicitly in the split condition (i.e.
> "reload_completed").  Or does this mean something else?

This is so that constants being loaded into the vector registers aren't split
(they are handled via different define_splits).  Previously, the only constant
that was loaded in vector registers was 0.

The int_reg_operand_not_pseudo allows the split to take place if it has already
gotten hard registers before register allocation.  It could have been the
normal int_reg_operand and then use a reload_completed check.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH], PowerPC: Allow DImode in Altivec registers
  2016-06-15 18:25     ` Michael Meissner
@ 2016-06-15 19:51       ` Segher Boessenkool
  2016-06-15 21:12         ` Michael Meissner
  0 siblings, 1 reply; 6+ messages in thread
From: Segher Boessenkool @ 2016-06-15 19:51 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt

On Wed, Jun 15, 2016 at 02:24:41PM -0400, Michael Meissner wrote:
> > >  ; Some DImode loads are best done as a load of -1 followed by a mask
> > >  ; instruction.
> > >  (define_split
> > > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> > 
> > Not sure what this is for...  If you want to say this split is only to
> > be done after RA, just say that explicitly in the split condition (i.e.
> > "reload_completed").  Or does this mean something else?
> 
> This is so that constants being loaded into the vector registers aren't split
> (they are handled via different define_splits).  Previously, the only constant
> that was loaded in vector registers was 0.
> 
> The int_reg_operand_not_pseudo allows the split to take place if it has already
> gotten hard registers before register allocation.

When does that happen?

> It could have been the
> normal int_reg_operand and then use a reload_completed check.

That is preferred if it makes no difference (otherwise, bebfore you know
it we'll have twice as many predicates).


Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH], PowerPC: Allow DImode in Altivec registers
  2016-06-15 19:51       ` Segher Boessenkool
@ 2016-06-15 21:12         ` Michael Meissner
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Meissner @ 2016-06-15 21:12 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt

On Wed, Jun 15, 2016 at 02:51:20PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 15, 2016 at 02:24:41PM -0400, Michael Meissner wrote:
> > > >  ; Some DImode loads are best done as a load of -1 followed by a mask
> > > >  ; instruction.
> > > >  (define_split
> > > > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > > > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> > > 
> > > Not sure what this is for...  If you want to say this split is only to
> > > be done after RA, just say that explicitly in the split condition (i.e.
> > > "reload_completed").  Or does this mean something else?
> > 
> > This is so that constants being loaded into the vector registers aren't split
> > (they are handled via different define_splits).  Previously, the only constant
> > that was loaded in vector registers was 0.
> > 
> > The int_reg_operand_not_pseudo allows the split to take place if it has already
> > gotten hard registers before register allocation.
> 
> When does that happen?

Using arguments, function returns, and of course explicit registers, but I
agree it is fairly low.

> > It could have been the
> > normal int_reg_operand and then use a reload_completed check.
> 
> That is preferred if it makes no difference (otherwise, bebfore you know
> it we'll have twice as many predicates).

We already had the predicate for another use, so I wasn't adding a new one.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-15 21:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-13 17:28 [PATCH], PowerPC: Allow DImode in Altivec registers Michael Meissner
2016-06-13 18:30 ` Michael Meissner
2016-06-14 22:54   ` Segher Boessenkool
2016-06-15 18:25     ` Michael Meissner
2016-06-15 19:51       ` Segher Boessenkool
2016-06-15 21:12         ` Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).