public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, powerpc] Rework VSX scalar floating point support
@ 2013-08-22 19:02 Michael Meissner
  2013-09-06  3:39 ` David Edelsohn
  2013-09-23 20:37 ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2 Michael Meissner
  0 siblings, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2013-08-22 19:02 UTC (permalink / raw)
  To: gcc-patches, dje.gcc

[-- Attachment #1: Type: text/plain, Size: 10138 bytes --]

I'm working on adding the secondary reload support in the PowerPC so that we
can use the upper registers for scalar floating point support.  For those of
you who do not know the layout of the PowerPC floating point unit, there are 2
sets of registers (traditional floating point scalar registers and altivec
vector registers).  When ISA 2.06 (power7) came out, it added new instructions
(VSX) that could use the combined register set for either scalar double
precision or vector logical/floating point operations.  In ISA 2.07 (power8),
instructions were added so that scalar single precision support could also be
done on the upper registers.

However, to load data in the upper registers that overlay the Altivec register
set, we can only use register + register addressing, while loading up scalar
floating point values in the traditional floating point register set can use
auto-update and offset addressing modes.

These patches reverse a decision that I made back in the initial ISA 2.06 time
frame, where if you did -mvsx, it only used the VSX form of the arithmetic
instruction, even though scalar values were resticted to using the traditional
floating point registers.  The patches now combine both SFmode and DFmode
expanders and insns with mode iterators.  If all of the registers used are in
the traditional floating point register set, it uses the traditional floating
point instruction (i.e. fadd instead of xsadddp).  If any of the registers come
from the upper register set, it will use the ISA 2.06/2.07 VSX instructions.

I have bootstraped the compiler at subversion id 201798 with these patches, and
ran make check with no regressions.  In addition, I've been running the SpecFP
2006 benchmark suite comparing the results of runs before the change was made
and with the changes for a power7 target, and I don't see any significant
changes in runtime behavior.  I am including patches for the tests that need to
be adjusted with these changes.

This patch adds a few new constraints.  For floating point work, the intention
is that the constraints will be used as follows:

    f	traditional SFmode insns (i.e. fadds)
    d	traditional DFmode insns (i.e. fadd)
    wy	VSX SFmode insns (i.e. xsaddsp), could be FLOAT_REGS or VSX_REGS
    ws	VSX DFmode insns (i.e. xsadddp), could be FLOAT_REGS or VSX_REGS
    wu	SFmode load to or store  from Altivec regs (i.e. lxsspx)
    wv	DFmode load to or store  from Altivec regs (i.e. lxsdx)
    ww	VSX instructions used in converting to/from SFmode

Note, that the wv constraint added in previous power8 changes went from being
SFmode to DFmode (nothing used wv in the current patches that are committed).

The patch adds 3 debug switches (-mvsx-scalar-float, -mupper-regs-sf, and
-mupper-regs-df) that I'm using to debug the reload stuff.  It is anticipated
that -mvsx will imply -mupper-regs-df, and -mpower8-vector will imply
-mupper-regs-sf when the reload patches are done.

At the moment, the current trunk (subversion id 201924) does not bootstrap on
the powerpc, and I will be away from the computer starting on August 24th.  I
won't be getting back until September 3rd, and I don't anticipate checking my
mail until I get back.  Are these patches ok to check in?  I can either check
them in now, or I can delay checking them in until we've fixed the boostrap
bug.  Ideally, I would like to get permission to check these in on September
3rd now, but I can resubmit them on the 3rd if desired.

[gcc]
2013-08-22  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/constraints.md (wa constraint): Add documentation
	to all w* constraints.  Make the documentation agree with
	md.texi.  Sort the w* constraints to be in alphabetical order.
	Add wu, wv, ww, and wy constraints for supporting using the upper
	registers for DFmode under power7 and SFmode under power8.
	(wd constraint): Likewise.
	(wf constraint): Likewise.
	(wg constraint): Likewise.
	(wl constraint): Likewise.
	(wm constraint): Likewise.
	(wn constraint): Likewise.
	(wr constraint): Likewise.
	(ws constraint): Likewise.
	(wt constraint): Likewise.
	(wu constraint): Likewise.
	(wv constraint): Likewise.
	(ww constraint): Likewise.
	(wx constraint): Likewise.
	(wy constraint): Likewise.
	(wz constraint): Likewise.
	* doc/md.texi (PowerPC and IBM RS6000): Likewise.

	* config/rs6000/rs6000-builtin.def (xsrdpim): Use floor, ceil,
	btrunc insns, instead of vsx_<name>.

	* config/rs6000/rs6000.opt (-mvsx-scalar-float): New debug swtich
	to allow/disallow single precision VSX scalar instructions.
	(-mvsx-double-float): Change initial value to 1 from -1.
	(-mvsx-scalar-memory): Make this an alias of -mupper-regs-df.
	(-mupper-regs-df): New debug switches to control whether DFmode
	and SFmode can use the upper registers on power7/power8
	respectively.
	(-mupper-regs-sf): Likewise.

	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
	for -mupper-regs-sf and -mupper-regs-df.
	(rs6000_init_hard_regno_mode_ok): Likewise.
	(rs6000_opt_masks): Likewise.
	(rs6000_debug_reg_global): Print wu, ww, and wy constraints.
	Print which type of floating point unit and registers are
	available for DFmode/SFmode.

	* config/rs6000/vsx.md (vsx_add<mode>3): Move all scalar DF VSX
	support to rs6000.md.  Make SF/DFmode insns common where
	possible.  Add support for power8 scalar float instructions using
	the upper registers.  Don't use VSv and VStype_simple mode
	attributes on the insns that only handle vectors after moving
	scalar support to rs6000.md.  Merge some expanders into the
	define_insn if there is only one option.
	(vsx_sub<mode>3): Likewise.
	(vsx_mul<mode>3): Likewise.
	(vsx_div<mode>3): Likewise.
	(vsx_fre<mode>2): Likewise.
	(vsx_neg<mode>2): Likewise.
	(vsx_abs<mode>2): Likewise.
	(vsx_nabs<mode>2): Likewise.
	(vsx_smax<mode>3): Likewise.
	(vsx_smaxsf3): Likewise.
	(vsx_smin<mode>3): Likewise.
	(vsx_sminsf3): Likewise.
	(vsx_sqrt<mode>2): Likewise.
	(vsx_rsqrte<mode>2): Likewise.
	(vsx_fmadf4): Likewise.
	(vsx_fmsdf4): Likewise.
	(vsx_fms<mode>4): Likewise.
	(vsx_nfmadf4): Likewise.
	(vsx_nfma<mode>4): Likewise.
	(vsx_nfmadf4): Likewise.
	(vsx_cmpdf_internal1): Likewise.
	(vsx_copysign<mode>3): Likewise.
	(vsx_btrunc<mode>2): Likewise.
	(vsx_floor<mode>2): Likewise.
	(vsx_ceil<mode>2): Likewise.
	* config/rs6000/rs6000.md (Ftrad): Likewise.
	(Fvsx): Likewise.
	(Ff): Likewise.
	(Fv): Likewise.
	(Fs): Likewise.
	(Ffre): Likewise.
	(FFRE): Likewise.
	(abs<mode>2): Likewise.
	(abs<mode>2_fpr): Likewise.
	(nabs<mode>2_fpr): Likewise.
	(neg<mode>2): Likewise.
	(neg<mode>2_fpr): Likewise.
	(smax<mode>3): Likewise.
	(smax<mode>3_vsx): Likewise.
	(smin<mode>3): Likewise.
	(smin<mode>3_fpr): Likewise.
	(smin/smax peephole): Likewise.
	(add<mode>3): Likewise.
	(add<mode>3_fpr): Likewise.
	(sub<mode>3): Likewise.
	(sub<mode>3_fpr): Likewise.
	(mul<mode>3): Likewise.
	(mul<mode>3_fpr): Likewise.
	(div<mode>3): Likewise.
	(div<mode>3_fpr): Likewise.
	(fre<Fs>): Likewise.
	(sqrt<mode>2): Likewise.
	(rsqrt<mode>2): Likewise.
	(cmp<mode>_fpr): Likewise.
	(negsf2): Likewise.
	(abssf2): Likewise.
	(negative abssf2 unnamed pattern): Likewise.
	(addsf3): Likewise.
	(subsf3): Likewise.
	(mulsf3): Likewise.
	(divsf3): Likewise.
	(fres): Likewise.
	(fmasf4_fpr): Likewise.
	(fmssf4_fpr): Likewise.
	(nfmasf4_fpr): Likewise.
	(nfmssf4_fpr): Likewise.
	(sqrtsf2): Likewise.
	(rsqrtsf_internal1): Likewise.
	(copysign<mode>3_fcpsgn): Likewise.
	(smaxsf3): Likewise.
	(sminsf3): Likewise.
	(sminsf3/smaxsf3 splitter): Likewise.
	(negdf2): Likewise.
	(negdf2_fpr): Likewise.
	(absdf2): Likewise.
	(absdf2_fpr): Likewise.
	(nabsdf2_fpr): Likewise.
	(adddf3): Likewise.
	(adddf3_fpr): Likewise.
	(subdf3): Likewise.
	(subdf3_fpr): Likewise.
	(muldf3): Likewise.
	(muldf3_fpr): Likewise.
	(divdf3): Likewise.
	(divdf3_fpr): Likewise.
	(fred_fpr): Likewise.
	(rsqrtdf_internal1): Likewise.
	(fmadf4_fpr): Likewise.
	(fmsdf4_fpr): Likewise.
	(nfmadf4_fpr): Likewise.
	(nfmsdf4_fpr): Likewise.
	(sqrtdf2): Likewise.
	(sqrtdf2_fpr): Likewise.
	(smaxdf3): Likewise.
	(smindf3): Likewise.
	(smaxdf3, smindf3 splitter): Likewise.
	(lrint<mode>di2): Likewise.
	(btrunc<mode>2): Likewise.
	(btranc<mode>2_fpr): Likewise.
	(ceil<mode>2): Likewise.
	(ceil<mode>2_fpr): Likewise.
	(floor<mode>2): Likewise.
	(floor<mode>2_fpr): Likewise.
	(cmpsf_internal1): Likewise.
	(cmpdf_internal1): Likewise.
	(fma<mode>4_fpr): Likewise.
	(fms<mode>4_fpr): Likewise.
	(nfma<mode>4): Likewise.
	(nfma<mode>4_fpr): Likewise.
	(fnms<mode>4): Likewise.
	(nfmssf4_fpr): Likewise.
	(zero_extendsidi2_lfiwzx): Restrict using VSX load/stores to
	Altivec registers.  Add support for power8 32-bit VSX memory
	operations.
	(extendsidi2_lfiwax): Likewise.
	(lfiwax): Likewise.
	(floatsi<mode>2_lfiwax_mem): Likewise.
	(lfiwzx): Likewise.
	(floatunssi<mode>2_lfiwzx_mem): Likewise.
	(mov<mode>_hardfloat, SFmode/SDmode): Likewise.
	(mov<mode>_hardfloat32, DFmode/DDmode): Likewise.
	(mov<mode>_hardfloat64, DFmode/DDmode): Likewise.
	(movdi_internal64): Likewise.
	(mov<mode>cc): Merge SF/DF conditional move patterns.
	(fsel<mode>sf4): Likewise.
	(fsel<mode>df4): Likewise.
	(fselsfsf4): Likewise.
	(fselsfdf4): Likewise.
	(fseldfsf4): Likewise.
	(fseldfdf4): Likewise.

	* config/rs6000/rs6000.h (TARGET_SF_SPE): Define new macros to
	simplify testing what kind of SFmode/DFmode floating point unit we
	have.
	(TARGET_DF_SPE): Likewise.
	(TARGET_SF_FPR): Likewise.
	(TARGET_DF_FPR): Likewise.
	(TARGET_SF_INSN): Likewise.
	(TARGET_DF_INSN): Likewise.
	(res6000_reg_class_enum): Add wu, ww, and wy constraints.  Sort
	elements.

[gcc/testsuite]
2013-08-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/recip-3.c: Update VSX tests to allow
	generation of traditional floating point ops.
	* gcc.target/powerpc/recip-5.c: Likewise.
	* gcc.target/powerpc/ppc-target-1.c: Likewise.
	* gcc.target/powerpc/ppc-target-2.c: Likewise.
	* gcc.target/powerpc/pr42747.c: Likewise.
	* gcc.target/powerpc/vsx-builtin-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch051b --]
[-- Type: text/plain, Size: 91306 bytes --]

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 201924)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -52,29 +52,18 @@ (define_register_constraint "z" "CA_REGS
   "@internal")
 
 ;; Use w as a prefix to add VSX modes
-;; vector double (V2DF)
+;; any VSX register
+(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
+  "Any VSX register if the -mvsx option was used or NO_REGS.")
+
 (define_register_constraint "wd" "rs6000_constraints[RS6000_CONSTRAINT_wd]"
-  "@internal")
+  "VSX vector register to hold vector double data or NO_REGS.")
 
-;; vector float (V4SF)
 (define_register_constraint "wf" "rs6000_constraints[RS6000_CONSTRAINT_wf]"
-  "@internal")
-
-;; scalar double (DF)
-(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]"
-  "@internal")
-
-;; TImode in VSX registers
-(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]"
-  "@internal")
-
-;; any VSX register
-(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
-  "@internal")
+  "VSX vector register to hold vector float data or NO_REGS.")
 
-;; Register constraints to simplify move patterns
 (define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
-  "Floating point register if -mmfpgpr is used, or NO_REGS.")
+  "If -mmfpgpr was used, a floating point register or NO_REGS.")
 
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
@@ -82,23 +71,38 @@ (define_register_constraint "wl" "rs6000
 (define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]"
   "VSX register if direct move instructions are enabled, or NO_REGS.")
 
+;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
+;; direct move directly, and movsf can't to move between the register sets.
+;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode
+(define_register_constraint "wn" "NO_REGS" "No register (NO_REGS).")
+
 (define_register_constraint "wr" "rs6000_constraints[RS6000_CONSTRAINT_wr]"
   "General purpose register if 64-bit instructions are enabled or NO_REGS.")
 
+(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]"
+  "VSX vector register to hold scalar double values or NO_REGS.")
+
+(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]"
+  "VSX vector register to hold 128 bit integer or NO_REGS.")
+
+(define_register_constraint "wu" "rs6000_constraints[RS6000_CONSTRAINT_wu]"
+  "Altivec register to use for float/32-bit int loads/stores  or NO_REGS.")
+
 (define_register_constraint "wv" "rs6000_constraints[RS6000_CONSTRAINT_wv]"
-  "Altivec register if -mpower8-vector is used or NO_REGS.")
+  "Altivec register to use for double loads/stores  or NO_REGS.")
+
+(define_register_constraint "ww" "rs6000_constraints[RS6000_CONSTRAINT_ww]"
+  "FP or VSX register to perform float operations under -mvsx or NO_REGS.")
 
 (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]"
   "Floating point register if the STFIWX instruction is enabled or NO_REGS.")
 
+(define_register_constraint "wy" "rs6000_constraints[RS6000_CONSTRAINT_wy]"
+  "VSX vector register to hold scalar float values or NO_REGS.")
+
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
   "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
 
-;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
-;; direct move directly, and movsf can't to move between the register sets.
-;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode
-(define_register_constraint "wn" "NO_REGS")
-
 ;; Lq/stq validates the address for load/store quad
 (define_memory_constraint "wQ"
   "Memory operand suitable for the load/store quad instructions"
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(revision 201924)
+++ gcc/config/rs6000/rs6000-builtin.def	(working copy)
@@ -1209,9 +1209,9 @@ BU_VSX_1 (XVRSPIZ,	      "xvrspiz",	CONS
 
 BU_VSX_1 (XSRDPI,	      "xsrdpi",		CONST,	vsx_xsrdpi)
 BU_VSX_1 (XSRDPIC,	      "xsrdpic",	CONST,	vsx_xsrdpic)
-BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	vsx_floordf2)
-BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	vsx_ceildf2)
-BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	vsx_btruncdf2)
+BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	floordf2)
+BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	ceildf2)
+BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	btruncdf2)
 
 /* VSX predicate functions.  */
 BU_VSX_P (XVCMPEQSP_P,	      "xvcmpeqsp_p",	CONST,	vector_eq_v4sf_p)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 201924)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -181,13 +181,16 @@ mvsx
 Target Report Mask(VSX) Var(rs6000_isa_flags)
 Use vector/scalar (VSX) instructions
 
+mvsx-scalar-float
+Target Undocumented Report Var(TARGET_VSX_SCALAR_FLOAT) Init(1)
+; If -mpower8-vector, use VSX arithmetic instructions for SFmode (on by default)
+
 mvsx-scalar-double
-Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for scalar double (on by default)
+Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(1)
+; If -mvsx, use VSX arithmetic instructions for DFmode (on by default)
 
 mvsx-scalar-memory
-Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY)
-; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default)
+Target Undocumented Report Alias(mupper-regs-df)
 
 mvsx-align-128
 Target Undocumented Report Var(TARGET_VSX_ALIGN_128)
@@ -550,3 +553,11 @@ Generate the quad word memory instructio
 mcompat-align-parm
 Target Report Var(rs6000_compat_align_parm) Init(0) Save
 Generate aggregate parameter passing code with at most 64-bit alignment.
+
+mupper-regs-df
+Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
+Allow double variables in upper registers with -mcpu=power7 or -mvsx
+
+mupper-regs-sf
+Target Undocumented Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
+Allow float variables in upper registers with -mcpu=power8 or -mp8-vector
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 201924)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1625,18 +1625,26 @@ rs6000_hard_regno_mode_ok (int regno, en
   /* VSX registers that overlap the FPR registers are larger than for non-VSX
      implementations.  Don't allow an item to be split between a FP register
      and an Altivec register.  */
-  if (VECTOR_MEM_VSX_P (mode))
+  if (VECTOR_UNIT_VSX_OR_P8_VECTOR_P (mode)
+      || VECTOR_MEM_VSX_OR_P8_VECTOR_P (mode))
     {
       if (FP_REGNO_P (regno))
 	return FP_REGNO_P (last_regno);
 
       if (ALTIVEC_REGNO_P (regno))
-	return ALTIVEC_REGNO_P (last_regno);
-    }
+	{
+	  if (!ALTIVEC_REGNO_P (last_regno))
+	    return false;
 
-  /* Allow TImode in all VSX registers if the user asked for it.  */
-  if (mode == TImode && TARGET_VSX_TIMODE && VSX_REGNO_P (regno))
-    return 1;
+	  if (mode == DFmode && !TARGET_UPPER_REGS_DF)
+	    return false;
+
+	  if (mode == SFmode && !TARGET_UPPER_REGS_SF)
+	    return false;
+
+	  return true;
+	}
+    }
 
   /* The GPRs can hold any mode, but values bigger than one register
      cannot go past R31.  */
@@ -1891,8 +1899,11 @@ rs6000_debug_reg_global (void)
 	   "wr reg_class = %s\n"
 	   "ws reg_class = %s\n"
 	   "wt reg_class = %s\n"
+	   "wu reg_class = %s\n"
 	   "wv reg_class = %s\n"
+	   "ww reg_class = %s\n"
 	   "wx reg_class = %s\n"
+	   "wy reg_class = %s\n"
 	   "wz reg_class = %s\n"
 	   "\n",
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]],
@@ -1907,8 +1918,11 @@ rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wt]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wu]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ww]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]);
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
@@ -2135,6 +2149,24 @@ rs6000_debug_reg_global (void)
   if (rs6000_float_gprs)
     fprintf (stderr, DEBUG_FMT_S, "float_gprs", "true");
 
+  if (VECTOR_UNIT_VSX_P (DFmode))
+    fprintf (stderr, DEBUG_FMT_S, "vsx double",
+	     (TARGET_UPPER_REGS_DF) ? "fpr + altivec" : "fpr");
+  else if (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
+    fprintf (stderr, DEBUG_FMT_S, "trad. double", "fpr");
+  else if (TARGET_HARD_FLOAT && TARGET_E500_DOUBLE)
+    fprintf (stderr, DEBUG_FMT_S, "E500 double", "gpr");
+  else
+    fprintf (stderr, DEBUG_FMT_S, "software double", "gpr");
+
+  if (VECTOR_UNIT_P8_VECTOR_P (SFmode))
+    fprintf (stderr, DEBUG_FMT_S, "p8 float",
+	     (TARGET_UPPER_REGS_SF) ? "fpr + altivec" : "fpr");
+  else if (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT)
+    fprintf (stderr, DEBUG_FMT_S, "trad. float", "fpr");
+  else
+    fprintf (stderr, DEBUG_FMT_S, "software float", "gpr");
+
   if (TARGET_LINK_STACK)
     fprintf (stderr, DEBUG_FMT_S, "link_stack", "true");
 
@@ -2315,12 +2347,21 @@ rs6000_init_hard_regno_mode_ok (bool glo
       rs6000_vector_align[V2DImode] = align64;
     }
 
-  /* DFmode, see if we want to use the VSX unit.  */
+  /* SFmode, see if we want to use the VSX unit.  Don't set the VSX memory unit
+     for now, to allow use of the traditional floating point instructions.  */
+  if (TARGET_P8_VECTOR && TARGET_VSX_SCALAR_FLOAT)
+    {
+      rs6000_vector_unit[SFmode] = VECTOR_P8_VECTOR;
+      rs6000_vector_mem[SFmode] = VECTOR_NONE;
+      rs6000_vector_align[SFmode] = align32;
+    }
+
+  /* DFmode, see if we want to use the VSX unit.  Don't set the VSX memory unit
+     for now, to allow use of the traditional floating point instructions.  */
   if (TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE)
     {
       rs6000_vector_unit[DFmode] = VECTOR_VSX;
-      rs6000_vector_mem[DFmode]
-	= (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE);
+      rs6000_vector_mem[DFmode] = VECTOR_NONE;
       rs6000_vector_align[DFmode] = align64;
     }
 
@@ -2349,13 +2390,15 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	 V4SF, wd = register class to use for V2DF, and ws = register classs to
 	 use for DF scalars.  */
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_ws] = (TARGET_VSX_SCALAR_MEMORY
-						  ? VSX_REGS
-						  : FLOAT_REGS);
+      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wu] = ALTIVEC_REGS;
+
       if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;
+
+      rs6000_constraints[RS6000_CONSTRAINT_ws]
+	= (TARGET_UPPER_REGS_DF) ? VSX_REGS : FLOAT_REGS;
     }
 
   /* Add conditional constraints based on various options, to allow us to
@@ -2376,7 +2419,14 @@ rs6000_init_hard_regno_mode_ok (bool glo
     rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
 
   if (TARGET_P8_VECTOR)
-    rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+    {
+      rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wy]
+	= rs6000_constraints[RS6000_CONSTRAINT_ww]
+	= (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS;
+    }
+  else if (TARGET_VSX)
+    rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS;
 
   if (TARGET_STFIWX)
     rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS;
@@ -2409,20 +2459,22 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_di_load;
 	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_di_store;
 	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_di_load;
-	  if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY)
+#if 0
+	  if (VECTOR_UNIT_VSX_P (DFmode) && TARGET_UPPER_REGS_DF)
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_di_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_di_load;
 	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_di_store;
 	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_di_load;
 	    }
-	  if (TARGET_P8_VECTOR)
+	  if (VECTOR_UNIT_P8_VECTOR_P (SFmode) && TARGET_UPPER_REGS_SF)
 	    {
 	      rs6000_vector_reload[SFmode][0]  = CODE_FOR_reload_sf_di_store;
 	      rs6000_vector_reload[SFmode][1]  = CODE_FOR_reload_sf_di_load;
 	      rs6000_vector_reload[SDmode][0]  = CODE_FOR_reload_sd_di_store;
 	      rs6000_vector_reload[SDmode][1]  = CODE_FOR_reload_sd_di_load;
 	    }
+#endif
 	  if (TARGET_VSX_TIMODE)
 	    {
 	      rs6000_vector_reload[TImode][0]  = CODE_FOR_reload_ti_di_store;
@@ -2472,20 +2524,22 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_si_load;
 	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_si_store;
 	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_si_load;
-	  if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY)
+#if 0
+	  if (VECTOR_UNIT_VSX_P (DFmode) && TARGET_UPPER_REGS_DF)
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_si_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_si_load;
 	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_si_store;
 	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_si_load;
 	    }
-	  if (TARGET_P8_VECTOR)
+	  if (VECTOR_UNIT_P8_VECTOR_P (SFmode) && TARGET_UPPER_REGS_SF)
 	    {
 	      rs6000_vector_reload[SFmode][0]  = CODE_FOR_reload_sf_si_store;
 	      rs6000_vector_reload[SFmode][1]  = CODE_FOR_reload_sf_si_load;
 	      rs6000_vector_reload[SDmode][0]  = CODE_FOR_reload_sd_si_store;
 	      rs6000_vector_reload[SDmode][1]  = CODE_FOR_reload_sd_si_load;
 	    }
+#endif
 	  if (TARGET_VSX_TIMODE)
 	    {
 	      rs6000_vector_reload[TImode][0]  = CODE_FOR_reload_ti_si_store;
@@ -29162,6 +29216,8 @@ static struct rs6000_opt_mask const rs60
   { "recip-precision",		OPTION_MASK_RECIP_PRECISION,	false, true  },
   { "string",			OPTION_MASK_STRING,		false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
+  { "upper-regs-df",		OPTION_MASK_UPPER_REGS_DF,	false, false },
+  { "upper-regs-sf",		OPTION_MASK_UPPER_REGS_SF,	false, false },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
   { "vsx-timode",		OPTION_MASK_VSX_TIMODE,		false, true  },
 #ifdef OPTION_MASK_64BIT
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 201924)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -316,40 +316,42 @@ (define_expand "vsx_store_<mode>"
   "")
 
 \f
-;; VSX scalar and vector floating point arithmetic instructions
+;; VSX vector floating point arithmetic instructions.  The VSX scalar
+;; instructions are now combined with the insn for the traditional floating
+;; point unit.
 (define_insn "*vsx_add<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (plus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>add<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")
+  "xvadd<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_sub<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (minus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		     (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (minus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		     (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>sub<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")
+  "xvsub<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_mul<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (mult:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>mul<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_mul>")
+  "xvmul<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_mul>")])
 
 (define_insn "*vsx_div<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (div:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (div:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		   (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>div<VSs> %x0,%x1,%x2"
+  "xvdiv<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_div>")
    (set_attr "fp_type" "<VSfptype_div>")])
 
@@ -392,94 +394,72 @@ (define_insn "*vsx_tdiv<mode>3_internal"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_fre<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRES))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>re<VSs> %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvre<VSs> %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_neg<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (neg:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (neg:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>neg<VSs> %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvneg<VSs> %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_abs<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>abs<VSs> %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvabs<VSs> %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_nabs<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (neg:VSX_B
-	 (abs:VSX_B
-	  (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa"))))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (neg:VSX_F
+	 (abs:VSX_F
+	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>nabs<VSs> %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvnabs<VSs> %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_smax<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smax:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (smax:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>max<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")
+  "xvmax<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_smin<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smin:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (smin:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>min<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")
+  "xvmin<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Special VSX version of smin/smax for single precision floating point.  Since
-;; both numbers are rounded to single precision, we can just use the DP version
-;; of the instruction.
-
-(define_insn "*vsx_smaxsf3"
-  [(set (match_operand:SF 0 "vsx_register_operand" "=f")
-        (smax:SF (match_operand:SF 1 "vsx_register_operand" "f")
-		 (match_operand:SF 2 "vsx_register_operand" "f")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "xsmaxdp %x0,%x1,%x2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_insn "*vsx_sminsf3"
-  [(set (match_operand:SF 0 "vsx_register_operand" "=f")
-        (smin:SF (match_operand:SF 1 "vsx_register_operand" "f")
-		 (match_operand:SF 2 "vsx_register_operand" "f")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "xsmindp %x0,%x1,%x2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
 (define_insn "*vsx_sqrt<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (sqrt:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (sqrt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>sqrt<VSs> %x0,%x1"
+  "xvsqrt<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_sqrt>")
    (set_attr "fp_type" "<VSfptype_sqrt>")])
 
 (define_insn "*vsx_rsqrte<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_RSQRT))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>rsqrte<VSs> %x0,%x1"
+  "xvrsqrte<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
@@ -518,26 +498,9 @@ (define_insn "*vsx_tsqrt<mode>2_internal
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Fused vector multiply/add instructions Support the classical DF versions of
-;; fma, which allows the target to be a separate register from the 3 inputs.
-;; Under VSX, the target must be either the addend or the first multiply.
-;; Where we can, also do the same for the Altivec V4SF fmas.
-
-(define_insn "*vsx_fmadf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsmaddadp %x0,%x1,%x2
-   xsmaddmdp %x0,%x1,%x3
-   xsmaddadp %x0,%x1,%x2
-   xsmaddmdp %x0,%x1,%x3
-   fmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
+;; Fused vector multiply/add instructions.  Under VSX, the target must be
+;; either the addend or the first multiply.  Where we can, also do the same for
+;; the Altivec V4SF fmas.
 
 (define_insn "*vsx_fmav4sf4"
   [(set (match_operand:V4SF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,v")
@@ -568,23 +531,6 @@ (define_insn "*vsx_fmav2df4"
    xvmaddmdp %x0,%x1,%x3"
   [(set_attr "type" "vecdouble")])
 
-(define_insn "*vsx_fmsdf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (neg:DF
-	    (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d"))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsmsubadp %x0,%x1,%x2
-   xsmsubmdp %x0,%x1,%x3
-   xsmsubadp %x0,%x1,%x2
-   xsmsubmdp %x0,%x1,%x3
-   fmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_fms<mode>4"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
 	(fma:VSX_F
@@ -594,29 +540,12 @@ (define_insn "*vsx_fms<mode>4"
 	    (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
-   x<VSv>msuba<VSs> %x0,%x1,%x2
-   x<VSv>msubm<VSs> %x0,%x1,%x3
-   x<VSv>msuba<VSs> %x0,%x1,%x2
-   x<VSv>msubm<VSs> %x0,%x1,%x3"
+   xvmsuba<VSs> %x0,%x1,%x2
+   xvmsubm<VSs> %x0,%x1,%x3
+   xvmsuba<VSs> %x0,%x1,%x2
+   xvmsubm<VSs> %x0,%x1,%x3"
   [(set_attr "type" "<VStype_mul>")])
 
-(define_insn "*vsx_nfmadf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(neg:DF
-	 (fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d"))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsnmaddadp %x0,%x1,%x2
-   xsnmaddmdp %x0,%x1,%x3
-   xsnmaddadp %x0,%x1,%x2
-   xsnmaddmdp %x0,%x1,%x3
-   fnmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_nfma<mode>4"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
 	(neg:VSX_F
@@ -626,31 +555,13 @@ (define_insn "*vsx_nfma<mode>4"
 	  (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
-   x<VSv>nmadda<VSs> %x0,%x1,%x2
-   x<VSv>nmaddm<VSs> %x0,%x1,%x3
-   x<VSv>nmadda<VSs> %x0,%x1,%x2
-   x<VSv>nmaddm<VSs> %x0,%x1,%x3"
+   xvnmadda<VSs> %x0,%x1,%x2
+   xvnmaddm<VSs> %x0,%x1,%x3
+   xvnmadda<VSs> %x0,%x1,%x2
+   xvnmaddm<VSs> %x0,%x1,%x3"
   [(set_attr "type" "<VStype_mul>")
    (set_attr "fp_type" "<VSfptype_mul>")])
 
-(define_insn "*vsx_nfmsdf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(neg:DF
-	 (fma:DF
-	   (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	   (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	   (neg:DF
-	     (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d")))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsnmsubadp %x0,%x1,%x2
-   xsnmsubmdp %x0,%x1,%x3
-   xsnmsubadp %x0,%x1,%x2
-   xsnmsubmdp %x0,%x1,%x3
-   fnmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_nfmsv4sf4"
   [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,wf,?wa,?wa,v")
 	(neg:V4SF
@@ -712,16 +623,6 @@ (define_insn "*vsx_ge<mode>"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Floating point scalar compare
-(define_insn "*vsx_cmpdf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y,?y")
-	(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "ws,wa")
-		      (match_operand:DF 2 "gpc_reg_operand" "ws,wa")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_VSX_P (DFmode)"
-  "xscmpudp %0,%x1,%x2"
-  [(set_attr "type" "fpcompare")])
-
 ;; Compare vectors producing a vector result and a predicate, setting CR6 to
 ;; indicate a combined status
 (define_insn "*vsx_eq_<mode>_p"
@@ -788,14 +689,14 @@ (define_insn "*vsx_xxsel<mode>_uns"
 
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B
-	 [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-	  (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F
+	 [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")]
 	 UNSPEC_COPYSIGN))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>cpsgn<VSs> %x0,%x2,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvcpsgn<VSs> %x0,%x2,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 ;; For the conversions, limit the register class for the integer value to be
@@ -855,11 +756,11 @@ (define_insn "vsx_x<VSv>r<VSs>ic"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_btrunc<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(fix:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>iz %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvr<VSs>iz %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_b2trunc<mode>2"
@@ -872,21 +773,21 @@ (define_insn "*vsx_b2trunc<mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_floor<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRIM))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>im %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvr<VSs>im %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_ceil<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRIP))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>ip %x0,%x1"
-  [(set_attr "type" "<VStype_simple>")
+  "xvr<VSs>ip %x0,%x1"
+  [(set_attr "type" "vecdouble")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 \f
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 201924)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -617,6 +617,25 @@ extern int rs6000_vector_align[];
 			  || rs6000_cpu == PROCESSOR_PPC8548)
 
 
+/* Whether SF/DF operations are supported on the E500.  */
+#define TARGET_SF_SPE	(TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT	\
+			 && !TARGET_FPRS)
+
+#define TARGET_DF_SPE	(TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT	\
+			 && !TARGET_FPRS && TARGET_E500_DOUBLE)
+
+/* Whether SF/DF operations are supported by by the normal floating point unit
+   (or the vector/scalar unit).  */
+#define TARGET_SF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_SINGLE_FLOAT)
+
+#define TARGET_DF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_DOUBLE_FLOAT)
+
+/* Whether SF/DF operations are supported by any hardware.  */
+#define TARGET_SF_INSN	(TARGET_SF_FPR || TARGET_SF_SPE)
+#define TARGET_DF_INSN	(TARGET_DF_FPR || TARGET_DF_SPE)
+
 /* Which machine supports the various reciprocal estimate instructions.  */
 #define TARGET_FRES	(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT \
 			 && TARGET_FPRS && TARGET_SINGLE_FLOAT)
@@ -1403,15 +1422,18 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_v,		/* Altivec registers */
   RS6000_CONSTRAINT_wa,		/* Any VSX register */
   RS6000_CONSTRAINT_wd,		/* VSX register for V2DF */
-  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wf,		/* VSX register for V4SF */
+  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wl,		/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wm,		/* VSX register for direct move */
   RS6000_CONSTRAINT_wr,		/* GPR register if 64-bit  */
   RS6000_CONSTRAINT_ws,		/* VSX register for DF */
   RS6000_CONSTRAINT_wt,		/* VSX register for TImode */
-  RS6000_CONSTRAINT_wv,		/* Altivec register for power8 vector */
+  RS6000_CONSTRAINT_wu,		/* Altivec register for float load/stores.  */
+  RS6000_CONSTRAINT_wv,		/* Altivec register for double load/stores.  */
+  RS6000_CONSTRAINT_ww,		/* FP or VSX register for vsx float ops.  */
   RS6000_CONSTRAINT_wx,		/* FPR register for STFIWX */
+  RS6000_CONSTRAINT_wy,		/* VSX register for SF */
   RS6000_CONSTRAINT_wz,		/* FPR register for LFIWZX */
   RS6000_CONSTRAINT_MAX
 };
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 201924)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -335,6 +335,25 @@ (define_mode_iterator RECIPF [SF DF V4SF
 ; Iterator for just SF/DF
 (define_mode_iterator SFDF [SF DF])
 
+; SF/DF suffix for traditional floating instructions
+(define_mode_attr Ftrad		[(SF "s") (DF "")])
+
+; SF/DF suffix for VSX instructions
+(define_mode_attr Fvsx		[(SF "sp") (DF	"dp")])
+
+; SF/DF constraint for arithmetic on traditional floating point registers
+(define_mode_attr Ff		[(SF "f") (DF "d")])
+
+; SF/DF constraint for arithmetic on VSX registers
+(define_mode_attr Fv		[(SF "wy") (DF "ws")])
+
+; s/d suffix for things like fp_addsub_s/fp_addsub_d
+(define_mode_attr Fs		[(SF "s")  (DF "d")])
+
+; FRE/FRES support
+(define_mode_attr Ffre		[(SF "fres") (DF "fre")])
+(define_mode_attr FFRE		[(SF "FRES") (DF "FRE")])
+
 ; Conditional returns.
 (define_code_iterator any_return [return simple_return])
 (define_code_attr return_pred [(return "direct_return ()")
@@ -541,7 +560,7 @@ (define_split
   "")
 
 (define_insn "*zero_extendsidi2_lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wz,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wz,!wu")
 	(zero_extend:DI (match_operand:SI 1 "reg_or_mem_operand" "m,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWZX"
   "@
@@ -711,7 +730,7 @@ (define_expand "extendsidi2"
   "")
 
 (define_insn "*extendsidi2_lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wl,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wl,!wu")
 	(sign_extend:DI (match_operand:SI 1 "lwa_operand" "m,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWAX"
   "@
@@ -5042,232 +5061,292 @@ (define_split
   "")
 
 ;; Floating-point insns, excluding normal data motion.
-;;
-;; PowerPC has a full set of single-precision floating point instructions.
-;;
-;; For the POWER architecture, we pretend that we have both SFmode and
-;; DFmode insns, while, in fact, all fp insns are actually done in double.
-;; The only conversions we will do will be when storing to memory.  In that
-;; case, we will use the "frsp" instruction before storing.
-;;
-;; Note that when we store into a single-precision memory location, we need to
-;; use the frsp insn first.  If the register being stored isn't dead, we
-;; need a scratch register for the frsp.  But this is difficult when the store
-;; is done by reload.  It is not incorrect to do the frsp on the register in
-;; this case, we just lose precision that we would have otherwise gotten but
-;; is not guaranteed.  Perhaps this should be tightened up at some point.
 
-(define_expand "extendsfdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
+;; This section provides the expanders for single/double precision binary
+;; floating point, but the support the E500 floating point instructions is in
+;; spe.md.
+
+;; Use the double varient of the instruction if we aren't doing calculations
+;; that modify the bottom bits (abs, min, max, copysign, etc.) and there isn't
+;; an explicit single precision version.
+
+;; We use the VSX instructions for accessing the upper 32 registers under VSX
+;; for DFmode and p8-vector for SFmode.  We generate the traditional floating
+;; point instructions if all of the registers come from the traditional float
+;; point register set.  Originally when the first support for VSX went in, we
+;; would use the xs* instruction all of the time, but this simplifies things.
+
+(define_expand "abs<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
   "")
 
-(define_insn_and_split "*extendsfdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d")
-	(float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" "0,f,m")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+(define_insn "*abs<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
   "@
-   #
-   fmr %0,%1
-   lfs%U1%X1 %0,%1"
-  "&& reload_completed && REG_P (operands[1]) && REGNO (operands[0]) == REGNO (operands[1])"
-  [(const_int 0)]
-{
-  emit_note (NOTE_INSN_DELETED);
-  DONE;
-}
-  [(set_attr_alternative "type"
-      [(const_string "fp")
-       (const_string "fp")
-       (if_then_else
-	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
-	 (const_string "fpload_ux")
-	 (if_then_else
-	   (match_test "update_address_mem (operands[1], VOIDmode)")
-	   (const_string "fpload_u")
-	   (const_string "fpload")))])])
-
-(define_expand "truncdfsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
+   fabs %0,%1
+   xsabsdp %x0,%x1"
+  [(set_attr "type" "fp")])
 
-(define_insn "*truncdfsf2_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
-  "frsp %0,%1"
+(define_insn "*nabs<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(neg:SFDF
+	 (abs:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnabs %0,%1
+   xsnabsdp %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "negsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(neg:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
+(define_expand "neg<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
   "")
 
-(define_insn "*negsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fneg %0,%1"
+(define_insn "*neg<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fneg %0,%1
+   xsnegdp %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "abssf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(abs:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
+;; For MIN, MAX on non-VSX machines, and conditional move all of the time, we
+;; use DEFINE_EXPAND's that involve a fsel instruction and some auxiliary
+;; computations.  Then we just have a single DEFINE_INSN for fsel and the
+;; define_splits to make them if made by combine.  On VSX machines we have the
+;; min/max instructions.
+;;
+;; On VSX, we only check for TARGET_VSX instead of checking for a vsx/p8 vector
+;; to allow either DF/SF to use only traditional registers.
 
-(define_insn "*abssf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(abs:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fabs %0,%1"
-  [(set_attr "type" "fp")])
+(define_expand "smax<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "")
+			       (match_operand:SFDF 2 "gpc_reg_operand" ""))
+			   (match_dup 1)
+			   (match_dup 2)))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math"
+{
+  rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]);
+  DONE;
+})
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (abs:SF (match_operand:SF 1 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnabs %0,%1"
+(define_insn "*smax<mode>3_vsx"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && TARGET_VSX"
+  "xsmaxdp %x0,%x1,%x2"
   [(set_attr "type" "fp")])
 
-(define_expand "addsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(plus:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		 (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
+(define_expand "smin<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "")
+			       (match_operand:SFDF 2 "gpc_reg_operand" ""))
+			   (match_dup 2)
+			   (match_dup 1)))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math"
+{
+  rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]);
+  DONE;
+})
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(plus:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
-		 (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fadds %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_s")])
+(define_insn "*smin<mode>3_vsx"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(smin:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && TARGET_VSX"
+  "xsmindp %x0,%x1,%x2"
+  [(set_attr "type" "fp")])
 
-(define_expand "subsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(minus:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		  (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
+(define_split
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(match_operator:SFDF 3 "min_max_operator"
+	 [(match_operand:SFDF 1 "gpc_reg_operand" "")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "")]))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math
+   && !TARGET_VSX"
+  [(const_int 0)]
+{
+  rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), operands[1],
+		      operands[2]);
+  DONE;
+})
+
+(define_expand "add<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(plus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
   "")
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(minus:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		  (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fsubs %0,%1,%2"
+(define_insn "*add<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(plus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fadd<Ftrad> %0,%1,%2
+   xsadd<Fvsx> %x0,%x1,%x2"
   [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_s")])
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
-(define_expand "mulsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(mult:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		 (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
+(define_expand "sub<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(minus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		    (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
   "")
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
-		 (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmuls %0,%1,%2"
+(define_insn "*sub<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(minus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		    (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fsub<Ftrad> %0,%1,%2
+   xssub<Fvsx> %x0,%x1,%x2"
   [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_mul_s")])
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
-(define_expand "divsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(div:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		(match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
+(define_expand "mul<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(mult:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
   "")
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(div:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS
-   && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
-  "fdivs %0,%1,%2"
-  [(set_attr "type" "sdiv")])
+(define_insn "*mul<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(mult:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmul<Ftrad> %0,%1,%2
+   xsmul<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_mul_<Fs>")])
 
-(define_insn "fres"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
-  "TARGET_FRES"
-  "fres %0,%1"
+(define_expand "div<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(div:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		  (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN && !TARGET_SIMPLE_FPU"
+  "")
+
+(define_insn "*div<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(div:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && !TARGET_SIMPLE_FPU"
+  "@
+   fdiv<Ftrad> %0,%1,%2
+   xsdiv<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "<Fs>div")])
+
+(define_insn "fre<Fs>"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
+		     UNSPEC_FRES))]
+  "TARGET_<FFRE>"
+  "@
+   fre<Ftrad> %0,%1
+   xsre<Fvsx> %x0,%x1"
   [(set_attr "type" "fp")])
 
-; builtin fmaf support
-(define_insn "*fmasf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")
-		(match_operand:SF 3 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmadds %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(sqrt:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && !TARGET_SIMPLE_FPU
+   && (TARGET_PPC_GPOPT || (<MODE>mode == SFmode && TARGET_XILINX_FPU))"
+  "@
+   fsqrt<Ftrad> %0,%1
+   xssqrt<Fvsx> %x0,%x1"
+  [(set_attr "type" "<Fs>sqrt")])
+
+(define_insn "*rsqrt<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
+		     UNSPEC_RSQRT))]
+  "RS6000_RECIP_HAVE_RSQRTE_P (<MODE>mode)"
+  "@
+   frsqrte<Ftrad> %0,%1
+   xsrsqrte<Fvsx> %x0,%x1"
+  [(set_attr "type" "fp")])
 
-(define_insn "*fmssf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")
-		(neg:SF (match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmsubs %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
+(define_insn "*cmp<mode>_fpr"
+  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y,y")
+	(compare:CCFP (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		      (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fcmpu %0,%1,%2
+   xscmpudp %x0,%x1,%x2"
+  [(set_attr "type" "fpcompare")])
 
-(define_insn "*nfmasf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-			(match_operand:SF 2 "gpc_reg_operand" "f")
-			(match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnmadds %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
+(define_expand "extendsfdf2"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "")
+	(float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand" "")))]
+  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
+  "")
 
-(define_insn "*nfmssf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-			(match_operand:SF 2 "gpc_reg_operand" "f")
-			(neg:SF (match_operand:SF 3 "gpc_reg_operand" "f")))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnmsubs %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
+(define_insn_and_split "*extendsfdf2_fpr"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d,wy,?wy,wy")
+	(float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" "0,f,m,0,wy,Z")))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+  "@
+   #
+   fmr %0,%1
+   lfs%U1%X1 %0,%1
+   #
+   xxlor %x0,%x1,%x1
+   lxsspx %x0,%y1"
+  "&& reload_completed && REG_P (operands[1]) && REGNO (operands[0]) == REGNO (operands[1])"
+  [(const_int 0)]
+{
+  emit_note (NOTE_INSN_DELETED);
+  DONE;
+}
+  [(set_attr_alternative "type"
+      [(const_string "fp")
+       (const_string "fp")
+       (if_then_else
+	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
+	 (const_string "fpload_ux")
+	 (if_then_else
+	   (match_test "update_address_mem (operands[1], VOIDmode)")
+	   (const_string "fpload_u")
+	   (const_string "fpload")))
+       (const_string "fp")
+       (const_string "vecsimple")
+       (if_then_else
+	(match_test "update_indexed_address_mem (operands[1], VOIDmode)")
+	(const_string "fpload_ux")
+	(if_then_else
+	 (match_test "update_address_mem (operands[1], VOIDmode)")
+	 (const_string "fpload_u")
+	 (const_string "fpload")))])])
 
-(define_expand "sqrtsf2"
+(define_expand "truncdfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "(TARGET_PPC_GPOPT || TARGET_XILINX_FPU)
-   && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT
-   && !TARGET_SIMPLE_FPU"
+	(float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "")))]
+  "TARGET_DF_INSN"
   "")
 
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "(TARGET_PPC_GPOPT || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT
-   && TARGET_FPRS && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
-  "fsqrts %0,%1"
-  [(set_attr "type" "ssqrt")])
-
-(define_insn "*rsqrtsf_internal1"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")]
-		   UNSPEC_RSQRT))]
-  "TARGET_FRSQRTES"
-  "frsqrtes %0,%1"
+(define_insn "*truncdfsf2_fpr"
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wy")
+	(float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "d,wy")))]
+  "TARGET_DF_FPR"
+  "@
+   frsp %0,%1
+   xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
 ;; This expander is here to avoid FLOAT_WORDS_BIGENDIAN tests in
@@ -5337,52 +5416,16 @@ (define_expand "copysign<mode>3"
 ;; Use an unspec rather providing an if-then-else in RTL, to prevent the
 ;; compiler from optimizing -0.0
 (define_insn "copysign<mode>3_fcpsgn"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")
-		      (match_operand:SFDF 2 "gpc_reg_operand" "<rreg2>")]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		      (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_COPYSIGN))]
-  "TARGET_CMPB && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "fcpsgn %0,%2,%1"
+  "TARGET_<MODE>_FPR && TARGET_CMPB"
+  "@
+   fcpsgn %0,%2,%1
+   xscpsgn<VSs> %x0,%x2,%x1"
   [(set_attr "type" "fp")])
 
-;; For MIN, MAX, and conditional move, we use DEFINE_EXPAND's that involve a
-;; fsel instruction and some auxiliary computations.  Then we just have a
-;; single DEFINE_INSN for fsel and the define_splits to make them if made by
-;; combine.
-(define_expand "smaxsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "")
-			     (match_operand:SF 2 "gpc_reg_operand" ""))
-			 (match_dup 1)
-			 (match_dup 2)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_SINGLE_FLOAT && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}")
-
-(define_expand "sminsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "")
-			     (match_operand:SF 2 "gpc_reg_operand" ""))
-			 (match_dup 2)
-			 (match_dup 1)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_SINGLE_FLOAT && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}")
-
-(define_split
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(match_operator:SF 3 "min_max_operator"
-	 [(match_operand:SF 1 "gpc_reg_operand" "")
-	  (match_operand:SF 2 "gpc_reg_operand" "")]))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_SINGLE_FLOAT && !flag_trapping_math"
-  [(const_int 0)]
-  "
-{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]),
-		      operands[1], operands[2]);
-  DONE;
-}")
-
 (define_expand "mov<mode>cc"
    [(set (match_operand:GPR 0 "gpc_reg_operand" "")
 	 (if_then_else:GPR (match_operand 1 "comparison_operator" "")
@@ -5465,12 +5508,12 @@ (define_insn "*isel_reversed_unsigned_<m
   [(set_attr "type" "isel")
    (set_attr "length" "4")])
 
-(define_expand "movsfcc"
-   [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	 (if_then_else:SF (match_operand 1 "comparison_operator" "")
-			  (match_operand:SF 2 "gpc_reg_operand" "")
-			  (match_operand:SF 3 "gpc_reg_operand" "")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
+(define_expand "mov<mode>cc"
+   [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	 (if_then_else:SFDF (match_operand 1 "comparison_operator" "")
+			    (match_operand:SFDF 2 "gpc_reg_operand" "")
+			    (match_operand:SFDF 3 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT"
   "
 {
   if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5479,282 +5522,37 @@ (define_expand "movsfcc"
     FAIL;
 }")
 
-(define_insn "*fselsfsf4"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "f")
-			     (match_operand:SF 4 "zero_fp_constant" "F"))
-			 (match_operand:SF 2 "gpc_reg_operand" "f")
-			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fsel %0,%1,%2,%3"
-  [(set_attr "type" "fp")])
-
-(define_insn "*fseldfsf4"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "d")
-			     (match_operand:DF 4 "zero_fp_constant" "F"))
-			 (match_operand:SF 2 "gpc_reg_operand" "f")
-			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT"
+;; There is no single VSX instruction that does what fsel does, so limit this
+;; to traditional floating point registers.
+(define_insn "*fsel<mode>sf4"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>")
+	(if_then_else:SFDF (ge (match_operand:SF 1 "gpc_reg_operand" "f")
+			       (match_operand:SF 4 "zero_fp_constant" "F"))
+			 (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>")
+			 (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>")))]
+  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT
+   && TARGET_DOUBLE_FLOAT"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_expand "negdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(neg:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*negdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(neg:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fneg %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "absdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(abs:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*absdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(abs:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_insn "*nabsdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "d"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fnabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "adddf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(plus:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		 (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*adddf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d")
-		 (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fadd %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_expand "subdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(minus:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		  (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*subdf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(minus:DF (match_operand:DF 1 "gpc_reg_operand" "d")
-		  (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fsub %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_expand "muldf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(mult:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		 (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*muldf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d")
-		 (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fmul %0,%1,%2"
-  [(set_attr "type" "dmul")
-   (set_attr "fp_type" "fp_mul_d")])
-
-(define_expand "divdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(div:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		(match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT
-   && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)
-   && !TARGET_SIMPLE_FPU"
-  "")
-
-(define_insn "*divdf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(div:DF (match_operand:DF 1 "gpc_reg_operand" "d")
-		(match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fdiv %0,%1,%2"
-  [(set_attr "type" "ddiv")])
-
-(define_insn "*fred_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
-  "TARGET_FRE && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fre %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_insn "*rsqrtdf_internal1"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")]
-		   UNSPEC_RSQRT))]
-  "TARGET_FRSQRTE && !VECTOR_UNIT_VSX_P (DFmode)"
-  "frsqrte %0,%1"
-  [(set_attr "type" "fp")])
-
-; builtin fma support
-(define_insn "*fmadf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-		(match_operand:DF 2 "gpc_reg_operand" "f")
-		(match_operand:DF 3 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*fmsdf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-		(match_operand:DF 2 "gpc_reg_operand" "f")
-		(neg:DF (match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*nfmadf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-			(match_operand:DF 2 "gpc_reg_operand" "f")
-			(match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fnmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*nfmsdf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-			(match_operand:DF 2 "gpc_reg_operand" "f")
-			(neg:DF (match_operand:DF 3 "gpc_reg_operand" "f")))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fnmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_expand "sqrtdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_PPC_GPOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
-  "")
-
-(define_insn "*sqrtdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_PPC_GPOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fsqrt %0,%1"
-  [(set_attr "type" "dsqrt")])
-
-;; The conditional move instructions allow us to perform max and min
-;; operations even when
-
-(define_expand "smaxdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "")
-			     (match_operand:DF 2 "gpc_reg_operand" ""))
-			 (match_dup 1)
-			 (match_dup 2)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}")
-
-(define_expand "smindf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "")
-			     (match_operand:DF 2 "gpc_reg_operand" ""))
-			 (match_dup 2)
-			 (match_dup 1)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}")
-
-(define_split
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(match_operator:DF 3 "min_max_operator"
-	 [(match_operand:DF 1 "gpc_reg_operand" "")
-	  (match_operand:DF 2 "gpc_reg_operand" "")]))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && !flag_trapping_math"
-  [(const_int 0)]
-  "
-{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]),
-		      operands[1], operands[2]);
-  DONE;
-}")
-
-(define_expand "movdfcc"
-   [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	 (if_then_else:DF (match_operand 1 "comparison_operator" "")
-			  (match_operand:DF 2 "gpc_reg_operand" "")
-			  (match_operand:DF 3 "gpc_reg_operand" "")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
-  "
-{
-  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
-    DONE;
-  else
-    FAIL;
-}")
-
-(define_insn "*fseldfdf4"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "d")
-			     (match_operand:DF 4 "zero_fp_constant" "F"))
-			 (match_operand:DF 2 "gpc_reg_operand" "d")
-			 (match_operand:DF 3 "gpc_reg_operand" "d")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+(define_insn "*fsel<mode>df4"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>")
+	(if_then_else:SFDF (ge (match_operand:DF 1 "gpc_reg_operand" "d")
+			       (match_operand:DF 4 "zero_fp_constant" "F"))
+			 (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>")
+			 (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>")))]
+  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT
+   && TARGET_DOUBLE_FLOAT"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_insn "*fselsfdf4"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(if_then_else:DF (ge (match_operand:SF 1 "gpc_reg_operand" "f")
-			     (match_operand:SF 4 "zero_fp_constant" "F"))
-			 (match_operand:DF 2 "gpc_reg_operand" "d")
-			 (match_operand:DF 3 "gpc_reg_operand" "d")))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT"
-  "fsel %0,%1,%2,%3"
-  [(set_attr "type" "fp")])
 \f
 ;; Conversions to and from floating-point.
 
 ; We don't define lfiwax/lfiwzx with the normal definition, because we
 ; don't want to support putting SImode in FPR registers.
 (define_insn "lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wm,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wu,wm")
 	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
 		   UNSPEC_LFIWAX))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX"
@@ -5811,11 +5609,11 @@ (define_insn_and_split "floatsi<mode>2_l
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatsi<mode>2_lfiwax_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,<rreg2>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,<rreg2>,wu")
 	(float:SFDF
 	 (sign_extend:DI
-	  (match_operand:SI 1 "memory_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "memory_operand" "Z,Z,Z"))))
+   (clobber (match_scratch:DI 2 "=0,d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
    && <SI_CONVERT_FP>"
   "#"
@@ -5834,7 +5632,7 @@ (define_insn_and_split "floatsi<mode>2_l
    (set_attr "type" "fpload")])
 
 (define_insn "lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wm,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wu,wm")
 	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
 		   UNSPEC_LFIWZX))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX"
@@ -5886,11 +5684,11 @@ (define_insn_and_split "floatunssi<mode>
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatunssi<mode>2_lfiwzx_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,<rreg2>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,<rreg2>,wu")
 	(unsigned_float:SFDF
 	 (zero_extend:DI
-	  (match_operand:SI 1 "memory_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "memory_operand" "Z,Z,Z"))))
+   (clobber (match_scratch:DI 2 "=0,d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
    && <SI_CONVERT_FP>"
   "#"
@@ -6384,56 +6182,38 @@ (define_insn "lrint<mode>di2"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d")
 	(unspec:DI [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
 		   UNSPEC_FCTID))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
   "fctid %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "btrunc<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIZ))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
-
-(define_insn "*btrunc<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "btrunc<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIZ))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "friz %0,%1"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   friz %0,%1
+   xsrdpiz %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "ceil<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIP))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
-
-(define_insn "*ceil<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "ceil<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIP))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "frip %0,%1"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   frip %0,%1
+   xsrdpip %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "floor<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIM))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
-
-(define_insn "*floor<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "floor<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIM))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "frim %0,%1"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   frim %0,%1
+   xsrdpim %x0,%x1"
   [(set_attr "type" "fp")])
 
 ;; No VSX equivalent to frin
@@ -9288,8 +9068,8 @@ (define_split
 }")
 
 (define_insn "mov<mode>_hardfloat"
-  [(set (match_operand:FMOVE32 0 "nonimmediate_operand" "=!r,!r,m,f,wa,wa,<f32_lr>,<f32_sm>,wm,Z,?<f32_dm>,?r,*c*l,!r,*h,!r,!r")
-	(match_operand:FMOVE32 1 "input_operand" "r,m,r,f,wa,j,<f32_lm>,<f32_sr>,Z,wm,r,<f32_dm>,r,h,0,G,Fn"))]
+  [(set (match_operand:FMOVE32 0 "nonimmediate_operand" "=!r,!r,m,f,wa,wa,<f32_lr>,<f32_sm>,wu,Z,?<f32_dm>,?r,*c*l,!r,*h,!r,!r")
+	(match_operand:FMOVE32 1 "input_operand" "r,m,r,f,wa,j,<f32_lm>,<f32_sr>,Z,wu,r,<f32_dm>,r,h,0,G,Fn"))]
   "(gpc_reg_operand (operands[0], <MODE>mode)
    || gpc_reg_operand (operands[1], <MODE>mode))
    && (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT)"
@@ -9490,8 +9270,8 @@ (define_split
 ;; reloading.
 
 (define_insn "*mov<mode>_hardfloat32"
-  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,!r,!r,!r")
-	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,G,H,F"))]
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,!r,!r,!r")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,G,H,F"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9500,11 +9280,8 @@ (define_insn "*mov<mode>_hardfloat32"
    lfd%U1%X1 %0,%1
    fmr %0,%1
    lxsd%U1x %x0,%y1
-   lxsd%U1x %x0,%y1
-   stxsd%U0x %x1,%y0
    stxsd%U0x %x1,%y0
    xxlor %x0,%x1,%x1
-   xxlor %x0,%x1,%x1
    xxlxor %x0,%x0,%x0
    #
    #
@@ -9533,27 +9310,18 @@ (define_insn "*mov<mode>_hardfloat32"
 	 (const_string "fpload_ux")
 	 (const_string "fpload"))
        (if_then_else
-	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
-	 (const_string "fpload_ux")
-	 (const_string "fpload"))
-       (if_then_else
-	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
-	 (const_string "fpstore_ux")
-	 (const_string "fpstore"))
-       (if_then_else
 	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
 	 (const_string "fpstore_ux")
 	 (const_string "fpstore"))
        (const_string "vecsimple")
        (const_string "vecsimple")
-       (const_string "vecsimple")
        (const_string "store")
        (const_string "load")
        (const_string "two")
        (const_string "fp")
        (const_string "fp")
        (const_string "*")])
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,8,8,8,8,12,16")])
+   (set_attr "length" "4,4,4,4,4,4,4,8,8,8,8,12,16")])
 
 (define_insn "*mov<mode>_softfloat32"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,r,r,r")
@@ -9570,8 +9338,8 @@ (define_insn "*mov<mode>_softfloat32"
 ; ld/std require word-aligned displacements -> 'Y' constraint.
 ; List Y->r and r->Y before r->r for reload.
 (define_insn "*mov<mode>_hardfloat64"
-  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wm")
-	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wm,r"))]
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wm")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wm,r"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9580,10 +9348,7 @@ (define_insn "*mov<mode>_hardfloat64"
    lfd%U1%X1 %0,%1
    fmr %0,%1
    lxsd%U1x %x0,%y1
-   lxsd%U1x %x0,%y1
    stxsd%U0x %x1,%y0
-   stxsd%U0x %x1,%y0
-   xxlor %x0,%x1,%x1
    xxlor %x0,%x1,%x1
    xxlxor %x0,%x0,%x0
    std%U0%X0 %1,%0
@@ -9620,18 +9385,9 @@ (define_insn "*mov<mode>_hardfloat64"
 	 (const_string "fpload_ux")
 	 (const_string "fpload"))
        (if_then_else
-	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
-	 (const_string "fpload_ux")
-	 (const_string "fpload"))
-       (if_then_else
 	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
 	 (const_string "fpstore_ux")
 	 (const_string "fpstore"))
-       (if_then_else
-	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
-	 (const_string "fpstore_ux")
-	 (const_string "fpstore"))
-       (const_string "vecsimple")
        (const_string "vecsimple")
        (const_string "vecsimple")
        (if_then_else
@@ -9659,7 +9415,7 @@ (define_insn "*mov<mode>_hardfloat64"
        (const_string "mffgpr")
        (const_string "mftgpr")
        (const_string "mffgpr")])
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4,4,4")])
+   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4,4,4")])
 
 (define_insn "*mov<mode>_softfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
@@ -10322,8 +10078,8 @@ (define_split
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,?Z,?wa,?wa,r,*h,*h,?wa,r,?*wg,r,?*wm")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,wa,Z,wa,*h,r,0,O,*wg,r,*wm,r"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,?Z,?wv,?wa,r,*h,*h,?wa,r,?*wg,r,?*wm")
+	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,wv,Z,wa,*h,r,0,O,*wg,r,*wm,r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -13339,23 +13095,6 @@ (define_split
   [(set (match_dup 3) (compare:CCUNS (match_dup 1) (match_dup 2)))
    (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 4)))])
 
-(define_insn "*cmpsf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
-	(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "f")
-		      (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
-
-(define_insn "*cmpdf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
-	(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "d")
-		      (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
-
 ;; Only need to compare second words if first words equal
 (define_insn "*cmptf_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -15638,8 +15377,9 @@ (define_insn "bpermd_<mode>"
   [(set_attr "type" "popcnt")])
 
 \f
-;; Builtin fma support.  Handle 
-;; Note that the conditions for expansion are in the FMA_F iterator.
+;; Builtin fma support.  Handle VSX instructions needing to overlap the
+;; desination with one of the two inputs.  Note that the conditions for
+;; expansion are in the FMA_F iterator.
 
 (define_expand "fma<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15650,6 +15390,20 @@ (define_expand "fma<mode>4"
   ""
   "")
 
+(define_insn "*fma<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmadd<Ftrad> %0,%1,%2,%3
+   xsmadda<Fvsx> %x0,%x1,%x2
+   xsmaddm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ; Altivec only has fma and nfms.
 (define_expand "fms<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15660,6 +15414,20 @@ (define_expand "fms<mode>4"
   "!VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "")
 
+(define_insn "*fms<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(fma:SFDF
+	 (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	 (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	 (neg:SFDF (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmsub<Ftrad> %0,%1,%2,%3
+   xsmsuba<Fvsx> %x0,%x1,%x2
+   xsmsubm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ;; If signed zeros are ignored, -(a * b - c) = -a * b + c.
 (define_expand "fnma<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15693,6 +15461,21 @@ (define_expand "nfma<mode>4"
   "!VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "")
 
+(define_insn "*nfma<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(neg:SFDF
+	 (fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnmadd<Ftrad> %0,%1,%2,%3
+   xsnmadda<Fvsx> %x0,%x1,%x2
+   xsnmaddm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ; Not an official optab name, but used from builtins.
 (define_expand "nfms<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15704,6 +15487,24 @@ (define_expand "nfms<mode>4"
   ""
   "")
 
+(define_insn "*nfmssf4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(neg:SFDF
+	 (fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (neg:SFDF
+	   (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>")))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnmsub<Ftrad> %0,%1,%2,%3
+   xsnmsuba<Fvsx> %x0,%x1,%x2
+   xsnmsubm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
+\f
+;; Timebase builtins
 (define_expand "rs6000_get_timebase"
   [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
   ""
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 201924)
+++ gcc/doc/md.texi	(working copy)
@@ -2067,40 +2067,52 @@ Floating point register (containing 32-b
 Altivec vector register
 
 @item wa
-Any VSX register
+Any VSX register if the -mvsx option was used or NO_REGS.
 
 @item wd
-VSX vector register to hold vector double data
+VSX vector register to hold vector double data or NO_REGS.
 
 @item wf
-VSX vector register to hold vector float data
+VSX vector register to hold vector float data or NO_REGS.
 
 @item wg
-If @option{-mmfpgpr} was used, a floating point register
+If @option{-mmfpgpr} was used, a floating point register or NO_REGS.
 
 @item wl
-If the LFIWAX instruction is enabled, a floating point register
+Floating point register if the LFIWAX instruction is enabled or NO_REGS.
 
 @item wm
-If direct moves are enabled, a VSX register.
+VSX register if direct move instructions are enabled, or NO_REGS.
 
 @item wn
-No register.
+No register (NO_REGS).
 
 @item wr
-General purpose register if 64-bit mode is used
+General purpose register if 64-bit instructions are enabled or NO_REGS.
 
 @item ws
-VSX vector register to hold scalar float data
+VSX vector register to hold scalar double values or NO_REGS.
 
 @item wt
-VSX vector register to hold 128 bit integer
+VSX vector register to hold 128 bit integer or NO_REGS.
+
+@item wu
+Altivec register to use for float/32-bit int loads/stores  or NO_REGS.
+
+@item wv
+Altivec register to use for double loads/stores  or NO_REGS.
+
+@item ww
+FP or VSX register to perform float operations under @option{-mvsx} or NO_REGS.
 
 @item wx
-If the STFIWX instruction is enabled, a floating point register
+Floating point register if the STFIWX instruction is enabled or NO_REGS.
+
+@item wy
+VSX vector register to hold scalar float values or NO_REGS.
 
 @item wz
-If the LFIWZX instruction is enabled, a floating point register
+Floating point register if the LFIWZX instruction is enabled or NO_REGS.
 
 @item wQ
 A memory address that will work with the @code{lq} and @code{stq}
Index: gcc/testsuite/gcc.target/powerpc/ppc-target-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-target-2.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/ppc-target-2.c	(working copy)
@@ -5,8 +5,7 @@
 /* { dg-final { scan-assembler-times "fabs" 3 } } */
 /* { dg-final { scan-assembler-times "fnabs" 3 } } */
 /* { dg-final { scan-assembler-times "fsel" 3 } } */
-/* { dg-final { scan-assembler-times "fcpsgn" 3 } } */
-/* { dg-final { scan-assembler-times "xscpsgndp" 1 } } */
+/* { dg-final { scan-assembler-times "fcpsgn\|xscpsgndp" 4 } } */
 
 /* fabs/fnabs/fsel */
 double normal1 (double a, double b) { return __builtin_copysign (a, b); }
Index: gcc/testsuite/gcc.target/powerpc/recip-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/recip-3.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/recip-3.c	(working copy)
@@ -1,14 +1,14 @@
 /* { dg-do compile { target { { powerpc*-*-* } && { ! powerpc*-apple-darwin* } } } } */
 /* { dg-require-effective-target powerpc_fprs } */
 /* { dg-options "-O2 -mrecip -ffast-math -mcpu=power7" } */
-/* { dg-final { scan-assembler-times "xsrsqrtedp" 1 } } */
+/* { dg-final { scan-assembler-times "xsrsqrtedp\|frsqrte\ " 1 } } */
 /* { dg-final { scan-assembler-times "xsmsub.dp\|fmsub\ " 1 } } */
-/* { dg-final { scan-assembler-times "xsmuldp" 4 } } */
+/* { dg-final { scan-assembler-times "xsmuldp\|fmul\ " 4 } } */
 /* { dg-final { scan-assembler-times "xsnmsub.dp\|fnmsub\ " 2 } } */
-/* { dg-final { scan-assembler-times "frsqrtes" 1 } } */
-/* { dg-final { scan-assembler-times "fmsubs" 1 } } */
-/* { dg-final { scan-assembler-times "fmuls" 2 } } */
-/* { dg-final { scan-assembler-times "fnmsubs" 1 } } */
+/* { dg-final { scan-assembler-times "xsrsqrtesp\|frsqrtes" 1 } } */
+/* { dg-final { scan-assembler-times "xsmsub.sp\|fmsubs" 1 } } */
+/* { dg-final { scan-assembler-times "xsmulsp\|fmuls" 2 } } */
+/* { dg-final { scan-assembler-times "xsnmsub.sp\|fnmsubs" 1 } } */
 
 double
 rsqrt_d (double a)
Index: gcc/testsuite/gcc.target/powerpc/pr42747.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr42747.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/pr42747.c	(working copy)
@@ -5,4 +5,4 @@
 
 double foo (double x) { return __builtin_sqrt (x); }
 
-/* { dg-final { scan-assembler "xssqrtdp" } } */
+/* { dg-final { scan-assembler "xssqrtdp\|fsqrt" } } */
Index: gcc/testsuite/gcc.target/powerpc/recip-5.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/recip-5.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/recip-5.c	(working copy)
@@ -4,12 +4,12 @@
 /* { dg-options "-O3 -ftree-vectorize -mrecip=all -ffast-math -mcpu=power7 -fno-unroll-loops" } */
 /* { dg-final { scan-assembler-times "xvredp" 4 } } */
 /* { dg-final { scan-assembler-times "xvresp" 5 } } */
-/* { dg-final { scan-assembler-times "xsredp" 2 } } */
-/* { dg-final { scan-assembler-times "fres" 2 } } */
-/* { dg-final { scan-assembler-times "fmuls" 2 } } */
-/* { dg-final { scan-assembler-times "fnmsubs" 2 } } */
-/* { dg-final { scan-assembler-times "xsmuldp" 2 } } */
-/* { dg-final { scan-assembler-times "xsnmsub.dp" 4 } } */
+/* { dg-final { scan-assembler-times "xsredp\|fre\ " 2 } } */
+/* { dg-final { scan-assembler-times "fres\|xsresp" 2 } } */
+/* { dg-final { scan-assembler-times "fmuls\|xsmulsp" 2 } } */
+/* { dg-final { scan-assembler-times "fnmsubs\|xsnmsub.sp" 2 } } */
+/* { dg-final { scan-assembler-times "xsmuldp\|fmul\ " 2 } } */
+/* { dg-final { scan-assembler-times "xsnmsub.dp\|fnmsub\ " 4 } } */
 /* { dg-final { scan-assembler-times "xvmulsp" 7 } } */
 /* { dg-final { scan-assembler-times "xvnmsub.sp" 5 } } */
 /* { dg-final { scan-assembler-times "xvmuldp" 6 } } */
Index: gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c	(working copy)
@@ -16,9 +16,9 @@
 /* { dg-final { scan-assembler "xvrspiz" } } */
 /* { dg-final { scan-assembler "xsrdpi" } } */
 /* { dg-final { scan-assembler "xsrdpic" } } */
-/* { dg-final { scan-assembler "xsrdpim" } } */
-/* { dg-final { scan-assembler "xsrdpip" } } */
-/* { dg-final { scan-assembler "xsrdpiz" } } */
+/* { dg-final { scan-assembler "xsrdpim\|frim" } } */
+/* { dg-final { scan-assembler "xsrdpip\|frip" } } */
+/* { dg-final { scan-assembler "xsrdpiz\|friz" } } */
 /* { dg-final { scan-assembler "xsmaxdp" } } */
 /* { dg-final { scan-assembler "xsmindp" } } */
 /* { dg-final { scan-assembler "xxland" } } */
Index: gcc/testsuite/gcc.target/powerpc/ppc-target-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-target-1.c	(revision 201924)
+++ gcc/testsuite/gcc.target/powerpc/ppc-target-1.c	(working copy)
@@ -5,8 +5,7 @@
 /* { dg-final { scan-assembler-times "fabs" 3 } } */
 /* { dg-final { scan-assembler-times "fnabs" 3 } } */
 /* { dg-final { scan-assembler-times "fsel" 3 } } */
-/* { dg-final { scan-assembler-times "fcpsgn" 3 } } */
-/* { dg-final { scan-assembler-times "xscpsgndp" 1 } } */
+/* { dg-final { scan-assembler-times "fcpsgn\|xscpsgndp" 4 } } */
 
 double normal1 (double, double);
 double power5  (double, double) __attribute__((__target__("cpu=power5")));

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework VSX scalar floating point support
  2013-08-22 19:02 [PATCH, powerpc] Rework VSX scalar floating point support Michael Meissner
@ 2013-09-06  3:39 ` David Edelsohn
  2013-09-21  2:09   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #1 Michael Meissner
  2013-09-23 20:37 ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2 Michael Meissner
  1 sibling, 1 reply; 20+ messages in thread
From: David Edelsohn @ 2013-09-06  3:39 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches, David Edelsohn

Mike,

Why does rs6000_init_hard_regno_mode_ok() include #if 0 code?

Because the patch affects e500, has this patch been tested in an e500
configuration?

The cover message did not make it clear that you are merging SFmode
and DFmode patterns using mode iterators. That really should have been
a separate patch. We'll proceed with the current form.  But, would you
please clarify the ChangeLog entry for rs6000.md to make it clear
exactly which patterns are being deleted instead of listing everything
as "Likewise"?

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #1
  2013-09-06  3:39 ` David Edelsohn
@ 2013-09-21  2:09   ` Michael Meissner
  2013-09-23 17:38     ` David Edelsohn
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2013-09-21  2:09 UTC (permalink / raw)
  To: David Edelsohn, GCC Patches, Michael Meissner

[-- Attachment #1: Type: text/plain, Size: 2996 bytes --]

I am now breaking the patches down to be more bite size.  Ultimately, I hope
these patches will provide support to allow scalar floating point to occupy the
Altivec (upper) registers if the ISA allows it (ISA 2.06 for DFmode, ISA 2.07
for SFmode).  One effect of later patches will be to go back to using the
traditional DFmode instructions for VSX if all of the registers come from the
traditional floating point register set.

This patch adds the new switches, and constraints that will be used in future
patches.  It produces exactly the same code on the targets I tested on and
passes the bootstrap/make check stages.  Is it ok to apply this patch?

I have tested the code generation for the following targets:

	power4, power5, power5+, power6, power6x, power7, power8 for 64/32-bit
	power7 with VSX disabled using altivec instructions for 64/32-bit
	power7 with both VSX and altivec disabled for 64/32-bit
	cell 64/32-bit
	e5500, e6500 64/32-bit
	G4 32-bit
	G5 64/32-bit
	linuxpaired 32-bit
	linuxspe 32-bit

2013-09-20  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add new
	constraints: wu, ww, and wy.  Repurpose wv constraint added during
	power8 changes.  Put wg constraint in alphabetical order.

	* config/rs6000/rs6000.opt (-mvsx-scalar-float): New debug switch
	for future work to add ISA 2.07 VSX single precision support.
	(-mvsx-scalar-double): Change default from -1 to 1, update
	documentation comment.
	(-mvsx-scalar-memory): Rename debug switch to -mupper-regs-df.
	(-mupper-regs-df): New debug switch to control whether DF values
	can go in the traditional Altivec registers.
	(-mupper-regs-sf): New debug switch to control whether SF values
	can go in the traditional Altivec registers.

	* config/rs6000/rs6000.c (rs6000_debug_reg_global): Print wu, ww,
	and wy constraints.
	(rs6000_init_hard_regno_mode_ok): Use ssize_t instead of int for
	loop variables.  Rename -mvsx-scalar-memory to -mupper-regs-df.
	Add new constraints, wu/ww/wy.  Repurpose wv constraint.
	(rs6000_debug_legitimate_address_p): Print if we are running
	before, during, or after reload.
	(rs6000_secondary_reload): Add a comment.
	(rs6000_opt_masks): Add -mupper-regs-df, -mupper-regs-sf.

	* config/rs6000/constraints.md (wa constraint): Sort w<x>
	constraints.  Update documentation string.
	(wd constraint): Likewise.
	(wf constraint): Likewise.
	(wg constraint): Likewise.
	(wn constraint): Likewise.
	(ws constraint): Likewise.
	(wt constraint): Likewise.
	(wx constraint): Likewise.
	(wz constraint): Likewise.
	(wu constraint): New constraint for ISA 2.07 SFmode scalar
	instructions.
	(ww constraint): Likewise.
	(wy constraint): Likewise.
	(wv constraint): Repurpose ISA 2.07 constraint that we not used in
	the previous submissions.
	* doc/md.texi (PowerPC and IBM RS6000): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch056b --]
[-- Type: text/plain, Size: 16674 bytes --]

Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 202793)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1403,15 +1403,18 @@ enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_v,		/* Altivec registers */
   RS6000_CONSTRAINT_wa,		/* Any VSX register */
   RS6000_CONSTRAINT_wd,		/* VSX register for V2DF */
-  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wf,		/* VSX register for V4SF */
+  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wl,		/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_wm,		/* VSX register for direct move */
   RS6000_CONSTRAINT_wr,		/* GPR register if 64-bit  */
   RS6000_CONSTRAINT_ws,		/* VSX register for DF */
   RS6000_CONSTRAINT_wt,		/* VSX register for TImode */
-  RS6000_CONSTRAINT_wv,		/* Altivec register for power8 vector */
+  RS6000_CONSTRAINT_wu,		/* Altivec register for float load/stores.  */
+  RS6000_CONSTRAINT_wv,		/* Altivec register for double load/stores.  */
+  RS6000_CONSTRAINT_ww,		/* FP or VSX register for vsx float ops.  */
   RS6000_CONSTRAINT_wx,		/* FPR register for STFIWX */
+  RS6000_CONSTRAINT_wy,		/* VSX register for SF */
   RS6000_CONSTRAINT_wz,		/* FPR register for LFIWZX */
   RS6000_CONSTRAINT_MAX
 };
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 202793)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -181,13 +181,16 @@ mvsx
 Target Report Mask(VSX) Var(rs6000_isa_flags)
 Use vector/scalar (VSX) instructions
 
+mvsx-scalar-float
+Target Undocumented Report Var(TARGET_VSX_SCALAR_FLOAT) Init(1)
+; If -mpower8-vector, use VSX arithmetic instructions for SFmode (on by default)
+
 mvsx-scalar-double
-Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1)
-; If -mvsx, use VSX arithmetic instructions for scalar double (on by default)
+Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(1)
+; If -mvsx, use VSX arithmetic instructions for DFmode (on by default)
 
 mvsx-scalar-memory
-Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY)
-; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default)
+Target Undocumented Report Alias(mupper-regs-df)
 
 mvsx-align-128
 Target Undocumented Report Var(TARGET_VSX_ALIGN_128)
@@ -550,3 +553,11 @@ Generate the quad word memory instructio
 mcompat-align-parm
 Target Report Var(rs6000_compat_align_parm) Init(0) Save
 Generate aggregate parameter passing code with at most 64-bit alignment.
+
+mupper-regs-df
+Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
+Allow double variables in upper registers with -mcpu=power7 or -mvsx
+
+mupper-regs-sf
+Target Undocumented Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
+Allow float variables in upper registers with -mcpu=power8 or -mp8-vector
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 202793)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1891,8 +1891,11 @@ rs6000_debug_reg_global (void)
 	   "wr reg_class = %s\n"
 	   "ws reg_class = %s\n"
 	   "wt reg_class = %s\n"
+	   "wu reg_class = %s\n"
 	   "wv reg_class = %s\n"
+	   "ww reg_class = %s\n"
 	   "wx reg_class = %s\n"
+	   "wy reg_class = %s\n"
 	   "wz reg_class = %s\n"
 	   "\n",
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]],
@@ -1907,8 +1910,11 @@ rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wt]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wu]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ww]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]);
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
@@ -2168,7 +2174,7 @@ rs6000_debug_reg_global (void)
 static void
 rs6000_init_hard_regno_mode_ok (bool global_init_p)
 {
-  int r, m, c;
+  ssize_t r, m, c;
   int align64;
   int align32;
 
@@ -2320,7 +2326,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
     {
       rs6000_vector_unit[DFmode] = VECTOR_VSX;
       rs6000_vector_mem[DFmode]
-	= (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE);
+	= (TARGET_UPPER_REGS_DF ? VECTOR_VSX : VECTOR_NONE);
       rs6000_vector_align[DFmode] = align64;
     }
 
@@ -2334,7 +2340,34 @@ rs6000_init_hard_regno_mode_ok (bool glo
   /* TODO add SPE and paired floating point vector support.  */
 
   /* Register class constraints for the constraints that depend on compile
-     switches.  */
+     switches. When the VSX code was added, different constraints were added
+     based on the type (DFmode, V2DFmode, V4SFmode).  For the vector types, all
+     of the VSX registers are used.  The register classes for scalar floating
+     point types is set, based on whether we allow that type into the upper
+     (Altivec) registers.  GCC has register classes to target the Altivec
+     registers for load/store operations, to select using a VSX memory
+     operation instead of the traditional floating point operation.  The
+     constraints are:
+
+	d  - Register class to use with traditional DFmode instructions.
+	f  - Register class to use with traditional SFmode instructions.
+	v  - Altivec register.
+	wa - Any VSX register.
+	wd - Preferred register class for V2DFmode.
+	wf - Preferred register class for V4SFmode.
+	wg - Float register for power6x move insns.
+	wl - Float register if we can do 32-bit signed int loads.
+	wm - VSX register for ISA 2.07 direct move operations.
+	wr - GPR if 64-bit mode is permitted.
+	ws - Register class to do ISA 2.06 DF operations.
+	wu - Altivec register for ISA 2.07 VSX SF/SI load/stores.
+	wv - Altivec register for ISA 2.06 VSX DF/DI load/stores.
+	wt - VSX register for TImode in VSX registers.
+	ww - Register class to do SF conversions in with VSX operations.
+	wx - Float register if we can do 32-bit int stores.
+	wy - Register class to do ISA 2.07 SF operations.
+	wz - Float register if we can do 32-bit unsigned int loads.  */
+
   if (TARGET_HARD_FLOAT && TARGET_FPRS)
     rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS;
 
@@ -2343,19 +2376,16 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   if (TARGET_VSX)
     {
-      /* At present, we just use VSX_REGS, but we have different constraints
-	 based on the use, in case we want to fine tune the default register
-	 class used.  wa = any VSX register, wf = register class to use for
-	 V4SF, wd = register class to use for V2DF, and ws = register classs to
-	 use for DF scalars.  */
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_ws] = (TARGET_VSX_SCALAR_MEMORY
-						  ? VSX_REGS
-						  : FLOAT_REGS);
+      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+
       if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;
+
+      rs6000_constraints[RS6000_CONSTRAINT_ws]
+	= (TARGET_UPPER_REGS_DF) ? VSX_REGS : FLOAT_REGS;
     }
 
   /* Add conditional constraints based on various options, to allow us to
@@ -2376,7 +2406,14 @@ rs6000_init_hard_regno_mode_ok (bool glo
     rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
 
   if (TARGET_P8_VECTOR)
-    rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+    {
+      rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wy]
+	= rs6000_constraints[RS6000_CONSTRAINT_ww]
+	= (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS;
+    }
+  else if (TARGET_VSX)
+    rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS;
 
   if (TARGET_STFIWX)
     rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS;
@@ -2409,7 +2446,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_di_load;
 	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_di_store;
 	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_di_load;
-	  if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY)
+	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_di_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_di_load;
@@ -2472,7 +2509,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_si_load;
 	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_si_store;
 	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_si_load;
-	  if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY)
+	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_si_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_si_load;
@@ -7195,10 +7232,13 @@ rs6000_debug_legitimate_address_p (enum 
   bool ret = rs6000_legitimate_address_p (mode, x, reg_ok_strict);
   fprintf (stderr,
 	   "\nrs6000_legitimate_address_p: return = %s, mode = %s, "
-	   "strict = %d, code = %s\n",
+	   "strict = %d, reload = %s, code = %s\n",
 	   ret ? "true" : "false",
 	   GET_MODE_NAME (mode),
 	   reg_ok_strict,
+	   (reload_completed
+	    ? "after"
+	    : (reload_in_progress ? "progress" : "before")),
 	   GET_RTX_NAME (GET_CODE (x)));
   debug_rtx (x);
 
@@ -14865,6 +14905,7 @@ rs6000_secondary_reload (bool in_p,
 	  from_type = exchange;
 	}
 
+      /* Can we do a direct move of some sort?  */
       if (rs6000_secondary_reload_move (to_type, from_type, mode, sri,
 					altivec_p))
 	{
@@ -29162,6 +29203,8 @@ static struct rs6000_opt_mask const rs60
   { "recip-precision",		OPTION_MASK_RECIP_PRECISION,	false, true  },
   { "string",			OPTION_MASK_STRING,		false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
+  { "upper-regs-df",		OPTION_MASK_UPPER_REGS_DF,	false, false },
+  { "upper-regs-sf",		OPTION_MASK_UPPER_REGS_SF,	false, false },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
   { "vsx-timode",		OPTION_MASK_VSX_TIMODE,		false, true  },
 #ifdef OPTION_MASK_64BIT
Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 202793)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -52,29 +52,18 @@ (define_register_constraint "z" "CA_REGS
   "@internal")
 
 ;; Use w as a prefix to add VSX modes
-;; vector double (V2DF)
+;; any VSX register
+(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
+  "Any VSX register if the -mvsx option was used or NO_REGS.")
+
 (define_register_constraint "wd" "rs6000_constraints[RS6000_CONSTRAINT_wd]"
-  "@internal")
+  "VSX vector register to hold vector double data or NO_REGS.")
 
-;; vector float (V4SF)
 (define_register_constraint "wf" "rs6000_constraints[RS6000_CONSTRAINT_wf]"
-  "@internal")
-
-;; scalar double (DF)
-(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]"
-  "@internal")
-
-;; TImode in VSX registers
-(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]"
-  "@internal")
-
-;; any VSX register
-(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
-  "@internal")
+  "VSX vector register to hold vector float data or NO_REGS.")
 
-;; Register constraints to simplify move patterns
 (define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
-  "Floating point register if -mmfpgpr is used, or NO_REGS.")
+  "If -mmfpgpr was used, a floating point register or NO_REGS.")
 
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
@@ -82,23 +71,38 @@ (define_register_constraint "wl" "rs6000
 (define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]"
   "VSX register if direct move instructions are enabled, or NO_REGS.")
 
+;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
+;; direct move directly, and movsf can't to move between the register sets.
+;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode
+(define_register_constraint "wn" "NO_REGS" "No register (NO_REGS).")
+
 (define_register_constraint "wr" "rs6000_constraints[RS6000_CONSTRAINT_wr]"
   "General purpose register if 64-bit instructions are enabled or NO_REGS.")
 
+(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]"
+  "VSX vector register to hold scalar double values or NO_REGS.")
+
+(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]"
+  "VSX vector register to hold 128 bit integer or NO_REGS.")
+
+(define_register_constraint "wu" "rs6000_constraints[RS6000_CONSTRAINT_wu]"
+  "Altivec register to use for float/32-bit int loads/stores  or NO_REGS.")
+
 (define_register_constraint "wv" "rs6000_constraints[RS6000_CONSTRAINT_wv]"
-  "Altivec register if -mpower8-vector is used or NO_REGS.")
+  "Altivec register to use for double loads/stores  or NO_REGS.")
+
+(define_register_constraint "ww" "rs6000_constraints[RS6000_CONSTRAINT_ww]"
+  "FP or VSX register to perform float operations under -mvsx or NO_REGS.")
 
 (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]"
   "Floating point register if the STFIWX instruction is enabled or NO_REGS.")
 
+(define_register_constraint "wy" "rs6000_constraints[RS6000_CONSTRAINT_wy]"
+  "VSX vector register to hold scalar float values or NO_REGS.")
+
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
   "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
 
-;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use
-;; direct move directly, and movsf can't to move between the register sets.
-;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode
-(define_register_constraint "wn" "NO_REGS")
-
 ;; Lq/stq validates the address for load/store quad
 (define_memory_constraint "wQ"
   "Memory operand suitable for the load/store quad instructions"
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 202793)
+++ gcc/doc/md.texi	(working copy)
@@ -2067,40 +2067,52 @@ Floating point register (containing 32-b
 Altivec vector register
 
 @item wa
-Any VSX register
+Any VSX register if the -mvsx option was used or NO_REGS.
 
 @item wd
-VSX vector register to hold vector double data
+VSX vector register to hold vector double data or NO_REGS.
 
 @item wf
-VSX vector register to hold vector float data
+VSX vector register to hold vector float data or NO_REGS.
 
 @item wg
-If @option{-mmfpgpr} was used, a floating point register
+If @option{-mmfpgpr} was used, a floating point register or NO_REGS.
 
 @item wl
-If the LFIWAX instruction is enabled, a floating point register
+Floating point register if the LFIWAX instruction is enabled or NO_REGS.
 
 @item wm
-If direct moves are enabled, a VSX register.
+VSX register if direct move instructions are enabled, or NO_REGS.
 
 @item wn
-No register.
+No register (NO_REGS).
 
 @item wr
-General purpose register if 64-bit mode is used
+General purpose register if 64-bit instructions are enabled or NO_REGS.
 
 @item ws
-VSX vector register to hold scalar float data
+VSX vector register to hold scalar double values or NO_REGS.
 
 @item wt
-VSX vector register to hold 128 bit integer
+VSX vector register to hold 128 bit integer or NO_REGS.
+
+@item wu
+Altivec register to use for float/32-bit int loads/stores  or NO_REGS.
+
+@item wv
+Altivec register to use for double loads/stores  or NO_REGS.
+
+@item ww
+FP or VSX register to perform float operations under @option{-mvsx} or NO_REGS.
 
 @item wx
-If the STFIWX instruction is enabled, a floating point register
+Floating point register if the STFIWX instruction is enabled or NO_REGS.
+
+@item wy
+VSX vector register to hold scalar float values or NO_REGS.
 
 @item wz
-If the LFIWZX instruction is enabled, a floating point register
+Floating point register if the LFIWZX instruction is enabled or NO_REGS.
 
 @item wQ
 A memory address that will work with the @code{lq} and @code{stq}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #1
  2013-09-21  2:09   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #1 Michael Meissner
@ 2013-09-23 17:38     ` David Edelsohn
  0 siblings, 0 replies; 20+ messages in thread
From: David Edelsohn @ 2013-09-23 17:38 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Fri, Sep 20, 2013 at 7:32 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> I am now breaking the patches down to be more bite size.  Ultimately, I hope
> these patches will provide support to allow scalar floating point to occupy the
> Altivec (upper) registers if the ISA allows it (ISA 2.06 for DFmode, ISA 2.07
> for SFmode).  One effect of later patches will be to go back to using the
> traditional DFmode instructions for VSX if all of the registers come from the
> traditional floating point register set.
>
> This patch adds the new switches, and constraints that will be used in future
> patches.  It produces exactly the same code on the targets I tested on and
> passes the bootstrap/make check stages.  Is it ok to apply this patch?
>
> I have tested the code generation for the following targets:
>
>         power4, power5, power5+, power6, power6x, power7, power8 for 64/32-bit
>         power7 with VSX disabled using altivec instructions for 64/32-bit
>         power7 with both VSX and altivec disabled for 64/32-bit
>         cell 64/32-bit
>         e5500, e6500 64/32-bit
>         G4 32-bit
>         G5 64/32-bit
>         linuxpaired 32-bit
>         linuxspe 32-bit
>
> 2013-09-20  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add new
>         constraints: wu, ww, and wy.  Repurpose wv constraint added during
>         power8 changes.  Put wg constraint in alphabetical order.
>
>         * config/rs6000/rs6000.opt (-mvsx-scalar-float): New debug switch
>         for future work to add ISA 2.07 VSX single precision support.
>         (-mvsx-scalar-double): Change default from -1 to 1, update
>         documentation comment.
>         (-mvsx-scalar-memory): Rename debug switch to -mupper-regs-df.
>         (-mupper-regs-df): New debug switch to control whether DF values
>         can go in the traditional Altivec registers.
>         (-mupper-regs-sf): New debug switch to control whether SF values
>         can go in the traditional Altivec registers.
>
>         * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print wu, ww,
>         and wy constraints.
>         (rs6000_init_hard_regno_mode_ok): Use ssize_t instead of int for
>         loop variables.  Rename -mvsx-scalar-memory to -mupper-regs-df.
>         Add new constraints, wu/ww/wy.  Repurpose wv constraint.
>         (rs6000_debug_legitimate_address_p): Print if we are running
>         before, during, or after reload.
>         (rs6000_secondary_reload): Add a comment.
>         (rs6000_opt_masks): Add -mupper-regs-df, -mupper-regs-sf.
>
>         * config/rs6000/constraints.md (wa constraint): Sort w<x>
>         constraints.  Update documentation string.
>         (wd constraint): Likewise.
>         (wf constraint): Likewise.
>         (wg constraint): Likewise.
>         (wn constraint): Likewise.
>         (ws constraint): Likewise.
>         (wt constraint): Likewise.
>         (wx constraint): Likewise.
>         (wz constraint): Likewise.
>         (wu constraint): New constraint for ISA 2.07 SFmode scalar
>         instructions.
>         (ww constraint): Likewise.
>         (wy constraint): Likewise.
>         (wv constraint): Repurpose ISA 2.07 constraint that we not used in
>         the previous submissions.
>         * doc/md.texi (PowerPC and IBM RS6000): Likewise.

This patch is okay.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2
  2013-08-22 19:02 [PATCH, powerpc] Rework VSX scalar floating point support Michael Meissner
  2013-09-06  3:39 ` David Edelsohn
@ 2013-09-23 20:37 ` Michael Meissner
  2013-09-24  7:05   ` David Edelsohn
  2013-09-24 21:46   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3 Michael Meissner
  1 sibling, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2013-09-23 20:37 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

[-- Attachment #1: Type: text/plain, Size: 991 bytes --]

This patch is an infrastructure patch that combines the 3 reload helper
function arrays into a single array of structures.  All machines generate the
same code with this patch (and no regressions were found in bootstrap/make
check).  Is it ok to apply?

2013-09-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_vector_reload): Delete, combine
	reload helper function arrays into a single array reg_addr.
	(reload_fpr_gpr): Likewise.
	(reload_gpr_vsx): Likewise.
	(reload_vsx_gpr): Likewise.
	(struct rs6000_reg_addr): Likewise.
	(reg_addr): Likewise.
	(rs6000_debug_reg_global): Change rs6000_vector_reload,
	reload_fpr_gpr, reload_gpr_vsx, reload_vsx_gpr uses to reg_addr.
	(rs6000_init_hard_regno_mode_ok): Likewise.
	(rs6000_secondary_reload_direct_move): Likewise.
	(rs6000_secondary_reload): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch057b --]
[-- Type: text/plain, Size: 15639 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 202804)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -189,9 +189,6 @@ unsigned char rs6000_hard_regno_nregs[NU
 /* Map register number to register class.  */
 enum reg_class rs6000_regno_regclass[FIRST_PSEUDO_REGISTER];
 
-/* Reload functions based on the type and the vector unit.  */
-static enum insn_code rs6000_vector_reload[NUM_MACHINE_MODES][2];
-
 static int dbg_cost_ctrl;
 
 /* Built in types.  */
@@ -316,11 +313,16 @@ static enum rs6000_reg_type reg_class_to
 
 #define IS_FP_VECT_REG_TYPE(RTYPE) IN_RANGE(RTYPE, VSX_REG_TYPE, FPR_REG_TYPE)
 
-/* Direct moves to/from vsx/gpr registers that need an additional register to
-   do the move.  */
-static enum insn_code reload_fpr_gpr[NUM_MACHINE_MODES];
-static enum insn_code reload_gpr_vsx[NUM_MACHINE_MODES];
-static enum insn_code reload_vsx_gpr[NUM_MACHINE_MODES];
+/* Register type masks based on the type, of valid addressing modes.  */
+struct rs6000_reg_addr {
+  enum insn_code reload_load;		/* INSN to reload for loading. */
+  enum insn_code reload_store;		/* INSN to reload for storing.  */
+  enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
+  enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
+  enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
+};
+
+static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
 
 \f
 /* Target cpu costs.  */
@@ -1919,8 +1921,8 @@ rs6000_debug_reg_global (void)
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
     if (rs6000_vector_unit[m] || rs6000_vector_mem[m]
-	|| (rs6000_vector_reload[m][0] != CODE_FOR_nothing)
-	|| (rs6000_vector_reload[m][1] != CODE_FOR_nothing))
+	|| (reg_addr[m].reload_load != CODE_FOR_nothing)
+	|| (reg_addr[m].reload_store != CODE_FOR_nothing))
       {
 	nl = "\n";
 	fprintf (stderr,
@@ -1929,8 +1931,8 @@ rs6000_debug_reg_global (void)
 		 GET_MODE_NAME (m),
 		 rs6000_debug_vector_unit[ rs6000_vector_unit[m] ],
 		 rs6000_debug_vector_unit[ rs6000_vector_mem[m] ],
-		 (rs6000_vector_reload[m][0] != CODE_FOR_nothing) ? 'y' : 'n',
-		 (rs6000_vector_reload[m][1] != CODE_FOR_nothing) ? 'y' : 'n');
+		 (reg_addr[m].reload_store != CODE_FOR_nothing) ? 'y' : 'n',
+		 (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'y' : 'n');
       }
 
   if (nl)
@@ -2239,13 +2241,17 @@ rs6000_init_hard_regno_mode_ok (bool glo
       reg_class_to_reg_type[(int)ALTIVEC_REGS] = ALTIVEC_REG_TYPE;
     }
 
-  /* Precalculate vector information, this must be set up before the
-     rs6000_hard_regno_nregs_internal below.  */
+  /* Precalculate the valid memory formats as well as the vector information,
+     this must be set up before the rs6000_hard_regno_nregs_internal calls
+     below.  */
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
     {
       rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE;
-      rs6000_vector_reload[m][0] = CODE_FOR_nothing;
-      rs6000_vector_reload[m][1] = CODE_FOR_nothing;
+      reg_addr[m].reload_load = CODE_FOR_nothing;
+      reg_addr[m].reload_store = CODE_FOR_nothing;
+      reg_addr[m].reload_fpr_gpr = CODE_FOR_nothing;
+      reg_addr[m].reload_gpr_vsx = CODE_FOR_nothing;
+      reg_addr[m].reload_vsx_gpr = CODE_FOR_nothing;
     }
 
   for (c = 0; c < (int)(int)RS6000_CONSTRAINT_MAX; c++)
@@ -2421,112 +2427,104 @@ rs6000_init_hard_regno_mode_ok (bool glo
   if (TARGET_LFIWZX)
     rs6000_constraints[RS6000_CONSTRAINT_wz] = FLOAT_REGS;
 
-  /* Setup the direct move combinations.  */
-  for (m = 0; m < NUM_MACHINE_MODES; ++m)
-    {
-      reload_fpr_gpr[m] = CODE_FOR_nothing;
-      reload_gpr_vsx[m] = CODE_FOR_nothing;
-      reload_vsx_gpr[m] = CODE_FOR_nothing;
-    }
-
   /* Set up the reload helper and direct move functions.  */
   if (TARGET_VSX || TARGET_ALTIVEC)
     {
       if (TARGET_64BIT)
 	{
-	  rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_di_store;
-	  rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_di_load;
-	  rs6000_vector_reload[V8HImode][0]  = CODE_FOR_reload_v8hi_di_store;
-	  rs6000_vector_reload[V8HImode][1]  = CODE_FOR_reload_v8hi_di_load;
-	  rs6000_vector_reload[V4SImode][0]  = CODE_FOR_reload_v4si_di_store;
-	  rs6000_vector_reload[V4SImode][1]  = CODE_FOR_reload_v4si_di_load;
-	  rs6000_vector_reload[V2DImode][0]  = CODE_FOR_reload_v2di_di_store;
-	  rs6000_vector_reload[V2DImode][1]  = CODE_FOR_reload_v2di_di_load;
-	  rs6000_vector_reload[V4SFmode][0]  = CODE_FOR_reload_v4sf_di_store;
-	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_di_load;
-	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_di_store;
-	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_di_load;
+	  reg_addr[V16QImode].reload_store = CODE_FOR_reload_v16qi_di_store;
+	  reg_addr[V16QImode].reload_load  = CODE_FOR_reload_v16qi_di_load;
+	  reg_addr[V8HImode].reload_store  = CODE_FOR_reload_v8hi_di_store;
+	  reg_addr[V8HImode].reload_load   = CODE_FOR_reload_v8hi_di_load;
+	  reg_addr[V4SImode].reload_store  = CODE_FOR_reload_v4si_di_store;
+	  reg_addr[V4SImode].reload_load   = CODE_FOR_reload_v4si_di_load;
+	  reg_addr[V2DImode].reload_store  = CODE_FOR_reload_v2di_di_store;
+	  reg_addr[V2DImode].reload_load   = CODE_FOR_reload_v2di_di_load;
+	  reg_addr[V4SFmode].reload_store  = CODE_FOR_reload_v4sf_di_store;
+	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_di_load;
+	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_di_store;
+	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_di_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
-	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_di_store;
-	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_di_load;
-	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_di_store;
-	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_di_load;
+	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_di_store;
+	      reg_addr[DFmode].reload_load   = CODE_FOR_reload_df_di_load;
+	      reg_addr[DDmode].reload_store  = CODE_FOR_reload_dd_di_store;
+	      reg_addr[DDmode].reload_load   = CODE_FOR_reload_dd_di_load;
 	    }
 	  if (TARGET_P8_VECTOR)
 	    {
-	      rs6000_vector_reload[SFmode][0]  = CODE_FOR_reload_sf_di_store;
-	      rs6000_vector_reload[SFmode][1]  = CODE_FOR_reload_sf_di_load;
-	      rs6000_vector_reload[SDmode][0]  = CODE_FOR_reload_sd_di_store;
-	      rs6000_vector_reload[SDmode][1]  = CODE_FOR_reload_sd_di_load;
+	      reg_addr[SFmode].reload_store  = CODE_FOR_reload_sf_di_store;
+	      reg_addr[SFmode].reload_load   = CODE_FOR_reload_sf_di_load;
+	      reg_addr[SDmode].reload_store  = CODE_FOR_reload_sd_di_store;
+	      reg_addr[SDmode].reload_load   = CODE_FOR_reload_sd_di_load;
 	    }
 	  if (TARGET_VSX_TIMODE)
 	    {
-	      rs6000_vector_reload[TImode][0]  = CODE_FOR_reload_ti_di_store;
-	      rs6000_vector_reload[TImode][1]  = CODE_FOR_reload_ti_di_load;
+	      reg_addr[TImode].reload_store  = CODE_FOR_reload_ti_di_store;
+	      reg_addr[TImode].reload_load   = CODE_FOR_reload_ti_di_load;
 	    }
 	  if (TARGET_DIRECT_MOVE)
 	    {
 	      if (TARGET_POWERPC64)
 		{
-		  reload_gpr_vsx[TImode]    = CODE_FOR_reload_gpr_from_vsxti;
-		  reload_gpr_vsx[V2DFmode]  = CODE_FOR_reload_gpr_from_vsxv2df;
-		  reload_gpr_vsx[V2DImode]  = CODE_FOR_reload_gpr_from_vsxv2di;
-		  reload_gpr_vsx[V4SFmode]  = CODE_FOR_reload_gpr_from_vsxv4sf;
-		  reload_gpr_vsx[V4SImode]  = CODE_FOR_reload_gpr_from_vsxv4si;
-		  reload_gpr_vsx[V8HImode]  = CODE_FOR_reload_gpr_from_vsxv8hi;
-		  reload_gpr_vsx[V16QImode] = CODE_FOR_reload_gpr_from_vsxv16qi;
-		  reload_gpr_vsx[SFmode]    = CODE_FOR_reload_gpr_from_vsxsf;
-
-		  reload_vsx_gpr[TImode]    = CODE_FOR_reload_vsx_from_gprti;
-		  reload_vsx_gpr[V2DFmode]  = CODE_FOR_reload_vsx_from_gprv2df;
-		  reload_vsx_gpr[V2DImode]  = CODE_FOR_reload_vsx_from_gprv2di;
-		  reload_vsx_gpr[V4SFmode]  = CODE_FOR_reload_vsx_from_gprv4sf;
-		  reload_vsx_gpr[V4SImode]  = CODE_FOR_reload_vsx_from_gprv4si;
-		  reload_vsx_gpr[V8HImode]  = CODE_FOR_reload_vsx_from_gprv8hi;
-		  reload_vsx_gpr[V16QImode] = CODE_FOR_reload_vsx_from_gprv16qi;
-		  reload_vsx_gpr[SFmode]    = CODE_FOR_reload_vsx_from_gprsf;
+		  reg_addr[TImode].reload_gpr_vsx    = CODE_FOR_reload_gpr_from_vsxti;
+		  reg_addr[V2DFmode].reload_gpr_vsx  = CODE_FOR_reload_gpr_from_vsxv2df;
+		  reg_addr[V2DImode].reload_gpr_vsx  = CODE_FOR_reload_gpr_from_vsxv2di;
+		  reg_addr[V4SFmode].reload_gpr_vsx  = CODE_FOR_reload_gpr_from_vsxv4sf;
+		  reg_addr[V4SImode].reload_gpr_vsx  = CODE_FOR_reload_gpr_from_vsxv4si;
+		  reg_addr[V8HImode].reload_gpr_vsx  = CODE_FOR_reload_gpr_from_vsxv8hi;
+		  reg_addr[V16QImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv16qi;
+		  reg_addr[SFmode].reload_gpr_vsx    = CODE_FOR_reload_gpr_from_vsxsf;
+
+		  reg_addr[TImode].reload_vsx_gpr    = CODE_FOR_reload_vsx_from_gprti;
+		  reg_addr[V2DFmode].reload_vsx_gpr  = CODE_FOR_reload_vsx_from_gprv2df;
+		  reg_addr[V2DImode].reload_vsx_gpr  = CODE_FOR_reload_vsx_from_gprv2di;
+		  reg_addr[V4SFmode].reload_vsx_gpr  = CODE_FOR_reload_vsx_from_gprv4sf;
+		  reg_addr[V4SImode].reload_vsx_gpr  = CODE_FOR_reload_vsx_from_gprv4si;
+		  reg_addr[V8HImode].reload_vsx_gpr  = CODE_FOR_reload_vsx_from_gprv8hi;
+		  reg_addr[V16QImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv16qi;
+		  reg_addr[SFmode].reload_vsx_gpr    = CODE_FOR_reload_vsx_from_gprsf;
 		}
 	      else
 		{
-		  reload_fpr_gpr[DImode] = CODE_FOR_reload_fpr_from_gprdi;
-		  reload_fpr_gpr[DDmode] = CODE_FOR_reload_fpr_from_gprdd;
-		  reload_fpr_gpr[DFmode] = CODE_FOR_reload_fpr_from_gprdf;
+		  reg_addr[DImode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdi;
+		  reg_addr[DDmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdd;
+		  reg_addr[DFmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdf;
 		}
 	    }
 	}
       else
 	{
-	  rs6000_vector_reload[V16QImode][0] = CODE_FOR_reload_v16qi_si_store;
-	  rs6000_vector_reload[V16QImode][1] = CODE_FOR_reload_v16qi_si_load;
-	  rs6000_vector_reload[V8HImode][0]  = CODE_FOR_reload_v8hi_si_store;
-	  rs6000_vector_reload[V8HImode][1]  = CODE_FOR_reload_v8hi_si_load;
-	  rs6000_vector_reload[V4SImode][0]  = CODE_FOR_reload_v4si_si_store;
-	  rs6000_vector_reload[V4SImode][1]  = CODE_FOR_reload_v4si_si_load;
-	  rs6000_vector_reload[V2DImode][0]  = CODE_FOR_reload_v2di_si_store;
-	  rs6000_vector_reload[V2DImode][1]  = CODE_FOR_reload_v2di_si_load;
-	  rs6000_vector_reload[V4SFmode][0]  = CODE_FOR_reload_v4sf_si_store;
-	  rs6000_vector_reload[V4SFmode][1]  = CODE_FOR_reload_v4sf_si_load;
-	  rs6000_vector_reload[V2DFmode][0]  = CODE_FOR_reload_v2df_si_store;
-	  rs6000_vector_reload[V2DFmode][1]  = CODE_FOR_reload_v2df_si_load;
+	  reg_addr[V16QImode].reload_store = CODE_FOR_reload_v16qi_si_store;
+	  reg_addr[V16QImode].reload_load  = CODE_FOR_reload_v16qi_si_load;
+	  reg_addr[V8HImode].reload_store  = CODE_FOR_reload_v8hi_si_store;
+	  reg_addr[V8HImode].reload_load   = CODE_FOR_reload_v8hi_si_load;
+	  reg_addr[V4SImode].reload_store  = CODE_FOR_reload_v4si_si_store;
+	  reg_addr[V4SImode].reload_load   = CODE_FOR_reload_v4si_si_load;
+	  reg_addr[V2DImode].reload_store  = CODE_FOR_reload_v2di_si_store;
+	  reg_addr[V2DImode].reload_load   = CODE_FOR_reload_v2di_si_load;
+	  reg_addr[V4SFmode].reload_store  = CODE_FOR_reload_v4sf_si_store;
+	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_si_load;
+	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_si_store;
+	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_si_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
-	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_si_store;
-	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_si_load;
-	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_si_store;
-	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_si_load;
+	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_si_store;
+	      reg_addr[DFmode].reload_load   = CODE_FOR_reload_df_si_load;
+	      reg_addr[DDmode].reload_store  = CODE_FOR_reload_dd_si_store;
+	      reg_addr[DDmode].reload_load   = CODE_FOR_reload_dd_si_load;
 	    }
 	  if (TARGET_P8_VECTOR)
 	    {
-	      rs6000_vector_reload[SFmode][0]  = CODE_FOR_reload_sf_si_store;
-	      rs6000_vector_reload[SFmode][1]  = CODE_FOR_reload_sf_si_load;
-	      rs6000_vector_reload[SDmode][0]  = CODE_FOR_reload_sd_si_store;
-	      rs6000_vector_reload[SDmode][1]  = CODE_FOR_reload_sd_si_load;
+	      reg_addr[SFmode].reload_store  = CODE_FOR_reload_sf_si_store;
+	      reg_addr[SFmode].reload_load   = CODE_FOR_reload_sf_si_load;
+	      reg_addr[SDmode].reload_store  = CODE_FOR_reload_sd_si_store;
+	      reg_addr[SDmode].reload_load   = CODE_FOR_reload_sd_si_load;
 	    }
 	  if (TARGET_VSX_TIMODE)
 	    {
-	      rs6000_vector_reload[TImode][0]  = CODE_FOR_reload_ti_si_store;
-	      rs6000_vector_reload[TImode][1]  = CODE_FOR_reload_ti_si_load;
+	      reg_addr[TImode].reload_store  = CODE_FOR_reload_ti_si_store;
+	      reg_addr[TImode].reload_load   = CODE_FOR_reload_ti_si_load;
 	    }
 	}
     }
@@ -14745,7 +14743,7 @@ rs6000_secondary_reload_direct_move (enu
 	  if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE)
 	    {
 	      cost = 3;			/* 2 mtvsrd's, 1 xxpermdi.  */
-	      icode = reload_vsx_gpr[(int)mode];
+	      icode = reg_addr[mode].reload_vsx_gpr;
 	    }
 
 	  /* Handle moving 128-bit values from VSX point registers to GPRs on
@@ -14754,7 +14752,7 @@ rs6000_secondary_reload_direct_move (enu
 	  else if (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE)
 	    {
 	      cost = 3;			/* 2 mfvsrd's, 1 xxpermdi.  */
-	      icode = reload_gpr_vsx[(int)mode];
+	      icode = reg_addr[mode].reload_gpr_vsx;
 	    }
 	}
 
@@ -14763,13 +14761,13 @@ rs6000_secondary_reload_direct_move (enu
 	  if (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE)
 	    {
 	      cost = 3;			/* xscvdpspn, mfvsrd, and.  */
-	      icode = reload_gpr_vsx[(int)mode];
+	      icode = reg_addr[mode].reload_gpr_vsx;
 	    }
 
 	  else if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE)
 	    {
 	      cost = 2;			/* mtvsrz, xscvspdpn.  */
-	      icode = reload_vsx_gpr[(int)mode];
+	      icode = reg_addr[mode].reload_vsx_gpr;
 	    }
 	}
     }
@@ -14782,7 +14780,7 @@ rs6000_secondary_reload_direct_move (enu
       if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE)
 	{
 	  cost = 3;			/* 2 mtvsrd's, 1 xxpermdi.  */
-	  icode = reload_vsx_gpr[(int)mode];
+	  icode = reg_addr[mode].reload_vsx_gpr;
 	}
 
       /* Handle moving 128-bit values from VSX point registers to GPRs on
@@ -14791,7 +14789,7 @@ rs6000_secondary_reload_direct_move (enu
       else if (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE)
 	{
 	  cost = 3;			/* 2 mfvsrd's, 1 xxpermdi.  */
-	  icode = reload_gpr_vsx[(int)mode];
+	  icode = reg_addr[mode].reload_gpr_vsx;
 	}
     }
 
@@ -14807,7 +14805,7 @@ rs6000_secondary_reload_direct_move (enu
       if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE && !altivec_p)
 	{
 	  cost = 3;			/* 2 mtvsrwz's, 1 fmrgow.  */
-	  icode = reload_fpr_gpr[(int)mode];
+	  icode = reg_addr[mode].reload_fpr_gpr;
 	}
     }
 
@@ -14890,7 +14888,9 @@ rs6000_secondary_reload (bool in_p,
   bool default_p = false;
 
   sri->icode = CODE_FOR_nothing;
-  icode = rs6000_vector_reload[mode][in_p != false];
+  icode = ((in_p)
+	   ? reg_addr[mode].reload_load
+	   : reg_addr[mode].reload_store);
 
   if (REG_P (x) || register_operand (x, mode))
     {

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2
  2013-09-23 20:37 ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2 Michael Meissner
@ 2013-09-24  7:05   ` David Edelsohn
  2013-09-24 21:46   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3 Michael Meissner
  1 sibling, 0 replies; 20+ messages in thread
From: David Edelsohn @ 2013-09-24  7:05 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Mon, Sep 23, 2013 at 4:06 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch is an infrastructure patch that combines the 3 reload helper
> function arrays into a single array of structures.  All machines generate the
> same code with this patch (and no regressions were found in bootstrap/make
> check).  Is it ok to apply?
>
> 2013-09-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_vector_reload): Delete, combine
>         reload helper function arrays into a single array reg_addr.
>         (reload_fpr_gpr): Likewise.
>         (reload_gpr_vsx): Likewise.
>         (reload_vsx_gpr): Likewise.
>         (struct rs6000_reg_addr): Likewise.
>         (reg_addr): Likewise.
>         (rs6000_debug_reg_global): Change rs6000_vector_reload,
>         reload_fpr_gpr, reload_gpr_vsx, reload_vsx_gpr uses to reg_addr.
>         (rs6000_init_hard_regno_mode_ok): Likewise.
>         (rs6000_secondary_reload_direct_move): Likewise.
>         (rs6000_secondary_reload): Likewise.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-23 20:37 ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2 Michael Meissner
  2013-09-24  7:05   ` David Edelsohn
@ 2013-09-24 21:46   ` Michael Meissner
  2013-09-26  8:34     ` David Edelsohn
  1 sibling, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2013-09-24 21:46 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, dje.gcc

[-- Attachment #1: Type: text/plain, Size: 1696 bytes --]

This patch adds the initial support for putting DI, DF, and SF values in the
upper registers (traditional Altivec registers) using the -mupper-regs-df and
-mupper-regs-sf patches.  Those switches will not be enabled by default until
the rest of the changes are made.  This patch passes the bootstrap test and
make check test.  I tested all of the targets I tested previously (power4-8,
G4/G5, SPE, cell, e5500/e5600, and paired floating point), and all machines
generate the same code.  Is it ok to install this patch?

[gcc]
2013-09-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow
	DFmode, DImode, and SFmode in the upper VSX registers based on the
	-mupper-regs-{df,sf} flags.  Fix wu constraint to be ALTIVEC_REGS
	if -mpower8-vector.  Combine -mvsx-timode handling with the rest
	of the VSX register handling.

	* config/rs6000/rs6000.md (f32_lv): Use %x0 for VSX regsters.
	(f32_sv): Likewise.
	(zero_extendsidi2_lfiwzx): Add support for loading into the
	Altivec registers with -mpower8-vector.  Use wu/wv constraints to
	only do VSX memory options on Altivec registers.
	(extendsidi2_lfiwax): Likewise.
	(extendsfdf2_fpr): Likewise.
	(mov<mode>_hardfloat, SF/SD modes): Likewise.
	(mov<mode>_hardfloat32, DF/DD modes): Likewise.
	(mov<mode>_hardfloat64, DF/DD modes): Likewise.
	(movdi_internal64): Likewise.

[gcc/testsuite]
2013-09-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p8vector-ldst.c: New test for -mupper-regs-sf
	and -mupper-regs-df.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch058b --]
[-- Type: text/plain, Size: 11842 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 202855)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1626,19 +1626,28 @@ rs6000_hard_regno_mode_ok (int regno, en
 
   /* VSX registers that overlap the FPR registers are larger than for non-VSX
      implementations.  Don't allow an item to be split between a FP register
-     and an Altivec register.  */
-  if (VECTOR_MEM_VSX_P (mode))
+     and an Altivec register.  Allow TImode in all VSX registers if the user
+     asked for it.  */
+  if (TARGET_VSX && VSX_REGNO_P (regno)
+      && (VECTOR_MEM_VSX_P (mode)
+	  || (TARGET_VSX_SCALAR_FLOAT && mode == SFmode)
+	  || (TARGET_VSX_SCALAR_DOUBLE && (mode == DFmode || mode == DImode))
+	  || (TARGET_VSX_TIMODE && mode == TImode)))
     {
       if (FP_REGNO_P (regno))
 	return FP_REGNO_P (last_regno);
 
       if (ALTIVEC_REGNO_P (regno))
-	return ALTIVEC_REGNO_P (last_regno);
-    }
+	{
+	  if (mode == SFmode && !TARGET_UPPER_REGS_SF)
+	    return 0;
 
-  /* Allow TImode in all VSX registers if the user asked for it.  */
-  if (mode == TImode && TARGET_VSX_TIMODE && VSX_REGNO_P (regno))
-    return 1;
+	  if ((mode == DFmode || mode == DImode) && !TARGET_UPPER_REGS_DF)
+	    return 0;
+
+	  return ALTIVEC_REGNO_P (last_regno);
+	}
+    }
 
   /* The GPRs can hold any mode, but values bigger than one register
      cannot go past R31.  */
@@ -2413,7 +2422,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   if (TARGET_P8_VECTOR)
     {
-      rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wu] = ALTIVEC_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wy]
 	= rs6000_constraints[RS6000_CONSTRAINT_ww]
 	= (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS;
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 202846)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -314,13 +314,13 @@ (define_mode_attr real_value_to_target [
 (define_mode_attr f32_lr [(SF "f")		 (SD "wz")])
 (define_mode_attr f32_lm [(SF "m")		 (SD "Z")])
 (define_mode_attr f32_li [(SF "lfs%U1%X1 %0,%1") (SD "lfiwzx %0,%y1")])
-(define_mode_attr f32_lv [(SF "lxsspx %0,%y1")	 (SD "lxsiwzx %0,%y1")])
+(define_mode_attr f32_lv [(SF "lxsspx %x0,%y1")	 (SD "lxsiwzx %x0,%y1")])
 
 ; Definitions for store from 32-bit fpr register
 (define_mode_attr f32_sr [(SF "f")		  (SD "wx")])
 (define_mode_attr f32_sm [(SF "m")		  (SD "Z")])
 (define_mode_attr f32_si [(SF "stfs%U0%X0 %1,%0") (SD "stfiwx %1,%y0")])
-(define_mode_attr f32_sv [(SF "stxsspx %1,%y0")	  (SD "stxsiwzx %1,%y0")])
+(define_mode_attr f32_sv [(SF "stxsspx %x1,%y0")  (SD "stxsiwzx %x1,%y0")])
 
 ; Definitions for 32-bit fpr direct move
 (define_mode_attr f32_dm [(SF "wn") (SD "wm")])
@@ -541,7 +541,7 @@ (define_split
   "")
 
 (define_insn "*zero_extendsidi2_lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wz,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wz,!wu")
 	(zero_extend:DI (match_operand:SI 1 "reg_or_mem_operand" "m,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWZX"
   "@
@@ -711,7 +711,7 @@ (define_expand "extendsidi2"
   "")
 
 (define_insn "*extendsidi2_lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wl,!wm")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,??wm,!wl,!wu")
 	(sign_extend:DI (match_operand:SI 1 "lwa_operand" "m,r,r,Z,Z")))]
   "TARGET_POWERPC64 && TARGET_LFIWAX"
   "@
@@ -5066,13 +5066,16 @@ (define_expand "extendsfdf2"
   "")
 
 (define_insn_and_split "*extendsfdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d")
-	(float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" "0,f,m")))]
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d,wy,?wy,wv")
+	(float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" "0,f,m,0,wz,Z")))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
   "@
    #
    fmr %0,%1
-   lfs%U1%X1 %0,%1"
+   lfs%U1%X1 %0,%1
+   #
+   xxlor %x0,%x1,%x1
+   lxsspx %x0,%y1"
   "&& reload_completed && REG_P (operands[1]) && REGNO (operands[0]) == REGNO (operands[1])"
   [(const_int 0)]
 {
@@ -5088,7 +5091,16 @@ (define_insn_and_split "*extendsfdf2_fpr
 	 (if_then_else
 	   (match_test "update_address_mem (operands[1], VOIDmode)")
 	   (const_string "fpload_u")
-	   (const_string "fpload")))])])
+	   (const_string "fpload")))
+       (const_string "fp")
+       (const_string "vecsimple")
+       (if_then_else
+	(match_test "update_indexed_address_mem (operands[1], VOIDmode)")
+	(const_string "fpload_ux")
+	(if_then_else
+	 (match_test "update_address_mem (operands[1], VOIDmode)")
+	 (const_string "fpload_u")
+	 (const_string "fpload")))])])
 
 (define_expand "truncdfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
@@ -9290,8 +9302,8 @@ (define_split
 }")
 
 (define_insn "mov<mode>_hardfloat"
-  [(set (match_operand:FMOVE32 0 "nonimmediate_operand" "=!r,!r,m,f,wa,wa,<f32_lr>,<f32_sm>,wm,Z,?<f32_dm>,?r,*c*l,!r,*h,!r,!r")
-	(match_operand:FMOVE32 1 "input_operand" "r,m,r,f,wa,j,<f32_lm>,<f32_sr>,Z,wm,r,<f32_dm>,r,h,0,G,Fn"))]
+  [(set (match_operand:FMOVE32 0 "nonimmediate_operand" "=!r,!r,m,f,wa,wa,<f32_lr>,<f32_sm>,wu,Z,?<f32_dm>,?r,*c*l,!r,*h,!r,!r")
+	(match_operand:FMOVE32 1 "input_operand" "r,m,r,f,wa,j,<f32_lm>,<f32_sr>,Z,wu,r,<f32_dm>,r,h,0,G,Fn"))]
   "(gpc_reg_operand (operands[0], <MODE>mode)
    || gpc_reg_operand (operands[1], <MODE>mode))
    && (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT)"
@@ -9492,8 +9504,8 @@ (define_split
 ;; reloading.
 
 (define_insn "*mov<mode>_hardfloat32"
-  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,!r,!r,!r")
-	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,G,H,F"))]
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,!r,!r,!r")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,G,H,F"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9502,11 +9514,8 @@ (define_insn "*mov<mode>_hardfloat32"
    lfd%U1%X1 %0,%1
    fmr %0,%1
    lxsd%U1x %x0,%y1
-   lxsd%U1x %x0,%y1
-   stxsd%U0x %x1,%y0
    stxsd%U0x %x1,%y0
    xxlor %x0,%x1,%x1
-   xxlor %x0,%x1,%x1
    xxlxor %x0,%x0,%x0
    #
    #
@@ -9535,27 +9544,18 @@ (define_insn "*mov<mode>_hardfloat32"
 	 (const_string "fpload_ux")
 	 (const_string "fpload"))
        (if_then_else
-	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
-	 (const_string "fpload_ux")
-	 (const_string "fpload"))
-       (if_then_else
-	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
-	 (const_string "fpstore_ux")
-	 (const_string "fpstore"))
-       (if_then_else
 	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
 	 (const_string "fpstore_ux")
 	 (const_string "fpstore"))
        (const_string "vecsimple")
        (const_string "vecsimple")
-       (const_string "vecsimple")
        (const_string "store")
        (const_string "load")
        (const_string "two")
        (const_string "fp")
        (const_string "fp")
        (const_string "*")])
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,8,8,8,8,12,16")])
+   (set_attr "length" "4,4,4,4,4,4,4,8,8,8,8,12,16")])
 
 (define_insn "*mov<mode>_softfloat32"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,r,r,r")
@@ -9572,8 +9572,8 @@ (define_insn "*mov<mode>_softfloat32"
 ; ld/std require word-aligned displacements -> 'Y' constraint.
 ; List Y->r and r->Y before r->r for reload.
 (define_insn "*mov<mode>_hardfloat64"
-  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wm")
-	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wm,r"))]
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,wv,Z,wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg,r,wm")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,wv,wa,j,r,Y,r,r,h,0,G,H,F,wg,r,wm,r"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9582,11 +9582,8 @@ (define_insn "*mov<mode>_hardfloat64"
    lfd%U1%X1 %0,%1
    fmr %0,%1
    lxsd%U1x %x0,%y1
-   lxsd%U1x %x0,%y1
-   stxsd%U0x %x1,%y0
    stxsd%U0x %x1,%y0
    xxlor %x0,%x1,%x1
-   xxlor %x0,%x1,%x1
    xxlxor %x0,%x0,%x0
    std%U0%X0 %1,%0
    ld%U1%X1 %0,%1
@@ -9622,20 +9619,11 @@ (define_insn "*mov<mode>_hardfloat64"
 	 (const_string "fpload_ux")
 	 (const_string "fpload"))
        (if_then_else
-	 (match_test "update_indexed_address_mem (operands[1], VOIDmode)")
-	 (const_string "fpload_ux")
-	 (const_string "fpload"))
-       (if_then_else
-	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
-	 (const_string "fpstore_ux")
-	 (const_string "fpstore"))
-       (if_then_else
 	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
 	 (const_string "fpstore_ux")
 	 (const_string "fpstore"))
        (const_string "vecsimple")
        (const_string "vecsimple")
-       (const_string "vecsimple")
        (if_then_else
 	 (match_test "update_indexed_address_mem (operands[0], VOIDmode)")
 	 (const_string "store_ux")
@@ -9661,7 +9649,7 @@ (define_insn "*mov<mode>_hardfloat64"
        (const_string "mffgpr")
        (const_string "mftgpr")
        (const_string "mffgpr")])
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4,4,4")])
+   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4,4,4")])
 
 (define_insn "*mov<mode>_softfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
@@ -10324,8 +10312,8 @@ (define_split
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,?Z,?wa,?wa,r,*h,*h,?wa,r,?*wg,r,?*wm")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,wa,Z,wa,*h,r,0,O,*wg,r,*wm,r"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,?Z,?wv,?wa,r,*h,*h,?wa,r,?*wg,r,?*wm")
+	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,wv,Z,wa,*h,r,0,O,*wg,r,*wm,r"))]
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
Index: gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c	(revision 0)
@@ -0,0 +1,42 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mcpu=power8 -O2 -mupper-regs-df -mupper-regs-sf" } */
+
+float load_sf (float *p)
+{
+  float f = *p;
+  __asm__ ("# reg %x0" : "+v" (f));
+  return f;
+}
+
+double load_df (double *p)
+{
+  double d = *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return d;
+}
+
+double load_dfsf (float *p)
+{
+  double d = (double) *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return d;
+}
+
+void store_sf (float *p, float f)
+{
+  __asm__ ("# reg %x0" : "+v" (f));
+  *p = f;
+}
+
+void store_df (double *p, double d)
+{
+  __asm__ ("# reg %x0" : "+v" (d));
+  *p = d;
+}
+
+/* { dg-final { scan-assembler-times "lxsspx"  2 } } */
+/* { dg-final { scan-assembler-times "lxsdx"   1 } } */
+/* { dg-final { scan-assembler-times "stxsspx" 1 } } */
+/* { dg-final { scan-assembler-times "stxsdx"  1 } } */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-24 21:46   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3 Michael Meissner
@ 2013-09-26  8:34     ` David Edelsohn
  2013-09-26 21:37       ` Michael Meissner
  0 siblings, 1 reply; 20+ messages in thread
From: David Edelsohn @ 2013-09-26  8:34 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Tue, Sep 24, 2013 at 4:33 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch adds the initial support for putting DI, DF, and SF values in the
> upper registers (traditional Altivec registers) using the -mupper-regs-df and
> -mupper-regs-sf patches.  Those switches will not be enabled by default until
> the rest of the changes are made.  This patch passes the bootstrap test and
> make check test.  I tested all of the targets I tested previously (power4-8,
> G4/G5, SPE, cell, e5500/e5600, and paired floating point), and all machines
> generate the same code.  Is it ok to install this patch?
>
> [gcc]
> 2013-09-24  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow
>         DFmode, DImode, and SFmode in the upper VSX registers based on the
>         -mupper-regs-{df,sf} flags.  Fix wu constraint to be ALTIVEC_REGS
>         if -mpower8-vector.  Combine -mvsx-timode handling with the rest
>         of the VSX register handling.
>
>         * config/rs6000/rs6000.md (f32_lv): Use %x0 for VSX regsters.
>         (f32_sv): Likewise.
>         (zero_extendsidi2_lfiwzx): Add support for loading into the
>         Altivec registers with -mpower8-vector.  Use wu/wv constraints to
>         only do VSX memory options on Altivec registers.
>         (extendsidi2_lfiwax): Likewise.
>         (extendsfdf2_fpr): Likewise.
>         (mov<mode>_hardfloat, SF/SD modes): Likewise.
>         (mov<mode>_hardfloat32, DF/DD modes): Likewise.
>         (mov<mode>_hardfloat64, DF/DD modes): Likewise.
>         (movdi_internal64): Likewise.
>
> [gcc/testsuite]
> 2013-09-24  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * gcc.target/powerpc/p8vector-ldst.c: New test for -mupper-regs-sf
>         and -mupper-regs-df.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-26  8:34     ` David Edelsohn
@ 2013-09-26 21:37       ` Michael Meissner
  2013-09-26 21:50         ` Michael Meissner
                           ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Michael Meissner @ 2013-09-26 21:37 UTC (permalink / raw)
  To: David Edelsohn, Michael Meissner, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

I discovered that I was setting the wv/wu constraints incorrectly to
ALTIVEC_REGS, which leads to reload failures in some cases.

Is this patch ok to apply along with the previous patch assuming it bootstraps
and has no regressions with make check?  It builds the programs that had
failures with the previous patch.

2013-09-26  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Don't
	allow wv/wu constraints to be ALTIVEC_REGISTERS unless DF/SF can
	occupy the Altivec registers.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch059c --]
[-- Type: text/plain, Size: 3900 bytes --]

Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(revision 202846)
+++ gcc/config/rs6000/rs6000-builtin.def	(working copy)
@@ -1209,9 +1209,9 @@ BU_VSX_1 (XVRSPIZ,	      "xvrspiz",	CONS
 
 BU_VSX_1 (XSRDPI,	      "xsrdpi",		CONST,	vsx_xsrdpi)
 BU_VSX_1 (XSRDPIC,	      "xsrdpic",	CONST,	vsx_xsrdpic)
-BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	vsx_floordf2)
-BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	vsx_ceildf2)
-BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	vsx_btruncdf2)
+BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	floordf2)
+BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	ceildf2)
+BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	btruncdf2)
 
 /* VSX predicate functions.  */
 BU_VSX_P (XVCMPEQSP_P,	      "xvcmpeqsp_p",	CONST,	vector_eq_v4sf_p)
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 202874)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2394,13 +2394,17 @@ rs6000_init_hard_regno_mode_ok (bool glo
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
 
       if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;
 
-      rs6000_constraints[RS6000_CONSTRAINT_ws]
-	= (TARGET_UPPER_REGS_DF) ? VSX_REGS : FLOAT_REGS;
+      if (TARGET_UPPER_REGS_DF)
+	{
+	  rs6000_constraints[RS6000_CONSTRAINT_ws] = VSX_REGS;
+	  rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+	}
+      else
+	rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;
     }
 
   /* Add conditional constraints based on various options, to allow us to
@@ -2420,12 +2424,16 @@ rs6000_init_hard_regno_mode_ok (bool glo
   if (TARGET_POWERPC64)
     rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS;
 
-  if (TARGET_P8_VECTOR)
+  if (TARGET_P8_VECTOR && TARGET_UPPER_REGS_SF)
     {
       rs6000_constraints[RS6000_CONSTRAINT_wu] = ALTIVEC_REGS;
-      rs6000_constraints[RS6000_CONSTRAINT_wy]
-	= rs6000_constraints[RS6000_CONSTRAINT_ww]
-	= (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_wy] = VSX_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_ww] = VSX_REGS;
+    }
+  else if (TARGET_P8_VECTOR)
+    {
+      rs6000_constraints[RS6000_CONSTRAINT_wy] = FLOAT_REGS;
+      rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS;
     }
   else if (TARGET_VSX)
     rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS;
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 202846)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -617,6 +617,25 @@ extern int rs6000_vector_align[];
 			  || rs6000_cpu == PROCESSOR_PPC8548)
 
 
+/* Whether SF/DF operations are supported on the E500.  */
+#define TARGET_SF_SPE	(TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT	\
+			 && !TARGET_FPRS)
+
+#define TARGET_DF_SPE	(TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT	\
+			 && !TARGET_FPRS && TARGET_E500_DOUBLE)
+
+/* Whether SF/DF operations are supported by by the normal floating point unit
+   (or the vector/scalar unit).  */
+#define TARGET_SF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_SINGLE_FLOAT)
+
+#define TARGET_DF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_DOUBLE_FLOAT)
+
+/* Whether SF/DF operations are supported by any hardware.  */
+#define TARGET_SF_INSN	(TARGET_SF_FPR || TARGET_SF_SPE)
+#define TARGET_DF_INSN	(TARGET_DF_FPR || TARGET_DF_SPE)
+
 /* Which machine supports the various reciprocal estimate instructions.  */
 #define TARGET_FRES	(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT \
 			 && TARGET_FPRS && TARGET_SINGLE_FLOAT)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-26 21:37       ` Michael Meissner
@ 2013-09-26 21:50         ` Michael Meissner
  2013-09-27  2:15         ` David Edelsohn
  2013-10-01 17:52         ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4 Michael Meissner
  2 siblings, 0 replies; 20+ messages in thread
From: Michael Meissner @ 2013-09-26 21:50 UTC (permalink / raw)
  To: Michael Meissner, David Edelsohn, GCC Patches

Just to be clear, I was only asking about the change in rs6000.c.  The other
two changes (rs6000-builtins.def, rs6000.h) will be part of the next patch
set.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-26 21:37       ` Michael Meissner
  2013-09-26 21:50         ` Michael Meissner
@ 2013-09-27  2:15         ` David Edelsohn
  2013-09-27  4:18           ` Michael Meissner
  2013-10-01 17:27           ` Michael Meissner
  2013-10-01 17:52         ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4 Michael Meissner
  2 siblings, 2 replies; 20+ messages in thread
From: David Edelsohn @ 2013-09-27  2:15 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Thu, Sep 26, 2013 at 4:51 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> I discovered that I was setting the wv/wu constraints incorrectly to
> ALTIVEC_REGS, which leads to reload failures in some cases.
>
> Is this patch ok to apply along with the previous patch assuming it bootstraps
> and has no regressions with make check?  It builds the programs that had
> failures with the previous patch.
>
> 2013-09-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Don't
>         allow wv/wu constraints to be ALTIVEC_REGISTERS unless DF/SF can
>         occupy the Altivec registers.

Okay.

Can you add a testcase to catch this in the future?

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-27  2:15         ` David Edelsohn
@ 2013-09-27  4:18           ` Michael Meissner
  2013-10-01 17:27           ` Michael Meissner
  1 sibling, 0 replies; 20+ messages in thread
From: Michael Meissner @ 2013-09-27  4:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Michael Meissner, GCC Patches

On Thu, Sep 26, 2013 at 06:56:37PM -0400, David Edelsohn wrote:
> On Thu, Sep 26, 2013 at 4:51 PM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > I discovered that I was setting the wv/wu constraints incorrectly to
> > ALTIVEC_REGS, which leads to reload failures in some cases.
> >
> > Is this patch ok to apply along with the previous patch assuming it bootstraps
> > and has no regressions with make check?  It builds the programs that had
> > failures with the previous patch.
> >
> > 2013-09-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
> >
> >         * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Don't
> >         allow wv/wu constraints to be ALTIVEC_REGISTERS unless DF/SF can
> >         occupy the Altivec registers.
> 
> Okay.
> 
> Can you add a testcase to catch this in the future?

You only see it in big programs with agressive optimizations.  I did not see it
during the normal testing (bootstrap, etc.).

The failure is reload complaining it can't find an Altivec register to spill if
the move pattern has an option for only Altivec registers.  It isn't like bad
code is silently generated.  I will check the 5 spec benchmarks that failed
with to see if I can extract one module that shows it off.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3
  2013-09-27  2:15         ` David Edelsohn
  2013-09-27  4:18           ` Michael Meissner
@ 2013-10-01 17:27           ` Michael Meissner
  1 sibling, 0 replies; 20+ messages in thread
From: Michael Meissner @ 2013-10-01 17:27 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Michael Meissner, GCC Patches

On Thu, Sep 26, 2013 at 06:56:37PM -0400, David Edelsohn wrote:
> On Thu, Sep 26, 2013 at 4:51 PM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > I discovered that I was setting the wv/wu constraints incorrectly to
> > ALTIVEC_REGS, which leads to reload failures in some cases.
> >
> > Is this patch ok to apply along with the previous patch assuming it bootstraps
> > and has no regressions with make check?  It builds the programs that had
> > failures with the previous patch.
> >
> > 2013-09-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
> >
> >         * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Don't
> >         allow wv/wu constraints to be ALTIVEC_REGISTERS unless DF/SF can
> >         occupy the Altivec registers.
> 
> Okay.
> 
> Can you add a testcase to catch this in the future?

I looked into this, and of the 5 spec benchmarks that did not compile without
the fix, 4 are in fortran, and the 5th in C++ did not show up the error if I
deleted any code, so there is no simple test case I know of.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4
  2013-09-26 21:37       ` Michael Meissner
  2013-09-26 21:50         ` Michael Meissner
  2013-09-27  2:15         ` David Edelsohn
@ 2013-10-01 17:52         ` Michael Meissner
  2013-10-01 23:15           ` David Edelsohn
  2 siblings, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2013-10-01 17:52 UTC (permalink / raw)
  To: Michael Meissner, David Edelsohn, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 7965 bytes --]

This patch moves most of the VSX DFmode operations from vsx.md to rs6000.md to
use the traditional floating point instructions (f*) instead of the VSX scalar
instructions (xs*) if all of the registers come from the traditional floating
point register set.  The add, subtract, multiply, divide, reciprocal estimate,
square root, absolute value, negate, round functions, and multiply/add
instructions were changed.  Some of the converts have not been changed with
these patches.  If the -mupper-regs-df switch is used, it will attempt to use
the upper registers (those that overlay on the traditional Altivec register
set).

This patch also combines the scalar SFmode/DFmode support on non-SPE systems.
It adds in ISA 2.07 (power8) single precision floating point support if the
-mupper-regs-sf switch is used.

At present, neither -mupper-regs-df nor -mupper-regs-sf is usable if reload has
to do anything.  A future patch will address this.

I did need to adjust a few tests that were specifically testing VSX scalar code
generation.  In addition, I put in a simple test to make sure the initial
-mupper-regs-df and -mupper-regs-sf works correctly.

I tested this an except for power7, power8 I could not find any changes in code
generated for power4, power5, power6, power6x, G4, G5, cell, e5500, e6500,
xilinx (sp_full, sp_lite, dp_full, dp_lite, none), 8548/8540 (spe), 750cl
(paired floating point).

For VSX systems there is code generation differences:

    1)	The traditional fp instruction is generated instead of VSX;

    2)	Because of #1, the code generator favors the 4 operand of multiply/add
	instructions, where the target register does not overlap with any of
	the inputs over the VSX version that that requires overlap.

    3)	A few of the vectorized tests on power8 now generate more direct move
	instructions, instead of moving values through the stack than
	previously.  These tests are integer tests, where you are doing an
	operation between an integer vector and a scalar value.  Previously in
	some cases, the register allocator would store the value from a GPR and
	reload it to the vector registers.

    4)	There is a slight scheduling difference in doing long double abs, that
	causes a different register to be used.  The code for long double abs
	needs to be improved in any case (the early splitting is causing spills
	to the stack).

I had no differences in doing bootstrap and make check (with the testsuite
fixes applied).

In addition, I am running Spec 2006 floating point tests on a power7 box to
compare the effects of going back to the traditional floating point tests.  For
most tests, there is less than 2% difference between the runs.  One benchmark
(482.sphinx3) is slightly faster with these changes, and it is dominated by
floating point multiply/add operations.

Can I apply these patches?

[gcc]
2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def (XSRDPIM): Use floatdf2,
	ceildf2, btruncdf2, instead of vsx_* name.

	* config/rs6000/vsx.md (vsx_add<mode>3): Change arithmetic
	iterators to only do V2DF and V4SF here.  Move the DF code to
	rs6000.md where it is combined with SF mode.  Replace <VSv> with
	just 'v' since only vector operations are handled with these insns
	after moving the DF support to rs6000.md.
	(vsx_sub<mode>3): Likewise.
	(vsx_mul<mode>3): Likewise.
	(vsx_div<mode>3): Likewise.
	(vsx_fre<mode>2): Likewise.
	(vsx_neg<mode>2): Likewise.
	(vsx_abs<mode>2): Likewise.
	(vsx_nabs<mode>2): Likewise.
	(vsx_smax<mode>3): Likewise.
	(vsx_smin<mode>3): Likewise.
	(vsx_sqrt<mode>2): Likewise.
	(vsx_rsqrte<mode>2): Likewise.
	(vsx_fms<mode>4): Likewise.
	(vsx_nfma<mode>4): Likewise.
	(vsx_copysign<mode>3): Likewise.
	(vsx_btrunc<mode>2): Likewise.
	(vsx_floor<mode>2): Likewise.
	(vsx_ceil<mode>2): Likewise.
	(vsx_smaxsf3): Delete scalar ops that were moved to rs6000.md.
	(vsx_sminsf3): Likewise.
	(vsx_fmadf4): Likewise.
	(vsx_fmsdf4): Likewise.
	(vsx_nfmadf4): Likewise.
	(vsx_nfmsdf4): Likewise.
	(vsx_cmpdf_internal1): Likewise.

	* config/rs6000/rs6000.h (TARGET_SF_SPE): Define macros to make it
	simpler to select whether a target has SPE or traditional floating
	point support in iterators.
	(TARGET_DF_SPE): Likewise.
	(TARGET_SF_FPR): Likewise.
	(TARGET_DF_FPR): Likewise.
	(TARGET_SF_INSN): Macros to say whether floating point support
	exists for a given operation for expanders.
	(TARGET_DF_INSN): Likewise.

	* config/rs6000/rs6000.c (Ftrad): New mode attributes to allow
	combining of SF/DF mode operations, using both traditional and VSX
	registers.
	(Fvsx): Likewise.
	(Ff): Likewise.
	(Fv): Likewise.
	(Fs): Likewise.
	(Ffre): Likewise.
	(FFRE): Likewise.
	(abs<mode>2): Combine SF/DF modes using traditional floating point
	instructions.  Add support for using the upper DF registers with
	VSX support, and SF registers with power8-vector support.  Update
	expanders for operations supported by both the SPE and traditional
	floating point units.
	(abs<mode>2_fpr): Likewise.
	(nabs<mode>2): Likewise.
	(nabs<mode>2_fpr): Likewise.
	(neg<mode>2): Likewise.
	(neg<mode>2_fpr): Likewise.
	(add<mode>3): Likewise.
	(add<mode>3_fpr): Likewise.
	(sub<mode>3): Likewise.
	(sub<mode>3_fpr): Likewise.
	(mul<mode>3): Likewise.
	(mul<mode>3_fpr): Likewise.
	(div<mode>3): Likewise.
	(div<mode>3_fpr): Likewise.
	(sqrt<mode>3): Likewise.
	(sqrt<mode>3_fpr): Likewise.
	(fre<Fs>): Likewise.
	(rsqrt<mode>2): Likewise.
	(cmp<mode>_fpr): Likewise.
	(smax<mode>3): Likewise.
	(smin<mode>3): Likewise.
	(smax<mode>3_vsx): Likewise.
	(smin<mode>3_vsx): Likewise.
	(negsf2): Delete SF operations that are merged with DF.
	(abssf2): Likewise.
	(addsf3): Likewise.
	(subsf3): Likewise.
	(mulsf3): Likewise.
	(divsf3): Likewise.
	(fres): Likewise.
	(fmasf4_fpr): Likewise.
	(fmssf4_fpr): Likewise.
	(nfmasf4_fpr): Likewise.
	(nfmssf4_fpr): Likewise.
	(sqrtsf2): Likewise.
	(rsqrtsf_internal1): Likewise.
	(smaxsf3): Likewise.
	(sminsf3): Likewise.
	(cmpsf_internal1): Likewise.
	(copysign<mode>3_fcpsgn): Add VSX/power8-vector support.
	(negdf2): Delete DF operations that are merged with SF.
	(absdf2): Likewise.
	(nabsdf2): Likewise.
	(adddf3): Likewise.
	(subdf3): Likewise.
	(muldf3): Likewise.
	(divdf3): Likewise.
	(fred): Likewise.
	(rsqrtdf_internal1): Likewise.
	(fmadf4_fpr): Likewise.
	(fmsdf4_fpr): Likewise.
	(nfmadf4_fpr): Likewise.
	(nfmsdf4_fpr): Likewise.
	(sqrtdf2): Likewise.
	(smaxdf3): Likewise.
	(smindf3): Likewise.
	(cmpdf_internal1): Likewise.
	(lrint<mode>di2): Use TARGET_<MODE>_FPR macro.
	(btrunc<mode>2): Delete separate expander, and combine with the
	insn and add VSX instruction support.  Use TARGET_<MODE>_FPR.
	(btrunc<mode>2_fpr): Likewise.
	(ceil<mode>2): Likewise.
	(ceil<mode>2_fpr): Likewise.
	(floor<mode>2): Likewise.
	(floor<mode>2_fpr): Likewise.
	(fma<mode>4_fpr): Combine SF and DF fused multiply/add support.
	Add support for using the upper registers with VSX and
	power8-vector.  Move insns to be closer to the define_expands. On
	VSX systems, prefer the traditional form of FMA over the VSX
	version, since the traditional form allows the target not to
	overlap with the inputs.
	(fms<mode>4_fpr): Likewise.
	(nfma<mode>4_fpr): Likewise.
	(nfms<mode>4_fpr): Likewise.

[gcc/testsuite]
2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p8vector-fp.c: New test for floating point
	scalar operations when using -mupper-regs-sf and -mupper-regs-df.
	* gcc.target/powerpc/ppc-target-1.c: Update tests to allow either
	VSX scalar operations or the traditional floating point form of
	the instruction.
	* gcc.target/powerpc/ppc-target-2.c: Likewise.
	* gcc.target/powerpc/recip-3.c: Likewise.
	* gcc.target/powerpc/recip-5.c: Likewise.
	* gcc.target/powerpc/pr72747.c: Likewise.
	* gcc.target/powerpc/vsx-builtin-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch062b --]
[-- Type: text/plain, Size: 61927 bytes --]

Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 203034)
+++ gcc/config/rs6000/rs6000-builtin.def	(.../gcc/config/rs6000)	(working copy)
@@ -1209,9 +1209,9 @@ BU_VSX_1 (XVRSPIZ,	      "xvrspiz",	CONS
 
 BU_VSX_1 (XSRDPI,	      "xsrdpi",		CONST,	vsx_xsrdpi)
 BU_VSX_1 (XSRDPIC,	      "xsrdpic",	CONST,	vsx_xsrdpic)
-BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	vsx_floordf2)
-BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	vsx_ceildf2)
-BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	vsx_btruncdf2)
+BU_VSX_1 (XSRDPIM,	      "xsrdpim",	CONST,	floordf2)
+BU_VSX_1 (XSRDPIP,	      "xsrdpip",	CONST,	ceildf2)
+BU_VSX_1 (XSRDPIZ,	      "xsrdpiz",	CONST,	btruncdf2)
 
 /* VSX predicate functions.  */
 BU_VSX_P (XVCMPEQSP_P,	      "xvcmpeqsp_p",	CONST,	vector_eq_v4sf_p)
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 203034)
+++ gcc/config/rs6000/vsx.md	(.../gcc/config/rs6000)	(working copy)
@@ -316,40 +316,42 @@ (define_expand "vsx_store_<mode>"
   "")
 
 \f
-;; VSX scalar and vector floating point arithmetic instructions
+;; VSX vector floating point arithmetic instructions.  The VSX scalar
+;; instructions are now combined with the insn for the traditional floating
+;; point unit.
 (define_insn "*vsx_add<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (plus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>add<VSs> %x0,%x1,%x2"
+  "xvadd<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_sub<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (minus:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		     (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (minus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		     (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>sub<VSs> %x0,%x1,%x2"
+  "xvsub<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_mul<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (mult:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>mul<VSs> %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_mul>")
+  "xvmul<VSs> %x0,%x1,%x2"
+  [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_mul>")])
 
 (define_insn "*vsx_div<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (div:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		   (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (div:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		   (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>div<VSs> %x0,%x1,%x2"
+  "xvdiv<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_div>")
    (set_attr "fp_type" "<VSfptype_div>")])
 
@@ -392,94 +394,72 @@ (define_insn "*vsx_tdiv<mode>3_internal"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_fre<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRES))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>re<VSs> %x0,%x1"
+  "xvre<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_neg<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (neg:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (neg:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>neg<VSs> %x0,%x1"
+  "xvneg<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_abs<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (abs:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>abs<VSs> %x0,%x1"
+  "xvabs<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_nabs<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (neg:VSX_B
-	 (abs:VSX_B
-	  (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa"))))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (neg:VSX_F
+	 (abs:VSX_F
+	  (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>nabs<VSs> %x0,%x1"
+  "xvnabs<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_smax<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smax:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (smax:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>max<VSs> %x0,%x1,%x2"
+  "xvmax<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_smin<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (smin:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-		    (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (smin:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+		    (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>min<VSs> %x0,%x1,%x2"
+  "xvmin<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Special VSX version of smin/smax for single precision floating point.  Since
-;; both numbers are rounded to single precision, we can just use the DP version
-;; of the instruction.
-
-(define_insn "*vsx_smaxsf3"
-  [(set (match_operand:SF 0 "vsx_register_operand" "=f")
-        (smax:SF (match_operand:SF 1 "vsx_register_operand" "f")
-		 (match_operand:SF 2 "vsx_register_operand" "f")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "xsmaxdp %x0,%x1,%x2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_insn "*vsx_sminsf3"
-  [(set (match_operand:SF 0 "vsx_register_operand" "=f")
-        (smin:SF (match_operand:SF 1 "vsx_register_operand" "f")
-		 (match_operand:SF 2 "vsx_register_operand" "f")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "xsmindp %x0,%x1,%x2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
 (define_insn "*vsx_sqrt<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-        (sqrt:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+        (sqrt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>sqrt<VSs> %x0,%x1"
+  "xvsqrt<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_sqrt>")
    (set_attr "fp_type" "<VSfptype_sqrt>")])
 
 (define_insn "*vsx_rsqrte<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_RSQRT))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>rsqrte<VSs> %x0,%x1"
+  "xvrsqrte<VSs> %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
@@ -518,26 +498,10 @@ (define_insn "*vsx_tsqrt<mode>2_internal
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Fused vector multiply/add instructions Support the classical DF versions of
-;; fma, which allows the target to be a separate register from the 3 inputs.
-;; Under VSX, the target must be either the addend or the first multiply.
-;; Where we can, also do the same for the Altivec V4SF fmas.
-
-(define_insn "*vsx_fmadf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d")))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsmaddadp %x0,%x1,%x2
-   xsmaddmdp %x0,%x1,%x3
-   xsmaddadp %x0,%x1,%x2
-   xsmaddmdp %x0,%x1,%x3
-   fmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
+;; Fused vector multiply/add instructions. Support the classical Altivec
+;; versions of fma, which allows the target to be a separate register from the
+;; 3 inputs.  Under VSX, the target must be either the addend or the first
+;; multiply.
 
 (define_insn "*vsx_fmav4sf4"
   [(set (match_operand:V4SF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,v")
@@ -568,23 +532,6 @@ (define_insn "*vsx_fmav2df4"
    xvmaddmdp %x0,%x1,%x3"
   [(set_attr "type" "vecdouble")])
 
-(define_insn "*vsx_fmsdf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (neg:DF
-	    (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d"))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsmsubadp %x0,%x1,%x2
-   xsmsubmdp %x0,%x1,%x3
-   xsmsubadp %x0,%x1,%x2
-   xsmsubmdp %x0,%x1,%x3
-   fmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_fms<mode>4"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
 	(fma:VSX_F
@@ -594,29 +541,12 @@ (define_insn "*vsx_fms<mode>4"
 	    (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
-   x<VSv>msuba<VSs> %x0,%x1,%x2
-   x<VSv>msubm<VSs> %x0,%x1,%x3
-   x<VSv>msuba<VSs> %x0,%x1,%x2
-   x<VSv>msubm<VSs> %x0,%x1,%x3"
+   xvmsuba<VSs> %x0,%x1,%x2
+   xvmsubm<VSs> %x0,%x1,%x3
+   xvmsuba<VSs> %x0,%x1,%x2
+   xvmsubm<VSs> %x0,%x1,%x3"
   [(set_attr "type" "<VStype_mul>")])
 
-(define_insn "*vsx_nfmadf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(neg:DF
-	 (fma:DF
-	  (match_operand:DF 1 "vsx_register_operand" "ws,ws,wa,wa,d")
-	  (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	  (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d"))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsnmaddadp %x0,%x1,%x2
-   xsnmaddmdp %x0,%x1,%x3
-   xsnmaddadp %x0,%x1,%x2
-   xsnmaddmdp %x0,%x1,%x3
-   fnmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_nfma<mode>4"
   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>,?wa,?wa")
 	(neg:VSX_F
@@ -626,31 +556,13 @@ (define_insn "*vsx_nfma<mode>4"
 	  (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>,0,wa"))))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "@
-   x<VSv>nmadda<VSs> %x0,%x1,%x2
-   x<VSv>nmaddm<VSs> %x0,%x1,%x3
-   x<VSv>nmadda<VSs> %x0,%x1,%x2
-   x<VSv>nmaddm<VSs> %x0,%x1,%x3"
+   xvnmadda<VSs> %x0,%x1,%x2
+   xvnmaddm<VSs> %x0,%x1,%x3
+   xvnmadda<VSs> %x0,%x1,%x2
+   xvnmaddm<VSs> %x0,%x1,%x3"
   [(set_attr "type" "<VStype_mul>")
    (set_attr "fp_type" "<VSfptype_mul>")])
 
-(define_insn "*vsx_nfmsdf4"
-  [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,d")
-	(neg:DF
-	 (fma:DF
-	   (match_operand:DF 1 "vsx_register_operand" "%ws,ws,wa,wa,d")
-	   (match_operand:DF 2 "vsx_register_operand" "ws,0,wa,0,d")
-	   (neg:DF
-	     (match_operand:DF 3 "vsx_register_operand" "0,ws,0,wa,d")))))]
-  "VECTOR_UNIT_VSX_P (DFmode)"
-  "@
-   xsnmsubadp %x0,%x1,%x2
-   xsnmsubmdp %x0,%x1,%x3
-   xsnmsubadp %x0,%x1,%x2
-   xsnmsubmdp %x0,%x1,%x3
-   fnmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
 (define_insn "*vsx_nfmsv4sf4"
   [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,wf,?wa,?wa,v")
 	(neg:V4SF
@@ -712,16 +624,6 @@ (define_insn "*vsx_ge<mode>"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
-;; Floating point scalar compare
-(define_insn "*vsx_cmpdf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y,?y")
-	(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "ws,wa")
-		      (match_operand:DF 2 "gpc_reg_operand" "ws,wa")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_VSX_P (DFmode)"
-  "xscmpudp %0,%x1,%x2"
-  [(set_attr "type" "fpcompare")])
-
 ;; Compare vectors producing a vector result and a predicate, setting CR6 to
 ;; indicate a combined status
 (define_insn "*vsx_eq_<mode>_p"
@@ -788,13 +690,13 @@ (define_insn "*vsx_xxsel<mode>_uns"
 
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B
-	 [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")
-	  (match_operand:VSX_B 2 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F
+	 [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")
+	  (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,wa")]
 	 UNSPEC_COPYSIGN))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>cpsgn<VSs> %x0,%x2,%x1"
+  "xvcpsgn<VSs> %x0,%x2,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
@@ -855,10 +757,10 @@ (define_insn "vsx_x<VSv>r<VSs>ic"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_btrunc<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(fix:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")))]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(fix:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>iz %x0,%x1"
+  "xvr<VSs>iz %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
@@ -872,20 +774,20 @@ (define_insn "*vsx_b2trunc<mode>2"
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_floor<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRIM))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>im %x0,%x1"
+  "xvr<VSs>im %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_ceil<mode>2"
-  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=<VSr>,?wa")
-	(unspec:VSX_B [(match_operand:VSX_B 1 "vsx_register_operand" "<VSr>,wa")]
+  [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,?wa")
+	(unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>,wa")]
 		      UNSPEC_FRIP))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "x<VSv>r<VSs>ip %x0,%x1"
+  "xvr<VSs>ip %x0,%x1"
   [(set_attr "type" "<VStype_simple>")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 203034)
+++ gcc/config/rs6000/rs6000.h	(.../gcc/config/rs6000)	(working copy)
@@ -617,6 +617,25 @@ extern int rs6000_vector_align[];
 			  || rs6000_cpu == PROCESSOR_PPC8548)
 
 
+/* Whether SF/DF operations are supported on the E500.  */
+#define TARGET_SF_SPE	(TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT	\
+			 && !TARGET_FPRS)
+
+#define TARGET_DF_SPE	(TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT	\
+			 && !TARGET_FPRS && TARGET_E500_DOUBLE)
+
+/* Whether SF/DF operations are supported by by the normal floating point unit
+   (or the vector/scalar unit).  */
+#define TARGET_SF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_SINGLE_FLOAT)
+
+#define TARGET_DF_FPR	(TARGET_HARD_FLOAT && TARGET_FPRS		\
+			 && TARGET_DOUBLE_FLOAT)
+
+/* Whether SF/DF operations are supported by any hardware.  */
+#define TARGET_SF_INSN	(TARGET_SF_FPR || TARGET_SF_SPE)
+#define TARGET_DF_INSN	(TARGET_DF_FPR || TARGET_DF_SPE)
+
 /* Which machine supports the various reciprocal estimate instructions.  */
 #define TARGET_FRES	(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT \
 			 && TARGET_FPRS && TARGET_SINGLE_FLOAT)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 203034)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -337,6 +337,25 @@ (define_mode_iterator RECIPF [SF DF V4SF
 ; Iterator for just SF/DF
 (define_mode_iterator SFDF [SF DF])
 
+; SF/DF suffix for traditional floating instructions
+(define_mode_attr Ftrad		[(SF "s") (DF "")])
+
+; SF/DF suffix for VSX instructions
+(define_mode_attr Fvsx		[(SF "sp") (DF	"dp")])
+
+; SF/DF constraint for arithmetic on traditional floating point registers
+(define_mode_attr Ff		[(SF "f") (DF "d")])
+
+; SF/DF constraint for arithmetic on VSX registers
+(define_mode_attr Fv		[(SF "wy") (DF "ws")])
+
+; s/d suffix for things like fp_addsub_s/fp_addsub_d
+(define_mode_attr Fs		[(SF "s")  (DF "d")])
+
+; FRE/FRES support
+(define_mode_attr Ffre		[(SF "fres") (DF "fre")])
+(define_mode_attr FFRE		[(SF "FRES") (DF "FRE")])
+
 ; Conditional returns.
 (define_code_iterator any_return [return simple_return])
 (define_code_attr return_pred [(return "direct_return ()")
@@ -5045,22 +5064,172 @@ (define_split
 		    (const_int 0)))]
   "")
 
-;; Floating-point insns, excluding normal data motion.
-;;
-;; PowerPC has a full set of single-precision floating point instructions.
-;;
-;; For the POWER architecture, we pretend that we have both SFmode and
-;; DFmode insns, while, in fact, all fp insns are actually done in double.
-;; The only conversions we will do will be when storing to memory.  In that
-;; case, we will use the "frsp" instruction before storing.
-;;
-;; Note that when we store into a single-precision memory location, we need to
-;; use the frsp insn first.  If the register being stored isn't dead, we
-;; need a scratch register for the frsp.  But this is difficult when the store
-;; is done by reload.  It is not incorrect to do the frsp on the register in
-;; this case, we just lose precision that we would have otherwise gotten but
-;; is not guaranteed.  Perhaps this should be tightened up at some point.
+\f
+;; Floating-point insns, excluding normal data motion.  We combine the SF/DF
+;; modes here, and also add in conditional vsx/power8-vector support to access
+;; values in the traditional Altivec registers if the appropriate
+;; -mupper-regs-{df,sf} option is enabled.
+
+(define_expand "abs<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
+  "")
+
+(define_insn "*abs<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fabs %0,%1
+   xsabsdp %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
+
+(define_insn "*nabs<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(neg:SFDF
+	 (abs:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnabs %0,%1
+   xsnabsdp %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
+  "")
+
+(define_insn "*neg<mode>2_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fneg %0,%1
+   xsnegdp %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
+
+(define_expand "add<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(plus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
+  "")
+
+(define_insn "*add<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(plus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fadd<Ftrad> %0,%1,%2
+   xsadd<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
+
+(define_expand "sub<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(minus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		    (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
+  "")
 
+(define_insn "*sub<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(minus:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		    (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fsub<Ftrad> %0,%1,%2
+   xssub<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
+
+(define_expand "mul<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(mult:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN"
+  "")
+
+(define_insn "*mul<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(mult:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmul<Ftrad> %0,%1,%2
+   xsmul<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "dmul")
+   (set_attr "fp_type" "fp_mul_<Fs>")])
+
+(define_expand "div<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(div:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")
+		  (match_operand:SFDF 2 "gpc_reg_operand" "")))]
+  "TARGET_<MODE>_INSN && !TARGET_SIMPLE_FPU"
+  "")
+
+(define_insn "*div<mode>3_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(div:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && !TARGET_SIMPLE_FPU"
+  "@
+   fdiv<Ftrad> %0,%1,%2
+   xsdiv<Fvsx> %x0,%x1,%x2"
+  [(set_attr "type" "<Fs>div")
+   (set_attr "fp_type" "fp_div_<Fs>")])
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(sqrt:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && !TARGET_SIMPLE_FPU
+   && (TARGET_PPC_GPOPT || (<MODE>mode == SFmode && TARGET_XILINX_FPU))"
+  "@
+   fsqrt<Ftrad> %0,%1
+   xssqrt<Fvsx> %x0,%x1"
+  [(set_attr "type" "<Fs>sqrt")
+   (set_attr "fp_type" "fp_sqrt_<Fs>")])
+
+;; Floating point reciprocal approximation
+(define_insn "fre<Fs>"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
+		     UNSPEC_FRES))]
+  "TARGET_<FFRE>"
+  "@
+   fre<Ftrad> %0,%1
+   xsre<Fvsx> %x0,%x1"
+  [(set_attr "type" "fp")])
+
+(define_insn "*rsqrt<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
+		     UNSPEC_RSQRT))]
+  "RS6000_RECIP_HAVE_RSQRTE_P (<MODE>mode)"
+  "@
+   frsqrte<Ftrad> %0,%1
+   xsrsqrte<Fvsx> %x0,%x1"
+  [(set_attr "type" "fp")])
+
+;; Floating point comparisons
+(define_insn "*cmp<mode>_fpr"
+  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y,y")
+	(compare:CCFP (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		      (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fcmpu %0,%1,%2
+   xscmpudp %x0,%x1,%x2"
+  [(set_attr "type" "fpcompare")])
+
+;; Floating point conversions
 (define_expand "extendsfdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "")
 	(float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand" "")))]
@@ -5117,175 +5286,6 @@ (define_insn "*truncdfsf2_fpr"
   "frsp %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "negsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(neg:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
-
-(define_insn "*negsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fneg %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "abssf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(abs:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
-
-(define_insn "*abssf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(abs:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (abs:SF (match_operand:SF 1 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "addsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(plus:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		 (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(plus:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
-		 (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fadds %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_s")])
-
-(define_expand "subsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(minus:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		  (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(minus:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		  (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fsubs %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_s")])
-
-(define_expand "mulsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(mult:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		 (match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT"
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
-		 (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmuls %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_mul_s")])
-
-(define_expand "divsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(div:SF (match_operand:SF 1 "gpc_reg_operand" "")
-		(match_operand:SF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(div:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS
-   && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
-  "fdivs %0,%1,%2"
-  [(set_attr "type" "sdiv")])
-
-(define_insn "fres"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
-  "TARGET_FRES"
-  "fres %0,%1"
-  [(set_attr "type" "fp")])
-
-; builtin fmaf support
-(define_insn "*fmasf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")
-		(match_operand:SF 3 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmadds %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
-
-(define_insn "*fmssf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-		(match_operand:SF 2 "gpc_reg_operand" "f")
-		(neg:SF (match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fmsubs %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
-
-(define_insn "*nfmasf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-			(match_operand:SF 2 "gpc_reg_operand" "f")
-			(match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnmadds %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
-
-(define_insn "*nfmssf4_fpr"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f")
-			(match_operand:SF 2 "gpc_reg_operand" "f")
-			(neg:SF (match_operand:SF 3 "gpc_reg_operand" "f")))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fnmsubs %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_s")])
-
-(define_expand "sqrtsf2"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "(TARGET_PPC_GPOPT || TARGET_XILINX_FPU)
-   && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT
-   && !TARGET_SIMPLE_FPU"
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "f")))]
-  "(TARGET_PPC_GPOPT || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT
-   && TARGET_FPRS && TARGET_SINGLE_FLOAT && !TARGET_SIMPLE_FPU"
-  "fsqrts %0,%1"
-  [(set_attr "type" "ssqrt")])
-
-(define_insn "*rsqrtsf_internal1"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
-	(unspec:SF [(match_operand:SF 1 "gpc_reg_operand" "f")]
-		   UNSPEC_RSQRT))]
-  "TARGET_FRSQRTES"
-  "frsqrtes %0,%1"
-  [(set_attr "type" "fp")])
-
 ;; This expander is here to avoid FLOAT_WORDS_BIGENDIAN tests in
 ;; builtins.c and optabs.c that are not correct for IBM long double
 ;; when little-endian.
@@ -5353,37 +5353,82 @@ (define_expand "copysign<mode>3"
 ;; Use an unspec rather providing an if-then-else in RTL, to prevent the
 ;; compiler from optimizing -0.0
 (define_insn "copysign<mode>3_fcpsgn"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")
-		      (match_operand:SFDF 2 "gpc_reg_operand" "<rreg2>")]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")
+		      (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_COPYSIGN))]
-  "TARGET_CMPB && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "fcpsgn %0,%2,%1"
+  "TARGET_<MODE>_FPR && TARGET_CMPB"
+  "@
+   fcpsgn %0,%2,%1
+   xscpsgn<VSs> %x0,%x2,%x1"
   [(set_attr "type" "fp")])
 
 ;; For MIN, MAX, and conditional move, we use DEFINE_EXPAND's that involve a
 ;; fsel instruction and some auxiliary computations.  Then we just have a
 ;; single DEFINE_INSN for fsel and the define_splits to make them if made by
 ;; combine.
-(define_expand "smaxsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "")
-			     (match_operand:SF 2 "gpc_reg_operand" ""))
-			 (match_dup 1)
-			 (match_dup 2)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_SINGLE_FLOAT && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}")
+;; For MIN, MAX on non-VSX machines, and conditional move all of the time, we
+;; use DEFINE_EXPAND's that involve a fsel instruction and some auxiliary
+;; computations.  Then we just have a single DEFINE_INSN for fsel and the
+;; define_splits to make them if made by combine.  On VSX machines we have the
+;; min/max instructions.
+;;
+;; On VSX, we only check for TARGET_VSX instead of checking for a vsx/p8 vector
+;; to allow either DF/SF to use only traditional registers.
 
-(define_expand "sminsf3"
-  [(set (match_operand:SF 0 "gpc_reg_operand" "")
-	(if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "")
-			     (match_operand:SF 2 "gpc_reg_operand" ""))
-			 (match_dup 2)
-			 (match_dup 1)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_SINGLE_FLOAT && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}")
+(define_expand "smax<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "")
+			       (match_operand:SFDF 2 "gpc_reg_operand" ""))
+			   (match_dup 1)
+			   (match_dup 2)))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math"
+{
+  rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]);
+  DONE;
+})
+
+(define_insn "*smax<mode>3_vsx"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && TARGET_VSX"
+  "xsmaxdp %x0,%x1,%x2"
+  [(set_attr "type" "fp")])
+
+(define_expand "smin<mode>3"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "")
+			       (match_operand:SFDF 2 "gpc_reg_operand" ""))
+			   (match_dup 2)
+			   (match_dup 1)))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math"
+{
+  rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]);
+  DONE;
+})
+
+(define_insn "*smin<mode>3_vsx"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(smin:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>")
+		   (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>")))]
+  "TARGET_<MODE>_FPR && TARGET_VSX"
+  "xsmindp %x0,%x1,%x2"
+  [(set_attr "type" "fp")])
+
+(define_split
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
+	(match_operator:SFDF 3 "min_max_operator"
+	 [(match_operand:SFDF 1 "gpc_reg_operand" "")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "")]))]
+  "TARGET_<MODE>_FPR && TARGET_PPC_GFXOPT && !flag_trapping_math
+   && !TARGET_VSX"
+  [(const_int 0)]
+{
+  rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), operands[1],
+		      operands[2]);
+  DONE;
+})
 
 (define_split
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
@@ -5515,208 +5560,9 @@ (define_insn "*fseldfsf4"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_expand "negdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(neg:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*negdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(neg:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fneg %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "absdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(abs:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*absdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(abs:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_insn "*nabsdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "d"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fnabs %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "adddf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(plus:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		 (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*adddf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d")
-		 (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fadd %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_expand "subdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(minus:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		  (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*subdf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(minus:DF (match_operand:DF 1 "gpc_reg_operand" "d")
-		  (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fsub %0,%1,%2"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_addsub_d")])
-
-(define_expand "muldf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(mult:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		 (match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)"
-  "")
-
-(define_insn "*muldf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d")
-		 (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fmul %0,%1,%2"
-  [(set_attr "type" "dmul")
-   (set_attr "fp_type" "fp_mul_d")])
-
-(define_expand "divdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(div:DF (match_operand:DF 1 "gpc_reg_operand" "")
-		(match_operand:DF 2 "gpc_reg_operand" "")))]
-  "TARGET_HARD_FLOAT
-   && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || TARGET_E500_DOUBLE)
-   && !TARGET_SIMPLE_FPU"
-  "")
-
-(define_insn "*divdf3_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(div:DF (match_operand:DF 1 "gpc_reg_operand" "d")
-		(match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fdiv %0,%1,%2"
-  [(set_attr "type" "ddiv")])
-
-(define_insn "*fred_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
-  "TARGET_FRE && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fre %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_insn "*rsqrtdf_internal1"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "d")]
-		   UNSPEC_RSQRT))]
-  "TARGET_FRSQRTE && !VECTOR_UNIT_VSX_P (DFmode)"
-  "frsqrte %0,%1"
-  [(set_attr "type" "fp")])
-
-; builtin fma support
-(define_insn "*fmadf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-		(match_operand:DF 2 "gpc_reg_operand" "f")
-		(match_operand:DF 3 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*fmsdf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-		(match_operand:DF 2 "gpc_reg_operand" "f")
-		(neg:DF (match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*nfmadf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-			(match_operand:DF 2 "gpc_reg_operand" "f")
-			(match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fnmadd %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_insn "*nfmsdf4_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
-	(neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f")
-			(match_operand:DF 2 "gpc_reg_operand" "f")
-			(neg:DF (match_operand:DF 3 "gpc_reg_operand" "f")))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && VECTOR_UNIT_NONE_P (DFmode)"
-  "fnmsub %0,%1,%2,%3"
-  [(set_attr "type" "fp")
-   (set_attr "fp_type" "fp_maddsub_d")])
-
-(define_expand "sqrtdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
-  "TARGET_PPC_GPOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
-  "")
-
-(define_insn "*sqrtdf2_fpr"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
-	(sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "d")))]
-  "TARGET_PPC_GPOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fsqrt %0,%1"
-  [(set_attr "type" "dsqrt")])
-
 ;; The conditional move instructions allow us to perform max and min
 ;; operations even when
 
-(define_expand "smaxdf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "")
-			     (match_operand:DF 2 "gpc_reg_operand" ""))
-			 (match_dup 1)
-			 (match_dup 2)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); DONE;}")
-
-(define_expand "smindf3"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "")
-			     (match_operand:DF 2 "gpc_reg_operand" ""))
-			 (match_dup 2)
-			 (match_dup 1)))]
-  "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && !flag_trapping_math"
-  "{ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); DONE;}")
-
 (define_split
   [(set (match_operand:DF 0 "gpc_reg_operand" "")
 	(match_operator:DF 3 "min_max_operator"
@@ -6400,66 +6246,52 @@ (define_insn "lrint<mode>di2"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d")
 	(unspec:DI [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
 		   UNSPEC_FCTID))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
   "fctid %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "btrunc<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIZ))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
-
-(define_insn "*btrunc<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "btrunc<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIZ))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "friz %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "ceil<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIP))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   friz %0,%1
+   xsrdpiz %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
-(define_insn "*ceil<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "ceil<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIP))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "frip %0,%1"
-  [(set_attr "type" "fp")])
-
-(define_expand "floor<mode>2"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
-		     UNSPEC_FRIM))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
-  "")
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   frip %0,%1
+   xsrdpip %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
-(define_insn "*floor<mode>2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
-	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
+(define_insn "floor<mode>2"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>")
+	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")]
 		     UNSPEC_FRIM))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>
-   && !VECTOR_UNIT_VSX_P (<MODE>mode)"
-  "frim %0,%1"
-  [(set_attr "type" "fp")])
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
+  "@
+   frim %0,%1
+   xsrdpim %x0,%x1"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
 ;; No VSX equivalent to frin
 (define_insn "round<mode>2"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<rreg2>")
 	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<rreg2>")]
 		     UNSPEC_FRIN))]
-  "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && <TARGET_FLOAT>"
+  "TARGET_<MODE>_FPR && TARGET_FPRND"
   "frin %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_addsub_<Fs>")])
 
 ; An UNSPEC is used so we don't have to support SImode in FP registers.
 (define_insn "stfiwx"
@@ -13331,23 +13163,6 @@ (define_split
   [(set (match_dup 3) (compare:CCUNS (match_dup 1) (match_dup 2)))
    (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 4)))])
 
-(define_insn "*cmpsf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
-	(compare:CCFP (match_operand:SF 1 "gpc_reg_operand" "f")
-		      (match_operand:SF 2 "gpc_reg_operand" "f")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT"
-  "fcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
-
-(define_insn "*cmpdf_internal1"
-  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
-	(compare:CCFP (match_operand:DF 1 "gpc_reg_operand" "d")
-		      (match_operand:DF 2 "gpc_reg_operand" "d")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
-   && !VECTOR_UNIT_VSX_P (DFmode)"
-  "fcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
-
 ;; Only need to compare second words if first words equal
 (define_insn "*cmptf_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -15642,6 +15457,20 @@ (define_expand "fma<mode>4"
   ""
   "")
 
+(define_insn "*fma<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>")))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmadd<Ftrad> %0,%1,%2,%3
+   xsmadda<Fvsx> %x0,%x1,%x2
+   xsmaddm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ; Altivec only has fma and nfms.
 (define_expand "fms<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15652,6 +15481,20 @@ (define_expand "fms<mode>4"
   "!VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "")
 
+(define_insn "*fms<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(fma:SFDF
+	 (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	 (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	 (neg:SFDF (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fmsub<Ftrad> %0,%1,%2,%3
+   xsmsuba<Fvsx> %x0,%x1,%x2
+   xsmsubm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ;; If signed zeros are ignored, -(a * b - c) = -a * b + c.
 (define_expand "fnma<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15685,6 +15528,21 @@ (define_expand "nfma<mode>4"
   "!VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "")
 
+(define_insn "*nfma<mode>4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(neg:SFDF
+	 (fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>"))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnmadd<Ftrad> %0,%1,%2,%3
+   xsnmadda<Fvsx> %x0,%x1,%x2
+   xsnmaddm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
 ; Not an official optab name, but used from builtins.
 (define_expand "nfms<mode>4"
   [(set (match_operand:FMA_F 0 "register_operand" "")
@@ -15696,6 +15554,23 @@ (define_expand "nfms<mode>4"
   ""
   "")
 
+(define_insn "*nfmssf4_fpr"
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fv>,<Fv>")
+	(neg:SFDF
+	 (fma:SFDF
+	  (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>,<Fv>")
+	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,<Fv>,0")
+	  (neg:SFDF
+	   (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,<Fv>")))))]
+  "TARGET_<MODE>_FPR"
+  "@
+   fnmsub<Ftrad> %0,%1,%2,%3
+   xsnmsuba<Fvsx> %x0,%x1,%x2
+   xsnmsubm<Fvsx> %x0,%x1,%x3"
+  [(set_attr "type" "fp")
+   (set_attr "fp_type" "fp_maddsub_<Fs>")])
+
+\f
 (define_expand "rs6000_get_timebase"
   [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
   ""
Index: gcc/testsuite/gcc.target/powerpc/p8vector-fp.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p8vector-fp.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p8vector-fp.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 203034)
@@ -0,0 +1,139 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mcpu=power8 -O2 -mupper-regs-df -mupper-regs-sf -fno-math-errno" } */
+
+float abs_sf (float *p)
+{
+  float f = *p;
+  __asm__ ("# reg %x0" : "+v" (f));
+  return __builtin_fabsf (f);
+}
+
+float nabs_sf (float *p)
+{
+  float f = *p;
+  __asm__ ("# reg %x0" : "+v" (f));
+  return - __builtin_fabsf (f);
+}
+
+float neg_sf (float *p)
+{
+  float f = *p;
+  __asm__ ("# reg %x0" : "+v" (f));
+  return - f;
+}
+
+float add_sf (float *p, float *q)
+{
+  float f1 = *p;
+  float f2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (f1), "+v" (f2));
+  return f1 + f2;
+}
+
+float sub_sf (float *p, float *q)
+{
+  float f1 = *p;
+  float f2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (f1), "+v" (f2));
+  return f1 - f2;
+}
+
+float mul_sf (float *p, float *q)
+{
+  float f1 = *p;
+  float f2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (f1), "+v" (f2));
+  return f1 * f2;
+}
+
+float div_sf (float *p, float *q)
+{
+  float f1 = *p;
+  float f2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (f1), "+v" (f2));
+  return f1 / f2;
+}
+
+float sqrt_sf (float *p)
+{
+  float f = *p;
+  __asm__ ("# reg %x0" : "+v" (f));
+  return __builtin_sqrtf (f);
+}
+
+\f
+double abs_df (double *p)
+{
+  double d = *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return __builtin_fabs (d);
+}
+
+double nabs_df (double *p)
+{
+  double d = *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return - __builtin_fabs (d);
+}
+
+double neg_df (double *p)
+{
+  double d = *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return - d;
+}
+
+double add_df (double *p, double *q)
+{
+  double d1 = *p;
+  double d2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (d1), "+v" (d2));
+  return d1 + d2;
+}
+
+double sub_df (double *p, double *q)
+{
+  double d1 = *p;
+  double d2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (d1), "+v" (d2));
+  return d1 - d2;
+}
+
+double mul_df (double *p, double *q)
+{
+  double d1 = *p;
+  double d2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (d1), "+v" (d2));
+  return d1 * d2;
+}
+
+double div_df (double *p, double *q)
+{
+  double d1 = *p;
+  double d2 = *q;
+  __asm__ ("# reg %x0, %x1" : "+v" (d1), "+v" (d2));
+  return d1 / d2;
+}
+
+double sqrt_df (float *p)
+{
+  double d = *p;
+  __asm__ ("# reg %x0" : "+v" (d));
+  return __builtin_sqrt (d);
+}
+
+/* { dg-final { scan-assembler "xsabsdp"  } } */
+/* { dg-final { scan-assembler "xsadddp"  } } */
+/* { dg-final { scan-assembler "xsaddsp"  } } */
+/* { dg-final { scan-assembler "xsdivdp"  } } */
+/* { dg-final { scan-assembler "xsdivsp"  } } */
+/* { dg-final { scan-assembler "xsmuldp"  } } */
+/* { dg-final { scan-assembler "xsmulsp"  } } */
+/* { dg-final { scan-assembler "xsnabsdp" } } */
+/* { dg-final { scan-assembler "xsnegdp"  } } */
+/* { dg-final { scan-assembler "xssqrtdp" } } */
+/* { dg-final { scan-assembler "xssqrtsp" } } */
+/* { dg-final { scan-assembler "xssubdp"  } } */
+/* { dg-final { scan-assembler "xssubsp"  } } */
Index: gcc/testsuite/gcc.target/powerpc/ppc-target-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-target-2.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/ppc-target-2.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -5,8 +5,7 @@
 /* { dg-final { scan-assembler-times "fabs" 3 } } */
 /* { dg-final { scan-assembler-times "fnabs" 3 } } */
 /* { dg-final { scan-assembler-times "fsel" 3 } } */
-/* { dg-final { scan-assembler-times "fcpsgn" 3 } } */
-/* { dg-final { scan-assembler-times "xscpsgndp" 1 } } */
+/* { dg-final { scan-assembler-times "fcpsgn\|xscpsgndp" 4 } } */
 
 /* fabs/fnabs/fsel */
 double normal1 (double a, double b) { return __builtin_copysign (a, b); }
Index: gcc/testsuite/gcc.target/powerpc/recip-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/recip-3.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/recip-3.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -1,14 +1,14 @@
 /* { dg-do compile { target { { powerpc*-*-* } && { ! powerpc*-apple-darwin* } } } } */
 /* { dg-require-effective-target powerpc_fprs } */
 /* { dg-options "-O2 -mrecip -ffast-math -mcpu=power7" } */
-/* { dg-final { scan-assembler-times "xsrsqrtedp" 1 } } */
+/* { dg-final { scan-assembler-times "xsrsqrtedp\|frsqrte\ " 1 } } */
 /* { dg-final { scan-assembler-times "xsmsub.dp\|fmsub\ " 1 } } */
-/* { dg-final { scan-assembler-times "xsmuldp" 4 } } */
+/* { dg-final { scan-assembler-times "xsmuldp\|fmul\ " 4 } } */
 /* { dg-final { scan-assembler-times "xsnmsub.dp\|fnmsub\ " 2 } } */
-/* { dg-final { scan-assembler-times "frsqrtes" 1 } } */
-/* { dg-final { scan-assembler-times "fmsubs" 1 } } */
-/* { dg-final { scan-assembler-times "fmuls" 2 } } */
-/* { dg-final { scan-assembler-times "fnmsubs" 1 } } */
+/* { dg-final { scan-assembler-times "xsrsqrtesp\|frsqrtes" 1 } } */
+/* { dg-final { scan-assembler-times "xsmsub.sp\|fmsubs" 1 } } */
+/* { dg-final { scan-assembler-times "xsmulsp\|fmuls" 2 } } */
+/* { dg-final { scan-assembler-times "xsnmsub.sp\|fnmsubs" 1 } } */
 
 double
 rsqrt_d (double a)
Index: gcc/testsuite/gcc.target/powerpc/pr42747.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr42747.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/pr42747.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -5,4 +5,4 @@
 
 double foo (double x) { return __builtin_sqrt (x); }
 
-/* { dg-final { scan-assembler "xssqrtdp" } } */
+/* { dg-final { scan-assembler "xssqrtdp\|fsqrt" } } */
Index: gcc/testsuite/gcc.target/powerpc/recip-5.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/recip-5.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/recip-5.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -4,12 +4,12 @@
 /* { dg-options "-O3 -ftree-vectorize -mrecip=all -ffast-math -mcpu=power7 -fno-unroll-loops" } */
 /* { dg-final { scan-assembler-times "xvredp" 4 } } */
 /* { dg-final { scan-assembler-times "xvresp" 5 } } */
-/* { dg-final { scan-assembler-times "xsredp" 2 } } */
-/* { dg-final { scan-assembler-times "fres" 2 } } */
-/* { dg-final { scan-assembler-times "fmuls" 2 } } */
-/* { dg-final { scan-assembler-times "fnmsubs" 2 } } */
-/* { dg-final { scan-assembler-times "xsmuldp" 2 } } */
-/* { dg-final { scan-assembler-times "xsnmsub.dp" 4 } } */
+/* { dg-final { scan-assembler-times "xsredp\|fre\ " 2 } } */
+/* { dg-final { scan-assembler-times "fres\|xsresp" 2 } } */
+/* { dg-final { scan-assembler-times "fmuls\|xsmulsp" 2 } } */
+/* { dg-final { scan-assembler-times "fnmsubs\|xsnmsub.sp" 2 } } */
+/* { dg-final { scan-assembler-times "xsmuldp\|fmul\ " 2 } } */
+/* { dg-final { scan-assembler-times "xsnmsub.dp\|fnmsub\ " 4 } } */
 /* { dg-final { scan-assembler-times "xvmulsp" 7 } } */
 /* { dg-final { scan-assembler-times "xvnmsub.sp" 5 } } */
 /* { dg-final { scan-assembler-times "xvmuldp" 6 } } */
Index: gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -16,9 +16,9 @@
 /* { dg-final { scan-assembler "xvrspiz" } } */
 /* { dg-final { scan-assembler "xsrdpi" } } */
 /* { dg-final { scan-assembler "xsrdpic" } } */
-/* { dg-final { scan-assembler "xsrdpim" } } */
-/* { dg-final { scan-assembler "xsrdpip" } } */
-/* { dg-final { scan-assembler "xsrdpiz" } } */
+/* { dg-final { scan-assembler "xsrdpim\|frim" } } */
+/* { dg-final { scan-assembler "xsrdpip\|frip" } } */
+/* { dg-final { scan-assembler "xsrdpiz\|friz" } } */
 /* { dg-final { scan-assembler "xsmaxdp" } } */
 /* { dg-final { scan-assembler "xsmindp" } } */
 /* { dg-final { scan-assembler "xxland" } } */
Index: gcc/testsuite/gcc.target/powerpc/ppc-target-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-target-1.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 203034)
+++ gcc/testsuite/gcc.target/powerpc/ppc-target-1.c	(.../gcc/testsuite/gcc.target/powerpc)	(working copy)
@@ -5,8 +5,7 @@
 /* { dg-final { scan-assembler-times "fabs" 3 } } */
 /* { dg-final { scan-assembler-times "fnabs" 3 } } */
 /* { dg-final { scan-assembler-times "fsel" 3 } } */
-/* { dg-final { scan-assembler-times "fcpsgn" 3 } } */
-/* { dg-final { scan-assembler-times "xscpsgndp" 1 } } */
+/* { dg-final { scan-assembler-times "fcpsgn\|xscpsgndp" 4 } } */
 
 double normal1 (double, double);
 double power5  (double, double) __attribute__((__target__("cpu=power5")));

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4
  2013-10-01 17:52         ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4 Michael Meissner
@ 2013-10-01 23:15           ` David Edelsohn
  2013-10-07 23:05             ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5 Michael Meissner
  0 siblings, 1 reply; 20+ messages in thread
From: David Edelsohn @ 2013-10-01 23:15 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Tue, Oct 1, 2013 at 1:52 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch moves most of the VSX DFmode operations from vsx.md to rs6000.md to
> use the traditional floating point instructions (f*) instead of the VSX scalar
> instructions (xs*) if all of the registers come from the traditional floating
> point register set.  The add, subtract, multiply, divide, reciprocal estimate,
> square root, absolute value, negate, round functions, and multiply/add
> instructions were changed.  Some of the converts have not been changed with
> these patches.  If the -mupper-regs-df switch is used, it will attempt to use
> the upper registers (those that overlay on the traditional Altivec register
> set).
>
> This patch also combines the scalar SFmode/DFmode support on non-SPE systems.
> It adds in ISA 2.07 (power8) single precision floating point support if the
> -mupper-regs-sf switch is used.
>
> At present, neither -mupper-regs-df nor -mupper-regs-sf is usable if reload has
> to do anything.  A future patch will address this.
>
> I did need to adjust a few tests that were specifically testing VSX scalar code
> generation.  In addition, I put in a simple test to make sure the initial
> -mupper-regs-df and -mupper-regs-sf works correctly.
>
> I tested this an except for power7, power8 I could not find any changes in code
> generated for power4, power5, power6, power6x, G4, G5, cell, e5500, e6500,
> xilinx (sp_full, sp_lite, dp_full, dp_lite, none), 8548/8540 (spe), 750cl
> (paired floating point).
>
> For VSX systems there is code generation differences:
>
>     1)  The traditional fp instruction is generated instead of VSX;
>
>     2)  Because of #1, the code generator favors the 4 operand of multiply/add
>         instructions, where the target register does not overlap with any of
>         the inputs over the VSX version that that requires overlap.
>
>     3)  A few of the vectorized tests on power8 now generate more direct move
>         instructions, instead of moving values through the stack than
>         previously.  These tests are integer tests, where you are doing an
>         operation between an integer vector and a scalar value.  Previously in
>         some cases, the register allocator would store the value from a GPR and
>         reload it to the vector registers.
>
>     4)  There is a slight scheduling difference in doing long double abs, that
>         causes a different register to be used.  The code for long double abs
>         needs to be improved in any case (the early splitting is causing spills
>         to the stack).
>
> I had no differences in doing bootstrap and make check (with the testsuite
> fixes applied).
>
> In addition, I am running Spec 2006 floating point tests on a power7 box to
> compare the effects of going back to the traditional floating point tests.  For
> most tests, there is less than 2% difference between the runs.  One benchmark
> (482.sphinx3) is slightly faster with these changes, and it is dominated by
> floating point multiply/add operations.
>
> Can I apply these patches?
>
> [gcc]
> 2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000-builtin.def (XSRDPIM): Use floatdf2,
>         ceildf2, btruncdf2, instead of vsx_* name.
>
>         * config/rs6000/vsx.md (vsx_add<mode>3): Change arithmetic
>         iterators to only do V2DF and V4SF here.  Move the DF code to
>         rs6000.md where it is combined with SF mode.  Replace <VSv> with
>         just 'v' since only vector operations are handled with these insns
>         after moving the DF support to rs6000.md.
>         (vsx_sub<mode>3): Likewise.
>         (vsx_mul<mode>3): Likewise.
>         (vsx_div<mode>3): Likewise.
>         (vsx_fre<mode>2): Likewise.
>         (vsx_neg<mode>2): Likewise.
>         (vsx_abs<mode>2): Likewise.
>         (vsx_nabs<mode>2): Likewise.
>         (vsx_smax<mode>3): Likewise.
>         (vsx_smin<mode>3): Likewise.
>         (vsx_sqrt<mode>2): Likewise.
>         (vsx_rsqrte<mode>2): Likewise.
>         (vsx_fms<mode>4): Likewise.
>         (vsx_nfma<mode>4): Likewise.
>         (vsx_copysign<mode>3): Likewise.
>         (vsx_btrunc<mode>2): Likewise.
>         (vsx_floor<mode>2): Likewise.
>         (vsx_ceil<mode>2): Likewise.
>         (vsx_smaxsf3): Delete scalar ops that were moved to rs6000.md.
>         (vsx_sminsf3): Likewise.
>         (vsx_fmadf4): Likewise.
>         (vsx_fmsdf4): Likewise.
>         (vsx_nfmadf4): Likewise.
>         (vsx_nfmsdf4): Likewise.
>         (vsx_cmpdf_internal1): Likewise.
>
>         * config/rs6000/rs6000.h (TARGET_SF_SPE): Define macros to make it
>         simpler to select whether a target has SPE or traditional floating
>         point support in iterators.
>         (TARGET_DF_SPE): Likewise.
>         (TARGET_SF_FPR): Likewise.
>         (TARGET_DF_FPR): Likewise.
>         (TARGET_SF_INSN): Macros to say whether floating point support
>         exists for a given operation for expanders.
>         (TARGET_DF_INSN): Likewise.
>
>         * config/rs6000/rs6000.c (Ftrad): New mode attributes to allow
>         combining of SF/DF mode operations, using both traditional and VSX
>         registers.
>         (Fvsx): Likewise.
>         (Ff): Likewise.
>         (Fv): Likewise.
>         (Fs): Likewise.
>         (Ffre): Likewise.
>         (FFRE): Likewise.
>         (abs<mode>2): Combine SF/DF modes using traditional floating point
>         instructions.  Add support for using the upper DF registers with
>         VSX support, and SF registers with power8-vector support.  Update
>         expanders for operations supported by both the SPE and traditional
>         floating point units.
>         (abs<mode>2_fpr): Likewise.
>         (nabs<mode>2): Likewise.
>         (nabs<mode>2_fpr): Likewise.
>         (neg<mode>2): Likewise.
>         (neg<mode>2_fpr): Likewise.
>         (add<mode>3): Likewise.
>         (add<mode>3_fpr): Likewise.
>         (sub<mode>3): Likewise.
>         (sub<mode>3_fpr): Likewise.
>         (mul<mode>3): Likewise.
>         (mul<mode>3_fpr): Likewise.
>         (div<mode>3): Likewise.
>         (div<mode>3_fpr): Likewise.
>         (sqrt<mode>3): Likewise.
>         (sqrt<mode>3_fpr): Likewise.
>         (fre<Fs>): Likewise.
>         (rsqrt<mode>2): Likewise.
>         (cmp<mode>_fpr): Likewise.
>         (smax<mode>3): Likewise.
>         (smin<mode>3): Likewise.
>         (smax<mode>3_vsx): Likewise.
>         (smin<mode>3_vsx): Likewise.
>         (negsf2): Delete SF operations that are merged with DF.
>         (abssf2): Likewise.
>         (addsf3): Likewise.
>         (subsf3): Likewise.
>         (mulsf3): Likewise.
>         (divsf3): Likewise.
>         (fres): Likewise.
>         (fmasf4_fpr): Likewise.
>         (fmssf4_fpr): Likewise.
>         (nfmasf4_fpr): Likewise.
>         (nfmssf4_fpr): Likewise.
>         (sqrtsf2): Likewise.
>         (rsqrtsf_internal1): Likewise.
>         (smaxsf3): Likewise.
>         (sminsf3): Likewise.
>         (cmpsf_internal1): Likewise.
>         (copysign<mode>3_fcpsgn): Add VSX/power8-vector support.
>         (negdf2): Delete DF operations that are merged with SF.
>         (absdf2): Likewise.
>         (nabsdf2): Likewise.
>         (adddf3): Likewise.
>         (subdf3): Likewise.
>         (muldf3): Likewise.
>         (divdf3): Likewise.
>         (fred): Likewise.
>         (rsqrtdf_internal1): Likewise.
>         (fmadf4_fpr): Likewise.
>         (fmsdf4_fpr): Likewise.
>         (nfmadf4_fpr): Likewise.
>         (nfmsdf4_fpr): Likewise.
>         (sqrtdf2): Likewise.
>         (smaxdf3): Likewise.
>         (smindf3): Likewise.
>         (cmpdf_internal1): Likewise.
>         (lrint<mode>di2): Use TARGET_<MODE>_FPR macro.
>         (btrunc<mode>2): Delete separate expander, and combine with the
>         insn and add VSX instruction support.  Use TARGET_<MODE>_FPR.
>         (btrunc<mode>2_fpr): Likewise.
>         (ceil<mode>2): Likewise.
>         (ceil<mode>2_fpr): Likewise.
>         (floor<mode>2): Likewise.
>         (floor<mode>2_fpr): Likewise.
>         (fma<mode>4_fpr): Combine SF and DF fused multiply/add support.
>         Add support for using the upper registers with VSX and
>         power8-vector.  Move insns to be closer to the define_expands. On
>         VSX systems, prefer the traditional form of FMA over the VSX
>         version, since the traditional form allows the target not to
>         overlap with the inputs.
>         (fms<mode>4_fpr): Likewise.
>         (nfma<mode>4_fpr): Likewise.
>         (nfms<mode>4_fpr): Likewise.
>
> [gcc/testsuite]
> 2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * gcc.target/powerpc/p8vector-fp.c: New test for floating point
>         scalar operations when using -mupper-regs-sf and -mupper-regs-df.
>         * gcc.target/powerpc/ppc-target-1.c: Update tests to allow either
>         VSX scalar operations or the traditional floating point form of
>         the instruction.
>         * gcc.target/powerpc/ppc-target-2.c: Likewise.
>         * gcc.target/powerpc/recip-3.c: Likewise.
>         * gcc.target/powerpc/recip-5.c: Likewise.
>         * gcc.target/powerpc/pr72747.c: Likewise.
>         * gcc.target/powerpc/vsx-builtin-3.c: Likewise.

Okay.  Good cleanups.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5
  2013-10-01 23:15           ` David Edelsohn
@ 2013-10-07 23:05             ` Michael Meissner
  2013-10-15  1:14               ` David Edelsohn
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2013-10-07 23:05 UTC (permalink / raw)
  To: David Edelsohn, Michael Meissner, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3145 bytes --]

On this patch, I add some infrastructure to say whether a given mode can go in
a general purpose register, floating point register, or altivec register, and
if it can, what kinds of addressing is legal for the mode.  In addition, I have
switched to using memset to setup the arrays indexed by mode.

I've rewritten this infrastructure since I submitted version 1 of the patch.
At present, this is only an infrastructure change, and the compiler generates
the same code underneath before and after the patch.

I switched rs6000_legitimate_address_p so that it uses this new infrastructure
to determine if a mode can do auto update as part of the address.  I found that
the code in rs6000_legitimate_address_p was not obvious what they were trying
to do.  I discovered for some 32-bit abis, the types DImode, DFmode, and
DDmode, would allow addresses with PRE_INC or PRE_DEC, but not PRE_MODIFY.  At
present, I am generating the same code, but in a later patch, we can allow
PRE_MODIFY for those types as well.  I noticed this when I was doing version 1
of the patches, and I was generating more auto update addressing forms than
before.

I have run bootstrap and make check with no regressions with these patches.  I
have also tested code for various 32/64-bit targets (power8, power7, power6x,
power6, power5, power 5+, power4, cell, 476 with/with floating point, g4, g5,
e5500, e6500, and xilinx full/lite/no floating point), and I found the code
generated is identical.

Can I install these patches in GCC 4.9 on the trunk?

2013-10-07  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (enum rs6000_reload_reg_type): Add new
	fields to the reg_addr array that describes the valid addressing
	mode for any register, general purpose registers, floating point
	registers, and Altivec registers.
	(FIRST_RELOAD_REG_CLASS): Likewise.
	(LAST_RELOAD_REG_CLASS): Likewise.
	(struct reload_reg_map_type): Likewise.
	(reload_reg_map_type): Likewise.
	(RELOAD_REG_VALID): Likewise.
	(RELOAD_REG_MULTIPLE): Likewise.
	(RELOAD_REG_INDEXED): Likewise.
	(RELOAD_REG_OFFSET): Likewise.
	(RELOAD_REG_PRE_INCDEC): Likewise.
	(RELOAD_REG_PRE_MODIFY): Likewise.
	(reg_addr): Likewise.
	(mode_supports_pre_incdec_p): New helper functions to say whether
	a given mode supports PRE_INC, PRE_DEC, and PRE_MODIFY.
	(mode_supports_pre_modify_p): Likewise.
	(rs6000_debug_vector_unit): Rearrange the -mdebug=reg output to
	print the valid address mode bits for each mode.
	(rs6000_debug_print_mode): Likewise.
	(rs6000_debug_reg_global): Likewise.
	(rs6000_setup_reg_addr_masks): New function to set up the address
	mask bits for each type.
	(rs6000_init_hard_regno_mode_ok): Use memset to clear arrays.
	Call rs6000_setup_reg_addr_masks to set up the address mask bits.
	(rs6000_legitimate_address_p): Use mode_supports_pre_incdec_p and
	mode_supports_pre_modify_p to determine if PRE_INC, PRE_DEC, and
	PRE_MODIFY are supported.
	(rs6000_print_options_internal): Tweak the debug output slightly.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch064b --]
[-- Type: text/plain, Size: 14407 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 203166)
+++ gcc/config/rs6000/rs6000.c	(.../gcc/config/rs6000)	(working copy)
@@ -313,6 +313,50 @@ static enum rs6000_reg_type reg_class_to
 
 #define IS_FP_VECT_REG_TYPE(RTYPE) IN_RANGE(RTYPE, VSX_REG_TYPE, FPR_REG_TYPE)
 
+
+/* Register classes we care about in secondary reload or go if legitimate
+   address.  We only need to worry about GPR, FPR, and Altivec registers here,
+   along an ANY field that is the OR of the 3 register classes.  */
+
+enum rs6000_reload_reg_type {
+  RELOAD_REG_GPR,			/* General purpose registers.  */
+  RELOAD_REG_FPR,			/* Traditional floating point regs.  */
+  RELOAD_REG_AV,			/* Altivec registers.  */
+  RELOAD_REG_ANY,			/* OR of GPR, FPR, Altivec masks.  */
+  N_RELOAD_REG
+};
+
+/* For setting up register classes, loop through the 3 register classes mapping
+   into real registers, and skip the ANY class, which is just an OR of the
+   bits.  */
+#define FIRST_RELOAD_REG_CLASS	RELOAD_REG_GPR
+#define LAST_RELOAD_REG_CLASS	RELOAD_REG_AV
+
+/* Map reload register type to a register in the register class.  */
+struct reload_reg_map_type {
+  const char *name;			/* Register class name.  */
+  int reg;				/* Register in the register class.  */
+};
+
+static const struct reload_reg_map_type reload_reg_map[N_RELOAD_REG] = {
+  { "Gpr",	FIRST_GPR_REGNO },	/* RELOAD_REG_GPR.  */
+  { "Fpr",	FIRST_FPR_REGNO },	/* RELOAD_REG_FPR.  */
+  { "Av.",	FIRST_ALTIVEC_REGNO },	/* RELOAD_REG_AV.  */
+  { "Any",	-1 },			/* RELOAD_REG_ANY.  */
+};
+
+/* Mask bits for each register class, indexed per mode.  Historically the
+   compiler has been more restrictive which types can do PRE_MODIFY instead of
+   PRE_INC and PRE_DEC, so keep track of sepaate bits for these two.  */
+typedef unsigned char addr_mask_type;
+
+#define RELOAD_REG_VALID	0x01	/* Mode valid in register..  */
+#define RELOAD_REG_MULTIPLE	0x02	/* Mode takes multiple registers.  */
+#define RELOAD_REG_INDEXED	0x04	/* Reg+reg addressing.  */
+#define RELOAD_REG_OFFSET	0x08	/* Reg+offset addressing. */
+#define RELOAD_REG_PRE_INCDEC	0x10	/* PRE_INC/PRE_DEC valid.  */
+#define RELOAD_REG_PRE_MODIFY	0x20	/* PRE_MODIFY valid.  */
+
 /* Register type masks based on the type, of valid addressing modes.  */
 struct rs6000_reg_addr {
   enum insn_code reload_load;		/* INSN to reload for loading. */
@@ -320,10 +364,27 @@ struct rs6000_reg_addr {
   enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
   enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
   enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
+  addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
 };
 
 static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
 
+/* Helper function to say whether a mode supports PRE_INC or PRE_DEC.  */
+static inline bool
+mode_supports_pre_incdec_p (enum machine_mode mode)
+{
+  return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_PRE_INCDEC)
+	  != 0);
+}
+
+/* Helper function to say whether a mode supports PRE_MODIFY.  */
+static inline bool
+mode_supports_pre_modify_p (enum machine_mode mode)
+{
+  return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_PRE_MODIFY)
+	  != 0);
+}
+
 \f
 /* Target cpu costs.  */
 
@@ -1777,6 +1838,63 @@ rs6000_debug_reg_print (int first_regno,
     }
 }
 
+static const char *
+rs6000_debug_vector_unit (enum rs6000_vector v)
+{
+  const char *ret;
+
+  switch (v)
+    {
+    case VECTOR_NONE:	   ret = "none";      break;
+    case VECTOR_ALTIVEC:   ret = "altivec";   break;
+    case VECTOR_VSX:	   ret = "vsx";       break;
+    case VECTOR_P8_VECTOR: ret = "p8_vector"; break;
+    case VECTOR_PAIRED:	   ret = "paired";    break;
+    case VECTOR_SPE:	   ret = "spe";       break;
+    case VECTOR_OTHER:	   ret = "other";     break;
+    default:		   ret = "unknown";   break;
+    }
+
+  return ret;
+}
+
+/* Print the address masks in a human readble fashion.  */
+DEBUG_FUNCTION void
+rs6000_debug_print_mode (ssize_t m)
+{
+  ssize_t rc;
+
+  fprintf (stderr, "Mode: %-5s", GET_MODE_NAME (m));
+  for (rc = 0; rc < N_RELOAD_REG; rc++)
+    {
+      addr_mask_type mask = reg_addr[m].addr_mask[rc];
+      fprintf (stderr,
+	       "  %s: %c%c%c%c%c%c",
+	       reload_reg_map[rc].name,
+	       (mask & RELOAD_REG_VALID)      != 0 ? 'v' : ' ',
+	       (mask & RELOAD_REG_MULTIPLE)   != 0 ? 'm' : ' ',
+	       (mask & RELOAD_REG_INDEXED)    != 0 ? 'i' : ' ',
+	       (mask & RELOAD_REG_OFFSET)     != 0 ? 'o' : ' ',
+	       (mask & RELOAD_REG_PRE_INCDEC) != 0 ? '+' : ' ',
+	       (mask & RELOAD_REG_PRE_MODIFY) != 0 ? '+' : ' ');
+    }
+
+  if (rs6000_vector_unit[m] != VECTOR_NONE
+      || rs6000_vector_mem[m] != VECTOR_NONE
+      || (reg_addr[m].reload_store != CODE_FOR_nothing)
+      || (reg_addr[m].reload_load != CODE_FOR_nothing))
+    {
+      fprintf (stderr,
+	       "  Vector-arith=%-10s Vector-mem=%-10s Reload=%c%c",
+	       rs6000_debug_vector_unit (rs6000_vector_unit[m]),
+	       rs6000_debug_vector_unit (rs6000_vector_mem[m]),
+	       (reg_addr[m].reload_store != CODE_FOR_nothing) ? 's' : '*',
+	       (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*');
+    }
+
+  fputs ("\n", stderr);
+}
+
 #define DEBUG_FMT_ID "%-32s= "
 #define DEBUG_FMT_D   DEBUG_FMT_ID "%d\n"
 #define DEBUG_FMT_WX  DEBUG_FMT_ID "%#.12" HOST_WIDE_INT_PRINT "x: "
@@ -1800,17 +1918,6 @@ rs6000_debug_reg_global (void)
   const char *cmodel_str;
   struct cl_target_option cl_opts;
 
-  /* Map enum rs6000_vector to string.  */
-  static const char *rs6000_debug_vector_unit[] = {
-    "none",
-    "altivec",
-    "vsx",
-    "p8_vector",
-    "paired",
-    "spe",
-    "other"
-  };
-
   /* Modes we want tieable information on.  */
   static const enum machine_mode print_tieable_modes[] = {
     QImode,
@@ -1928,24 +2035,11 @@ rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]);
 
+  nl = "\n";
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
-    if (rs6000_vector_unit[m] || rs6000_vector_mem[m]
-	|| (reg_addr[m].reload_load != CODE_FOR_nothing)
-	|| (reg_addr[m].reload_store != CODE_FOR_nothing))
-      {
-	nl = "\n";
-	fprintf (stderr,
-		 "Vector mode: %-5s arithmetic: %-10s move: %-10s "
-		 "reload-out: %c reload-in: %c\n",
-		 GET_MODE_NAME (m),
-		 rs6000_debug_vector_unit[ rs6000_vector_unit[m] ],
-		 rs6000_debug_vector_unit[ rs6000_vector_mem[m] ],
-		 (reg_addr[m].reload_store != CODE_FOR_nothing) ? 'y' : 'n',
-		 (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'y' : 'n');
-      }
+    rs6000_debug_print_mode (m);
 
-  if (nl)
-    fputs (nl, stderr);
+  fputs ("\n", stderr);
 
   for (m1 = 0; m1 < ARRAY_SIZE (print_tieable_modes); m1++)
     {
@@ -2181,6 +2275,101 @@ rs6000_debug_reg_global (void)
 	   (int)RS6000_BUILTIN_COUNT);
 }
 
+\f
+/* Update the addr mask bits in reg_addr to help secondary reload and go if
+   legitimate address support to figure out the appropriate addressing to
+   use.  */
+
+static void
+rs6000_setup_reg_addr_masks (void)
+{
+  ssize_t rc, reg, m, nregs;
+  addr_mask_type any_addr_mask, addr_mask;
+
+  for (m = 0; m < NUM_MACHINE_MODES; ++m)
+    {
+      /* SDmode is special in that we want to access it only via REG+REG
+	 addressing on power7 and above, since we want to use the LFIWZX and
+	 STFIWZX instructions to load it.  */
+      bool indexed_only_p = (m == SDmode && TARGET_NO_SDMODE_STACK);
+
+      any_addr_mask = 0;
+      for (rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
+	{
+	  addr_mask = 0;
+	  reg = reload_reg_map[rc].reg;
+
+	  /* Can mode values go in the GPR/FPR/Altivec registers?  */
+	  if (reg >= 0 && rs6000_hard_regno_mode_ok_p[m][reg])
+	    {
+	      nregs = rs6000_hard_regno_nregs[m][reg];
+	      addr_mask |= RELOAD_REG_VALID;
+
+	      /* Indicate if the mode takes more than 1 physical register.  If
+		 it takes a single register, indicate it can do REG+REG
+		 addressing.  */
+	      if (nregs > 1 || m == BLKmode)
+		addr_mask |= RELOAD_REG_MULTIPLE;
+	      else
+		addr_mask |= RELOAD_REG_INDEXED;
+
+	      /* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
+		 addressing.  Restrict addressing on SPE for 64-bit types
+		 because of the SUBREG hackery used to address 64-bit floats in
+		 '32-bit' GPRs.  To simplify secondary reload, don't allow
+		 update forms on scalar floating point types that can go in the
+		 upper registers.  */
+
+	      if (TARGET_UPDATE
+		  && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
+		  && GET_MODE_SIZE (m) <= 8
+		  && !VECTOR_MODE_P (m)
+		  && !COMPLEX_MODE_P (m)
+		  && !indexed_only_p
+		  && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m) == 8)
+		  && !(m == DFmode && TARGET_UPPER_REGS_DF)
+		  && !(m == SFmode && TARGET_UPPER_REGS_SF))
+		{
+		  addr_mask |= RELOAD_REG_PRE_INCDEC;
+
+		  /* PRE_MODIFY is more restricted than PRE_INC/PRE_DEC in that
+		     we don't allow PRE_MODIFY for some multi-register
+		     operations.  */
+		  switch (m)
+		    {
+		    default:
+		      addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+
+		    case DImode:
+		      if (TARGET_POWERPC64)
+			addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+
+		    case DFmode:
+		    case DDmode:
+		      if (TARGET_DF_INSN)
+			addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+		    }
+		}
+	    }
+
+	  /* GPR and FPR registers can do REG+OFFSET addressing, except
+	     possibly for SDmode.  */
+	  if ((addr_mask != 0) && !indexed_only_p
+	      && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR))
+	    addr_mask |= RELOAD_REG_OFFSET;
+
+	  reg_addr[m].addr_mask[rc] = addr_mask;
+	  any_addr_mask |= addr_mask;
+	}
+
+      reg_addr[m].addr_mask[RELOAD_REG_ANY] = any_addr_mask;
+    }
+}
+
+\f
 /* Initialize the various global tables that are based on register size.  */
 static void
 rs6000_init_hard_regno_mode_ok (bool global_init_p)
@@ -2253,18 +2442,15 @@ rs6000_init_hard_regno_mode_ok (bool glo
   /* Precalculate the valid memory formats as well as the vector information,
      this must be set up before the rs6000_hard_regno_nregs_internal calls
      below.  */
-  for (m = 0; m < NUM_MACHINE_MODES; ++m)
-    {
-      rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE;
-      reg_addr[m].reload_load = CODE_FOR_nothing;
-      reg_addr[m].reload_store = CODE_FOR_nothing;
-      reg_addr[m].reload_fpr_gpr = CODE_FOR_nothing;
-      reg_addr[m].reload_gpr_vsx = CODE_FOR_nothing;
-      reg_addr[m].reload_vsx_gpr = CODE_FOR_nothing;
-    }
+  gcc_assert ((int)VECTOR_NONE == 0);
+  memset ((void *) &rs6000_vector_unit[0], '\0', sizeof (rs6000_vector_unit));
+  memset ((void *) &rs6000_vector_mem[0], '\0', sizeof (rs6000_vector_unit));
+
+  gcc_assert ((int)CODE_FOR_nothing == 0);
+  memset ((void *) &reg_addr[0], '\0', sizeof (reg_addr));
 
-  for (c = 0; c < (int)(int)RS6000_CONSTRAINT_MAX; c++)
-    rs6000_constraints[c] = NO_REGS;
+  gcc_assert ((int)NO_REGS == 0);
+  memset ((void *) &rs6000_constraints[0], '\0', sizeof (rs6000_constraints));
 
   /* The VSX hardware allows native alignment for vectors, but control whether the compiler
      believes it can use native alignment or still uses 128-bit alignment.  */
@@ -2660,6 +2846,11 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	}
     }
 
+  /* Update the addr mask bits in reg_addr to help secondary reload and go if
+     legitimate address support to figure out the appropriate addressing to
+     use.  */
+  rs6000_setup_reg_addr_masks ();
+
   if (global_init_p || TARGET_DEBUG_TARGET)
     {
       if (TARGET_DEBUG_REG)
@@ -7162,17 +7353,9 @@ rs6000_legitimate_address_p (enum machin
     return 0;
   if (legitimate_indirect_address_p (x, reg_ok_strict))
     return 1;
-  if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
-      && !ALTIVEC_OR_VSX_VECTOR_MODE (mode)
-      && !SPE_VECTOR_MODE (mode)
-      && mode != TFmode
-      && mode != TDmode
-      && mode != TImode
-      && mode != PTImode
-      /* Restrict addressing for DI because of our SUBREG hackery.  */
-      && !(TARGET_E500_DOUBLE
-	   && (mode == DFmode || mode == DDmode || mode == DImode))
-      && TARGET_UPDATE
+  if (TARGET_UPDATE
+      && (GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
+      && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
   if (virtual_stack_registers_memory_p (x))
@@ -7212,21 +7395,8 @@ rs6000_legitimate_address_p (enum machin
       && !avoiding_indexed_address_p (mode)
       && legitimate_indexed_address_p (x, reg_ok_strict))
     return 1;
-  if (GET_CODE (x) == PRE_MODIFY
-      && mode != TImode
-      && mode != PTImode
-      && mode != TFmode
-      && mode != TDmode
-      && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
-	  || TARGET_POWERPC64
-	  || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE))
-      && (TARGET_POWERPC64 || mode != DImode)
-      && !ALTIVEC_OR_VSX_VECTOR_MODE (mode)
-      && !SPE_VECTOR_MODE (mode)
-      /* Restrict addressing for DI because of our SUBREG hackery.  */
-      && !(TARGET_E500_DOUBLE
-	   && (mode == DFmode || mode == DDmode || mode == DImode))
-      && TARGET_UPDATE
+  if (TARGET_UPDATE && GET_CODE (x) == PRE_MODIFY
+      && mode_supports_pre_modify_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)
       && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1),
 					      reg_ok_strict, false)
@@ -29803,7 +29973,6 @@ rs6000_print_options_internal (FILE *fil
   size_t cur_column;
   size_t max_column = 76;
   const char *comma = "";
-  const char *nl = "\n";
 
   if (indent)
     start_column += fprintf (file, "%*s", indent, "");
@@ -29834,7 +30003,6 @@ rs6000_print_options_internal (FILE *fil
 	      fprintf (stderr, ", \\\n%*s", (int)start_column, "");
 	      cur_column = start_column + len;
 	      comma = "";
-	      nl = "\n\n";
 	    }
 
 	  fprintf (file, "%s%s%s%s", comma, prefix, no_str,
@@ -29844,7 +30012,7 @@ rs6000_print_options_internal (FILE *fil
 	}
     }
 
-  fputs (nl, file);
+  fputs ("\n", file);
 }
 
 /* Helper function to print the current isa options on a line.  */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5
  2013-10-07 23:05             ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5 Michael Meissner
@ 2013-10-15  1:14               ` David Edelsohn
  2013-10-16 19:08                 ` Michael Meissner
  2013-10-17 19:32                 ` Michael Meissner
  0 siblings, 2 replies; 20+ messages in thread
From: David Edelsohn @ 2013-10-15  1:14 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Mon, Oct 7, 2013 at 7:00 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On this patch, I add some infrastructure to say whether a given mode can go in
> a general purpose register, floating point register, or altivec register, and
> if it can, what kinds of addressing is legal for the mode.  In addition, I have
> switched to using memset to setup the arrays indexed by mode.
>
> I've rewritten this infrastructure since I submitted version 1 of the patch.
> At present, this is only an infrastructure change, and the compiler generates
> the same code underneath before and after the patch.
>
> I switched rs6000_legitimate_address_p so that it uses this new infrastructure
> to determine if a mode can do auto update as part of the address.  I found that
> the code in rs6000_legitimate_address_p was not obvious what they were trying
> to do.  I discovered for some 32-bit abis, the types DImode, DFmode, and
> DDmode, would allow addresses with PRE_INC or PRE_DEC, but not PRE_MODIFY.  At
> present, I am generating the same code, but in a later patch, we can allow
> PRE_MODIFY for those types as well.  I noticed this when I was doing version 1
> of the patches, and I was generating more auto update addressing forms than
> before.
>
> I have run bootstrap and make check with no regressions with these patches.  I
> have also tested code for various 32/64-bit targets (power8, power7, power6x,
> power6, power5, power 5+, power4, cell, 476 with/with floating point, g4, g5,
> e5500, e6500, and xilinx full/lite/no floating point), and I found the code
> generated is identical.
>
> Can I install these patches in GCC 4.9 on the trunk?
>
> 2013-10-07  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (enum rs6000_reload_reg_type): Add new
>         fields to the reg_addr array that describes the valid addressing
>         mode for any register, general purpose registers, floating point
>         registers, and Altivec registers.
>         (FIRST_RELOAD_REG_CLASS): Likewise.
>         (LAST_RELOAD_REG_CLASS): Likewise.
>         (struct reload_reg_map_type): Likewise.
>         (reload_reg_map_type): Likewise.
>         (RELOAD_REG_VALID): Likewise.
>         (RELOAD_REG_MULTIPLE): Likewise.
>         (RELOAD_REG_INDEXED): Likewise.
>         (RELOAD_REG_OFFSET): Likewise.
>         (RELOAD_REG_PRE_INCDEC): Likewise.
>         (RELOAD_REG_PRE_MODIFY): Likewise.
>         (reg_addr): Likewise.
>         (mode_supports_pre_incdec_p): New helper functions to say whether
>         a given mode supports PRE_INC, PRE_DEC, and PRE_MODIFY.
>         (mode_supports_pre_modify_p): Likewise.
>         (rs6000_debug_vector_unit): Rearrange the -mdebug=reg output to
>         print the valid address mode bits for each mode.
>         (rs6000_debug_print_mode): Likewise.
>         (rs6000_debug_reg_global): Likewise.
>         (rs6000_setup_reg_addr_masks): New function to set up the address
>         mask bits for each type.
>         (rs6000_init_hard_regno_mode_ok): Use memset to clear arrays.
>         Call rs6000_setup_reg_addr_masks to set up the address mask bits.
>         (rs6000_legitimate_address_p): Use mode_supports_pre_incdec_p and
>         mode_supports_pre_modify_p to determine if PRE_INC, PRE_DEC, and
>         PRE_MODIFY are supported.
>         (rs6000_print_options_internal): Tweak the debug output slightly.

Please use "VMX" instead of "AV", not because I am pedantic about
IBM's name for the feature, but "AV" is meaningless to me and anyone
else reading the code. If you need a three-letter name, use "VMX".

Okay with that change.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5
  2013-10-15  1:14               ` David Edelsohn
@ 2013-10-16 19:08                 ` Michael Meissner
  2013-10-16 21:35                   ` David Edelsohn
  2013-10-17 19:32                 ` Michael Meissner
  1 sibling, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2013-10-16 19:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Michael Meissner, GCC Patches

On Mon, Oct 14, 2013 at 08:59:50PM -0400, David Edelsohn wrote:
> Please use "VMX" instead of "AV", not because I am pedantic about
> IBM's name for the feature, but "AV" is meaningless to me and anyone
> else reading the code. If you need a three-letter name, use "VMX".
> 
> Okay with that change.

I can change it to VMX, but VMX is not referenced very much in the port.  For
better or worse, Altivec is used everywhere.  So I think I will just spell it
out.

I had used _av previously in rs6000_output_move_128bit.  Did you want me to
clean up those places as well?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5
  2013-10-16 19:08                 ` Michael Meissner
@ 2013-10-16 21:35                   ` David Edelsohn
  0 siblings, 0 replies; 20+ messages in thread
From: David Edelsohn @ 2013-10-16 21:35 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Wed, Oct 16, 2013 at 3:00 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On Mon, Oct 14, 2013 at 08:59:50PM -0400, David Edelsohn wrote:
>> Please use "VMX" instead of "AV", not because I am pedantic about
>> IBM's name for the feature, but "AV" is meaningless to me and anyone
>> else reading the code. If you need a three-letter name, use "VMX".
>>
>> Okay with that change.
>
> I can change it to VMX, but VMX is not referenced very much in the port.  For
> better or worse, Altivec is used everywhere.  So I think I will just spell it
> out.
>
> I had used _av previously in rs6000_output_move_128bit.  Did you want me to
> clean up those places as well?

Yes, please remove uses of "av" because I really don't think that it
is meaningful or informative to someone reading the source code. If
you change that use to "_vmx", then VMX will be used more in the code.
I'm not going to insist on VMX, but it never will be used more in the
code if we always avoid using it.

Thanks, David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5
  2013-10-15  1:14               ` David Edelsohn
  2013-10-16 19:08                 ` Michael Meissner
@ 2013-10-17 19:32                 ` Michael Meissner
  1 sibling, 0 replies; 20+ messages in thread
From: Michael Meissner @ 2013-10-17 19:32 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Michael Meissner, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2069 bytes --]

On Mon, Oct 14, 2013 at 08:59:50PM -0400, David Edelsohn wrote:
> Please use "VMX" instead of "AV", not because I am pedantic about
> IBM's name for the feature, but "AV" is meaningless to me and anyone
> else reading the code. If you need a three-letter name, use "VMX".
> 
> Okay with that change.

I just applied the following patch.

2013-10-17  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (enum rs6000_reload_reg_type): Add new
	fields to the reg_addr array that describes the valid addressing
	mode for any register, general purpose registers, floating point
	registers, and Altivec registers.
	(FIRST_RELOAD_REG_CLASS): Likewise.
	(LAST_RELOAD_REG_CLASS): Likewise.
	(struct reload_reg_map_type): Likewise.
	(reload_reg_map_type): Likewise.
	(RELOAD_REG_VALID): Likewise.
	(RELOAD_REG_MULTIPLE): Likewise.
	(RELOAD_REG_INDEXED): Likewise.
	(RELOAD_REG_OFFSET): Likewise.
	(RELOAD_REG_PRE_INCDEC): Likewise.
	(RELOAD_REG_PRE_MODIFY): Likewise.
	(reg_addr): Likewise.
	(mode_supports_pre_incdec_p): New helper functions to say whether
	a given mode supports PRE_INC, PRE_DEC, and PRE_MODIFY.
	(mode_supports_pre_modify_p): Likewise.
	(rs6000_debug_vector_unit): Rearrange the -mdebug=reg output to
	print the valid address mode bits for each mode.
	(rs6000_debug_print_mode): Likewise.
	(rs6000_debug_reg_global): Likewise.
	(rs6000_setup_reg_addr_masks): New function to set up the address
	mask bits for each type.
	(rs6000_init_hard_regno_mode_ok): Use memset to clear arrays.
	Call rs6000_setup_reg_addr_masks to set up the address mask bits.
	(rs6000_legitimate_address_p): Use mode_supports_pre_incdec_p and
	mode_supports_pre_modify_p to determine if PRE_INC, PRE_DEC, and
	PRE_MODIFY are supported.
	(rs6000_output_move_128bit): Change to use {src,dest}_vmx_p for altivec
	registers, instead of {src,dest}_av_p.
	(rs6000_print_options_internal): Tweak the debug output slightly.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: gcc-power8.patch064c --]
[-- Type: text/plain, Size: 17387 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 203789)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -313,6 +313,50 @@ static enum rs6000_reg_type reg_class_to
 
 #define IS_FP_VECT_REG_TYPE(RTYPE) IN_RANGE(RTYPE, VSX_REG_TYPE, FPR_REG_TYPE)
 
+
+/* Register classes we care about in secondary reload or go if legitimate
+   address.  We only need to worry about GPR, FPR, and Altivec registers here,
+   along an ANY field that is the OR of the 3 register classes.  */
+
+enum rs6000_reload_reg_type {
+  RELOAD_REG_GPR,			/* General purpose registers.  */
+  RELOAD_REG_FPR,			/* Traditional floating point regs.  */
+  RELOAD_REG_VMX,			/* Altivec (VMX) registers.  */
+  RELOAD_REG_ANY,			/* OR of GPR, FPR, Altivec masks.  */
+  N_RELOAD_REG
+};
+
+/* For setting up register classes, loop through the 3 register classes mapping
+   into real registers, and skip the ANY class, which is just an OR of the
+   bits.  */
+#define FIRST_RELOAD_REG_CLASS	RELOAD_REG_GPR
+#define LAST_RELOAD_REG_CLASS	RELOAD_REG_VMX
+
+/* Map reload register type to a register in the register class.  */
+struct reload_reg_map_type {
+  const char *name;			/* Register class name.  */
+  int reg;				/* Register in the register class.  */
+};
+
+static const struct reload_reg_map_type reload_reg_map[N_RELOAD_REG] = {
+  { "Gpr",	FIRST_GPR_REGNO },	/* RELOAD_REG_GPR.  */
+  { "Fpr",	FIRST_FPR_REGNO },	/* RELOAD_REG_FPR.  */
+  { "VMX",	FIRST_ALTIVEC_REGNO },	/* RELOAD_REG_VMX.  */
+  { "Any",	-1 },			/* RELOAD_REG_ANY.  */
+};
+
+/* Mask bits for each register class, indexed per mode.  Historically the
+   compiler has been more restrictive which types can do PRE_MODIFY instead of
+   PRE_INC and PRE_DEC, so keep track of sepaate bits for these two.  */
+typedef unsigned char addr_mask_type;
+
+#define RELOAD_REG_VALID	0x01	/* Mode valid in register..  */
+#define RELOAD_REG_MULTIPLE	0x02	/* Mode takes multiple registers.  */
+#define RELOAD_REG_INDEXED	0x04	/* Reg+reg addressing.  */
+#define RELOAD_REG_OFFSET	0x08	/* Reg+offset addressing. */
+#define RELOAD_REG_PRE_INCDEC	0x10	/* PRE_INC/PRE_DEC valid.  */
+#define RELOAD_REG_PRE_MODIFY	0x20	/* PRE_MODIFY valid.  */
+
 /* Register type masks based on the type, of valid addressing modes.  */
 struct rs6000_reg_addr {
   enum insn_code reload_load;		/* INSN to reload for loading. */
@@ -320,10 +364,27 @@ struct rs6000_reg_addr {
   enum insn_code reload_fpr_gpr;	/* INSN to move from FPR to GPR.  */
   enum insn_code reload_gpr_vsx;	/* INSN to move from GPR to VSX.  */
   enum insn_code reload_vsx_gpr;	/* INSN to move from VSX to GPR.  */
+  addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
 };
 
 static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
 
+/* Helper function to say whether a mode supports PRE_INC or PRE_DEC.  */
+static inline bool
+mode_supports_pre_incdec_p (enum machine_mode mode)
+{
+  return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_PRE_INCDEC)
+	  != 0);
+}
+
+/* Helper function to say whether a mode supports PRE_MODIFY.  */
+static inline bool
+mode_supports_pre_modify_p (enum machine_mode mode)
+{
+  return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_PRE_MODIFY)
+	  != 0);
+}
+
 \f
 /* Target cpu costs.  */
 
@@ -1777,6 +1838,63 @@ rs6000_debug_reg_print (int first_regno,
     }
 }
 
+static const char *
+rs6000_debug_vector_unit (enum rs6000_vector v)
+{
+  const char *ret;
+
+  switch (v)
+    {
+    case VECTOR_NONE:	   ret = "none";      break;
+    case VECTOR_ALTIVEC:   ret = "altivec";   break;
+    case VECTOR_VSX:	   ret = "vsx";       break;
+    case VECTOR_P8_VECTOR: ret = "p8_vector"; break;
+    case VECTOR_PAIRED:	   ret = "paired";    break;
+    case VECTOR_SPE:	   ret = "spe";       break;
+    case VECTOR_OTHER:	   ret = "other";     break;
+    default:		   ret = "unknown";   break;
+    }
+
+  return ret;
+}
+
+/* Print the address masks in a human readble fashion.  */
+DEBUG_FUNCTION void
+rs6000_debug_print_mode (ssize_t m)
+{
+  ssize_t rc;
+
+  fprintf (stderr, "Mode: %-5s", GET_MODE_NAME (m));
+  for (rc = 0; rc < N_RELOAD_REG; rc++)
+    {
+      addr_mask_type mask = reg_addr[m].addr_mask[rc];
+      fprintf (stderr,
+	       "  %s: %c%c%c%c%c%c",
+	       reload_reg_map[rc].name,
+	       (mask & RELOAD_REG_VALID)      != 0 ? 'v' : ' ',
+	       (mask & RELOAD_REG_MULTIPLE)   != 0 ? 'm' : ' ',
+	       (mask & RELOAD_REG_INDEXED)    != 0 ? 'i' : ' ',
+	       (mask & RELOAD_REG_OFFSET)     != 0 ? 'o' : ' ',
+	       (mask & RELOAD_REG_PRE_INCDEC) != 0 ? '+' : ' ',
+	       (mask & RELOAD_REG_PRE_MODIFY) != 0 ? '+' : ' ');
+    }
+
+  if (rs6000_vector_unit[m] != VECTOR_NONE
+      || rs6000_vector_mem[m] != VECTOR_NONE
+      || (reg_addr[m].reload_store != CODE_FOR_nothing)
+      || (reg_addr[m].reload_load != CODE_FOR_nothing))
+    {
+      fprintf (stderr,
+	       "  Vector-arith=%-10s Vector-mem=%-10s Reload=%c%c",
+	       rs6000_debug_vector_unit (rs6000_vector_unit[m]),
+	       rs6000_debug_vector_unit (rs6000_vector_mem[m]),
+	       (reg_addr[m].reload_store != CODE_FOR_nothing) ? 's' : '*',
+	       (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*');
+    }
+
+  fputs ("\n", stderr);
+}
+
 #define DEBUG_FMT_ID "%-32s= "
 #define DEBUG_FMT_D   DEBUG_FMT_ID "%d\n"
 #define DEBUG_FMT_WX  DEBUG_FMT_ID "%#.12" HOST_WIDE_INT_PRINT "x: "
@@ -1800,17 +1918,6 @@ rs6000_debug_reg_global (void)
   const char *cmodel_str;
   struct cl_target_option cl_opts;
 
-  /* Map enum rs6000_vector to string.  */
-  static const char *rs6000_debug_vector_unit[] = {
-    "none",
-    "altivec",
-    "vsx",
-    "p8_vector",
-    "paired",
-    "spe",
-    "other"
-  };
-
   /* Modes we want tieable information on.  */
   static const enum machine_mode print_tieable_modes[] = {
     QImode,
@@ -1928,24 +2035,11 @@ rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]);
 
+  nl = "\n";
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
-    if (rs6000_vector_unit[m] || rs6000_vector_mem[m]
-	|| (reg_addr[m].reload_load != CODE_FOR_nothing)
-	|| (reg_addr[m].reload_store != CODE_FOR_nothing))
-      {
-	nl = "\n";
-	fprintf (stderr,
-		 "Vector mode: %-5s arithmetic: %-10s move: %-10s "
-		 "reload-out: %c reload-in: %c\n",
-		 GET_MODE_NAME (m),
-		 rs6000_debug_vector_unit[ rs6000_vector_unit[m] ],
-		 rs6000_debug_vector_unit[ rs6000_vector_mem[m] ],
-		 (reg_addr[m].reload_store != CODE_FOR_nothing) ? 'y' : 'n',
-		 (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'y' : 'n');
-      }
+    rs6000_debug_print_mode (m);
 
-  if (nl)
-    fputs (nl, stderr);
+  fputs ("\n", stderr);
 
   for (m1 = 0; m1 < ARRAY_SIZE (print_tieable_modes); m1++)
     {
@@ -2181,6 +2275,101 @@ rs6000_debug_reg_global (void)
 	   (int)RS6000_BUILTIN_COUNT);
 }
 
+\f
+/* Update the addr mask bits in reg_addr to help secondary reload and go if
+   legitimate address support to figure out the appropriate addressing to
+   use.  */
+
+static void
+rs6000_setup_reg_addr_masks (void)
+{
+  ssize_t rc, reg, m, nregs;
+  addr_mask_type any_addr_mask, addr_mask;
+
+  for (m = 0; m < NUM_MACHINE_MODES; ++m)
+    {
+      /* SDmode is special in that we want to access it only via REG+REG
+	 addressing on power7 and above, since we want to use the LFIWZX and
+	 STFIWZX instructions to load it.  */
+      bool indexed_only_p = (m == SDmode && TARGET_NO_SDMODE_STACK);
+
+      any_addr_mask = 0;
+      for (rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
+	{
+	  addr_mask = 0;
+	  reg = reload_reg_map[rc].reg;
+
+	  /* Can mode values go in the GPR/FPR/Altivec registers?  */
+	  if (reg >= 0 && rs6000_hard_regno_mode_ok_p[m][reg])
+	    {
+	      nregs = rs6000_hard_regno_nregs[m][reg];
+	      addr_mask |= RELOAD_REG_VALID;
+
+	      /* Indicate if the mode takes more than 1 physical register.  If
+		 it takes a single register, indicate it can do REG+REG
+		 addressing.  */
+	      if (nregs > 1 || m == BLKmode)
+		addr_mask |= RELOAD_REG_MULTIPLE;
+	      else
+		addr_mask |= RELOAD_REG_INDEXED;
+
+	      /* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
+		 addressing.  Restrict addressing on SPE for 64-bit types
+		 because of the SUBREG hackery used to address 64-bit floats in
+		 '32-bit' GPRs.  To simplify secondary reload, don't allow
+		 update forms on scalar floating point types that can go in the
+		 upper registers.  */
+
+	      if (TARGET_UPDATE
+		  && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
+		  && GET_MODE_SIZE (m) <= 8
+		  && !VECTOR_MODE_P (m)
+		  && !COMPLEX_MODE_P (m)
+		  && !indexed_only_p
+		  && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m) == 8)
+		  && !(m == DFmode && TARGET_UPPER_REGS_DF)
+		  && !(m == SFmode && TARGET_UPPER_REGS_SF))
+		{
+		  addr_mask |= RELOAD_REG_PRE_INCDEC;
+
+		  /* PRE_MODIFY is more restricted than PRE_INC/PRE_DEC in that
+		     we don't allow PRE_MODIFY for some multi-register
+		     operations.  */
+		  switch (m)
+		    {
+		    default:
+		      addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+
+		    case DImode:
+		      if (TARGET_POWERPC64)
+			addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+
+		    case DFmode:
+		    case DDmode:
+		      if (TARGET_DF_INSN)
+			addr_mask |= RELOAD_REG_PRE_MODIFY;
+		      break;
+		    }
+		}
+	    }
+
+	  /* GPR and FPR registers can do REG+OFFSET addressing, except
+	     possibly for SDmode.  */
+	  if ((addr_mask != 0) && !indexed_only_p
+	      && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR))
+	    addr_mask |= RELOAD_REG_OFFSET;
+
+	  reg_addr[m].addr_mask[rc] = addr_mask;
+	  any_addr_mask |= addr_mask;
+	}
+
+      reg_addr[m].addr_mask[RELOAD_REG_ANY] = any_addr_mask;
+    }
+}
+
+\f
 /* Initialize the various global tables that are based on register size.  */
 static void
 rs6000_init_hard_regno_mode_ok (bool global_init_p)
@@ -2253,18 +2442,15 @@ rs6000_init_hard_regno_mode_ok (bool glo
   /* Precalculate the valid memory formats as well as the vector information,
      this must be set up before the rs6000_hard_regno_nregs_internal calls
      below.  */
-  for (m = 0; m < NUM_MACHINE_MODES; ++m)
-    {
-      rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE;
-      reg_addr[m].reload_load = CODE_FOR_nothing;
-      reg_addr[m].reload_store = CODE_FOR_nothing;
-      reg_addr[m].reload_fpr_gpr = CODE_FOR_nothing;
-      reg_addr[m].reload_gpr_vsx = CODE_FOR_nothing;
-      reg_addr[m].reload_vsx_gpr = CODE_FOR_nothing;
-    }
+  gcc_assert ((int)VECTOR_NONE == 0);
+  memset ((void *) &rs6000_vector_unit[0], '\0', sizeof (rs6000_vector_unit));
+  memset ((void *) &rs6000_vector_mem[0], '\0', sizeof (rs6000_vector_unit));
+
+  gcc_assert ((int)CODE_FOR_nothing == 0);
+  memset ((void *) &reg_addr[0], '\0', sizeof (reg_addr));
 
-  for (c = 0; c < (int)(int)RS6000_CONSTRAINT_MAX; c++)
-    rs6000_constraints[c] = NO_REGS;
+  gcc_assert ((int)NO_REGS == 0);
+  memset ((void *) &rs6000_constraints[0], '\0', sizeof (rs6000_constraints));
 
   /* The VSX hardware allows native alignment for vectors, but control whether the compiler
      believes it can use native alignment or still uses 128-bit alignment.  */
@@ -2660,6 +2846,11 @@ rs6000_init_hard_regno_mode_ok (bool glo
 	}
     }
 
+  /* Update the addr mask bits in reg_addr to help secondary reload and go if
+     legitimate address support to figure out the appropriate addressing to
+     use.  */
+  rs6000_setup_reg_addr_masks ();
+
   if (global_init_p || TARGET_DEBUG_TARGET)
     {
       if (TARGET_DEBUG_REG)
@@ -7166,17 +7357,9 @@ rs6000_legitimate_address_p (enum machin
     return 0;
   if (legitimate_indirect_address_p (x, reg_ok_strict))
     return 1;
-  if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
-      && !ALTIVEC_OR_VSX_VECTOR_MODE (mode)
-      && !SPE_VECTOR_MODE (mode)
-      && mode != TFmode
-      && mode != TDmode
-      && mode != TImode
-      && mode != PTImode
-      /* Restrict addressing for DI because of our SUBREG hackery.  */
-      && !(TARGET_E500_DOUBLE
-	   && (mode == DFmode || mode == DDmode || mode == DImode))
-      && TARGET_UPDATE
+  if (TARGET_UPDATE
+      && (GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
+      && mode_supports_pre_incdec_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
     return 1;
   if (virtual_stack_registers_memory_p (x))
@@ -7216,21 +7399,8 @@ rs6000_legitimate_address_p (enum machin
       && !avoiding_indexed_address_p (mode)
       && legitimate_indexed_address_p (x, reg_ok_strict))
     return 1;
-  if (GET_CODE (x) == PRE_MODIFY
-      && mode != TImode
-      && mode != PTImode
-      && mode != TFmode
-      && mode != TDmode
-      && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
-	  || TARGET_POWERPC64
-	  || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE))
-      && (TARGET_POWERPC64 || mode != DImode)
-      && !ALTIVEC_OR_VSX_VECTOR_MODE (mode)
-      && !SPE_VECTOR_MODE (mode)
-      /* Restrict addressing for DI because of our SUBREG hackery.  */
-      && !(TARGET_E500_DOUBLE
-	   && (mode == DFmode || mode == DDmode || mode == DImode))
-      && TARGET_UPDATE
+  if (TARGET_UPDATE && GET_CODE (x) == PRE_MODIFY
+      && mode_supports_pre_modify_p (mode)
       && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)
       && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1),
 					      reg_ok_strict, false)
@@ -16000,21 +16170,21 @@ rs6000_output_move_128bit (rtx operands[
   enum machine_mode mode = GET_MODE (dest);
   int dest_regno;
   int src_regno;
-  bool dest_gpr_p, dest_fp_p, dest_av_p, dest_vsx_p;
-  bool src_gpr_p, src_fp_p, src_av_p, src_vsx_p;
+  bool dest_gpr_p, dest_fp_p, dest_vmx_p, dest_vsx_p;
+  bool src_gpr_p, src_fp_p, src_vmx_p, src_vsx_p;
 
   if (REG_P (dest))
     {
       dest_regno = REGNO (dest);
       dest_gpr_p = INT_REGNO_P (dest_regno);
       dest_fp_p = FP_REGNO_P (dest_regno);
-      dest_av_p = ALTIVEC_REGNO_P (dest_regno);
-      dest_vsx_p = dest_fp_p | dest_av_p;
+      dest_vmx_p = ALTIVEC_REGNO_P (dest_regno);
+      dest_vsx_p = dest_fp_p | dest_vmx_p;
     }
   else
     {
       dest_regno = -1;
-      dest_gpr_p = dest_fp_p = dest_av_p = dest_vsx_p = false;
+      dest_gpr_p = dest_fp_p = dest_vmx_p = dest_vsx_p = false;
     }
 
   if (REG_P (src))
@@ -16022,13 +16192,13 @@ rs6000_output_move_128bit (rtx operands[
       src_regno = REGNO (src);
       src_gpr_p = INT_REGNO_P (src_regno);
       src_fp_p = FP_REGNO_P (src_regno);
-      src_av_p = ALTIVEC_REGNO_P (src_regno);
-      src_vsx_p = src_fp_p | src_av_p;
+      src_vmx_p = ALTIVEC_REGNO_P (src_regno);
+      src_vsx_p = src_fp_p | src_vmx_p;
     }
   else
     {
       src_regno = -1;
-      src_gpr_p = src_fp_p = src_av_p = src_vsx_p = false;
+      src_gpr_p = src_fp_p = src_vmx_p = src_vsx_p = false;
     }
 
   /* Register moves.  */
@@ -16052,7 +16222,7 @@ rs6000_output_move_128bit (rtx operands[
 	    return "#";
 	}
 
-      else if (TARGET_ALTIVEC && dest_av_p && src_av_p)
+      else if (TARGET_ALTIVEC && dest_vmx_p && src_vmx_p)
 	return "vor %0,%1,%1";
 
       else if (dest_fp_p && src_fp_p)
@@ -16070,7 +16240,7 @@ rs6000_output_move_128bit (rtx operands[
 	    return "#";
 	}
 
-      else if (TARGET_ALTIVEC && dest_av_p
+      else if (TARGET_ALTIVEC && dest_vmx_p
 	       && altivec_indexed_or_indirect_operand (src, mode))
 	return "lvx %0,%y1";
 
@@ -16082,7 +16252,7 @@ rs6000_output_move_128bit (rtx operands[
 	    return "lxvd2x %x0,%y1";
 	}
 
-      else if (TARGET_ALTIVEC && dest_av_p)
+      else if (TARGET_ALTIVEC && dest_vmx_p)
 	return "lvx %0,%y1";
 
       else if (dest_fp_p)
@@ -16100,7 +16270,7 @@ rs6000_output_move_128bit (rtx operands[
 	    return "#";
 	}
 
-      else if (TARGET_ALTIVEC && src_av_p
+      else if (TARGET_ALTIVEC && src_vmx_p
 	       && altivec_indexed_or_indirect_operand (src, mode))
 	return "stvx %1,%y0";
 
@@ -16112,7 +16282,7 @@ rs6000_output_move_128bit (rtx operands[
 	    return "stxvd2x %x1,%y0";
 	}
 
-      else if (TARGET_ALTIVEC && src_av_p)
+      else if (TARGET_ALTIVEC && src_vmx_p)
 	return "stvx %1,%y0";
 
       else if (src_fp_p)
@@ -16131,7 +16301,7 @@ rs6000_output_move_128bit (rtx operands[
       else if (TARGET_VSX && dest_vsx_p && zero_constant (src, mode))
 	return "xxlxor %x0,%x0,%x0";
 
-      else if (TARGET_ALTIVEC && dest_av_p)
+      else if (TARGET_ALTIVEC && dest_vmx_p)
 	return output_vec_const_move (operands);
     }
 
@@ -30038,7 +30208,6 @@ rs6000_print_options_internal (FILE *fil
   size_t cur_column;
   size_t max_column = 76;
   const char *comma = "";
-  const char *nl = "\n";
 
   if (indent)
     start_column += fprintf (file, "%*s", indent, "");
@@ -30069,7 +30238,6 @@ rs6000_print_options_internal (FILE *fil
 	      fprintf (stderr, ", \\\n%*s", (int)start_column, "");
 	      cur_column = start_column + len;
 	      comma = "";
-	      nl = "\n\n";
 	    }
 
 	  fprintf (file, "%s%s%s%s", comma, prefix, no_str,
@@ -30079,7 +30247,7 @@ rs6000_print_options_internal (FILE *fil
 	}
     }
 
-  fputs (nl, file);
+  fputs ("\n", file);
 }
 
 /* Helper function to print the current isa options on a line.  */

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-10-17 19:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-22 19:02 [PATCH, powerpc] Rework VSX scalar floating point support Michael Meissner
2013-09-06  3:39 ` David Edelsohn
2013-09-21  2:09   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #1 Michael Meissner
2013-09-23 17:38     ` David Edelsohn
2013-09-23 20:37 ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #2 Michael Meissner
2013-09-24  7:05   ` David Edelsohn
2013-09-24 21:46   ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3 Michael Meissner
2013-09-26  8:34     ` David Edelsohn
2013-09-26 21:37       ` Michael Meissner
2013-09-26 21:50         ` Michael Meissner
2013-09-27  2:15         ` David Edelsohn
2013-09-27  4:18           ` Michael Meissner
2013-10-01 17:27           ` Michael Meissner
2013-10-01 17:52         ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4 Michael Meissner
2013-10-01 23:15           ` David Edelsohn
2013-10-07 23:05             ` [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5 Michael Meissner
2013-10-15  1:14               ` David Edelsohn
2013-10-16 19:08                 ` Michael Meissner
2013-10-16 21:35                   ` David Edelsohn
2013-10-17 19:32                 ` Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).