From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-384511-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 16226 invoked by alias); 14 Nov 2014 20:16:44 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 16211 invoked by uid 89); 14 Nov 2014 20:16:44 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2
X-HELO: e9.ny.us.ibm.com
Received: from e9.ny.us.ibm.com (HELO e9.ny.us.ibm.com) (32.97.182.139) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 14 Nov 2014 20:16:42 +0000
Received: from /spool/local	by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted	for <gcc-patches@gcc.gnu.org> from <meissner@ibm-tiger.the-meissners.org>;	Fri, 14 Nov 2014 15:16:40 -0500
Received: from d01dlp01.pok.ibm.com (9.56.250.166)	by e9.ny.us.ibm.com (192.168.1.109) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;	Fri, 14 Nov 2014 15:16:37 -0500
Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27])	by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 3537338C804D	for <gcc-patches@gcc.gnu.org>; Fri, 14 Nov 2014 15:11:10 -0500 (EST)
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217])	by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id sAEKGagq24576210	for <gcc-patches@gcc.gnu.org>; Fri, 14 Nov 2014 20:16:36 GMT
Received: from d01av03.pok.ibm.com (localhost [127.0.0.1])	by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id sAEKGZIw026026	for <gcc-patches@gcc.gnu.org>; Fri, 14 Nov 2014 15:16:35 -0500
Received: from ibm-tiger.the-meissners.org (dhcp-9-32-77-206.usma.ibm.com [9.32.77.206])	by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id sAEKGZCV025979;	Fri, 14 Nov 2014 15:16:35 -0500
Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500)	id 0BDAB4249C; Fri, 14 Nov 2014 15:16:34 -0500 (EST)
Date: Fri, 14 Nov 2014 20:47:00 -0000
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: Michael Meissner <meissner@linux.vnet.ibm.com>, gcc-patches@gcc.gnu.org,        dje.gcc@gmail.com, joseph@codesourcery.com, macro@codesourcery.com,        pattyo.lists@gmail.com, segher@kernel.crashing.org,        hainque@adacore.com, dmalcolm@redhat.com
Subject: Re: PATCH [8 of 8], rs6000, add support for scalar floating point in Altivec registers
Message-ID: <20141114201634.GA6247@ibm-tiger.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,	gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, joseph@codesourcery.com,	macro@codesourcery.com, pattyo.lists@gmail.com,	segher@kernel.crashing.org, hainque@adacore.com,	dmalcolm@redhat.com
References: <20141112002113.GA1489@ibm-tiger.the-meissners.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="J2SCkAp4GZ/dPZZf"
Content-Disposition: inline
In-Reply-To: <20141112002113.GA1489@ibm-tiger.the-meissners.org>
User-Agent: Mutt/1.5.20 (2009-12-10)
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 14111420-0033-0000-0000-0000010D70E5
X-IsSubscribed: yes
X-SW-Source: 2014-11/txt/msg01890.txt.bz2


--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-length: 3141

I tracked down the regression in the spec benchmarks, and it was due to turning
off pre-increment/pre-decrement for floating point values, and these two
benchmarks use pre-increment/pre-decrement quite a bit.  My secondary reload
handlers are capable of adding in the pre-increment/pre-decrement if such an
operation is attempted on an Altivec register.

I am also including a patch to make the compiler work with -ffast-math.  If you
use -ffast-math, the easy_fp_constant predicate says that all constants are
easy in order to enable using the reciprocal approximation instructions for
division.  I put in a define_split to move the constants to the constant pool
after the reciprocal approximation work has been done but before reload
starts.  I had had this patch in when I was doing the development, but I
thought I did not need it when making up the patches, but perhaps recent
changes to the register allocator need it again.

I added an option (-mupper-regs) to simplify setting both -mupper-regs-sf and
-mupper-regs-df.  It will only set the options that the particular machine
supports.

Finally, I made the default to turn on -mupper-regs-df on power7/power8
systems, and -mupper-regs-sf on power8 systems.  I have run the regression test
suite with these options on, and there were no regressions.  Once all of the
other patches go in, can I check in these patches?

If you would prefer the default for GCC 5.0 not to enable the upper register
support, let me know, and I can remove the lines in rs6000-cpu.def that sets
the default.

2014-11-14  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/predicates.md (memory_fp_constant): New predicate
	to return true if the operand is a floating point constant that
	must be put into the constant pool, before register allocation
	occurs.

	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Enable
	-mupper-regs-df by default.
	(ISA_2_7_MASKS_SERVER): Enable -mupper-regs-sf by default.
	(POWERPC_MASKS): Add -mupper-regs-{sf,df} as options set by the
	various -mcpu=... options.
	(power7 cpu): Enable -mupper-regs-df by default.

	* config/rs6000/rs6000.opt (-mupper-regs): New combination option
	that sets -mupper-regs-sf and -mupper-regs-df by default if the
	cpu supports the instructions.

	* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Allow
	pre-increment and pre-decrement on floating point, even if the
	-mupper-regs-{sf,df} options were used.
	(rs6000_option_override_internal): If -mupper-regs, set both
	-mupper-regs-sf and -mupper-regs-df, depending on the underlying
	cpu.

	* config/rs6000/rs6000.md (DFmode splitter): Add a define_split to
	move floating point constants to the constant pool before register
	allocation.  Normally constants are put into the pool immediately,
	but -ffast-math delays putting them into the constant pool for the
	reciprocal approximation support.
	(SFmode splitter): Likewise.

	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
	-mupper-regs.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="gcc-power8.patch132i"
Content-length: 9699

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 217448)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -521,6 +521,27 @@ (define_predicate "easy_fp_constant"
   }
 })
 
+;; Return 1 if the operand must be loaded from memory.  This is used by a
+;; define_split to insure constants get pushed to the constant pool before
+;; reload.  If -ffast-math is used, easy_fp_constant will allow move insns to
+;; have constants in order not interfere with reciprocal estimation.  However,
+;; with -mupper-regs support, these constants must be moved to the constant
+;; pool before register allocation.
+
+(define_predicate "memory_fp_constant"
+  (match_code "const_double")
+{
+  if (TARGET_VSX && op == CONST0_RTX (mode))
+    return 0;
+
+  if (!TARGET_HARD_FLOAT || !TARGET_FPRS
+      || (mode == SFmode && !TARGET_SINGLE_FLOAT)
+      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+    return 0;
+	  
+  return 1;
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 217448)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -44,7 +44,8 @@
 #define ISA_2_6_MASKS_SERVER	(ISA_2_5_MASKS_SERVER			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
-				 | OPTION_MASK_VSX)
+				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_UPPER_REGS_DF)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
 #define ISA_2_7_MASKS_SERVER	(ISA_2_6_MASKS_SERVER			\
@@ -54,7 +55,8 @@
 				 | OPTION_MASK_DIRECT_MOVE		\
 				 | OPTION_MASK_HTM			\
 				 | OPTION_MASK_QUAD_MEMORY		\
-  				 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
+  				 | OPTION_MASK_QUAD_MEMORY_ATOMIC	\
+				 | OPTION_MASK_UPPER_REGS_SF)
 
 #define POWERPC_7400_MASK	(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
 
@@ -94,6 +96,8 @@
 				 | OPTION_MASK_RECIP_PRECISION		\
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
+				 | OPTION_MASK_UPPER_REGS_DF		\
+				 | OPTION_MASK_UPPER_REGS_SF		\
 				 | OPTION_MASK_VSX			\
 				 | OPTION_MASK_VSX_TIMODE)
 
@@ -184,7 +188,7 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION)
+	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 217448)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -589,6 +589,10 @@ mupper-regs-sf
 Target Report Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
 Allow float variables in upper registers with -mcpu=power8 or -mpower8-vector
 
+mupper-regs
+Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
+Allow float/double variables in upper registers if cpu allows it
+
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 217448)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -2462,9 +2462,7 @@ rs6000_setup_reg_addr_masks (void)
 	      /* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
 		 addressing.  Restrict addressing on SPE for 64-bit types
 		 because of the SUBREG hackery used to address 64-bit floats in
-		 '32-bit' GPRs.  To simplify secondary reload, don't allow
-		 update forms on scalar floating point types that can go in the
-		 upper registers.  */
+		 '32-bit' GPRs.  */
 
 	      if (TARGET_UPDATE
 		  && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
@@ -2472,8 +2470,7 @@ rs6000_setup_reg_addr_masks (void)
 		  && !VECTOR_MODE_P (m2)
 		  && !COMPLEX_MODE_P (m2)
 		  && !indexed_only_p
-		  && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
-		  && !reg_addr[m2].scalar_in_vmx_p)
+		  && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8))
 		{
 		  addr_mask |= RELOAD_REG_PRE_INCDEC;
 
@@ -3509,6 +3506,40 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_DFP;
     }
 
+  /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
+     -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
+     the individual option.  */
+  if (TARGET_UPPER_REGS > 0)
+    {
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+	{
+	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+	}
+      if (TARGET_P8_VECTOR
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+	{
+	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_SF;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+	}
+    }
+  else if (TARGET_UPPER_REGS == 0)
+    {
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+	{
+	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+	}
+      if (TARGET_P8_VECTOR
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+	{
+	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_SF;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+	}
+    }
+
   if (TARGET_UPPER_REGS_DF && !TARGET_VSX)
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 217448)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -8137,6 +8137,21 @@ (define_insn_and_split "*mov<mode>_softf
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
   [(set_attr "length" "20,20,16")])
 
+;; If we are using -ffast-math, easy_fp_constant assumes all constants are
+;; 'easy' in order to allow for reciprocal estimation.  Make sure the constant
+;; is in the constant pool before reload occurs.  This simplifies accessing
+;; scalars in the traditional Altivec registers.
+
+(define_split
+  [(set (match_operand:SFDF 0 "register_operand" "")
+	(match_operand:SFDF 1 "memory_fp_constant" ""))]
+  "TARGET_<MODE>_FPR && flag_unsafe_math_optimizations
+   && !reload_in_progress && !reload_completed && !lra_in_progress"
+  [(set (match_dup 0) (match_dup 2))]
+{
+  operands[2] = validize_mem (force_const_mem (<MODE>mode, operands[1]));
+})
+
 (define_expand "extenddftf2"
   [(set (match_operand:TF 0 "nonimmediate_operand" "")
 	(float_extend:TF (match_operand:DF 1 "input_operand" "")))]
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 217448)
+++ gcc/doc/invoke.texi	(working copy)
@@ -940,7 +940,8 @@ See RS/6000 and PowerPC Options.
 -mquad-memory -mno-quad-memory @gol
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
 -mcompat-align-parm -mno-compat-align-parm @gol
--mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf}
+-mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs -mno-upper-regs}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -19691,10 +19692,9 @@ instructions.  The @option{-mquad-memory
 Generate code that uses (does not use) the scalar double precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.06 of the
-PowerPC ISA.  If @option{-mupper-regs-df} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers.  This option requires the @option{-mvsx},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA.  The @option{-mupper-regs-df} turned on by default if you
+use either of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
+@option{-mvsx} options.
 
 @item -mupper-regs-sf
 @itemx -mno-upper-regs-sf
@@ -19703,10 +19703,20 @@ registers.  This option requires the @op
 Generate code that uses (does not use) the scalar single precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.07 of the
-PowerPC ISA.  If @option{-mupper-regs-sf} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers.  This option requires the @option{-mpower8-vector},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA.  The @option{-mupper-regs-sf} turned on by default if you
+use either of the @option{-mcpu=power8}, or @option{-mpower8-vector}
+options.
+
+@item -mupper-regs
+@itemx -mno-upper-regs
+@opindex mupper-regs
+@opindex mno-upper-regs
+Generate code that uses (does not use) the scalar
+instructions that target all 64 registers in the vector/scalar
+floating point register set, depending on the model of the machine.
+
+If the @option{-mno-upper-regs} option was used, it will turn off both
+@option{-mupper-regs-sf} and @option{-mupper-regs-df} options.
 
 @item -mfloat-gprs=@var{yes/single/double/no}
 @itemx -mfloat-gprs

--J2SCkAp4GZ/dPZZf--