* [PATCH] PowerPC merge TD/TF moves
@ 2013-01-30 22:50 Michael Meissner
2013-02-05 18:46 ` David Edelsohn
2013-03-08 1:45 ` David Edelsohn
0 siblings, 2 replies; 4+ messages in thread
From: Michael Meissner @ 2013-01-30 22:50 UTC (permalink / raw)
To: gcc-patches, dje.gcc
[-- Attachment #1: Type: text/plain, Size: 2021 bytes --]
This patch like the previous 2 pages combines the decimal and binary floating
point moves, this time for 128-bit floating point.
In doing this patch, I discovered that I left out the code in the previous
patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
move instructions. So, I added the code in this patch, and also created a test
to make sure that direct moves are generated in the future.
I also added the reload helper for DDmode to rs6000_vector_reload that was
missed in the last patch. This was harmless, since that is only used with an
undocumented debug switch. Hopefully sometime in the future, I will scalar
floating point to be able to be loaded in the upper 32 VSX registers that are
overlaid over the Altivec registers.
Like the previous 2 patches, I've bootstrapped this, and ran make check with no
regressions. Is it ok to apply when GCC 4.9 opens up?
I have one more patch in the insn combination to post, combining movdi on
systems with normal floating point and with the power6 direct move
instructions.
[gcc]
2013-01-30 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg
constraint if -mdebug=reg.
(rs6000_initi_hard_regno_mode_ok): Enable wg constraint if
-mfpgpr. Enable using dd reload support if needed.
* config/rs6000/dfp.md (movtd): Delete, combine with 128-bit
binary and decimal floating point moves in rs6000.md.
(movtd_internal): Likewise.
* config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and
decimal floating point moves.
(movtf): Likewise.
(movtf_internal): Likewise.
(mov<mode>_internal, TDmode/TFmode): Likewise.
(movtf_softfloat): Likewise.
(mov<mode>_softfloat, TDmode/TFmode): Likewise.
[gcc/testsuite]
2013-01-30 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/mmfpgpr.c: New test.
--
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meissner@linux.vnet.ibm.com fax +1 (978) 399-6899
[-- Attachment #2: gcc-power7.patch387b --]
[-- Type: text/plain, Size: 7117 bytes --]
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 195586)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -1737,6 +1737,7 @@ rs6000_debug_reg_global (void)
"wa reg_class = %s\n"
"wd reg_class = %s\n"
"wf reg_class = %s\n"
+ "wg reg_class = %s\n"
"wl reg_class = %s\n"
"ws reg_class = %s\n"
"wx reg_class = %s\n"
@@ -1748,6 +1749,7 @@ rs6000_debug_reg_global (void)
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wa]],
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wd]],
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
+ reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]],
reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]],
@@ -2120,6 +2122,9 @@ rs6000_init_hard_regno_mode_ok (bool glo
if (TARGET_ALTIVEC)
rs6000_constraints[RS6000_CONSTRAINT_v] = ALTIVEC_REGS;
+ if (TARGET_MFPGPR)
+ rs6000_constraints[RS6000_CONSTRAINT_wg] = FLOAT_REGS;
+
if (TARGET_LFIWAX)
rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS;
@@ -2150,6 +2155,8 @@ rs6000_init_hard_regno_mode_ok (bool glo
{
rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_di_store;
rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_di_load;
+ rs6000_vector_reload[DDmode][0] = CODE_FOR_reload_dd_di_store;
+ rs6000_vector_reload[DDmode][1] = CODE_FOR_reload_dd_di_load;
}
}
else
@@ -2170,6 +2177,8 @@ rs6000_init_hard_regno_mode_ok (bool glo
{
rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_si_store;
rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_si_load;
+ rs6000_vector_reload[DDmode][0] = CODE_FOR_reload_dd_si_store;
+ rs6000_vector_reload[DDmode][1] = CODE_FOR_reload_dd_si_load;
}
}
}
Index: gcc/config/rs6000/dfp.md
===================================================================
--- gcc/config/rs6000/dfp.md (revision 195590)
+++ gcc/config/rs6000/dfp.md (working copy)
@@ -144,27 +144,6 @@ (define_insn "*nabstd2_fpr"
"fnabs %0,%1"
[(set_attr "type" "fp")])
-(define_expand "movtd"
- [(set (match_operand:TD 0 "general_operand" "")
- (match_operand:TD 1 "any_operand" ""))]
- "TARGET_HARD_FLOAT && TARGET_FPRS"
- "{ rs6000_emit_move (operands[0], operands[1], TDmode); DONE; }")
-
-; It's important to list the Y->r and r->Y moves before r->r because
-; otherwise reload, given m->r, will try to pick r->r and reload it,
-; which doesn't make progress.
-(define_insn_and_split "*movtd_internal"
- [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
- (match_operand:TD 1 "input_operand" "d,m,d,r,YGHF,r"))]
- "TARGET_HARD_FLOAT && TARGET_FPRS
- && (gpc_reg_operand (operands[0], TDmode)
- || gpc_reg_operand (operands[1], TDmode))"
- "#"
- "&& reload_completed"
- [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
- [(set_attr "length" "8,8,8,20,20,16")])
-
;; Hardware support for decimal floating point operations.
(define_insn "extendddtd2"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 195590)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -257,6 +257,8 @@ (define_mode_iterator FMA_F [
(define_mode_iterator FMOVE32 [SF SD])
(define_mode_iterator FMOVE64 [DF DD])
(define_mode_iterator FMOVE64X [DI DF DD])
+(define_mode_iterator FMOVE128 [(TF "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128")
+ (TD "TARGET_HARD_FLOAT && TARGET_FPRS")])
; Whether a floating point move is ok, don't allow SD without hardware FP
(define_mode_attr fmove_ok [(SF "")
@@ -8148,35 +8150,33 @@ (define_insn "*mov<mode>_softfloat64"
[(set_attr "type" "store,load,*,mtjmpr,mfjmpr,*,*,*,*")
(set_attr "length" "4,4,4,4,4,8,12,16,4")])
\f
-(define_expand "movtf"
- [(set (match_operand:TF 0 "general_operand" "")
- (match_operand:TF 1 "any_operand" ""))]
- "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
- "{ rs6000_emit_move (operands[0], operands[1], TFmode); DONE; }")
+(define_expand "mov<mode>"
+ [(set (match_operand:FMOVE128 0 "general_operand" "")
+ (match_operand:FMOVE128 1 "any_operand" ""))]
+ ""
+ "{ rs6000_emit_move (operands[0], operands[1], <MODE>mode); DONE; }")
;; It's important to list Y->r and r->Y before r->r because otherwise
;; reload, given m->r, will try to pick r->r and reload it, which
;; doesn't make progress.
-(define_insn_and_split "*movtf_internal"
- [(set (match_operand:TF 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
- (match_operand:TF 1 "input_operand" "d,m,d,r,YGHF,r"))]
- "!TARGET_IEEEQUAD
- && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128
- && (gpc_reg_operand (operands[0], TFmode)
- || gpc_reg_operand (operands[1], TFmode))"
+(define_insn_and_split "*mov<mode>_internal"
+ [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
+ (match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r"))]
+ "TARGET_HARD_FLOAT && TARGET_FPRS
+ && (gpc_reg_operand (operands[0], <MODE>mode)
+ || gpc_reg_operand (operands[1], <MODE>mode))"
"#"
"&& reload_completed"
[(pc)]
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
[(set_attr "length" "8,8,8,20,20,16")])
-(define_insn_and_split "*movtf_softfloat"
- [(set (match_operand:TF 0 "rs6000_nonimmediate_operand" "=Y,r,r")
- (match_operand:TF 1 "input_operand" "r,YGHF,r"))]
- "!TARGET_IEEEQUAD
- && (TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_LONG_DOUBLE_128
- && (gpc_reg_operand (operands[0], TFmode)
- || gpc_reg_operand (operands[1], TFmode))"
+(define_insn_and_split "*mov<mode>_softfloat"
+ [(set (match_operand:FMOVE128 0 "rs6000_nonimmediate_operand" "=Y,r,r")
+ (match_operand:FMOVE128 1 "input_operand" "r,YGHF,r"))]
+ "(TARGET_SOFT_FLOAT || !TARGET_FPRS)
+ && (gpc_reg_operand (operands[0], <MODE>mode)
+ || gpc_reg_operand (operands[1], <MODE>mode))"
"#"
"&& reload_completed"
[(pc)]
Index: gcc/testsuite/gcc.target/powerpc/mmfpgpr.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mmfpgpr.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mmfpgpr.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power6x -mmfpgpr" } */
+/* { dg-final { scan-assembler "mffgpr" } } */
+/* { dg-final { scan-assembler "mftgpr" } } */
+
+/* Test that we generate the instructions to move between the GPR and FPR
+ registers under power6x. */
+
+extern long return_long (void);
+extern double return_double (void);
+
+double return_double2 (void)
+{
+ return (double) return_long ();
+}
+
+long return_long2 (void)
+{
+ return (long) return_double ();
+}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PowerPC merge TD/TF moves
2013-01-30 22:50 [PATCH] PowerPC merge TD/TF moves Michael Meissner
@ 2013-02-05 18:46 ` David Edelsohn
2013-03-08 1:45 ` David Edelsohn
1 sibling, 0 replies; 4+ messages in thread
From: David Edelsohn @ 2013-02-05 18:46 UTC (permalink / raw)
To: Michael Meissner, gcc-patches
On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch like the previous 2 pages combines the decimal and binary floating
> point moves, this time for 128-bit floating point.
>
> In doing this patch, I discovered that I left out the code in the previous
> patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> move instructions. So, I added the code in this patch, and also created a test
> to make sure that direct moves are generated in the future.
>
> I also added the reload helper for DDmode to rs6000_vector_reload that was
> missed in the last patch. This was harmless, since that is only used with an
> undocumented debug switch. Hopefully sometime in the future, I will scalar
> floating point to be able to be loaded in the upper 32 VSX registers that are
> overlaid over the Altivec registers.
>
> Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> regressions. Is it ok to apply when GCC 4.9 opens up?
>
> I have one more patch in the insn combination to post, combining movdi on
> systems with normal floating point and with the power6 direct move
> instructions.
>
> [gcc]
> 2013-01-30 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg
> constraint if -mdebug=reg.
> (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if
> -mfpgpr. Enable using dd reload support if needed.
>
> * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit
> binary and decimal floating point moves in rs6000.md.
> (movtd_internal): Likewise.
>
> * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and
> decimal floating point moves.
> (movtf): Likewise.
> (movtf_internal): Likewise.
> (mov<mode>_internal, TDmode/TFmode): Likewise.
> (movtf_softfloat): Likewise.
> (mov<mode>_softfloat, TDmode/TFmode): Likewise.
>
> [gcc/testsuite]
> 2013-01-30 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * gcc.target/powerpc/mmfpgpr.c: New test.
This patch is okay after 4.9 tree opens. Again, please confirm that
it works on pre-POWER7 systems.
Thanks, David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PowerPC merge TD/TF moves
2013-01-30 22:50 [PATCH] PowerPC merge TD/TF moves Michael Meissner
2013-02-05 18:46 ` David Edelsohn
@ 2013-03-08 1:45 ` David Edelsohn
2013-03-11 19:05 ` Michael Meissner
1 sibling, 1 reply; 4+ messages in thread
From: David Edelsohn @ 2013-03-08 1:45 UTC (permalink / raw)
To: Michael Meissner, gcc-patches
On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch like the previous 2 pages combines the decimal and binary floating
> point moves, this time for 128-bit floating point.
>
> In doing this patch, I discovered that I left out the code in the previous
> patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> move instructions. So, I added the code in this patch, and also created a test
> to make sure that direct moves are generated in the future.
>
> I also added the reload helper for DDmode to rs6000_vector_reload that was
> missed in the last patch. This was harmless, since that is only used with an
> undocumented debug switch. Hopefully sometime in the future, I will scalar
> floating point to be able to be loaded in the upper 32 VSX registers that are
> overlaid over the Altivec registers.
>
> Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> regressions. Is it ok to apply when GCC 4.9 opens up?
>
> I have one more patch in the insn combination to post, combining movdi on
> systems with normal floating point and with the power6 direct move
> instructions.
Mike,
Which of these sets of patches adjusts and updates
rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing
the same register file?
Thanks, David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PowerPC merge TD/TF moves
2013-03-08 1:45 ` David Edelsohn
@ 2013-03-11 19:05 ` Michael Meissner
0 siblings, 0 replies; 4+ messages in thread
From: Michael Meissner @ 2013-03-11 19:05 UTC (permalink / raw)
To: David Edelsohn; +Cc: Michael Meissner, gcc-patches
On Thu, Mar 07, 2013 at 08:45:10PM -0500, David Edelsohn wrote:
> On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > This patch like the previous 2 pages combines the decimal and binary floating
> > point moves, this time for 128-bit floating point.
> >
> > In doing this patch, I discovered that I left out the code in the previous
> > patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> > move instructions. So, I added the code in this patch, and also created a test
> > to make sure that direct moves are generated in the future.
> >
> > I also added the reload helper for DDmode to rs6000_vector_reload that was
> > missed in the last patch. This was harmless, since that is only used with an
> > undocumented debug switch. Hopefully sometime in the future, I will scalar
> > floating point to be able to be loaded in the upper 32 VSX registers that are
> > overlaid over the Altivec registers.
> >
> > Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> > regressions. Is it ok to apply when GCC 4.9 opens up?
> >
> > I have one more patch in the insn combination to post, combining movdi on
> > systems with normal floating point and with the power6 direct move
> > instructions.
>
> Mike,
>
> Which of these sets of patches adjusts and updates
> rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing
> the same register file?
None of these patches adjust register_move_cost.
--
Michael Meissner, IBM
Now: M/S 2757, 5 Technology Place Drive, Westford, MA 01886-3141, USA
March 20: M/S 2506R, 550 King Street, Littleton, MA 01460, USA
meissner@linux.vnet.ibm.com fax +1 (978) 399-6899
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-11 19:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-30 22:50 [PATCH] PowerPC merge TD/TF moves Michael Meissner
2013-02-05 18:46 ` David Edelsohn
2013-03-08 1:45 ` David Edelsohn
2013-03-11 19:05 ` Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).