* [PATCH] Add support for sparc VIS3 fp<-->int moves.
@ 2011-10-24 5:01 David Miller
2011-10-24 21:29 ` Richard Henderson
0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2011-10-24 5:01 UTC (permalink / raw)
To: gcc-patches; +Cc: ebotcazou, rth
The non-trivial aspects (and what took the most time for me) of these
changes are:
1) Getting the register move costs and class preferencing right such
that the VIS3 moves do get effectively used for incoming
float/vector argument passing on 32-bit, yet IRA and reload don't
go nuts allocating integer registers to float/vector mode values
and vice versa.
Non-optimized compiles are particularly sensitive to this because
there's simply a lot of moves that don't get cleaned up. So we
might have 6 moves, 3 on each side of a single real calculation, so
in the IRA costs the register classes of the moves dominate.
2) Making sure we don't merge a VIS3 move into a restore instruction.
3) Dealing with the restriction that we can't operate on 32-bit pieces
of values contained in the upper 32 v9 float registers.
We deal with this using two elements.
First, we indicate a FP_REGS or GENERAL_OR_FP_REGS preferred
reload class when we see reload try to load an integer register
into class EXTRA_FP_REGS or GENERAL_OR_EXTRA_FP_REGS.
Second, we teach reload that if it tries to move between float and
integer regs, and some register class involving EXTRA_FP_REGS is
involved, that an intermediate FP_REGS class register will possibly
be needed to successfully complete the reload.
The rest is mostly mechanical work of splitting the existing v9/64-bit
move patterns into non-vis3 and vis3 variants.
Because of how float arguments are passed on 32-bit, these instructions
help a lot. This is evident in even the simplest examples, this C code:
float fnegs (float a) { return -a; }
double fnegd (double a) { return -a; }
would generate:
fnegs:
add %sp, -104, %sp
st %o0, [%sp+100]
ld [%sp+100], %f8
sub %sp, -104, %sp
jmp %o7+8
fnegs %f8, %f0
fnegd:
add %sp, -104, %sp
std %o0, [%sp+96]
ldd [%sp+96], %f8
sub %sp, -104, %sp
jmp %o7+8
fnegd %f8, %f0
but with VIS3 moves we get:
fnegs:
movwtos %o0, %f8
jmp %o7+8
fnegs %f8, %f0
fnegd:
movwtos %o0, %f8
movwtos %o1, %f9
jmp %o7+8
fnegd %f8, %f0
And with our good friend pdist.c we get the following code for
function 'foo' with VIS3 moves:
foo:
fzero %f8
movwtos %o0, %f10
movwtos %o1, %f11
movwtos %o2, %f12
movwtos %o3, %f13
pdist %f10, %f12, %f8
movstouw %f8, %o0
jmp %o7+8
movstouw %f9, %o1
Another good example of significantly improved code generation
can be found when looking at the output of libgcc2.c:_mulsc3()
Of course, sometimes we generate spurious secondary reloads because
the use of the EXTRA_FP_REGS (and GENERAL_OR_EXTRA_FP_REGS) register
class doesn't necessary result in using one of the upper 32 v9 float
registers. Maybe if we used segregated register classes for the lower
and upper float regs we could attack this issue effectively.
These VIS3 patterns can also in the future be used for more crafty
constant and non-constant vec_init sequences.
This was regstrapped both with the compiler defaulting to vis3, and
without.
Committed to trunk.
gcc/
* config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): We can move
between float and non-float regs when VIS3.
* config/sparc/sparc.c (eligible_for_restore_insn): We can't
use a restore when the source is a float register.
(sparc_split_regreg_legitimate): When VIS3 allow moves between
float and integer regs.
(sparc_register_move_cost): Adjust to account for VIS3 moves.
(sparc_preferred_reload_class): On 32-bit with VIS3 when moving an
integer reg to a class containing EXTRA_FP_REGS, constrain to
FP_REGS.
(sparc_secondary_reload): On 32-bit with VIS3 when moving between
float and integer regs we sometimes need a FP_REGS class
intermediate move to satisfy the reload. When this happens
specify an extra cost of 2.
(*movsi_insn): Rename to have "_novis3" suffix and add !VIS3
guard.
(*movdi_insn_sp32_v9): Likewise.
(*movdi_insn_sp64): Likewise.
(*movsf_insn): Likewise.
(*movdf_insn_sp32_v9): Likewise.
(*movdf_insn_sp64): Likewise.
(*zero_extendsidi2_insn_sp64): Likewise.
(*sign_extendsidi2_insn): Likewise.
(*movsi_insn_vis3): New insn.
(*movdi_insn_sp32_v9_vis3): New insn.
(*movdi_insn_sp64_vis3): New insn.
(*movsf_insn_vis3): New insn.
(*movdf_insn_sp32_v9_vis3): New insn.
(*movdf_insn_sp64_vis3): New insn.
(*zero_extendsidi2_insn_sp64_vis3): New insn.
(*sign_extendsidi2_insn_vis3): New insn.
(TFmode reg/reg split): Make sure both REG operands are float.
(*mov<VM32:mode>_insn): Add "_novis3" suffix and !VIS3 guard. Remove
easy constant to integer reg alternatives.
(*mov<VM64:mode>_insn_sp64): Likewise.
(*mov<VM64:mode>_insn_sp32_novis3): Likewise.
(*mov<VM32:mode>_insn_vis3): New insn.
(*mov<VM64:mode>_insn_sp64_vis3): New insn.
(*mov<VM64:mode>_insn_sp32_vis3): New insn.
(VM64 reg<-->reg split): New spliiter for 32-bit.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180360 138bc75d-0d04-0410-961f-82ee72b054a4
---
gcc/ChangeLog | 41 +++++
gcc/config/sparc/sparc.c | 85 ++++++++++-
gcc/config/sparc/sparc.h | 9 +-
gcc/config/sparc/sparc.md | 375 +++++++++++++++++++++++++++++++++++++++++----
4 files changed, 469 insertions(+), 41 deletions(-)
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index dfa4caf..1842402 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,46 @@
2011-10-23 David S. Miller <davem@davemloft.net>
+ * config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): We can move
+ between float and non-float regs when VIS3.
+ * config/sparc/sparc.c (eligible_for_restore_insn): We can't
+ use a restore when the source is a float register.
+ (sparc_split_regreg_legitimate): When VIS3 allow moves between
+ float and integer regs.
+ (sparc_register_move_cost): Adjust to account for VIS3 moves.
+ (sparc_preferred_reload_class): On 32-bit with VIS3 when moving an
+ integer reg to a class containing EXTRA_FP_REGS, constrain to
+ FP_REGS.
+ (sparc_secondary_reload): On 32-bit with VIS3 when moving between
+ float and integer regs we sometimes need a FP_REGS class
+ intermediate move to satisfy the reload. When this happens
+ specify an extra cost of 2.
+ (*movsi_insn): Rename to have "_novis3" suffix and add !VIS3
+ guard.
+ (*movdi_insn_sp32_v9): Likewise.
+ (*movdi_insn_sp64): Likewise.
+ (*movsf_insn): Likewise.
+ (*movdf_insn_sp32_v9): Likewise.
+ (*movdf_insn_sp64): Likewise.
+ (*zero_extendsidi2_insn_sp64): Likewise.
+ (*sign_extendsidi2_insn): Likewise.
+ (*movsi_insn_vis3): New insn.
+ (*movdi_insn_sp32_v9_vis3): New insn.
+ (*movdi_insn_sp64_vis3): New insn.
+ (*movsf_insn_vis3): New insn.
+ (*movdf_insn_sp32_v9_vis3): New insn.
+ (*movdf_insn_sp64_vis3): New insn.
+ (*zero_extendsidi2_insn_sp64_vis3): New insn.
+ (*sign_extendsidi2_insn_vis3): New insn.
+ (TFmode reg/reg split): Make sure both REG operands are float.
+ (*mov<VM32:mode>_insn): Add "_novis3" suffix and !VIS3 guard. Remove
+ easy constant to integer reg alternatives.
+ (*mov<VM64:mode>_insn_sp64): Likewise.
+ (*mov<VM64:mode>_insn_sp32_novis3): Likewise.
+ (*mov<VM32:mode>_insn_vis3): New insn.
+ (*mov<VM64:mode>_insn_sp64_vis3): New insn.
+ (*mov<VM64:mode>_insn_sp32_vis3): New insn.
+ (VM64 reg<-->reg split): New spliiter for 32-bit.
+
* config/sparc/sparc.c (sparc_split_regreg_legitimate): New
function.
* config/sparc/sparc-protos.h (sparc_split_regreg_legitimate):
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 29d2847..79bb821 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2996,10 +2996,23 @@ eligible_for_restore_insn (rtx trial, bool return_p)
{
rtx pat = PATTERN (trial);
rtx src = SET_SRC (pat);
+ bool src_is_freg = false;
+ rtx src_reg;
+
+ /* Since we now can do moves between float and integer registers when
+ VIS3 is enabled, we have to catch this case. We can allow such
+ moves when doing a 'return' however. */
+ src_reg = src;
+ if (GET_CODE (src_reg) == SUBREG)
+ src_reg = SUBREG_REG (src_reg);
+ if (GET_CODE (src_reg) == REG
+ && SPARC_FP_REG_P (REGNO (src_reg)))
+ src_is_freg = true;
/* The 'restore src,%g0,dest' pattern for word mode and below. */
if (GET_MODE_CLASS (GET_MODE (src)) != MODE_FLOAT
- && arith_operand (src, GET_MODE (src)))
+ && arith_operand (src, GET_MODE (src))
+ && ! src_is_freg)
{
if (TARGET_ARCH64)
return GET_MODE_SIZE (GET_MODE (src)) <= GET_MODE_SIZE (DImode);
@@ -3009,7 +3022,8 @@ eligible_for_restore_insn (rtx trial, bool return_p)
/* The 'restore src,%g0,dest' pattern for double-word mode. */
else if (GET_MODE_CLASS (GET_MODE (src)) != MODE_FLOAT
- && arith_double_operand (src, GET_MODE (src)))
+ && arith_double_operand (src, GET_MODE (src))
+ && ! src_is_freg)
return GET_MODE_SIZE (GET_MODE (src)) <= GET_MODE_SIZE (DImode);
/* The 'restore src,%g0,dest' pattern for float if no FPU. */
@@ -7784,6 +7798,13 @@ sparc_split_regreg_legitimate (rtx reg1, rtx reg2)
if (SPARC_INT_REG_P (regno1) && SPARC_INT_REG_P (regno2))
return 1;
+ if (TARGET_VIS3)
+ {
+ if ((SPARC_INT_REG_P (regno1) && SPARC_FP_REG_P (regno2))
+ || (SPARC_FP_REG_P (regno1) && SPARC_INT_REG_P (regno2)))
+ return 1;
+ }
+
return 0;
}
@@ -10302,10 +10323,28 @@ static int
sparc_register_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED,
reg_class_t from, reg_class_t to)
{
- if ((FP_REG_CLASS_P (from) && general_or_i64_p (to))
- || (general_or_i64_p (from) && FP_REG_CLASS_P (to))
- || from == FPCC_REGS
- || to == FPCC_REGS)
+ bool need_memory = false;
+
+ if (from == FPCC_REGS || to == FPCC_REGS)
+ need_memory = true;
+ else if ((FP_REG_CLASS_P (from) && general_or_i64_p (to))
+ || (general_or_i64_p (from) && FP_REG_CLASS_P (to)))
+ {
+ if (TARGET_VIS3)
+ {
+ int size = GET_MODE_SIZE (mode);
+ if (size == 8 || size == 4)
+ {
+ if (! TARGET_ARCH32 || size == 4)
+ return 4;
+ else
+ return 6;
+ }
+ }
+ need_memory = true;
+ }
+
+ if (need_memory)
{
if (sparc_cpu == PROCESSOR_ULTRASPARC
|| sparc_cpu == PROCESSOR_ULTRASPARC3
@@ -11163,6 +11202,18 @@ sparc_preferred_reload_class (rtx x, reg_class_t rclass)
}
}
+ if (TARGET_VIS3
+ && ! TARGET_ARCH64
+ && (rclass == EXTRA_FP_REGS
+ || rclass == GENERAL_OR_EXTRA_FP_REGS))
+ {
+ int regno = true_regnum (x);
+
+ if (SPARC_INT_REG_P (regno))
+ return (rclass == EXTRA_FP_REGS
+ ? FP_REGS : GENERAL_OR_FP_REGS);
+ }
+
return rclass;
}
@@ -11275,6 +11326,9 @@ sparc_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
{
enum reg_class rclass = (enum reg_class) rclass_i;
+ sri->icode = CODE_FOR_nothing;
+ sri->extra_cost = 0;
+
/* We need a temporary when loading/storing a HImode/QImode value
between memory and the FPU registers. This can happen when combine puts
a paradoxical subreg in a float/fix conversion insn. */
@@ -11307,6 +11361,25 @@ sparc_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
return NO_REGS;
}
+ if (TARGET_VIS3 && TARGET_ARCH32)
+ {
+ int regno = true_regnum (x);
+
+ /* When using VIS3 fp<-->int register moves, on 32-bit we have
+ to move 8-byte values in 4-byte pieces. This only works via
+ FP_REGS, and not via EXTRA_FP_REGS. Therefore if we try to
+ move between EXTRA_FP_REGS and GENERAL_REGS, we will need
+ an FP_REGS intermediate move. */
+ if ((rclass == EXTRA_FP_REGS && SPARC_INT_REG_P (regno))
+ || ((general_or_i64_p (rclass)
+ || rclass == GENERAL_OR_FP_REGS)
+ && SPARC_FP_REG_P (regno)))
+ {
+ sri->extra_cost = 2;
+ return FP_REGS;
+ }
+ }
+
return NO_REGS;
}
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index 76240f0..aed18fc 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -1040,10 +1040,13 @@ extern char leaf_reg_remap[];
#define SPARC_SETHI32_P(X) \
(SPARC_SETHI_P ((unsigned HOST_WIDE_INT) (X) & GET_MODE_MASK (SImode)))
-/* On SPARC it is not possible to directly move data between
- GENERAL_REGS and FP_REGS. */
+/* On SPARC when not VIS3 it is not possible to directly move data
+ between GENERAL_REGS and FP_REGS. */
#define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE) \
- (FP_REG_CLASS_P (CLASS1) != FP_REG_CLASS_P (CLASS2))
+ ((FP_REG_CLASS_P (CLASS1) != FP_REG_CLASS_P (CLASS2)) \
+ && (! TARGET_VIS3 \
+ || GET_MODE_SIZE (MODE) > 8 \
+ || GET_MODE_SIZE (MODE) < 4))
/* Get_secondary_mem widens its argument to BITS_PER_WORD which loses on v9
because the movsi and movsf patterns don't handle r/f moves.
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index b84699a..0f716d6 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -1312,11 +1312,12 @@
DONE;
})
-(define_insn "*movsi_insn"
+(define_insn "*movsi_insn_novis3"
[(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m,!f,!f,!m,d,d")
(match_operand:SI 1 "input_operand" "rI,K,m,rJ,f,m,f,J,P"))]
- "(register_operand (operands[0], SImode)
- || register_or_zero_or_all_ones_operand (operands[1], SImode))"
+ "(! TARGET_VIS3
+ && (register_operand (operands[0], SImode)
+ || register_or_zero_or_all_ones_operand (operands[1], SImode)))"
"@
mov\t%1, %0
sethi\t%%hi(%a1), %0
@@ -1329,6 +1330,26 @@
fones\t%0"
[(set_attr "type" "*,*,load,store,fpmove,fpload,fpstore,fga,fga")])
+(define_insn "*movsi_insn_vis3"
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r, m, r,*f,*f,*f, m,d,d")
+ (match_operand:SI 1 "input_operand" "rI,K,m,rJ,*f, r, f, m,*f,J,P"))]
+ "(TARGET_VIS3
+ && (register_operand (operands[0], SImode)
+ || register_or_zero_or_all_ones_operand (operands[1], SImode)))"
+ "@
+ mov\t%1, %0
+ sethi\t%%hi(%a1), %0
+ ld\t%1, %0
+ st\t%r1, %0
+ movstouw\t%1, %0
+ movwtos\t%1, %0
+ fmovs\t%1, %0
+ ld\t%1, %0
+ st\t%1, %0
+ fzeros\t%0
+ fones\t%0"
+ [(set_attr "type" "*,*,load,store,*,*,fpmove,fpload,fpstore,fga,fga")])
+
(define_insn "*movsi_lo_sum"
[(set (match_operand:SI 0 "register_operand" "=r")
(lo_sum:SI (match_operand:SI 1 "register_operand" "r")
@@ -1486,13 +1507,14 @@
[(set_attr "type" "store,store,load,*,*,*,*,fpstore,fpload,*,*,*")
(set_attr "length" "2,*,*,2,2,2,2,*,*,2,2,2")])
-(define_insn "*movdi_insn_sp32_v9"
+(define_insn "*movdi_insn_sp32_v9_novis3"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=T,o,T,U,o,r,r,r,?T,?f,?f,?o,?e,?e,?W,b,b")
(match_operand:DI 1 "input_operand"
" J,J,U,T,r,o,i,r, f, T, o, f, e, W, e,J,P"))]
"! TARGET_ARCH64
&& TARGET_V9
+ && ! TARGET_VIS3
&& (register_operand (operands[0], DImode)
|| register_or_zero_operand (operands[1], DImode))"
"@
@@ -1517,10 +1539,45 @@
(set_attr "length" "*,2,*,*,2,2,2,2,*,*,2,2,*,*,*,*,*")
(set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*,double,double")])
-(define_insn "*movdi_insn_sp64"
+(define_insn "*movdi_insn_sp32_v9_vis3"
+ [(set (match_operand:DI 0 "nonimmediate_operand"
+ "=T,o,T,U,o,r,r,r,?T,?*f,?*f,?o,?*e, r,?*f,?*e,?W,b,b")
+ (match_operand:DI 1 "input_operand"
+ " J,J,U,T,r,o,i,r,*f, T, o,*f, *e,?*f, r, W,*e,J,P"))]
+ "! TARGET_ARCH64
+ && TARGET_V9
+ && TARGET_VIS3
+ && (register_operand (operands[0], DImode)
+ || register_or_zero_operand (operands[1], DImode))"
+ "@
+ stx\t%%g0, %0
+ #
+ std\t%1, %0
+ ldd\t%1, %0
+ #
+ #
+ #
+ #
+ std\t%1, %0
+ ldd\t%1, %0
+ #
+ #
+ fmovd\t%1, %0
+ #
+ #
+ ldd\t%1, %0
+ std\t%1, %0
+ fzero\t%0
+ fone\t%0"
+ [(set_attr "type" "store,store,store,load,*,*,*,*,fpstore,fpload,*,*,*,*,fpmove,fpload,fpstore,fga,fga")
+ (set_attr "length" "*,2,*,*,2,2,2,2,*,*,2,2,*,2,2,*,*,*,*")
+ (set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*,*,*,double,double")])
+
+(define_insn "*movdi_insn_sp64_novis3"
[(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,m,?e,?e,?W,b,b")
(match_operand:DI 1 "input_operand" "rI,N,m,rJ,e,W,e,J,P"))]
"TARGET_ARCH64
+ && ! TARGET_VIS3
&& (register_operand (operands[0], DImode)
|| register_or_zero_or_all_ones_operand (operands[1], DImode))"
"@
@@ -1536,6 +1593,28 @@
[(set_attr "type" "*,*,load,store,fpmove,fpload,fpstore,fga,fga")
(set_attr "fptype" "*,*,*,*,double,*,*,double,double")])
+(define_insn "*movdi_insn_sp64_vis3"
+ [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m, r,*e,?*e,?*e,?W,b,b")
+ (match_operand:DI 1 "input_operand" "rI,N,m,rJ,*e, r, *e, W,*e,J,P"))]
+ "TARGET_ARCH64
+ && TARGET_VIS3
+ && (register_operand (operands[0], DImode)
+ || register_or_zero_or_all_ones_operand (operands[1], DImode))"
+ "@
+ mov\t%1, %0
+ sethi\t%%hi(%a1), %0
+ ldx\t%1, %0
+ stx\t%r1, %0
+ movdtox\t%1, %0
+ movxtod\t%1, %0
+ fmovd\t%1, %0
+ ldd\t%1, %0
+ std\t%1, %0
+ fzero\t%0
+ fone\t%0"
+ [(set_attr "type" "*,*,load,store,*,*,fpmove,fpload,fpstore,fga,fga")
+ (set_attr "fptype" "*,*,*,*,*,*,double,*,*,double,double")])
+
(define_expand "movdi_pic_label_ref"
[(set (match_dup 3) (high:DI
(unspec:DI [(match_operand:DI 1 "label_ref_operand" "")
@@ -1933,10 +2012,11 @@
DONE;
})
-(define_insn "*movsf_insn"
+(define_insn "*movsf_insn_novis3"
[(set (match_operand:SF 0 "nonimmediate_operand" "=d, d,f, *r,*r,*r,f,*r,m, m")
(match_operand:SF 1 "input_operand" "GY,ZC,f,*rRY, Q, S,m, m,f,*rGY"))]
"TARGET_FPU
+ && ! TARGET_VIS3
&& (register_operand (operands[0], SFmode)
|| register_or_zero_or_all_ones_operand (operands[1], SFmode))"
{
@@ -1979,6 +2059,57 @@
}
[(set_attr "type" "fga,fga,fpmove,*,*,*,fpload,load,fpstore,store")])
+(define_insn "*movsf_insn_vis3"
+ [(set (match_operand:SF 0 "nonimmediate_operand" "=d, d,f, *r,*r,*r,*r, f, f,*r, m, m")
+ (match_operand:SF 1 "input_operand" "GY,ZC,f,*rRY, Q, S, f,*r, m, m, f,*rGY"))]
+ "TARGET_FPU
+ && TARGET_VIS3
+ && (register_operand (operands[0], SFmode)
+ || register_or_zero_or_all_ones_operand (operands[1], SFmode))"
+{
+ if (GET_CODE (operands[1]) == CONST_DOUBLE
+ && (which_alternative == 3
+ || which_alternative == 4
+ || which_alternative == 5))
+ {
+ REAL_VALUE_TYPE r;
+ long i;
+
+ REAL_VALUE_FROM_CONST_DOUBLE (r, operands[1]);
+ REAL_VALUE_TO_TARGET_SINGLE (r, i);
+ operands[1] = GEN_INT (i);
+ }
+
+ switch (which_alternative)
+ {
+ case 0:
+ return "fzeros\t%0";
+ case 1:
+ return "fones\t%0";
+ case 2:
+ return "fmovs\t%1, %0";
+ case 3:
+ return "mov\t%1, %0";
+ case 4:
+ return "sethi\t%%hi(%a1), %0";
+ case 5:
+ return "#";
+ case 6:
+ return "movstouw\t%1, %0";
+ case 7:
+ return "movwtos\t%1, %0";
+ case 8:
+ case 9:
+ return "ld\t%1, %0";
+ case 10:
+ case 11:
+ return "st\t%r1, %0";
+ default:
+ gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "fga,fga,fpmove,*,*,*,*,*,fpload,load,fpstore,store")])
+
;; Exactly the same as above, except that all `f' cases are deleted.
;; This is necessary to prevent reload from ever trying to use a `f' reg
;; when -mno-fpu.
@@ -2107,11 +2238,12 @@
(set_attr "length" "*,*,2,2,2")])
;; We have available v9 double floats but not 64-bit integer registers.
-(define_insn "*movdf_insn_sp32_v9"
+(define_insn "*movdf_insn_sp32_v9_novis3"
[(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e, e, T,W,U,T, f, *r, o")
(match_operand:DF 1 "input_operand" "GY,ZC,e,W#F,GY,e,T,U,o#F,*roGYDF,*rGYf"))]
"TARGET_FPU
&& TARGET_V9
+ && ! TARGET_VIS3
&& ! TARGET_ARCH64
&& (register_operand (operands[0], DFmode)
|| register_or_zero_or_all_ones_operand (operands[1], DFmode))"
@@ -2131,6 +2263,33 @@
(set_attr "length" "*,*,*,*,*,*,*,*,2,2,2")
(set_attr "fptype" "double,double,double,*,*,*,*,*,*,*,*")])
+(define_insn "*movdf_insn_sp32_v9_vis3"
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e,*r, f, e, T,W,U,T, f, *r, o")
+ (match_operand:DF 1 "input_operand" "GY,ZC,e, f,*r,W#F,GY,e,T,U,o#F,*roGYDF,*rGYf"))]
+ "TARGET_FPU
+ && TARGET_V9
+ && TARGET_VIS3
+ && ! TARGET_ARCH64
+ && (register_operand (operands[0], DFmode)
+ || register_or_zero_or_all_ones_operand (operands[1], DFmode))"
+ "@
+ fzero\t%0
+ fone\t%0
+ fmovd\t%1, %0
+ #
+ #
+ ldd\t%1, %0
+ stx\t%r1, %0
+ std\t%1, %0
+ ldd\t%1, %0
+ std\t%1, %0
+ #
+ #
+ #"
+ [(set_attr "type" "fga,fga,fpmove,*,*,load,store,store,load,store,*,*,*")
+ (set_attr "length" "*,*,*,2,2,*,*,*,*,*,2,2,2")
+ (set_attr "fptype" "double,double,double,*,*,*,*,*,*,*,*,*,*")])
+
(define_insn "*movdf_insn_sp32_v9_no_fpu"
[(set (match_operand:DF 0 "nonimmediate_operand" "=U,T,T,r,o")
(match_operand:DF 1 "input_operand" "T,U,G,ro,rG"))]
@@ -2149,10 +2308,11 @@
(set_attr "length" "*,*,*,2,2")])
;; We have available both v9 double floats and 64-bit integer registers.
-(define_insn "*movdf_insn_sp64"
+(define_insn "*movdf_insn_sp64_novis3"
[(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e, e,W, *r,*r, m,*r")
(match_operand:DF 1 "input_operand" "GY,ZC,e,W#F,e,*rGY, m,*rGY,DF"))]
"TARGET_FPU
+ && ! TARGET_VIS3
&& TARGET_ARCH64
&& (register_operand (operands[0], DFmode)
|| register_or_zero_or_all_ones_operand (operands[1], DFmode))"
@@ -2170,6 +2330,30 @@
(set_attr "length" "*,*,*,*,*,*,*,*,2")
(set_attr "fptype" "double,double,double,*,*,*,*,*,*")])
+(define_insn "*movdf_insn_sp64_vis3"
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e,*r, e, e,W, *r,*r, m,*r")
+ (match_operand:DF 1 "input_operand" "GY,ZC,e, e,*r,W#F,e,*rGY, m,*rGY,DF"))]
+ "TARGET_FPU
+ && TARGET_ARCH64
+ && TARGET_VIS3
+ && (register_operand (operands[0], DFmode)
+ || register_or_zero_or_all_ones_operand (operands[1], DFmode))"
+ "@
+ fzero\t%0
+ fone\t%0
+ fmovd\t%1, %0
+ movdtox\t%1, %0
+ movxtod\t%1, %0
+ ldd\t%1, %0
+ std\t%1, %0
+ mov\t%r1, %0
+ ldx\t%1, %0
+ stx\t%r1, %0
+ #"
+ [(set_attr "type" "fga,fga,fpmove,*,*,load,store,*,load,store,*")
+ (set_attr "length" "*,*,*,*,*,*,*,*,*,*,2")
+ (set_attr "fptype" "double,double,double,double,double,*,*,*,*,*,*")])
+
(define_insn "*movdf_insn_sp64_no_fpu"
[(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m")
(match_operand:DF 1 "input_operand" "r,m,rG"))]
@@ -2444,7 +2628,8 @@
&& (! TARGET_ARCH64
|| (TARGET_FPU
&& ! TARGET_HARD_QUAD)
- || ! fp_register_operand (operands[0], TFmode))"
+ || (! fp_register_operand (operands[0], TFmode)
+ && ! fp_register_operand (operands[1], TFmode)))"
[(clobber (const_int 0))]
{
rtx set_dest = operands[0];
@@ -2944,15 +3129,29 @@
""
"")
-(define_insn "*zero_extendsidi2_insn_sp64"
+(define_insn "*zero_extendsidi2_insn_sp64_novis3"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(zero_extend:DI (match_operand:SI 1 "input_operand" "r,m")))]
- "TARGET_ARCH64 && GET_CODE (operands[1]) != CONST_INT"
+ "TARGET_ARCH64
+ && ! TARGET_VIS3
+ && GET_CODE (operands[1]) != CONST_INT"
"@
srl\t%1, 0, %0
lduw\t%1, %0"
[(set_attr "type" "shift,load")])
+(define_insn "*zero_extendsidi2_insn_sp64_vis3"
+ [(set (match_operand:DI 0 "register_operand" "=r,r,r")
+ (zero_extend:DI (match_operand:SI 1 "input_operand" "r,m,*f")))]
+ "TARGET_ARCH64
+ && TARGET_VIS3
+ && GET_CODE (operands[1]) != CONST_INT"
+ "@
+ srl\t%1, 0, %0
+ lduw\t%1, %0
+ movstouw\t%1, %0"
+ [(set_attr "type" "shift,load,*")])
+
(define_insn_and_split "*zero_extendsidi2_insn_sp32"
[(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI (match_operand:SI 1 "register_operand" "r")))]
@@ -3276,16 +3475,27 @@
"TARGET_ARCH64"
"")
-(define_insn "*sign_extendsidi2_insn"
+(define_insn "*sign_extendsidi2_insn_novis3"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(sign_extend:DI (match_operand:SI 1 "input_operand" "r,m")))]
- "TARGET_ARCH64"
+ "TARGET_ARCH64 && ! TARGET_VIS3"
"@
sra\t%1, 0, %0
ldsw\t%1, %0"
[(set_attr "type" "shift,sload")
(set_attr "us3load_type" "*,3cycle")])
+(define_insn "*sign_extendsidi2_insn_vis3"
+ [(set (match_operand:DI 0 "register_operand" "=r,r,r")
+ (sign_extend:DI (match_operand:SI 1 "input_operand" "r,m,*f")))]
+ "TARGET_ARCH64 && TARGET_VIS3"
+ "@
+ sra\t%1, 0, %0
+ ldsw\t%1, %0
+ movstosw\t%1, %0"
+ [(set_attr "type" "shift,sload,*")
+ (set_attr "us3load_type" "*,3cycle,*")])
+
;; Special pattern for optimizing bit-field compares. This is needed
;; because combine uses this as a canonical form.
@@ -7769,10 +7979,11 @@
DONE;
})
-(define_insn "*mov<VM32:mode>_insn"
- [(set (match_operand:VM32 0 "nonimmediate_operand" "=f, f,f,f,m, m,r,m, r, r")
- (match_operand:VM32 1 "input_operand" "GY,ZC,f,m,f,GY,m,r,GY,ZC"))]
+(define_insn "*mov<VM32:mode>_insn_novis3"
+ [(set (match_operand:VM32 0 "nonimmediate_operand" "=f, f,f,f,m, m,r,m,*r")
+ (match_operand:VM32 1 "input_operand" "GY,ZC,f,m,f,GY,m,r,*r"))]
"TARGET_VIS
+ && ! TARGET_VIS3
&& (register_operand (operands[0], <VM32:MODE>mode)
|| register_or_zero_or_all_ones_operand (operands[1], <VM32:MODE>mode))"
"@
@@ -7784,14 +7995,35 @@
st\t%r1, %0
ld\t%1, %0
st\t%1, %0
- mov\t0, %0
- mov\t-1, %0"
- [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*")])
+ mov\t%1, %0"
+ [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*")])
-(define_insn "*mov<VM64:mode>_insn_sp64"
- [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,r,m, r, r")
- (match_operand:VM64 1 "input_operand" "GY,ZC,e,m,e,GY,m,r,GY,ZC"))]
+(define_insn "*mov<VM32:mode>_insn_vis3"
+ [(set (match_operand:VM32 0 "nonimmediate_operand" "=f, f,f,f,m, m,*r, m,*r,*r, f")
+ (match_operand:VM32 1 "input_operand" "GY,ZC,f,m,f,GY, m,*r,*r, f,*r"))]
"TARGET_VIS
+ && TARGET_VIS3
+ && (register_operand (operands[0], <VM32:MODE>mode)
+ || register_or_zero_or_all_ones_operand (operands[1], <VM32:MODE>mode))"
+ "@
+ fzeros\t%0
+ fones\t%0
+ fsrc1s\t%1, %0
+ ld\t%1, %0
+ st\t%1, %0
+ st\t%r1, %0
+ ld\t%1, %0
+ st\t%1, %0
+ mov\t%1, %0
+ movstouw\t%1, %0
+ movwtos\t%1, %0"
+ [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*,*")])
+
+(define_insn "*mov<VM64:mode>_insn_sp64_novis3"
+ [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,r,m,*r")
+ (match_operand:VM64 1 "input_operand" "GY,ZC,e,m,e,GY,m,r,*r"))]
+ "TARGET_VIS
+ && ! TARGET_VIS3
&& TARGET_ARCH64
&& (register_operand (operands[0], <VM64:MODE>mode)
|| register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
@@ -7804,14 +8036,36 @@
stx\t%r1, %0
ldx\t%1, %0
stx\t%1, %0
- mov\t0, %0
- mov\t-1, %0"
- [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*")])
+ mov\t%1, %0"
+ [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*")])
-(define_insn "*mov<VM64:mode>_insn_sp32"
- [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,U,T,o, r, r")
- (match_operand:VM64 1 "input_operand" "GY,ZC,e,m,e,GY,T,U,r,GY,ZC"))]
+(define_insn "*mov<VM64:mode>_insn_sp64_vis3"
+ [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,*r, m,*r, f,*r")
+ (match_operand:VM64 1 "input_operand" "GY,ZC,e,m,e,GY, m,*r, f,*r,*r"))]
"TARGET_VIS
+ && TARGET_VIS3
+ && TARGET_ARCH64
+ && (register_operand (operands[0], <VM64:MODE>mode)
+ || register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
+ "@
+ fzero\t%0
+ fone\t%0
+ fsrc1\t%1, %0
+ ldd\t%1, %0
+ std\t%1, %0
+ stx\t%r1, %0
+ ldx\t%1, %0
+ stx\t%1, %0
+ movdtox\t%1, %0
+ movxtod\t%1, %0
+ mov\t%1, %0"
+ [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*,*")])
+
+(define_insn "*mov<VM64:mode>_insn_sp32_novis3"
+ [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,U,T,o,*r")
+ (match_operand:VM64 1 "input_operand" "GY,ZC,e,m,e,GY,T,U,r,*r"))]
+ "TARGET_VIS
+ && ! TARGET_VIS3
&& ! TARGET_ARCH64
&& (register_operand (operands[0], <VM64:MODE>mode)
|| register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
@@ -7825,10 +8079,33 @@
ldd\t%1, %0
std\t%1, %0
#
- mov 0, %L0; mov 0, %H0
- mov -1, %L0; mov -1, %H0"
- [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*,*")
- (set_attr "length" "*,*,*,*,*,*,*,*,2,2,2")])
+ #"
+ [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*")
+ (set_attr "length" "*,*,*,*,*,*,*,*,2,2")])
+
+(define_insn "*mov<VM64:mode>_insn_sp32_vis3"
+ [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,*r, f,e,m, m,U,T, o,*r")
+ (match_operand:VM64 1 "input_operand" "GY,ZC,e, f,*r,m,e,GY,T,U,*r,*r"))]
+ "TARGET_VIS
+ && TARGET_VIS3
+ && ! TARGET_ARCH64
+ && (register_operand (operands[0], <VM64:MODE>mode)
+ || register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
+ "@
+ fzero\t%0
+ fone\t%0
+ fsrc1\t%1, %0
+ #
+ #
+ ldd\t%1, %0
+ std\t%1, %0
+ stx\t%r1, %0
+ ldd\t%1, %0
+ std\t%1, %0
+ #
+ #"
+ [(set_attr "type" "fga,fga,fga,*,*,fpload,fpstore,store,load,store,*,*")
+ (set_attr "length" "*,*,*,2,2,*,*,*,*,*,2,2")])
(define_split
[(set (match_operand:VM64 0 "memory_operand" "")
@@ -7851,6 +8128,40 @@
DONE;
})
+(define_split
+ [(set (match_operand:VM64 0 "register_operand" "")
+ (match_operand:VM64 1 "register_operand" ""))]
+ "reload_completed
+ && TARGET_VIS
+ && ! TARGET_ARCH64
+ && sparc_split_regreg_legitimate (operands[0], operands[1])"
+ [(clobber (const_int 0))]
+{
+ rtx set_dest = operands[0];
+ rtx set_src = operands[1];
+ rtx dest1, dest2;
+ rtx src1, src2;
+
+ dest1 = gen_highpart (SImode, set_dest);
+ dest2 = gen_lowpart (SImode, set_dest);
+ src1 = gen_highpart (SImode, set_src);
+ src2 = gen_lowpart (SImode, set_src);
+
+ /* Now emit using the real source and destination we found, swapping
+ the order if we detect overlap. */
+ if (reg_overlap_mentioned_p (dest1, src2))
+ {
+ emit_insn (gen_movsi (dest2, src2));
+ emit_insn (gen_movsi (dest1, src1));
+ }
+ else
+ {
+ emit_insn (gen_movsi (dest1, src1));
+ emit_insn (gen_movsi (dest2, src2));
+ }
+ DONE;
+})
+
(define_expand "vec_init<mode>"
[(match_operand:VMALL 0 "register_operand" "")
(match_operand:VMALL 1 "" "")]
--
1.7.6.401.g6a319
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Add support for sparc VIS3 fp<-->int moves.
2011-10-24 5:01 [PATCH] Add support for sparc VIS3 fp<-->int moves David Miller
@ 2011-10-24 21:29 ` Richard Henderson
2011-10-24 23:02 ` David Miller
0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2011-10-24 21:29 UTC (permalink / raw)
To: David Miller; +Cc: gcc-patches, ebotcazou
On 10/23/2011 08:53 PM, David Miller wrote:
> -(define_insn "*movsi_insn"
> +(define_insn "*movsi_insn_novis3"
> [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m,!f,!f,!m,d,d")
> (match_operand:SI 1 "input_operand" "rI,K,m,rJ,f,m,f,J,P"))]
> - "(register_operand (operands[0], SImode)
> - || register_or_zero_or_all_ones_operand (operands[1], SImode))"
> + "(! TARGET_VIS3
> + && (register_operand (operands[0], SImode)
> + || register_or_zero_or_all_ones_operand (operands[1], SImode)))"
> "@
> mov\t%1, %0
> sethi\t%%hi(%a1), %0
> @@ -1329,6 +1330,26 @@
> fones\t%0"
> [(set_attr "type" "*,*,load,store,fpmove,fpload,fpstore,fga,fga")])
>
> +(define_insn "*movsi_insn_vis3"
> + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r, m, r,*f,*f,*f, m,d,d")
> + (match_operand:SI 1 "input_operand" "rI,K,m,rJ,*f, r, f, m,*f,J,P"))]
> + "(TARGET_VIS3
> + && (register_operand (operands[0], SImode)
> + || register_or_zero_or_all_ones_operand (operands[1], SImode)))"
> + "@
> + mov\t%1, %0
> + sethi\t%%hi(%a1), %0
> + ld\t%1, %0
> + st\t%r1, %0
> + movstouw\t%1, %0
> + movwtos\t%1, %0
> + fmovs\t%1, %0
> + ld\t%1, %0
> + st\t%1, %0
> + fzeros\t%0
> + fones\t%0"
> + [(set_attr "type" "*,*,load,store,*,*,fpmove,fpload,fpstore,fga,fga")])
You shouldn't need to split these anymore. See the enabled attribute, as
used on several other targets so far.
r~
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Add support for sparc VIS3 fp<-->int moves.
2011-10-24 21:29 ` Richard Henderson
@ 2011-10-24 23:02 ` David Miller
0 siblings, 0 replies; 3+ messages in thread
From: David Miller @ 2011-10-24 23:02 UTC (permalink / raw)
To: rth; +Cc: gcc-patches, ebotcazou
From: Richard Henderson <rth@redhat.com>
Date: Mon, 24 Oct 2011 14:05:28 -0700
> You shouldn't need to split these anymore. See the enabled attribute, as
> used on several other targets so far.
See the patch I posted 2 hours after this one.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-10-24 21:18 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-24 5:01 [PATCH] Add support for sparc VIS3 fp<-->int moves David Miller
2011-10-24 21:29 ` Richard Henderson
2011-10-24 23:02 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).