* Re: [PATCH v4, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
@ 2017-06-16 16:23 Carl Love
2017-06-16 21:11 ` Segher Boessenkool
0 siblings, 1 reply; 2+ messages in thread
From: Carl Love @ 2017-06-16 16:23 UTC (permalink / raw)
To: gcc-patches, David Edelsohn, Bill Schmidt, Segher Boessenkool; +Cc: cel
GCC Maintainers:
I have addressed the latest comments on the patch from Segher,
formatting issues and renaming the new define_mode_attr. I believe I
have addressed all of the issues. I have reviewed the patch for
formatting issues.
I retested the changes on powerpc64le-unknown-linux-gnu (Power 8 LE)
only.
Please let me know if there are any additional issues that need fixing.
Thanks.
Carl Love
----------------------------------------------------
gcc/ChangeLog:
2017-06-16 Carl Love <cel@us.ibm.com>
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins[]): Add
definitions for vec_float, vec_float2, vec_floato,
vec_floate built-ins.
* config/rs6000/vsx.md (define_c_enum "unspec"): Add RTL code
for instructions vsx_xvcvsxws vsx_xvcvuxwsp, float2, floato and
floate.
* config/rs6000/rs6000-builtin.def (FLOAT2_V2DI, FLOATE_V2D*,
FLOATO_V2D*, XVCVSXWSP_V4SF, UNS_FLOATO_V2DI, UNS_FLOATE_V2DI): Add
definitions.
* config/altivec.md (define_insn "p8_vmrgew_<mode>",
define_mode_attr VF_sxddp):Add V4SF type to p8_vmrgew.
* config/rs6000/altivec.h (vec_float, vec_float2, vec_floate,
vec_floato): Add builtin defines.
* doc/extend.texi (vec_float, vec_float2, vec_floate, vec_floato):
Update the built-in documentation file for the new built-in
functions.
gcc/testsuite/ChangeLog:
2017-06-16 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/builtins-3-runnable.c (test_result_sp(),
main()): Add runnable tests and test checker for vec_float,
vec_float2, vec_floate and vec_floato builtins.
---
gcc/config/rs6000/altivec.h | 4 +
gcc/config/rs6000/altivec.md | 17 ++-
gcc/config/rs6000/rs6000-builtin.def | 19 ++-
gcc/config/rs6000/rs6000-c.c | 28 +++-
gcc/config/rs6000/rs6000-protos.h | 1 +
gcc/config/rs6000/rs6000.c | 45 +++++-
gcc/config/rs6000/vsx.md | 158 +++++++++++++++++++++
gcc/doc/extend.texi | 14 ++
.../gcc.target/powerpc/builtins-3-runnable.c | 82 +++++++++++
9 files changed, 356 insertions(+), 12 deletions(-)
diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 20050eb..d542315 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -133,6 +133,10 @@
#define vec_doublel __builtin_vec_doublel
#define vec_doubleh __builtin_vec_doubleh
#define vec_expte __builtin_vec_expte
+#define vec_float __builtin_vec_float
+#define vec_float2 __builtin_vec_float2
+#define vec_floate __builtin_vec_floate
+#define vec_floato __builtin_vec_floato
#define vec_floor __builtin_vec_floor
#define vec_loge __builtin_vec_loge
#define vec_madd __builtin_vec_madd
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 487b9a4..fd15286 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -207,6 +207,9 @@
;; versus floating point
(define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
+;; Mode attribute for vector floate and floato conversions
+(define_mode_attr VF_sxddp [(V2DI "sxd") (V2DF "dp")])
+
;; Specific iterator for parity which does not have a byte/half-word form, but
;; does have a quad word form
(define_mode_iterator VParity [V4SI
@@ -1316,13 +1319,13 @@
}
[(set_attr "type" "vecperm")])
-;; Power8 vector merge even/odd
-(define_insn "p8_vmrgew"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (vec_select:V4SI
- (vec_concat:V8SI
- (match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v"))
+;; Power8 vector merge two V4SF/V4SI even words to V4SF
+(define_insn "p8_vmrgew_<mode>"
+ [(set (match_operand:VSX_W 0 "register_operand" "=v")
+ (vec_select:VSX_W
+ (vec_concat:<VS_double>
+ (match_operand:VSX_W 1 "register_operand" "v")
+ (match_operand:VSX_W 2 "register_operand" "v"))
(parallel [(const_int 0) (const_int 4)
(const_int 2) (const_int 6)])))]
"TARGET_P8_VECTOR"
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 241c439..4682628 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1591,6 +1591,8 @@ BU_VSX_2 (CMPLE_U16QI, "cmple_u16qi", CONST, vector_ngtuv16qi)
BU_VSX_2 (CMPLE_U8HI, "cmple_u8hi", CONST, vector_ngtuv8hi)
BU_VSX_2 (CMPLE_U4SI, "cmple_u4si", CONST, vector_ngtuv4si)
BU_VSX_2 (CMPLE_U2DI, "cmple_u2di", CONST, vector_ngtuv2di)
+BU_VSX_2 (FLOAT2_V2DI, "float2_v2di", CONST, float2_v2di)
+BU_VSX_2 (UNS_FLOAT2_V2DI, "uns_float2_v2di", CONST, uns_float2_v2di)
/* VSX abs builtin functions. */
BU_VSX_A (XVABSDP, "xvabsdp", CONST, absv2df2)
@@ -1648,6 +1650,16 @@ BU_VSX_1 (XVCVSPSXDS, "xvcvspsxds", CONST, vsx_xvcvspsxds)
BU_VSX_1 (XVCVSPUXDS, "xvcvspuxds", CONST, vsx_xvcvspuxds)
BU_VSX_1 (XVCVSXDSP, "xvcvsxdsp", CONST, vsx_xvcvsxdsp)
BU_VSX_1 (XVCVUXDSP, "xvcvuxdsp", CONST, vsx_xvcvuxdsp)
+
+BU_VSX_1 (XVCVSXWSP_V4SF, "vsx_xvcvsxwsp", CONST, vsx_xvcvsxwsp)
+BU_VSX_1 (XVCVUXWSP_V4SF, "vsx_xvcvuxwsp", CONST, vsx_xvcvuxwsp)
+BU_VSX_1 (FLOATE_V2DI, "floate_v2di", CONST, floatev2di)
+BU_VSX_1 (FLOATE_V2DF, "floate_v2df", CONST, floatev2df)
+BU_VSX_1 (FLOATO_V2DI, "floato_v2di", CONST, floatov2di)
+BU_VSX_1 (FLOATO_V2DF, "floato_v2df", CONST, floatov2df)
+BU_VSX_1 (UNS_FLOATO_V2DI, "uns_floato_v2di", CONST, unsfloatov2di)
+BU_VSX_1 (UNS_FLOATE_V2DI, "uns_floate_v2di", CONST, unsfloatev2di)
+
BU_VSX_1 (XVRSPI, "xvrspi", CONST, vsx_xvrspi)
BU_VSX_1 (XVRSPIC, "xvrspic", CONST, vsx_xvrspic)
BU_VSX_1 (XVRSPIM, "xvrspim", CONST, vsx_floorv4sf2)
@@ -1760,6 +1772,8 @@ BU_VSX_OVERLOAD_2 (XXMRGHW, "xxmrghw")
BU_VSX_OVERLOAD_2 (XXMRGLW, "xxmrglw")
BU_VSX_OVERLOAD_2 (XXSPLTD, "xxspltd")
BU_VSX_OVERLOAD_2 (XXSPLTW, "xxspltw")
+BU_VSX_OVERLOAD_2 (FLOAT2, "float2")
+BU_VSX_OVERLOAD_2 (UNS_FLOAT2, "uns_float2")
/* 1 argument VSX overloaded builtin functions. */
BU_VSX_OVERLOAD_1 (DOUBLE, "double")
@@ -1771,6 +1785,9 @@ BU_VSX_OVERLOAD_1 (DOUBLEH, "doubleh")
BU_VSX_OVERLOAD_1 (UNS_DOUBLEH, "uns_doubleh")
BU_VSX_OVERLOAD_1 (DOUBLEL, "doublel")
BU_VSX_OVERLOAD_1 (UNS_DOUBLEL, "uns_doublel")
+BU_VSX_OVERLOAD_1 (FLOAT, "float")
+BU_VSX_OVERLOAD_1 (FLOATE, "floate")
+BU_VSX_OVERLOAD_1 (FLOATO, "floato")
/* VSX builtins that are handled as special cases. */
BU_VSX_OVERLOAD_X (LD, "ld")
@@ -1812,7 +1829,7 @@ BU_P8V_AV_2 (VMINSD, "vminsd", CONST, sminv2di3)
BU_P8V_AV_2 (VMAXSD, "vmaxsd", CONST, smaxv2di3)
BU_P8V_AV_2 (VMINUD, "vminud", CONST, uminv2di3)
BU_P8V_AV_2 (VMAXUD, "vmaxud", CONST, umaxv2di3)
-BU_P8V_AV_2 (VMRGEW, "vmrgew", CONST, p8_vmrgew)
+BU_P8V_AV_2 (VMRGEW_V4SI, "vmrgew_v4si", CONST, p8_vmrgew_v4si)
BU_P8V_AV_2 (VMRGOW, "vmrgow", CONST, p8_vmrgow)
BU_P8V_AV_2 (VBPERMQ, "vbpermq", CONST, altivec_vbpermq)
BU_P8V_AV_2 (VBPERMQ2, "vbpermq2", CONST, altivec_vbpermq2)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index f1e8d3d..19f6d9c 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1538,6 +1538,28 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
{ VSX_BUILTIN_VEC_DOUBLEL, VSX_BUILTIN_DOUBLEL_V4SF,
RS6000_BTI_V2DF, RS6000_BTI_V4SF, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOAT, VSX_BUILTIN_XVCVSXWSP_V4SF,
+ RS6000_BTI_V4SF, RS6000_BTI_V4SI, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOAT, VSX_BUILTIN_XVCVUXWSP_V4SF,
+ RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOAT2, VSX_BUILTIN_FLOAT2_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
+ { VSX_BUILTIN_VEC_FLOAT2, VSX_BUILTIN_UNS_FLOAT2_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI,
+ RS6000_BTI_unsigned_V2DI, 0 },
+ { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_FLOATE_V2DF,
+ RS6000_BTI_V4SF, RS6000_BTI_V2DF, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_FLOATE_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_V2DI, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_UNS_FLOATE_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_FLOATO_V2DF,
+ RS6000_BTI_V4SF, RS6000_BTI_V2DF, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_FLOATO_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_V2DI, 0, 0 },
+ { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_UNS_FLOATO_V2DI,
+ RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
{ ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DF,
RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
{ ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DI,
@@ -5262,12 +5284,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
RS6000_BTI_unsigned_V2DI, 0 },
- { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+ { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
- { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+ { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
RS6000_BTI_unsigned_V4SI, 0 },
- { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+ { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 },
{ P8V_BUILTIN_VEC_VMRGOW, P8V_BUILTIN_VMRGOW,
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 8a231f5..8165d04 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -72,6 +72,7 @@ extern void altivec_expand_stvex_be (rtx, rtx, machine_mode, unsigned);
extern void rs6000_expand_extract_even (rtx, rtx, rtx);
extern void rs6000_expand_interleave (rtx, rtx, rtx, bool);
extern void rs6000_scale_v2df (rtx, rtx, int);
+extern void rs6000_generate_float2_code (bool, rtx, rtx, rtx);
extern int expand_block_clear (rtx[]);
extern int expand_block_move (rtx[]);
extern bool expand_block_compare (rtx[]);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 941c0c2..43ba6e5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -36798,7 +36798,7 @@ altivec_expand_vec_perm_const (rtx operands[4])
(BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct
: CODE_FOR_altivec_vmrghw_direct),
{ 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
- { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
+ { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew_v4si,
{ 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } },
{ OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgow,
{ 4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31 } }
@@ -42389,6 +42389,49 @@ rs6000_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
*update = build2 (COMPOUND_EXPR, void_type_node, update_mffs, update_mtfsf);
}
+void
+rs6000_generate_float2_code (bool signed_convert, rtx dst, rtx src1, rtx src2)
+{
+ rtx rtx_tmp0, rtx_tmp1, rtx_tmp2, rtx_tmp3;
+
+ rtx_tmp0 = gen_reg_rtx (V2DImode);
+ rtx_tmp1 = gen_reg_rtx (V2DImode);
+
+ /* The destination of the vmrgew instruction layout is:
+ rtx_tmp2[0] rtx_tmp3[0] rtx_tmp2[1] rtx_tmp3[0].
+ Setup rtx_tmp0 and rtx_tmp1 to ensure the order of the elements after the
+ vmrgew instruction will be correct. */
+ if (VECTOR_ELT_ORDER_BIG)
+ {
+ emit_insn (gen_vsx_xxpermdi_v2di_be (rtx_tmp0, src1, src2, GEN_INT (0)));
+ emit_insn (gen_vsx_xxpermdi_v2di_be (rtx_tmp1, src1, src2, GEN_INT (3)));
+ }
+ else
+ {
+ emit_insn (gen_vsx_xxpermdi_v2di (rtx_tmp0, src1, src2, GEN_INT (3)));
+ emit_insn (gen_vsx_xxpermdi_v2di (rtx_tmp1, src1, src2, GEN_INT (0)));
+ }
+
+ rtx_tmp2 = gen_reg_rtx (V4SFmode);
+ rtx_tmp3 = gen_reg_rtx (V4SFmode);
+
+ if (signed_convert)
+ {
+ emit_insn (gen_vsx_xvcvsxdsp (rtx_tmp2, rtx_tmp0));
+ emit_insn (gen_vsx_xvcvsxdsp (rtx_tmp3, rtx_tmp1));
+ }
+ else
+ {
+ emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp2, rtx_tmp0));
+ emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp3, rtx_tmp1));
+ }
+
+ if (VECTOR_ELT_ORDER_BIG)
+ emit_insn (gen_p8_vmrgew_v4sf (dst, rtx_tmp2, rtx_tmp3));
+ else
+ emit_insn (gen_p8_vmrgew_v4sf (dst, rtx_tmp3, rtx_tmp2));
+}
+
/* Implement the TARGET_OPTAB_SUPPORTED_P hook. */
static bool
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 141aa42..fe2a388 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -331,6 +331,14 @@
UNSPEC_VSX_CVUXDSP
UNSPEC_VSX_CVSPSXDS
UNSPEC_VSX_CVSPUXDS
+ UNSPEC_VSX_CVSXWSP
+ UNSPEC_VSX_CVUXWSP
+ UNSPEC_VSX_FLOAT2
+ UNSPEC_VSX_UNS_FLOAT2
+ UNSPEC_VSX_FLOATE
+ UNSPEC_VSX_UNS_FLOATE
+ UNSPEC_VSX_FLOATO
+ UNSPEC_VSX_UNS_FLOATO
UNSPEC_VSX_TDIV
UNSPEC_VSX_TSQRT
UNSPEC_VSX_SET
@@ -1976,6 +1984,156 @@
"xvcvspuxds %x0,%x1"
[(set_attr "type" "vecdouble")])
+(define_insn "vsx_xvcvsxwsp"
+ [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
+ (unspec:V4SF [(match_operand:V4SI 1 "vsx_register_operand" "wa")]
+ UNSPEC_VSX_CVSXWSP))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "xvcvsxwsp %x0,%x1"
+ [(set_attr "type" "vecdouble")])
+
+(define_insn "vsx_xvcvuxwsp"
+ [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
+ (unspec:V4SF[(match_operand:V4SI 1 "vsx_register_operand" "wa")]
+ UNSPEC_VSX_CVUXWSP))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "xvcvuxwsp %x0,%x1"
+ [(set_attr "type" "vecdouble")])
+
+;; Generate float2
+;; convert two long long signed ints to float
+(define_expand "float2_v2di"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:V2DI 1 "register_operand" "wa"))
+ (use (match_operand:V2DI 2 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ rtx rtx_src1, rtx_src2, rtx_dst;
+
+ rtx_dst = operands[0];
+ rtx_src1 = operands[1];
+ rtx_src2 = operands[2];
+
+ rs6000_generate_float2_code (true, rtx_dst, rtx_src1, rtx_src2);
+ DONE;
+})
+
+;; Generate uns_float2
+;; convert two long long unsigned ints to float
+(define_expand "uns_float2_v2di"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:V2DI 1 "register_operand" "wa"))
+ (use (match_operand:V2DI 2 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ rtx rtx_src1, rtx_src2, rtx_dst;
+
+ rtx_dst = operands[0];
+ rtx_src1 = operands[1];
+ rtx_src2 = operands[2];
+
+ rs6000_generate_float2_code (true, rtx_dst, rtx_src1, rtx_src2);
+ DONE;
+})
+
+;; Generate floate
+;; convert double or long long signed to float
+;;(Only even words are valid, BE numbering)
+(define_expand "floate<mode>"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:VSX_D 1 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ if (VECTOR_ELT_ORDER_BIG)
+ {
+ /* Shift left one word to put even word correct location */
+ rtx rtx_tmp;
+ rtx rtx_val = GEN_INT (4);
+
+ rtx_tmp = gen_reg_rtx (V4SFmode);
+ emit_insn (gen_vsx_xvcv<VF_sxddp>sp (rtx_tmp, operands[1]));
+ emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+ rtx_tmp, rtx_tmp, rtx_val));
+ }
+ else
+ emit_insn (gen_vsx_xvcv<VFC_inst>sp (operands[0], operands[1]));
+
+ DONE;
+})
+
+;; Generate uns_floate
+;; convert long long unsigned to float
+;; (Only even words are valid, BE numbering)
+(define_expand "unsfloatev2di"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:V2DI 1 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ if (VECTOR_ELT_ORDER_BIG)
+ {
+ /* Shift left one word to put even word correct location */
+ rtx rtx_tmp;
+ rtx rtx_val = GEN_INT (4);
+
+ rtx_tmp = gen_reg_rtx (V4SFmode);
+ emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp, operands[1]));
+ emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+ rtx_tmp, rtx_tmp, rtx_val));
+ }
+ else
+ emit_insn (gen_vsx_xvcvuxdsp (operands[0], operands[1]));
+
+ DONE;
+})
+
+;; Generate floato
+;; convert double or long long signed to float
+;; Only odd words are valid, BE numbering)
+(define_expand "floato<mode>"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:VSX_D 1 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ if (VECTOR_ELT_ORDER_BIG)
+ emit_insn (gen_vsx_xvcv<VFC_inst>sp (operands[0], operands[1]));
+ else
+ {
+ /* Shift left one word to put odd word correct location */
+ rtx rtx_tmp;
+ rtx rtx_val = GEN_INT (4);
+
+ rtx_tmp = gen_reg_rtx (V4SFmode);
+ emit_insn (gen_vsx_xvcv<VFC_inst>sp (rtx_tmp, operands[1]));
+ emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+ rtx_tmp, rtx_tmp, rtx_val));
+ }
+ DONE;
+})
+
+;; Generate uns_floato
+;; convert long long unsigned to float
+;; (Only odd words are valid, BE numbering)
+(define_expand "unsfloatov2di"
+ [(use (match_operand:V4SF 0 "register_operand" "=wa"))
+ (use (match_operand:V2DI 1 "register_operand" "wa"))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+{
+ if (VECTOR_ELT_ORDER_BIG)
+ emit_insn (gen_vsx_xvcvuxdsp (operands[0], operands[1]));
+ else
+ {
+ /* Shift left one word to put odd word correct location */
+ rtx rtx_tmp;
+ rtx rtx_val = GEN_INT (4);
+
+ rtx_tmp = gen_reg_rtx (V4SFmode);
+ emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp, operands[1]));
+ emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+ rtx_tmp, rtx_tmp, rtx_val));
+ }
+ DONE;
+})
+
;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since
;; since the xvrdpiz instruction does not truncate the value if the floating
;; point value is < LONG_MIN or > LONG_MAX.
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7d39335..a662aeb 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16039,6 +16039,20 @@ vector float vec_expte (vector float);
vector float vec_floor (vector float);
+vector float vec_float (vector signed int);
+vector float vec_float (vector unsigned int);
+
+vector float vec_float2 (vector signed long long, vector signed long long);
+vector float vec_float2 (vector unsigned long long, vector signed long long);
+
+vector float vec_floate (vector double);
+vector float vec_floate (vector signed long long);
+vector float vec_floate (vector unsigned long long);
+
+vector float vec_floato (vector double);
+vector float vec_floato (vector signed long long);
+vector float vec_floato (vector unsigned long long);
+
vector float vec_ld (int, const vector float *);
vector float vec_ld (int, const float *);
vector bool int vec_ld (int, const vector bool int *);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 60ec617..8e09a92 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -5,8 +5,37 @@
#include <altivec.h> // vector
+#define ALL 1
+#define EVEN 2
+#define ODD 3
+
void abort (void);
+void test_result_sp(int check, vector float vec_result, vector float vec_expected)
+{
+ int i;
+ for(i = 0; i<4; i++) {
+
+ switch (check) {
+ case ALL:
+ break;
+ case EVEN:
+ if (i%2 == 0)
+ break;
+ else
+ continue;
+ case ODD:
+ if (i%2 != 0)
+ break;
+ else
+ continue;
+ }
+
+ if (vec_result[i] != vec_expected[i])
+ abort();
+ }
+}
+
void test_result_dp(vector double vec_result, vector double vec_expected)
{
if (vec_result[0] != vec_expected[0])
@@ -21,11 +50,17 @@ int main()
int i;
vector unsigned int vec_unint;
vector signed int vec_int;
+ vector long long int vec_ll_int0, vec_ll_int1;
+ vector long long unsigned int vec_ll_uns_int0, vec_ll_uns_int1;
vector float vec_flt, vec_flt_result, vec_flt_expected;
vector double vec_dble0, vec_dble1, vec_dble_result, vec_dble_expected;
vec_int = (vector signed int){ -1, 3, -5, 1234567 };
+ vec_ll_int0 = (vector long long int){ -12, -12345678901234 };
+ vec_ll_int1 = (vector long long int){ 12, 9876543210 };
vec_unint = (vector unsigned int){ 9, 11, 15, 2468013579 };
+ vec_ll_uns_int0 = (vector unsigned long long int){ 102, 9753108642 };
+ vec_ll_uns_int1 = (vector unsigned long long int){ 23, 29 };
vec_flt = (vector float){ -21., 3.5, -53., 78. };
vec_dble0 = (vector double){ 34.0, 97.0 };
vec_dble1 = (vector double){ 214.0, -5.5 };
@@ -81,4 +116,51 @@ int main()
vec_dble_result = vec_doubleh (vec_unint);
test_result_dp(vec_dble_result, vec_dble_expected);
+ vec_dble_expected = (vector double){-21.000000, 3.500000};
+ vec_dble_result = vec_doubleh (vec_flt);
+ test_result_dp(vec_dble_result, vec_dble_expected);
+
+ /* conversion of integer vector to single precision float vector */
+ vec_flt_expected = (vector float){-1.00, 3.00, -5.00, 1234567.00};
+ vec_flt_result = vec_float (vec_int);
+ test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){9.00, 11.00, 15.00, 2468013579.0};
+ vec_flt_result = vec_float (vec_unint);
+ test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+ /* conversion of two double precision vectors to single precision vector */
+ vec_flt_expected = (vector float){-12.00, -12345678901234.00, 12.00, 9876543210.00};
+ vec_flt_result = vec_float2 (vec_ll_int0, vec_ll_int1);
+ test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){102.00, 9753108642.00, 23.00, 29.00};
+ vec_flt_result = vec_float2 (vec_ll_uns_int0, vec_ll_uns_int1);
+ test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+ /* conversion of even words in double precision vector to single precision vector */
+ vec_flt_expected = (vector float){-12.00, 00.00, -12345678901234.00, 0.00};
+ vec_flt_result = vec_floate (vec_ll_int0);
+ test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){102.00, 0.00, 9753108642.00, 0.00};
+ vec_flt_result = vec_floate (vec_ll_uns_int0);
+ test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){34.00, 0.00, 97.00, 0.00};
+ vec_flt_result = vec_floate (vec_dble0);
+ test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+ /* conversion of odd words in double precision vector to single precision vector */
+ vec_flt_expected = (vector float){0.00, -12.00, 00.00, -12345678901234.00};
+ vec_flt_result = vec_floato (vec_ll_int0);
+ test_result_sp(ODD, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){0.00, 102.00, 0.00, 9753108642.00};
+ vec_flt_result = vec_floato (vec_ll_uns_int0);
+ test_result_sp(ODD, vec_flt_result, vec_flt_expected);
+
+ vec_flt_expected = (vector float){0.00, 34.00, 0.00, 97.00};
+ vec_flt_result = vec_floato (vec_dble0);
+ test_result_sp(ODD, vec_flt_result, vec_flt_expected);
}
--
1.9.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v4, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
2017-06-16 16:23 [PATCH v4, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins Carl Love
@ 2017-06-16 21:11 ` Segher Boessenkool
0 siblings, 0 replies; 2+ messages in thread
From: Segher Boessenkool @ 2017-06-16 21:11 UTC (permalink / raw)
To: Carl Love; +Cc: gcc-patches, David Edelsohn, Bill Schmidt
Hi Carl,
On Fri, Jun 16, 2017 at 09:23:06AM -0700, Carl Love wrote:
> * config/rs6000/rs6000-c.c (altivec_overloaded_builtins[]): Add
> definitions for vec_float, vec_float2, vec_floato,
> vec_floate built-ins.
No [], just the name.
> * config/rs6000/rs6000-builtin.def (FLOAT2_V2DI, FLOATE_V2D*,
> FLOATO_V2D*, XVCVSXWSP_V4SF, UNS_FLOATO_V2DI, UNS_FLOATE_V2DI): Add
> definitions.
Please spell out FLOATE_V2DF, FLOATE_V2DI -- it's only two of-em, and
it makes things easier to find.
> * config/altivec.md (define_insn "p8_vmrgew_<mode>",
> define_mode_attr VF_sxddp):Add V4SF type to p8_vmrgew.
Space after colon.
> * gcc.target/powerpc/builtins-3-runnable.c (test_result_sp(),
> main()): Add runnable tests and test checker for vec_float,
> vec_float2, vec_floate and vec_floato builtins.
No () please.
> +(define_insn "vsx_xvcvsxwsp"
> + [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
> + (unspec:V4SF [(match_operand:V4SI 1 "vsx_register_operand" "wa")]
> + UNSPEC_VSX_CVSXWSP))]
> + "VECTOR_UNIT_VSX_P (V4SFmode)"
> + "xvcvsxwsp %x0,%x1"
> + [(set_attr "type" "vecdouble")])
Hrm, is that the best type? Maybe vecfloat is better.
> +;; Generate floate
> +;; convert double or long long signed to float
> +;;(Only even words are valid, BE numbering)
Single space before double; space before (.
> +(define_expand "floato<mode>"
> + [(use (match_operand:V4SF 0 "register_operand" "=wa"))
> + (use (match_operand:VSX_D 1 "register_operand" "wa"))]
> + "VECTOR_UNIT_VSX_P (V4SFmode)"
These last three lines should be indented one more space.
> +{
> + if (VECTOR_ELT_ORDER_BIG)
> + emit_insn (gen_vsx_xvcv<VFC_inst>sp (operands[0], operands[1]));
> + else
> + {
> + /* Shift left one word to put odd word correct location */
> + rtx rtx_tmp;
> + rtx rtx_val = GEN_INT (4);
> +
> + rtx_tmp = gen_reg_rtx (V4SFmode);
> + emit_insn (gen_vsx_xvcv<VFC_inst>sp (rtx_tmp, operands[1]));
> + emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
> + rtx_tmp, rtx_tmp, rtx_val));
This indent should use tabs. There are more like this.
Okay with those last trivialities fixed. Thanks!
Segher
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-06-16 21:11 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-16 16:23 [PATCH v4, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins Carl Love
2017-06-16 21:11 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).