* [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP
@ 2020-07-10 2:14 Xiong Hu Luo
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Xiong Hu Luo @ 2020-07-10 2:14 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, wschmidt, luoxhu
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvsrdd
With the patch followed, it could be continue optimized to:
lwz + rldimi + mtvsrdd
The point is to use lwz to avoid converting the single-precision to
double-precision upon load, pack four 32-bit data into one 128-bit
register directly.
gcc/ChangeLog:
2020-07-10 Xionghu Luo <luoxhu@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_expand_vector_init):
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
---
gcc/config/rs6000/rs6000.c | 49 +++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 58f5d780603..d94e88c23a5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6423,35 +6423,34 @@ rs6000_expand_vector_init (rtx target, rtx vals)
}
else
{
- rtx dbl_even = gen_reg_rtx (V2DFmode);
- rtx dbl_odd = gen_reg_rtx (V2DFmode);
- rtx flt_even = gen_reg_rtx (V4SFmode);
- rtx flt_odd = gen_reg_rtx (V4SFmode);
- rtx op0 = force_reg (SFmode, XVECEXP (vals, 0, 0));
- rtx op1 = force_reg (SFmode, XVECEXP (vals, 0, 1));
- rtx op2 = force_reg (SFmode, XVECEXP (vals, 0, 2));
- rtx op3 = force_reg (SFmode, XVECEXP (vals, 0, 3));
-
- /* Use VMRGEW if we can instead of doing a permute. */
- if (TARGET_P8_VECTOR)
+ rtx tmpSF[4];
+ rtx tmpSI[4];
+ rtx tmpDI[4];
+ rtx mrgDI[4];
+ for (i = 0; i < 4; i++)
{
- emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op2));
- emit_insn (gen_vsx_concat_v2sf (dbl_odd, op1, op3));
- emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even));
- emit_insn (gen_vsx_xvcvdpsp (flt_odd, dbl_odd));
- if (BYTES_BIG_ENDIAN)
- emit_insn (gen_p8_vmrgew_v4sf_direct (target, flt_even, flt_odd));
- else
- emit_insn (gen_p8_vmrgew_v4sf_direct (target, flt_odd, flt_even));
+ tmpSI[i] = gen_reg_rtx (SImode);
+ tmpDI[i] = gen_reg_rtx (DImode);
+ mrgDI[i] = gen_reg_rtx (DImode);
+ tmpSF[i] = force_reg (SFmode, XVECEXP (vals, 0, i));
+ emit_insn (gen_movsi_from_sf (tmpSI[i], tmpSF[i]));
+ emit_insn (gen_zero_extendsidi2 (tmpDI[i], tmpSI[i]));
}
- else
+
+ if (!BYTES_BIG_ENDIAN)
{
- emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op1));
- emit_insn (gen_vsx_concat_v2sf (dbl_odd, op2, op3));
- emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even));
- emit_insn (gen_vsx_xvcvdpsp (flt_odd, dbl_odd));
- rs6000_expand_extract_even (target, flt_even, flt_odd);
+ std::swap (tmpDI[0], tmpDI[1]);
+ std::swap (tmpDI[2], tmpDI[3]);
}
+
+ emit_insn (gen_ashldi3 (mrgDI[0], tmpDI[0], GEN_INT (32)));
+ emit_insn (gen_iordi3 (mrgDI[1], mrgDI[0], tmpDI[1]));
+ emit_insn (gen_ashldi3 (mrgDI[2], tmpDI[2], GEN_INT (32)));
+ emit_insn (gen_iordi3 (mrgDI[3], mrgDI[2], tmpDI[3]));
+
+ rtx tmpV2DI = gen_reg_rtx (V2DImode);
+ emit_insn (gen_vsx_concat_v2di (tmpV2DI, mrgDI[1], mrgDI[3]));
+ emit_move_insn (target, gen_lowpart (V4SFmode, tmpV2DI));
}
return;
}
--
2.27.0.90.geebb51ba8c
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi
2020-07-10 2:14 [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP Xiong Hu Luo
@ 2020-07-10 2:14 ` Xiong Hu Luo
2020-07-11 0:28 ` Segher Boessenkool
2020-08-03 11:21 ` Andreas Schwab
2020-07-10 2:37 ` [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP luoxhu
2020-07-10 23:34 ` Segher Boessenkool
2 siblings, 2 replies; 8+ messages in thread
From: Xiong Hu Luo @ 2020-07-10 2:14 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, wschmidt, luoxhu
Combine pass could recognize the pattern defined and split it in split1,
this patch could optimize:
21: r130:DI=r133:DI<<0x20
11: {r129:DI=zero_extend(unspec[[r145:DI]] 87);clobber scratch;}
22: r134:DI=r130:DI|r129:DI
to
21: {r149:DI=zero_extend(unspec[[r145:DI]] 87);clobber scratch;}
22: r134:DI=r149:DI&0xffffffff|r133:DI<<0x20
rldimi is generated instead of sldi+or.
gcc/ChangeLog:
2020-07-10 Xionghu Luo <luoxhu@linux.ibm.com>
* config/rs6000/rs6000.md (rotl_unspec): New
define_insn_and_split.
gcc/testsuite/ChangeLog:
2020-07-10 Xionghu Luo <luoxhu@linux.ibm.com>
* gcc.target/powerpc/vector_float.c: New test.
---
gcc/config/rs6000/rs6000.md | 26 +++++++++++++++++++
.../gcc.target/powerpc/vector_float.c | 14 ++++++++++
2 files changed, 40 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/powerpc/vector_float.c
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0aa5265d199..64b655df363 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4239,6 +4239,32 @@
operands[5] = GEN_INT ((HOST_WIDE_INT_1U << <bits>) - 1);
})
+; rldimi with UNSPEC_SI_FROM_SF.
+(define_insn_and_split "*rotl_unspec"
+ [(set (match_operand:DI 0 "gpc_reg_operand")
+ (ior:DI
+ (ashift:DI (match_operand:DI 1 "gpc_reg_operand")
+ (match_operand:SI 2 "const_int_operand"))
+ (zero_extend:DI
+ (unspec:QHSI
+ [(match_operand:SF 3 "memory_operand")]
+ UNSPEC_SI_FROM_SF))))
+ (clobber (match_scratch:V4SF 4))]
+ "INTVAL (operands[2]) == <bits>"
+ "#"
+ ""
+ [(parallel [(set (match_dup 5)
+ (zero_extend:DI (unspec:QHSI [(match_dup 3)] UNSPEC_SI_FROM_SF)))
+ (clobber (match_dup 4))])
+ (set (match_dup 0)
+ (ior:DI
+ (and:DI (match_dup 5) (match_dup 6))
+ (ashift:DI (match_dup 1) (match_dup 2))))]
+{
+ operands[5] = gen_reg_rtx (DImode);
+ operands[6] = GEN_INT ((HOST_WIDE_INT_1U << <bits>) - 1);
+})
+
; rlwimi, too.
(define_split
[(set (match_operand:SI 0 "gpc_reg_operand")
diff --git a/gcc/testsuite/gcc.target/powerpc/vector_float.c b/gcc/testsuite/gcc.target/powerpc/vector_float.c
new file mode 100644
index 00000000000..414824ad264
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector_float.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+
+vector float
+test (float *a, float *b, float *c, float *d)
+{
+ return (vector float){*a, *b, *c, *d};
+}
+
+/* { dg-final { scan-assembler-not {\mlxsspx\M} } } */
+/* { dg-final { scan-assembler-not {\mlfs\M} } } */
+/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
--
2.27.0.90.geebb51ba8c
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP
2020-07-10 2:14 [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP Xiong Hu Luo
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
@ 2020-07-10 2:37 ` luoxhu
2020-07-10 23:34 ` Segher Boessenkool
2 siblings, 0 replies; 8+ messages in thread
From: luoxhu @ 2020-07-10 2:37 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, wschmidt
Update patch to keep the logic for non TARGET_P8_VECTOR targets.
Please ignore the previous [PATCH 1/2], Sorry!
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvsrdd
With the patch followed, it could be continue optimized to:
lwz + rldimi + mtvsrdd
The point is to use lwz to avoid converting the single-precision to
double-precision upon load, pack four 32-bit data into one 128-bit
register directly.
gcc/ChangeLog:
2020-07-10 Xionghu Luo <luoxhu@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_expand_vector_init):
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
---
gcc/config/rs6000/rs6000.c | 55 +++++++++++++++++++++++++-------------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 58f5d780603..00972fb5165 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6423,29 +6423,48 @@ rs6000_expand_vector_init (rtx target, rtx vals)
}
else
{
- rtx dbl_even = gen_reg_rtx (V2DFmode);
- rtx dbl_odd = gen_reg_rtx (V2DFmode);
- rtx flt_even = gen_reg_rtx (V4SFmode);
- rtx flt_odd = gen_reg_rtx (V4SFmode);
- rtx op0 = force_reg (SFmode, XVECEXP (vals, 0, 0));
- rtx op1 = force_reg (SFmode, XVECEXP (vals, 0, 1));
- rtx op2 = force_reg (SFmode, XVECEXP (vals, 0, 2));
- rtx op3 = force_reg (SFmode, XVECEXP (vals, 0, 3));
-
- /* Use VMRGEW if we can instead of doing a permute. */
if (TARGET_P8_VECTOR)
{
- emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op2));
- emit_insn (gen_vsx_concat_v2sf (dbl_odd, op1, op3));
- emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even));
- emit_insn (gen_vsx_xvcvdpsp (flt_odd, dbl_odd));
- if (BYTES_BIG_ENDIAN)
- emit_insn (gen_p8_vmrgew_v4sf_direct (target, flt_even, flt_odd));
- else
- emit_insn (gen_p8_vmrgew_v4sf_direct (target, flt_odd, flt_even));
+ rtx tmpSF[4];
+ rtx tmpSI[4];
+ rtx tmpDI[4];
+ rtx mrgDI[4];
+ for (i = 0; i < 4; i++)
+ {
+ tmpSI[i] = gen_reg_rtx (SImode);
+ tmpDI[i] = gen_reg_rtx (DImode);
+ mrgDI[i] = gen_reg_rtx (DImode);
+ tmpSF[i] = force_reg (SFmode, XVECEXP (vals, 0, i));
+ emit_insn (gen_movsi_from_sf (tmpSI[i], tmpSF[i]));
+ emit_insn (gen_zero_extendsidi2 (tmpDI[i], tmpSI[i]));
+ }
+
+ if (!BYTES_BIG_ENDIAN)
+ {
+ std::swap (tmpDI[0], tmpDI[1]);
+ std::swap (tmpDI[2], tmpDI[3]);
+ }
+
+ emit_insn (gen_ashldi3 (mrgDI[0], tmpDI[0], GEN_INT (32)));
+ emit_insn (gen_iordi3 (mrgDI[1], mrgDI[0], tmpDI[1]));
+ emit_insn (gen_ashldi3 (mrgDI[2], tmpDI[2], GEN_INT (32)));
+ emit_insn (gen_iordi3 (mrgDI[3], mrgDI[2], tmpDI[3]));
+
+ rtx tmpV2DI = gen_reg_rtx (V2DImode);
+ emit_insn (gen_vsx_concat_v2di (tmpV2DI, mrgDI[1], mrgDI[3]));
+ emit_move_insn (target, gen_lowpart (V4SFmode, tmpV2DI));
}
else
{
+ rtx dbl_even = gen_reg_rtx (V2DFmode);
+ rtx dbl_odd = gen_reg_rtx (V2DFmode);
+ rtx flt_even = gen_reg_rtx (V4SFmode);
+ rtx flt_odd = gen_reg_rtx (V4SFmode);
+ rtx op0 = force_reg (SFmode, XVECEXP (vals, 0, 0));
+ rtx op1 = force_reg (SFmode, XVECEXP (vals, 0, 1));
+ rtx op2 = force_reg (SFmode, XVECEXP (vals, 0, 2));
+ rtx op3 = force_reg (SFmode, XVECEXP (vals, 0, 3));
+
emit_insn (gen_vsx_concat_v2sf (dbl_even, op0, op1));
emit_insn (gen_vsx_concat_v2sf (dbl_odd, op2, op3));
emit_insn (gen_vsx_xvcvdpsp (flt_even, dbl_even));
--
2.27.0.90.geebb51ba8c
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP
2020-07-10 2:14 [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP Xiong Hu Luo
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
2020-07-10 2:37 ` [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP luoxhu
@ 2020-07-10 23:34 ` Segher Boessenkool
2 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2020-07-10 23:34 UTC (permalink / raw)
To: Xiong Hu Luo; +Cc: gcc-patches, wschmidt
Hi!
On Thu, Jul 09, 2020 at 09:14:44PM -0500, Xiong Hu Luo wrote:
> Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
> Better instruction sequence could be generated on Power9:
> The point is to use lwz to avoid converting the single-precision to
> double-precision upon load, pack four 32-bit data into one 128-bit
> register directly.
> + rtx tmpSF[4];
> + rtx tmpSI[4];
> + rtx tmpDI[4];
> + rtx mrgDI[4];
Don't use upper case in variable names like this please. Either tmpsf
or tmp_sf is fine.
> + emit_move_insn (target, gen_lowpart (V4SFmode, tmpV2DI));
(This is a good example of why: it isn't obvious from just seeing this
that the tmpV2DI is a variable, while the V4SFmode is a symbolic
constant).
Looks fine other than that :-)
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
@ 2020-07-11 0:28 ` Segher Boessenkool
2020-07-13 1:27 ` luoxhu
2020-08-03 11:21 ` Andreas Schwab
1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2020-07-11 0:28 UTC (permalink / raw)
To: Xiong Hu Luo; +Cc: gcc-patches, wschmidt
Hi!
On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote:
> * config/rs6000/rs6000.md (rotl_unspec): New
> define_insn_and_split.
> +; rldimi with UNSPEC_SI_FROM_SF.
> +(define_insn_and_split "*rotl_unspec"
Please have rotldi3_insert in the name. "unspec" in the name doesn't
really mean much... Can you put "sf" in the name, instead? So something
like "*rotldi3_insert_sf"?
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vector_float.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
This needs p9vector_ok (yes, that name doesn't make too much sense).
> +vector float
> +test (float *a, float *b, float *c, float *d)
> +{
> + return (vector float){*a, *b, *c, *d};
> +}
> +
> +/* { dg-final { scan-assembler-not {\mlxsspx\M} } } */
> +/* { dg-final { scan-assembler-not {\mlfs\M} } } */
No lxssp or lfsx either... or the update forms...
/* { dg-final { scan-assembler-not {\mlxssp} } } */
/* { dg-final { scan-assembler-not {\mlfs} } } */
works fine (there are no other mnemonics starting with those strings).
> +/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
Okay for trunk with those changes (or post again if you prefer). Thanks!
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi
2020-07-11 0:28 ` Segher Boessenkool
@ 2020-07-13 1:27 ` luoxhu
0 siblings, 0 replies; 8+ messages in thread
From: luoxhu @ 2020-07-13 1:27 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: gcc-patches, wschmidt
On 2020/7/11 08:28, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote:
>> * config/rs6000/rs6000.md (rotl_unspec): New
>> define_insn_and_split.
>
>> +; rldimi with UNSPEC_SI_FROM_SF.
>> +(define_insn_and_split "*rotl_unspec"
>
> Please have rotldi3_insert in the name. "unspec" in the name doesn't
> really mean much... Can you put "sf" in the name, instead? So something
> like "*rotldi3_insert_sf"?
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vector_float.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
>
> This needs p9vector_ok (yes, that name doesn't make too much sense).
>
>> +vector float
>> +test (float *a, float *b, float *c, float *d)
>> +{
>> + return (vector float){*a, *b, *c, *d};
>> +}
>> +
>> +/* { dg-final { scan-assembler-not {\mlxsspx\M} } } */
>> +/* { dg-final { scan-assembler-not {\mlfs\M} } } */
>
> No lxssp or lfsx either... or the update forms...
>
> /* { dg-final { scan-assembler-not {\mlxssp} } } */
> /* { dg-final { scan-assembler-not {\mlfs} } } */
>
> works fine (there are no other mnemonics starting with those strings).
>
>> +/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
>> +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
>> +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
>
> Okay for trunk with those changes (or post again if you prefer). Thanks!
>
Thanks. The 2 patches are committed to trunk(r11-2043, r11-2044) after
modifications.
Xionghu
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
2020-07-11 0:28 ` Segher Boessenkool
@ 2020-08-03 11:21 ` Andreas Schwab
2020-08-03 22:20 ` Segher Boessenkool
1 sibling, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2020-08-03 11:21 UTC (permalink / raw)
To: Xiong Hu Luo via Gcc-patches; +Cc: Xiong Hu Luo, wschmidt, segher
On Jul 09 2020, Xiong Hu Luo via Gcc-patches wrote:
> diff --git a/gcc/testsuite/gcc.target/powerpc/vector_float.c b/gcc/testsuite/gcc.target/powerpc/vector_float.c
> new file mode 100644
> index 00000000000..414824ad264
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vector_float.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> +
> +vector float
> +test (float *a, float *b, float *c, float *d)
> +{
> + return (vector float){*a, *b, *c, *d};
> +}
> +
> +/* { dg-final { scan-assembler-not {\mlxsspx\M} } } */
> +/* { dg-final { scan-assembler-not {\mlfs\M} } } */
> +/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
This fails with -m32.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi
2020-08-03 11:21 ` Andreas Schwab
@ 2020-08-03 22:20 ` Segher Boessenkool
0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2020-08-03 22:20 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Xiong Hu Luo via Gcc-patches, Xiong Hu Luo, wschmidt
On Mon, Aug 03, 2020 at 01:21:21PM +0200, Andreas Schwab wrote:
> On Jul 09 2020, Xiong Hu Luo via Gcc-patches wrote:
>
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vector_float.c b/gcc/testsuite/gcc.target/powerpc/vector_float.c
> > new file mode 100644
> > index 00000000000..414824ad264
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/vector_float.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> > +
> > +vector float
> > +test (float *a, float *b, float *c, float *d)
> > +{
> > + return (vector float){*a, *b, *c, *d};
> > +}
> > +
> > +/* { dg-final { scan-assembler-not {\mlxsspx\M} } } */
> > +/* { dg-final { scan-assembler-not {\mlfs\M} } } */
> > +/* { dg-final { scan-assembler-times {\mlwz\M} 4 } } */
> > +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
> > +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
>
> This fails with -m32.
Fixed ( https://gcc.gnu.org/g:c004b383aa41 ). Thanks!
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-08-03 22:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10 2:14 [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP Xiong Hu Luo
2020-07-10 2:14 ` [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi Xiong Hu Luo
2020-07-11 0:28 ` Segher Boessenkool
2020-07-13 1:27 ` luoxhu
2020-08-03 11:21 ` Andreas Schwab
2020-08-03 22:20 ` Segher Boessenkool
2020-07-10 2:37 ` [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP luoxhu
2020-07-10 23:34 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).