* [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
@ 2016-03-05 6:40 Jakub Jelinek
2016-03-09 13:06 ` Uros Bizjak
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-05 6:40 UTC (permalink / raw)
To: Kirill Yukhin, Uros Bizjak; +Cc: gcc-patches
Hi!
The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
The problem is that for some tunings we have a splitter
/* For converting DF(xmm2) to SF(xmm1), use the following code instead of
cvtsd2ss:
unpcklpd xmm2,xmm2 ; packed conversion might crash on signaling NaNs
cvtpd2ps xmm2,xmm1
If the input operand is memory, it attempts to emit sse2_loadlpd
instruction. But, that define_insn doesn't have any v constraints and so we
fail to recognize it. For the vmovsd 2 operand m -> v instruction
*vec_concatv2df implements that too.
So I see 3 options for this:
1) as the patch does, emit *vec_concatv2df manually
2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
in the splitter; possibly use it instead of sse2_loadlpd there, because
that insn has uglier/more complex pattern
3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
avx512vl isa, etc.
I bet the 3) treatment is desirable and likely many other instructions need
it, but that doesn't sound like stage4 material to me, I find it quite
risky, do you agree? If yes, the following patch can work temporarily
(bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
but in that case I'd like to know your preferences about the suboption
(whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
to use it only for the EXT_REX_SSE_REG_P regs).
2016-03-04 Jakub Jelinek <jakub@redhat.com>
PR target/70086
* config/i386/i386.md (truncdfsf2 splitter): Handle
EXT_REX_SSE_REG_P destination with memory input.
* gcc.target/i386/pr70086-1.c: New test.
* gcc.target/i386/pr70086-2.c: New test.
--- gcc/config/i386/i386.md.jj 2016-03-02 14:09:50.000000000 +0100
+++ gcc/config/i386/i386.md 2016-03-04 22:56:32.206840674 +0100
@@ -4392,6 +4392,11 @@ (define_split
operands[4] = simplify_gen_subreg (V2DFmode, operands[1], DFmode, 0);
emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
}
+ else if (EXT_REX_SSE_REG_P (operands[4]))
+ /* Emit *vec_concatv2df. */
+ emit_insn (gen_rtx_SET (operands[4],
+ gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+ CONST0_RTX (DFmode))));
else
emit_insn (gen_sse2_loadlpd (operands[4],
CONST0_RTX (V2DFmode), operands[1]));
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj 2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c 2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,11 @@
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+ e -= d;
+ d *= e;
+ return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj 2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c 2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,12 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+ register float xmm16 __asm ("xmm16");
+ xmm16 = *p;
+ asm volatile ("" : "+v" (xmm16));
+ return xmm16;
+}
Jakub
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
2016-03-05 6:40 [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086) Jakub Jelinek
@ 2016-03-09 13:06 ` Uros Bizjak
2016-03-09 14:51 ` Jakub Jelinek
0 siblings, 1 reply; 5+ messages in thread
From: Uros Bizjak @ 2016-03-09 13:06 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Kirill Yukhin, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]
On Sat, Mar 5, 2016 at 7:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
> The problem is that for some tunings we have a splitter
> /* For converting DF(xmm2) to SF(xmm1), use the following code instead of
> cvtsd2ss:
> unpcklpd xmm2,xmm2 ; packed conversion might crash on signaling NaNs
> cvtpd2ps xmm2,xmm1
> If the input operand is memory, it attempts to emit sse2_loadlpd
> instruction. But, that define_insn doesn't have any v constraints and so we
> fail to recognize it. For the vmovsd 2 operand m -> v instruction
> *vec_concatv2df implements that too.
> So I see 3 options for this:
> 1) as the patch does, emit *vec_concatv2df manually
> 2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
> in the splitter; possibly use it instead of sse2_loadlpd there, because
> that insn has uglier/more complex pattern
> 3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
> avx512vl isa, etc.
>
> I bet the 3) treatment is desirable and likely many other instructions need
> it, but that doesn't sound like stage4 material to me, I find it quite
> risky, do you agree? If yes, the following patch can work temporarily
> (bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
> but in that case I'd like to know your preferences about the suboption
> (whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
> to use it only for the EXT_REX_SSE_REG_P regs).
Let's go with the option 2) and always generate vec_concatv2df, as we
only need it for [v,m,C] alternative. In the long term, we should
enhance all patterns with new alternatives, but not in stage-4.
Attached (lightly tested) patch that implements option 2) also allows
us to simplify splitter enable condition a bit.
Uros.
[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 971 bytes --]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cb8bcec..ef80d6a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4362,9 +4362,8 @@
(match_operand:DF 1 "nonimmediate_operand")))]
"TARGET_USE_VECTOR_FP_CONVERTS
&& optimize_insn_for_speed_p ()
- && reload_completed && SSE_REG_P (operands[0])
- && (!EXT_REX_SSE_REG_P (operands[0])
- || TARGET_AVX512VL)"
+ && reload_completed
+ && SSE_REG_P (operands[0])"
[(set (match_dup 2)
(vec_concat:V4SF
(float_truncate:V2SF
@@ -4393,8 +4392,10 @@
emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
}
else
- emit_insn (gen_sse2_loadlpd (operands[4],
- CONST0_RTX (V2DFmode), operands[1]));
+ /* Emit *vec_concatv2df. */
+ emit_insn (gen_rtx_SET (operands[4],
+ gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+ CONST0_RTX (DFmode))));
})
;; It's more profitable to split and then extend in the same register.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
2016-03-09 13:06 ` Uros Bizjak
@ 2016-03-09 14:51 ` Jakub Jelinek
2016-03-09 16:58 ` Jakub Jelinek
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-09 14:51 UTC (permalink / raw)
To: Uros Bizjak; +Cc: Kirill Yukhin, gcc-patches
On Wed, Mar 09, 2016 at 02:06:03PM +0100, Uros Bizjak wrote:
> Let's go with the option 2) and always generate vec_concatv2df, as we
> only need it for [v,m,C] alternative. In the long term, we should
> enhance all patterns with new alternatives, but not in stage-4.
Ok, see patch below.
> Attached (lightly tested) patch that implements option 2) also allows
> us to simplify splitter enable condition a bit.
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index cb8bcec..ef80d6a 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4362,9 +4362,8 @@
> (match_operand:DF 1 "nonimmediate_operand")))]
> "TARGET_USE_VECTOR_FP_CONVERTS
> && optimize_insn_for_speed_p ()
> - && reload_completed && SSE_REG_P (operands[0])
> - && (!EXT_REX_SSE_REG_P (operands[0])
> - || TARGET_AVX512VL)"
> + && reload_completed
> + && SSE_REG_P (operands[0])"
> [(set (match_dup 2)
> (vec_concat:V4SF
> (float_truncate:V2SF
Unfortunately, this really doesn't seem to work, I get ICEs on the
testcases. I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
just for MEM_P (operands[1]), but even that ICEs. Perhaps there are bugs
in other splitters.
I'll bootstrap/regtest this then:
2016-03-04 Jakub Jelinek <jakub@redhat.com>
PR target/70086
* config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
instead of gen_sse2_loadlpd.
* config/i386/sse.md (*vec_concatv2df): Rename to...
(vec_concatv2df): ... this.
* gcc.target/i386/pr70086-1.c: New test.
* gcc.target/i386/pr70086-2.c: New test.
* gcc.target/i386/pr70086-3.c: New test.
--- gcc/config/i386/i386.md.jj 2016-03-08 09:01:50.871475493 +0100
+++ gcc/config/i386/i386.md 2016-03-09 15:40:00.102942847 +0100
@@ -4393,8 +4393,8 @@ (define_split
emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
}
else
- emit_insn (gen_sse2_loadlpd (operands[4],
- CONST0_RTX (V2DFmode), operands[1]));
+ emit_insn (gen_vec_concatv2df (operands[4], operands[1],
+ CONST0_RTX (DFmode)));
})
;; It's more profitable to split and then extend in the same register.
--- gcc/config/i386/sse.md.jj 2016-03-09 15:08:17.000000000 +0100
+++ gcc/config/i386/sse.md 2016-03-09 15:15:10.346223894 +0100
@@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
(set_attr "prefix" "orig,maybe_vex,evex")
(set_attr "mode" "V2DF,DF,DF")])
-(define_insn "*vec_concatv2df"
+(define_insn "vec_concatv2df"
[(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x,v,x,x")
(vec_concat:V2DF
(match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj 2016-03-09 15:12:55.177060382 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c 2016-03-09 15:12:55.177060382 +0100
@@ -0,0 +1,11 @@
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+ e -= d;
+ d *= e;
+ return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj 2016-03-09 15:12:55.177060382 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c 2016-03-09 15:35:52.000000000 +0100
@@ -0,0 +1,21 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+ register float xmm16 __asm ("xmm16");
+ xmm16 = *p;
+ asm volatile ("" : "+v" (xmm16));
+ return xmm16;
+}
+
+float
+bar (double x)
+{
+ register float xmm16 __asm ("xmm16");
+ xmm16 = x;
+ asm volatile ("" : "+v" (xmm16));
+ return xmm16;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj 2016-03-09 15:36:28.332831118 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-3.c 2016-03-09 15:35:33.000000000 +0100
@@ -0,0 +1,21 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
+
+float
+foo (double *p)
+{
+ register float xmm16 __asm ("xmm16");
+ xmm16 = *p;
+ asm volatile ("" : "+v" (xmm16));
+ return xmm16;
+}
+
+float
+bar (double x)
+{
+ register float xmm16 __asm ("xmm16");
+ xmm16 = x;
+ asm volatile ("" : "+v" (xmm16));
+ return xmm16;
+}
Jakub
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
2016-03-09 14:51 ` Jakub Jelinek
@ 2016-03-09 16:58 ` Jakub Jelinek
2016-03-09 19:37 ` Uros Bizjak
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-09 16:58 UTC (permalink / raw)
To: Uros Bizjak; +Cc: Kirill Yukhin, gcc-patches
On Wed, Mar 09, 2016 at 03:51:04PM +0100, Jakub Jelinek wrote:
> Unfortunately, this really doesn't seem to work, I get ICEs on the
> testcases. I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
> just for MEM_P (operands[1]), but even that ICEs. Perhaps there are bugs
> in other splitters.
>
> I'll bootstrap/regtest this then:
>
> 2016-03-04 Jakub Jelinek <jakub@redhat.com>
>
> PR target/70086
> * config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
> instead of gen_sse2_loadlpd.
> * config/i386/sse.md (*vec_concatv2df): Rename to...
> (vec_concatv2df): ... this.
>
> * gcc.target/i386/pr70086-1.c: New test.
> * gcc.target/i386/pr70086-2.c: New test.
> * gcc.target/i386/pr70086-3.c: New test.
Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk?
> --- gcc/config/i386/i386.md.jj 2016-03-08 09:01:50.871475493 +0100
> +++ gcc/config/i386/i386.md 2016-03-09 15:40:00.102942847 +0100
> @@ -4393,8 +4393,8 @@ (define_split
> emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
> }
> else
> - emit_insn (gen_sse2_loadlpd (operands[4],
> - CONST0_RTX (V2DFmode), operands[1]));
> + emit_insn (gen_vec_concatv2df (operands[4], operands[1],
> + CONST0_RTX (DFmode)));
> })
>
> ;; It's more profitable to split and then extend in the same register.
> --- gcc/config/i386/sse.md.jj 2016-03-09 15:08:17.000000000 +0100
> +++ gcc/config/i386/sse.md 2016-03-09 15:15:10.346223894 +0100
> @@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
> (set_attr "prefix" "orig,maybe_vex,evex")
> (set_attr "mode" "V2DF,DF,DF")])
>
> -(define_insn "*vec_concatv2df"
> +(define_insn "vec_concatv2df"
> [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x,v,x,x")
> (vec_concat:V2DF
> (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
> --- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj 2016-03-09 15:12:55.177060382 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-1.c 2016-03-09 15:12:55.177060382 +0100
> @@ -0,0 +1,11 @@
> +/* PR target/70086 */
> +/* { dg-do compile } */
> +/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
> +
> +float
> +foo (float a, float b, double c, float d, double e, float f)
> +{
> + e -= d;
> + d *= e;
> + return e + d;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj 2016-03-09 15:12:55.177060382 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-2.c 2016-03-09 15:35:52.000000000 +0100
> @@ -0,0 +1,21 @@
> +/* PR target/70086 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
> +
> +float
> +foo (double *p)
> +{
> + register float xmm16 __asm ("xmm16");
> + xmm16 = *p;
> + asm volatile ("" : "+v" (xmm16));
> + return xmm16;
> +}
> +
> +float
> +bar (double x)
> +{
> + register float xmm16 __asm ("xmm16");
> + xmm16 = x;
> + asm volatile ("" : "+v" (xmm16));
> + return xmm16;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj 2016-03-09 15:36:28.332831118 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-3.c 2016-03-09 15:35:33.000000000 +0100
> @@ -0,0 +1,21 @@
> +/* PR target/70086 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
> +
> +float
> +foo (double *p)
> +{
> + register float xmm16 __asm ("xmm16");
> + xmm16 = *p;
> + asm volatile ("" : "+v" (xmm16));
> + return xmm16;
> +}
> +
> +float
> +bar (double x)
> +{
> + register float xmm16 __asm ("xmm16");
> + xmm16 = x;
> + asm volatile ("" : "+v" (xmm16));
> + return xmm16;
> +}
Jakub
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
2016-03-09 16:58 ` Jakub Jelinek
@ 2016-03-09 19:37 ` Uros Bizjak
0 siblings, 0 replies; 5+ messages in thread
From: Uros Bizjak @ 2016-03-09 19:37 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Kirill Yukhin, gcc-patches
On Wed, Mar 9, 2016 at 5:58 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Mar 09, 2016 at 03:51:04PM +0100, Jakub Jelinek wrote:
>> Unfortunately, this really doesn't seem to work, I get ICEs on the
>> testcases. I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
>> just for MEM_P (operands[1]), but even that ICEs. Perhaps there are bugs
>> in other splitters.
>>
>> I'll bootstrap/regtest this then:
>>
>> 2016-03-04 Jakub Jelinek <jakub@redhat.com>
>>
>> PR target/70086
>> * config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
>> instead of gen_sse2_loadlpd.
>> * config/i386/sse.md (*vec_concatv2df): Rename to...
>> (vec_concatv2df): ... this.
>>
>> * gcc.target/i386/pr70086-1.c: New test.
>> * gcc.target/i386/pr70086-2.c: New test.
>> * gcc.target/i386/pr70086-3.c: New test.
>
> Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.
> Ok for trunk?
OK.
Thanks,
Uros.
>> --- gcc/config/i386/i386.md.jj 2016-03-08 09:01:50.871475493 +0100
>> +++ gcc/config/i386/i386.md 2016-03-09 15:40:00.102942847 +0100
>> @@ -4393,8 +4393,8 @@ (define_split
>> emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
>> }
>> else
>> - emit_insn (gen_sse2_loadlpd (operands[4],
>> - CONST0_RTX (V2DFmode), operands[1]));
>> + emit_insn (gen_vec_concatv2df (operands[4], operands[1],
>> + CONST0_RTX (DFmode)));
>> })
>>
>> ;; It's more profitable to split and then extend in the same register.
>> --- gcc/config/i386/sse.md.jj 2016-03-09 15:08:17.000000000 +0100
>> +++ gcc/config/i386/sse.md 2016-03-09 15:15:10.346223894 +0100
>> @@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
>> (set_attr "prefix" "orig,maybe_vex,evex")
>> (set_attr "mode" "V2DF,DF,DF")])
>>
>> -(define_insn "*vec_concatv2df"
>> +(define_insn "vec_concatv2df"
>> [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x,v,x,x")
>> (vec_concat:V2DF
>> (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
>> --- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj 2016-03-09 15:12:55.177060382 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-1.c 2016-03-09 15:12:55.177060382 +0100
>> @@ -0,0 +1,11 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
>> +
>> +float
>> +foo (float a, float b, double c, float d, double e, float f)
>> +{
>> + e -= d;
>> + d *= e;
>> + return e + d;
>> +}
>> --- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj 2016-03-09 15:12:55.177060382 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-2.c 2016-03-09 15:35:52.000000000 +0100
>> @@ -0,0 +1,21 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile { target { ! ia32 } } } */
>> +/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
>> +
>> +float
>> +foo (double *p)
>> +{
>> + register float xmm16 __asm ("xmm16");
>> + xmm16 = *p;
>> + asm volatile ("" : "+v" (xmm16));
>> + return xmm16;
>> +}
>> +
>> +float
>> +bar (double x)
>> +{
>> + register float xmm16 __asm ("xmm16");
>> + xmm16 = x;
>> + asm volatile ("" : "+v" (xmm16));
>> + return xmm16;
>> +}
>> --- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj 2016-03-09 15:36:28.332831118 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-3.c 2016-03-09 15:35:33.000000000 +0100
>> @@ -0,0 +1,21 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile { target { ! ia32 } } } */
>> +/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
>> +
>> +float
>> +foo (double *p)
>> +{
>> + register float xmm16 __asm ("xmm16");
>> + xmm16 = *p;
>> + asm volatile ("" : "+v" (xmm16));
>> + return xmm16;
>> +}
>> +
>> +float
>> +bar (double x)
>> +{
>> + register float xmm16 __asm ("xmm16");
>> + xmm16 = x;
>> + asm volatile ("" : "+v" (xmm16));
>> + return xmm16;
>> +}
>
> Jakub
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-09 19:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-05 6:40 [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086) Jakub Jelinek
2016-03-09 13:06 ` Uros Bizjak
2016-03-09 14:51 ` Jakub Jelinek
2016-03-09 16:58 ` Jakub Jelinek
2016-03-09 19:37 ` Uros Bizjak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).