public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
@ 2016-03-05  6:40 Jakub Jelinek
  2016-03-09 13:06 ` Uros Bizjak
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-05  6:40 UTC (permalink / raw)
  To: Kirill Yukhin, Uros Bizjak; +Cc: gcc-patches

Hi!

The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
The problem is that for some tunings we have a splitter
/* For converting DF(xmm2) to SF(xmm1), use the following code instead of
   cvtsd2ss:
      unpcklpd xmm2,xmm2   ; packed conversion might crash on signaling NaNs
      cvtpd2ps xmm2,xmm1
If the input operand is memory, it attempts to emit sse2_loadlpd
instruction.  But, that define_insn doesn't have any v constraints and so we
fail to recognize it.  For the vmovsd 2 operand m -> v instruction
*vec_concatv2df implements that too.
So I see 3 options for this:
1) as the patch does, emit *vec_concatv2df manually
2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
   in the splitter; possibly use it instead of sse2_loadlpd there, because
   that insn has uglier/more complex pattern
3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
   avx512vl isa, etc.

I bet the 3) treatment is desirable and likely many other instructions need
it, but that doesn't sound like stage4 material to me, I find it quite
risky, do you agree?  If yes, the following patch can work temporarily
(bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
but in that case I'd like to know your preferences about the suboption
(whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
to use it only for the EXT_REX_SSE_REG_P regs).

2016-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/70086
	* config/i386/i386.md (truncdfsf2 splitter): Handle
	EXT_REX_SSE_REG_P destination with memory input.

	* gcc.target/i386/pr70086-1.c: New test.
	* gcc.target/i386/pr70086-2.c: New test.

--- gcc/config/i386/i386.md.jj	2016-03-02 14:09:50.000000000 +0100
+++ gcc/config/i386/i386.md	2016-03-04 22:56:32.206840674 +0100
@@ -4392,6 +4392,11 @@ (define_split
 	operands[4] = simplify_gen_subreg (V2DFmode, operands[1], DFmode, 0);
       emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
     }
+  else if (EXT_REX_SSE_REG_P (operands[4]))
+    /* Emit *vec_concatv2df.  */
+    emit_insn (gen_rtx_SET (operands[4],
+			    gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+						CONST0_RTX (DFmode))));
   else
     emit_insn (gen_sse2_loadlpd (operands[4],
 				 CONST0_RTX (V2DFmode), operands[1]));
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj	2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c	2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,11 @@
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+  e -= d;
+  d *= e;
+  return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj	2016-03-04 23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c	2016-03-04 23:00:27.000000000 +0100
@@ -0,0 +1,12 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = *p;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}

	Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
  2016-03-05  6:40 [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086) Jakub Jelinek
@ 2016-03-09 13:06 ` Uros Bizjak
  2016-03-09 14:51   ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Uros Bizjak @ 2016-03-09 13:06 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Kirill Yukhin, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]

On Sat, Mar 5, 2016 at 7:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
> The problem is that for some tunings we have a splitter
> /* For converting DF(xmm2) to SF(xmm1), use the following code instead of
>    cvtsd2ss:
>       unpcklpd xmm2,xmm2   ; packed conversion might crash on signaling NaNs
>       cvtpd2ps xmm2,xmm1
> If the input operand is memory, it attempts to emit sse2_loadlpd
> instruction.  But, that define_insn doesn't have any v constraints and so we
> fail to recognize it.  For the vmovsd 2 operand m -> v instruction
> *vec_concatv2df implements that too.
> So I see 3 options for this:
> 1) as the patch does, emit *vec_concatv2df manually
> 2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
>    in the splitter; possibly use it instead of sse2_loadlpd there, because
>    that insn has uglier/more complex pattern
> 3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
>    avx512vl isa, etc.
>
> I bet the 3) treatment is desirable and likely many other instructions need
> it, but that doesn't sound like stage4 material to me, I find it quite
> risky, do you agree?  If yes, the following patch can work temporarily
> (bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
> but in that case I'd like to know your preferences about the suboption
> (whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
> to use it only for the EXT_REX_SSE_REG_P regs).

Let's go with the option 2) and always generate vec_concatv2df, as we
only need it for [v,m,C] alternative. In the long term, we should
enhance all patterns with new alternatives, but not in stage-4.

Attached (lightly tested) patch that implements option 2) also allows
us to simplify splitter enable condition a bit.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 971 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cb8bcec..ef80d6a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4362,9 +4362,8 @@
 	  (match_operand:DF 1 "nonimmediate_operand")))]
   "TARGET_USE_VECTOR_FP_CONVERTS
    && optimize_insn_for_speed_p ()
-   && reload_completed && SSE_REG_P (operands[0])
-   && (!EXT_REX_SSE_REG_P (operands[0])
-       || TARGET_AVX512VL)"
+   && reload_completed
+   && SSE_REG_P (operands[0])"
    [(set (match_dup 2)
 	 (vec_concat:V4SF
 	   (float_truncate:V2SF
@@ -4393,8 +4392,10 @@
       emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
     }
   else
-    emit_insn (gen_sse2_loadlpd (operands[4],
-				 CONST0_RTX (V2DFmode), operands[1]));
+    /* Emit *vec_concatv2df.  */
+    emit_insn (gen_rtx_SET (operands[4],
+			    gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+						CONST0_RTX (DFmode))));
 })
 
 ;; It's more profitable to split and then extend in the same register.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
  2016-03-09 13:06 ` Uros Bizjak
@ 2016-03-09 14:51   ` Jakub Jelinek
  2016-03-09 16:58     ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-09 14:51 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Kirill Yukhin, gcc-patches

On Wed, Mar 09, 2016 at 02:06:03PM +0100, Uros Bizjak wrote:
> Let's go with the option 2) and always generate vec_concatv2df, as we
> only need it for [v,m,C] alternative. In the long term, we should
> enhance all patterns with new alternatives, but not in stage-4.

Ok, see patch below.

> Attached (lightly tested) patch that implements option 2) also allows
> us to simplify splitter enable condition a bit.

> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index cb8bcec..ef80d6a 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4362,9 +4362,8 @@
>  	  (match_operand:DF 1 "nonimmediate_operand")))]
>    "TARGET_USE_VECTOR_FP_CONVERTS
>     && optimize_insn_for_speed_p ()
> -   && reload_completed && SSE_REG_P (operands[0])
> -   && (!EXT_REX_SSE_REG_P (operands[0])
> -       || TARGET_AVX512VL)"
> +   && reload_completed
> +   && SSE_REG_P (operands[0])"
>     [(set (match_dup 2)
>  	 (vec_concat:V4SF
>  	   (float_truncate:V2SF

Unfortunately, this really doesn't seem to work, I get ICEs on the
testcases.  I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
just for MEM_P (operands[1]), but even that ICEs.  Perhaps there are bugs
in other splitters.

I'll bootstrap/regtest this then:

2016-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/70086
	* config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
	instead of gen_sse2_loadlpd.
	* config/i386/sse.md (*vec_concatv2df): Rename to...
	(vec_concatv2df): ... this.

	* gcc.target/i386/pr70086-1.c: New test.
	* gcc.target/i386/pr70086-2.c: New test.
	* gcc.target/i386/pr70086-3.c: New test.

--- gcc/config/i386/i386.md.jj	2016-03-08 09:01:50.871475493 +0100
+++ gcc/config/i386/i386.md	2016-03-09 15:40:00.102942847 +0100
@@ -4393,8 +4393,8 @@ (define_split
       emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
     }
   else
-    emit_insn (gen_sse2_loadlpd (operands[4],
-				 CONST0_RTX (V2DFmode), operands[1]));
+    emit_insn (gen_vec_concatv2df (operands[4], operands[1],
+				   CONST0_RTX (DFmode)));
 })
 
 ;; It's more profitable to split and then extend in the same register.
--- gcc/config/i386/sse.md.jj	2016-03-09 15:08:17.000000000 +0100
+++ gcc/config/i386/sse.md	2016-03-09 15:15:10.346223894 +0100
@@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
    (set_attr "prefix" "orig,maybe_vex,evex")
    (set_attr "mode" "V2DF,DF,DF")])
 
-(define_insn "*vec_concatv2df"
+(define_insn "vec_concatv2df"
   [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v,x,v,x,x,v,x,x")
 	(vec_concat:V2DF
 	  (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj	2016-03-09 15:12:55.177060382 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c	2016-03-09 15:12:55.177060382 +0100
@@ -0,0 +1,11 @@
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+  e -= d;
+  d *= e;
+  return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj	2016-03-09 15:12:55.177060382 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c	2016-03-09 15:35:52.000000000 +0100
@@ -0,0 +1,21 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = *p;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}
+
+float
+bar (double x)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = x;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj	2016-03-09 15:36:28.332831118 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-3.c	2016-03-09 15:35:33.000000000 +0100
@@ -0,0 +1,21 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
+
+float
+foo (double *p)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = *p;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}
+
+float
+bar (double x)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = x;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}

	Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
  2016-03-09 14:51   ` Jakub Jelinek
@ 2016-03-09 16:58     ` Jakub Jelinek
  2016-03-09 19:37       ` Uros Bizjak
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2016-03-09 16:58 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Kirill Yukhin, gcc-patches

On Wed, Mar 09, 2016 at 03:51:04PM +0100, Jakub Jelinek wrote:
> Unfortunately, this really doesn't seem to work, I get ICEs on the
> testcases.  I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
> just for MEM_P (operands[1]), but even that ICEs.  Perhaps there are bugs
> in other splitters.
> 
> I'll bootstrap/regtest this then:
> 
> 2016-03-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/70086
> 	* config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
> 	instead of gen_sse2_loadlpd.
> 	* config/i386/sse.md (*vec_concatv2df): Rename to...
> 	(vec_concatv2df): ... this.
> 
> 	* gcc.target/i386/pr70086-1.c: New test.
> 	* gcc.target/i386/pr70086-2.c: New test.
> 	* gcc.target/i386/pr70086-3.c: New test.

Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk?

> --- gcc/config/i386/i386.md.jj	2016-03-08 09:01:50.871475493 +0100
> +++ gcc/config/i386/i386.md	2016-03-09 15:40:00.102942847 +0100
> @@ -4393,8 +4393,8 @@ (define_split
>        emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
>      }
>    else
> -    emit_insn (gen_sse2_loadlpd (operands[4],
> -				 CONST0_RTX (V2DFmode), operands[1]));
> +    emit_insn (gen_vec_concatv2df (operands[4], operands[1],
> +				   CONST0_RTX (DFmode)));
>  })
>  
>  ;; It's more profitable to split and then extend in the same register.
> --- gcc/config/i386/sse.md.jj	2016-03-09 15:08:17.000000000 +0100
> +++ gcc/config/i386/sse.md	2016-03-09 15:15:10.346223894 +0100
> @@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
>     (set_attr "prefix" "orig,maybe_vex,evex")
>     (set_attr "mode" "V2DF,DF,DF")])
>  
> -(define_insn "*vec_concatv2df"
> +(define_insn "vec_concatv2df"
>    [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v,x,v,x,x,v,x,x")
>  	(vec_concat:V2DF
>  	  (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
> --- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj	2016-03-09 15:12:55.177060382 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-1.c	2016-03-09 15:12:55.177060382 +0100
> @@ -0,0 +1,11 @@
> +/* PR target/70086 */
> +/* { dg-do compile } */
> +/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
> +
> +float
> +foo (float a, float b, double c, float d, double e, float f)
> +{
> +  e -= d;
> +  d *= e;
> +  return e + d;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj	2016-03-09 15:12:55.177060382 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-2.c	2016-03-09 15:35:52.000000000 +0100
> @@ -0,0 +1,21 @@
> +/* PR target/70086 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
> +
> +float
> +foo (double *p)
> +{
> +  register float xmm16 __asm ("xmm16");
> +  xmm16 = *p;
> +  asm volatile ("" : "+v" (xmm16));
> +  return xmm16;
> +}
> +
> +float
> +bar (double x)
> +{
> +  register float xmm16 __asm ("xmm16");
> +  xmm16 = x;
> +  asm volatile ("" : "+v" (xmm16));
> +  return xmm16;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj	2016-03-09 15:36:28.332831118 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70086-3.c	2016-03-09 15:35:33.000000000 +0100
> @@ -0,0 +1,21 @@
> +/* PR target/70086 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
> +
> +float
> +foo (double *p)
> +{
> +  register float xmm16 __asm ("xmm16");
> +  xmm16 = *p;
> +  asm volatile ("" : "+v" (xmm16));
> +  return xmm16;
> +}
> +
> +float
> +bar (double x)
> +{
> +  register float xmm16 __asm ("xmm16");
> +  xmm16 = x;
> +  asm volatile ("" : "+v" (xmm16));
> +  return xmm16;
> +}

	Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086)
  2016-03-09 16:58     ` Jakub Jelinek
@ 2016-03-09 19:37       ` Uros Bizjak
  0 siblings, 0 replies; 5+ messages in thread
From: Uros Bizjak @ 2016-03-09 19:37 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Kirill Yukhin, gcc-patches

On Wed, Mar 9, 2016 at 5:58 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Mar 09, 2016 at 03:51:04PM +0100, Jakub Jelinek wrote:
>> Unfortunately, this really doesn't seem to work, I get ICEs on the
>> testcases.  I've tried to allow EXT_REX_SSE_REG_P for -mavx512f -mno-avx512vl
>> just for MEM_P (operands[1]), but even that ICEs.  Perhaps there are bugs
>> in other splitters.
>>
>> I'll bootstrap/regtest this then:
>>
>> 2016-03-04  Jakub Jelinek  <jakub@redhat.com>
>>
>>       PR target/70086
>>       * config/i386/i386.md (truncdfsf2 splitter): Use gen_vec_concatv2df
>>       instead of gen_sse2_loadlpd.
>>       * config/i386/sse.md (*vec_concatv2df): Rename to...
>>       (vec_concatv2df): ... this.
>>
>>       * gcc.target/i386/pr70086-1.c: New test.
>>       * gcc.target/i386/pr70086-2.c: New test.
>>       * gcc.target/i386/pr70086-3.c: New test.
>
> Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.
> Ok for trunk?

OK.

Thanks,
Uros.

>> --- gcc/config/i386/i386.md.jj        2016-03-08 09:01:50.871475493 +0100
>> +++ gcc/config/i386/i386.md   2016-03-09 15:40:00.102942847 +0100
>> @@ -4393,8 +4393,8 @@ (define_split
>>        emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
>>      }
>>    else
>> -    emit_insn (gen_sse2_loadlpd (operands[4],
>> -                              CONST0_RTX (V2DFmode), operands[1]));
>> +    emit_insn (gen_vec_concatv2df (operands[4], operands[1],
>> +                                CONST0_RTX (DFmode)));
>>  })
>>
>>  ;; It's more profitable to split and then extend in the same register.
>> --- gcc/config/i386/sse.md.jj 2016-03-09 15:08:17.000000000 +0100
>> +++ gcc/config/i386/sse.md    2016-03-09 15:15:10.346223894 +0100
>> @@ -8951,7 +8951,7 @@ (define_insn "vec_dupv2df<mask_name>"
>>     (set_attr "prefix" "orig,maybe_vex,evex")
>>     (set_attr "mode" "V2DF,DF,DF")])
>>
>> -(define_insn "*vec_concatv2df"
>> +(define_insn "vec_concatv2df"
>>    [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v,x,v,x,x,v,x,x")
>>       (vec_concat:V2DF
>>         (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0")
>> --- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj      2016-03-09 15:12:55.177060382 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-1.c 2016-03-09 15:12:55.177060382 +0100
>> @@ -0,0 +1,11 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
>> +
>> +float
>> +foo (float a, float b, double c, float d, double e, float f)
>> +{
>> +  e -= d;
>> +  d *= e;
>> +  return e + d;
>> +}
>> --- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj      2016-03-09 15:12:55.177060382 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-2.c 2016-03-09 15:35:52.000000000 +0100
>> @@ -0,0 +1,21 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile { target { ! ia32 } } } */
>> +/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
>> +
>> +float
>> +foo (double *p)
>> +{
>> +  register float xmm16 __asm ("xmm16");
>> +  xmm16 = *p;
>> +  asm volatile ("" : "+v" (xmm16));
>> +  return xmm16;
>> +}
>> +
>> +float
>> +bar (double x)
>> +{
>> +  register float xmm16 __asm ("xmm16");
>> +  xmm16 = x;
>> +  asm volatile ("" : "+v" (xmm16));
>> +  return xmm16;
>> +}
>> --- gcc/testsuite/gcc.target/i386/pr70086-3.c.jj      2016-03-09 15:36:28.332831118 +0100
>> +++ gcc/testsuite/gcc.target/i386/pr70086-3.c 2016-03-09 15:35:33.000000000 +0100
>> @@ -0,0 +1,21 @@
>> +/* PR target/70086 */
>> +/* { dg-do compile { target { ! ia32 } } } */
>> +/* { dg-options "-O2 -mtune=barcelona -mavx512f -mno-avx512vl" } */
>> +
>> +float
>> +foo (double *p)
>> +{
>> +  register float xmm16 __asm ("xmm16");
>> +  xmm16 = *p;
>> +  asm volatile ("" : "+v" (xmm16));
>> +  return xmm16;
>> +}
>> +
>> +float
>> +bar (double x)
>> +{
>> +  register float xmm16 __asm ("xmm16");
>> +  xmm16 = x;
>> +  asm volatile ("" : "+v" (xmm16));
>> +  return xmm16;
>> +}
>
>         Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-09 19:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-05  6:40 [PATCH] Fix ICE with xmm{16-31} in *truncdfsf_fast_mixed with -mtune=barcelona (PR target/70086) Jakub Jelinek
2016-03-09 13:06 ` Uros Bizjak
2016-03-09 14:51   ` Jakub Jelinek
2016-03-09 16:58     ` Jakub Jelinek
2016-03-09 19:37       ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).