public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [rtl, patch] combine concat+shuffle
@ 2012-05-07 20:17 Marc Glisse
  2012-05-08  9:40 ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-07 20:17 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1443 bytes --]

Hello,

this patch combines for vectors a concat and a shuffle. An example on x86 
would be:

__m128d f(double d){
   __m128d x=_mm_setr_pd(-d,d);
   return _mm_shuffle_pd(x,x,1);
}

which was compiled as:

 	vmovsd	.LC0(%rip), %xmm1
 	vxorpd	%xmm0, %xmm1, %xmm1
 	vunpcklpd	%xmm0, %xmm1, %xmm0
 	vshufpd	$1, %xmm0, %xmm0, %xmm0

and with the patch:

 	vmovsd	.LC0(%rip), %xmm1
 	vxorpd	%xmm0, %xmm1, %xmm1
 	vunpcklpd	%xmm1, %xmm0, %xmm0

This happens a lot in my code, for interval arithmetics, where I have a 
number d, build an interval (-d,d) from it, then subtract that interval 
from an other one, and subtraction is implemented as shufpd+addpd.

The patch is quite specialized, but I guessed I could start there, and it 
can always be generalized later.

For the testsuite, since the patch is not in a particular target, it would 
be better to have a generic test (in gcc.dg?), but I don't really know how 
to write a generic one, so would a test in gcc.target/i386 that scans 
the asm for shuf or perm be ok?

Ah, and if I use __builtin_shuffle instead of _mm_shuffle_pd, the patch 
works without -mavx, but -mavx uses vpermilpd (ie a vec_select:V2DF 
(reg:V2DF) ...) instead of a vshufpd, so I'll probably want to handle that 
too later. I thought about doing a general transformation from 
vec_select(vec_concat(x,x),*) to vec_select(x,*) (reducing the indexes in 
* so they fit), but that seemed way too dangerous.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1328 bytes --]

Index: simplify-rtx.c
===================================================================
--- simplify-rtx.c	(revision 187228)
+++ simplify-rtx.c	(working copy)
@@ -3268,10 +3268,32 @@ simplify_binary_operation_1 (enum rtx_co
 
 	  if (GET_MODE (vec) == mode)
 	    return vec;
 	}
 
+      /* If we build {a,b} then permute it, build the result directly.  */
+      if (XVECLEN (trueop1, 0) == 2
+	  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
+	  && CONST_INT_P (XVECEXP (trueop1, 0, 1))
+	  && GET_CODE (trueop0) == VEC_CONCAT
+	  && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop0, 1))
+	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
+	  && GET_MODE (XEXP (trueop0, 0)) == mode)
+	{
+	  int offset0 = INTVAL (XVECEXP (trueop1, 0, 0)) % 2;
+	  int offset1 = INTVAL (XVECEXP (trueop1, 0, 1)) % 2;
+	  rtx baseop  = XEXP (trueop0, 0);
+	  rtx baseop0 = XEXP (baseop , 0);
+	  rtx baseop1 = XEXP (baseop , 1);
+	  baseop0 = avoid_constant_pool_reference (baseop0);
+	  baseop1 = avoid_constant_pool_reference (baseop1);
+
+	  return simplify_gen_binary (VEC_CONCAT, mode,
+	     offset0 ? baseop1 : baseop0,
+	     offset1 ? baseop1 : baseop0);
+	}
+
       return 0;
     case VEC_CONCAT:
       {
 	enum machine_mode op0_mode = (GET_MODE (trueop0) != VOIDmode
 				      ? GET_MODE (trueop0)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-07 20:17 [rtl, patch] combine concat+shuffle Marc Glisse
@ 2012-05-08  9:40 ` Richard Sandiford
  2012-05-08 10:30   ` Marc Glisse
  2012-05-08 16:33   ` Marc Glisse
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Sandiford @ 2012-05-08  9:40 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

Looks like a good idea.

Marc Glisse <marc.glisse@inria.fr> writes:
> For the testsuite, since the patch is not in a particular target, it would 
> be better to have a generic test (in gcc.dg?), but I don't really know how 
> to write a generic one, so would a test in gcc.target/i386 that scans 
> the asm for shuf or perm be ok?

Tree-level vectorisation tests tend to go in gcc.dg/vector, but it's
hard to generalise rtl-level transforms like these.  An x86-only test
sounds good to me FWIW.

> Index: simplify-rtx.c
> ===================================================================
> --- simplify-rtx.c	(revision 187228)
> +++ simplify-rtx.c	(working copy)
> @@ -3268,10 +3268,32 @@ simplify_binary_operation_1 (enum rtx_co
>
>  	  if (GET_MODE (vec) == mode)
>  	    return vec;
>  	}
> 
> +      /* If we build {a,b} then permute it, build the result directly.  */
> +      if (XVECLEN (trueop1, 0) == 2
> +	  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
> +	  && CONST_INT_P (XVECEXP (trueop1, 0, 1))
> +	  && GET_CODE (trueop0) == VEC_CONCAT
> +	  && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop0, 1))
> +	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
> +	  && GET_MODE (XEXP (trueop0, 0)) == mode)
> +	{
> +	  int offset0 = INTVAL (XVECEXP (trueop1, 0, 0)) % 2;
> +	  int offset1 = INTVAL (XVECEXP (trueop1, 0, 1)) % 2;
> +	  rtx baseop  = XEXP (trueop0, 0);
> +	  rtx baseop0 = XEXP (baseop , 0);
> +	  rtx baseop1 = XEXP (baseop , 1);
> +	  baseop0 = avoid_constant_pool_reference (baseop0);
> +	  baseop1 = avoid_constant_pool_reference (baseop1);
> +
> +	  return simplify_gen_binary (VEC_CONCAT, mode,
> +	     offset0 ? baseop1 : baseop0,
> +	     offset1 ? baseop1 : baseop0);
> +	}
> +

I know you said that generalising it could be done later,
and that's fine, but it looks in some ways like it would
be easier to go straight for the more general:

	  && GET_CODE (trueop0) == VEC_CONCAT
	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
	  && GET_MODE (XEXP (trueop0, 0)) == mode
	  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
	  && GET_MODE (XEXP (trueop0, 1)) == mode)
	{
	  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
	  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
	  rtx op0, op1;

	  gcc_assert (i0 < 4 && i1 < 4);
	  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
	  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);

	  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
	}

(completely untested).  avoid_constant_pool_reference shouldn't be
called here.

Very minor, but this code probably belongs in the else part of the
if (!VECTOR_MODE_P (mode)) block.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08  9:40 ` Richard Sandiford
@ 2012-05-08 10:30   ` Marc Glisse
  2012-05-08 11:09     ` Richard Sandiford
  2012-05-08 16:33   ` Marc Glisse
  1 sibling, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-08 10:30 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On Tue, 8 May 2012, Richard Sandiford wrote:

> I know you said that generalising it could be done later,
> and that's fine, but it looks in some ways like it would
> be easier to go straight for the more general:
>
> 	  && GET_CODE (trueop0) == VEC_CONCAT
> 	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
> 	  && GET_MODE (XEXP (trueop0, 0)) == mode
> 	  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
> 	  && GET_MODE (XEXP (trueop0, 1)) == mode)
> 	{
> 	  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
> 	  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
> 	  rtx op0, op1;
>
> 	  gcc_assert (i0 < 4 && i1 < 4);
> 	  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
> 	  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
>
> 	  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
> 	}

Yes, I hesitated.

> (completely untested).  avoid_constant_pool_reference shouldn't be
> called here.

I wasn't quite sure what it was for, but it looked safer to call it for 
nothing than to forget it ;-)

> Very minor, but this code probably belongs in the else part of the
> if (!VECTOR_MODE_P (mode)) block.

Thanks, I'll update the patch with your comments, add a testcase and 
ChangeLog and re-send it here.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08 10:30   ` Marc Glisse
@ 2012-05-08 11:09     ` Richard Sandiford
  2012-05-08 12:09       ` Marc Glisse
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Sandiford @ 2012-05-08 11:09 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

Marc Glisse <marc.glisse@inria.fr> writes:
> On Tue, 8 May 2012, Richard Sandiford wrote:
>> I know you said that generalising it could be done later,
>> and that's fine, but it looks in some ways like it would
>> be easier to go straight for the more general:
>>
>> 	  && GET_CODE (trueop0) == VEC_CONCAT
>> 	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
>> 	  && GET_MODE (XEXP (trueop0, 0)) == mode
>> 	  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
>> 	  && GET_MODE (XEXP (trueop0, 1)) == mode)
>> 	{
>> 	  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
>> 	  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
>> 	  rtx op0, op1;
>>
>> 	  gcc_assert (i0 < 4 && i1 < 4);
>> 	  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
>> 	  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
>>
>> 	  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
>> 	}
>
> Yes, I hesitated.

Realised afterwards that both versions need to check
GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
to be scalar.  Sorry for not noticing first time.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08 11:09     ` Richard Sandiford
@ 2012-05-08 12:09       ` Marc Glisse
  2012-05-08 12:56         ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-08 12:09 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On Tue, 8 May 2012, Richard Sandiford wrote:

> Marc Glisse <marc.glisse@inria.fr> writes:
>> On Tue, 8 May 2012, Richard Sandiford wrote:
>>> I know you said that generalising it could be done later,
>>> and that's fine, but it looks in some ways like it would
>>> be easier to go straight for the more general:
>>>
>>> 	  && GET_CODE (trueop0) == VEC_CONCAT
>>> 	  && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
>>> 	  && GET_MODE (XEXP (trueop0, 0)) == mode
>>> 	  && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
>>> 	  && GET_MODE (XEXP (trueop0, 1)) == mode)
>>> 	{
>>> 	  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
>>> 	  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
>>> 	  rtx op0, op1;
>>>
>>> 	  gcc_assert (i0 < 4 && i1 < 4);
>>> 	  op0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
>>> 	  op1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
>>>
>>> 	  return simplify_gen_binary (VEC_CONCAT, mode, op0, op1);
>>> 	}
>>
>> Yes, I hesitated.
>
> Realised afterwards that both versions need to check
> GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
> to be scalar.  Sorry for not noticing first time.

I thought that was a consequence of

XVECLEN (trueop1, 0) == 2

(in the lines before your first &&)
ie the result (which has mode "mode") is a vec_select of 2 objects.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08 12:09       ` Marc Glisse
@ 2012-05-08 12:56         ` Richard Sandiford
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Sandiford @ 2012-05-08 12:56 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

Marc Glisse <marc.glisse@inria.fr> writes:
>> Realised afterwards that both versions need to check
>> GET_MODE_NUNITS (mode) == 2, because we're requiring OP0 and OP1
>> to be scalar.  Sorry for not noticing first time.
>
> I thought that was a consequence of
>
> XVECLEN (trueop1, 0) == 2
>
> (in the lines before your first &&)
> ie the result (which has mode "mode") is a vec_select of 2 objects.

Yeah, you're right of course.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08  9:40 ` Richard Sandiford
  2012-05-08 10:30   ` Marc Glisse
@ 2012-05-08 16:33   ` Marc Glisse
  2012-05-08 17:53     ` Richard Sandiford
  1 sibling, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-08 16:33 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1061 bytes --]

Here is a new version.

gcc/ChangeLog
2012-05-08  Marc Glisse  <marc.glisse@inria.fr>

 	* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
 	of concatenations.

gcc/testsuite/ChangeLog
2012-05-08  Marc Glisse  <marc.glisse@inria.fr>

 	* gcc.target/i386/shuf-concat.c: New test.


On Tue, 8 May 2012, Richard Sandiford wrote:

> Very minor, but this code probably belongs in the else part of the
> if (!VECTOR_MODE_P (mode)) block.

I moved it in the block. Note that the piece of code right below, that 
starts with:

       if (XVECLEN (trueop1, 0) == 1
           && CONST_INT_P (XVECEXP (trueop1, 0, 0))
           && GET_CODE (trueop0) == VEC_CONCAT)

could probably move too. I had put mine right below because they do 
similar things. By the way, reusing that piece of code and applying it to 
each of the 2 selected parts of the vector would be one way to generalize 
my patch so it also applies to the vpermilpd case.

Note to self: if you want to grep for "shuf" in the asm, don't put "shuf" 
in the name of the file...

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 2769 bytes --]

Index: testsuite/gcc.target/i386/shuf-concat.c
===================================================================
--- testsuite/gcc.target/i386/shuf-concat.c	(revision 0)
+++ testsuite/gcc.target/i386/shuf-concat.c	(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not "shufpd" } } */
+/* { dg-final { scan-assembler-times "unpck" 1 } } */

Property changes on: testsuite/gcc.target/i386/shuf-concat.c
___________________________________________________________________
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL

Index: simplify-rtx.c
===================================================================
--- simplify-rtx.c	(revision 187276)
+++ simplify-rtx.c	(working copy)
@@ -1,10 +1,10 @@
 /* RTL simplification functions for GNU compiler.
    Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
    1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
-   2011  Free Software Foundation, Inc.
+   2011, 2012  Free Software Foundation, Inc.
 
 This file is part of GCC.
 
 GCC is free software; you can redistribute it and/or modify it under
 the terms of the GNU General Public License as published by the Free
 Software Foundation; either version 3, or (at your option) any later
@@ -3239,12 +3239,33 @@ simplify_binary_operation_1 (enum rtx_co
 		  RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop0,
 						       INTVAL (x));
 		}
 
 	      return gen_rtx_CONST_VECTOR (mode, v);
 	    }
+
+	  /* If we build {a,b} then permute it, build the result directly.  */
+	  if (XVECLEN (trueop1, 0) == 2
+	      && CONST_INT_P (XVECEXP (trueop1, 0, 0))
+	      && CONST_INT_P (XVECEXP (trueop1, 0, 1))
+	      && GET_CODE (trueop0) == VEC_CONCAT
+	      && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT
+	      && GET_MODE (XEXP (trueop0, 0)) == mode
+	      && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT
+	      && GET_MODE (XEXP (trueop0, 1)) == mode)
+	    {
+	      unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
+	      unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
+	      rtx subop0, subop1;
+
+	      gcc_assert (i0 < 4 && i1 < 4);
+	      subop0 = XEXP (XEXP (trueop0, i0 / 2), i0 % 2);
+	      subop1 = XEXP (XEXP (trueop0, i1 / 2), i1 % 2);
+
+	      return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1);
+	    }
 	}
 
       if (XVECLEN (trueop1, 0) == 1
 	  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
 	  && GET_CODE (trueop0) == VEC_CONCAT)
 	{

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08 16:33   ` Marc Glisse
@ 2012-05-08 17:53     ` Richard Sandiford
  2012-05-08 18:40       ` Marc Glisse
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Sandiford @ 2012-05-08 17:53 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

Marc Glisse <marc.glisse@inria.fr> writes:
> Here is a new version.
>
> gcc/ChangeLog
> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>
>  	* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>  	of concatenations.

OK, thanks.  I'll leave an x86 maintainer to review the testcase,
but it looks like it'll need some markup to ensure an SSE target.

> Note to self: if you want to grep for "shuf" in the asm, don't put "shuf" 
> in the name of the file...

Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the regexp.
(And to the end if possible.)

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [rtl, patch] combine concat+shuffle
  2012-05-08 17:53     ` Richard Sandiford
@ 2012-05-08 18:40       ` Marc Glisse
  2012-05-10 20:16         ` [i386] New testcase (was: [rtl, patch] combine concat+shuffle) Marc Glisse
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-08 18:40 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1191 bytes --]

On Tue, 8 May 2012, Richard Sandiford wrote:

> Marc Glisse <marc.glisse@inria.fr> writes:
>> Here is a new version.
>>
>> gcc/ChangeLog
>> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>>
>>  	* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>>  	of concatenations.
>
> OK, thanks.  I'll leave an x86 maintainer to review the testcase,
> but it looks like it'll need some markup to ensure an SSE target.

Oups, I'd thought about that, then completely forgot. For 64 bits, it 
always works. For 32 bits, it requires -msse2 -mfpmath=sse (without 
-mfpmath=sse we can still test for shufpd, but apparently not unpcklpd, I 
could remove that second test if people prefer, as it isn't important). 
Since this is a compile-only test, I think this would be enough:

/* { dg-options "-O -msse2 -mfpmath=sse" } */

>> Note to self: if you want to grep for "shuf" in the asm, don't put "shuf"
>> in the name of the file...
>
> Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the regexp.
> (And to the end if possible.)

Good idea. I was trying to make the check as wide as possible, but that's 
not so useful. Attached a new version of the testcase.

-- 
Marc Glisse

[-- Attachment #2: Type: TEXT/PLAIN, Size: 817 bytes --]

Index: gcc.target/i386/shuf-concat.c
===================================================================
--- gcc.target/i386/shuf-concat.c	(revision 0)
+++ gcc.target/i386/shuf-concat.c	(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -msse2 -mfpmath=sse" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not "\tv?shufpd\t" } } */
+/* { dg-final { scan-assembler-times "\tv?unpcklpd\t" 1 } } */

Property changes on: gcc.target/i386/shuf-concat.c
___________________________________________________________________
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [i386] New testcase (was: [rtl, patch] combine concat+shuffle)
  2012-05-08 18:40       ` Marc Glisse
@ 2012-05-10 20:16         ` Marc Glisse
  2012-05-28 13:38           ` Marc Glisse
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-10 20:16 UTC (permalink / raw)
  To: gcc-patches

Hello,

could an i386 maintainer take a look at the following testcase?

gcc/testsuite/ChangeLog
2012-05-08  Marc Glisse  <marc.glisse@inria.fr>

 	* gcc.target/i386/shuf-concat.c: New test.


--- gcc.target/i386/shuf-concat.c	(revision 0)
+++ gcc.target/i386/shuf-concat.c	(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -msse2 -mfpmath=sse" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not "\tv?shufpd\t" } } */
+/* { dg-final { scan-assembler-times "\tv?unpcklpd\t" 1 } } */


The conversation on this patch started at 
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html


On Tue, 8 May 2012, Marc Glisse wrote:

> On Tue, 8 May 2012, Richard Sandiford wrote:
>
>> Marc Glisse <marc.glisse@inria.fr> writes:
>>> Here is a new version.
>>> 
>>> gcc/ChangeLog
>>> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>>>
>>>  	* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>>>  	of concatenations.
>> 
>> OK, thanks.  I'll leave an x86 maintainer to review the testcase,
>> but it looks like it'll need some markup to ensure an SSE target.
>
> Oups, I'd thought about that, then completely forgot. For 64 bits, it always 
> works. For 32 bits, it requires -msse2 -mfpmath=sse (without -mfpmath=sse we 
> can still test for shufpd, but apparently not unpcklpd, I could remove that 
> second test if people prefer, as it isn't important). Since this is a 
> compile-only test, I think this would be enough:
>
> /* { dg-options "-O -msse2 -mfpmath=sse" } */
>
>>> Note to self: if you want to grep for "shuf" in the asm, don't put "shuf"
>>> in the name of the file...
>> 
>> Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the regexp.
>> (And to the end if possible.)
>
> Good idea. I was trying to make the check as wide as possible, but that's not 
> so useful. Attached a new version of the testcase.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [i386] New testcase (was: [rtl, patch] combine concat+shuffle)
  2012-05-10 20:16         ` [i386] New testcase (was: [rtl, patch] combine concat+shuffle) Marc Glisse
@ 2012-05-28 13:38           ` Marc Glisse
  2012-05-28 15:48             ` Uros Bizjak
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Glisse @ 2012-05-28 13:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: ubizjak


Ping? The rest of the patch has been approved already.


On Thu, 10 May 2012, Marc Glisse wrote:

> Hello,
>
> could an i386 maintainer take a look at the following testcase?
>
> gcc/testsuite/ChangeLog
> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>
> 	* gcc.target/i386/shuf-concat.c: New test.
>
>
> --- gcc.target/i386/shuf-concat.c	(revision 0)
> +++ gcc.target/i386/shuf-concat.c	(revision 0)
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -msse2 -mfpmath=sse" } */
> +
> +typedef double v2df __attribute__ ((__vector_size__ (16)));
> +
> +v2df f(double d,double e){
> +  v2df x={-d,d};
> +  v2df y={-e,e};
> +  return __builtin_ia32_shufpd(x,y,1);
> +}
> +
> +/* { dg-final { scan-assembler-not "\tv?shufpd\t" } } */
> +/* { dg-final { scan-assembler-times "\tv?unpcklpd\t" 1 } } */
>
>
> The conversation on this patch started at 
> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html
>
>
> On Tue, 8 May 2012, Marc Glisse wrote:
>
>> On Tue, 8 May 2012, Richard Sandiford wrote:
>> 
>>> Marc Glisse <marc.glisse@inria.fr> writes:
>>>> Here is a new version.
>>>> 
>>>> gcc/ChangeLog
>>>> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>>>>
>>>>  	* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>>>>  	of concatenations.
>>> 
>>> OK, thanks.  I'll leave an x86 maintainer to review the testcase,
>>> but it looks like it'll need some markup to ensure an SSE target.
>> 
>> Oups, I'd thought about that, then completely forgot. For 64 bits, it 
>> always works. For 32 bits, it requires -msse2 -mfpmath=sse (without 
>> -mfpmath=sse we can still test for shufpd, but apparently not unpcklpd, I 
>> could remove that second test if people prefer, as it isn't important). 
>> Since this is a compile-only test, I think this would be enough:
>> 
>> /* { dg-options "-O -msse2 -mfpmath=sse" } */
>> 
>>>> Note to self: if you want to grep for "shuf" in the asm, don't put "shuf"
>>>> in the name of the file...
>>> 
>>> Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the 
>>> regexp.
>>> (And to the end if possible.)
>> 
>> Good idea. I was trying to make the check as wide as possible, but that's 
>> not so useful. Attached a new version of the testcase.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [i386] New testcase (was: [rtl, patch] combine concat+shuffle)
  2012-05-28 13:38           ` Marc Glisse
@ 2012-05-28 15:48             ` Uros Bizjak
  0 siblings, 0 replies; 12+ messages in thread
From: Uros Bizjak @ 2012-05-28 15:48 UTC (permalink / raw)
  To: Marc Glisse; +Cc: gcc-patches

On Mon, May 28, 2012 at 3:37 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
> Ping? The rest of the patch has been approved already.
>
>
> On Thu, 10 May 2012, Marc Glisse wrote:
>
>> Hello,
>>
>> could an i386 maintainer take a look at the following testcase?
>>
>> gcc/testsuite/ChangeLog
>> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>>
>>        * gcc.target/i386/shuf-concat.c: New test.
>>
>>
>> --- gcc.target/i386/shuf-concat.c       (revision 0)
>> +++ gcc.target/i386/shuf-concat.c       (revision 0)
>> @@ -0,0 +1,13 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -msse2 -mfpmath=sse" } */
>> +
>> +typedef double v2df __attribute__ ((__vector_size__ (16)));
>> +
>> +v2df f(double d,double e){
>> +  v2df x={-d,d};
>> +  v2df y={-e,e};
>> +  return __builtin_ia32_shufpd(x,y,1);
>> +}
>> +
>> +/* { dg-final { scan-assembler-not "\tv?shufpd\t" } } */
>> +/* { dg-final { scan-assembler-times "\tv?unpcklpd\t" 1 } } */
>>
>>
>> The conversation on this patch started at
>> http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html
>>
>>
>> On Tue, 8 May 2012, Marc Glisse wrote:
>>
>>> On Tue, 8 May 2012, Richard Sandiford wrote:
>>>
>>>> Marc Glisse <marc.glisse@inria.fr> writes:
>>>>>
>>>>> Here is a new version.
>>>>>
>>>>> gcc/ChangeLog
>>>>> 2012-05-08  Marc Glisse  <marc.glisse@inria.fr>
>>>>>
>>>>>        * simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
>>>>>        of concatenations.
>>>>
>>>>
>>>> OK, thanks.  I'll leave an x86 maintainer to review the testcase,
>>>> but it looks like it'll need some markup to ensure an SSE target.
>>>
>>>
>>> Oups, I'd thought about that, then completely forgot. For 64 bits, it
>>> always works. For 32 bits, it requires -msse2 -mfpmath=sse (without
>>> -mfpmath=sse we can still test for shufpd, but apparently not unpcklpd, I
>>> could remove that second test if people prefer, as it isn't important).
>>> Since this is a compile-only test, I think this would be enough:
>>>
>>> /* { dg-options "-O -msse2 -mfpmath=sse" } */
>>>
>>>>> Note to self: if you want to grep for "shuf" in the asm, don't put
>>>>> "shuf"
>>>>> in the name of the file...
>>>>
>>>>
>>>> Yeah :-)  For MIPS tests I tend to add "\t" to the beginning of the
>>>> regexp.
>>>> (And to the end if possible.)
>>>
>>>
>>> Good idea. I was trying to make the check as wide as possible, but that's
>>> not so useful. Attached a new version of the testcase.

Please add "\[ \t\]" at the end of add string instead of only "\t".

OK with that change.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-05-28 15:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-07 20:17 [rtl, patch] combine concat+shuffle Marc Glisse
2012-05-08  9:40 ` Richard Sandiford
2012-05-08 10:30   ` Marc Glisse
2012-05-08 11:09     ` Richard Sandiford
2012-05-08 12:09       ` Marc Glisse
2012-05-08 12:56         ` Richard Sandiford
2012-05-08 16:33   ` Marc Glisse
2012-05-08 17:53     ` Richard Sandiford
2012-05-08 18:40       ` Marc Glisse
2012-05-10 20:16         ` [i386] New testcase (was: [rtl, patch] combine concat+shuffle) Marc Glisse
2012-05-28 13:38           ` Marc Glisse
2012-05-28 15:48             ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).