[PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element.
@ 2021-08-06  7:04 liuhongt
  2021-08-06 10:55 ` Richard Sandiford
  0 siblings, 1 reply; 3+ messages in thread
From: liuhongt @ 2021-08-06  7:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: crazylht, richard.sandiford, segher

Hi:
  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
  Ok for trunk?

gcc/ChangeLog:

	PR rtl-optimization/101796
	* simplify-rtx.c
	(simplify_context::simplify_binary_operation_1): Simplify
	vector shift/rotate with const_vec_duplicate to vector
	shift/rotate with const_int element.

gcc/testsuite/ChangeLog:

	PR rtl-optimization/101796
	* gcc.target/i386/pr101796.c: New test.
---
 gcc/simplify-rtx.c                       | 15 ++++++
 gcc/testsuite/gcc.target/i386/pr101796.c | 65 ++++++++++++++++++++++++
 2 files changed, 80 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101796.c

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index a719f57870f..75f3e455562 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3970,6 +3970,21 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
 	    return simplify_gen_binary (code, mode, op0,
 					gen_int_shift_amount (mode, val));
 	}
+
+      /* Optimize vector shift/rotate with const_vec_duplicate
+	 to vector shift/rotate with const_int element.
+      /* TODO: vec_duplicate with variable can also be simplified,
+	 but GCC only require operand 2 of shift/rotate to be a scalar type
+	 which can have different modes in different backends, it makes
+	 simplication difficult to decide which mode should be choosed
+	 for shift/rotate count.  */
+      if ((code == ASHIFTRT || code == LSHIFTRT
+	   || code == ASHIFT || code == ROTATERT
+	   || code == ROTATE)
+	  && const_vec_duplicate_p (op1))
+	return simplify_gen_binary (code, mode, op0,
+				    unwrap_const_vec_duplicate (op1));
+
       break;
 
     case ASHIFT:
diff --git a/gcc/testsuite/gcc.target/i386/pr101796.c b/gcc/testsuite/gcc.target/i386/pr101796.c
new file mode 100644
index 00000000000..c22d6267fe5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101796.c
@@ -0,0 +1,65 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2 " } */
+/* { dg-final { scan-assembler-not "vpbroadcast" } }  */
+/* { dg-final { scan-assembler-not "vpsrlv\[dwq\]" } }  */
+/* { dg-final { scan-assembler-not "vpsllv\[dwq\]" } }  */
+/* { dg-final { scan-assembler-not "vpsrav\[dwq\]" } }  */
+/* { dg-final { scan-assembler-times "vpsrl\[dwq\]" 3 } }  */
+/* { dg-final { scan-assembler-times "vpsll\[dwq\]" 3 } }  */
+/* { dg-final { scan-assembler-times "vpsra\[dwq\]" 3 } }  */
+
+#include <immintrin.h>
+
+__m512i
+foo (__m512i a)
+{
+  return _mm512_srlv_epi16 (a, _mm512_set1_epi16 (3));
+}
+
+__m512i
+foo1 (__m512i a)
+{
+  return _mm512_srlv_epi32 (a, _mm512_set1_epi32 (3));
+}
+
+__m512i
+foo2 (__m512i a, long long b)
+{
+  return _mm512_srlv_epi64 (a, _mm512_set1_epi64 (3));
+}
+
+__m512i
+foo3 (__m512i a)
+{
+  return _mm512_srav_epi16 (a, _mm512_set1_epi16 (3));
+}
+
+__m512i
+foo4 (__m512i a)
+{
+  return _mm512_srav_epi32 (a, _mm512_set1_epi32 (3));
+}
+
+__m512i
+foo5 (__m512i a, long long b)
+{
+  return _mm512_srav_epi64 (a, _mm512_set1_epi64 (3));
+}
+
+__m512i
+foo6 (__m512i a)
+{
+  return _mm512_sllv_epi16 (a, _mm512_set1_epi16 (3));
+}
+
+__m512i
+foo7 (__m512i a)
+{
+  return _mm512_sllv_epi32 (a, _mm512_set1_epi32 (3));
+}
+
+__m512i
+foo8 (__m512i a, long long b)
+{
+  return _mm512_sllv_epi64 (a, _mm512_set1_epi64 (3));
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element.
  2021-08-06  7:04 [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element liuhongt
@ 2021-08-06 10:55 ` Richard Sandiford
  2021-08-06 14:08   ` Segher Boessenkool
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Sandiford @ 2021-08-06 10:55 UTC (permalink / raw)
  To: liuhongt via Gcc-patches; +Cc: liuhongt, segher

liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi:
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
>   Ok for trunk?

I think if anything the canonicalisation should be the other way:
if the shift amount is an in-range constant, we know that it fits
within a vector element, and so the vector form should be preferred.

IMO it's a wart that we support (say):

  (ashift:V4SI (reg:V4SI R) (const_int 1))

even though:

  (plus:V4SI (reg:V4SI R) (const_int 1))
  (minus:V4SI (const_int 1) (reg:V4SI R))

are not well-formed (at least AFAIK).

Thanks,
Richard

>
> gcc/ChangeLog:
>
> 	PR rtl-optimization/101796
> 	* simplify-rtx.c
> 	(simplify_context::simplify_binary_operation_1): Simplify
> 	vector shift/rotate with const_vec_duplicate to vector
> 	shift/rotate with const_int element.
>
> gcc/testsuite/ChangeLog:
>
> 	PR rtl-optimization/101796
> 	* gcc.target/i386/pr101796.c: New test.
> ---
>  gcc/simplify-rtx.c                       | 15 ++++++
>  gcc/testsuite/gcc.target/i386/pr101796.c | 65 ++++++++++++++++++++++++
>  2 files changed, 80 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101796.c
>
> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> index a719f57870f..75f3e455562 100644
> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -3970,6 +3970,21 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
>  	    return simplify_gen_binary (code, mode, op0,
>  					gen_int_shift_amount (mode, val));
>  	}
> +
> +      /* Optimize vector shift/rotate with const_vec_duplicate
> +	 to vector shift/rotate with const_int element.
> +      /* TODO: vec_duplicate with variable can also be simplified,
> +	 but GCC only require operand 2 of shift/rotate to be a scalar type
> +	 which can have different modes in different backends, it makes
> +	 simplication difficult to decide which mode should be choosed
> +	 for shift/rotate count.  */
> +      if ((code == ASHIFTRT || code == LSHIFTRT
> +	   || code == ASHIFT || code == ROTATERT
> +	   || code == ROTATE)
> +	  && const_vec_duplicate_p (op1))
> +	return simplify_gen_binary (code, mode, op0,
> +				    unwrap_const_vec_duplicate (op1));
> +
>        break;
>  
>      case ASHIFT:
> diff --git a/gcc/testsuite/gcc.target/i386/pr101796.c b/gcc/testsuite/gcc.target/i386/pr101796.c
> new file mode 100644
> index 00000000000..c22d6267fe5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr101796.c
> @@ -0,0 +1,65 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -O2 " } */
> +/* { dg-final { scan-assembler-not "vpbroadcast" } }  */
> +/* { dg-final { scan-assembler-not "vpsrlv\[dwq\]" } }  */
> +/* { dg-final { scan-assembler-not "vpsllv\[dwq\]" } }  */
> +/* { dg-final { scan-assembler-not "vpsrav\[dwq\]" } }  */
> +/* { dg-final { scan-assembler-times "vpsrl\[dwq\]" 3 } }  */
> +/* { dg-final { scan-assembler-times "vpsll\[dwq\]" 3 } }  */
> +/* { dg-final { scan-assembler-times "vpsra\[dwq\]" 3 } }  */
> +
> +#include <immintrin.h>
> +
> +__m512i
> +foo (__m512i a)
> +{
> +  return _mm512_srlv_epi16 (a, _mm512_set1_epi16 (3));
> +}
> +
> +__m512i
> +foo1 (__m512i a)
> +{
> +  return _mm512_srlv_epi32 (a, _mm512_set1_epi32 (3));
> +}
> +
> +__m512i
> +foo2 (__m512i a, long long b)
> +{
> +  return _mm512_srlv_epi64 (a, _mm512_set1_epi64 (3));
> +}
> +
> +__m512i
> +foo3 (__m512i a)
> +{
> +  return _mm512_srav_epi16 (a, _mm512_set1_epi16 (3));
> +}
> +
> +__m512i
> +foo4 (__m512i a)
> +{
> +  return _mm512_srav_epi32 (a, _mm512_set1_epi32 (3));
> +}
> +
> +__m512i
> +foo5 (__m512i a, long long b)
> +{
> +  return _mm512_srav_epi64 (a, _mm512_set1_epi64 (3));
> +}
> +
> +__m512i
> +foo6 (__m512i a)
> +{
> +  return _mm512_sllv_epi16 (a, _mm512_set1_epi16 (3));
> +}
> +
> +__m512i
> +foo7 (__m512i a)
> +{
> +  return _mm512_sllv_epi32 (a, _mm512_set1_epi32 (3));
> +}
> +
> +__m512i
> +foo8 (__m512i a, long long b)
> +{
> +  return _mm512_sllv_epi64 (a, _mm512_set1_epi64 (3));
> +}

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element.
  2021-08-06 10:55 ` Richard Sandiford
@ 2021-08-06 14:08   ` Segher Boessenkool
  0 siblings, 0 replies; 3+ messages in thread
From: Segher Boessenkool @ 2021-08-06 14:08 UTC (permalink / raw)
  To: liuhongt via Gcc-patches, liuhongt, richard.sandiford

On Fri, Aug 06, 2021 at 11:55:52AM +0100, Richard Sandiford wrote:
> liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > Hi:
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> >   Ok for trunk?
> 
> I think if anything the canonicalisation should be the other way:
> if the shift amount is an in-range constant, we know that it fits
> within a vector element, and so the vector form should be preferred.

Yeah.  And the canonicalisation needs to be documented *first*, i.e. we
have to agree on it first, *before* patches doing this to simplify-rtx
are acceptable.  We don't do design-by-fait-accompli.

Any canonicalisation also has to fit in well with other canonicalisations,
or we will be better off not having canonical forms.

If it turns out there is no good canonical form, we will simply have to
handle both forms (or more than two perhaps).  This isn't the end of the
world, we have to do that already.  If we can simplify things with a
canonical form, that is great; if that causes too much extra work
instead, it is not so great.  These things have to be thought about.

Segher

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-06 14:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-06  7:04 [PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element liuhongt
2021-08-06 10:55 ` Richard Sandiford
2021-08-06 14:08   ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).