public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch-2v3, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]
@ 2024-06-17  8:59 HAO CHEN GUI
  2024-06-18  2:55 ` Kewen.Lin
  0 siblings, 1 reply; 2+ messages in thread
From: HAO CHEN GUI @ 2024-06-17  8:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  This patch creates an insn_and_split pattern which helps the duplicated
constant vector replace the source pseudo of store insn in fwprop pass.
Thus the store can be implemented by a single stxvd2x and it eliminates the
unnecessary byte swap insn on P8 LE. The test case shows the optimization.

  The patch depends on the first generic patch which uses insn cost in fwprop.

  Compared to previous version, the main change is to move
"can_create_pseudo_p ()" to insn condition.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. Is it OK for trunk?

Thanks
Gui Haochen


ChangeLog
rs6000: Eliminate unnecessary byte swaps for duplicated constant vector store

gcc/
	PR target/113325
	* config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): New.

gcc/testsuite/
	PR target/113325
	* gcc.target/powerpc/pr113325.c: New.


patch.diff
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f135fa079bd..d350c92141c 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3368,6 +3368,31 @@ (define_insn "*vsx_stxvd2x4_le_<mode>"
   "stxvd2x %x1,%y0"
   [(set_attr "type" "vecstore")])

+(define_insn_and_split "vsx_stxvd2x4_le_const_<mode>"
+  [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
+	(match_operand:VSX_W 1 "immediate_operand" "W"))]
+  "!BYTES_BIG_ENDIAN
+   && VECTOR_MEM_VSX_P (<MODE>mode)
+   && !TARGET_P9_VECTOR
+   && const_vec_duplicate_p (operands[1])
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 2)
+	(match_dup 1))
+   (set (match_dup 0)
+	(vec_select:VSX_W
+	  (match_dup 2)
+	  (parallel [(const_int 2) (const_int 3)
+		     (const_int 0) (const_int 1)])))]
+{
+  /* Here all the constants must be loaded without memory.  */
+  gcc_assert (easy_altivec_constant (operands[1], <MODE>mode));
+  operands[2] = gen_reg_rtx (<MODE>mode);
+}
+  [(set_attr "type" "vecstore")
+   (set_attr "length" "8")])
+
 (define_insn "*vsx_stxvd2x8_le_V8HI"
   [(set (match_operand:V8HI 0 "memory_operand" "=Z")
         (vec_select:V8HI
diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c b/gcc/testsuite/gcc.target/powerpc/pr113325.c
new file mode 100644
index 00000000000..3ca1fcbc9ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mvsx" } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */
+
+void* foo (void* s1)
+{
+  return __builtin_memset (s1, 0, 32);
+}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Patch-2v3, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]
  2024-06-17  8:59 [Patch-2v3, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325] HAO CHEN GUI
@ 2024-06-18  2:55 ` Kewen.Lin
  0 siblings, 0 replies; 2+ messages in thread
From: Kewen.Lin @ 2024-06-18  2:55 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi Haochen,

on 2024/6/17 16:59, HAO CHEN GUI wrote:
> Hi,
>   This patch creates an insn_and_split pattern which helps the duplicated
> constant vector replace the source pseudo of store insn in fwprop pass.
> Thus the store can be implemented by a single stxvd2x and it eliminates the
> unnecessary byte swap insn on P8 LE. The test case shows the optimization.
> 
>   The patch depends on the first generic patch which uses insn cost in fwprop.
> 
>   Compared to previous version, the main change is to move
> "can_create_pseudo_p ()" to insn condition.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is it OK for trunk?

OK, thanks!

BR,
Kewen

> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: Eliminate unnecessary byte swaps for duplicated constant vector store
> 
> gcc/
> 	PR target/113325
> 	* config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): New.
> 
> gcc/testsuite/
> 	PR target/113325
> 	* gcc.target/powerpc/pr113325.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f135fa079bd..d350c92141c 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -3368,6 +3368,31 @@ (define_insn "*vsx_stxvd2x4_le_<mode>"
>    "stxvd2x %x1,%y0"
>    [(set_attr "type" "vecstore")])
> 
> +(define_insn_and_split "vsx_stxvd2x4_le_const_<mode>"
> +  [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
> +	(match_operand:VSX_W 1 "immediate_operand" "W"))]
> +  "!BYTES_BIG_ENDIAN
> +   && VECTOR_MEM_VSX_P (<MODE>mode)
> +   && !TARGET_P9_VECTOR
> +   && const_vec_duplicate_p (operands[1])
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 2)
> +	(match_dup 1))
> +   (set (match_dup 0)
> +	(vec_select:VSX_W
> +	  (match_dup 2)
> +	  (parallel [(const_int 2) (const_int 3)
> +		     (const_int 0) (const_int 1)])))]
> +{
> +  /* Here all the constants must be loaded without memory.  */
> +  gcc_assert (easy_altivec_constant (operands[1], <MODE>mode));
> +  operands[2] = gen_reg_rtx (<MODE>mode);
> +}
> +  [(set_attr "type" "vecstore")
> +   (set_attr "length" "8")])
> +
>  (define_insn "*vsx_stxvd2x8_le_V8HI"
>    [(set (match_operand:V8HI 0 "memory_operand" "=Z")
>          (vec_select:V8HI
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> new file mode 100644
> index 00000000000..3ca1fcbc9ba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8 -mvsx" } */
> +/* { dg-require-effective-target powerpc_vsx } */
> +/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */
> +
> +void* foo (void* s1)
> +{
> +  return __builtin_memset (s1, 0, 32);
> +}


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-06-18  2:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-17  8:59 [Patch-2v3, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325] HAO CHEN GUI
2024-06-18  2:55 ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).