public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
@ 2023-04-05  8:38 Prathamesh Kulkarni
  2023-04-11  8:46 ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-05  8:38 UTC (permalink / raw)
  To: gcc Patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]

Hi,
For the following test:

svint32_t f(svint32_t v)
{
  return svrev_s32 (svrev_s32 (v));
}

We generate 2 rev instructions instead of nop:
f:
        rev     z0.s, z0.s
        rev     z0.s, z0.s
        ret

The attached patch tries to fix that by trying to recognize the following
pattern in match.pd:
v1 = VEC_PERM_EXPR (v0, v0, mask)
v2 = VEC_PERM_EXPR (v1, v1, mask)
-->
v2 = v0
if mask is { nelts - 1, nelts - 2, nelts - 3, ... }

Code-gen with patch:
f:
        ret

Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
Does it look OK for stage-1 ?

Thanks,
Prathamesh

[-- Attachment #2: gnu-830-1.txt --]
[-- Type: text/plain, Size: 1609 bytes --]

gcc/ChangeLog:
	* match.pd: New pattern to simplify two successive VEC_PERM_EXPRs with single
	operand and same mask, where mask chooses elements in reverse order.

gcc/testesuite/ChangeLog:
	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.

diff --git a/gcc/match.pd b/gcc/match.pd
index b8d3538b809..19dfc8f3722 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8456,3 +8456,27 @@ and,
       }
       (if (full_perm_p)
 	(vec_perm (op@3 @0 @1) @3 @2))))))
+
+/* Transform:
+   v1 = VEC_PERM_EXPR (v0, v0, mask)
+   v2 = VEC_PERM_EXPR (v1, v1, mask)
+   -->
+   v2 = v0
+   if mask is {nelts - 1, nelts - 2, ...}  */
+
+(simplify
+ (vec_perm (vec_perm@2 @0 @0 VECTOR_CST@1) @2 @1)
+  (with
+   {
+    vec_perm_builder builder;
+    bool rev_p = false;
+    if (tree_to_vec_perm_builder (&builder, @1))
+      {
+	poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
+	vec_perm_indices sel (builder, 1, nelts);
+	if (sel.series_p (0, 1, nelts - 1, -1))
+	  rev_p = true;
+      }
+   }
+   (if (rev_p)
+    @0)))
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
new file mode 100644
index 00000000000..e57ee67d716
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+#include <arm_sve.h>
+
+svint32_t f(svint32_t v)
+{
+  return svrev_s32 (svrev_s32 (v));
+}
+
+/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-05  8:38 [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v Prathamesh Kulkarni
@ 2023-04-11  8:46 ` Richard Biener
  2023-04-11 14:06   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2023-04-11  8:46 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, Richard Sandiford

On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> For the following test:
>
> svint32_t f(svint32_t v)
> {
>   return svrev_s32 (svrev_s32 (v));
> }
>
> We generate 2 rev instructions instead of nop:
> f:
>         rev     z0.s, z0.s
>         rev     z0.s, z0.s
>         ret
>
> The attached patch tries to fix that by trying to recognize the following
> pattern in match.pd:
> v1 = VEC_PERM_EXPR (v0, v0, mask)
> v2 = VEC_PERM_EXPR (v1, v1, mask)
> -->
> v2 = v0
> if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
>
> Code-gen with patch:
> f:
>         ret
>
> Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> Does it look OK for stage-1 ?

I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
handle two consecutive permutes with the is_combined_permutation_identity
which might need tweaking for VLA vectors

Richard.

>
> Thanks,
> Prathamesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-11  8:46 ` Richard Biener
@ 2023-04-11 14:06   ` Prathamesh Kulkarni
  2023-04-19  9:21     ` Prathamesh Kulkarni
  0 siblings, 1 reply; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-11 14:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]

On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> > For the following test:
> >
> > svint32_t f(svint32_t v)
> > {
> >   return svrev_s32 (svrev_s32 (v));
> > }
> >
> > We generate 2 rev instructions instead of nop:
> > f:
> >         rev     z0.s, z0.s
> >         rev     z0.s, z0.s
> >         ret
> >
> > The attached patch tries to fix that by trying to recognize the following
> > pattern in match.pd:
> > v1 = VEC_PERM_EXPR (v0, v0, mask)
> > v2 = VEC_PERM_EXPR (v1, v1, mask)
> > -->
> > v2 = v0
> > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> >
> > Code-gen with patch:
> > f:
> >         ret
> >
> > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> > Does it look OK for stage-1 ?
>
> I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> handle two consecutive permutes with the is_combined_permutation_identity
> which might need tweaking for VLA vectors
Hi Richard,
Thanks for the suggestions. The attached patch modifies
is_combined_permutation_identity
to recognize the above pattern.
Does it look OK ?
Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.

Thanks,
Prathamesh
>
> Richard.
>
> >
> > Thanks,
> > Prathamesh

[-- Attachment #2: gnu-830-2.txt --]
[-- Type: text/plain, Size: 2983 bytes --]

gcc/ChangeLog:
	* tree-ssa-forwprop.cc (is_combined_permutation_identity):
	New parameter def_stmt.
	Try to simplify two successive VEC_PERM_EXPRs with single operand
	and same mask, where mask chooses elements in reverse order.

gcc/testesuite/ChangeLog:
	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
new file mode 100644
index 00000000000..e57ee67d716
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+#include <arm_sve.h>
+
+svint32_t f(svint32_t v)
+{
+  return svrev_s32 (svrev_s32 (v));
+}
+
+/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 9b567440ba4..5cbee077d89 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -2532,7 +2532,7 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi)
    gives back one of the input.  */
 
 static int
-is_combined_permutation_identity (tree mask1, tree mask2)
+is_combined_permutation_identity (tree mask1, tree mask2, gimple *def_stmt = NULL)
 {
   tree mask;
   unsigned HOST_WIDE_INT nelts, i, j;
@@ -2541,7 +2541,36 @@ is_combined_permutation_identity (tree mask1, tree mask2)
 
   gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
 		       && TREE_CODE (mask2) == VECTOR_CST);
+  if (def_stmt)
+    gcc_checking_assert (is_gimple_assign (def_stmt)
+			 && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
+
   mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
+
+  /* For VLA masks, check for the following pattern:
+     v1 = VEC_PERM_EXPR (v0, v0, mask)
+     v2 = VEC_PERM_EXPR (v1, v1, mask)
+     -->
+     v2 = v0
+     if mask is {nelts - 1, nelts - 2, ...}.  */
+
+  if (operand_equal_p (mask1, mask2, 0)
+      && !VECTOR_CST_NELTS (mask1).is_constant ()
+      && def_stmt
+      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
+			  gimple_assign_rhs2 (def_stmt), 0))
+    {
+      vec_perm_builder builder;
+      if (tree_to_vec_perm_builder (&builder, mask1))
+	{
+	  poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
+	  vec_perm_indices sel (builder, 1, nelts);
+	  if (sel.series_p (0, 1, nelts - 1, -1))
+	    return 1;
+	}
+      return 0;
+    }
+
   if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
     return 0;
 
@@ -2629,7 +2658,7 @@ simplify_permutation (gimple_stmt_iterator *gsi)
       op3 = gimple_assign_rhs3 (def_stmt);
       if (TREE_CODE (op3) != VECTOR_CST)
 	return 0;
-      ident = is_combined_permutation_identity (op3, op2);
+      ident = is_combined_permutation_identity (op3, op2, def_stmt);
       if (!ident)
 	return 0;
       orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-11 14:06   ` Prathamesh Kulkarni
@ 2023-04-19  9:21     ` Prathamesh Kulkarni
  2023-04-19 10:45       ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-19  9:21 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches, Richard Sandiford

On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > > For the following test:
> > >
> > > svint32_t f(svint32_t v)
> > > {
> > >   return svrev_s32 (svrev_s32 (v));
> > > }
> > >
> > > We generate 2 rev instructions instead of nop:
> > > f:
> > >         rev     z0.s, z0.s
> > >         rev     z0.s, z0.s
> > >         ret
> > >
> > > The attached patch tries to fix that by trying to recognize the following
> > > pattern in match.pd:
> > > v1 = VEC_PERM_EXPR (v0, v0, mask)
> > > v2 = VEC_PERM_EXPR (v1, v1, mask)
> > > -->
> > > v2 = v0
> > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> > >
> > > Code-gen with patch:
> > > f:
> > >         ret
> > >
> > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> > > Does it look OK for stage-1 ?
> >
> > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> > handle two consecutive permutes with the is_combined_permutation_identity
> > which might need tweaking for VLA vectors
> Hi Richard,
> Thanks for the suggestions. The attached patch modifies
> is_combined_permutation_identity
> to recognize the above pattern.
> Does it look OK ?
> Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
Hi,
ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > >
> > > Thanks,
> > > Prathamesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-19  9:21     ` Prathamesh Kulkarni
@ 2023-04-19 10:45       ` Richard Biener
  2023-04-19 12:20         ` Prathamesh Kulkarni
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2023-04-19 10:45 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, Richard Sandiford

On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
> >
> > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > Hi,
> > > > For the following test:
> > > >
> > > > svint32_t f(svint32_t v)
> > > > {
> > > >   return svrev_s32 (svrev_s32 (v));
> > > > }
> > > >
> > > > We generate 2 rev instructions instead of nop:
> > > > f:
> > > >         rev     z0.s, z0.s
> > > >         rev     z0.s, z0.s
> > > >         ret
> > > >
> > > > The attached patch tries to fix that by trying to recognize the following
> > > > pattern in match.pd:
> > > > v1 = VEC_PERM_EXPR (v0, v0, mask)
> > > > v2 = VEC_PERM_EXPR (v1, v1, mask)
> > > > -->
> > > > v2 = v0
> > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> > > >
> > > > Code-gen with patch:
> > > > f:
> > > >         ret
> > > >
> > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> > > > Does it look OK for stage-1 ?
> > >
> > > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> > > handle two consecutive permutes with the is_combined_permutation_identity
> > > which might need tweaking for VLA vectors
> > Hi Richard,
> > Thanks for the suggestions. The attached patch modifies
> > is_combined_permutation_identity
> > to recognize the above pattern.
> > Does it look OK ?
> > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
> Hi,
> ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html

Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2
and amend the function comment accordingly, say,

  tem = VEC_PERM <op0, op1, MASK1>;
  res = VEC_PERM <tem, tem, MASK2>;

SAME_P specifies whether op0 and op1 compare equal.  */

+  if (def_stmt)
+    gcc_checking_assert (is_gimple_assign (def_stmt)
+                        && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
this is then unnecessary

   mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
+
+  /* For VLA masks, check for the following pattern:
+     v1 = VEC_PERM_EXPR (v0, v0, mask)
+     v2 = VEC_PERM_EXPR (v1, v1, mask)
+     -->
+     v2 = v0

you are not using 'mask' so please defer fold_ternary until after your
special-case.

+  if (operand_equal_p (mask1, mask2, 0)
+      && !VECTOR_CST_NELTS (mask1).is_constant ()
+      && def_stmt
+      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
+                         gimple_assign_rhs2 (def_stmt), 0))
+    {
+      vec_perm_builder builder;
+      if (tree_to_vec_perm_builder (&builder, mask1))
+       {
+         poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
+         vec_perm_indices sel (builder, 1, nelts);
+         if (sel.series_p (0, 1, nelts - 1, -1))
+           return 1;
+       }
+      return 0;

I'm defering to Richard whether this is the correct way to check for a vector
reversing mask (I wonder how constructing such mask is even possible)

Richard.

> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > > >
> > > > Thanks,
> > > > Prathamesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-19 10:45       ` Richard Biener
@ 2023-04-19 12:20         ` Prathamesh Kulkarni
  2023-04-19 12:32           ` Richard Biener
  2023-04-21 16:27           ` Richard Sandiford
  0 siblings, 2 replies; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-19 12:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 4638 bytes --]

On Wed, 19 Apr 2023 at 16:17, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
> >
> > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
> > <prathamesh.kulkarni@linaro.org> wrote:
> > >
> > > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > Hi,
> > > > > For the following test:
> > > > >
> > > > > svint32_t f(svint32_t v)
> > > > > {
> > > > >   return svrev_s32 (svrev_s32 (v));
> > > > > }
> > > > >
> > > > > We generate 2 rev instructions instead of nop:
> > > > > f:
> > > > >         rev     z0.s, z0.s
> > > > >         rev     z0.s, z0.s
> > > > >         ret
> > > > >
> > > > > The attached patch tries to fix that by trying to recognize the following
> > > > > pattern in match.pd:
> > > > > v1 = VEC_PERM_EXPR (v0, v0, mask)
> > > > > v2 = VEC_PERM_EXPR (v1, v1, mask)
> > > > > -->
> > > > > v2 = v0
> > > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> > > > >
> > > > > Code-gen with patch:
> > > > > f:
> > > > >         ret
> > > > >
> > > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> > > > > Does it look OK for stage-1 ?
> > > >
> > > > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> > > > handle two consecutive permutes with the is_combined_permutation_identity
> > > > which might need tweaking for VLA vectors
> > > Hi Richard,
> > > Thanks for the suggestions. The attached patch modifies
> > > is_combined_permutation_identity
> > > to recognize the above pattern.
> > > Does it look OK ?
> > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
> > Hi,
> > ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html
>
> Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2
> and amend the function comment accordingly, say,
>
>   tem = VEC_PERM <op0, op1, MASK1>;
>   res = VEC_PERM <tem, tem, MASK2>;
>
> SAME_P specifies whether op0 and op1 compare equal.  */
>
> +  if (def_stmt)
> +    gcc_checking_assert (is_gimple_assign (def_stmt)
> +                        && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
> this is then unnecessary
>
>    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
> +
> +  /* For VLA masks, check for the following pattern:
> +     v1 = VEC_PERM_EXPR (v0, v0, mask)
> +     v2 = VEC_PERM_EXPR (v1, v1, mask)
> +     -->
> +     v2 = v0
>
> you are not using 'mask' so please defer fold_ternary until after your
> special-case.
>
> +  if (operand_equal_p (mask1, mask2, 0)
> +      && !VECTOR_CST_NELTS (mask1).is_constant ()
> +      && def_stmt
> +      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
> +                         gimple_assign_rhs2 (def_stmt), 0))
> +    {
> +      vec_perm_builder builder;
> +      if (tree_to_vec_perm_builder (&builder, mask1))
> +       {
> +         poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> +         vec_perm_indices sel (builder, 1, nelts);
> +         if (sel.series_p (0, 1, nelts - 1, -1))
> +           return 1;
> +       }
> +      return 0;
>
> I'm defering to Richard whether this is the correct way to check for a vector
> reversing mask (I wonder how constructing such mask is even possible)
Hi Richard,
Thanks for the suggestions, I have updated the patch accordingly.

The following hunk from svrev_impl::fold() constructs mask in reverse:
    /* Permute as { nelts - 1, nelts - 2, nelts - 3, ... }.  */
    poly_int64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (f.lhs));
    vec_perm_builder builder (nelts, 1, 3);
    for (int i = 0; i < 3; ++i)
      builder.quick_push (nelts - i - 1);
    return fold_permute (f, builder);

To see if mask chooses elements in reverse, I borrowed it from function comment
for series_p in vec-perm-indices.cc:
/* Return true if index OUT_BASE + I * OUT_STEP selects input
   element IN_BASE + I * IN_STEP.  For example, the call to test
   whether a permute reverses a vector of N elements would be:

     series_p (0, 1, N - 1, -1)

   which would return true for { N - 1, N - 2, N - 3, ... }.  */

Thanks,
Prathamesh
>
> Richard.
>
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Richard.
> > > >
> > > > >
> > > > > Thanks,
> > > > > Prathamesh

[-- Attachment #2: gnu-830-4.txt --]
[-- Type: text/plain, Size: 3229 bytes --]

gcc/ChangeLog:
	* tree-ssa-forwprop.cc (is_combined_permutation_identity):
	New parameter same_p.
	Try to simplify two successive VEC_PERM_EXPRs with single operand
	and same mask, where mask chooses elements in reverse order.

gcc/testesuite/ChangeLog:
	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
new file mode 100644
index 00000000000..e57ee67d716
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+#include <arm_sve.h>
+
+svint32_t f(svint32_t v)
+{
+  return svrev_s32 (svrev_s32 (v));
+}
+
+/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 9b567440ba4..ebd4a368ae9 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -2528,11 +2528,16 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Determine whether applying the 2 permutations (mask1 then mask2)
-   gives back one of the input.  */
+/* For the following sequence:
+   tem = VEC_PERM_EXPR <op0, op1, mask1>
+   res = VEC_PERM_EXPR <tem, tem, mask2>
+
+   Determine whether applying the 2 permutations (mask1 then mask2)
+   gives back one of the input. SAME_P specifies whether op0
+   and op1 compare equal.  */
 
 static int
-is_combined_permutation_identity (tree mask1, tree mask2)
+is_combined_permutation_identity (tree mask1, tree mask2, bool same_p)
 {
   tree mask;
   unsigned HOST_WIDE_INT nelts, i, j;
@@ -2541,6 +2546,29 @@ is_combined_permutation_identity (tree mask1, tree mask2)
 
   gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
 		       && TREE_CODE (mask2) == VECTOR_CST);
+
+  /* For VLA masks, check for the following pattern:
+     v1 = VEC_PERM_EXPR (v0, v0, mask1)
+     v2 = VEC_PERM_EXPR (v1, v1, mask2)
+     -->
+     v2 = v0
+     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
+
+  if (operand_equal_p (mask1, mask2, 0)
+      && !VECTOR_CST_NELTS (mask1).is_constant ()
+      && same_p)
+    {
+      vec_perm_builder builder;
+      if (tree_to_vec_perm_builder (&builder, mask1))
+	{
+	  poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
+	  vec_perm_indices sel (builder, 1, nelts);
+	  if (sel.series_p (0, 1, nelts - 1, -1))
+	    return 1;
+	}
+      return 0;
+    }
+
   mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
   if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
     return 0;
@@ -2629,7 +2657,9 @@ simplify_permutation (gimple_stmt_iterator *gsi)
       op3 = gimple_assign_rhs3 (def_stmt);
       if (TREE_CODE (op3) != VECTOR_CST)
 	return 0;
-      ident = is_combined_permutation_identity (op3, op2);
+      bool same_p = operand_equal_p (gimple_assign_rhs1 (def_stmt),
+				     gimple_assign_rhs2 (def_stmt), 0);
+      ident = is_combined_permutation_identity (op3, op2, same_p);
       if (!ident)
 	return 0;
       orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-19 12:20         ` Prathamesh Kulkarni
@ 2023-04-19 12:32           ` Richard Biener
  2023-04-21 16:27           ` Richard Sandiford
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Biener @ 2023-04-19 12:32 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, Richard Sandiford

On Wed, Apr 19, 2023 at 2:20 PM Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
>
> On Wed, 19 Apr 2023 at 16:17, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
> > <prathamesh.kulkarni@linaro.org> wrote:
> > >
> > > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
> > > <prathamesh.kulkarni@linaro.org> wrote:
> > > >
> > > > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > > For the following test:
> > > > > >
> > > > > > svint32_t f(svint32_t v)
> > > > > > {
> > > > > >   return svrev_s32 (svrev_s32 (v));
> > > > > > }
> > > > > >
> > > > > > We generate 2 rev instructions instead of nop:
> > > > > > f:
> > > > > >         rev     z0.s, z0.s
> > > > > >         rev     z0.s, z0.s
> > > > > >         ret
> > > > > >
> > > > > > The attached patch tries to fix that by trying to recognize the following
> > > > > > pattern in match.pd:
> > > > > > v1 = VEC_PERM_EXPR (v0, v0, mask)
> > > > > > v2 = VEC_PERM_EXPR (v1, v1, mask)
> > > > > > -->
> > > > > > v2 = v0
> > > > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> > > > > >
> > > > > > Code-gen with patch:
> > > > > > f:
> > > > > >         ret
> > > > > >
> > > > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> > > > > > Does it look OK for stage-1 ?
> > > > >
> > > > > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> > > > > handle two consecutive permutes with the is_combined_permutation_identity
> > > > > which might need tweaking for VLA vectors
> > > > Hi Richard,
> > > > Thanks for the suggestions. The attached patch modifies
> > > > is_combined_permutation_identity
> > > > to recognize the above pattern.
> > > > Does it look OK ?
> > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
> > > Hi,
> > > ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html
> >
> > Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2
> > and amend the function comment accordingly, say,
> >
> >   tem = VEC_PERM <op0, op1, MASK1>;
> >   res = VEC_PERM <tem, tem, MASK2>;
> >
> > SAME_P specifies whether op0 and op1 compare equal.  */
> >
> > +  if (def_stmt)
> > +    gcc_checking_assert (is_gimple_assign (def_stmt)
> > +                        && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
> > this is then unnecessary
> >
> >    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
> > +
> > +  /* For VLA masks, check for the following pattern:
> > +     v1 = VEC_PERM_EXPR (v0, v0, mask)
> > +     v2 = VEC_PERM_EXPR (v1, v1, mask)
> > +     -->
> > +     v2 = v0
> >
> > you are not using 'mask' so please defer fold_ternary until after your
> > special-case.
> >
> > +  if (operand_equal_p (mask1, mask2, 0)
> > +      && !VECTOR_CST_NELTS (mask1).is_constant ()
> > +      && def_stmt
> > +      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
> > +                         gimple_assign_rhs2 (def_stmt), 0))
> > +    {
> > +      vec_perm_builder builder;
> > +      if (tree_to_vec_perm_builder (&builder, mask1))
> > +       {
> > +         poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> > +         vec_perm_indices sel (builder, 1, nelts);
> > +         if (sel.series_p (0, 1, nelts - 1, -1))
> > +           return 1;
> > +       }
> > +      return 0;
> >
> > I'm defering to Richard whether this is the correct way to check for a vector
> > reversing mask (I wonder how constructing such mask is even possible)
> Hi Richard,
> Thanks for the suggestions, I have updated the patch accordingly.
>
> The following hunk from svrev_impl::fold() constructs mask in reverse:
>     /* Permute as { nelts - 1, nelts - 2, nelts - 3, ... }.  */
>     poly_int64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (f.lhs));
>     vec_perm_builder builder (nelts, 1, 3);
>     for (int i = 0; i < 3; ++i)
>       builder.quick_push (nelts - i - 1);
>     return fold_permute (f, builder);
>
> To see if mask chooses elements in reverse, I borrowed it from function comment
> for series_p in vec-perm-indices.cc:
> /* Return true if index OUT_BASE + I * OUT_STEP selects input
>    element IN_BASE + I * IN_STEP.  For example, the call to test
>    whether a permute reverses a vector of N elements would be:
>
>      series_p (0, 1, N - 1, -1)
>
>    which would return true for { N - 1, N - 2, N - 3, ... }.  */

Looks good from my side now, but as said I defer to Richard for the check.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > Richard.
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Prathamesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-19 12:20         ` Prathamesh Kulkarni
  2023-04-19 12:32           ` Richard Biener
@ 2023-04-21 16:27           ` Richard Sandiford
  2023-04-23  6:11             ` Prathamesh Kulkarni
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Sandiford @ 2023-04-21 16:27 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Richard Biener, gcc Patches

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> On Wed, 19 Apr 2023 at 16:17, Richard Biener <richard.guenther@gmail.com> wrote:
>>
>> On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
>> <prathamesh.kulkarni@linaro.org> wrote:
>> >
>> > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
>> > <prathamesh.kulkarni@linaro.org> wrote:
>> > >
>> > > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
>> > > >
>> > > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
>> > > > <gcc-patches@gcc.gnu.org> wrote:
>> > > > >
>> > > > > Hi,
>> > > > > For the following test:
>> > > > >
>> > > > > svint32_t f(svint32_t v)
>> > > > > {
>> > > > >   return svrev_s32 (svrev_s32 (v));
>> > > > > }
>> > > > >
>> > > > > We generate 2 rev instructions instead of nop:
>> > > > > f:
>> > > > >         rev     z0.s, z0.s
>> > > > >         rev     z0.s, z0.s
>> > > > >         ret
>> > > > >
>> > > > > The attached patch tries to fix that by trying to recognize the following
>> > > > > pattern in match.pd:
>> > > > > v1 = VEC_PERM_EXPR (v0, v0, mask)
>> > > > > v2 = VEC_PERM_EXPR (v1, v1, mask)
>> > > > > -->
>> > > > > v2 = v0
>> > > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
>> > > > >
>> > > > > Code-gen with patch:
>> > > > > f:
>> > > > >         ret
>> > > > >
>> > > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
>> > > > > Does it look OK for stage-1 ?
>> > > >
>> > > > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
>> > > > handle two consecutive permutes with the is_combined_permutation_identity
>> > > > which might need tweaking for VLA vectors
>> > > Hi Richard,
>> > > Thanks for the suggestions. The attached patch modifies
>> > > is_combined_permutation_identity
>> > > to recognize the above pattern.
>> > > Does it look OK ?
>> > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
>> > Hi,
>> > ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html
>>
>> Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2
>> and amend the function comment accordingly, say,
>>
>>   tem = VEC_PERM <op0, op1, MASK1>;
>>   res = VEC_PERM <tem, tem, MASK2>;
>>
>> SAME_P specifies whether op0 and op1 compare equal.  */
>>
>> +  if (def_stmt)
>> +    gcc_checking_assert (is_gimple_assign (def_stmt)
>> +                        && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
>> this is then unnecessary
>>
>>    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
>> +
>> +  /* For VLA masks, check for the following pattern:
>> +     v1 = VEC_PERM_EXPR (v0, v0, mask)
>> +     v2 = VEC_PERM_EXPR (v1, v1, mask)
>> +     -->
>> +     v2 = v0
>>
>> you are not using 'mask' so please defer fold_ternary until after your
>> special-case.
>>
>> +  if (operand_equal_p (mask1, mask2, 0)
>> +      && !VECTOR_CST_NELTS (mask1).is_constant ()
>> +      && def_stmt
>> +      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
>> +                         gimple_assign_rhs2 (def_stmt), 0))
>> +    {
>> +      vec_perm_builder builder;
>> +      if (tree_to_vec_perm_builder (&builder, mask1))
>> +       {
>> +         poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
>> +         vec_perm_indices sel (builder, 1, nelts);
>> +         if (sel.series_p (0, 1, nelts - 1, -1))
>> +           return 1;
>> +       }
>> +      return 0;
>>
>> I'm defering to Richard whether this is the correct way to check for a vector
>> reversing mask (I wonder how constructing such mask is even possible)
> Hi Richard,
> Thanks for the suggestions, I have updated the patch accordingly.
>
> The following hunk from svrev_impl::fold() constructs mask in reverse:
>     /* Permute as { nelts - 1, nelts - 2, nelts - 3, ... }.  */
>     poly_int64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (f.lhs));
>     vec_perm_builder builder (nelts, 1, 3);
>     for (int i = 0; i < 3; ++i)
>       builder.quick_push (nelts - i - 1);
>     return fold_permute (f, builder);
>
> To see if mask chooses elements in reverse, I borrowed it from function comment
> for series_p in vec-perm-indices.cc:
> /* Return true if index OUT_BASE + I * OUT_STEP selects input
>    element IN_BASE + I * IN_STEP.  For example, the call to test
>    whether a permute reverses a vector of N elements would be:
>
>      series_p (0, 1, N - 1, -1)
>
>    which would return true for { N - 1, N - 2, N - 3, ... }.  */
>
> Thanks,
> Prathamesh
>>
>> Richard.
>>
>> > Thanks,
>> > Prathamesh
>> > >
>> > > Thanks,
>> > > Prathamesh
>> > > >
>> > > > Richard.
>> > > >
>> > > > >
>> > > > > Thanks,
>> > > > > Prathamesh
>
> gcc/ChangeLog:
> 	* tree-ssa-forwprop.cc (is_combined_permutation_identity):
> 	New parameter same_p.
> 	Try to simplify two successive VEC_PERM_EXPRs with single operand
> 	and same mask, where mask chooses elements in reverse order.
>
> gcc/testesuite/ChangeLog:
> 	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> new file mode 100644
> index 00000000000..e57ee67d716
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
> +
> +#include <arm_sve.h>
> +
> +svint32_t f(svint32_t v)
> +{
> +  return svrev_s32 (svrev_s32 (v));
> +}
> +
> +/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
> diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> index 9b567440ba4..ebd4a368ae9 100644
> --- a/gcc/tree-ssa-forwprop.cc
> +++ b/gcc/tree-ssa-forwprop.cc
> @@ -2528,11 +2528,16 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi)
>    return true;
>  }
>  
> -/* Determine whether applying the 2 permutations (mask1 then mask2)
> -   gives back one of the input.  */
> +/* For the following sequence:
> +   tem = VEC_PERM_EXPR <op0, op1, mask1>
> +   res = VEC_PERM_EXPR <tem, tem, mask2>
> +
> +   Determine whether applying the 2 permutations (mask1 then mask2)
> +   gives back one of the input. SAME_P specifies whether op0
> +   and op1 compare equal.  */
>  
>  static int
> -is_combined_permutation_identity (tree mask1, tree mask2)
> +is_combined_permutation_identity (tree mask1, tree mask2, bool same_p)
>  {
>    tree mask;
>    unsigned HOST_WIDE_INT nelts, i, j;
> @@ -2541,6 +2546,29 @@ is_combined_permutation_identity (tree mask1, tree mask2)
>  
>    gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
>  		       && TREE_CODE (mask2) == VECTOR_CST);
> +
> +  /* For VLA masks, check for the following pattern:
> +     v1 = VEC_PERM_EXPR (v0, v0, mask1)
> +     v2 = VEC_PERM_EXPR (v1, v1, mask2)
> +     -->
> +     v2 = v0
> +     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
> +
> +  if (operand_equal_p (mask1, mask2, 0)
> +      && !VECTOR_CST_NELTS (mask1).is_constant ()
> +      && same_p)

The same_p isn't needed, since a reverse mask only picks from the
first operand.  Canonicalisation rules should ensure that op0 and op1
end up being equal in that case, but this fold doesn't depend on that.

> +    {
> +      vec_perm_builder builder;
> +      if (tree_to_vec_perm_builder (&builder, mask1))
> +	{
> +	  poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> +	  vec_perm_indices sel (builder, 1, nelts);
> +	  if (sel.series_p (0, 1, nelts - 1, -1))
> +	    return 1;
> +	}

I have conflicting feelings about this.

(1) It seems quite special-purpose.  I'm a bit worried that we'll end
    up with a list of highly specific folds (which could grow quite long)
    rather than something more general.

(2) Constructing a vec_perm_indices seems quite heavyweight for this
    single check.  We could go directly from the mask instead.

But “fixing” (2) reduces the generality of the structure and so leans
more heavily into (1).

I agree the check is correct though, so let's go with the patch
in this form (without the same_p thing).  However:

> +      return 0;
> +    }
> +

I don't think we want this early exit.  It won't be used for all VLA cases
due the operand_equal_p check.  And in future, more VLA-compatible stuff
might be added further down.

Thanks,
Richard

>    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
>    if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
>      return 0;
> @@ -2629,7 +2657,9 @@ simplify_permutation (gimple_stmt_iterator *gsi)
>        op3 = gimple_assign_rhs3 (def_stmt);
>        if (TREE_CODE (op3) != VECTOR_CST)
>  	return 0;
> -      ident = is_combined_permutation_identity (op3, op2);
> +      bool same_p = operand_equal_p (gimple_assign_rhs1 (def_stmt),
> +				     gimple_assign_rhs2 (def_stmt), 0);
> +      ident = is_combined_permutation_identity (op3, op2, same_p);
>        if (!ident)
>  	return 0;
>        orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-21 16:27           ` Richard Sandiford
@ 2023-04-23  6:11             ` Prathamesh Kulkarni
  2023-04-24  9:32               ` Richard Sandiford
  0 siblings, 1 reply; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-23  6:11 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, gcc Patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 10184 bytes --]

On Fri, 21 Apr 2023 at 21:57, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > On Wed, 19 Apr 2023 at 16:17, Richard Biener <richard.guenther@gmail.com> wrote:
> >>
> >> On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni
> >> <prathamesh.kulkarni@linaro.org> wrote:
> >> >
> >> > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni
> >> > <prathamesh.kulkarni@linaro.org> wrote:
> >> > >
> >> > > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guenther@gmail.com> wrote:
> >> > > >
> >> > > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches
> >> > > > <gcc-patches@gcc.gnu.org> wrote:
> >> > > > >
> >> > > > > Hi,
> >> > > > > For the following test:
> >> > > > >
> >> > > > > svint32_t f(svint32_t v)
> >> > > > > {
> >> > > > >   return svrev_s32 (svrev_s32 (v));
> >> > > > > }
> >> > > > >
> >> > > > > We generate 2 rev instructions instead of nop:
> >> > > > > f:
> >> > > > >         rev     z0.s, z0.s
> >> > > > >         rev     z0.s, z0.s
> >> > > > >         ret
> >> > > > >
> >> > > > > The attached patch tries to fix that by trying to recognize the following
> >> > > > > pattern in match.pd:
> >> > > > > v1 = VEC_PERM_EXPR (v0, v0, mask)
> >> > > > > v2 = VEC_PERM_EXPR (v1, v1, mask)
> >> > > > > -->
> >> > > > > v2 = v0
> >> > > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... }
> >> > > > >
> >> > > > > Code-gen with patch:
> >> > > > > f:
> >> > > > >         ret
> >> > > > >
> >> > > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in progress.
> >> > > > > Does it look OK for stage-1 ?
> >> > > >
> >> > > > I didn't look at the patch but tree-ssa-forwprop.cc:simplify_permutation should
> >> > > > handle two consecutive permutes with the is_combined_permutation_identity
> >> > > > which might need tweaking for VLA vectors
> >> > > Hi Richard,
> >> > > Thanks for the suggestions. The attached patch modifies
> >> > > is_combined_permutation_identity
> >> > > to recognize the above pattern.
> >> > > Does it look OK ?
> >> > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu.
> >> > Hi,
> >> > ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html
> >>
> >> Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2
> >> and amend the function comment accordingly, say,
> >>
> >>   tem = VEC_PERM <op0, op1, MASK1>;
> >>   res = VEC_PERM <tem, tem, MASK2>;
> >>
> >> SAME_P specifies whether op0 and op1 compare equal.  */
> >>
> >> +  if (def_stmt)
> >> +    gcc_checking_assert (is_gimple_assign (def_stmt)
> >> +                        && gimple_assign_rhs_code (def_stmt) == VEC_PERM_EXPR);
> >> this is then unnecessary
> >>
> >>    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
> >> +
> >> +  /* For VLA masks, check for the following pattern:
> >> +     v1 = VEC_PERM_EXPR (v0, v0, mask)
> >> +     v2 = VEC_PERM_EXPR (v1, v1, mask)
> >> +     -->
> >> +     v2 = v0
> >>
> >> you are not using 'mask' so please defer fold_ternary until after your
> >> special-case.
> >>
> >> +  if (operand_equal_p (mask1, mask2, 0)
> >> +      && !VECTOR_CST_NELTS (mask1).is_constant ()
> >> +      && def_stmt
> >> +      && operand_equal_p (gimple_assign_rhs1 (def_stmt),
> >> +                         gimple_assign_rhs2 (def_stmt), 0))
> >> +    {
> >> +      vec_perm_builder builder;
> >> +      if (tree_to_vec_perm_builder (&builder, mask1))
> >> +       {
> >> +         poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> >> +         vec_perm_indices sel (builder, 1, nelts);
> >> +         if (sel.series_p (0, 1, nelts - 1, -1))
> >> +           return 1;
> >> +       }
> >> +      return 0;
> >>
> >> I'm defering to Richard whether this is the correct way to check for a vector
> >> reversing mask (I wonder how constructing such mask is even possible)
> > Hi Richard,
> > Thanks for the suggestions, I have updated the patch accordingly.
> >
> > The following hunk from svrev_impl::fold() constructs mask in reverse:
> >     /* Permute as { nelts - 1, nelts - 2, nelts - 3, ... }.  */
> >     poly_int64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (f.lhs));
> >     vec_perm_builder builder (nelts, 1, 3);
> >     for (int i = 0; i < 3; ++i)
> >       builder.quick_push (nelts - i - 1);
> >     return fold_permute (f, builder);
> >
> > To see if mask chooses elements in reverse, I borrowed it from function comment
> > for series_p in vec-perm-indices.cc:
> > /* Return true if index OUT_BASE + I * OUT_STEP selects input
> >    element IN_BASE + I * IN_STEP.  For example, the call to test
> >    whether a permute reverses a vector of N elements would be:
> >
> >      series_p (0, 1, N - 1, -1)
> >
> >    which would return true for { N - 1, N - 2, N - 3, ... }.  */
> >
> > Thanks,
> > Prathamesh
> >>
> >> Richard.
> >>
> >> > Thanks,
> >> > Prathamesh
> >> > >
> >> > > Thanks,
> >> > > Prathamesh
> >> > > >
> >> > > > Richard.
> >> > > >
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Prathamesh
> >
> > gcc/ChangeLog:
> >       * tree-ssa-forwprop.cc (is_combined_permutation_identity):
> >       New parameter same_p.
> >       Try to simplify two successive VEC_PERM_EXPRs with single operand
> >       and same mask, where mask chooses elements in reverse order.
> >
> > gcc/testesuite/ChangeLog:
> >       * gcc.target/aarch64/sve/acle/general/rev-1.c: New test.
> >
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> > new file mode 100644
> > index 00000000000..e57ee67d716
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -fdump-tree-optimized" } */
> > +
> > +#include <arm_sve.h>
> > +
> > +svint32_t f(svint32_t v)
> > +{
> > +  return svrev_s32 (svrev_s32 (v));
> > +}
> > +
> > +/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
> > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
> > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> > index 9b567440ba4..ebd4a368ae9 100644
> > --- a/gcc/tree-ssa-forwprop.cc
> > +++ b/gcc/tree-ssa-forwprop.cc
> > @@ -2528,11 +2528,16 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi)
> >    return true;
> >  }
> >
> > -/* Determine whether applying the 2 permutations (mask1 then mask2)
> > -   gives back one of the input.  */
> > +/* For the following sequence:
> > +   tem = VEC_PERM_EXPR <op0, op1, mask1>
> > +   res = VEC_PERM_EXPR <tem, tem, mask2>
> > +
> > +   Determine whether applying the 2 permutations (mask1 then mask2)
> > +   gives back one of the input. SAME_P specifies whether op0
> > +   and op1 compare equal.  */
> >
> >  static int
> > -is_combined_permutation_identity (tree mask1, tree mask2)
> > +is_combined_permutation_identity (tree mask1, tree mask2, bool same_p)
> >  {
> >    tree mask;
> >    unsigned HOST_WIDE_INT nelts, i, j;
> > @@ -2541,6 +2546,29 @@ is_combined_permutation_identity (tree mask1, tree mask2)
> >
> >    gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
> >                      && TREE_CODE (mask2) == VECTOR_CST);
> > +
> > +  /* For VLA masks, check for the following pattern:
> > +     v1 = VEC_PERM_EXPR (v0, v0, mask1)
> > +     v2 = VEC_PERM_EXPR (v1, v1, mask2)
> > +     -->
> > +     v2 = v0
> > +     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
> > +
> > +  if (operand_equal_p (mask1, mask2, 0)
> > +      && !VECTOR_CST_NELTS (mask1).is_constant ()
> > +      && same_p)
>
> The same_p isn't needed, since a reverse mask only picks from the
> first operand.  Canonicalisation rules should ensure that op0 and op1
> end up being equal in that case, but this fold doesn't depend on that.
>
> > +    {
> > +      vec_perm_builder builder;
> > +      if (tree_to_vec_perm_builder (&builder, mask1))
> > +     {
> > +       poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> > +       vec_perm_indices sel (builder, 1, nelts);
> > +       if (sel.series_p (0, 1, nelts - 1, -1))
> > +         return 1;
> > +     }
>
> I have conflicting feelings about this.
>
> (1) It seems quite special-purpose.  I'm a bit worried that we'll end
>     up with a list of highly specific folds (which could grow quite long)
>     rather than something more general.
>
> (2) Constructing a vec_perm_indices seems quite heavyweight for this
>     single check.  We could go directly from the mask instead.
>
> But “fixing” (2) reduces the generality of the structure and so leans
> more heavily into (1).
>
> I agree the check is correct though, so let's go with the patch
> in this form (without the same_p thing).  However:
>
> > +      return 0;
> > +    }
> > +
>
> I don't think we want this early exit.  It won't be used for all VLA cases
> due the operand_equal_p check.  And in future, more VLA-compatible stuff
> might be added further down.
Hi Richard,
Thanks for the suggestions, does the attached patch look OK ?
Bootstrapped+tested on aarch64-linux-gnu.

Thanks,
Prathamesh
>
> Thanks,
> Richard
>
> >    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
> >    if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
> >      return 0;
> > @@ -2629,7 +2657,9 @@ simplify_permutation (gimple_stmt_iterator *gsi)
> >        op3 = gimple_assign_rhs3 (def_stmt);
> >        if (TREE_CODE (op3) != VECTOR_CST)
> >       return 0;
> > -      ident = is_combined_permutation_identity (op3, op2);
> > +      bool same_p = operand_equal_p (gimple_assign_rhs1 (def_stmt),
> > +                                  gimple_assign_rhs2 (def_stmt), 0);
> > +      ident = is_combined_permutation_identity (op3, op2, same_p);
> >        if (!ident)
> >       return 0;
> >        orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt)

[-- Attachment #2: gnu-830-5.txt --]
[-- Type: text/plain, Size: 1989 bytes --]

gcc/ChangeLog:
	* tree-ssa-forwprop.cc (is_combined_permutation_identity): Try to
	simplify two successive VEC_PERM_EXPRs with single operand and same
	mask, where mask chooses elements in reverse order.

gcc/testesuite/ChangeLog:
	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
new file mode 100644
index 00000000000..e57ee67d716
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+#include <arm_sve.h>
+
+svint32_t f(svint32_t v)
+{
+  return svrev_s32 (svrev_s32 (v));
+}
+
+/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 9b567440ba4..61df7efe82c 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -2541,6 +2541,27 @@ is_combined_permutation_identity (tree mask1, tree mask2)
 
   gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
 		       && TREE_CODE (mask2) == VECTOR_CST);
+
+  /* For VLA masks, check for the following pattern:
+     v1 = VEC_PERM_EXPR (v0, v0, mask1)
+     v2 = VEC_PERM_EXPR (v1, v1, mask2)
+     -->
+     v2 = v0
+     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
+
+  if (operand_equal_p (mask1, mask2, 0)
+      && !VECTOR_CST_NELTS (mask1).is_constant ())
+    {
+      vec_perm_builder builder;
+      if (tree_to_vec_perm_builder (&builder, mask1))
+	{
+	  poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
+	  vec_perm_indices sel (builder, 1, nelts);
+	  if (sel.series_p (0, 1, nelts - 1, -1))
+	    return 1;
+	}
+    }
+
   mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
   if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
     return 0;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-23  6:11             ` Prathamesh Kulkarni
@ 2023-04-24  9:32               ` Richard Sandiford
  2023-04-24 19:53                 ` Prathamesh Kulkarni
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Sandiford @ 2023-04-24  9:32 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Richard Biener, gcc Patches

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> gcc/ChangeLog:
> 	* tree-ssa-forwprop.cc (is_combined_permutation_identity): Try to
> 	simplify two successive VEC_PERM_EXPRs with single operand and same
> 	mask, where mask chooses elements in reverse order.
>
> gcc/testesuite/ChangeLog:
> 	* gcc.target/aarch64/sve/acle/general/rev-1.c: New test.
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> new file mode 100644
> index 00000000000..e57ee67d716
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
> +
> +#include <arm_sve.h>
> +
> +svint32_t f(svint32_t v)
> +{
> +  return svrev_s32 (svrev_s32 (v));
> +}
> +
> +/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
> diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> index 9b567440ba4..61df7efe82c 100644
> --- a/gcc/tree-ssa-forwprop.cc
> +++ b/gcc/tree-ssa-forwprop.cc
> @@ -2541,6 +2541,27 @@ is_combined_permutation_identity (tree mask1, tree mask2)
>  
>    gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
>  		       && TREE_CODE (mask2) == VECTOR_CST);
> +
> +  /* For VLA masks, check for the following pattern:
> +     v1 = VEC_PERM_EXPR (v0, v0, mask1)
> +     v2 = VEC_PERM_EXPR (v1, v1, mask2)

Maybe blank out the second operands using "...":

     v1 = VEC_PERM_EXPR (v0, ..., mask1)
     v2 = VEC_PERM_EXPR (v1, ..., mask2)

to make it clear that they don't matter.

OK with that change, thanks.

Richard

> +     -->
> +     v2 = v0
> +     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
> +
> +  if (operand_equal_p (mask1, mask2, 0)
> +      && !VECTOR_CST_NELTS (mask1).is_constant ())
> +    {
> +      vec_perm_builder builder;
> +      if (tree_to_vec_perm_builder (&builder, mask1))
> +	{
> +	  poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> +	  vec_perm_indices sel (builder, 1, nelts);
> +	  if (sel.series_p (0, 1, nelts - 1, -1))
> +	    return 1;
> +	}
> +    }
> +
>    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
>    if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
>      return 0;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v
  2023-04-24  9:32               ` Richard Sandiford
@ 2023-04-24 19:53                 ` Prathamesh Kulkarni
  0 siblings, 0 replies; 11+ messages in thread
From: Prathamesh Kulkarni @ 2023-04-24 19:53 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, gcc Patches, richard.sandiford

On Mon, 24 Apr 2023 at 15:02, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > gcc/ChangeLog:
> >       * tree-ssa-forwprop.cc (is_combined_permutation_identity): Try to
> >       simplify two successive VEC_PERM_EXPRs with single operand and same
> >       mask, where mask chooses elements in reverse order.
> >
> > gcc/testesuite/ChangeLog:
> >       * gcc.target/aarch64/sve/acle/general/rev-1.c: New test.
> >
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> > new file mode 100644
> > index 00000000000..e57ee67d716
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -fdump-tree-optimized" } */
> > +
> > +#include <arm_sve.h>
> > +
> > +svint32_t f(svint32_t v)
> > +{
> > +  return svrev_s32 (svrev_s32 (v));
> > +}
> > +
> > +/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */
> > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */
> > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> > index 9b567440ba4..61df7efe82c 100644
> > --- a/gcc/tree-ssa-forwprop.cc
> > +++ b/gcc/tree-ssa-forwprop.cc
> > @@ -2541,6 +2541,27 @@ is_combined_permutation_identity (tree mask1, tree mask2)
> >
> >    gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST
> >                      && TREE_CODE (mask2) == VECTOR_CST);
> > +
> > +  /* For VLA masks, check for the following pattern:
> > +     v1 = VEC_PERM_EXPR (v0, v0, mask1)
> > +     v2 = VEC_PERM_EXPR (v1, v1, mask2)
>
> Maybe blank out the second operands using "...":
>
>      v1 = VEC_PERM_EXPR (v0, ..., mask1)
>      v2 = VEC_PERM_EXPR (v1, ..., mask2)
>
> to make it clear that they don't matter.
>
> OK with that change, thanks.
Thanks, committed in:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f0eabc52c9a2d3da0bfc201da7a5c1658b76e9a4

Thanks,
Prathamesh
>
> Richard
>
> > +     -->
> > +     v2 = v0
> > +     if mask1 == mask2 == {nelts - 1, nelts - 2, ...}.  */
> > +
> > +  if (operand_equal_p (mask1, mask2, 0)
> > +      && !VECTOR_CST_NELTS (mask1).is_constant ())
> > +    {
> > +      vec_perm_builder builder;
> > +      if (tree_to_vec_perm_builder (&builder, mask1))
> > +     {
> > +       poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1));
> > +       vec_perm_indices sel (builder, 1, nelts);
> > +       if (sel.series_p (0, 1, nelts - 1, -1))
> > +         return 1;
> > +     }
> > +    }
> > +
> >    mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2);
> >    if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST)
> >      return 0;

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-04-24 19:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-05  8:38 [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v Prathamesh Kulkarni
2023-04-11  8:46 ` Richard Biener
2023-04-11 14:06   ` Prathamesh Kulkarni
2023-04-19  9:21     ` Prathamesh Kulkarni
2023-04-19 10:45       ` Richard Biener
2023-04-19 12:20         ` Prathamesh Kulkarni
2023-04-19 12:32           ` Richard Biener
2023-04-21 16:27           ` Richard Sandiford
2023-04-23  6:11             ` Prathamesh Kulkarni
2023-04-24  9:32               ` Richard Sandiford
2023-04-24 19:53                 ` Prathamesh Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).