[PATCH v2 9/18]middle-end optimize slp simplify back to back permutes.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes.
@ 2020-11-03 15:07 Tamar Christina
  2020-11-04 13:00 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Tamar Christina @ 2020-11-03 15:07 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd, rguenther, ook

[-- Attachment #1: Type: text/plain, Size: 1289 bytes --]

Hi All,

This optimizes sequential permutes. i.e. if there are two permutes back to back
this function applies the permute of the parent to the child and removed the
parent.

If the resulting permute in the child is now a no-op.  Then the child is also
dropped from the graph and the parent's parent attached to the child's child.

This relies on the materialization point calculation in optimize SLP.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Tests are included as part of the final patch as they need the SLP pattern
matcher to insert permutes in between.

This allows us to remove useless permutes such as

	ldr     q0, [x0, x3]
	ldr     q2, [x1, x3]
	trn1    v1.4s, v0.4s, v0.4s
	trn2    v0.4s, v0.4s, v0.4s
	trn1    v0.4s, v1.4s, v0.4s
	mov     v1.16b, v3.16b
	fcmla   v1.4s, v0.4s, v2.4s, #0
	fcmla   v1.4s, v0.4s, v2.4s, #90
	str     q1, [x2, x3]

from the sequence the vectorizer puts out and give

	ldr     q0, [x0, x3]
	ldr     q2, [x1, x3]
	mov     v1.16b, v3.16b
	fcmla   v1.4s, v0.4s, v2.4s, #0
	fcmla   v1.4s, v0.4s, v2.4s, #90
	str     q1, [x2, x3]

instead

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-vect-slp.c (vect_slp_tree_permute_noop_p): New.
	(vect_optimize_slp): Optimize permutes.
	(vectorizable_slp_permutation): Fix typo.

-- 

[-- Attachment #2: rb13719.patch --]
[-- Type: text/x-diff, Size: 2762 bytes --]

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index c03fc2fbecad1a2219504ac9daae75495e691775..48f615e1952707de4827f0e69e443c0a7db27d81 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -290,6 +290,27 @@ vect_slp_tree_uniform_p (slp_tree node)
   return true;
 }
 
+/* Return true when the node is a permute node and the permutation the node
+   contains is a no-op.  */
+
+static bool
+vect_slp_tree_permute_noop_p (slp_tree node)
+{
+  gcc_assert (SLP_TREE_CODE (node) == VEC_PERM_EXPR);
+
+  if (!SLP_TREE_LANE_PERMUTATION (node).exists ())
+    return true;
+
+  unsigned x, seed;
+  lane_permutation_t perms = SLP_TREE_LANE_PERMUTATION (node);
+  seed = perms[0].second;
+  for (x = 1; x < perms.length (); x++)
+    if (perms[x].first != perms[0].first || perms[x].second != ++seed)
+      return false;
+
+  return true;
+}
+
 /* Find the place of the data-ref in STMT_INFO in the interleaving chain
    that starts from FIRST_STMT_INFO.  Return -1 if the data-ref is not a part
    of the chain.  */
@@ -3150,6 +3171,41 @@ vect_optimize_slp (vec_info *vinfo)
 	    /* For loads simply drop the permutation, the load permutation
 	       already performs the desired permutation.  */
 	    ;
+	  else if (SLP_TREE_LANE_PERMUTATION (node).exists ())
+	    {
+	      /* If the node if already a permute node we just need to apply
+		 the permutation to the permute node itself.  */
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_NOTE, vect_location,
+				 "simplifying permute node %p\n",
+				 node);
+
+	      vect_slp_permute (perms[perm], SLP_TREE_LANE_PERMUTATION (node),
+				true);
+
+	      /* If the remaining permute is a no-op then we can just drop the
+		 node instead of materializing it.  */
+	      if (vect_slp_tree_permute_noop_p (node))
+		{
+		  if (dump_enabled_p ())
+		    dump_printf_loc (MSG_NOTE, vect_location,
+				     "removing unneeded permute node %p\n",
+				     node);
+
+		   unsigned idx = SLP_TREE_LANE_PERMUTATION (node)[0].first;
+		   slp_tree value = SLP_TREE_CHILDREN (node)[idx];
+		   unsigned src = slpg->vertices[node->vertex].pred->src;
+		   slp_tree prev = vertices[src];
+		   unsigned dest;
+		   slp_tree tmp;
+		   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (prev), dest, tmp)
+		     if (tmp == node)
+		       {
+			  SLP_TREE_CHILDREN (prev)[dest] = value;
+			  break;
+		        }
+		}
+	    }
 	  else
 	    {
 	      if (dump_enabled_p ())
@@ -5361,7 +5417,7 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "permutation requires at "
-			     "least three vectors");
+			     "least three vectors\n");
 	  gcc_assert (!gsi);
 	  return false;
 	}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes.
  2020-11-03 15:07 [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes Tamar Christina
@ 2020-11-04 13:00 ` Richard Biener
  2020-11-04 13:08   ` Tamar Christina
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2020-11-04 13:00 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, ook

On Tue, 3 Nov 2020, Tamar Christina wrote:

> Hi All,
> 
> This optimizes sequential permutes. i.e. if there are two permutes back to back
> this function applies the permute of the parent to the child and removed the
> parent.
> 
> If the resulting permute in the child is now a no-op.  Then the child is also
> dropped from the graph and the parent's parent attached to the child's child.
> 
> This relies on the materialization point calculation in optimize SLP.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Tests are included as part of the final patch as they need the SLP pattern
> matcher to insert permutes in between.
> 
> This allows us to remove useless permutes such as
> 
> 	ldr     q0, [x0, x3]
> 	ldr     q2, [x1, x3]
> 	trn1    v1.4s, v0.4s, v0.4s
> 	trn2    v0.4s, v0.4s, v0.4s
> 	trn1    v0.4s, v1.4s, v0.4s
> 	mov     v1.16b, v3.16b
> 	fcmla   v1.4s, v0.4s, v2.4s, #0
> 	fcmla   v1.4s, v0.4s, v2.4s, #90
> 	str     q1, [x2, x3]
> 
> from the sequence the vectorizer puts out and give
> 
> 	ldr     q0, [x0, x3]
> 	ldr     q2, [x1, x3]
> 	mov     v1.16b, v3.16b
> 	fcmla   v1.4s, v0.4s, v2.4s, #0
> 	fcmla   v1.4s, v0.4s, v2.4s, #90
> 	str     q1, [x2, x3]
> 
> instead
> 
> Ok for master?

+             /* If the remaining permute is a no-op then we can just drop 
the
+                node instead of materializing it.  */
+             if (vect_slp_tree_permute_noop_p (node))
+               {
+                 if (dump_enabled_p ())
+                   dump_printf_loc (MSG_NOTE, vect_location,
+                                    "removing unneeded permute node 
%p\n",
+                                    node);
+
+                  unsigned idx = SLP_TREE_LANE_PERMUTATION 
(node)[0].first;
+                  slp_tree value = SLP_TREE_CHILDREN (node)[idx];
+                  unsigned src = slpg->vertices[node->vertex].pred->src;
+                  slp_tree prev = vertices[src];
+                  unsigned dest;
+                  slp_tree tmp;
+                  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (prev), dest, tmp)
+                    if (tmp == node)
+                      {
+                         SLP_TREE_CHILDREN (prev)[dest] = value;
+                         break;
+                       }

so I don't think this will work reliably since we do not update
the graph when inserting permute nodes and thus the "parent"
can refer to a permute rather than the original node now (we're
just walking over all vertices in no specific order during
materialization - guess using IPO might fix this apart from in
cycles).  You would also need to iterate over preds here (pred_next).
I guess removing no-op permutes is only important for costing?
They should not cause any actual code generation?

You also need to adjust reference counts when you change
SLP_TREE_CHILDREN (prev)[dest], first add to that of VALUE
and then slp_tree_free node itself (which might be tricky
at this point).

+static bool
+vect_slp_tree_permute_noop_p (slp_tree node)
+{
+  gcc_assert (SLP_TREE_CODE (node) == VEC_PERM_EXPR);
+
+  if (!SLP_TREE_LANE_PERMUTATION (node).exists ())
+    return true;
+
+  unsigned x, seed;
+  lane_permutation_t perms = SLP_TREE_LANE_PERMUTATION (node);
+  seed = perms[0].second;
+  for (x = 1; x < perms.length (); x++)
+    if (perms[x].first != perms[0].first || perms[x].second != ++seed)
+      return false;

'seed' needs to be zero to be a noop permute and
SLP_TREE_LANES (SLP_TREE_CHILDREN (node)[perms[0].first]) needs
to be the same as SLP_TREE_LANES (node).  Otherwise you'll make
permutes that select parts of a vector no-op.

Maybe simplify the patch and do the vect_slp_tree_permute_noop_p
check in vectorizable_slp_permutation instead?

The permute node adjustment part is OK, thus up to

+         else if (SLP_TREE_LANE_PERMUTATION (node).exists ())
+           {
+             /* If the node if already a permute node we just need to 
apply
+                the permutation to the permute node itself.  */
+             if (dump_enabled_p ())
+               dump_printf_loc (MSG_NOTE, vect_location,
+                                "simplifying permute node %p\n",
+                                node);
+
+             vect_slp_permute (perms[perm], SLP_TREE_LANE_PERMUTATION 
(node),
+                               true);

in case you want to split up, independently of the rest of the
patches.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* tree-vect-slp.c (vect_slp_tree_permute_noop_p): New.
> 	(vect_optimize_slp): Optimize permutes.
> 	(vectorizable_slp_permutation): Fix typo.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes.
  2020-11-04 13:00 ` Richard Biener
@ 2020-11-04 13:08   ` Tamar Christina
  2020-11-04 13:12     ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Tamar Christina @ 2020-11-04 13:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nd, ook

Hi Richi,

> -----Original Message-----
> From: rguenther@c653.arch.suse.de <rguenther@c653.arch.suse.de> On
> Behalf Of Richard Biener
> Sent: Wednesday, November 4, 2020 1:00 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; ook@ucw.cz
> Subject: Re: [PATCH v2 9/18]middle-end optimize slp simplify back to back
> permutes.
> 
> On Tue, 3 Nov 2020, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This optimizes sequential permutes. i.e. if there are two permutes
> > back to back this function applies the permute of the parent to the
> > child and removed the parent.
> >
> > If the resulting permute in the child is now a no-op.  Then the child
> > is also dropped from the graph and the parent's parent attached to the
> child's child.
> >
> > This relies on the materialization point calculation in optimize SLP.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > Tests are included as part of the final patch as they need the SLP
> > pattern matcher to insert permutes in between.
> >
> > This allows us to remove useless permutes such as
> >
> > 	ldr     q0, [x0, x3]
> > 	ldr     q2, [x1, x3]
> > 	trn1    v1.4s, v0.4s, v0.4s
> > 	trn2    v0.4s, v0.4s, v0.4s
> > 	trn1    v0.4s, v1.4s, v0.4s
> > 	mov     v1.16b, v3.16b
> > 	fcmla   v1.4s, v0.4s, v2.4s, #0
> > 	fcmla   v1.4s, v0.4s, v2.4s, #90
> > 	str     q1, [x2, x3]
> >
> > from the sequence the vectorizer puts out and give
> >
> > 	ldr     q0, [x0, x3]
> > 	ldr     q2, [x1, x3]
> > 	mov     v1.16b, v3.16b
> > 	fcmla   v1.4s, v0.4s, v2.4s, #0
> > 	fcmla   v1.4s, v0.4s, v2.4s, #90
> > 	str     q1, [x2, x3]
> >
> > instead
> >
> > Ok for master?
> 
> +             /* If the remaining permute is a no-op then we can just
> + drop
> the
> +                node instead of materializing it.  */
> +             if (vect_slp_tree_permute_noop_p (node))
> +               {
> +                 if (dump_enabled_p ())
> +                   dump_printf_loc (MSG_NOTE, vect_location,
> +                                    "removing unneeded permute node
> %p\n",
> +                                    node);
> +
> +                  unsigned idx = SLP_TREE_LANE_PERMUTATION
> (node)[0].first;
> +                  slp_tree value = SLP_TREE_CHILDREN (node)[idx];
> +                  unsigned src = slpg->vertices[node->vertex].pred->src;
> +                  slp_tree prev = vertices[src];
> +                  unsigned dest;
> +                  slp_tree tmp;
> +                  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (prev), dest, tmp)
> +                    if (tmp == node)
> +                      {
> +                         SLP_TREE_CHILDREN (prev)[dest] = value;
> +                         break;
> +                       }
> 
> so I don't think this will work reliably since we do not update the graph when
> inserting permute nodes and thus the "parent"
> can refer to a permute rather than the original node now (we're just walking
> over all vertices in no specific order during materialization - guess using IPO
> might fix this apart from in cycles).  You would also need to iterate over preds
> here (pred_next).
> I guess removing no-op permutes is only important for costing?
> They should not cause any actual code generation?

Yeah, it's just for costing, the simplification of the permute part is the one fixing
the codegen. I could just remove the lane permute (as in, clear it) and change the
costing function to not cost VEC_PERMS with no lane permutes (if it doesn't already do that).

> 
> You also need to adjust reference counts when you change
> SLP_TREE_CHILDREN (prev)[dest], first add to that of VALUE and then
> slp_tree_free node itself (which might be tricky at this point).
> 
> +static bool
> +vect_slp_tree_permute_noop_p (slp_tree node) {
> +  gcc_assert (SLP_TREE_CODE (node) == VEC_PERM_EXPR);
> +
> +  if (!SLP_TREE_LANE_PERMUTATION (node).exists ())
> +    return true;
> +
> +  unsigned x, seed;
> +  lane_permutation_t perms = SLP_TREE_LANE_PERMUTATION (node);
> seed =
> + perms[0].second;  for (x = 1; x < perms.length (); x++)
> +    if (perms[x].first != perms[0].first || perms[x].second != ++seed)
> +      return false;
> 
> 'seed' needs to be zero to be a noop permute and SLP_TREE_LANES
> (SLP_TREE_CHILDREN (node)[perms[0].first]) needs to be the same as
> SLP_TREE_LANES (node).  Otherwise you'll make permutes that select parts
> of a vector no-op.
> 
> Maybe simplify the patch and do the vect_slp_tree_permute_noop_p check
> in vectorizable_slp_permutation instead?
> 
> The permute node adjustment part is OK, thus up to
> 
> +         else if (SLP_TREE_LANE_PERMUTATION (node).exists ())
> +           {
> +             /* If the node if already a permute node we just need to
> apply
> +                the permutation to the permute node itself.  */
> +             if (dump_enabled_p ())
> +               dump_printf_loc (MSG_NOTE, vect_location,
> +                                "simplifying permute node %p\n",
> +                                node);
> +
> +             vect_slp_permute (perms[perm], SLP_TREE_LANE_PERMUTATION
> (node),
> +                               true);
> 
> in case you want to split up, independently of the rest of the patches.
> 

I could but don't think I can generate a testcase for this as the build function wouldn't
have generated back to back permutes.  So I'd still need the rest to get approved for a
testcase 😊

Thanks,
Tamar
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* tree-vect-slp.c (vect_slp_tree_permute_noop_p): New.
> > 	(vect_optimize_slp): Optimize permutes.
> > 	(vectorizable_slp_permutation): Fix typo.
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg, Germany; GF: Felix Imend

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes.
  2020-11-04 13:08   ` Tamar Christina
@ 2020-11-04 13:12     ` Richard Biener
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2020-11-04 13:12 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches, nd, ook

On Wed, 4 Nov 2020, Tamar Christina wrote:

> Hi Richi,
> 
> > -----Original Message-----
> > From: rguenther@c653.arch.suse.de <rguenther@c653.arch.suse.de> On
> > Behalf Of Richard Biener
> > Sent: Wednesday, November 4, 2020 1:00 PM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; ook@ucw.cz
> > Subject: Re: [PATCH v2 9/18]middle-end optimize slp simplify back to back
> > permutes.
> > 
> > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This optimizes sequential permutes. i.e. if there are two permutes
> > > back to back this function applies the permute of the parent to the
> > > child and removed the parent.
> > >
> > > If the resulting permute in the child is now a no-op.  Then the child
> > > is also dropped from the graph and the parent's parent attached to the
> > child's child.
> > >
> > > This relies on the materialization point calculation in optimize SLP.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > Tests are included as part of the final patch as they need the SLP
> > > pattern matcher to insert permutes in between.
> > >
> > > This allows us to remove useless permutes such as
> > >
> > > 	ldr     q0, [x0, x3]
> > > 	ldr     q2, [x1, x3]
> > > 	trn1    v1.4s, v0.4s, v0.4s
> > > 	trn2    v0.4s, v0.4s, v0.4s
> > > 	trn1    v0.4s, v1.4s, v0.4s
> > > 	mov     v1.16b, v3.16b
> > > 	fcmla   v1.4s, v0.4s, v2.4s, #0
> > > 	fcmla   v1.4s, v0.4s, v2.4s, #90
> > > 	str     q1, [x2, x3]
> > >
> > > from the sequence the vectorizer puts out and give
> > >
> > > 	ldr     q0, [x0, x3]
> > > 	ldr     q2, [x1, x3]
> > > 	mov     v1.16b, v3.16b
> > > 	fcmla   v1.4s, v0.4s, v2.4s, #0
> > > 	fcmla   v1.4s, v0.4s, v2.4s, #90
> > > 	str     q1, [x2, x3]
> > >
> > > instead
> > >
> > > Ok for master?
> > 
> > +             /* If the remaining permute is a no-op then we can just
> > + drop
> > the
> > +                node instead of materializing it.  */
> > +             if (vect_slp_tree_permute_noop_p (node))
> > +               {
> > +                 if (dump_enabled_p ())
> > +                   dump_printf_loc (MSG_NOTE, vect_location,
> > +                                    "removing unneeded permute node
> > %p\n",
> > +                                    node);
> > +
> > +                  unsigned idx = SLP_TREE_LANE_PERMUTATION
> > (node)[0].first;
> > +                  slp_tree value = SLP_TREE_CHILDREN (node)[idx];
> > +                  unsigned src = slpg->vertices[node->vertex].pred->src;
> > +                  slp_tree prev = vertices[src];
> > +                  unsigned dest;
> > +                  slp_tree tmp;
> > +                  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (prev), dest, tmp)
> > +                    if (tmp == node)
> > +                      {
> > +                         SLP_TREE_CHILDREN (prev)[dest] = value;
> > +                         break;
> > +                       }
> > 
> > so I don't think this will work reliably since we do not update the graph when
> > inserting permute nodes and thus the "parent"
> > can refer to a permute rather than the original node now (we're just walking
> > over all vertices in no specific order during materialization - guess using IPO
> > might fix this apart from in cycles).  You would also need to iterate over preds
> > here (pred_next).
> > I guess removing no-op permutes is only important for costing?
> > They should not cause any actual code generation?
> 
> Yeah, it's just for costing, the simplification of the permute part is the one fixing
> the codegen. I could just remove the lane permute (as in, clear it) and change the
> costing function to not cost VEC_PERMS with no lane permutes (if it doesn't already do that).

I think clearing the lane permute is even not necessary.  The vec
perm code generation should already not cost anything here
since it is also able to elide costs when the permute aligns
naturally with vector boundaries as in { [0, 2], [0, 3], [0, 0], [0, 1] }
for two-element vectors. 

> > 
> > You also need to adjust reference counts when you change
> > SLP_TREE_CHILDREN (prev)[dest], first add to that of VALUE and then
> > slp_tree_free node itself (which might be tricky at this point).
> > 
> > +static bool
> > +vect_slp_tree_permute_noop_p (slp_tree node) {
> > +  gcc_assert (SLP_TREE_CODE (node) == VEC_PERM_EXPR);
> > +
> > +  if (!SLP_TREE_LANE_PERMUTATION (node).exists ())
> > +    return true;
> > +
> > +  unsigned x, seed;
> > +  lane_permutation_t perms = SLP_TREE_LANE_PERMUTATION (node);
> > seed =
> > + perms[0].second;  for (x = 1; x < perms.length (); x++)
> > +    if (perms[x].first != perms[0].first || perms[x].second != ++seed)
> > +      return false;
> > 
> > 'seed' needs to be zero to be a noop permute and SLP_TREE_LANES
> > (SLP_TREE_CHILDREN (node)[perms[0].first]) needs to be the same as
> > SLP_TREE_LANES (node).  Otherwise you'll make permutes that select parts
> > of a vector no-op.
> > 
> > Maybe simplify the patch and do the vect_slp_tree_permute_noop_p check
> > in vectorizable_slp_permutation instead?
> > 
> > The permute node adjustment part is OK, thus up to
> > 
> > +         else if (SLP_TREE_LANE_PERMUTATION (node).exists ())
> > +           {
> > +             /* If the node if already a permute node we just need to
> > apply
> > +                the permutation to the permute node itself.  */
> > +             if (dump_enabled_p ())
> > +               dump_printf_loc (MSG_NOTE, vect_location,
> > +                                "simplifying permute node %p\n",
> > +                                node);
> > +
> > +             vect_slp_permute (perms[perm], SLP_TREE_LANE_PERMUTATION
> > (node),
> > +                               true);
> > 
> > in case you want to split up, independently of the rest of the patches.
> > 
> 
> I could but don't think I can generate a testcase for this as the build function wouldn't
> have generated back to back permutes.  So I'd still need the rest to get approved for a
> testcase ?

You can add a testcase once you have one, the above is quite obvious
IMHO.

Richard.

> Thanks,
> Tamar
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > 	* tree-vect-slp.c (vect_slp_tree_permute_noop_p): New.
> > > 	(vect_optimize_slp): Optimize permutes.
> > > 	(vectorizable_slp_permutation): Fix typo.
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > Nuernberg, Germany; GF: Felix Imend
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-04 13:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 15:07 [PATCH v2 9/18]middle-end optimize slp simplify back to back permutes Tamar Christina
2020-11-04 13:00 ` Richard Biener
2020-11-04 13:08   ` Tamar Christina
2020-11-04 13:12     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).