public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last
@ 2023-01-10  9:46 Richard Biener
  2023-01-10 10:42 ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2023-01-10  9:46 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford

The extract-last reduction internal function expects the then and
else clause as vector and scalar and thus we cannot perform optimization
of the inversion of the condition by swapping the then/else clauses.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

	PR tree-optimization/108314
	* tree-vect-stmts.cc (vectorizable_condition): Do not
	perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION.

	* gcc.dg/vect/pr108314.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++
 gcc/tree-vect-stmts.cc               | 13 +++++++++----
 2 files changed, 25 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c b/gcc/testsuite/gcc.dg/vect/pr108314.c
new file mode 100644
index 00000000000..07260e06915
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr108314.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
+
+int x, y, z;
+
+void f(void)
+{
+  int t = 4;
+  for (; x; x++)
+    {
+      if (y)
+	continue;
+      t = 0;
+    }
+  z = t;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6ddd41fb473..eb4ca1f184e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo,
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	      if (bitop2 == NOP_EXPR)
 		vec_compare = new_temp;
-	      else if (bitop2 == BIT_NOT_EXPR)
+	      else if (bitop2 == BIT_NOT_EXPR
+		       && reduction_type != EXTRACT_LAST_REDUCTION)
 		{
 		  /* Instead of doing ~x ? y : z do x ? z : y.  */
 		  vec_compare = new_temp;
@@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo,
 	      else
 		{
 		  vec_compare = make_ssa_name (vec_cmp_type);
-		  new_stmt
-		    = gimple_build_assign (vec_compare, bitop2,
-					   vec_cond_lhs, new_temp);
+		  if (bitop2 == BIT_NOT_EXPR)
+		    new_stmt
+		      = gimple_build_assign (vec_compare, bitop2, new_temp);
+		  else
+		    new_stmt
+		      = gimple_build_assign (vec_compare, bitop2,
+					     vec_cond_lhs, new_temp);
 		  vect_finish_stmt_generation (vinfo, stmt_info,
 					       new_stmt, gsi);
 		}
-- 
2.35.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last
  2023-01-10  9:46 [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last Richard Biener
@ 2023-01-10 10:42 ` Richard Sandiford
  2023-01-10 11:05   ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Sandiford @ 2023-01-10 10:42 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <rguenther@suse.de> writes:
> The extract-last reduction internal function expects the then and
> else clause as vector and scalar and thus we cannot perform optimization
> of the inversion of the condition by swapping the then/else clauses.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?

Sorry for not having found the time to look at the PR yet.
Like you say in the trail, it seems kind-of familiar.

I think we should instead prevent the else in:

	  scalar_cond_masked_key cond (cond_expr, ncopies);
	  if (loop_vinfo->scalar_cond_masked_set.contains (cond))
	    masks = &LOOP_VINFO_MASKS (loop_vinfo);
	  else
	    {

for EXTRACT_LAST.  We've lost as soon as swap_cond_operands gets
set to true.

Thanks,
Richard

> Thanks,
> Richard.
>
> 	PR tree-optimization/108314
> 	* tree-vect-stmts.cc (vectorizable_condition): Do not
> 	perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION.
>
> 	* gcc.dg/vect/pr108314.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++
>  gcc/tree-vect-stmts.cc               | 13 +++++++++----
>  2 files changed, 25 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c b/gcc/testsuite/gcc.dg/vect/pr108314.c
> new file mode 100644
> index 00000000000..07260e06915
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr108314.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
> +
> +int x, y, z;
> +
> +void f(void)
> +{
> +  int t = 4;
> +  for (; x; x++)
> +    {
> +      if (y)
> +	continue;
> +      t = 0;
> +    }
> +  z = t;
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 6ddd41fb473..eb4ca1f184e 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo,
>  	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
>  	      if (bitop2 == NOP_EXPR)
>  		vec_compare = new_temp;
> -	      else if (bitop2 == BIT_NOT_EXPR)
> +	      else if (bitop2 == BIT_NOT_EXPR
> +		       && reduction_type != EXTRACT_LAST_REDUCTION)
>  		{
>  		  /* Instead of doing ~x ? y : z do x ? z : y.  */
>  		  vec_compare = new_temp;
> @@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo,
>  	      else
>  		{
>  		  vec_compare = make_ssa_name (vec_cmp_type);
> -		  new_stmt
> -		    = gimple_build_assign (vec_compare, bitop2,
> -					   vec_cond_lhs, new_temp);
> +		  if (bitop2 == BIT_NOT_EXPR)
> +		    new_stmt
> +		      = gimple_build_assign (vec_compare, bitop2, new_temp);
> +		  else
> +		    new_stmt
> +		      = gimple_build_assign (vec_compare, bitop2,
> +					     vec_cond_lhs, new_temp);
>  		  vect_finish_stmt_generation (vinfo, stmt_info,
>  					       new_stmt, gsi);
>  		}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last
  2023-01-10 10:42 ` Richard Sandiford
@ 2023-01-10 11:05   ` Richard Biener
  2023-01-10 11:32     ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2023-01-10 11:05 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On Tue, 10 Jan 2023, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > The extract-last reduction internal function expects the then and
> > else clause as vector and scalar and thus we cannot perform optimization
> > of the inversion of the condition by swapping the then/else clauses.
> >
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
> 
> Sorry for not having found the time to look at the PR yet.
> Like you say in the trail, it seems kind-of familiar.
> 
> I think we should instead prevent the else in:
> 
> 	  scalar_cond_masked_key cond (cond_expr, ncopies);
> 	  if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> 	    masks = &LOOP_VINFO_MASKS (loop_vinfo);
> 	  else
> 	    {
> 
> for EXTRACT_LAST.  We've lost as soon as swap_cond_operands gets
> set to true.

But we're not getting there - the above is guarded with

      if (reduction_type == EXTRACT_LAST_REDUCTION) 
        masks = &LOOP_VINFO_MASKS (loop_vinfo); 
      else
        {

instead we run into

      if (masked)
        vec_compare = vec_cond_lhs;
      else
        {
          vec_cond_rhs = vec_oprnds1[i];
          if (bitop1 == NOP_EXPR)
            {
...
          else
            {
...
              else if (bitop2 == BIT_NOT_EXPR
                {
                  /* Instead of doing ~x ? y : z do x ? z : y.  */
                  vec_compare = new_temp;
                  std::swap (vec_then_clause, vec_else_clause);

so we could instead reject vectorizing for EQ_EXPR but then
applying the negation to the condition allows this to be
vectorized just fine (which is what the patch does)?

Richard.

> Thanks,
> Richard
> 
> > Thanks,
> > Richard.
> >
> > 	PR tree-optimization/108314
> > 	* tree-vect-stmts.cc (vectorizable_condition): Do not
> > 	perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION.
> >
> > 	* gcc.dg/vect/pr108314.c: New testcase.
> > ---
> >  gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++
> >  gcc/tree-vect-stmts.cc               | 13 +++++++++----
> >  2 files changed, 25 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c b/gcc/testsuite/gcc.dg/vect/pr108314.c
> > new file mode 100644
> > index 00000000000..07260e06915
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/pr108314.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
> > +
> > +int x, y, z;
> > +
> > +void f(void)
> > +{
> > +  int t = 4;
> > +  for (; x; x++)
> > +    {
> > +      if (y)
> > +	continue;
> > +      t = 0;
> > +    }
> > +  z = t;
> > +}
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 6ddd41fb473..eb4ca1f184e 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo,
> >  	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
> >  	      if (bitop2 == NOP_EXPR)
> >  		vec_compare = new_temp;
> > -	      else if (bitop2 == BIT_NOT_EXPR)
> > +	      else if (bitop2 == BIT_NOT_EXPR
> > +		       && reduction_type != EXTRACT_LAST_REDUCTION)
> >  		{
> >  		  /* Instead of doing ~x ? y : z do x ? z : y.  */
> >  		  vec_compare = new_temp;
> > @@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo,
> >  	      else
> >  		{
> >  		  vec_compare = make_ssa_name (vec_cmp_type);
> > -		  new_stmt
> > -		    = gimple_build_assign (vec_compare, bitop2,
> > -					   vec_cond_lhs, new_temp);
> > +		  if (bitop2 == BIT_NOT_EXPR)
> > +		    new_stmt
> > +		      = gimple_build_assign (vec_compare, bitop2, new_temp);
> > +		  else
> > +		    new_stmt
> > +		      = gimple_build_assign (vec_compare, bitop2,
> > +					     vec_cond_lhs, new_temp);
> >  		  vect_finish_stmt_generation (vinfo, stmt_info,
> >  					       new_stmt, gsi);
> >  		}
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last
  2023-01-10 11:05   ` Richard Biener
@ 2023-01-10 11:32     ` Richard Sandiford
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Sandiford @ 2023-01-10 11:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Tue, 10 Jan 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > The extract-last reduction internal function expects the then and
>> > else clause as vector and scalar and thus we cannot perform optimization
>> > of the inversion of the condition by swapping the then/else clauses.
>> >
>> > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
>> 
>> Sorry for not having found the time to look at the PR yet.
>> Like you say in the trail, it seems kind-of familiar.
>> 
>> I think we should instead prevent the else in:
>> 
>> 	  scalar_cond_masked_key cond (cond_expr, ncopies);
>> 	  if (loop_vinfo->scalar_cond_masked_set.contains (cond))
>> 	    masks = &LOOP_VINFO_MASKS (loop_vinfo);
>> 	  else
>> 	    {
>> 
>> for EXTRACT_LAST.  We've lost as soon as swap_cond_operands gets
>> set to true.
>
> But we're not getting there - the above is guarded with
>
>       if (reduction_type == EXTRACT_LAST_REDUCTION) 
>         masks = &LOOP_VINFO_MASKS (loop_vinfo); 
>       else
>         {
>
> instead we run into
>
>       if (masked)
>         vec_compare = vec_cond_lhs;
>       else
>         {
>           vec_cond_rhs = vec_oprnds1[i];
>           if (bitop1 == NOP_EXPR)
>             {
> ...
>           else
>             {
> ...
>               else if (bitop2 == BIT_NOT_EXPR
>                 {
>                   /* Instead of doing ~x ? y : z do x ? z : y.  */
>                   vec_compare = new_temp;
>                   std::swap (vec_then_clause, vec_else_clause);
>
> so we could instead reject vectorizing for EQ_EXPR but then
> applying the negation to the condition allows this to be
> vectorized just fine (which is what the patch does)?

Ah, OK.  I wasn't sure which of the paths we were going down to get here.

So yeah, I agree the patch is OK.  Sorry for the noise.

Richard

> Richard.
>
>> Thanks,
>> Richard
>> 
>> > Thanks,
>> > Richard.
>> >
>> > 	PR tree-optimization/108314
>> > 	* tree-vect-stmts.cc (vectorizable_condition): Do not
>> > 	perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION.
>> >
>> > 	* gcc.dg/vect/pr108314.c: New testcase.
>> > ---
>> >  gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++
>> >  gcc/tree-vect-stmts.cc               | 13 +++++++++----
>> >  2 files changed, 25 insertions(+), 4 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c
>> >
>> > diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c b/gcc/testsuite/gcc.dg/vect/pr108314.c
>> > new file mode 100644
>> > index 00000000000..07260e06915
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.dg/vect/pr108314.c
>> > @@ -0,0 +1,16 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
>> > +
>> > +int x, y, z;
>> > +
>> > +void f(void)
>> > +{
>> > +  int t = 4;
>> > +  for (; x; x++)
>> > +    {
>> > +      if (y)
>> > +	continue;
>> > +      t = 0;
>> > +    }
>> > +  z = t;
>> > +}
>> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> > index 6ddd41fb473..eb4ca1f184e 100644
>> > --- a/gcc/tree-vect-stmts.cc
>> > +++ b/gcc/tree-vect-stmts.cc
>> > @@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo,
>> >  	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
>> >  	      if (bitop2 == NOP_EXPR)
>> >  		vec_compare = new_temp;
>> > -	      else if (bitop2 == BIT_NOT_EXPR)
>> > +	      else if (bitop2 == BIT_NOT_EXPR
>> > +		       && reduction_type != EXTRACT_LAST_REDUCTION)
>> >  		{
>> >  		  /* Instead of doing ~x ? y : z do x ? z : y.  */
>> >  		  vec_compare = new_temp;
>> > @@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo,
>> >  	      else
>> >  		{
>> >  		  vec_compare = make_ssa_name (vec_cmp_type);
>> > -		  new_stmt
>> > -		    = gimple_build_assign (vec_compare, bitop2,
>> > -					   vec_cond_lhs, new_temp);
>> > +		  if (bitop2 == BIT_NOT_EXPR)
>> > +		    new_stmt
>> > +		      = gimple_build_assign (vec_compare, bitop2, new_temp);
>> > +		  else
>> > +		    new_stmt
>> > +		      = gimple_build_assign (vec_compare, bitop2,
>> > +					     vec_cond_lhs, new_temp);
>> >  		  vect_finish_stmt_generation (vinfo, stmt_info,
>> >  					       new_stmt, gsi);
>> >  		}
>> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-01-10 11:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10  9:46 [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last Richard Biener
2023-01-10 10:42 ` Richard Sandiford
2023-01-10 11:05   ` Richard Biener
2023-01-10 11:32     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).