public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] middle-end/114070 - VEC_COND_EXPR folding
@ 2024-02-29  8:35 Richard Biener
  2024-03-03 17:02 ` Jeff Law
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2024-02-29  8:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, richard.sandiford

The following amends the PR114070 fix to optimistically allow
the folding when we cannot expand the current vec_cond using
vcond_mask and we're still before vector lowering.  This leaves
a small window between vectorization and lowering where we could
break vec_conds that can be expanded via vcond{,u,eq}, most
susceptible is the loop unrolling pass which applies VN and thus
possibly folding to the unrolled body of a vectorized loop.

This gets back the folding for targets that cannot do vectorization.
It doesn't get back the folding for x86 with AVX512 for example
since that can handle the original IL but not the folded since
it misses some vcond_mask expanders.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

As said for stage1 I want to move vector lowering before vectorization.
While I'm not entirely happy with this patch it forces us into the
correct direction, getting vcond_mask and vcmp{,u,eq} patterns
implemented.  We could use canonicalize_math_p () to close the
vectorizer -> vector lowering gap but this only works when that
pass is run (not with -Og or when disabled).  We could add a new
PROP_vectorizer_il and disable the folding if the vectorizer ran.

Or we could simply live with the regression.

Any preferences?

Thanks,
Richard.

	PR middle-end/114070
	* match.pd ((c ? a : b) op d  -->  c ? (a op d) : (b op d)):
	Allow the folding if before lowering and the current IL
	isn't supported with vcond_mask.
---
 gcc/match.pd | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index f3fffd8dec2..4edba7c84fb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
   (if (TREE_CODE_CLASS (op) != tcc_comparison
        || types_match (type, TREE_TYPE (@1))
-       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
+       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
+       || (optimize_vectors_before_lowering_p ()
+	   /* The following is optimistic on the side of non-support, we are
+	      missing the legacy vcond{,u,eq} cases.  Do this only when
+	      lowering will be able to fixup..  */
+	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
+				       TREE_TYPE (@0), ERROR_MARK)))
    (vec_cond @0 (op! @1 @3) (op! @2 @4))))
 
 /* (c ? a : b) op d  -->  c ? (a op d) : (b op d) */
@@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (op (vec_cond:s @0 @1 @2) @3)
   (if (TREE_CODE_CLASS (op) != tcc_comparison
        || types_match (type, TREE_TYPE (@1))
-       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
+       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
+       || (optimize_vectors_before_lowering_p ()
+	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
+				       TREE_TYPE (@0), ERROR_MARK)))
    (vec_cond @0 (op! @1 @3) (op! @2 @3))))
  (simplify
   (op @3 (vec_cond:s @0 @1 @2))
   (if (TREE_CODE_CLASS (op) != tcc_comparison
        || types_match (type, TREE_TYPE (@1))
-       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
+       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
+       || (optimize_vectors_before_lowering_p ()
+	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
+				       TREE_TYPE (@0), ERROR_MARK)))
    (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
 
 #if GIMPLE
-- 
2.35.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle-end/114070 - VEC_COND_EXPR folding
  2024-02-29  8:35 [PATCH] middle-end/114070 - VEC_COND_EXPR folding Richard Biener
@ 2024-03-03 17:02 ` Jeff Law
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff Law @ 2024-03-03 17:02 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Jakub Jelinek, richard.sandiford



On 2/29/24 01:35, Richard Biener wrote:
> The following amends the PR114070 fix to optimistically allow
> the folding when we cannot expand the current vec_cond using
> vcond_mask and we're still before vector lowering.  This leaves
> a small window between vectorization and lowering where we could
> break vec_conds that can be expanded via vcond{,u,eq}, most
> susceptible is the loop unrolling pass which applies VN and thus
> possibly folding to the unrolled body of a vectorized loop.
> 
> This gets back the folding for targets that cannot do vectorization.
> It doesn't get back the folding for x86 with AVX512 for example
> since that can handle the original IL but not the folded since
> it misses some vcond_mask expanders.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> As said for stage1 I want to move vector lowering before vectorization.
> While I'm not entirely happy with this patch it forces us into the
> correct direction, getting vcond_mask and vcmp{,u,eq} patterns
> implemented.  We could use canonicalize_math_p () to close the
> vectorizer -> vector lowering gap but this only works when that
> pass is run (not with -Og or when disabled).  We could add a new
> PROP_vectorizer_il and disable the folding if the vectorizer ran.
> 
> Or we could simply live with the regression.
> 
> Any preferences?
Not really.  As I think I said, I consider the regression insignificant 
an I could certainly live with it.

jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle-end/114070 - VEC_COND_EXPR folding
  2024-02-29 10:44 ` Jakub Jelinek
@ 2024-03-01 12:11   ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2024-03-01 12:11 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, richard.sandiford

On Thu, 29 Feb 2024, Jakub Jelinek wrote:

> On Thu, Feb 29, 2024 at 11:16:54AM +0100, Richard Biener wrote:
> > That said, the quick experiment shows this isn't anything for stage4.
> 
> The earlier the vector lowering is moved in the pass list, the higher
> are the possibilities that match.pd or some other optimization reintroduces
> unsupportable vector operations into the IL.
> 
> Guess your patch looks reasonable.

Pushed.

Thanks,
Richard.

> > > 	PR middle-end/114070
> > > 	* match.pd ((c ? a : b) op d  -->  c ? (a op d) : (b op d)):
> > > 	Allow the folding if before lowering and the current IL
> > > 	isn't supported with vcond_mask.
> > > ---
> > >  gcc/match.pd | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index f3fffd8dec2..4edba7c84fb 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >    (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
> > >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> > >         || types_match (type, TREE_TYPE (@1))
> > > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > > +       || (optimize_vectors_before_lowering_p ()
> > > +	   /* The following is optimistic on the side of non-support, we are
> > > +	      missing the legacy vcond{,u,eq} cases.  Do this only when
> > > +	      lowering will be able to fixup..  */
> > > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > > +				       TREE_TYPE (@0), ERROR_MARK)))
> > >     (vec_cond @0 (op! @1 @3) (op! @2 @4))))
> > >  
> > >  /* (c ? a : b) op d  -->  c ? (a op d) : (b op d) */
> > > @@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >    (op (vec_cond:s @0 @1 @2) @3)
> > >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> > >         || types_match (type, TREE_TYPE (@1))
> > > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > > +       || (optimize_vectors_before_lowering_p ()
> > > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > > +				       TREE_TYPE (@0), ERROR_MARK)))
> > >     (vec_cond @0 (op! @1 @3) (op! @2 @3))))
> > >   (simplify
> > >    (op @3 (vec_cond:s @0 @1 @2))
> > >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> > >         || types_match (type, TREE_TYPE (@1))
> > > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > > +       || (optimize_vectors_before_lowering_p ()
> > > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > > +				       TREE_TYPE (@0), ERROR_MARK)))
> > >     (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
> > >  
> > >  #if GIMPLE
> > > 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle-end/114070 - VEC_COND_EXPR folding
  2024-02-29 10:16 Richard Biener
@ 2024-02-29 10:44 ` Jakub Jelinek
  2024-03-01 12:11   ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2024-02-29 10:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, richard.sandiford

On Thu, Feb 29, 2024 at 11:16:54AM +0100, Richard Biener wrote:
> That said, the quick experiment shows this isn't anything for stage4.

The earlier the vector lowering is moved in the pass list, the higher
are the possibilities that match.pd or some other optimization reintroduces
unsupportable vector operations into the IL.

Guess your patch looks reasonable.

> > 	PR middle-end/114070
> > 	* match.pd ((c ? a : b) op d  -->  c ? (a op d) : (b op d)):
> > 	Allow the folding if before lowering and the current IL
> > 	isn't supported with vcond_mask.
> > ---
> >  gcc/match.pd | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index f3fffd8dec2..4edba7c84fb 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >    (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
> >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> >         || types_match (type, TREE_TYPE (@1))
> > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > +       || (optimize_vectors_before_lowering_p ()
> > +	   /* The following is optimistic on the side of non-support, we are
> > +	      missing the legacy vcond{,u,eq} cases.  Do this only when
> > +	      lowering will be able to fixup..  */
> > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > +				       TREE_TYPE (@0), ERROR_MARK)))
> >     (vec_cond @0 (op! @1 @3) (op! @2 @4))))
> >  
> >  /* (c ? a : b) op d  -->  c ? (a op d) : (b op d) */
> > @@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >    (op (vec_cond:s @0 @1 @2) @3)
> >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> >         || types_match (type, TREE_TYPE (@1))
> > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > +       || (optimize_vectors_before_lowering_p ()
> > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > +				       TREE_TYPE (@0), ERROR_MARK)))
> >     (vec_cond @0 (op! @1 @3) (op! @2 @3))))
> >   (simplify
> >    (op @3 (vec_cond:s @0 @1 @2))
> >    (if (TREE_CODE_CLASS (op) != tcc_comparison
> >         || types_match (type, TREE_TYPE (@1))
> > -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> > +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> > +       || (optimize_vectors_before_lowering_p ()
> > +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> > +				       TREE_TYPE (@0), ERROR_MARK)))
> >     (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
> >  
> >  #if GIMPLE
> > 

	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle-end/114070 - VEC_COND_EXPR folding
@ 2024-02-29 10:16 Richard Biener
  2024-02-29 10:44 ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2024-02-29 10:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, richard.sandiford

On Thu, 29 Feb 2024, Richard Biener wrote:

> The following amends the PR114070 fix to optimistically allow
> the folding when we cannot expand the current vec_cond using
> vcond_mask and we're still before vector lowering.  This leaves
> a small window between vectorization and lowering where we could
> break vec_conds that can be expanded via vcond{,u,eq}, most
> susceptible is the loop unrolling pass which applies VN and thus
> possibly folding to the unrolled body of a vectorized loop.
> 
> This gets back the folding for targets that cannot do vectorization.
> It doesn't get back the folding for x86 with AVX512 for example
> since that can handle the original IL but not the folded since
> it misses some vcond_mask expanders.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> As said for stage1 I want to move vector lowering before vectorization.
> While I'm not entirely happy with this patch it forces us into the
> correct direction, getting vcond_mask and vcmp{,u,eq} patterns
> implemented.  We could use canonicalize_math_p () to close the
> vectorizer -> vector lowering gap but this only works when that
> pass is run (not with -Og or when disabled).  We could add a new
> PROP_vectorizer_il and disable the folding if the vectorizer ran.
> 
> Or we could simply live with the regression.
> 
> Any preferences?

I've tried moving vector lowering, first try to after the first
forwprop after IPA.  That exposes (at least) invariant motion
creating unsupported COND_EXPRs - we hoist a vector PHI as
_2 ? _3 : _6 and that might lead to unsupported BLKmode moves.

I think there's some latent issues to be fixed in passes.

A more conservative move is to duplicate vector lowering into
the loop/non-loop sections and put it right before vectorization
(but there's invariant motion after it, so the above issue will
prevail).  Since the vectorizer currently cannot handle existing
vector code "re-vectorization" is best done on lowered code (after
that got some cleanup).  Also SLP can interface with existing
vectors, but currently only non-BLKmode ones, so that would benefit
as well.

That said, the quick experiment shows this isn't anything for stage4.

Richard.

> Thanks,
> Richard.
> 
> 	PR middle-end/114070
> 	* match.pd ((c ? a : b) op d  -->  c ? (a op d) : (b op d)):
> 	Allow the folding if before lowering and the current IL
> 	isn't supported with vcond_mask.
> ---
>  gcc/match.pd | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index f3fffd8dec2..4edba7c84fb 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>    (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
>    (if (TREE_CODE_CLASS (op) != tcc_comparison
>         || types_match (type, TREE_TYPE (@1))
> -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> +       || (optimize_vectors_before_lowering_p ()
> +	   /* The following is optimistic on the side of non-support, we are
> +	      missing the legacy vcond{,u,eq} cases.  Do this only when
> +	      lowering will be able to fixup..  */
> +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> +				       TREE_TYPE (@0), ERROR_MARK)))
>     (vec_cond @0 (op! @1 @3) (op! @2 @4))))
>  
>  /* (c ? a : b) op d  -->  c ? (a op d) : (b op d) */
> @@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>    (op (vec_cond:s @0 @1 @2) @3)
>    (if (TREE_CODE_CLASS (op) != tcc_comparison
>         || types_match (type, TREE_TYPE (@1))
> -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> +       || (optimize_vectors_before_lowering_p ()
> +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> +				       TREE_TYPE (@0), ERROR_MARK)))
>     (vec_cond @0 (op! @1 @3) (op! @2 @3))))
>   (simplify
>    (op @3 (vec_cond:s @0 @1 @2))
>    (if (TREE_CODE_CLASS (op) != tcc_comparison
>         || types_match (type, TREE_TYPE (@1))
> -       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
> +       || (optimize_vectors_before_lowering_p ()
> +	   && !expand_vec_cond_expr_p (TREE_TYPE (@1),
> +				       TREE_TYPE (@0), ERROR_MARK)))
>     (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
>  
>  #if GIMPLE
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-03-03 17:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-29  8:35 [PATCH] middle-end/114070 - VEC_COND_EXPR folding Richard Biener
2024-03-03 17:02 ` Jeff Law
2024-02-29 10:16 Richard Biener
2024-02-29 10:44 ` Jakub Jelinek
2024-03-01 12:11   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).