public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
@ 2024-02-26 21:13 haochen.jiang
  2024-02-27  7:43 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: haochen.jiang @ 2024-02-26 21:13 UTC (permalink / raw)
  To: rguenther, gcc-regression, gcc-patches, haochen.jiang

On Linux/x86_64,

af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
commit af66ad89e8169f44db723813662917cf4cbb78fc
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Feb 23 16:06:05 2024 +0100

    middle-end/114070 - folding breaking VEC_COND expansion

caused

FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"

with GCC configured with

../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-9173/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/andnot-2.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/andnot-2.c --target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command line might save that.)
(However, please make sure that there is no potential problems with AVX512.)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-26 21:13 [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64 haochen.jiang
@ 2024-02-27  7:43 ` Richard Biener
  2024-02-27  8:01   ` Hongtao Liu
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Richard Biener @ 2024-02-27  7:43 UTC (permalink / raw)
  To: haochen.jiang; +Cc: gcc-patches, richard.sandiford, haochen.jiang, hongtao.liu

On Tue, 27 Feb 2024, haochen.jiang wrote:

> On Linux/x86_64,
> 
> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> commit af66ad89e8169f44db723813662917cf4cbb78fc
> Author: Richard Biener <rguenther@suse.de>
> Date:   Fri Feb 23 16:06:05 2024 +0100
> 
>     middle-end/114070 - folding breaking VEC_COND expansion
> 
> caused
> 
> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"

This shows that the x86 backend is missing vcond_mask_qiqi and friends
(for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
and all the machinery behind it (ISEL pass, lowering) should handle
pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
targets now implement patterns for these variants, whatever their
boolean vector modes are.

One complication with the change, which was

  (simplify
   (op @3 (vec_cond:s @0 @1 @2))
-  (vec_cond @0 (op! @3 @1) (op! @3 @2))))
+  (if (TREE_CODE_CLASS (op) != tcc_comparison
+       || types_match (type, TREE_TYPE (@1))
+       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
+   (vec_cond @0 (op! @3 @1) (op! @3 @2)))))

is that expand_vec_cond_expr_p can also handle comparison defined
masks, but whether or not we have this isn't visible here so we
can only check whether vcond_mask expansion would work.

We have optimize_vectors_before_lowering_p but we shouldn't even there
turn supported into not supported ops and as said, what's supported or
not cannot be finally decided (if it's only vcond and not vcond_mask
that is supported).  Also optimize_vectors_before_lowering_p is set
for a short time between vectorization and vector lowering and we
definitely do not want to turn supported vectorizer emitted stmts
into ones that we need to lower.  For GCC 15 we should see to move
vector lowering before vectorization (before loop optimization I'd
say) to close this particula hole (and also reliably ICE when the
vectorizer creates unsupported IL).  We also definitely want to
retire vcond expanders (no target I know of supports single-instruction
compare-and-select).

So short term we either live with this regression (the testcase
verifies we perform constant folding to { 0, 0 }), implement
the four missing patterns (qi, hi, si and di missing value mode
vcond_mask patterns) or see to implement generic code for this.

Given precedent I'd tend towards adding the x86 patterns.

Hongtao, can you handle that?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27  7:43 ` Richard Biener
@ 2024-02-27  8:01   ` Hongtao Liu
  2024-02-27 13:20   ` Jeff Law
  2024-03-07 21:03   ` Richard Sandiford
  2 siblings, 0 replies; 12+ messages in thread
From: Hongtao Liu @ 2024-02-27  8:01 UTC (permalink / raw)
  To: Richard Biener
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu

On Tue, Feb 27, 2024 at 3:44 PM Richard Biener <rguenther@suse.de> wrote:
>
> On Tue, 27 Feb 2024, haochen.jiang wrote:
>
> > On Linux/x86_64,
> >
> > af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> > commit af66ad89e8169f44db723813662917cf4cbb78fc
> > Author: Richard Biener <rguenther@suse.de>
> > Date:   Fri Feb 23 16:06:05 2024 +0100
> >
> >     middle-end/114070 - folding breaking VEC_COND expansion
> >
> > caused
> >
> > FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
>
> This shows that the x86 backend is missing vcond_mask_qiqi and friends
Interesting, so both operand and mask are vector boolean.
> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> and all the machinery behind it (ISEL pass, lowering) should handle
> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> targets now implement patterns for these variants, whatever their
> boolean vector modes are.
>
> One complication with the change, which was
>
>   (simplify
>    (op @3 (vec_cond:s @0 @1 @2))
> -  (vec_cond @0 (op! @3 @1) (op! @3 @2))))
> +  (if (TREE_CODE_CLASS (op) != tcc_comparison
> +       || types_match (type, TREE_TYPE (@1))
> +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> +   (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
>
> is that expand_vec_cond_expr_p can also handle comparison defined
> masks, but whether or not we have this isn't visible here so we
> can only check whether vcond_mask expansion would work.
>
> We have optimize_vectors_before_lowering_p but we shouldn't even there
> turn supported into not supported ops and as said, what's supported or
> not cannot be finally decided (if it's only vcond and not vcond_mask
> that is supported).  Also optimize_vectors_before_lowering_p is set
> for a short time between vectorization and vector lowering and we
> definitely do not want to turn supported vectorizer emitted stmts
> into ones that we need to lower.  For GCC 15 we should see to move
> vector lowering before vectorization (before loop optimization I'd
> say) to close this particula hole (and also reliably ICE when the
> vectorizer creates unsupported IL).  We also definitely want to
> retire vcond expanders (no target I know of supports single-instruction
> compare-and-select).
>
> So short term we either live with this regression (the testcase
> verifies we perform constant folding to { 0, 0 }), implement
> the four missing patterns (qi, hi, si and di missing value mode
> vcond_mask patterns) or see to implement generic code for this.
>
> Given precedent I'd tend towards adding the x86 patterns.
>
> Hongtao, can you handle that?
Sure, I'll take a look.
>
> Thanks,
> Richard.



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27  7:43 ` Richard Biener
  2024-02-27  8:01   ` Hongtao Liu
@ 2024-02-27 13:20   ` Jeff Law
  2024-02-27 13:53     ` Richard Biener
  2024-03-07 21:03   ` Richard Sandiford
  2 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2024-02-27 13:20 UTC (permalink / raw)
  To: Richard Biener, haochen.jiang
  Cc: gcc-patches, richard.sandiford, haochen.jiang, hongtao.liu



On 2/27/24 00:43, Richard Biener wrote:
> On Tue, 27 Feb 2024, haochen.jiang wrote:
> 
>> On Linux/x86_64,
>>
>> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
>> commit af66ad89e8169f44db723813662917cf4cbb78fc
>> Author: Richard Biener <rguenther@suse.de>
>> Date:   Fri Feb 23 16:06:05 2024 +0100
>>
>>      middle-end/114070 - folding breaking VEC_COND expansion
>>
>> caused
>>
>> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
> 
> This shows that the x86 backend is missing vcond_mask_qiqi and friends
> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> and all the machinery behind it (ISEL pass, lowering) should handle
> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> targets now implement patterns for these variants, whatever their
> boolean vector modes are.
There may be more going on than just that.  The andnot-2 test started 
regressing on most targets overnight, including on targets without 
vector capabilities.

fr30-elf for example:


> Tests that now fail, but worked before (2 tests):
> 
> fr30-sim: gcc: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
> fr30-sim: gcc: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"


Jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27 13:20   ` Jeff Law
@ 2024-02-27 13:53     ` Richard Biener
  2024-02-27 14:11       ` Jeff Law
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2024-02-27 13:53 UTC (permalink / raw)
  To: Jeff Law
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu

On Tue, 27 Feb 2024, Jeff Law wrote:

> 
> 
> On 2/27/24 00:43, Richard Biener wrote:
> > On Tue, 27 Feb 2024, haochen.jiang wrote:
> > 
> >> On Linux/x86_64,
> >>
> >> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> >> commit af66ad89e8169f44db723813662917cf4cbb78fc
> >> Author: Richard Biener <rguenther@suse.de>
> >> Date:   Fri Feb 23 16:06:05 2024 +0100
> >>
> >>      middle-end/114070 - folding breaking VEC_COND expansion
> >>
> >> caused
> >>
> >> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
> > 
> > This shows that the x86 backend is missing vcond_mask_qiqi and friends
> > (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> > and all the machinery behind it (ISEL pass, lowering) should handle
> > pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> > targets now implement patterns for these variants, whatever their
> > boolean vector modes are.
> There may be more going on than just that.  The andnot-2 test started
> regressing on most targets overnight, including on targets without vector
> capabilities.

Yes, we fail this generic vector simplification test now (not sure
why the test didn't test forwprop1).  As said the problem is that
we can't test whether the existing VEC_COND_EXPR is handled
(we just can test if it's handled by vcond_mask).

As can be seen from x86 we don't want to turn the existing working
vcond into unsupported vcond_mask as that will fail to expand
(that's what the patch fixed for SPARC) or similarly bad, would
be lowered during vector lowering into inefficient scalar code.

In the end this is fallout from splitting out the condition from
VEC_COND_EXPR but not getting rid of all vcond{,u,eq} expanders,
rewriting target support to vcmp{,u,eq} + vcond_mask ... (those partial
transitions...).

Grepping shows there's unfortunately plenty of targets with
vcond{,u,eq} patterns, maybe most of them have vcmp{,u,eq}
patterns as well but only cutting those off hard by simply
removing the expansion path will tell who's affected.

One could try to fix this by adding a 2nd set of patterns where
the defining conditionals are visible, so we can check for
vcond[u] expansion support (and compare before/after state),
and allow the patterns w/o visible compares only when vcond_mask
is available.

Richard.

> fr30-elf for example:
> 
> 
> > Tests that now fail, but worked before (2 tests):
> > 
> > fr30-sim: gcc: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3
> > "_expr"
> > fr30-sim: gcc: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3
> > "_expr"
> 
> 
> Jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27 13:53     ` Richard Biener
@ 2024-02-27 14:11       ` Jeff Law
  2024-02-27 14:31         ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2024-02-27 14:11 UTC (permalink / raw)
  To: Richard Biener
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu



On 2/27/24 06:53, Richard Biener wrote:
> On Tue, 27 Feb 2024, Jeff Law wrote:
> 
>>
>>
>> On 2/27/24 00:43, Richard Biener wrote:
>>> On Tue, 27 Feb 2024, haochen.jiang wrote:
>>>
>>>> On Linux/x86_64,
>>>>
>>>> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
>>>> commit af66ad89e8169f44db723813662917cf4cbb78fc
>>>> Author: Richard Biener <rguenther@suse.de>
>>>> Date:   Fri Feb 23 16:06:05 2024 +0100
>>>>
>>>>       middle-end/114070 - folding breaking VEC_COND expansion
>>>>
>>>> caused
>>>>
>>>> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
>>>
>>> This shows that the x86 backend is missing vcond_mask_qiqi and friends
>>> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
>>> and all the machinery behind it (ISEL pass, lowering) should handle
>>> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
>>> targets now implement patterns for these variants, whatever their
>>> boolean vector modes are.
>> There may be more going on than just that.  The andnot-2 test started
>> regressing on most targets overnight, including on targets without vector
>> capabilities.
> 
> Yes, we fail this generic vector simplification test now (not sure
> why the test didn't test forwprop1).  As said the problem is that
> we can't test whether the existing VEC_COND_EXPR is handled
> (we just can test if it's handled by vcond_mask).
ACK.  I'll force those targets to regenerate their baselines.

I hadn't read things too closely and just wanted to raise the issue that 
this impacts additional targets w/o vector support.  It sounds like it's 
largely expected fallout.

Jeff



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27 14:11       ` Jeff Law
@ 2024-02-27 14:31         ` Richard Biener
  2024-02-28 10:05           ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2024-02-27 14:31 UTC (permalink / raw)
  To: Jeff Law
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu

On Tue, 27 Feb 2024, Jeff Law wrote:

> 
> 
> On 2/27/24 06:53, Richard Biener wrote:
> > On Tue, 27 Feb 2024, Jeff Law wrote:
> > 
> >>
> >>
> >> On 2/27/24 00:43, Richard Biener wrote:
> >>> On Tue, 27 Feb 2024, haochen.jiang wrote:
> >>>
> >>>> On Linux/x86_64,
> >>>>
> >>>> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> >>>> commit af66ad89e8169f44db723813662917cf4cbb78fc
> >>>> Author: Richard Biener <rguenther@suse.de>
> >>>> Date:   Fri Feb 23 16:06:05 2024 +0100
> >>>>
> >>>>       middle-end/114070 - folding breaking VEC_COND expansion
> >>>>
> >>>> caused
> >>>>
> >>>> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
> >>>
> >>> This shows that the x86 backend is missing vcond_mask_qiqi and friends
> >>> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> >>> and all the machinery behind it (ISEL pass, lowering) should handle
> >>> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> >>> targets now implement patterns for these variants, whatever their
> >>> boolean vector modes are.
> >> There may be more going on than just that.  The andnot-2 test started
> >> regressing on most targets overnight, including on targets without vector
> >> capabilities.
> > 
> > Yes, we fail this generic vector simplification test now (not sure
> > why the test didn't test forwprop1).  As said the problem is that
> > we can't test whether the existing VEC_COND_EXPR is handled
> > (we just can test if it's handled by vcond_mask).
> ACK.  I'll force those targets to regenerate their baselines.
> 
> I hadn't read things too closely and just wanted to raise the issue that this
> impacts additional targets w/o vector support.  It sounds like it's largely
> expected fallout.

No, I didn't expect it (well, kind-of but my own testing came up 
clean...).  I'm going to try the extra patterns.

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27 14:31         ` Richard Biener
@ 2024-02-28 10:05           ` Richard Biener
  2024-02-28 15:04             ` Jeff Law
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2024-02-28 10:05 UTC (permalink / raw)
  To: Jeff Law
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu

On Tue, 27 Feb 2024, Richard Biener wrote:

> On Tue, 27 Feb 2024, Jeff Law wrote:
> 
> > 
> > 
> > On 2/27/24 06:53, Richard Biener wrote:
> > > On Tue, 27 Feb 2024, Jeff Law wrote:
> > > 
> > >>
> > >>
> > >> On 2/27/24 00:43, Richard Biener wrote:
> > >>> On Tue, 27 Feb 2024, haochen.jiang wrote:
> > >>>
> > >>>> On Linux/x86_64,
> > >>>>
> > >>>> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> > >>>> commit af66ad89e8169f44db723813662917cf4cbb78fc
> > >>>> Author: Richard Biener <rguenther@suse.de>
> > >>>> Date:   Fri Feb 23 16:06:05 2024 +0100
> > >>>>
> > >>>>       middle-end/114070 - folding breaking VEC_COND expansion
> > >>>>
> > >>>> caused
> > >>>>
> > >>>> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
> > >>>
> > >>> This shows that the x86 backend is missing vcond_mask_qiqi and friends
> > >>> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> > >>> and all the machinery behind it (ISEL pass, lowering) should handle
> > >>> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> > >>> targets now implement patterns for these variants, whatever their
> > >>> boolean vector modes are.
> > >> There may be more going on than just that.  The andnot-2 test started
> > >> regressing on most targets overnight, including on targets without vector
> > >> capabilities.
> > > 
> > > Yes, we fail this generic vector simplification test now (not sure
> > > why the test didn't test forwprop1).  As said the problem is that
> > > we can't test whether the existing VEC_COND_EXPR is handled
> > > (we just can test if it's handled by vcond_mask).
> > ACK.  I'll force those targets to regenerate their baselines.
> > 
> > I hadn't read things too closely and just wanted to raise the issue that this
> > impacts additional targets w/o vector support.  It sounds like it's largely
> > expected fallout.
> 
> No, I didn't expect it (well, kind-of but my own testing came up 
> clean...).  I'm going to try the extra patterns.

So the following fixes it for AVR for example but it doesn't fix it
with -march=znver4 since we can expand the original IL but not the
one we attempt to fold to.  The testcase at hand would be also
fixed if we force simplification of the outer vec_cond.  But we
currently can't use '!' at the outermost operation, or at least
it wouldn't do the expected thing.

The testcase which is

typedef long vec __attribute__((vector_size(16)));
vec f(vec x){
  vec y = x < 10;
  return y & (y == 0);
}

really expects y & (y == 0) to simplify, the x < 10 is not really
important besides making 'y' a boolean.

typedef long vec __attribute__((vector_size(16)));
vec f(vec y){
  return (y != 0) & (y == 0);
}

is the gist of what we're simplifying and we're doing this by
a quite complicated dance.

The patch below comes at the cost of a bit of pattern explosion and
it still shows that some targets need to be adjusted to get the
constant folding that's desired here.  Unfortunately there's no
good way to express logical_inverted_value to make

/* X & !X -> 0.  */
(simplify 
 (bit_and:c @0 (logical_inverted_value @0))
 { build_zero_cst (type); })

catch it.  The (match ..) syntax isn't powerful enough to
return an alternate tree to match like maybe a syntax like
the following could express:

(match (logical_inverted_value (vec_cond@@0 (ne @1 @2) @3 @4))
 (vec_cond (eq @1 @2) @3 @4))

which I think can be only made "working" when actually "inlining"
the (match ..) stuff into the (simplify ..) uses, which was my
original plan but which of course would act as another source of pattern
explosion.

Untested fix for targets that cannot handle the original IL below.
I'm not convinced that's the way to go here, is it?  Or scrap
the testcase?  Or have a cheap way to say "this target doesn't
support _any_ vec_cond"?

Another way, but for stage1 I think, would be to delay all the
vector checking to the "final" maybe_push_res_to_seq, but that
requires aggressively cleaning out intermediate folding results
no longer needed and also still requires patterns to do the
"is the incoming IL supported on the target" as later we do not
know what parts we matched (this could in theory be automated
as well, of course).  There's also no way to say "don't apply
this unless the final simplification result is a constant".
While we might be able to add a new '*' modifier saying
"if the result is supported by the target" there's again no
way of conditionalizing this on the original operation being
supported.  OK, maybe on the match allow '*1' and when '*1'
is used in the result we fail if any of the '*1' annotated
parts in the match were supported but at least one of the '*1'
in the result is not.  Supported only for select operations.
But it still doesn't help as we'd have to somehow check the
re-simplified result and we can't really track where a
vec_cond we put in goes.

Thanks,
Richard.

diff --git a/gcc/match.pd b/gcc/match.pd
index f3fffd8dec2..4c4b9484a7a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5168,7 +5168,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (TREE_CODE_CLASS (op) != tcc_comparison
        || types_match (type, TREE_TYPE (@1))
        || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
-   (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
+   (vec_cond @0 (op! @3 @1) (op! @3 @2))))
+
+ (for op2 (tcc_comparison)
+  (simplify
+   (op (vec_cond:s (op2@0 @00 @01) @1 @2) (vec_cond:s @0 @3 @4))
+   (if (TREE_CODE_CLASS (op) != tcc_comparison
+        || types_match (type, TREE_TYPE (@1))
+        || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)
+       || !expand_vec_cond_expr_p (TREE_TYPE (@1), TREE_TYPE (@00), op2))
+    (vec_cond @0 (op! @1 @3) (op! @2 @4))))
+  (simplify
+   (op (vec_cond:s (op2@0 @00 @01) @1 @2) @3)
+   (if (TREE_CODE_CLASS (op) != tcc_comparison
+        || types_match (type, TREE_TYPE (@1))
+        || expand_vec_cond_expr_p (type, TREE_TYPE (@00), op2)
+       || !expand_vec_cond_expr_p (TREE_TYPE (@1), TREE_TYPE (@00), op2))
+    (vec_cond @0 (op! @1 @3) (op! @2 @3))))
+  (simplify
+   (op @3 (vec_cond:s (op2@0 @00 @01) @1 @2))
+   (if (TREE_CODE_CLASS (op) != tcc_comparison
+        || types_match (type, TREE_TYPE (@1))
+        || expand_vec_cond_expr_p (type, TREE_TYPE (@00), op2)
+       || !expand_vec_cond_expr_p (TREE_TYPE (@1), TREE_TYPE (@00), op2))
+    (vec_cond @0 (op! @3 @1) (op! @3 @2))))))
 
 #if GIMPLE
 (match (nop_atomic_bit_test_and_p @0 @1 @4)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-28 10:05           ` Richard Biener
@ 2024-02-28 15:04             ` Jeff Law
  2024-02-28 16:47               ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2024-02-28 15:04 UTC (permalink / raw)
  To: Richard Biener
  Cc: haochen.jiang, gcc-patches, richard.sandiford, haochen.jiang,
	hongtao.liu



On 2/28/24 03:05, Richard Biener wrote:

> 
> Untested fix for targets that cannot handle the original IL below.
> I'm not convinced that's the way to go here, is it?  Or scrap
> the testcase?  Or have a cheap way to say "this target doesn't
> support _any_ vec_cond"?
> 
> Another way, but for stage1 I think, would be to delay all the
> vector checking to the "final" maybe_push_res_to_seq, but that
> requires aggressively cleaning out intermediate folding results
> no longer needed and also still requires patterns to do the
> "is the incoming IL supported on the target" as later we do not
> know what parts we matched (this could in theory be automated
> as well, of course).  There's also no way to say "don't apply
> this unless the final simplification result is a constant".
> While we might be able to add a new '*' modifier saying
> "if the result is supported by the target" there's again no
> way of conditionalizing this on the original operation being
> supported.  OK, maybe on the match allow '*1' and when '*1'
> is used in the result we fail if any of the '*1' annotated
> parts in the match were supported but at least one of the '*1'
> in the result is not.  Supported only for select operations.
> But it still doesn't help as we'd have to somehow check the
> re-simplified result and we can't really track where a
> vec_cond we put in goes.
So I don't mind completely deferring to the next stage1; I don't 
consider this a significant quality regression and if we think we can do 
something cleaner and avoid the pattern explosion I certainly don't mind 
waiting.

Seems like we ought to open a bug so it doesn't get lost, particularly 
if you'd prefer to defer.

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-28 15:04             ` Jeff Law
@ 2024-02-28 16:47               ` Richard Biener
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Biener @ 2024-02-28 16:47 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Biener, haochen.jiang, gcc-patches, richard.sandiford,
	haochen.jiang, hongtao.liu



> Am 28.02.2024 um 16:05 schrieb Jeff Law <jeffreyalaw@gmail.com>:
> 
> 
> 
>> On 2/28/24 03:05, Richard Biener wrote:
>> 
>> Untested fix for targets that cannot handle the original IL below.
>> I'm not convinced that's the way to go here, is it?  Or scrap
>> the testcase?  Or have a cheap way to say "this target doesn't
>> support _any_ vec_cond"?
>> Another way, but for stage1 I think, would be to delay all the
>> vector checking to the "final" maybe_push_res_to_seq, but that
>> requires aggressively cleaning out intermediate folding results
>> no longer needed and also still requires patterns to do the
>> "is the incoming IL supported on the target" as later we do not
>> know what parts we matched (this could in theory be automated
>> as well, of course).  There's also no way to say "don't apply
>> this unless the final simplification result is a constant".
>> While we might be able to add a new '*' modifier saying
>> "if the result is supported by the target" there's again no
>> way of conditionalizing this on the original operation being
>> supported.  OK, maybe on the match allow '*1' and when '*1'
>> is used in the result we fail if any of the '*1' annotated
>> parts in the match were supported but at least one of the '*1'
>> in the result is not.  Supported only for select operations.
>> But it still doesn't help as we'd have to somehow check the
>> re-simplified result and we can't really track where a
>> vec_cond we put in goes.
> So I don't mind completely deferring to the next stage1; I don't consider this a significant quality regression and if we think we can do something cleaner and avoid the pattern explosion I certainly don't mind waiting.
> 
> Seems like we ought to open a bug so it doesn't get lost, particularly if you'd prefer to defer.

I have some other idea to get away with a leaner not supported query when we’re before vector lowering.  I’ll see how that goes tomorrow.

Richard 

> jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-02-27  7:43 ` Richard Biener
  2024-02-27  8:01   ` Hongtao Liu
  2024-02-27 13:20   ` Jeff Law
@ 2024-03-07 21:03   ` Richard Sandiford
  2024-03-08  6:57     ` Richard Biener
  2 siblings, 1 reply; 12+ messages in thread
From: Richard Sandiford @ 2024-03-07 21:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: haochen.jiang, gcc-patches, haochen.jiang, hongtao.liu

Sorry, still catching up on email, but: 

Richard Biener <rguenther@suse.de> writes:
> We have optimize_vectors_before_lowering_p but we shouldn't even there
> turn supported into not supported ops and as said, what's supported or
> not cannot be finally decided (if it's only vcond and not vcond_mask
> that is supported).  Also optimize_vectors_before_lowering_p is set
> for a short time between vectorization and vector lowering and we
> definitely do not want to turn supported vectorizer emitted stmts
> into ones that we need to lower.  For GCC 15 we should see to move
> vector lowering before vectorization (before loop optimization I'd
> say) to close this particula hole (and also reliably ICE when the
> vectorizer creates unsupported IL).  We also definitely want to
> retire vcond expanders (no target I know of supports single-instruction
> compare-and-select).

...definitely agree with this FWIW.  Sounds like a much cleaner approach.

One of the main tricks that vcond*s tend to do is invert "difficult"
comparisons and swap the data operands to match.  But I think we should
move to a situation where targets don't provide comparison patterns
that require an inversion, and instead move inversions to generic code.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64
  2024-03-07 21:03   ` Richard Sandiford
@ 2024-03-08  6:57     ` Richard Biener
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Biener @ 2024-03-08  6:57 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: haochen.jiang, gcc-patches, haochen.jiang, hongtao.liu

On Thu, 7 Mar 2024, Richard Sandiford wrote:

> Sorry, still catching up on email, but: 
> 
> Richard Biener <rguenther@suse.de> writes:
> > We have optimize_vectors_before_lowering_p but we shouldn't even there
> > turn supported into not supported ops and as said, what's supported or
> > not cannot be finally decided (if it's only vcond and not vcond_mask
> > that is supported).  Also optimize_vectors_before_lowering_p is set
> > for a short time between vectorization and vector lowering and we
> > definitely do not want to turn supported vectorizer emitted stmts
> > into ones that we need to lower.  For GCC 15 we should see to move
> > vector lowering before vectorization (before loop optimization I'd
> > say) to close this particula hole (and also reliably ICE when the
> > vectorizer creates unsupported IL).  We also definitely want to
> > retire vcond expanders (no target I know of supports single-instruction
> > compare-and-select).
> 
> ...definitely agree with this FWIW.  Sounds like a much cleaner approach.
> 
> One of the main tricks that vcond*s tend to do is invert "difficult"
> comparisons and swap the data operands to match.  But I think we should
> move to a situation where targets don't provide comparison patterns
> that require an inversion, and instead move inversions to generic code.

Yes, that would be good as well - of course it will likely require fending
off some more simplification pattenrs.  Sounds like step #2 anyway, making
all ports happy with no vcond will definitely require some work unless
we force it by simply no longer using vcond ...

Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-03-08  6:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-26 21:13 [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64 haochen.jiang
2024-02-27  7:43 ` Richard Biener
2024-02-27  8:01   ` Hongtao Liu
2024-02-27 13:20   ` Jeff Law
2024-02-27 13:53     ` Richard Biener
2024-02-27 14:11       ` Jeff Law
2024-02-27 14:31         ` Richard Biener
2024-02-28 10:05           ` Richard Biener
2024-02-28 15:04             ` Jeff Law
2024-02-28 16:47               ` Richard Biener
2024-03-07 21:03   ` Richard Sandiford
2024-03-08  6:57     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).