* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
@ 2023-11-03 7:20 ` jakub at gcc dot gnu.org
2023-11-03 7:22 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-11-03 7:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P1
CC| |rdapp at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
Target Milestone|--- |14.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
2023-11-03 7:20 ` [Bug tree-optimization/112361] " jakub at gcc dot gnu.org
@ 2023-11-03 7:22 ` pinskia at gcc dot gnu.org
2023-11-03 7:51 ` rdapp at gcc dot gnu.org
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-03 7:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2023-11-03
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed by me too:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635063.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
2023-11-03 7:20 ` [Bug tree-optimization/112361] " jakub at gcc dot gnu.org
2023-11-03 7:22 ` pinskia at gcc dot gnu.org
@ 2023-11-03 7:51 ` rdapp at gcc dot gnu.org
2023-11-03 8:13 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rdapp at gcc dot gnu.org @ 2023-11-03 7:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #2 from Robin Dapp <rdapp at gcc dot gnu.org> ---
I can have a look. Of course I tested it but neither the compile farm machine
(gcc188) I used nor my local device have AVX512 run capability. Anywhere else
I can test it?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (2 preceding siblings ...)
2023-11-03 7:51 ` rdapp at gcc dot gnu.org
@ 2023-11-03 8:13 ` rguenth at gcc dot gnu.org
2023-11-03 8:14 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-03 8:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
You can use SDE (the simulator from Intel), but I suppose just inspecting the
vectorized code should work fine as well. I suspect we fail to fail
vectorization when we have a masked op but no native masked_fold_left as
we cannot open-code that variant.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (3 preceding siblings ...)
2023-11-03 8:13 ` rguenth at gcc dot gnu.org
@ 2023-11-03 8:14 ` jakub at gcc dot gnu.org
2023-11-04 5:24 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-11-03 8:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
You can just look at the dumps.
Generally, I'd expect that we shouldn't be creating .COND_ADD etc. calls for
conditional reductions for SCALAR_FLOAT_TYPE_P if !flag_associative_math, but
probably
also the fold left reduction code needs to either assert it isn't conditional
or needs to handle it.
E.g. needs_fold_left_reduction_p returns true if it has to do an in-order
reduction.
But guess Richard will know the details much better.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (4 preceding siblings ...)
2023-11-03 8:14 ` jakub at gcc dot gnu.org
@ 2023-11-04 5:24 ` pinskia at gcc dot gnu.org
2023-11-06 9:51 ` rdapp at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-04 5:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The patch that caused this one, also causes a bootstrap comparison failure with
--with-arch=skylake-avx512, see PR 112374 .
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (5 preceding siblings ...)
2023-11-04 5:24 ` pinskia at gcc dot gnu.org
@ 2023-11-06 9:51 ` rdapp at gcc dot gnu.org
2023-11-06 10:06 ` rguenther at suse dot de
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rdapp at gcc dot gnu.org @ 2023-11-06 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #6 from Robin Dapp <rdapp at gcc dot gnu.org> ---
So "before" we created
vect__3.12_55 = MEM <vector(16) float> [(float *)vectp_a.10_53];
vect__ifc__43.13_57 = VEC_COND_EXPR <mask__24.9_52, vect__3.12_55, { 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 }>;
// _ifc__43 = _24 ? _3 : 0.0;
stmp__44.14_58 = BIT_FIELD_REF <vect__ifc__43.13_57, 32, 0>;
stmp__44.14_59 = r3_29 + stmp__44.14_58;
...
in vect_expand_fold_left.
Now, as intended, there is no VEC_COND anymore and we just create the bit-field
reduction over the unmasked vector.
We could refrain from creating the COND_OP in the first place as Jakub
mentioned (I guess we know already in if-conv that we shouldn't), re-insert a
VEC_COND or create a COND_OP chain (instead of an OP chain) in
vect_expand_fold_left by passing it the mask (and is_cond_op).
Having several COND_OPs here might make analysis of subsequent passes more
difficult?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (6 preceding siblings ...)
2023-11-06 9:51 ` rdapp at gcc dot gnu.org
@ 2023-11-06 10:06 ` rguenther at suse dot de
2023-11-07 21:38 ` cvs-commit at gcc dot gnu.org
2023-11-08 7:01 ` rguenth at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: rguenther at suse dot de @ 2023-11-06 10:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 6 Nov 2023, rdapp at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
>
> --- Comment #6 from Robin Dapp <rdapp at gcc dot gnu.org> ---
> So "before" we created
>
> vect__3.12_55 = MEM <vector(16) float> [(float *)vectp_a.10_53];
> vect__ifc__43.13_57 = VEC_COND_EXPR <mask__24.9_52, vect__3.12_55, { 0.0,
> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 }>;
> // _ifc__43 = _24 ? _3 : 0.0;
> stmp__44.14_58 = BIT_FIELD_REF <vect__ifc__43.13_57, 32, 0>;
> stmp__44.14_59 = r3_29 + stmp__44.14_58;
> ...
>
> in vect_expand_fold_left.
Note that this wasn't correct in all cases (wrt signed zeros and
sign-dependent rounding).
> Now, as intended, there is no VEC_COND anymore and we just create the bit-field
> reduction over the unmasked vector.
That's invalid for a COND_OP. We either have to emulate that COND_OP
by materializing a VEC_COND_EXPR as before when that's semantically
valid, or refrain from vectorizing (I don't think we want to emit
N compare & jump to scalarize the mask effect).
> We could refrain from creating the COND_OP in the first place as Jakub
> mentioned (I guess we know already in if-conv that we shouldn't), re-insert a
> VEC_COND or create a COND_OP chain (instead of an OP chain) in
> vect_expand_fold_left by passing it the mask (and is_cond_op).
> Having several COND_OPs here might make analysis of subsequent passes more
> difficult?
pass in the mask and is_cond_op and create the VEC_COND_EXPR in
vect_expand_fold_left. But make sure to disallow vectorizing the
invalid cases.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (7 preceding siblings ...)
2023-11-06 10:06 ` rguenther at suse dot de
@ 2023-11-07 21:38 ` cvs-commit at gcc dot gnu.org
2023-11-08 7:01 ` rguenth at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-07 21:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Robin Dapp <rdapp@gcc.gnu.org>:
https://gcc.gnu.org/g:fd940d248bfccb6994794152681dc4c693160919
commit r14-5231-gfd940d248bfccb6994794152681dc4c693160919
Author: Robin Dapp <rdapp@ventanamicro.com>
Date: Mon Nov 6 11:24:37 2023 +0100
vect/ifcvt: Add vec_cond fallback and check for vector versioning.
This restricts tree-ifcvt to only create COND_OPs when we versioned the
loop for vectorization. Apart from that it re-creates a VEC_COND_EXPR
in vect_expand_fold_left if we emitted a COND_OP.
gcc/ChangeLog:
PR tree-optimization/112361
PR target/112359
PR middle-end/112406
* tree-if-conv.cc (convert_scalar_cond_reduction): Remember if
loop was versioned and only then create COND_OPs.
(predicate_scalar_phi): Do not create COND_OP when not
vectorizing.
* tree-vect-loop.cc (vect_expand_fold_left): Re-create
VEC_COND_EXPR.
(vectorize_fold_left_reduction): Pass mask to
vect_expand_fold_left.
gcc/testsuite/ChangeLog:
* gcc.dg/pr112359.c: New test.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076
2023-11-03 7:20 [Bug tree-optimization/112361] New: [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076 jakub at gcc dot gnu.org
` (8 preceding siblings ...)
2023-11-07 21:38 ` cvs-commit at gcc dot gnu.org
@ 2023-11-08 7:01 ` rguenth at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-08 7:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 11+ messages in thread