public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize
@ 2022-04-07 16:59 acoplan at gcc dot gnu.org
2022-04-08 8:28 ` [Bug target/105197] " rguenth at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: acoplan at gcc dot gnu.org @ 2022-04-07 16:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Bug ID: 105197
Summary: [12 Regression] SVE: wrong code with -O
-ftree-vectorize
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: acoplan at gcc dot gnu.org
Target Milestone: ---
The following C code:
unsigned char arr_7[9][3];
unsigned char (*main_arr_7)[3] = arr_7;
int main() {
char arr_2[9];
int arr_6[9];
int x;
unsigned i;
for (i = 0; i < 9; ++i) {
arr_2[i] = 21;
arr_6[i] = 6;
}
for (i = arr_2[8] - 21; i < 2; i++)
x = arr_6[i] ? (main_arr_7[8][i] ? main_arr_7[8][i] : 8) : (char)arr_6[i];
if (x != 8)
__builtin_abort ();
}
appears to be miscompiled with -march=armv8.2-a+sve -O -ftree-vectorize. The
issue doesn't seem to occur with GCC 11.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
@ 2022-04-08 8:28 ` rguenth at gcc dot gnu.org
2022-04-08 8:41 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-08 8:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
Known to work| |11.2.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
2022-04-08 8:28 ` [Bug target/105197] " rguenth at gcc dot gnu.org
@ 2022-04-08 8:41 ` rguenth at gcc dot gnu.org
2022-04-10 7:04 ` tnfchris at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-08 8:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
the GIMPLE doesn't look wrong. We're using an EXTRACT_LAST, so that might be
the special thing. Vectorization of the first loop is probably not necessary
to trigger the failure.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
2022-04-08 8:28 ` [Bug target/105197] " rguenth at gcc dot gnu.org
2022-04-08 8:41 ` rguenth at gcc dot gnu.org
@ 2022-04-10 7:04 ` tnfchris at gcc dot gnu.org
2022-04-11 6:59 ` tnfchris at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-04-10 7:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org
CC| |tnfchris at gcc dot gnu.org
Last reconfirmed| |2022-04-10
Ever confirmed|0 |1
--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> the GIMPLE doesn't look wrong. We're using an EXTRACT_LAST, so that might
> be the special thing. Vectorization of the first loop is probably not
> necessary to trigger the failure.
Hmmm looks like the GIMPLE is wrong, the masks it combines creates a
contradiction
At GIMPLE we have
mask__44.14_114 = vect__4.13_112 != { 0, ... };
mask__26.22_128 = vect__6.17_121 == { 0, ... };
mask_patt_65.24_130 = mask__44.14_114 & mask__26.22_128;
mask__43.26_135 = vect__4.13_112 == { 0, ... };
mask__25.18_123 = vect__6.17_121 != { 0, ... };
_137 = mask__43.26_135 & loop_mask_111;
_163 = mask_patt_65.24_130 & _137;
where _163 demands vect__4.13_112 != 0 && vect__4.13_112 == 0
_163 should have been _163 = mask_patt_65.24_130 & loop_mask_111;
So it looks like the wrong loop masks are combined.
Mine.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
` (2 preceding siblings ...)
2022-04-10 7:04 ` tnfchris at gcc dot gnu.org
@ 2022-04-11 6:59 ` tnfchris at gcc dot gnu.org
2022-04-11 8:05 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-04-11 6:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Looks like this started with
commit d846f225c25c5885250c303c8d118caa08c447ab
Author: Richard Biener <rguenther@suse.de>
Date: Tue May 4 15:51:20 2021 +0200
tree-optimization/79333 - fold stmts following SSA edges in VN
This makes sure to follow SSA edges when folding eliminated stmts.
This reaps the same benefit as forwprop folding all stmts, not
waiting for one to produce copysign in the new testcase.
2021-05-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/79333
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Fold stmt following SSA edges.
* gcc.dg/tree-ssa/ssa-fre-94.c: New testcase.
* gcc.dg/graphite/fuse-1.c: Adjust.
* gcc.dg/pr43864-4.c: Likewise.
and what's happening is that the vectorize relies on a mask A and it's inverse
~A be represented by a negation of the mask. Ifcvt used to enforce this but
with the change it now pushes the ~ into the mask operation if it can.
So previously we would generate
_26 = ~_25;
_43 = ~_44;
out of ifcvt and how we generate
_26 = _6 == 0;
_43 = _4 == 0;
and force the creation of two new extra mask as it de-optimizes the vectorizers
ability to immediately see a mask invert.
We however still detect that those two are inverses of
_25 = _6 != 0;
_44 = _4 != 0;
and when generating the second VEC_COND for the operation we end up flipping
the arguments somehow
------>vectorizing statement: _ifc__41 = _43 ? 0 : _ifc__40;
created new init_stmt: vect_cst__136 = { 0, ... }
add new stmt: _137 = mask__43.26_135 & loop_mask_111
note: add new stmt: vect__ifc__41.27_138 = VEC_COND_EXPR <_137,
vect__ifc__40.25_133, vect_cst__136>;
so we've vectorized_ifc__41 = _43 ?_ifc__40 : 0; instead without negating _137
which is where the contradiction gets introduced. I'll fix that bug, but the
question remains whether we want this simplification to now happen in ifcvt for
masks.
It makes the vectorizer generate a lot more intermediate masks that are cleaned
up by the RPO pass I added at the end but we also lose the fact that they are
simple inverses, i.e. at -O3 on these integer masks we could have just
generated a NOT.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
` (3 preceding siblings ...)
2022-04-11 6:59 ` tnfchris at gcc dot gnu.org
@ 2022-04-11 8:05 ` rguenth at gcc dot gnu.org
2022-04-11 14:09 ` cvs-commit at gcc dot gnu.org
2022-04-11 14:25 ` tnfchris at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-11 8:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The following improves the output from if-conversion by simplifying ~cond ? a :
b
to cond ? b : a, possibly reducing the number of conds. Ideally if-conversion
would track predicates in a more concious way of course.
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 7495ed653c0..dd3d5255a38 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -472,6 +472,14 @@ fold_build_cond_expr (tree type, tree cond, tree rhs, tree
lhs)
&& (integer_zerop (op1)))
cond = op0;
}
+ gassign *ass;
+ if (TREE_CODE (cond) == SSA_NAME
+ && (ass = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (cond)))
+ && gimple_assign_rhs_code (ass) == BIT_NOT_EXPR)
+ {
+ cond = gimple_assign_rhs1 (ass);
+ std::swap (rhs, lhs);
+ }
cond_expr = fold_ternary (COND_EXPR, type, cond, rhs, lhs);
if (cond_expr == NULL_TREE)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
` (4 preceding siblings ...)
2022-04-11 8:05 ` rguenth at gcc dot gnu.org
@ 2022-04-11 14:09 ` cvs-commit at gcc dot gnu.org
2022-04-11 14:25 ` tnfchris at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-04-11 14:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfchris@gcc.gnu.org>:
https://gcc.gnu.org/g:78c718490bc2843d4dadcef8a0ae14aed1d15a32
commit r12-8080-g78c718490bc2843d4dadcef8a0ae14aed1d15a32
Author: Tamar Christina <tamar.christina@arm.com>
Date: Mon Apr 11 15:09:05 2022 +0100
middle-end: Prevent the use of the cond inversion detection code when both
conditions are external. [PR105197]
Previously ifcvt used to enforce that a mask A and the inverse of said mask
be
represented as ~A. So for the masks
_25 = _6 != 0;
_44 = _4 != 0;
ifcvt would produce for an operation requiring the inverse of said mask
_26 = ~_25;
_43 = ~_44;
but now that VN is applied to the entire function body we get a
simplification
on the mask and produce:
_26 = _6 == 0;
_43 = _4 == 0;
This in itself is not a problem semantically speaking (though it does
create
more masks that need to be tracked) but when vectorizing the masked
conditional
we would still detect _26 and _43 to be inverses of _25 and _44 and mark
them
as requiring their operands be swapped.
When vectorizing we swap the operands but don't find the BIT_NOT_EXPR to
remove
and so we leave the condition as is which produces invalid code:
------>vectorizing statement: _ifc__41 = _43 ? 0 : _ifc__40;
created new init_stmt: vect_cst__136 = { 0, ... }
add new stmt: _137 = mask__43.26_135 & loop_mask_111
note: add new stmt: vect__ifc__41.27_138 = VEC_COND_EXPR <_137,
vect__ifc__40.25_133, vect_cst__136>;
This fixes disabling the inversion detection code when the loop isn't
masked
since both conditional would be external. We'd then not use the new
cond_code
and would incorrectly still swap the operands.
The resulting code is also better than GCC-11 with most operations now
predicated on the loop mask rather than a ptrue.
gcc/ChangeLog:
PR target/105197
* tree-vect-stmts.cc (vectorizable_condition): Prevent cond swap
when
not masked.
gcc/testsuite/ChangeLog:
PR target/105197
* gcc.target/aarch64/sve/pr105197-1.c: New test.
* gcc.target/aarch64/sve/pr105197-2.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/105197] [12 Regression] SVE: wrong code with -O -ftree-vectorize
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
` (5 preceding siblings ...)
2022-04-11 14:09 ` cvs-commit at gcc dot gnu.org
@ 2022-04-11 14:25 ` tnfchris at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-04-11 14:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105197
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Fixed and no regression from codegen in GCC-11.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-04-11 14:25 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-07 16:59 [Bug target/105197] New: [12 Regression] SVE: wrong code with -O -ftree-vectorize acoplan at gcc dot gnu.org
2022-04-08 8:28 ` [Bug target/105197] " rguenth at gcc dot gnu.org
2022-04-08 8:41 ` rguenth at gcc dot gnu.org
2022-04-10 7:04 ` tnfchris at gcc dot gnu.org
2022-04-11 6:59 ` tnfchris at gcc dot gnu.org
2022-04-11 8:05 ` rguenth at gcc dot gnu.org
2022-04-11 14:09 ` cvs-commit at gcc dot gnu.org
2022-04-11 14:25 ` tnfchris at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).