public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization
@ 2020-09-14 9:01 rguenth at gcc dot gnu.org
2020-09-14 9:16 ` [Bug tree-optimization/97043] " rguenth at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-14 9:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug ID: 97043
Summary: latent wrong-code with SLP vectorization
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
There's a latent wrong-code bug on the branches visible with the
gcc.dg/vect/pr81410.c testcase. The issue is avoided on trunk
by means of delaying optimizing/validating of SLP permutations until
the vectorization factor is final whilst on branches we throw away
the permutation prematurely via
if (!this_load_permuted
/* The load requires permutation when unrolling exposes
a gap either because the group is larger than the SLP
group-size or because there is a gap between the groups.
*/
&& (known_eq (unrolling_factor, 1U)
|| (group_size == DR_GROUP_SIZE (first_stmt_info)
&& DR_GROUP_GAP (first_stmt_info) == 0)))
{
SLP_TREE_LOAD_PERMUTATION (load_node).release ();
because unrolling_factor is 1 but later is upped to 2 due to hybrid
SLP/non-SLP vectorization. This causes us to run into the gap adjustment
code added by the PR81410 fix:
/* With SLP permutation we load the gaps as well, without
we need to skip the gaps after we manage to fully load
all elements. group_gap_adj is DR_GROUP_SIZE here. */
group_elt += nunits;
if (maybe_ne (group_gap_adj, 0U)
&& !slp_perm
&& known_eq (group_elt, group_size - group_gap_adj))
{
poly_wide_int bump_val
= (wi::to_wide (TYPE_SIZE_UNIT (elem_type))
* group_gap_adj);
tree bump = wide_int_to_tree (sizetype, bump_val);
dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi,
stmt_info, bump);
group_elt = 0;
}
which notes that we do not load the gaps. Now, alignment analysis
correctly analyzes the load of x[] to be aligned due to the VF being 2:
poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
step_preserves_misalignment_p
= multiple_p (DR_STEP_ALIGNMENT (dr_info->dr) * vf, vect_align_c);
but that assumes contiguous aligned loads. The
VMAT_CONTIGUOUS-with-gap-without-permutation vectorization OTOH assumes that
each individual instance
of the group is aligned which it is not. Trying to fix up there
would also require unaligned access support checking at analysis time
as well as possible cost adjustments.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
@ 2020-09-14 9:16 ` rguenth at gcc dot gnu.org
2020-09-14 9:30 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-14 9:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
Ever confirmed|0 |1
Blocks| |96522
Status|UNCONFIRMED |NEW
Last reconfirmed| |2020-09-14
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This blocks backporting the fix for PR96522, causing the gcc.dg/vect/pr81410.c
testcase to FAIL execution with an unaligned access using an aligned load.
The trunk rev. that fixed this is gbc484e250990393e887f7239157cc85ce6fadcce
A pragmatic fix might be
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index f6331eeea86..3fdf56f9335 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2309,9 +2309,8 @@ vect_analyze_slp_instance (vec_info *vinfo,
/* The load requires permutation when unrolling exposes
a gap either because the group is larger than the SLP
group-size or because there is a gap between the groups.
*/
- && (known_eq (unrolling_factor, 1U)
- || (group_size == DR_GROUP_SIZE (first_stmt_info)
- && DR_GROUP_GAP (first_stmt_info) == 0)))
+ && group_size == DR_GROUP_SIZE (first_stmt_info)
+ && DR_GROUP_GAP (first_stmt_info) == 0)
{
SLP_TREE_LOAD_PERMUTATION (load_node).release ();
continue;
with biggest effects eventually on load-lane targets (arm/aarch64) where
we then eventually prefer more of those. For the testcase in question
we then generate the following, matching trunk
movdqa (%rdx), %xmm2
movdqa 16(%rdx), %xmm0
shufpd $1, 32(%rdx), %xmm0
instead of
movdqa (%rdx), %xmm1
addq $48, %rdx
movdqu -24(%rdx), %xmm2
(or with the backport of PR96522 a wrong movdqa in place of the movdqu).
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522
[Bug 96522] [9/10 Regression] Incorrect with with -O -fno-tree-pta
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
2020-09-14 9:16 ` [Bug tree-optimization/97043] " rguenth at gcc dot gnu.org
@ 2020-09-14 9:30 ` rguenth at gcc dot gnu.org
2020-09-14 13:22 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-14 9:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|NEW |ASSIGNED
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
2020-09-14 9:16 ` [Bug tree-optimization/97043] " rguenth at gcc dot gnu.org
2020-09-14 9:30 ` rguenth at gcc dot gnu.org
@ 2020-09-14 13:22 ` cvs-commit at gcc dot gnu.org
2020-10-06 12:20 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-14 13:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:e93428a8b056aed83a7678d4dc8272131ab671ba
commit r10-8759-ge93428a8b056aed83a7678d4dc8272131ab671ba
Author: Richard Biener <rguenther@suse.de>
Date: Mon Sep 14 11:25:04 2020 +0200
tree-optimization/97043 - fix latent wrong-code with SLP vectorization
When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned. Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.
The following removes the SLP load permutation eliding based on
the still premature vectorization factor.
2020-09-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2020-09-14 13:22 ` cvs-commit at gcc dot gnu.org
@ 2020-10-06 12:20 ` rguenth at gcc dot gnu.org
2021-02-03 8:26 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-06 12:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.
Bug 97236 Summary: [10 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2020-10-06 12:20 ` rguenth at gcc dot gnu.org
@ 2021-02-03 8:26 ` rguenth at gcc dot gnu.org
2021-02-03 8:27 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-03 8:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.
Bug 97236 Summary: [8/9 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |---
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
` (4 preceding siblings ...)
2021-02-03 8:26 ` rguenth at gcc dot gnu.org
@ 2021-02-03 8:27 ` rguenth at gcc dot gnu.org
2021-02-08 10:43 ` rguenth at gcc dot gnu.org
2022-02-18 8:21 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-03 8:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed, not backporting further.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
` (5 preceding siblings ...)
2021-02-03 8:27 ` rguenth at gcc dot gnu.org
@ 2021-02-08 10:43 ` rguenth at gcc dot gnu.org
2022-02-18 8:21 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-08 10:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.
Bug 97236 Summary: [8 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution|--- |FIXED
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/97043] latent wrong-code with SLP vectorization
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
` (6 preceding siblings ...)
2021-02-08 10:43 ` rguenth at gcc dot gnu.org
@ 2022-02-18 8:21 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-18 8:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:e75e5d2c41d294c4da4adfe610204ce5d97c3a4e
commit r9-9959-ge75e5d2c41d294c4da4adfe610204ce5d97c3a4e
Author: Richard Biener <rguenther@suse.de>
Date: Mon Sep 14 11:25:04 2020 +0200
tree-optimization/97043 - fix latent wrong-code with SLP vectorization
When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned. Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.
The following removes the SLP load permutation eliding based on
the still premature vectorization factor.
2020-09-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.
(cherry picked from commit e93428a8b056aed83a7678d4dc8272131ab671ba)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-02-18 8:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14 9:01 [Bug tree-optimization/97043] New: latent wrong-code with SLP vectorization rguenth at gcc dot gnu.org
2020-09-14 9:16 ` [Bug tree-optimization/97043] " rguenth at gcc dot gnu.org
2020-09-14 9:30 ` rguenth at gcc dot gnu.org
2020-09-14 13:22 ` cvs-commit at gcc dot gnu.org
2020-10-06 12:20 ` rguenth at gcc dot gnu.org
2021-02-03 8:26 ` rguenth at gcc dot gnu.org
2021-02-03 8:27 ` rguenth at gcc dot gnu.org
2021-02-08 10:43 ` rguenth at gcc dot gnu.org
2022-02-18 8:21 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).