public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/64745] New: Generic vectorization missed opportunities
@ 2015-01-23 12:10 rguenth at gcc dot gnu.org
2015-01-23 12:15 ` [Bug tree-optimization/64745] " rguenth at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-01-23 12:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64745
Bug ID: 64745
Summary: Generic vectorization missed opportunities
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target: x86_64-*-*, i?86-*-*
unsigned short a[2], b[2];
void foo (void)
{
int i;
for (i = 0; i < 2; ++i)
a[i] = b[i];
}
unsigned char x[4], y[4];
void bar (void)
{
int i;
for (i = 0; i < 4; ++i)
x[i] = y[i];
}
vectorizing this on i?86 (without SSE) fails for the first testcase at -O3
because we unroll the loop and SLP refuses to handle the "unaligned" load.
For the 2nd case we loop-vectorize it but apply versioning for alignment.
The alignment checks in the vectorizer do not account for non-vector modes.
If we fix that the first loop fails to SLP vectorize because of bogus
cost calculation:
t.c:6:13: note: Cost model analysis:
Vector inside of basic block cost: 4
Vector prologue cost: 0
Vector epilogue cost: 0
Scalar cost of basic block: 4
t.c:6:13: note: not vectorized: vectorization is not profitable.
because of the unaligned load/store cost:
t.c:6:13: note: vect_model_load_cost: unaligned supported by hardware.
t.c:6:13: note: vect_model_load_cost: inside_cost = 2, prologue_cost = 0 .
...
t.c:6:13: note: vect_model_store_cost: unaligned supported by hardware.
t.c:6:13: note: vect_model_store_cost: inside_cost = 2, prologue_cost = 0 .
that's a backend bug which doesn't consider !VECTOR_MODE_P vector types
in ix86_builtin_vectorization_cost. OTOH for SLP vectorization if
the cost is equal we can assume less stmts will be used so eventually
just vectorize anyway if the costs are equal.
The real issue of course is that generic vectorization is not attempted
if a vector ISA is available - but that fails to vectorize the above cases
where SLP vectorization would take care of combining small loads and stores.
So we'd need to support HImode, SImode (and DImode on x86_64) vectorization
sizes which probably comes at a too big cost to consider that though
basic-block vectorization (knowing the size of the loads) could try anyway.
But that needs some re-org of the analysis.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/64745] Generic vectorization missed opportunities
2015-01-23 12:10 [Bug tree-optimization/64745] New: Generic vectorization missed opportunities rguenth at gcc dot gnu.org
@ 2015-01-23 12:15 ` rguenth at gcc dot gnu.org
2021-12-26 22:43 ` pinskia at gcc dot gnu.org
2021-12-26 22:49 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-01-23 12:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64745
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2015-01-23
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Mine. The alignment issue is easily fixed (I have a patch), the cost model
issue is, well, a cost model issue also easily fixed.
A big required change is to re-structure basic-block vectorization to
perform SLP analysis independent of vector types/sizes and to vectorize
independent SLP instances separately (allowing different vector
sizes in a BB).
Loop vectorization could also do SLP analysis first (basically splitting it) to
reduce the number of applicable vectorization factors. Other analysis phases
could also contribute to that and it would also help compile-time to not
re-do dataref and dependence analysis for each size.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/64745] Generic vectorization missed opportunities
2015-01-23 12:10 [Bug tree-optimization/64745] New: Generic vectorization missed opportunities rguenth at gcc dot gnu.org
2015-01-23 12:15 ` [Bug tree-optimization/64745] " rguenth at gcc dot gnu.org
@ 2021-12-26 22:43 ` pinskia at gcc dot gnu.org
2021-12-26 22:49 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-26 22:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64745
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/64745] Generic vectorization missed opportunities
2015-01-23 12:10 [Bug tree-optimization/64745] New: Generic vectorization missed opportunities rguenth at gcc dot gnu.org
2015-01-23 12:15 ` [Bug tree-optimization/64745] " rguenth at gcc dot gnu.org
2021-12-26 22:43 ` pinskia at gcc dot gnu.org
@ 2021-12-26 22:49 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-26 22:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64745
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Known to work| |10.1.0
Resolution|--- |FIXED
Known to fail| |6.1.0, 8.1.0, 9.1.0
Target Milestone|--- |10.0
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Both are caught in GCC 10+ now for SLP.
Note store merging is able to catch it in GCC 8+ too.
So closing as fixed in GCC 10 for the SLP part of the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-12-26 22:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 12:10 [Bug tree-optimization/64745] New: Generic vectorization missed opportunities rguenth at gcc dot gnu.org
2015-01-23 12:15 ` [Bug tree-optimization/64745] " rguenth at gcc dot gnu.org
2021-12-26 22:43 ` pinskia at gcc dot gnu.org
2021-12-26 22:49 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).