public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses
@ 2013-03-13 10:50 rguenth at gcc dot gnu.org
2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-13 10:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612
Bug #: 56612
Summary: basic-block vectorization does not replace all scalar
uses
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: rguenth@gcc.gnu.org
When vectorizing stmts in a basic-block we do not verify that the SLP
instance covers all uses of the definitions the stmts in the SLP tree.
This can easily result in both the scalar and vectorized set of stmts
being kept live and executed.
See PR56608 for an example (trivial re-use of the SLP roots stored values).
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
@ 2013-03-13 11:10 ` rguenth at gcc dot gnu.org
2023-08-09 9:37 ` rguenth at gcc dot gnu.org
2023-08-09 13:03 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-13 11:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2013-03-13
AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org
|gnu.org |
Ever Confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-13 11:09:55 UTC ---
Choice is to either not vectorize or replace remaining scalar uses with
vector extracts.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
@ 2023-08-09 9:37 ` rguenth at gcc dot gnu.org
2023-08-09 13:03 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-09 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We now try hard to generate lane extracts for those uses but still when we fail
(and know so during analysis - there's some support for "late" fails) we try
to adjust costing for this.
double x[1024];
double y;
double foo ()
{
y = x[1];
double r = x[0];
return r + x[1] + x[2];
}
is currently not handled for example (detected during analysis) because
of a ???, since the use in y = x[1] is before the last scalar stmt in
the SLP node (r = x[0]) despite us emitting vector loads before the
first scalar stmt (last is correct for any other stmt - but the story to
compute the insert location is really complicated).
Swapping y = x[1] and r = x[0] creates a lane extract as requested.
vectorizable_live_operation has fallback code that refrains from replacing
some uses, that should be priority one to avoid.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
2023-08-09 9:37 ` rguenth at gcc dot gnu.org
@ 2023-08-09 13:03 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-09 13:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Another example is derived from gcc.dg/vect/bb-slp-46.c (which has 'int'
instead of 'unsigned int'):
unsigned int a[4], b[4];
unsigned int foo ()
{
unsigned int tem0 = a[0] + b[0];
unsigned int temx = tem0 * 17; /* this fails without a real need */
unsigned int tem1 = a[1] + b[1];
unsigned int tem2 = a[2] + b[2];
unsigned int tem3 = a[3] + b[3];
unsigned int temy = tem3 * 13;
a[0] = tem0;
a[1] = tem1;
a[2] = tem2;
a[3] = tem3;
return temx + temy;
}
here we first build an SLP instance for the temx + temy reduction, mark
stmts as PURE_SLP but then during stmt analysis figure we cannot do the
multiplication and remove the instance. That causes the liveness analysis
to think temx and temy are vectorized when they are in fact not.
Note even when they would be vectorized we'd vectorize the loads twice
since we do not consider "live" lanes between instances (but we'd at least
cost them together). That's an artifact of how our SLP building works.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-09 13:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
2023-08-09 9:37 ` rguenth at gcc dot gnu.org
2023-08-09 13:03 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).