public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses
@ 2013-03-13 10:50 rguenth at gcc dot gnu.org
  2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-13 10:50 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612

             Bug #: 56612
           Summary: basic-block vectorization does not replace all scalar
                    uses
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: rguenth@gcc.gnu.org


When vectorizing stmts in a basic-block we do not verify that the SLP
instance covers all uses of the definitions the stmts in the SLP tree.
This can easily result in both the scalar and vectorized set of stmts
being kept live and executed.

See PR56608 for an example (trivial re-use of the SLP roots stored values).


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
  2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
@ 2013-03-13 11:10 ` rguenth at gcc dot gnu.org
  2023-08-09  9:37 ` rguenth at gcc dot gnu.org
  2023-08-09 13:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-13 11:10 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2013-03-13
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-13 11:09:55 UTC ---
Choice is to either not vectorize or replace remaining scalar uses with
vector extracts.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
  2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
  2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
@ 2023-08-09  9:37 ` rguenth at gcc dot gnu.org
  2023-08-09 13:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-09  9:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We now try hard to generate lane extracts for those uses but still when we fail
(and know so during analysis - there's some support for "late" fails) we try
to adjust costing for this.

double x[1024];
double y;
double foo ()
{
  y = x[1];
  double r = x[0];
  return r + x[1] + x[2];
}

is currently not handled for example (detected during analysis) because
of a ???, since the use in y = x[1] is before the last scalar stmt in
the SLP node (r = x[0]) despite us emitting vector loads before the
first scalar stmt (last is correct for any other stmt - but the story to
compute the insert location is really complicated).

Swapping y = x[1] and r = x[0] creates a lane extract as requested.
vectorizable_live_operation has fallback code that refrains from replacing
some uses, that should be priority one to avoid.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/56612] basic-block vectorization does not replace all scalar uses
  2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
  2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
  2023-08-09  9:37 ` rguenth at gcc dot gnu.org
@ 2023-08-09 13:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-09 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56612

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Another example is derived from gcc.dg/vect/bb-slp-46.c (which has 'int'
instead of 'unsigned int'):

unsigned int a[4], b[4];
unsigned int foo ()
{
  unsigned int tem0 = a[0] + b[0];
  unsigned int temx = tem0 * 17;  /* this fails without a real need */
  unsigned int tem1 = a[1] + b[1];
  unsigned int tem2 = a[2] + b[2];
  unsigned int tem3 = a[3] + b[3];
  unsigned int temy = tem3 * 13;
  a[0] = tem0;
  a[1] = tem1;
  a[2] = tem2;
  a[3] = tem3;
  return temx + temy;
}

here we first build an SLP instance for the temx + temy reduction, mark
stmts as PURE_SLP but then during stmt analysis figure we cannot do the
multiplication and remove the instance.  That causes the liveness analysis
to think temx and temy are vectorized when they are in fact not.
Note even when they would be vectorized we'd vectorize the loads twice
since we do not consider "live" lanes between instances (but we'd at least
cost them together).  That's an artifact of how our SLP building works.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-09 13:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-13 10:50 [Bug tree-optimization/56612] New: basic-block vectorization does not replace all scalar uses rguenth at gcc dot gnu.org
2013-03-13 11:10 ` [Bug tree-optimization/56612] " rguenth at gcc dot gnu.org
2023-08-09  9:37 ` rguenth at gcc dot gnu.org
2023-08-09 13:03 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).