public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/49955] New: Fails to do partial basic-block SLP
@ 2011-08-03  9:08 rguenth at gcc dot gnu.org
  2011-08-03 15:17 ` [Bug tree-optimization/49955] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-08-03  9:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955

           Summary: Fails to do partial basic-block SLP
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: rguenth@gcc.gnu.org
                CC: irar@gcc.gnu.org


410.bwaves in shell_lam.f has a lot of arrays with inner dimension 5 operated
on in loops that are either unrolled by early unrolling or manually unrolled
in source.  All but one loop in shell_lam.f are not vectorized.

One reason is that basic-block vectorization gives up if it sees
interleaving size that is not a multiple of a supported vectorization
factor.  Testcase:

double a[1024], b[1024];

void foo (int k)
{
  int j;
  a[k*5 + 0] = a[k*5 + 0] + b[k*5 + 0];
  a[k*5 + 1] = a[k*5 + 1] + b[k*5 + 1];
  a[k*5 + 2] = a[k*5 + 2] + b[k*5 + 2];
  a[k*5 + 3] = a[k*5 + 3] + b[k*5 + 3];
  a[k*5 + 4] = a[k*5 + 4] + b[k*5 + 4];
}

taken from the last loop in shell_lam.f which has its innermost loop unrolled
(and loop SLP refuses to vectorize as well, see separate bug).

For the above we get:

t.c:6: note: === vect_analyze_data_ref_accesses ===
t.c:6: note: Detected interleaving of size 5
t.c:6: note: Detected interleaving of size 5
t.c:6: note: Detected interleaving of size 5
t.c:6: note: Vectorizing an unaligned access.
t.c:6: note: Vectorizing an unaligned access.
t.c:6: note: Vectorizing an unaligned access.
t.c:6: note: === vect_analyze_slp ===
t.c:6: note: get vectype with 2 units of type double
t.c:6: note: vectype: vector(2) double
t.c:6: note: Build SLP failed: unrolling required in basic block SLP
t.c:6: note: Failed to SLP the basic block.
t.c:6: note: not vectorized: failed to find SLP opportunities in basic block.

but of course we could simply vectorize with an interleaving size of 4
leaving the excess operations unvectorized (with optimization opportunity
if we can pick a properly sized and aligned set of accesses).


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-08 12:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-03  9:08 [Bug tree-optimization/49955] New: Fails to do partial basic-block SLP rguenth at gcc dot gnu.org
2011-08-03 15:17 ` [Bug tree-optimization/49955] " rguenth at gcc dot gnu.org
2011-08-05 10:41 ` irar at il dot ibm.com
2011-08-05 10:51 ` irar at il dot ibm.com
2023-08-04 20:05 ` pinskia at gcc dot gnu.org
2023-08-07  9:10 ` rguenth at gcc dot gnu.org
2023-08-08 12:38 ` cvs-commit at gcc dot gnu.org
2023-08-08 12:38 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).