public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50789] New: Gather vectorization
@ 2011-10-19  7:51 jakub at gcc dot gnu.org
  2011-10-19  8:10 ` [Bug tree-optimization/50789] " rguenth at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-10-19  7:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50789

             Bug #: 50789
           Summary: Gather vectorization
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jakub@gcc.gnu.org
                CC: hjl.tools@gmail.com, irar@gcc.gnu.org,
                    kirill.yukhin@intel.com


This is to track progress on vectorization using AVX2 v*gather* instructions.

The instructions allow plain unconditional gather, e.g.:
#define N 1024
float f[N];
int k[N];
float *l[N];
int **m[N];

float
f1 (void)
{
  int i;
  float g = 0.0;
  for (i = 0; i < N; i++)
    g += f[k[i]];
  return g;
}

float
f2 (float *p)
{
  int i;
  float g = 0.0;
  for (i = 0; i < N; i++)
    g += p[k[i]];
  return g;
}

float
f3 (void)
{
  int i;
  float g = 0.0;
  for (i = 0; i < N; i++)
    g += *l[i];
  return g;
}

int
f4 (void)
{
  int i;
  int g = 0;
  for (i = 0; i < N; i++)
    g += **m[i];
  return g;
}

should be able to vectorize all 4 loops.  In f1/f2 it would use non-zero base
(the vector would contain just indexes into some array, which vgather sign
extends and adds to base), in f3/f4 it would use zero base - the vectors would
be vectors of pointers (resp. uintptr_t).

To vectorize the above I'm afraid we'd need to modify tree-data-ref.c as well
as tree-vect-data-ref.c, because the memory accesses aren't affine and already
dr_analyze_innermost gives up on those, doesn't fill in any of the DR_* stuff.
Perhaps with some flag and when the base resp. offset has vdef in the same loop
we could mark it somehow and at least fill in the other fields.  It would
probably make alias decisions (in tree-vect-data-ref.c?) harder.  Any ideas?

What is additionally possible is to conditionalize loads, either affine or not.
So something like:
for (i = 0; i < N; i++)
  {
    c = 6;
    if (a[i] > 24)
      c = b[i];
    d[i] = c + e[i];
  }
for the affine conditional accesses where the vector could be just { 0, 1, 2,
3, ... } but the mask from the comparison.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-07-03  9:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-19  7:51 [Bug tree-optimization/50789] New: Gather vectorization jakub at gcc dot gnu.org
2011-10-19  8:10 ` [Bug tree-optimization/50789] " rguenth at gcc dot gnu.org
2011-10-19  8:49 ` irar at il dot ibm.com
2011-10-19  9:03 ` jakub at gcc dot gnu.org
2011-10-19  9:40 ` irar at il dot ibm.com
2011-10-24  8:40 ` jakub at gcc dot gnu.org
2011-10-25 21:17 ` jakub at gcc dot gnu.org
2011-10-26 16:57 ` jakub at gcc dot gnu.org
2011-11-07 16:02 ` jakub at gcc dot gnu.org
2011-11-08 13:26 ` jakub at gcc dot gnu.org
2013-04-02 16:50 ` vincenzo.innocente at cern dot ch
2013-04-17  8:31 ` andrey.turetskiy at gmail dot com
2013-04-17  8:53 ` rguenther at suse dot de
2013-07-03  9:22 ` vincenzo.innocente at cern dot ch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).