public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/110456] New: vectorization with loop masking prone to STLF issues
@ 2023-06-28 12:53 rguenth at gcc dot gnu.org
  2023-06-28 13:25 ` [Bug target/110456] " rguenth at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-28 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456

            Bug ID: 110456
           Summary: vectorization with loop masking prone to STLF issues
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
  for (int j = 0; j < m; ++j)
    for (int i = 0; i < n; ++i)
      a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}

double a[1024];
double b[1024]; 

int main(int argc, char **argv)
{
  int m = atoi (argv[1]);
  for (long i = 0; i < 1000000000; ++i)
    test (a + 4, b + 4, 4, m);
}


Shows that when we apply loop masking with --param vect-partial-vector-usage
then masked stores will generally prohibit store-to-load forwarding,
especially when there's only a partial overlap with a following load like
when traversing a multi-dimensional array as above.  The above runs
noticable slower compared to when the loads are offset
(uncomment the /* + 512 */).

The situation is difficult to avoid in general but there might be easy
heuristics that could be implemented like avoiding loop masking when
there's a read-modify-write operation to the same memory location in
a loop (with or without an immediately visible outer loop).  For
unknown dependences and thus runtime disambiguation a proper distance
of any read/write operation could be ensured as well.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/110456] vectorization with loop masking prone to STLF issues
  2023-06-28 12:53 [Bug target/110456] New: vectorization with loop masking prone to STLF issues rguenth at gcc dot gnu.org
@ 2023-06-28 13:25 ` rguenth at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-28 13:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Correction, the testcase should look like

void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
  for (int j = 0; j < m; ++j)
    for (int i = 0; i < n; ++i)
      a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}

double a[1024];
double b[1024];

int main(int argc, char **argv)
{
  int m = atoi (argv[1]);
  for (long i = 0; i < m; ++i)
    test (a + 4, b + 4, 4, 1024/4);
  return 0;
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-06-28 13:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 12:53 [Bug target/110456] New: vectorization with loop masking prone to STLF issues rguenth at gcc dot gnu.org
2023-06-28 13:25 ` [Bug target/110456] " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).