public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/110456] New: vectorization with loop masking prone to STLF issues
@ 2023-06-28 12:53 rguenth at gcc dot gnu.org
2023-06-28 13:25 ` [Bug target/110456] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-28 12:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456
Bug ID: 110456
Summary: vectorization with loop masking prone to STLF issues
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
for (int j = 0; j < m; ++j)
for (int i = 0; i < n; ++i)
a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}
double a[1024];
double b[1024];
int main(int argc, char **argv)
{
int m = atoi (argv[1]);
for (long i = 0; i < 1000000000; ++i)
test (a + 4, b + 4, 4, m);
}
Shows that when we apply loop masking with --param vect-partial-vector-usage
then masked stores will generally prohibit store-to-load forwarding,
especially when there's only a partial overlap with a following load like
when traversing a multi-dimensional array as above. The above runs
noticable slower compared to when the loads are offset
(uncomment the /* + 512 */).
The situation is difficult to avoid in general but there might be easy
heuristics that could be implemented like avoiding loop masking when
there's a read-modify-write operation to the same memory location in
a loop (with or without an immediately visible outer loop). For
unknown dependences and thus runtime disambiguation a proper distance
of any read/write operation could be ensured as well.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug target/110456] vectorization with loop masking prone to STLF issues
2023-06-28 12:53 [Bug target/110456] New: vectorization with loop masking prone to STLF issues rguenth at gcc dot gnu.org
@ 2023-06-28 13:25 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-28 13:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110456
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Correction, the testcase should look like
void __attribute__((noipa))
test (double * __restrict a, double *b, int n, int m)
{
for (int j = 0; j < m; ++j)
for (int i = 0; i < n; ++i)
a[i + j*n] = a[i + j*n /* + 512 */] + b[i + j*n];
}
double a[1024];
double b[1024];
int main(int argc, char **argv)
{
int m = atoi (argv[1]);
for (long i = 0; i < m; ++i)
test (a + 4, b + 4, 4, 1024/4);
return 0;
}
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-06-28 13:25 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 12:53 [Bug target/110456] New: vectorization with loop masking prone to STLF issues rguenth at gcc dot gnu.org
2023-06-28 13:25 ` [Bug target/110456] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).