public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP
@ 2014-09-08 4:05 andi-gcc at firstfloor dot org
2014-09-08 7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-09-08 4:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202
Bug ID: 63202
Summary: tree vectorizer does not make use of alignment
information from VRP/CCP
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
char b[100];
void alignment(int *p)
{
if ((uintptr_t)p & 15) __builtin_unreachable();
int i;
for (i = 0; i < 64; i++)
b[i] = p[i] ^ 0x1f;
}
-O3 results in
leaq 256(%rdi), %rax
cmpq $b, %rax
jbe .L9
cmpq $b+64, %rdi
jb .L5
.L9:
movdqu (%rdi), %xmm0
movdqu 16(%rdi), %xmm2
movdqa %xmm0, %xmm1
punpcklwd %xmm2, %xmm0
movdqu 48(%rdi), %xmm3
punpckhwd %xmm2, %xmm1
movdqu 112(%rdi), %xmm4
...
.L5:
xorl %eax, %eax
.p2align 4,,10
.p2align 3
.L8:
movzbl (%rdi,%rax,4), %edx
addq $1, %rax
xorl $31, %edx
movb %dl, b-1(%rax)
cmpq $64, %rax
jne .L8
rep ret
The extra loop for the unaligned case shouldn't be needed because VRP or CCP
can prove that the pointer is always aligned from the builtin_unreachable test.
vrp1 doesn't handle this
p_3(D): VARYING
p.0_4: [0, +INF]
it only is known in vrp2, which is too late for the vectorizer?
p_1: ~[0B, 0B] EQUIVALENCES: { p_3(D) } (1 elements)
Also the vectorizer uses a different variable which does not inherit the known
alignment:
<bb 2>:
p.0_4 = (long unsigned int) p_3(D);
_5 = p.0_4 & 15;
if (_5 != 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
__builtin_unreachable ();
p.0_4 is unknown range again
p.0_4: [0, +INF]
Fixing this would allow implementing an __assume() macro behaving similar to
VC++
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
2014-09-08 4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
@ 2014-09-08 7:36 ` jakub at gcc dot gnu.org
2014-09-08 8:38 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-09-08 7:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess the cast prevents this from being handled by maybe_set_nonzero_bits,
guess it could be handled there.
That said, it is extremely fragile, because we insert the range and non-zero
bits info on SSA_NAMEs and have this single exception for function parameters
if they aren't used anywhere before the __builtin_unreachable check. As soon
as e.g. the function is inlined, there might be more uses and the info can be
lost.
Richard didn't want to disable forward propagation if some SSA_NAME holds a
useful range info which the to be propagated SSA_NAME does not hold (in that
case, we'd keep a new SSA_NAME with the more precise range/non-zero info around
and be able to stick it somewhere).
The reason why we have __builtin_assume_aligned defined the way it is is that
there is always an SSA_NAME to stick that info to, it is clear in which part of
the function the condition is true.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
2014-09-08 4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
2014-09-08 7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
@ 2014-09-08 8:38 ` rguenth at gcc dot gnu.org
2014-09-08 17:49 ` andi-gcc at firstfloor dot org
2021-07-20 7:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-08 8:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, as with restrict it would be nice to be able to annotate the memory
references themselves with alignment info.
Btw, a possibility would be to insert assume_aligned calls into the IL
from the
if (p & 15)
__builtin_unreachable ();
pattern and remove the test & __builtin_unreachable ().
Of course quite special and breaks down for assume (!(p & 15) && a == b).
As Jakub said, the testcase can be handled with the existing code as
there is no use of p before the conditional.
Note that there isn't an extra loop for the "unaligned" case but
the extra loop is for the case where there is aliasing between
p and b.
But yes, we fail to use aligned loads here (but movdqu doesn't have a
penalty for that).
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
2014-09-08 4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
2014-09-08 7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
2014-09-08 8:38 ` rguenth at gcc dot gnu.org
@ 2014-09-08 17:49 ` andi-gcc at firstfloor dot org
2021-07-20 7:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-09-08 17:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202
--- Comment #3 from Andi Kleen <andi-gcc at firstfloor dot org> ---
I'm not sure rewriting the pattern to assume_aligned would be useful. After all
the user could already use assume_aligned directly.
I was more thinking of cases when VRP/CCP can prove alignment in other ways
from the code, and the vectorizer should use that.
Good point that the fallback is not for unalignment. Should probably use a more
fancy test case where unalignment matters for the cost model.
One interesting case is avoiding the need for tail code when the iteration is
not a multiple of the vector length.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
2014-09-08 4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
` (2 preceding siblings ...)
2014-09-08 17:49 ` andi-gcc at firstfloor dot org
@ 2021-07-20 7:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20 7:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Severity|normal |enhancement
Keywords| |missed-optimization
Last reconfirmed| |2021-07-20
Status|UNCONFIRMED |NEW
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
Still happens on the trunk.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-07-20 7:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08 4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
2014-09-08 7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
2014-09-08 8:38 ` rguenth at gcc dot gnu.org
2014-09-08 17:49 ` andi-gcc at firstfloor dot org
2021-07-20 7:37 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).