public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP
@ 2014-09-08  4:05 andi-gcc at firstfloor dot org
  2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-09-08  4:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

            Bug ID: 63202
           Summary: tree vectorizer does not make use of alignment
                    information from VRP/CCP
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andi-gcc at firstfloor dot org

char b[100];

void alignment(int *p)
{
        if ((uintptr_t)p & 15) __builtin_unreachable();
        int i;
        for (i = 0; i < 64; i++)
                b[i] = p[i] ^ 0x1f;
}

-O3 results in 

        leaq    256(%rdi), %rax
        cmpq    $b, %rax
        jbe     .L9
        cmpq    $b+64, %rdi
        jb      .L5
.L9:
        movdqu  (%rdi), %xmm0
        movdqu  16(%rdi), %xmm2
        movdqa  %xmm0, %xmm1
        punpcklwd       %xmm2, %xmm0
        movdqu  48(%rdi), %xmm3
        punpckhwd       %xmm2, %xmm1
        movdqu  112(%rdi), %xmm4
...

.L5:
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
.L8:
        movzbl  (%rdi,%rax,4), %edx
        addq    $1, %rax
        xorl    $31, %edx
        movb    %dl, b-1(%rax)
        cmpq    $64, %rax
        jne     .L8
        rep ret


The extra loop for the unaligned case shouldn't be needed because VRP or CCP
can prove that the pointer is always aligned from the builtin_unreachable test.

vrp1 doesn't handle this

p_3(D): VARYING
p.0_4: [0, +INF]

it only is known in vrp2, which is too late for the vectorizer?

p_1: ~[0B, 0B]  EQUIVALENCES: { p_3(D) } (1 elements)

Also the vectorizer uses a different variable which does not inherit the known
alignment:

 <bb 2>:
  p.0_4 = (long unsigned int) p_3(D);
  _5 = p.0_4 & 15;
  if (_5 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  __builtin_unreachable ();

p.0_4 is unknown range again

p.0_4: [0, +INF]

Fixing this would allow implementing an __assume() macro behaving similar to
VC++


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
  2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
@ 2014-09-08  7:36 ` jakub at gcc dot gnu.org
  2014-09-08  8:38 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-09-08  7:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess the cast prevents this from being handled by maybe_set_nonzero_bits,
guess it could be handled there.

That said, it is extremely fragile, because we insert the range and non-zero
bits info on SSA_NAMEs and have this single exception for function parameters
if they aren't used anywhere before the __builtin_unreachable check.  As soon
as e.g. the function is inlined, there might be more uses and the info can be
lost.
Richard didn't want to disable forward propagation if some SSA_NAME holds a
useful range info which the to be propagated SSA_NAME does not hold (in that
case, we'd keep a new SSA_NAME with the more precise range/non-zero info around
and be able to stick it somewhere).
The reason why we have __builtin_assume_aligned defined the way it is is that
there is always an SSA_NAME to stick that info to, it is clear in which part of
the function the condition is true.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
  2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
  2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
@ 2014-09-08  8:38 ` rguenth at gcc dot gnu.org
  2014-09-08 17:49 ` andi-gcc at firstfloor dot org
  2021-07-20  7:37 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-08  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, as with restrict it would be nice to be able to annotate the memory
references themselves with alignment info.

Btw, a possibility would be to insert assume_aligned calls into the IL
from the

 if (p & 15)
   __builtin_unreachable ();

pattern and remove the test & __builtin_unreachable ().

Of course quite special and breaks down for assume (!(p & 15) && a == b).

As Jakub said, the testcase can be handled with the existing code as
there is no use of p before the conditional.

Note that there isn't an extra loop for the "unaligned" case but
the extra loop is for the case where there is aliasing between
p and b.

But yes, we fail to use aligned loads here (but movdqu doesn't have a
penalty for that).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
  2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
  2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
  2014-09-08  8:38 ` rguenth at gcc dot gnu.org
@ 2014-09-08 17:49 ` andi-gcc at firstfloor dot org
  2021-07-20  7:37 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-09-08 17:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

--- Comment #3 from Andi Kleen <andi-gcc at firstfloor dot org> ---
I'm not sure rewriting the pattern to assume_aligned would be useful. After all
the user could already use assume_aligned directly.

I was more thinking of cases when VRP/CCP can prove alignment in other ways
from the code, and the vectorizer should use that.

Good point that the fallback is not for unalignment. Should probably use a more
fancy test case where unalignment matters for the cost model.

One interesting case is avoiding the need for tail code when the iteration is
not a multiple of the vector length.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
  2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
                   ` (2 preceding siblings ...)
  2014-09-08 17:49 ` andi-gcc at firstfloor dot org
@ 2021-07-20  7:37 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  7:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2021-07-20
             Status|UNCONFIRMED                 |NEW

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

Still happens on the trunk.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-20  7:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
2014-09-08  8:38 ` rguenth at gcc dot gnu.org
2014-09-08 17:49 ` andi-gcc at firstfloor dot org
2021-07-20  7:37 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).