public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP
@ 2014-09-08  4:05 andi-gcc at firstfloor dot org
  2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: andi-gcc at firstfloor dot org @ 2014-09-08  4:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202

            Bug ID: 63202
           Summary: tree vectorizer does not make use of alignment
                    information from VRP/CCP
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andi-gcc at firstfloor dot org

char b[100];

void alignment(int *p)
{
        if ((uintptr_t)p & 15) __builtin_unreachable();
        int i;
        for (i = 0; i < 64; i++)
                b[i] = p[i] ^ 0x1f;
}

-O3 results in 

        leaq    256(%rdi), %rax
        cmpq    $b, %rax
        jbe     .L9
        cmpq    $b+64, %rdi
        jb      .L5
.L9:
        movdqu  (%rdi), %xmm0
        movdqu  16(%rdi), %xmm2
        movdqa  %xmm0, %xmm1
        punpcklwd       %xmm2, %xmm0
        movdqu  48(%rdi), %xmm3
        punpckhwd       %xmm2, %xmm1
        movdqu  112(%rdi), %xmm4
...

.L5:
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
.L8:
        movzbl  (%rdi,%rax,4), %edx
        addq    $1, %rax
        xorl    $31, %edx
        movb    %dl, b-1(%rax)
        cmpq    $64, %rax
        jne     .L8
        rep ret


The extra loop for the unaligned case shouldn't be needed because VRP or CCP
can prove that the pointer is always aligned from the builtin_unreachable test.

vrp1 doesn't handle this

p_3(D): VARYING
p.0_4: [0, +INF]

it only is known in vrp2, which is too late for the vectorizer?

p_1: ~[0B, 0B]  EQUIVALENCES: { p_3(D) } (1 elements)

Also the vectorizer uses a different variable which does not inherit the known
alignment:

 <bb 2>:
  p.0_4 = (long unsigned int) p_3(D);
  _5 = p.0_4 & 15;
  if (_5 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  __builtin_unreachable ();

p.0_4 is unknown range again

p.0_4: [0, +INF]

Fixing this would allow implementing an __assume() macro behaving similar to
VC++


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-20  7:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08  4:05 [Bug tree-optimization/63202] New: tree vectorizer does not make use of alignment information from VRP/CCP andi-gcc at firstfloor dot org
2014-09-08  7:36 ` [Bug tree-optimization/63202] " jakub at gcc dot gnu.org
2014-09-08  8:38 ` rguenth at gcc dot gnu.org
2014-09-08 17:49 ` andi-gcc at firstfloor dot org
2021-07-20  7:37 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).