public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/64655] New: Vectorizer is always using load aligned instructions with objects with the "aligned" attribute
@ 2015-01-18 16:33 adrien at guinet dot me
  2015-01-18 16:33 ` [Bug c++/64655] " adrien at guinet dot me
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: adrien at guinet dot me @ 2015-01-18 16:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64655

            Bug ID: 64655
           Summary: Vectorizer is always using load aligned instructions
                    with objects with the "aligned" attribute
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: adrien at guinet dot me

I think there is an issue with the way gcc uses the __attribute__ information
when generating SIMD load instructions during auto vectorization.

For instance, using a structure like this one:

struct A
{
        uint32_t i[N] __attribute__((aligned(32)));
        uint32_t j[N] __attribute__((aligned(32)));
        uint32_t r[N] __attribute__((aligned(32)));
};

with a loop like this one:

struct A *a = (struct A*) malloc(sizeof(struct A));
for (size_t i = 0; i < N; i++) {
   a->r[i] = a->i[i] + a->j[i];
}

then the vectorizer will use "load aligned" instructions. The issue is that,
even if i, j and r are aligned *inside* the A structure, nothing tells gcc that
the "a" pointer is actually correctly aligned.

You can reproduce this using the attached "test_align.c" test case, and
compiling it like this (using AVX2 for instance which needs 32-bytes
alignment):

$ gcc -O3 -march=core-avx2 test_align.c -std=c99 -o test_align

Running it on a computer supporting AVX2 will segfault. If you don't have AVX2
on your machine, you can use the excellent Intel SDE
(https://software.intel.com/en-us/articles/intel-software-development-emulator)
which will use PIN to "emulate" AVX2. Running the sample under SDE will even
tell us this:

$ sde64 -- ./test_align 
SDE ERROR:  TID: 0 executed instruction with an unaligned memory reference to
address 0x602ab0 INSTR: 0x000400550: IFORM: VMOVDQA_YMMqq_MEMqq :: vmovdqa
ymm0, ymmword ptr [rax+0x1aa0]
        IMAGE:    /tmp/test_align
        FUNCTION: main

Indeed, if we take a look at the assembly produced, we see:

[.. init code with rand() calls .., then, without any guards checking aligned
pointers]
lea     rdx, [r13+1A80h]
mov     rax, r13
loc_400550:
  vmovdqa ymm0, ymmword ptr [rax+1AA0h]  <- aligned load
  vpaddd  ymm0, ymm0, ymmword ptr [rax]  <- aligned load
  add     rax, 20h
  vmovdqa ymmword ptr [rax+3520h], ymm0 <- aligned store
  cmp     rax, rdx
  jnz     short loc_400550

If we remove the aligned attributes, ending with this structure:

struct A
{
    uint32_t i[N];
    uint32_t j[N];
    uint32_t r[N];
}

then gcc generates guard to check for unaligned pointers, and everything runs
fine!

Note that clang uses unaligned loads even with the aligned attributes (and thus
the binary does not segfault). The disassembly from the binary produced with
clang 3.5 is this one:

mov     rax, 0FFFFFFFFFFFFF960h
loc_400670:                             
  vmovdqu ymm0, ymmword ptr [r14+rax*4+3510h]
  vpaddd  ymm0, ymm0, ymmword ptr [r14+rax*4+1A80h]
  vmovdqu ymmword ptr [r14+rax*4+4FA0h], ymm0
  add     rax, 8
  jnz     short loc_400670

Bug seen in GCC 4.8.3 and GCC 4.9.2.

Thanks for any thoughts about this!

Regards,

Adrien.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-01-18 19:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-18 16:33 [Bug c++/64655] New: Vectorizer is always using load aligned instructions with objects with the "aligned" attribute adrien at guinet dot me
2015-01-18 16:33 ` [Bug c++/64655] " adrien at guinet dot me
2015-01-18 17:39 ` [Bug c/64655] " jakub at gcc dot gnu.org
2015-01-18 17:52 ` adrien at guinet dot me
2015-01-18 17:53 ` adrien at guinet dot me
2015-01-18 17:57 ` adrien at guinet dot me
2015-01-18 18:06 ` jakub at gcc dot gnu.org
2015-01-18 18:34 ` adrien at guinet dot me
2015-01-18 19:05 ` jakub at gcc dot gnu.org
2015-01-18 19:05 ` jakub at gcc dot gnu.org
2015-01-18 19:25 ` adrien at guinet dot me

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).