public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
@ 2012-11-23 21:02 siarhei.siamashka at gmail dot com
  2012-11-25 17:51 ` [Bug target/55454] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2012-11-23 21:02 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

             Bug #: 55454
           Summary: [PPC] unaligned memory accesses do not work correctly
                    for vector extensions when using altivec
    Classification: Unclassified
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: siarhei.siamashka@gmail.com


The following test program reproduces the problem:

/*******************************************************/

#include <stdint.h>
#include <assert.h>

typedef uint8_t uint8x16 __attribute__ ((vector_size(16)));
typedef struct { char dummy; uint8x16 data; } __attribute__((packed)) foo;

char __attribute__((aligned(16))) buffer[32];

void __attribute__((noinline)) init_buffer(const uint8x16 *a)
{
    ((foo *)(buffer + 9))->data = *a;
}

int main (void)
{
    const uint8x16 a = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 };
    assert(sizeof(foo) == 17);
    init_buffer(&a);
    assert(buffer[0] == 0);
    return 0;
}

/*******************************************************/

$ gcc -O2 -maltivec -o test test.c
$ ./test
test: test.c:19: main: Assertion `buffer[0] == 0' failed.
Aborted


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/55454] [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
  2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
@ 2012-11-25 17:51 ` rguenth at gcc dot gnu.org
  2012-11-25 18:18 ` siarhei.siamashka at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-11-25 17:51 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2012-11-25 17:51:40 UTC ---
Besides from whether the testcase is valid 4.8 should do a better job here.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/55454] [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
  2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
  2012-11-25 17:51 ` [Bug target/55454] " rguenth at gcc dot gnu.org
@ 2012-11-25 18:18 ` siarhei.siamashka at gmail dot com
  2012-11-25 19:32 ` siarhei.siamashka at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2012-11-25 18:18 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

--- Comment #2 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> 2012-11-25 18:18:16 UTC ---
(In reply to comment #1)
> Besides from whether the testcase is valid

According to http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html

"packed - This attribute, attached to struct or union type definition,
specifies that each member (other than zero-width bit-fields) of the structure
or union is placed to minimize the memory required. When attached to an enum
definition, it indicates that the smallest integral type should be used."

Is it safe to assume that the size of this "foo" struct is always expected to
be 17 bytes in the testcase? If yes, then it must be safe to use any alignment
for this struct because an array of "foo" will have elements with addresses at
any possible alignments. As such, any memory location can be safely casted to
foo* and used. Is there anything wrong with these assumptions?


But in fact what I want is just to somehow tell gcc that I'm going to write
this vector data type at an unaligned memory location. For example, x86 SSE2
and ARM NEON have unaligned load/store instructions. PPC Altivec can't do it
easily, but that's a headache for GCC and the application developer (me) should
not care. After all, if running out of options, one can always use

    memcpy(buffer + 10, a, sizeof(*a));

instead of

    ((foo *)(buffer + 9))->data = *a;

The performance goes down the toilet though. Which would be in fact an
acceptable solution for PPC, but x86 and ARM can definitely do much better.

> 4.8 should do a better job here.

Thanks, I'll check GCC 4.8 a bit later.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/55454] [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
  2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
  2012-11-25 17:51 ` [Bug target/55454] " rguenth at gcc dot gnu.org
  2012-11-25 18:18 ` siarhei.siamashka at gmail dot com
@ 2012-11-25 19:32 ` siarhei.siamashka at gmail dot com
  2012-11-25 21:17 ` siarhei.siamashka at gmail dot com
  2012-12-09 22:25 ` siarhei.siamashka at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2012-11-25 19:32 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

--- Comment #3 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> 2012-11-25 19:32:02 UTC ---
Also fails with GCC trunk (gcc version 4.8.0 20120518 (experimental))

The disassembly listing for "init_buffer" function:

00000000 <init_buffer>:
   0:    7d 80 42 a6     mfvrsave r12
   4:    94 21 ff e0     stwu    r1,-32(r1)
   8:    91 81 00 1c     stw     r12,28(r1)
   c:    65 8c 80 00     oris    r12,r12,32768
  10:    7d 80 43 a6     mtvrsave r12
  14:    3d 40 00 00     lis     r10,0
  18:    7c 00 18 ce     lvx     v0,r0,r3
  1c:    39 20 00 0a     li      r9,10
  20:    39 4a 00 00     addi    r10,r10,0
  24:    7c 0a 49 ce     stvx    v0,r10,r9
                        ^^^^
Here it happily tries to use STVX instruction. And using this instruction just
silently aligns the address down to 16 byte boundary, effectively doing the
write at &buffer[0] instead of &buffer[10].

  28:    81 81 00 1c     lwz     r12,28(r1)
  2c:    7d 80 43 a6     mtvrsave r12
  30:    38 21 00 20     addi    r1,r1,32
  34:    4e 80 00 20     blr


And by the way, the memcpy workaround mentioned above is also broken in GCC
4.8, because it tries to be clever and generates exactly the same code relying
on STVX :)


With GCC 4.7.2, at least memcpy variant used to work correctly:

00000000 <init_buffer>:
   0:    3d 40 00 00     lis     r10,0
   4:    80 a3 00 00     lwz     r5,0(r3)
   8:    80 c3 00 04     lwz     r6,4(r3)
   c:    80 e3 00 08     lwz     r7,8(r3)
  10:    39 2a 00 0a     addi    r9,r10,10
  14:    81 03 00 0c     lwz     r8,12(r3)
  18:    90 aa 00 0a     stw     r5,10(r10)
  1c:    90 c9 00 04     stw     r6,4(r9)
  20:    90 e9 00 08     stw     r7,8(r9)
  24:    91 09 00 0c     stw     r8,12(r9)
  28:    4e 80 00 20     blr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/55454] [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
  2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
                   ` (2 preceding siblings ...)
  2012-11-25 19:32 ` siarhei.siamashka at gmail dot com
@ 2012-11-25 21:17 ` siarhei.siamashka at gmail dot com
  2012-12-09 22:25 ` siarhei.siamashka at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2012-11-25 21:17 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

--- Comment #4 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> 2012-11-25 21:16:53 UTC ---
(In reply to comment #3)
> Also fails with GCC trunk (gcc version 4.8.0 20120518 (experimental))
                                         ^^^^^^^^^^^^^^
Sorry, I accidentally compiled GCC from the stale old directory. The recent
trunk 4.8.0 20121120 (experimental) has memcpy issue fixed. Still the STVX
problem is there:

00000000 <init_buffer>:
   0:    7c 00 18 ce     lvx     v0,r0,r3
   4:    3d 40 00 00     lis     r10,0
   8:    39 20 00 0a     li      r9,10
   c:    39 4a 00 00     addi    r10,r10,0
  10:    7c 0a 49 ce     stvx    v0,r10,r9
  14:    4e 80 00 20     blr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/55454] [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec
  2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
                   ` (3 preceding siblings ...)
  2012-11-25 21:17 ` siarhei.siamashka at gmail dot com
@ 2012-12-09 22:25 ` siarhei.siamashka at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: siarhei.siamashka at gmail dot com @ 2012-12-09 22:25 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55454

Siarhei Siamashka <siarhei.siamashka at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE

--- Comment #5 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> 2012-12-09 22:25:17 UTC ---
Appears that this is a duplicate of
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55614

As for memcpy, it looks like this is indeed the preferable "portable" way of
storing vectors to unaligned memory (albeit somewhat buggy at the moment).

And ARM just happens to have a performance issue related to memcpy, but it can
be tracked elsewhere: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55634

*** This bug has been marked as a duplicate of bug 55614 ***


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-12-09 22:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-23 21:02 [Bug target/55454] New: [PPC] unaligned memory accesses do not work correctly for vector extensions when using altivec siarhei.siamashka at gmail dot com
2012-11-25 17:51 ` [Bug target/55454] " rguenth at gcc dot gnu.org
2012-11-25 18:18 ` siarhei.siamashka at gmail dot com
2012-11-25 19:32 ` siarhei.siamashka at gmail dot com
2012-11-25 21:17 ` siarhei.siamashka at gmail dot com
2012-12-09 22:25 ` siarhei.siamashka at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).