public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
@ 2013-07-16 14:35 yann at droneaud dot fr
  2013-07-16 14:39 ` [Bug c/57908] " yann at droneaud dot fr
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 14:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

            Bug ID: 57908
           Summary: alignment of arrays allocated stack on amd64/x86_64:
                    16 bytes ?
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yann at droneaud dot fr

According to "System V Application Binary Interface, AMD64 Architecture
Processor Supplement, Draft Version 0.90"

    Aggregates and Unions
    ---------------------

    An array uses the same alignment as its elements, except that a local or 
    global array variable that requires at least 16 bytes, or a C99 local or 
    global variable-length array variable, always has alignment of at least 16 
    bytes.[4]
    No other changes required.

    [4] The alignment requirement allows the use of SSE instructions when 
        operating on the array.
        The compiler cannot in general calculate the size of a variable-length 
        array (VLA), but it is expected that most VLAs will require at least 16 
        bytes, so it is logical to mandate that VLAs have at least a 16-byte 
        alignment.


As I understand the ABI specifications, arrays allocated on stack must be
aligned on 16 bytes boundaries, whatever its length is: eg. an array of 7 bytes
get aligned on 16 bytes boundaries.

A test program seems to verify that, with gcc version 4.8.1 20130603 (Red Hat
4.8.1-1) (GCC) :

     kind         name              address   size   alignment   required

   Arrays                                                                
   object |         u8 |     0x7fffefdd486f |    1 |         1 |        1
   object |       u8_0 |     0x7fffefdd4860 |    8 |        32 |        1
   object |       u8_1 |     0x7fffefdd4850 |    7 |        16 |        1
   object |       u8_2 |     0x7fffefdd4840 |    6 |        64 |        1
   object |       u8_3 |     0x7fffefdd4830 |    5 |        16 |        1
   object |       u8_4 |     0x7fffefdd4820 |    4 |        32 |        1
   object |       u8_5 |     0x7fffefdd4810 |    3 |        16 |        1
   object |       u8_6 |     0x7fffefdd4800 |    2 |      2048 |        1
   object |       u8_7 |     0x7fffefdd47ff |    1 |         1 |        1
   object |       u8_8 |     0x7fffefdd47fe |    1 |         2 |        1

IMHO it's a waste of memory and this behavor is inconsistent regarding
structure layout: eg. arrays in structure are not aligned on 16 bytes boundary.

But let's say the specification mandate such stack allocation, aligned on 16
bytes boundary.

Then enter LLVM/Clang clang version 3.3 (tags/RELEASE_33/rc3):

     kind         name              address   size   alignment   required

   Arrays                                                                
   object |         u8 |     0x7fff45f4154f |    1 |         1 |        1
   object |       u8_0 |     0x7fff45f41547 |    8 |         1 |        1
   object |       u8_1 |     0x7fff45f41540 |    7 |        64 |        1
   object |       u8_2 |     0x7fff45f4153a |    6 |         2 |        1
   object |       u8_3 |     0x7fff45f41535 |    5 |         1 |        1
   object |       u8_4 |     0x7fff45f41531 |    4 |         1 |        1
   object |       u8_5 |     0x7fff45f4152e |    3 |         2 |        1
   object |       u8_6 |     0x7fff45f4152c |    2 |         4 |        1
   object |       u8_7 |     0x7fff45f4152b |    1 |         1 |        1
   object |       u8_8 |     0x7fff45f4152a |    1 |         2 |        1

It seems that Clang is not aligning arrays on stack to 16 bytes boundary.

Note: for both compiler, tested on Fedora 19, the results were produced with a
test program compiled with default optimisation flag, with -O3, the results are
quite the same.

The source code of the test is available here:
https://gitorious.org/opteya/alignment/blobs/master/stack-alignment.c


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
@ 2013-07-16 14:39 ` yann at droneaud dot fr
  2013-07-16 15:03 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 14:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #1 from Yann Droneaud <yann at droneaud dot fr> ---
Additionally, for ARM target (ARMv7), it seems GCC align arrays on stack to 4
bytes boundary ... but I don't found the ABI specification that require such
alignment.

     kind         name              address   size   alignment   required

   Arrays                                                                
   object |         u8 |         0xf6fff017 |    1 |         1 |        1
   object |       u8_0 |         0xf6fff00c |    8 |         4 |        1
   object |       u8_1 |         0xf6fff004 |    7 |         4 |        1
   object |       u8_2 |         0xf6ffeffc |    6 |         4 |        1
   object |       u8_3 |         0xf6ffeff4 |    5 |         4 |        1
   object |       u8_4 |         0xf6ffeff0 |    4 |        16 |        1
   object |       u8_5 |         0xf6ffefec |    3 |         4 |        1
   object |       u8_6 |         0xf6ffefe8 |    2 |         8 |        1
   object |       u8_7 |         0xf6ffefe4 |    1 |         4 |        1
   object |       u8_8 |         0xf6ffefe3 |    1 |         1 |        1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
  2013-07-16 14:39 ` [Bug c/57908] " yann at droneaud dot fr
@ 2013-07-16 15:03 ` pinskia at gcc dot gnu.org
  2013-07-16 15:06 ` [Bug target/57908] " pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-07-16 15:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Your test program is not fully testing things correctly.
     kind         name              address   size   alignment   required
>   object |       u8_5 |     0x7fffefdd4810 |    3 |        16 |        1
>   object |       u8_6 |     0x7fffefdd4800 |    2 |      2048 |        1
>   object |       u8_7 |     0x7fffefdd47ff |    1 |         1 |        1

Shows why.  There are two variables right next to each other but the alignment
recorded is 2048 but that was just accidental.  The alignment of u8_6 is 16 due
to the next variable at 10.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
  2013-07-16 14:39 ` [Bug c/57908] " yann at droneaud dot fr
  2013-07-16 15:03 ` pinskia at gcc dot gnu.org
@ 2013-07-16 15:06 ` pinskia at gcc dot gnu.org
  2013-07-16 15:11 ` yann at droneaud dot fr
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-07-16 15:06 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-linux-gnu
             Status|UNCONFIRMED                 |RESOLVED
          Component|c                           |target
         Resolution|---                         |INVALID

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> As I understand the ABI specifications, arrays allocated on stack must be
> aligned on 16 bytes boundaries, whatever its length is

HUH?  I don't read it that way.  I read if the length is less than 16 bytes
then it is same alignment as its elements; otherwise it is 16byte aligned.

It does require VLAs to have 16byte alignment though.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (2 preceding siblings ...)
  2013-07-16 15:06 ` [Bug target/57908] " pinskia at gcc dot gnu.org
@ 2013-07-16 15:11 ` yann at droneaud dot fr
  2013-07-16 15:18 ` yann at droneaud dot fr
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 15:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #4 from Yann Droneaud <yann at droneaud dot fr> ---
(In reply to Andrew Pinski from comment #2)
> Your test program is not fully testing things correctly.
>      kind         name              address   size   alignment   required
> >   object |       u8_5 |     0x7fffefdd4810 |    3 |        16 |        1
> >   object |       u8_6 |     0x7fffefdd4800 |    2 |      2048 |        1
> >   object |       u8_7 |     0x7fffefdd47ff |    1 |         1 |        1
> 
> Shows why.  There are two variables right next to each other but the
> alignment recorded is 2048 but that was just accidental.  The alignment of
> u8_6 is 16 due to the next variable at 10.

Have you noticed that u8_7 is an array of 1 element only ?
Array of 1 element (bytes) only are not aligned on 16 bytes boundary.
Array of 2 bytes and greater get aligned on 16 bytes boundary.

Should I show another test case ?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (3 preceding siblings ...)
  2013-07-16 15:11 ` yann at droneaud dot fr
@ 2013-07-16 15:18 ` yann at droneaud dot fr
  2013-07-16 15:20 ` yann at droneaud dot fr
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 15:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

Yann Droneaud <yann at droneaud dot fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |---

--- Comment #5 from Yann Droneaud <yann at droneaud dot fr> ---
(In reply to Yann Droneaud from comment #4)
> (In reply to Andrew Pinski from comment #2)
> > Your test program is not fully testing things correctly.
> >      kind         name              address   size   alignment   required
> > >   object |       u8_5 |     0x7fffefdd4810 |    3 |        16 |        1
> > >   object |       u8_6 |     0x7fffefdd4800 |    2 |      2048 |        1
> > >   object |       u8_7 |     0x7fffefdd47ff |    1 |         1 |        1
> > 
> > Shows why.  There are two variables right next to each other but the
> > alignment recorded is 2048 but that was just accidental.  The alignment of
> > u8_6 is 16 due to the next variable at 10.
> 
> Have you noticed that u8_7 is an array of 1 element only ?
> Array of 1 element (bytes) only are not aligned on 16 bytes boundary.
> Array of 2 bytes and greater get aligned on 16 bytes boundary.
> 
> Should I show another test case ?

     kind         name              address   size   alignment   required

     type |    uint8_t |                N/A |    1 |       N/A |        1 
     type | uint8_t[2] |                N/A |    2 |       N/A |        1 

   Arrays                                                                
   object |       u8_0 |     0x7fff7671e25f |    1 |         1 |        1
   object |       u8_1 |     0x7fff7671e250 |    3 |        16 |        1
   object |       u8_2 |     0x7fff7671e240 |    7 |        64 |        1
   object |       u8_3 |     0x7fff7671e230 |    5 |        16 |        1
   object |       u8_4 |     0x7fff7671e220 |    2 |        32 |        1
   object |       u8_5 |     0x7fff7671e21f |    1 |         1 |        1
   object |       u8_6 |     0x7fff7671e210 |    3 |        16 |        1
   object |       u8_7 |     0x7fff7671e200 |    5 |       512 |        1
   object |       u8_8 |     0x7fff7671e1f0 |    7 |        16 |        1
   object |       u8_9 |     0x7fff7671e1e0 |    2 |        32 |        1
   object |      u8_10 |     0x7fff7671e1d0 |   11 |        16 |        1
   object |      u8_11 |     0x7fff7671e1c0 |    3 |        64 |        1
   object |      u8_12 |     0x7fff7671e1b0 |   13 |        16 |        1
   object |      u8_13 |     0x7fff7671e1a0 |    2 |        32 |        1
   object |      u8_14 |     0x7fff7671e19f |    1 |         1 |        1
   object |      u8_15 |     0x7fff7671e190 |    2 |        16 |        1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (4 preceding siblings ...)
  2013-07-16 15:18 ` yann at droneaud dot fr
@ 2013-07-16 15:20 ` yann at droneaud dot fr
  2013-07-16 15:28 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 15:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #6 from Yann Droneaud <yann at droneaud dot fr> ---
Created attachment 30512
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30512&action=edit
Demonstrate stack allocation of array aligned on 16 bytes


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (5 preceding siblings ...)
  2013-07-16 15:20 ` yann at droneaud dot fr
@ 2013-07-16 15:28 ` pinskia at gcc dot gnu.org
  2013-07-16 15:32 ` yann at droneaud dot fr
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-07-16 15:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Use -Os if you want no alignment for those arrays.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (6 preceding siblings ...)
  2013-07-16 15:28 ` pinskia at gcc dot gnu.org
@ 2013-07-16 15:32 ` yann at droneaud dot fr
  2013-07-16 15:36 ` pinskia at gcc dot gnu.org
  2013-07-16 15:56 ` yann at droneaud dot fr
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 15:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #8 from Yann Droneaud <yann at droneaud dot fr> ---
Using -Os show different results:

   Arrays                                                                
   object |       u8_0 |     0x7fff9b4c05bc |    1 |         4 |        1
   object |       u8_1 |     0x7fff9b4c05c7 |    3 |         1 |        1
   object |       u8_2 |     0x7fff9b4c05da |    7 |         2 |        1
   object |       u8_3 |     0x7fff9b4c05d0 |    5 |        16 |        1
   object |       u8_4 |     0x7fff9b4c05bf |    2 |         1 |        1
   object |       u8_5 |     0x7fff9b4c05bd |    1 |         1 |        1
   object |       u8_6 |     0x7fff9b4c05ca |    3 |         2 |        1
   object |       u8_7 |     0x7fff9b4c05d5 |    5 |         1 |        1
   object |       u8_8 |     0x7fff9b4c05e1 |    7 |         1 |        1
   object |       u8_9 |     0x7fff9b4c05c1 |    2 |         1 |        1
   object |      u8_10 |     0x7fff9b4c05e8 |   11 |         8 |        1
   object |      u8_11 |     0x7fff9b4c05cd |    3 |         1 |        1
   object |      u8_12 |     0x7fff9b4c05f3 |   13 |         1 |        1
   object |      u8_13 |     0x7fff9b4c05c3 |    2 |         1 |        1
   object |      u8_14 |     0x7fff9b4c05be |    1 |         2 |        1
   object |      u8_15 |     0x7fff9b4c05c5 |    2 |         1 |        1


So GCC is using 16 bytes to align array allocated on stack by default but it's
not enforcing such alignment.
I guess it's all about optimisation ... but wasting up to 14 bytes to get an
array of 2 bytes aligned might be overkill.

Could someone comment on which optimisation is achieved by aligning such small
arrays ?

Regards.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (7 preceding siblings ...)
  2013-07-16 15:32 ` yann at droneaud dot fr
@ 2013-07-16 15:36 ` pinskia at gcc dot gnu.org
  2013-07-16 15:56 ` yann at droneaud dot fr
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-07-16 15:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Yann Droneaud from comment #8)
> Could someone comment on which optimisation is achieved by aligning such
> small arrays ?

The simple answer is so each array is more likely to fit into a cache line:
   One use of this macro is to increase alignment of medium-size
   data to make it all fit in fewer cache lines.  */


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/57908] alignment of arrays allocated stack on amd64/x86_64: 16 bytes ?
  2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
                   ` (8 preceding siblings ...)
  2013-07-16 15:36 ` pinskia at gcc dot gnu.org
@ 2013-07-16 15:56 ` yann at droneaud dot fr
  9 siblings, 0 replies; 11+ messages in thread
From: yann at droneaud dot fr @ 2013-07-16 15:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #10 from Yann Droneaud <yann at droneaud dot fr> ---
(In reply to Andrew Pinski from comment #9)
> (In reply to Yann Droneaud from comment #8)
> > Could someone comment on which optimisation is achieved by aligning such
> > small arrays ?
> 
> The simple answer is so each array is more likely to fit into a cache line:
>    One use of this macro is to increase alignment of medium-size
>    data to make it all fit in fewer cache lines.  */

Thanks for the investigation.

Initially I thought it would be better to "pack" such arrays to fit whole cache
line: fewer cache lines will be used and most of the arrays would be already in
cache lines.

But according to http://stackoverflow.com/a/7281770:

"On x86 cache lines are 64 bytes, however, to prevent false sharing, you need
to follow the guidelines of the processor you are targeting (intel has some
special notes on its netburst based processors), generally you need to align to
64 bytes for this (intel states that you should also avoid crossing 16 byte
boundries)."

This start to make sense to me.

I'm likely buying the argument for global variables but for local variables, I
think they are probably not going to be shared a lot across CPUs. But I haven't
data for this so I won't continue that way.

Thanks a lot for answer my question.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-07-16 15:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-16 14:35 [Bug c/57908] New: alignment of arrays allocated stack on amd64/x86_64: 16 bytes ? yann at droneaud dot fr
2013-07-16 14:39 ` [Bug c/57908] " yann at droneaud dot fr
2013-07-16 15:03 ` pinskia at gcc dot gnu.org
2013-07-16 15:06 ` [Bug target/57908] " pinskia at gcc dot gnu.org
2013-07-16 15:11 ` yann at droneaud dot fr
2013-07-16 15:18 ` yann at droneaud dot fr
2013-07-16 15:20 ` yann at droneaud dot fr
2013-07-16 15:28 ` pinskia at gcc dot gnu.org
2013-07-16 15:32 ` yann at droneaud dot fr
2013-07-16 15:36 ` pinskia at gcc dot gnu.org
2013-07-16 15:56 ` yann at droneaud dot fr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).