public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* stack allocation
@ 2004-12-12 19:21 matt smith
  2004-12-16 19:08 ` jlh
  0 siblings, 1 reply; 4+ messages in thread
From: matt smith @ 2004-12-12 19:21 UTC (permalink / raw)
  To: gcc-help

This issue has been discussed in a few threads that I
have found on google but there was no conclusive
answer given for the phenomenon.  

example1.c

    void function(int a, int b, int c) {

           char buffer1[5];

           char buffer2[10];

}

void main() {

            function(1,2,3);

}

 

When you issue the "gcc -S -o example1.s example1.c"
command and view the function prolog you see that the
compiler reserves 40 bytes for these two arrays.  

subl $40, %esp

 

To me the expected behavior should be subl $24, %esp 
or in other words reserving 24 bytes of stack space.

 

Why the discrepancy? Thanks

Josh

 

 




		
__________________________________ 
Do you Yahoo!? 
Meet the all-new My Yahoo! - Try it today! 
http://my.yahoo.com 
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stack allocation
  2004-12-12 19:21 stack allocation matt smith
@ 2004-12-16 19:08 ` jlh
  0 siblings, 0 replies; 4+ messages in thread
From: jlh @ 2004-12-16 19:08 UTC (permalink / raw)
  To: matt smith, gcc-help

[-- Attachment #1: Type: text/plain, Size: 1624 bytes --]


matt smith wrote:
> Why the discrepancy?

I think I might have found the reason for this; here's what I've
been experimenting with today:

extern int i;
extern void f2();
void f()
{
         f2();
         i = 3;
}

If I compile with "gcc-4.0 -O2" I get this:  (on x86)

f:      pushl   %ebp
         movl    %esp, %ebp
         subl    $8, %esp
         call    f2
         movl    $3, %eax
         movl    %eax, i
         leave
         ret

The pushed %ebp uses 4 bytes on the stack and GCC reserves another
8 bytes (which are never used) for a total of 12 bytes.

Now if I compile the same with the option "-fomit-frame-pointer"
added I get this:

f:      subl    $12, %esp
         call    f2
         movl    $3, %eax
         movl    %eax, i
         addl    $12, %esp
         ret

No more %ebp on the stack, but now GCC reserves 12 bytes.

In both cases the function f() uses 12 bytes of stack and together
with the 4 bytes of return address being on the stack already, it
totals to 16 bytes, which is a nice alignment.  And as you know,
proper alignment makes code faster.  If f() does not call any
function, GCC does not reserve any unnecessary space.

In your sample code, you didn't use optimization at all, so it
probably did the alignment anyway, even if no other function gets
called from your function.  This might be the reason why it
allocates 40 bytes instead of only what it requires for storage.

Then I did some measurements and apparently, calling a function
with the stack not aligned to 16-bytes is slower.  So GCC actually
does a good job here.

Voilà, I hope this wasn't non-sense.  :)

jlh


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Stack allocation
  2011-11-18 17:57 ` Stack allocation Alexandru Juncu
@ 2011-11-18 19:35   ` Andrew Haley
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Haley @ 2011-11-18 19:35 UTC (permalink / raw)
  To: gcc-help

On 11/18/2011 03:03 PM, Alexandru Juncu wrote:
> Hello!
> 
> [I sent an email on the gcc main list by my mistake, and I am moving
> the discussion here]
> 
> I have a curiosity with something I once tested. I took a simple C
> program and made an assembly file with gcc -S.
> 
> The C file looks something like this:
> int main(void)
> {
>   int a=1, b=2;
>   return 0;
> }
> 
> The assembly instructions look like this:
> 
> subl    $16, %esp
> movl    $1, -4(%ebp)
> movl    $2, -8(%ebp)
> 
> The subl $16, means the allocation of local variables on the stack,
> right? 16 bytes are enough for 4 32bit integers.
> If I have 1,2,3 or 4 local variables declared, you get those 16 bytes.
> If I have 5 variables, we have "        subl    $32, %esp". 5,6,7,8 variables ar
> the same. 9, 10,11,12, 48 bytes.
> 
> The observation is that gcc allocates increments of 4 variables (if
> they are integers). If I allocate 8bit chars, increments of 16 chars.
> 
> So the allocation is in increments of 16 bytes no matter what.
> 
> OK, that's the observation... my question is why? What's the reason
> for this, is it an optimization (does is matter what's the -O used?)
> or is it architecture dependent (I ran it on x86) and is this just in
> gcc, just in a certain version of gcc or this is universal?
> 
> I got a response that is related to the cache line alignment, to
> optimize cache hits.
> But I tried to compile the program with the --param l1-cache-size and
> got the same .s file. Is this ok?

You're not optimizing.  Nothing much will happen with optimization options
when you're not optimizing.

This is x86-specific, but other processors have similar needs.

gcc must 16-align the stack because some structures (such as MMX data)
must be aligned.  Given that the data must be aligned, so must the stack.
Also, fetches and stores that straddle cache line boundaries can be very
slow.

Andrew.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Stack allocation
       [not found] <CAPhGq=bY8hS0DF2rf7_5E8ycYS52uJR8UH=Yjb0NiDBdaSR+6Q@mail.gmail.com>
@ 2011-11-18 17:57 ` Alexandru Juncu
  2011-11-18 19:35   ` Andrew Haley
  0 siblings, 1 reply; 4+ messages in thread
From: Alexandru Juncu @ 2011-11-18 17:57 UTC (permalink / raw)
  To: gcc-help

Hello!

[I sent an email on the gcc main list by my mistake, and I am moving
the discussion here]

I have a curiosity with something I once tested. I took a simple C
program and made an assembly file with gcc -S.

The C file looks something like this:
int main(void)
{
  int a=1, b=2;
  return 0;
}

The assembly instructions look like this:

subl    $16, %esp
movl    $1, -4(%ebp)
movl    $2, -8(%ebp)

The subl $16, means the allocation of local variables on the stack,
right? 16 bytes are enough for 4 32bit integers.
If I have 1,2,3 or 4 local variables declared, you get those 16 bytes.
If I have 5 variables, we have "        subl    $32, %esp". 5,6,7,8 variables ar
the same. 9, 10,11,12, 48 bytes.

The observation is that gcc allocates increments of 4 variables (if
they are integers). If I allocate 8bit chars, increments of 16 chars.

So the allocation is in increments of 16 bytes no matter what.

OK, that's the observation... my question is why? What's the reason
for this, is it an optimization (does is matter what's the -O used?)
or is it architecture dependent (I ran it on x86) and is this just in
gcc, just in a certain version of gcc or this is universal?

I got a response that is related to the cache line alignment, to
optimize cache hits.
But I tried to compile the program with the --param l1-cache-size and
got the same .s file. Is this ok?


Thank you!

--
Alexandru Juncu
ROSEdu

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-18 15:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-12 19:21 stack allocation matt smith
2004-12-16 19:08 ` jlh
     [not found] <CAPhGq=bY8hS0DF2rf7_5E8ycYS52uJR8UH=Yjb0NiDBdaSR+6Q@mail.gmail.com>
2011-11-18 17:57 ` Stack allocation Alexandru Juncu
2011-11-18 19:35   ` Andrew Haley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).