* Future possible stack based optimization
@ 2006-01-25 20:04 Frediano Ziglio
2006-01-25 21:29 ` Marcel Cox
0 siblings, 1 reply; 3+ messages in thread
From: Frediano Ziglio @ 2006-01-25 20:04 UTC (permalink / raw)
To: gcc
Hi,
I saw that stack instructions on Intel platform are not used that
much. I think this is a pity cause stack operations are small (size
optimization) and usually fast (from Pentium two consecutive push/pop
are executed together -> speed optimization). Consider this small piece
of code
extern int foo1(int *a);
int foo2(int a)
{
int b = a + 2;
return foo1(&b);
}
compiling with
$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c
$ gcc --version
gcc (GCC) 4.2.0 20060107 (experimental)
produce following code
foo2:
subl $8, %esp
movl 12(%esp), %eax
addl $2, %eax
movl %eax, 4(%esp)
leal 4(%esp), %eax
movl %eax, (%esp)
call foo1
addl $8, %esp
ret
compiled with
$ gcc -Os -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c
foo2:
subl $4, %esp
movl 8(%esp), %eax
addl $2, %eax
movl %eax, (%esp)
movl %esp, %eax
pushl %eax
call foo1
popl %edx
popl %ecx
ret
this is worst than 4.0.2
$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c
$ gcc --version
gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8)
foo2:
pushl %eax
movl 8(%esp), %eax
addl $2, %eax
movl %eax, (%esp)
movl %esp, %eax
pushl %eax
call foo1
addl $8, %esp
ret
(note pushl %eax size optimization instead of subl $4, %esp)
Would it possible instead of allocating memory with subl/pushl to
allocate and set memory with pushl only? Something like
foo2:
movl 4(%esp), %eax
addl $2, %eax
pushl %eax
pushl %esp
call foo1
popl %edx
popl %ecx
ret
(note that first pushl allocate and set variable on stack)
Is anyone working in this direction?
bye
Frediano Ziglio
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Future possible stack based optimization
2006-01-25 20:04 Future possible stack based optimization Frediano Ziglio
@ 2006-01-25 21:29 ` Marcel Cox
2006-01-27 6:05 ` Frediano Ziglio
0 siblings, 1 reply; 3+ messages in thread
From: Marcel Cox @ 2006-01-25 21:29 UTC (permalink / raw)
To: Frediano Ziglio; +Cc: gcc
> I saw that stack instructions on Intel platform are not used that
> much. I think this is a pity cause stack operations are small (size
> optimization) and usually fast (from Pentium two consecutive push/pop
> are executed together -> speed optimization). Consider this small
> piece of code
whether push(pop instructions or mov instructions are faster depends on
the type of processor used. GCC is well aware of this. If you specify
the desired processor with -mtune then GCC will use whatever is best
for that processor. For example if you optimize for old Pentium
processors, use -mtune=pentium and you will see that the compiler uses
push/pop instructions even when not using -Os
--
Marcel Cox
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Future possible stack based optimization
2006-01-25 21:29 ` Marcel Cox
@ 2006-01-27 6:05 ` Frediano Ziglio
0 siblings, 0 replies; 3+ messages in thread
From: Frediano Ziglio @ 2006-01-27 6:05 UTC (permalink / raw)
To: Marcel Cox; +Cc: gcc
Il giorno mer, 25/01/2006 alle 22.29 +0100, Marcel Cox ha scritto:
> > I saw that stack instructions on Intel platform are not used that
> > much. I think this is a pity cause stack operations are small (size
> > optimization) and usually fast (from Pentium two consecutive push/pop
> > are executed together -> speed optimization). Consider this small
> > piece of code
>
>
> whether push(pop instructions or mov instructions are faster depends on
> the type of processor used. GCC is well aware of this. If you specify
> the desired processor with -mtune then GCC will use whatever is best
> for that processor. For example if you optimize for old Pentium
> processors, use -mtune=pentium and you will see that the compiler uses
> push/pop instructions even when not using -Os
Marcus,
I tried many options with some gcc versions but I can confirm that gcc
do not use push in the way I suggest. Perhaps a smaller code will help
extern int foo1(int *a);
void foo2()
{
int x = 2;
foo1(&x);
}
should become something like
foo2:
# here is the optimization I suggested,
# allocation and set with a single instruction
pushl $2
# I don't understand why gcc compile
# movl %esp, %eax pushl %eax here
pushl %esp
call foo1
# this can be subl $4, %esp or similar depending on
# options you suggested
popl %ecx
ret
Is anybody working in this direction ??
freddy77
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-01-27 6:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-25 20:04 Future possible stack based optimization Frediano Ziglio
2006-01-25 21:29 ` Marcel Cox
2006-01-27 6:05 ` Frediano Ziglio
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).