public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Future possible stack based optimization
@ 2006-01-25 20:04 Frediano Ziglio
  2006-01-25 21:29 ` Marcel Cox
  0 siblings, 1 reply; 3+ messages in thread
From: Frediano Ziglio @ 2006-01-25 20:04 UTC (permalink / raw)
  To: gcc

Hi,
  I saw that stack instructions on Intel platform are not used that
much. I think this is a pity cause stack operations are small (size
optimization) and usually fast (from Pentium two consecutive push/pop
are executed together -> speed optimization). Consider this small piece
of code

extern int foo1(int *a);

int foo2(int a)
{
	int b = a + 2;
	return foo1(&b);
}

compiling with 

$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

$ gcc --version
gcc (GCC) 4.2.0 20060107 (experimental)

produce following code

foo2:
        subl    $8, %esp
        movl    12(%esp), %eax
        addl    $2, %eax
        movl    %eax, 4(%esp)
        leal    4(%esp), %eax
        movl    %eax, (%esp)
        call    foo1
        addl    $8, %esp
        ret

compiled with

$ gcc -Os -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

foo2:
        subl    $4, %esp
        movl    8(%esp), %eax
        addl    $2, %eax
        movl    %eax, (%esp)
        movl    %esp, %eax
        pushl   %eax
        call    foo1
        popl    %edx
        popl    %ecx
        ret

this is worst than 4.0.2

$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

$ gcc --version
gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8)

foo2:
        pushl   %eax
        movl    8(%esp), %eax
        addl    $2, %eax
        movl    %eax, (%esp)
        movl    %esp, %eax
        pushl   %eax
        call    foo1
        addl    $8, %esp
        ret

(note pushl %eax size optimization instead of subl $4, %esp)

Would it possible instead of allocating memory with subl/pushl to
allocate and set memory with pushl only? Something like

foo2:
	movl	4(%esp), %eax
	addl	$2, %eax
	pushl	%eax
	pushl	%esp
	call	foo1
	popl	%edx
	popl	%ecx
	ret

(note that first pushl allocate and set variable on stack)

Is anyone working in this direction?

bye
  Frediano Ziglio


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Future possible stack based optimization
  2006-01-25 20:04 Future possible stack based optimization Frediano Ziglio
@ 2006-01-25 21:29 ` Marcel Cox
  2006-01-27  6:05   ` Frediano Ziglio
  0 siblings, 1 reply; 3+ messages in thread
From: Marcel Cox @ 2006-01-25 21:29 UTC (permalink / raw)
  To: Frediano Ziglio; +Cc: gcc


>   I saw that stack instructions on Intel platform are not used that
> much. I think this is a pity cause stack operations are small (size
> optimization) and usually fast (from Pentium two consecutive push/pop
> are executed together -> speed optimization). Consider this small
> piece of code


whether push(pop instructions or mov instructions are faster depends on
the type of processor used. GCC is well aware of this. If you specify
the desired processor with -mtune then GCC will use whatever is best
for that processor. For example if you optimize for old Pentium
processors, use -mtune=pentium and you will see that the compiler uses
push/pop instructions even when not using -Os
-- 
Marcel Cox 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Future possible stack based optimization
  2006-01-25 21:29 ` Marcel Cox
@ 2006-01-27  6:05   ` Frediano Ziglio
  0 siblings, 0 replies; 3+ messages in thread
From: Frediano Ziglio @ 2006-01-27  6:05 UTC (permalink / raw)
  To: Marcel Cox; +Cc: gcc

Il giorno mer, 25/01/2006 alle 22.29 +0100, Marcel Cox ha scritto:
> >   I saw that stack instructions on Intel platform are not used that
> > much. I think this is a pity cause stack operations are small (size
> > optimization) and usually fast (from Pentium two consecutive push/pop
> > are executed together -> speed optimization). Consider this small
> > piece of code
> 
> 
> whether push(pop instructions or mov instructions are faster depends on
> the type of processor used. GCC is well aware of this. If you specify
> the desired processor with -mtune then GCC will use whatever is best
> for that processor. For example if you optimize for old Pentium
> processors, use -mtune=pentium and you will see that the compiler uses
> push/pop instructions even when not using -Os

Marcus,
  I tried many options with some gcc versions but I can confirm that gcc
do not use push in the way I suggest. Perhaps a smaller code will help

extern int foo1(int *a);

void foo2()
{
	int x = 2;
	foo1(&x);
}

should become something like

foo2:
# here is the optimization I suggested, 
# allocation and set with a single instruction
	pushl	$2
# I don't understand why gcc compile 
# movl %esp, %eax  pushl %eax   here
	pushl	%esp
	call	foo1
# this can be subl $4, %esp or similar depending on 
# options you suggested
	popl	%ecx
	ret

Is anybody working in this direction ??

freddy77


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-01-27  6:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-25 20:04 Future possible stack based optimization Frediano Ziglio
2006-01-25 21:29 ` Marcel Cox
2006-01-27  6:05   ` Frediano Ziglio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).