public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Program fails when optimizing for speed under gcc 4.6
@ 2011-12-20 20:16 Wander Lairson Costa
  2011-12-20 21:52 ` kevin diggs
  0 siblings, 1 reply; 4+ messages in thread
From: Wander Lairson Costa @ 2011-12-20 20:16 UTC (permalink / raw)
  To: gcc-help

Dear all,

I have a home made alpha blend code that used to work until gcc 4.5
but fails on gcc 4.6 (tested on gcc 4.6.1 [ubuntu] and gcc 4.6.2
[archlinux]) when I optimize code for speed (-O1). If I optimize for
size (-Os) it works fine. To make a long story short, the problem is
that when optimizing for speed, gcc generates code that accesses local
variables using the esp register, which cause troubles in some part of
my code that is written in assembly:

        __asm__ __volatile__ (
            /* Initialize the counter and skip */
            /* if the latter is equal to zero. */
            "movl   %0,%%ecx\n\t"
            "cmpl   $0,%%ecx\n\t"
            "jz     not_blend\n\t"

            /* Load the frame buffer pointers into the registers. */

            "pushl      %%ebx\n\t"        <------ HERE IS THE ROOT OF
THE PROBLEM
            "movl       %1,%%edi\n\t"   <------ In this three lines
gcc accesses %1, %2, and %3
            "movl       %2,%%esi\n\t"   <------ variables using the esp register
            "movl       %3,%%ebx\n\t"  <------

The problem is that inside the assembly code, I do a "pushl %%ebx"
instruction, which updates the esp register, and following it, I
access local variables using the "%n" idiom, but gcc (when optimizing
for speed) emits code that accesses the variables through esp
register, which is no longer valid. When no optimization is applied or
when optimizing for size, the local vars accesses are done through ebp
register, and everything runs fine.

Now I am in doubt if I am loosing some spec detail in IA32 that
prohibit me from pushing things to the stack or if gcc is emitting
some kind of invalid code. Any ideas?

Thanks in advance.

-- 
Best Regards,
Wander Lairson Costa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Program fails when optimizing for speed under gcc 4.6
  2011-12-20 20:16 Program fails when optimizing for speed under gcc 4.6 Wander Lairson Costa
@ 2011-12-20 21:52 ` kevin diggs
  2011-12-20 22:41   ` Wander Lairson Costa
  0 siblings, 1 reply; 4+ messages in thread
From: kevin diggs @ 2011-12-20 21:52 UTC (permalink / raw)
  To: Wander Lairson Costa; +Cc: gcc-help

On Tue, Dec 20, 2011 at 2:13 PM, Wander Lairson Costa
<wander.lairson@gmail.com> wrote:
> Dear all,
>
> I have a home made alpha blend code that used to work until gcc 4.5
> but fails on gcc 4.6 (tested on gcc 4.6.1 [ubuntu] and gcc 4.6.2
> [archlinux]) when I optimize code for speed (-O1). If I optimize for
> size (-Os) it works fine. To make a long story short, the problem is
> that when optimizing for speed, gcc generates code that accesses local
> variables using the esp register, which cause troubles in some part of
> my code that is written in assembly:
>
>        __asm__ __volatile__ (
>            /* Initialize the counter and skip */
>            /* if the latter is equal to zero. */
>            "movl   %0,%%ecx\n\t"
>            "cmpl   $0,%%ecx\n\t"
>            "jz     not_blend\n\t"
>
>            /* Load the frame buffer pointers into the registers. */
>
>            "pushl      %%ebx\n\t"        <------ HERE IS THE ROOT OF
> THE PROBLEM
>            "movl       %1,%%edi\n\t"   <------ In this three lines
> gcc accesses %1, %2, and %3
>            "movl       %2,%%esi\n\t"   <------ variables using the esp register
>            "movl       %3,%%ebx\n\t"  <------
>
> The problem is that inside the assembly code, I do a "pushl %%ebx"
> instruction, which updates the esp register, and following it, I
> access local variables using the "%n" idiom, but gcc (when optimizing
> for speed) emits code that accesses the variables through esp
> register, which is no longer valid. When no optimization is applied or
> when optimizing for size, the local vars accesses are done through ebp
> register, and everything runs fine.
>
> Now I am in doubt if I am loosing some spec detail in IA32 that
> prohibit me from pushing things to the stack or if gcc is emitting
> some kind of invalid code. Any ideas?
>
> Thanks in advance.
>
> --
> Best Regards,
> Wander Lairson Costa

Hi,

Can you use 'O1 -fno-omit-frame-pointer'? Does O1 enable
omit-frame-pointer? You can also try -fverbose-asm to see the list of
-f options that are passed to cc1 (only useful in a -S compile). Maybe
you can look for the culprit that is haunting your code that way?

kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Program fails when optimizing for speed under gcc 4.6
  2011-12-20 21:52 ` kevin diggs
@ 2011-12-20 22:41   ` Wander Lairson Costa
  2011-12-21 23:24     ` Wander Lairson Costa
  0 siblings, 1 reply; 4+ messages in thread
From: Wander Lairson Costa @ 2011-12-20 22:41 UTC (permalink / raw)
  To: gcc-help

2011/12/20 kevin diggs <diggskevin38@gmail.com>:
>
> Hi,
>
> Can you use 'O1 -fno-omit-frame-pointer'? Does O1 enable
> omit-frame-pointer? You can also try -fverbose-asm to see the list of
> -f options that are passed to cc1 (only useful in a -S compile). Maybe
> you can look for the culprit that is haunting your code that way?
>
> kevin

Hi kevin,

Thanks to pointing this out. From gcc 4.6.2 manual:

"Starting with GCC version 4.6, the default setting (when not
optimizing for size) for 32-bit Linux x86 and 32-bit Darwin x86
targets has been changed to -fomit-frame-pointer. The default can be
reverted to -fno-omit-frame-pointer by configuring GCC with the
--enable-frame-pointer configure option."

I am out of office now. Tomorrow I will test compiling with
-fno-omit-frame-pointer. Thanks again.

-- 
Best Regards,
Wander Lairson Costa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Program fails when optimizing for speed under gcc 4.6
  2011-12-20 22:41   ` Wander Lairson Costa
@ 2011-12-21 23:24     ` Wander Lairson Costa
  0 siblings, 0 replies; 4+ messages in thread
From: Wander Lairson Costa @ 2011-12-21 23:24 UTC (permalink / raw)
  To: gcc-help

2011/12/20 Wander Lairson Costa <wander.lairson@gmail.com>:
> I am out of office now. Tomorrow I will test compiling with
> -fno-omit-frame-pointer. Thanks again.
>

Just to confirm that using -fno-omit-frame-pointer worked. Thanks more one time.


-- 
Best Regards,
Wander Lairson Costa

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-12-21 20:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-20 20:16 Program fails when optimizing for speed under gcc 4.6 Wander Lairson Costa
2011-12-20 21:52 ` kevin diggs
2011-12-20 22:41   ` Wander Lairson Costa
2011-12-21 23:24     ` Wander Lairson Costa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).