public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* abysmal code generated by gcc 3.2
@ 2002-10-21  0:56 Denys Duchier
  2002-10-21  7:37 ` Fergus Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Denys Duchier @ 2002-10-21  0:56 UTC (permalink / raw)
  To: gcc

my application is the implementation of a virtual machine for an
emulated programming language.  Switching from gcc 2.95.x to 3.2
brought a few expected pains due to the change in data layout, but the
major issue is that gcc 3.2 produces extremely poor code for my
application on x86 (also on others, but I have not measured those
personally).

Measuring just the impact on the main emulator loop (which uses the
classical threaded code technique, i.e. jumps to first class labels) I
found that the emulator was slowed down by a FACTOR of 8.27.

Looking at the generated assembly code, it is clear that the 3.2
compiler expends a lot of effort trying to keep a certain set of
values in registers.  On x86, this is a horrible policy (especially in
a threaded code interpretation loop).

Part of the problem comes from an interaction with inlining.  I turned
inlining off for a couple of non-critical functions which were
exposing values that the compiler ended up trying to keep in
registers, and I declared one variable volatile (much better results
than trying to switch off gcse).

This got me to only a factor 1.37 slowdown :-) ... measured on
basically pure emulated recursion (i.e. the speed of looping while
doing nothing else).

Which of course still sucks majorly since this is the MAIN emulator
loop (and since _every_ part of the implementation has been sizeably
slowed down... aargh!)

Here is an example of what I still cannot get rid of.  Here is the
code produced by gcc 2.95.x for the MOVEXX instruction:

#APP
         MOVEXX:
#NO_APP
        movl 4(%ebp),%edx
        movl 8(%ebp),%eax
        addl $12,%ebp
        movl (%edx),%edx
        movl %edx,(%eax)
        jmp *(%ebp)

Here is the code produced by gcc 3.2:

#APP
         MOVEXX:
#NO_APP
        movl    4(%ebp), %esi
        movl    8(%ebp), %eax
        addl    $12, %ebp               #  PC
        movl    (%esi), %ebx
        movl    _oz_heap_end, %esi      #  _oz_heap_end
        movl    %ebx, (%eax)
        movl    _oz_heap_cur, %ebx      #  _oz_heap_cur,  sPointer
        movl    480(%esp), %eax         #  CAP
        movl    am+52, %ecx             #  <variable>._currentOptVar, <anonymous>
        movl    am+28, %edx             #  <variable>.statusReg,  <anonymous>
        leal    12(%eax), %edi          #  <anonymous>
        jmp     *(%ebp)                 # * PC

To my uneducated eye, it looks like gcc is now trying very hard to
keep a bunch of values in registers.  Every emulated instruction is
like that, thus resulting in considerable overhead.  I tried to
declare _oz_heap_end and _oz_heap_cur volatile, but, curiously, that
had no effect on this particular code generation.

I am at my wits ends. Can anyone help?  (I realize that my application
is atypical).

Cheers,

PS: the compiler options used for the emulator file are:
-fno-exceptions -O3 -pipe -fstrict-aliasing -march=pentium -mcpu=pentiumpro -fomit-frame-pointer

-- 
Dr. Denys Duchier			Denys.Duchier@ps.uni-sb.de
Forschungsbereich Programmiersysteme	(Programming Systems Lab)
Universitaet des Saarlandes, Geb. 45	http://www.ps.uni-sb.de/~duchier
Postfach 15 11 50			Phone: +49 681 302 5618
66041 Saarbruecken, Germany		Fax:   +49 681 302 5615

^ permalink raw reply	[flat|nested] 16+ messages in thread
[parent not found: <20021021142058.25276.qmail@web21104.mail.yahoo.com>]
* Re: abysmal code generated by gcc 3.2
@ 2002-10-21 11:01 Joe Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Joe Wilson @ 2002-10-21 11:01 UTC (permalink / raw)
  To: Denys Duchier; +Cc: gcc

My mistake.  I did not see the cross-jump in the -O3 -finline-limit case.
If you follow the jumps you have many more instructions for MOVEXX.

-O2 does produce "optimal" code, though.

How does one disable these cross-jumps (if that's the correct term) in GCC 3.2?

--- Joe Wilson <developir@yahoo.com> wrote:
> The following GCC 3.2 flags:
> 
> -S -fno-exceptions -O3 -pipe -fstrict-aliasing -march=pentium -mcpu=pentiumpro
> -fomit-frame-pointer emulate.ii -finline-limit=10000000
> 
> also produce:
> 
>          MOVEXX:
> /NO_APP 
>         movl    4(%ebp), %esi
>         movl    8(%ebp), %eax
>         movl    (%esi), %ebx
>         movl    %ebx, (%eax)
> L6530:  
>         addl    $12, %ebp
>         jmp     L6288 
> 
> Using -O2 with the default inline limit produces comparable results as well.
> 
> --- Denys Duchier <Denys.Duchier@ps.uni-sb.de> wrote: 
> > As I mentioned to Brad Lucier (pc), it seems that the poor code
> > generation is somehow triggered in connection with inlining.  IIRC (I
> > have tried so many variations) If I supply -fno-inline-functions then
> > indeed I get the code above.  This very marginally improves straight
> > emulated recursion, but degrades the rest of the emulated
> > instructions.  Overall, its a loss.  Any further lowering of the
> > optimization level also leads to a degradation in performance.
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Y! Web Hosting - Let the expert host your web site
> http://webhosting.yahoo.com/
> 


__________________________________________________
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-10-22 16:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-21  0:56 abysmal code generated by gcc 3.2 Denys Duchier
2002-10-21  7:37 ` Fergus Henderson
2002-10-21  8:48   ` Denys Duchier
2002-10-21 12:59 ` Mike Stump
2002-10-21 15:07   ` Denys Duchier
2002-10-21 15:12     ` Fergus Henderson
2002-10-21 15:37     ` Mike Stump
2002-10-21 16:06       ` Dale Johannesen
2002-10-22  6:03         ` Michael Matz
2002-10-22  8:30           ` Kurt Garloff
2002-10-22 11:29           ` Dale Johannesen
2002-10-21 18:43 ` Denys Duchier
2002-10-22  3:56   ` Richard Henderson
     [not found] <20021021142058.25276.qmail@web21104.mail.yahoo.com>
2002-10-21 10:29 ` Denys Duchier
2002-10-21 10:47   ` Joe Wilson
2002-10-21 11:01 Joe Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).