public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Optimizations
@ 1997-12-09  9:52 David M. Ronis
  1997-12-09 11:19 ` Optimizations Jeffrey A Law
  1997-12-10 10:46 ` [EGCS] Optimizations Marc Lehmann
  0 siblings, 2 replies; 27+ messages in thread
From: David M. Ronis @ 1997-12-09  9:52 UTC (permalink / raw)
  To: egcs

There's been some disccussion of egs vs gcc vs MSC benchmark results
on comp.os.linux.development.apps recently, much of which ends up with
various suggestions of what compiler flags should be specified.  For
example, one poster suggests:

gcc -O6 -mpentium -fomit-frame-pointer -fexpensive-optimizations \
>-ffast-math

To that, add:
        -march=pentium
        -fschedule-insns
        -fschedule-insns2
        -fregmove
        -fdelayed-branch

According to the gcc info description, all the -f options are enabled
(if supported) when -O2 (and I presume -O6) is specified.  Is this
correct?  

To the above I normally add the following:

-malign-double -malign-loops=0 -malign-jumps=0 -malign-functions=0\
-mno-ieee-fp 

Are the -malign directives implied by -march=pentium? (they probably
should be, and in either case, this sould be described in the info
pages).

Is -mno-ieee-fp implied by -ffast-math?


David Ronis


^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [EGCS] Optimizations
@ 1997-12-14 14:30 meissner
  1997-12-15  5:38 ` Optimizations Marc Lehmann
       [not found] ` <19971216000653.24186.cygnus.egcs@cerebro.laendle>
  0 siblings, 2 replies; 27+ messages in thread
From: meissner @ 1997-12-14 14:30 UTC (permalink / raw)
  To: egcs

| Marc Lehmann wrote:
| > -fschedule-insns is a *loss* on x86 cpu's!
| 
| care to explain why it is a loss (and most probably also -fschedule-insns2)

The problem is that -fschedule-insns, -funroll-{,all-}loops, and
-fstrength-reduce all tend to work by creating more registers to hold
intermediate results (in compiler speak this is known as register pressure).
Obviously, -fschedule-insns2 doesn't suffer from this problem, since it only
schedules things after register allocation has been done (and thus on a machine
that has plenty of registers has little effect, other than to move spills
around).

--
Michael Meissner, Cygnus Solutions (Massachusetts office)
4th floor, 955 Massachusetts Avenue, Cambridge, MA 02139, USA
meissner@cygnus.com,	617-354-5416 (office),	617-354-7161 (fax)

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Optimizations
@ 2000-03-10  1:46 Virgil Palanciuc
  0 siblings, 0 replies; 27+ messages in thread
From: Virgil Palanciuc @ 2000-03-10  1:46 UTC (permalink / raw)
  To: mrs, gcc

>Well, optimizing gcc's memory allocation should be an interesting
>project.  Certainly we do little in this area, so any improvement you
>could contribute would be great.  Also, with the advent of 200+ cycle
>memory misses, it might be fairly profitable.  As for register
>allocation, I would hope that it would be hard to improve gcc's
>scheme, but the right person with the right algorithm...
   My project is specifically for the SC100 family. I never dreamed of being 
able to do any general optimizations (I thought people are trying to 
optimize gcc for too long for me to be able to bring something new in only 
two months). However, if you think optimizing gcc's memory allocation is 
possible (in AT MOST 2 months), write me some details (frankly, I don't know 
how gcc does the memory allocation - my approach was to write a new phase 
that discards whatever memory allocation gcc did so far).

> > schemes for a specific architecture (Motorola's SC100 family).
>Our code isn't specific to any processor and any code you submit that
>we included, would have to be fairly machine independent,or at least,
>it would have to not harm performance on other machines.
    I already explained : I didn't think I can make machine-independent 
optimizations. Of course, I would be glad to be wrong in this matter.

                            Virgil.
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com

^ permalink raw reply	[flat|nested] 27+ messages in thread
* optimizations
@ 2003-01-14 22:58 Reza Roboubi
  2003-01-15  0:15 ` optimizations Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Reza Roboubi @ 2003-01-14 22:58 UTC (permalink / raw)
  To: gcc-help, gcc

In the following code, it is clear that the return value of mm() can be
eliminated.  In fact, many optimizations are possible here.  Yet gcc seems not
to be able to do these optimizations.  Below, I posted the assembly code that
gcc generated (for the while() loop).

I compiled this code with gcc -O2 -Wall.  

I was wondering if I am doing something wrong.  If not, then please comment on
current gcc developments in this regard, and what it takes to add some of these
features.

Please also comment on how other compilers would compare with gcc in this case.

Are there any non-obvious remedies you have for this case?

PS: Please tell me if I must report this as a gcc bug.

Thanks in advance for any help you provide.


inline int mm(int *i)
{
        if((*i)==0x10){return 0;}
        (*i)++;return 1;
}

int  
main(){  

	int k=0;
	while (mm(&k)){}
	write(1,&k,1);

	return 0;
}  


Associated assembly code for the while() loop:

0x80483b0 <main+16>:	mov    0xfffffffc(%ebp),%eax
0x80483b3 <main+19>:	xor    %edx,%edx
0x80483b5 <main+21>:	cmp    $0x10,%eax
0x80483b8 <main+24>:	je     0x80483c3 <main+35>
0x80483ba <main+26>:	inc    %eax
0x80483bb <main+27>:	mov    $0x1,%edx
0x80483c0 <main+32>:	mov    %eax,0xfffffffc(%ebp)
0x80483c3 <main+35>:	test   %edx,%edx
0x80483c5 <main+37>:	jne    0x80483b0 <main+16>

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: optimizations
@ 2003-01-15 23:20 Bonzini
  2003-01-16 10:53 ` optimizations Reza Roboubi
  0 siblings, 1 reply; 27+ messages in thread
From: Bonzini @ 2003-01-15 23:20 UTC (permalink / raw)
  To: gcc, gcc-help; +Cc: reza

> > Could you please also tell me if 3.3 and 3.4 remove the extra mov's in
and out
> > of %eax. Ideally, there should be no more than 4 instructions in the
critical
> > loop.
>
> .L2:
> movl -4(%ebp), %eax <== still does the load
> cmpl $16, %eax
> je .L7
> incl %eax
> movl %eax, -4(%ebp) <== and store
> jmp .L2
> .L7:
>
> For some reason it is not (even with -fnew-ra), but on PPC there
> is no extra load/store.

Instruction counts do not tell the whole story; gcc is simply putting more
pressure on the decoding unit but less pressure on the execution unit (which
otherwise would execute two loads in the `taken' case).  Things might be
different if gcc is given other options like -mtune=i386.

|_  _  _ __
|_)(_)| ),'
------- '---


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-02-19  1:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-12-09  9:52 Optimizations David M. Ronis
1997-12-09 11:19 ` Optimizations Jeffrey A Law
1997-12-10 10:46 ` [EGCS] Optimizations Marc Lehmann
1997-12-14  5:39   ` Philipp Thomas
1997-12-14 15:14     ` Optimizations Marc Lehmann
1997-12-14 20:14       ` Optimizations Jeffrey A Law
1997-12-14 14:30 [EGCS] Optimizations meissner
1997-12-15  5:38 ` Optimizations Marc Lehmann
1997-12-15 11:29   ` Optimizations Dave Love
1997-12-15 15:43     ` Optimizations Marc Lehmann
     [not found] ` <19971216000653.24186.cygnus.egcs@cerebro.laendle>
1997-12-23  7:51   ` Optimizations Stan Cox
2000-03-10  1:46 Optimizations Virgil Palanciuc
2003-01-14 22:58 optimizations Reza Roboubi
2003-01-15  0:15 ` optimizations Andrew Pinski
2003-01-15  5:10   ` optimizations Reza Roboubi
2003-01-15  6:31     ` optimizations Reza Roboubi
2003-01-15 17:37       ` optimizations Andrew Pinski
2003-01-15 17:46         ` optimizations Reza Roboubi
2003-01-15 23:20 optimizations Bonzini
2003-01-16 10:53 ` optimizations Reza Roboubi
2003-01-16 11:03   ` optimizations tm_gccmail
2003-01-16 12:34     ` optimizations Reza Roboubi
2003-02-18 18:13       ` optimizations Håkan Hjort
2003-02-18 18:16         ` optimizations Andrew Pinski
2003-02-18 18:17         ` optimizations Zack Weinberg
2003-02-18 18:40           ` optimizations Håkan Hjort
2003-02-19  5:02           ` optimizations David Edelsohn
2003-01-16 11:53   ` optimizations Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).