Re: -O2 versus -O1 (Was: Re: GCSE store motion)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
@ 2002-05-16  8:55 Brad Lucier
  0 siblings, 0 replies; 10+ messages in thread
From: Brad Lucier @ 2002-05-16  8:55 UTC (permalink / raw)
  To: gcc; +Cc: lucier

There was a long thread in 1999 about compile time versus -Ok, k=1,2,...

In the end, I suggested the following guidelines.

If an optimization takes O(N) or O(N log N) time, put it in -O1.
(Here N is the number of instructions or edges.)  Otherwise, it should go in
-O2.

If an optimization can be computed using one bit per edge or per
register-basic block pair, then put it in -O1, otherwise put it in -O2.

The subject line was " How long should -O1 compiles take?" and it began
in October.  My final suggestion had the subject line
"How much resources for -Ok, k=1,2,...?" on November 15.

Brad

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-17  4:33 Robert Dewar
@ 2002-05-17  5:03 ` Richard Earnshaw
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Earnshaw @ 2002-05-17  5:03 UTC (permalink / raw)
  To: Robert Dewar; +Cc: degger, gcc, Richard.Earnshaw

> We never want -g to affect the generated code. Actually what I would like
> to see (I mentioned this before is a -O.5 which would do all optimizations
> that did not affect debugging). One of the problems with GCC is that -O0 is
> really terrible, much worse than other compilers in "no optimize" mode. That's
> a problem for two reasons.

Hmm, -Og might be a better name.  In particular it would be useful if such 
a mode inhibited optimizations that blurred statement boundaries, so

	int a;

	a = 1;

	a++;

would display correct results after each statement was executed.  'a' 
could still be put in a register though.

R.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
@ 2002-05-17  4:33 Robert Dewar
  2002-05-17  5:03 ` Richard Earnshaw
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Dewar @ 2002-05-17  4:33 UTC (permalink / raw)
  To: degger, dewar; +Cc: gcc

<<What about simply disabling any problematic optimisation when -g is
supplied? It doesn't make sense to me to have another debugging flag
being "-O1".
>>

We never want -g to affect the generated code. Actually what I would like
to see (I mentioned this before is a -O.5 which would do all optimizations
that did not affect debugging). One of the problems with GCC is that -O0 is
really terrible, much worse than other compilers in "no optimize" mode. That's
a problem for two reasons.

A substantial set of users is permanently set against optimization, either
because they are doing safety critical work (where no optimization is one
religeon adhered to), or because they have been burned in the past. Or they
are doing benchmarks against another compiler where optimziation does not
work and they think this is the fair way to do it.

Losing debugging information is a problem, and the trouble with -O0 is not
just the speed, but more importantly the huge size of executables.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-17  3:42 ` Daniel Egger
@ 2002-05-17  4:00   ` Andreas Schwab
  0 siblings, 0 replies; 10+ messages in thread
From: Andreas Schwab @ 2002-05-17  4:00 UTC (permalink / raw)
  To: Daniel Egger; +Cc: Robert Dewar, GCC Developer Mailinglist

Daniel Egger <degger@fhm.edu> writes:

|> Am Don, 2002-05-16 um 16.59 schrieb Robert Dewar:
|> 
|> > Is this really a goal? Because if it is, we fall far short, -O1 code is
|> > nowhere near "as debuggable as possible".
|> 
|> What about simply disabling any problematic optimisation when -g is
|> supplied?

This is against the policy of not changing the generated code when -g is
enabled.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE GmbH, Deutschherrnstr. 15-19, D-90429 NÃ¼rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  8:21 Robert Dewar
@ 2002-05-17  3:42 ` Daniel Egger
  2002-05-17  4:00   ` Andreas Schwab
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Egger @ 2002-05-17  3:42 UTC (permalink / raw)
  To: Robert Dewar; +Cc: GCC Developer Mailinglist

Am Don, 2002-05-16 um 16.59 schrieb Robert Dewar:

> Is this really a goal? Because if it is, we fall far short, -O1 code is
> nowhere near "as debuggable as possible".

What about simply disabling any problematic optimisation when -g is
supplied? It doesn't make sense to me to have another debugging flag
being "-O1".
 
-- 
Servus,
       Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
@ 2002-05-16  8:21 Robert Dewar
  2002-05-17  3:42 ` Daniel Egger
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Dewar @ 2002-05-16  8:21 UTC (permalink / raw)
  To: bosch, jh; +Cc: aj, davem, dberlin, dewar, gcc, mark, roger, rth

> A second important goal for -O1 is to keep code as debuggable as
> possible. This rules out at least the sibling calls optimization, but
> probably others as well.

Is this really a goal? Because if it is, we fall far short, -O1 code is
nowhere near "as debuggable as possible".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  8:04   ` Geert Bosch
@ 2002-05-16  8:06     ` Geert Bosch
  0 siblings, 0 replies; 10+ messages in thread
From: Geert Bosch @ 2002-05-16  8:06 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Robert Dewar, dberlin, mark, roger, aj, davem, gcc, rth


On Thursday, May 16, 2002, at 10:07 , Jan Hubicka wrote:

> On the other hand at -O2 we do some bits that are not that expensive
> and may come to -O1 category.  I would guess for:
>
>       flag_optimize_sibling_calls = 1;
>       flag_rename_registers = 1;
>       flag_caller_saves = 1;
>       flag_force_mem = 1;
>       flag_regmove = 1;
>       flag_strict_aliasing = 1;
>       flag_reorder_blocks = 1;
>       flag_reorder_functions = 1;

A second important goal for -O1 is to keep code as debuggable as
possible. This rules out at least the sibling calls optimization, but
probably others as well.

   -Geert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
  2002-05-16  8:04   ` Mark Mitchell
@ 2002-05-16  8:04   ` Geert Bosch
  2002-05-16  8:06     ` Geert Bosch
  1 sibling, 1 reply; 10+ messages in thread
From: Geert Bosch @ 2002-05-16  8:04 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Robert Dewar, dberlin, mark, roger, aj, davem, gcc, rth


On Thursday, May 16, 2002, at 10:07 , Jan Hubicka wrote:

> On the other hand at -O2 we do some bits that are not that expensive
> and may come to -O1 category.  I would guess for:
>
>       flag_optimize_sibling_calls = 1;
>       flag_rename_registers = 1;
>       flag_caller_saves = 1;
>       flag_force_mem = 1;
>       flag_regmove = 1;
>       flag_strict_aliasing = 1;
>       flag_reorder_blocks = 1;
>       flag_reorder_functions = 1;

A second important goal for -O1 is to keep code as debuggable as
possible. This rules out at least the sibling calls optimization, but
probably others as well.

   -Geert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
@ 2002-05-16  8:04   ` Mark Mitchell
  2002-05-16  8:04   ` Geert Bosch
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Mitchell @ 2002-05-16  8:04 UTC (permalink / raw)
  To: Jan Hubicka, Robert Dewar; +Cc: dberlin, roger, aj, davem, gcc, rth

I'm not sure how to change around what's in -O2 and -O1.  We don't want
to confuse people who are used to one set of things, of course.  It's
a tricky question.

> Another thing I believe can be worthwhile is to have switch that enables
> the aggressive bits, like loop unrolling or prefetch people can use for
> benchmarks or very CPU bound code.  It appears to be common problems of
> the GCC reviews that they do use suboptimal switches and partly it is our
> mistake I guess. It is very dificult to set it up.

See my earlier rants about why it is bad to have so dang many options...

I'm not sure what to do, exactly, but you're right that it would be nice
if you tended to get the fastest code with "-O2" or "-O3" and not
"-O2 -fno-this -fthat".  If that's not turning out to be true, we should
see if we could tune it somewhat.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  5:30 GCSE store motion Robert Dewar
@ 2002-05-16  7:33 ` Jan Hubicka
  2002-05-16  8:04   ` Mark Mitchell
  2002-05-16  8:04   ` Geert Bosch
  0 siblings, 2 replies; 10+ messages in thread
From: Jan Hubicka @ 2002-05-16  7:33 UTC (permalink / raw)
  To: Robert Dewar; +Cc: dberlin, mark, roger, aj, davem, gcc, rth

> > That means we shouldn't be spending much time trying to do software
> > loop pipelining when compiling GCC, so the optimization shouldn't
> > make compiling the compiler significantly slower.
> 
> I don't see how you conclude this. You have to do the analysis on every
> loop. There will definitely be loops in GCC where the optimization is
> possible, there will be loops where it is not. I would expect the
> compiler to spend quite a bit of time trying to improve code for
> loops in GCC. What I am saying is that I doubt that the overall
> effect will be that benficial for GCC.

I don't think the rule should be taken literaly for each optimization.
Software pipelining, profile feedback, loop unroling, function inlining,
prefetch code genration, scheduling on i386 are all optimizations that will
lose in such test and still are worthwhile to have as for numeric code for
instance are a must.

I think we have -O1 for those "I want sane code but don't have time to wait"
and -O2 for "I can wait to save extra few %".

On the other hand, what I think is wortwhile is to reconsider what optimizations
should be enabled at -O1. Currently we do:

      flag_defer_pop = 1;
      flag_thread_jumps = 1;
#ifdef DELAY_SLOTS
      flag_delayed_branch = 1;
#endif
#ifdef CAN_DEBUG_WITHOUT_FP
      flag_omit_frame_pointer = 1;
#endif
      flag_guess_branch_prob = 1;
      flag_cprop_registers = 1;
      flag_loop_optimize = 1;
      flag_crossjumping = 1;
      flag_if_conversion = 1;
      flag_if_conversion2 = 1;

I believe crossjumping, jump threading and perhaps if conversion 2 are examples
of such optimizations that are expensive and brings not so much benefit.
Do you think it makes sense to run some tests and think about disabling them?
Would be the "bootstrap -O1" considered as valueable rule of thumb?

On the other hand at -O2 we do some bits that are not that expensive
and may come to -O1 category.  I would guess for:

      flag_optimize_sibling_calls = 1;
      flag_rename_registers = 1;
      flag_caller_saves = 1;
      flag_force_mem = 1;
      flag_regmove = 1;
      flag_strict_aliasing = 1;
      flag_reorder_blocks = 1;
      flag_reorder_functions = 1;

What do you think?  If we get kind of agreeement, I can run series of tests
for these optimizations...

Another thing I believe can be worthwhile is to have switch that enables
the aggressive bits, like loop unrolling or prefetch people can use for
benchmarks or very CPU bound code.  It appears to be common problems of the
GCC reviews that they do use suboptimal switches and partly it is our mistake
I guess. It is very dificult to set it up.

Honza

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-05-17 10:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-16  8:55 -O2 versus -O1 (Was: Re: GCSE store motion) Brad Lucier
  -- strict thread matches above, loose matches on Subject: below --
2002-05-17  4:33 Robert Dewar
2002-05-17  5:03 ` Richard Earnshaw
2002-05-16  8:21 Robert Dewar
2002-05-17  3:42 ` Daniel Egger
2002-05-17  4:00   ` Andreas Schwab
2002-05-16  5:30 GCSE store motion Robert Dewar
2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
2002-05-16  8:04   ` Mark Mitchell
2002-05-16  8:04   ` Geert Bosch
2002-05-16  8:06     ` Geert Bosch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).