public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* -Os is weak...
@ 2010-09-09 16:43 DJ Delorie
  2010-09-09 17:16 ` Ian Lance Taylor
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: DJ Delorie @ 2010-09-09 16:43 UTC (permalink / raw)
  To: gcc


The docs say...

@item -Os
@opindex Os
Optimize for size.  @option{-Os} enables all @option{-O2} optimizations that
do not typically increase code size.  It also performs further
optimizations designed to reduce code size.

@option{-Os} disables the following optimization flags:
@gccoptlist{-falign-functions  -falign-jumps  -falign-loops @gol
-falign-labels  -freorder-blocks  -freorder-blocks-and-partition @gol
-fprefetch-loop-arrays  -ftree-vect-loop-version}

But in reality, the only thing -Os does beyond -O2, aside from a few
niche special cases, is enable inlining, and maybe scheduling, which
for some cases may be the wrong thing to do.

Is this what we want?





  flag_schedule_insns = opt2 && ! optimize_size;

  if (optimize_size)
    {
      /* Inlining of functions reducing size is a good idea regardless of them
	 being declared inline.  */
      flag_inline_functions = 1;

      /* Basic optimization options.  */
      optimize_size = 1;
      if (optimize > 2)
	optimize = 2;

      /* We want to crossjump as much as possible.  */
      set_param_value ("min-crossjump-insns", 1);
    }
  else
    set_param_value ("min-crossjump-insns", initial_min_crossjump_insns);


$ grep optimize_size *.c
genconditions.c:   { "! optimize_size && ! TARGET_READ_MODIFY_WRITE",
genconditions.c:     __builtin_constant_p (! optimize_size && ! TARGET_READ_MODIFY_WRITE)
genconditions.c:     ? (int) (! optimize_size && ! TARGET_READ_MODIFY_WRITE)
opts.c:       optimize_size = 0;
opts.c:           optimize_size = 0;
opts.c:   optimize_size = 1;
opts.c:   optimize_size = 0;
opts.c:  flag_schedule_insns = opt2 && ! optimize_size;
opts.c:  if (optimize_size)
opts.c:      optimize_size = 1;
opts.c:  OPTIMIZATION_OPTIONS (optimize, optimize_size);
predict.c:  if (optimize_size)
predict.c:  return (optimize_size
toplev.c:   The only valid values are zero and nonzero. When optimize_size is
toplev.c:int optimize_size = 0;
toplev.c:  if (flag_prefetch_loop_arrays > 0 && optimize_size)
tree-inline.c:  if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO (!optimize_size))
tree-inline.c:    || (caller_opt->optimize_size != callee_opt->optimize_size))

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 16:43 -Os is weak DJ Delorie
@ 2010-09-09 17:16 ` Ian Lance Taylor
  2010-09-09 17:20   ` Andrew Pinski
  2010-09-09 17:38   ` DJ Delorie
  2010-09-09 19:42 ` Steven Bosscher
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 13+ messages in thread
From: Ian Lance Taylor @ 2010-09-09 17:16 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc

DJ Delorie <dj@redhat.com> writes:

> But in reality, the only thing -Os does beyond -O2, aside from a few
> niche special cases, is enable inlining, and maybe scheduling, which
> for some cases may be the wrong thing to do.

Some backends also check optimize_size to change their cost algorithms
to favor shorter instruction sequences.

Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 17:16 ` Ian Lance Taylor
@ 2010-09-09 17:20   ` Andrew Pinski
  2010-09-09 17:38   ` DJ Delorie
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Pinski @ 2010-09-09 17:20 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: DJ Delorie, gcc

On Thu, Sep 9, 2010 at 10:16 AM, Ian Lance Taylor <iant@google.com> wrote:
> Some backends also check optimize_size to change their cost algorithms
> to favor shorter instruction sequences.

Also see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16996 for all the
other known code size improvements that could be done.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 17:16 ` Ian Lance Taylor
  2010-09-09 17:20   ` Andrew Pinski
@ 2010-09-09 17:38   ` DJ Delorie
  1 sibling, 0 replies; 13+ messages in thread
From: DJ Delorie @ 2010-09-09 17:38 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc


> Some backends also check optimize_size to change their cost algorithms
> to favor shorter instruction sequences.

But why doesn't it do what the documentation says?  -falign-* seems
like an obvious one - aligning labels and such always makes the code
bigger.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 16:43 -Os is weak DJ Delorie
  2010-09-09 17:16 ` Ian Lance Taylor
@ 2010-09-09 19:42 ` Steven Bosscher
  2010-09-09 20:00 ` Steven Bosscher
  2010-09-10  8:44 ` Steven Bosscher
  3 siblings, 0 replies; 13+ messages in thread
From: Steven Bosscher @ 2010-09-09 19:42 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc

On Thu, Sep 9, 2010 at 6:43 PM, DJ Delorie <dj@redhat.com> wrote:

> $ grep optimize_size *.c

Try egrep "optimize_.*_for_speed|optimize_.*_for_size" * config/*/*

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 16:43 -Os is weak DJ Delorie
  2010-09-09 17:16 ` Ian Lance Taylor
  2010-09-09 19:42 ` Steven Bosscher
@ 2010-09-09 20:00 ` Steven Bosscher
  2010-09-10  8:44 ` Steven Bosscher
  3 siblings, 0 replies; 13+ messages in thread
From: Steven Bosscher @ 2010-09-09 20:00 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc, Jan Hubicha

On Thu, Sep 9, 2010 at 6:43 PM, DJ Delorie <dj@redhat.com> wrote:
> $ grep optimize_size *.c
> genconditions.c:   { "! optimize_size && ! TARGET_READ_MODIFY_WRITE",
> genconditions.c:     __builtin_constant_p (! optimize_size && ! TARGET_READ_MODIFY_WRITE)
> genconditions.c:     ? (int) (! optimize_size && ! TARGET_READ_MODIFY_WRITE)

These are in comments, not actual tests of optimize_size.


> opts.c:       optimize_size = 0;
> opts.c:           optimize_size = 0;
> opts.c:   optimize_size = 1;
> opts.c:   optimize_size = 0;
> opts.c:  flag_schedule_insns = opt2 && ! optimize_size;
> opts.c:  if (optimize_size)
> opts.c:      optimize_size = 1;
> opts.c:  OPTIMIZATION_OPTIONS (optimize, optimize_size);

Various initialization bits for optimize_size, this is OK.


> predict.c:  if (optimize_size)

This looks like a bug, it should proabably be:

if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION (edge->caller->decl))

Honza, what do you think about this one?


> predict.c:  return (optimize_size

This is OK, this is inside optimize_function_for_size_p.


> toplev.c:   The only valid values are zero and nonzero. When optimize_size is
> toplev.c:int optimize_size = 0;
> toplev.c:  if (flag_prefetch_loop_arrays > 0 && optimize_size)

These are OK.


> tree-inline.c:  if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO (!optimize_size))

This lacks context to call one of the optimize_*_for_size_p functions.
So this is OK.


> tree-inline.c:    || (caller_opt->optimize_size != callee_opt->optimize_size))

This is inside an #if 0'ed block and would not be a reference to the
global variable optimize_size anyway. It looks like this code, if
enabled again, would need modifications to make it compile again.


In general, any reference to the global var optimize_size should be
checked to verify that there shouldn't be a more fine-grained check
instead.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-09 16:43 -Os is weak DJ Delorie
                   ` (2 preceding siblings ...)
  2010-09-09 20:00 ` Steven Bosscher
@ 2010-09-10  8:44 ` Steven Bosscher
  2010-09-10 16:49   ` DJ Delorie
                     ` (2 more replies)
  3 siblings, 3 replies; 13+ messages in thread
From: Steven Bosscher @ 2010-09-10  8:44 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc

On Thu, Sep 9, 2010 at 6:43 PM, DJ Delorie <dj@redhat.com> wrote:
>
> The docs say...
>
> @item -Os
> @opindex Os
> Optimize for size.  @option{-Os} enables all @option{-O2} optimizations that
> do not typically increase code size.  It also performs further
> optimizations designed to reduce code size.
>
> @option{-Os} disables the following optimization flags:
> @gccoptlist{-falign-functions  -falign-jumps  -falign-loops @gol
> -falign-labels  -freorder-blocks  -freorder-blocks-and-partition @gol
> -fprefetch-loop-arrays  -ftree-vect-loop-version}
>
> But in reality, the only thing -Os does beyond -O2, aside from a few
> niche special cases, is enable inlining, and maybe scheduling, which
> for some cases may be the wrong thing to do.
>
> Is this what we want?

So yesterday I already sent out a few mails explaining that there is
really more than just the things you described above. It seems that
you haven't followed GCC development closely enough for a while to
notice that the "optimize_size" checks have mostly been replaced with
more fine-grained checks, even at the level of individual insns.

What you quote above, from the documentation, is also actually
incomplete. The -Os option also enables optimizations that are not
performed at -O[123], e.g. code hoisting only runs at -Os (see
gcse.c:pass_rtl_hoist).

That said, it is true that GCC does not have the strongest code size
optimizations compared to other compilers on the market. There are
many things GCC could do better/more to improve code size further.
This is a matter of focus, as you know: If no-one cares enough to
commit enough resources to code-size optimizations, they will not get
implemented in GCC.

I guess the most important missing optimizations are various forms of
code unification, such as the sequence abstraction code that GCC used
to have (http://gcc.gnu.org/projects/cfo.html, but it never worked
properly and it was way too slow), or some suffix-tree based sequence
finding code. Various algorithms can be found in the academic
literature about code size optimizations via abstraction (see e.g.
"Procedural Abstraction with Reverse Prefix Trees",
http://portal.acm.org/citation.cfm?id=1545074).

Even for the existing code size optimizations, improvements are
possible. I've played with some ideas myself, with new work
(implementing code hoisting for GIMPLE, xf.
http://gcc.gnu.org/PR23286) and extending other people's work
(if-conversion and cross-jumping, xf. http://gcc.gnu.org/PR20070). If
you have plans to work on improved -Os optimizations, those two could
be good starting points to warm up.

Personally, I had hoped that the ARM folks (Linaro, or what's it
called?) would work on -Os. While I've never actually used it, a web
search suggests that the RealView compilers generate code that is as
much as 20% smaller than GCC at -Os (for unnamed benchmarks), so
apparently there is a lot of room for improvement in GCC and the ARM
people should know where.

Finally, of course there are just various issues with instruction
selection in GCC that result in larger-than-necessary code. It seems
that this doesn't hurt code speed so much, but for code size GCC
doesn't always select the shortes sequence possible. Some Google folks
(Carrot Wei in particular) have filed bugs and patches for a couple of
cases for ARM, but there is no target-independent frame work for
selecting the shortest insn or sequence.

Is there a particular target you're interested in?

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-10  8:44 ` Steven Bosscher
@ 2010-09-10 16:49   ` DJ Delorie
  2010-09-16 17:13   ` Yao Qi
  2010-09-27  3:50   ` Gerald Pfeifer
  2 siblings, 0 replies; 13+ messages in thread
From: DJ Delorie @ 2010-09-10 16:49 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc


> Is there a particular target you're interested in?

Not in that way, no.  My biggest concern is that the documentation is
wrong.  My second concern is that the help option says it basically
does nothing (well, one or two options) instead of the big list it
used to do (or that the other -O* do).  My third concern is that it
doesn't globally affect as many options as you'd expect - like forcing
alignments to 1 for all targets.  Why should every target duplicate
that code?

I wasn't really asking "how do I make code smaller" I was asking "why
does -Os appear to be useless" - emphasis on "appear".

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-10  8:44 ` Steven Bosscher
  2010-09-10 16:49   ` DJ Delorie
@ 2010-09-16 17:13   ` Yao Qi
  2010-09-16 22:06     ` Andi Kleen
  2010-09-16 22:15     ` Steven Bosscher
  2010-09-27  3:50   ` Gerald Pfeifer
  2 siblings, 2 replies; 13+ messages in thread
From: Yao Qi @ 2010-09-16 17:13 UTC (permalink / raw)
  To: gcc

On Fri, Sep 10, 2010 at 10:44:24AM +0200, Steven Bosscher wrote:
> On Thu, Sep 9, 2010 at 6:43 PM, DJ Delorie <dj@redhat.com> wrote:
> 
> 
> I guess the most important missing optimizations are various forms of
> code unification, such as the sequence abstraction code that GCC used
> to have (http://gcc.gnu.org/projects/cfo.html, but it never worked
> properly and it was way too slow), or some suffix-tree based sequence
> finding code. Various algorithms can be found in the academic
> literature about code size optimizations via abstraction (see e.g.
> "Procedural Abstraction with Reverse Prefix Trees",
> http://portal.acm.org/citation.cfm?id=1545074).

Was CFO finally merged to mainline?  At least, I can't find it in
current gcc.

> Personally, I had hoped that the ARM folks (Linaro, or what's it
> called?) would work on -Os. While I've never actually used it, a web
> search suggests that the RealView compilers generate code that is as
> much as 20% smaller than GCC at -Os (for unnamed benchmarks), so
> apparently there is a lot of room for improvement in GCC and the ARM
> people should know where.
> 

We, Linaro Toolchain Working Group, are doing the investigation on
code size improvements on thumb-2.  As you said, there would be a lot of
room for improvement, and here is the report we got, fyi.
http://lists.linaro.org/pipermail/linaro-toolchain/2010-September/000202.html

> Finally, of course there are just various issues with instruction
> selection in GCC that result in larger-than-necessary code. It seems
> that this doesn't hurt code speed so much, but for code size GCC
> doesn't always select the shortes sequence possible. Some Google folks
> (Carrot Wei in particular) have filed bugs and patches for a couple of
> cases for ARM, but there is no target-independent frame work for
> selecting the shortest insn or sequence.

During the investigation, I feel that all the potential improvements
are identified by ARM experts or by reading asm code manually.  This
mode doesn't scale very well.  IMO, it is necessary to have a
target-independent framework for code size optimization.  I have no
idea to do that framework though.

-- 
Yao Qi
CodeSourcery
yao@codesourcery.com
(650) 331-3385 x739

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-16 17:13   ` Yao Qi
@ 2010-09-16 22:06     ` Andi Kleen
  2010-09-18 13:31       ` Jakub Jelinek
  2010-09-16 22:15     ` Steven Bosscher
  1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2010-09-16 22:06 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc

Yao Qi <yao@codesourcery.com> writes:
>
> During the investigation, I feel that all the potential improvements
> are identified by ARM experts or by reading asm code manually.  This
> mode doesn't scale very well.  IMO, it is necessary to have a
> target-independent framework for code size optimization.  I have no
> idea to do that framework though.

On x86 gcc is definitely behind some other compilers in terms
of code size.

Try reading some examples from http://embed.cs.utah.edu/embarrassing/
Since the criteria of the comparisons is code size it can show
you where gcc is behind some other compilers

(but note that these comparisons do not include the best compilers
for small size and also do not run with -Os currently)

This is for x86, but could be probably used for other architectures
too.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-16 17:13   ` Yao Qi
  2010-09-16 22:06     ` Andi Kleen
@ 2010-09-16 22:15     ` Steven Bosscher
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Bosscher @ 2010-09-16 22:15 UTC (permalink / raw)
  To: Yao Qi; +Cc: gcc

On Thu, Sep 16, 2010 at 9:35 AM, Yao Qi <yao@codesourcery.com> wrote:
> Was CFO finally merged to mainline?  At least, I can't find it in
> current gcc.

Yes, it was merged.

And then it was removed again because the implementation had several
big problems. Such as, it didn't actually work.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-16 22:06     ` Andi Kleen
@ 2010-09-18 13:31       ` Jakub Jelinek
  0 siblings, 0 replies; 13+ messages in thread
From: Jakub Jelinek @ 2010-09-18 13:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Yao Qi, gcc

On Thu, Sep 16, 2010 at 10:55:22AM +0200, Andi Kleen wrote:
> Try reading some examples from http://embed.cs.utah.edu/embarrassing/
> Since the criteria of the comparisons is code size it can show
> you where gcc is behind some other compilers
> 
> (but note that these comparisons do not include the best compilers
> for small size and also do not run with -Os currently)

I'm not denying that there is lots of room for -Os code size improvements,
but from the http://embed.cs.utah.edu/embarrassing/ results
GCC doesn't perform that bad in comparison with other compilers
(gcc 3.4 best, then icc and then gcc 4.5), and all the results there
were from -Os compilations across the different compilers.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: -Os is weak...
  2010-09-10  8:44 ` Steven Bosscher
  2010-09-10 16:49   ` DJ Delorie
  2010-09-16 17:13   ` Yao Qi
@ 2010-09-27  3:50   ` Gerald Pfeifer
  2 siblings, 0 replies; 13+ messages in thread
From: Gerald Pfeifer @ 2010-09-27  3:50 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: DJ Delorie, gcc

[-- Attachment #1: Type: TEXT/PLAIN, Size: 844 bytes --]

On Fri, 10 Sep 2010, Steven Bosscher wrote:
>> The docs say...
>>
>> @item -Os
>> @opindex Os
>> Optimize for size.  @option{-Os} enables all @option{-O2} optimizations that
>> do not typically increase code size.  It also performs further
>> optimizations designed to reduce code size.
>>
>> @option{-Os} disables the following optimization flags:
>> @gccoptlist{-falign-functions  -falign-jumps  -falign-loops @gol
>> -falign-labels  -freorder-blocks  -freorder-blocks-and-partition @gol
>> -fprefetch-loop-arrays  -ftree-vect-loop-version}
> What you quote above, from the documentation, is also actually
> incomplete. The -Os option also enables optimizations that are not
> performed at -O[123], e.g. code hoisting only runs at -Os (see
> gcse.c:pass_rtl_hoist).

Any chance you could update the documentation, Steven or DJ?

Gerald

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-09-26 12:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-09 16:43 -Os is weak DJ Delorie
2010-09-09 17:16 ` Ian Lance Taylor
2010-09-09 17:20   ` Andrew Pinski
2010-09-09 17:38   ` DJ Delorie
2010-09-09 19:42 ` Steven Bosscher
2010-09-09 20:00 ` Steven Bosscher
2010-09-10  8:44 ` Steven Bosscher
2010-09-10 16:49   ` DJ Delorie
2010-09-16 17:13   ` Yao Qi
2010-09-16 22:06     ` Andi Kleen
2010-09-18 13:31       ` Jakub Jelinek
2010-09-16 22:15     ` Steven Bosscher
2010-09-27  3:50   ` Gerald Pfeifer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).