public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: C compile time
@ 2003-06-19 20:16 Dara Hazeghi
  2003-06-19 20:16 ` Andrew Pinski
                   ` (3 more replies)
  0 siblings, 4 replies; 71+ messages in thread
From: Dara Hazeghi @ 2003-06-19 20:16 UTC (permalink / raw)
  To: gcc; +Cc: pinskia

Hello,

I've now updated results at
http://www.myownlittleworld.com/computers/gcctable.html
to include a table of the percentage change for
compile time between the various compilers.

Some things which stick out:

1) Andrew Pinski's patch gets us back on average about
10% on mainline and 7.5% on branch. Hopefully his
copyright assignment arrives soon!

2) Kaveh's work the garbage collection algorithm means
that gcc 3.3 is the first major gcc release since egcs
1.0.3a that's not more than 5% slower than the last
major release on the preceding branch. gcc 3.2.3 for
instance is between 14% and 42% slower than gcc 3.0.4
depending on optimizations.

3) The biggest slowdown to gcc was between gcc 3.0.X
and 3.1.X (ie different branches), but between 3.1.1
and 3.2.3, there was quite a slowdown too, ~6-9%. I
didn't realize that such big changes occurred on
release branches.

4) -funit-at-a-time is expensive!

Some (possibly impractical) suggestions:

1) At this point, the SPEC testers are keeping track
of runtime. We already have compile regression testers
for the testsuite which automatically report
regressions to the list. Should the same be done for
compile-time, ie a compile time increase of more than
1% prompts a message to the list.

1b) This may not be practical because of noise between
runs. Possibly do multiple runs, and take their mean,
or some such to reduce noise? Maybe not practical with
SPEC2K, but with SPEC95?

2) Establish a clear criteria for new optimizations,
and where they fit. For instance, -O1 according to the
manual means avoiding time consuming optimizations.
Yet as of 3.3, it's 55% slower than in 2.95.3 and 35%
slower than in 3.0.4. Perhaps state that certain
optimization levels aren't allowed to slow more than a
certain number of % between release for a certain
important benchmarks (ie SPEC2K, linux-kernel, etc.)

2b) Set criteria for new optimizations to be added.
Mandate a certain amount of runtime improvement in a
certain benchmark, before an optimization is included
with -O2 for example.

Cheers,

Dara

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-30 15:28 Robert Dewar
  0 siblings, 0 replies; 71+ messages in thread
From: Robert Dewar @ 2003-06-30 15:28 UTC (permalink / raw)
  To: dberlin, dewar; +Cc: coyote, dhazeghi, gcc, pkoning

> It would be better to improve GDB than to dumb down gcc's optimizations.
> Also work on merging things like var-tracking and dwarf2 location list
> support from the rtlopt-branch/cfg-branch for gcc, which helps immensely
> with optimized debugging.

Yes, but you still get transformations in the optimized code that cannot
be followed by the debugger in a clearly inteligent way.

I am not talking about dumbing down -O here, but rather in practice making
-O0 better.

In our world at least, people use -O0 primarily because they can
't debug at higher levels. Yes, it would be nice if this is fixed,
but it would also be nice if there was a debuggable level which did
not generate so much junk.

In comparison with other compilers, the performance in unoptimized, clearly
debuggable mode of gcc is rather poor, even if the -O2 code compares
favorably.

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-30 14:25 Robert Dewar
  2003-06-30 14:58 ` Daniel Berlin
  0 siblings, 1 reply; 71+ messages in thread
From: Robert Dewar @ 2003-06-30 14:25 UTC (permalink / raw)
  To: dewar, pkoning; +Cc: coyote, dhazeghi, gcc

> Absolutely.  The notion that you do debugging with -O0 and only final
> build with -O2 is obsolete -- and actually never was a good one.  
> 
> If you want reliable software, you have to debug what you ship -- not
> something totally different.

Mind you, in practice we find that gdb is pretty weak debugging -O2 code,
so this is a tricky requirement. It sometimes helps to use -O1 as a 
compromise, but I still would very much like to see -Od, meaning do all
the optimization you can that does not intefere with debugging :-)

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-29 13:51 Robert Dewar
  2003-06-30 13:50 ` Paul Koning
  0 siblings, 1 reply; 71+ messages in thread
From: Robert Dewar @ 2003-06-29 13:51 UTC (permalink / raw)
  To: coyote, dhazeghi; +Cc: gcc

> But for the vast majority of working programmers, I
> simply don't see why 
> optimized compile times are such an issue.

In our world, we have many customers building very large systems that require
optimization to be turned on once they get past the initial integration
stage, so the great majority of development work is done in optimized mode.
The model where you do everything unoptimized and turn on optimization for
the final build and you are done is not reasonable for large critical systems
which must be extensively tested in essentially final form.

So for us, -O2 compilation time is indeed critical.

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-19 14:58 Richard Guenther
  0 siblings, 0 replies; 71+ messages in thread
From: Richard Guenther @ 2003-06-19 14:58 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc, jh, Benjamin Kosnik

>> It would be nice if some of the inlining issues got sorted out for 3.4,
>> and -Winline became deterministic again.
>
>For 3.4, we could consider going back to the "bottom-up" inlining
>strategy.  That might be better than what we have now, even though it's
>inherently quadratic.  Implementing bottom-up inlining wouldn't be
>terribly hard; all the same tree-inlining machinery would work.
>
>One of the things we seem to forget in all the inlining discussion is
>that inlining has never worked well.  In fact, one of the big
>motivations in going to function-at-a-time was to try to fix all the
>lameness in the RTL inliner!  On many large C++ programs, the 2.95 era
>compilers would simply exhaust all memory trying to do inlining...
>
>I'm pretty convinced that there's no easy fix, unfortunately.

I think at least with a callgraph available we can do better. Also
implementing more user hints (like __attribute__((leafify))) correctly
needs a callgraph due to our function deffering stuff.

Richard.

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-18 21:51 Chris Lattner
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Lattner @ 2003-06-18 21:51 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Zack Weinberg, gcc


Mark Mitchell said:
>> With information about the complete translation unit in hand, I don't
>> think bottom-up has to be quadratic.  Consider the following
>> algorithm.

> I think we need to say quadratic in what.  You're right that it's a
> linear number of inlining operations, but that's quadratic in terms of
> the number of nodes in the trees.

That's actually _exponential_ in the number of nodes in the tree, which
is why a good heuristic is a _must_.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-18 21:18 Chris Lattner
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Lattner @ 2003-06-18 21:18 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Mark Mitchell, Benjamin Kosnik, gcc, guenth, jh


Zach said:
> With information about the complete translation unit in hand, I don't
> think bottom-up has to be quadratic.  Consider the following
> algorithm.
> 1) Construct a complete call graph for the translation unit.
 ...

> The key property of this algorithm is that each function is processed
> for inlining exactly once, and we never have to inline more than one
> level, because all call sites in an inlinee have already been either
> inlined or marked do-not-inline.  Thus, no quadratic term.  IIRC
> topological sort is O(n log n), which we can live with.

This is similar to what we do in LLVM, except that we don't build an
explicit call graph.  If you're interested, here's the code for the
heuristic:
http://llvm.cs.uiuc.edu/doxygen/FunctionInlining_8cpp-source.html

Basically we do a bottom-up inlining based on what amounts to the call
graph (we just don't explicitly create one).  Our heuristic currently is
set to only inline when it will shrink the program, thus the cut-off is
set to be very conservative.  Besides that though, you may be interested
in why we try harder to inline some functions than others.

The only major problem that I have with the LLVM implementation right now
is that it doesn't cache information about the size of the function it is
considering inlining (it recalculates it every time it considers the
function).  Aside from this though (and the fact that it happens to use an
std::map instead of a hash_map), the algorithm is linear in the number of
function inlines it performs.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-18 17:52 Chris Lattner
  2003-06-18 18:01 ` Jan Hubicka
  2003-06-18 18:28 ` Wolfgang Bangerth
  0 siblings, 2 replies; 71+ messages in thread
From: Chris Lattner @ 2003-06-18 17:52 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Wolfgang Bangerth, Richard Guenther, gcc


Jan Hubicka wrote:
> Actually there are so few static functions in C++ so unit-at-a-time as
> implemented currently has almost no effect.

How much of this is because anonymous namespaces are not marking their
contents as static?  In C++, static functions are deprecated...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-18 16:34 Benjamin Kosnik
  2003-06-18 20:11 ` Mark Mitchell
  0 siblings, 1 reply; 71+ messages in thread
From: Benjamin Kosnik @ 2003-06-18 16:34 UTC (permalink / raw)
  To: gcc; +Cc: guenth, mark, jh

> Note that I consider not having unit-at-a-time for C++ for 3.4 as a
> showstopper as it is possibly the only way to get sane inlining and such
> sane performance out of scientific C++ codes like POOMA.

It's not just scientific C++ code. It's pretty much all C++ code.

It would be nice if some of the inlining issues got sorted out for 3.4,
and -Winline became deterministic again. I'm not working on this, so
all I can say is, "would be nice." Any efforts on this front would be
appreciated.

If I had extra gnudits, I'd spend them on this.

;)

-benjamin

^ permalink raw reply	[flat|nested] 71+ messages in thread
* Re: C compile time
@ 2003-06-18 16:12 Wolfgang Bangerth
  2003-06-18 17:48 ` Jan Hubicka
  0 siblings, 1 reply; 71+ messages in thread
From: Wolfgang Bangerth @ 2003-06-18 16:12 UTC (permalink / raw)
  To: Richard Guenther, Jan Hubicka, gcc


> Note that I consider not having unit-at-a-time for C++ for 3.4 as a
> showstopper as it is possibly the only way to get sane inlining and
> such sane performance out of scientific C++ codes like POOMA.

I don't think that this is a particularly true statement. C++ codes have many 
member functions that can be called from everywhere, and few file-static 
functions. So being able to inline static functions that are only called once 
is not a very important feature. The general ability to inline small accessor 
and template functions is much more important, but that is independent of 
unit-at-a-time.

W.

-------------------------------------------------------------------------
Wolfgang Bangerth              email:            bangerth@ices.utexas.edu
                               www: http://www.ices.utexas.edu/~bangerth/

^ permalink raw reply	[flat|nested] 71+ messages in thread
[parent not found: <Pine.LNX.4.44.0306181249160.6712-100000@bellatrix.tat.physik.uni-tuebingen. de>]
* Re: C compile time
@ 2003-06-18 13:08 Richard Guenther
  0 siblings, 0 replies; 71+ messages in thread
From: Richard Guenther @ 2003-06-18 13:08 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Mark Mitchell, gcc

>> > Yes -- you recently mentioned that Mark had some ideas about C++
>> > unit-at-a-time. Could you tell a bit more about that?
>>
>> Mark does not like my current approach. C++ frontend currently contain
>> loop that does iterate over all known functions, virtual tables and
>> static data and outputs one only when it has been already assembled
>> reference to. (the templates instantiations and virtual tables are
>> created lazilly when needed)
>>
>> I modified it to walk the function bodies to discover what static
>> initializers and templates are needed so I don't need to actually
>> assemble to see what is needed, but Mark would preffer approach that
>> expands everything available and then removes dead objects.
>>
>> That means that I need to extend unit-at-a-time first to deal with data
>> structures as well (so one can cancel data structure from being output
>> when it is already expanded) and doing so also brings memory
explission.
>> Mark thinks we should reduce memory overhead of the frontend first this
>> is bit more involved change that I feel I am able to do in C++ frontend
>> in 3.4 horizont.
>
>Note that we also discussed an "in the middle" sollution that actually
>expands all functions but delay expansion of the virtual tables and
>examines what are needed.  I am trying to implement it but at the moment
>it looks even uglyer than the original and it slows libstdc++
>compilation down noticeably (12%) apparently due to extra expansion and
>memory overhead, I will try to cut this down somewhat.
>
>The expansion loop in C++ as currently written is really twisted and all
>my approaches to get unit-at-a-time in makes it even worse :(((

Note that I consider not having unit-at-a-time for C++ for 3.4 as a
showstopper as it is possibly the only way to get sane inlining and
such sane performance out of scientific C++ codes like POOMA.

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 71+ messages in thread
[parent not found: <3EEFA473.1020800@student.tudelft.nl>]
* Re: C compile time
@ 2003-06-18  3:38 Andrew Pinski
  2003-06-18  7:43 ` Dara Hazeghi
  0 siblings, 1 reply; 71+ messages in thread
From: Andrew Pinski @ 2003-06-18  3:38 UTC (permalink / raw)
  To: dhazeghi, gcc; +Cc: Andrew Pinski

Dara,
For some reason (University of Cincinnati's network sucks), I cannot 
reply to the original message,
I also have to use list archives, :(.

Can you see how my patch in 10962 will help compile time (mostly it 
will help at -O0 but it could
help other places) on the mainline?
I have several patches which I will be submitting (after I get my 
copyright forms submitted, still
waiting for them to arrive), most of them removes invariant loads in 
loops (Shikari [shameless plug]
from the CHUD tools on Mac OS X are good to find these) and some which 
will cause sib calling to
happen more in some places.

Also did you compile the mainline and 3.5-tree-ssa with 
--disable-checking because it looks like you
did not from looking at the numbers?


Also for the mainline (and maybe 3.5-tree-ssa) could you run with 
-ftime-report, yes I know this will
report a lot of information but you could use a spread-sheet to look at 
the data and maybe give an idea
of where the problem is?

Thanks,
Andrew Pinski

> Hello,
>
> after Wolfgang's post about C++ compile times, my
> curiosity was piqued to check how C compile times have
> been going. I used compiling gcc 3.2.3's cc1 as the
> test.
>
> Hopefully this table gives some sense of where thing
> are at.
>
> gcc version     -O0     -O1     -O2     -O3
> 2.7.2.3         128.04  131.02  163.51  176.01
> 2.8.1           128.86  140.79  182.35  194.63
> 2.90.29         130.60  140.57  186.29  199.32
> 2.91.66         132.44  148.48  203.71  219.21
> 2.95.3          143.38  180.97  250.94  276.85
> 3.0.4           169.79  210.73  320.24  365.15
> 3.2.3           193.48  269.43  424.74  519.85
> 3.3             184.15  282.57  442.64  529.93
> 3.3-branch      184.15  283.89  443.66  535.10
> 3.4-mainline    203.58  326.35  514.49  783.59
> 3.5-tree-ssa    223.33  327.91  503.58  702.11
>
> icc* version    -O0     -O1*    -O2     -O3*
> 5.01*           158.47  ~       293.47  293.11
> 6.0             142.81  ~       227.25  227.88
> 7.1             153.95  ~       243.35  243.78
>
> *icc is Intel's C++ Compiler for Linux (unsupported
> noncommerical version)
> *icc sets -O1 and -O2 to be the same
> *icc claims -O2 and -O3 are different, but I'm not
> sure how, as compile time indicates
> *icc 5.0 would not compile df.c, so df.c was compiled
> with gcc for this test
>
> Test conducted was compiling cc1 from gcc 3.2.3 on
> i686-pc-linux-gnu with different versions of gcc. cvs
> snapshots for 3.3-branch, 3.4-mainline and
> 3.5-tree-ssa were from 20030614.
>
> Cheers,
>
> Dara

^ permalink raw reply	[flat|nested] 71+ messages in thread
* C compile time
@ 2003-06-18  2:31 Dara Hazeghi
  2003-06-18 10:38 ` Joseph S. Myers
  0 siblings, 1 reply; 71+ messages in thread
From: Dara Hazeghi @ 2003-06-18  2:31 UTC (permalink / raw)
  To: gcc

Hello,

after Wolfgang's post about C++ compile times, my
curiosity was piqued to check how C compile times have
been going. I used compiling gcc 3.2.3's cc1 as the
test.

Hopefully this table gives some sense of where thing
are at.

gcc version     -O0     -O1     -O2     -O3
2.7.2.3         128.04  131.02  163.51  176.01
2.8.1           128.86  140.79  182.35  194.63
2.90.29         130.60  140.57  186.29  199.32
2.91.66         132.44  148.48  203.71  219.21
2.95.3          143.38  180.97  250.94  276.85
3.0.4           169.79  210.73  320.24  365.15
3.2.3           193.48  269.43  424.74  519.85
3.3             184.15  282.57  442.64  529.93
3.3-branch      184.15  283.89  443.66  535.10
3.4-mainline    203.58  326.35  514.49  783.59
3.5-tree-ssa    223.33  327.91  503.58  702.11

icc* version    -O0     -O1*    -O2     -O3*
5.01*           158.47  ~       293.47  293.11
6.0             142.81  ~       227.25  227.88
7.1             153.95  ~       243.35  243.78

*icc is Intel's C++ Compiler for Linux (unsupported
noncommerical version)
*icc sets -O1 and -O2 to be the same
*icc claims -O2 and -O3 are different, but I'm not
sure how, as compile time indicates
*icc 5.0 would not compile df.c, so df.c was compiled
with gcc for this test

Test conducted was compiling cc1 from gcc 3.2.3 on
i686-pc-linux-gnu with different versions of gcc. cvs
snapshots for 3.3-branch, 3.4-mainline and
3.5-tree-ssa were from 20030614.

Cheers,

Dara

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2003-07-04  5:39 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-19 20:16 C compile time Dara Hazeghi
2003-06-19 20:16 ` Andrew Pinski
2003-06-19 20:22 ` Diego Novillo
2003-06-19 21:58   ` Dara Hazeghi
2003-06-19 21:58     ` Diego Novillo
2003-06-20 22:42       ` Dara Hazeghi
2003-06-21  0:34         ` Diego Novillo
2003-06-19 21:59     ` Jan Hubicka
2003-06-19 20:44 ` Jan Hubicka
2003-06-19 21:23   ` Dara Hazeghi
2003-06-19 21:23     ` Jan Hubicka
2003-06-19 21:26       ` Dara Hazeghi
2003-06-19 21:31         ` Jan Hubicka
2003-06-19 21:59           ` Jan Hubicka
2003-06-20  0:55             ` Dara Hazeghi
2003-06-19 22:10 ` Steven Bosscher
2003-06-19 22:30   ` Steven Bosscher
  -- strict thread matches above, loose matches on Subject: below --
2003-06-30 15:28 Robert Dewar
2003-06-30 14:25 Robert Dewar
2003-06-30 14:58 ` Daniel Berlin
2003-06-29 13:51 Robert Dewar
2003-06-30 13:50 ` Paul Koning
2003-06-19 14:58 Richard Guenther
2003-06-18 21:51 Chris Lattner
2003-06-18 21:18 Chris Lattner
2003-06-18 17:52 Chris Lattner
2003-06-18 18:01 ` Jan Hubicka
2003-06-18 18:08   ` Chris Lattner
2003-06-18 18:28 ` Wolfgang Bangerth
2003-06-18 18:48   ` Chris Lattner
2003-06-18 18:57     ` Wolfgang Bangerth
2003-06-18 19:28       ` Chris Lattner
2003-06-18 19:30         ` Wolfgang Bangerth
2003-06-18 19:31           ` Chris Lattner
2003-06-18 16:34 Benjamin Kosnik
2003-06-18 20:11 ` Mark Mitchell
2003-06-18 20:49   ` Jan Hubicka
2003-06-18 20:52   ` Zack Weinberg
2003-06-18 21:26     ` Mark Mitchell
2003-06-18 21:51       ` Zack Weinberg
2003-06-18 23:09         ` Mark Mitchell
2003-06-19 15:21       ` Jan Hubicka
2003-06-19 16:31         ` Mark Mitchell
2003-06-19 16:36           ` Jan Hubicka
2003-06-19 16:41             ` Mark Mitchell
2003-06-19 17:08               ` Jan Hubicka
2003-06-19 17:33               ` Jeff Sturm
2003-06-18 16:12 Wolfgang Bangerth
2003-06-18 17:48 ` Jan Hubicka
     [not found] <Pine.LNX.4.44.0306181249160.6712-100000@bellatrix.tat.physik.uni-tuebingen. de>
2003-06-18 15:54 ` Mark Mitchell
2003-06-18 17:42   ` Jan Hubicka
2003-06-18 13:08 Richard Guenther
     [not found] <3EEFA473.1020800@student.tudelft.nl>
2003-06-18  4:36 ` Dara Hazeghi
2003-06-18  3:38 Andrew Pinski
2003-06-18  7:43 ` Dara Hazeghi
2003-06-18  8:41   ` Steven Bosscher
2003-06-18  9:14     ` Jan Hubicka
2003-06-18  9:15       ` Steven Bosscher
2003-06-18 10:07         ` Jan Hubicka
2003-06-18 10:55           ` Steven Bosscher
2003-06-18 12:38             ` Jan Hubicka
2003-06-18 12:51               ` Jan Hubicka
2003-06-18 22:03         ` Dara Hazeghi
2003-06-20 20:36           ` Scott Robert Ladd
2003-06-21  0:31             ` Dara Hazeghi
2003-06-21 16:14             ` Michael S. Zick
2003-07-04  7:14           ` Ben Elliston
2003-06-18 14:00   ` Scott Robert Ladd
2003-06-18  2:31 Dara Hazeghi
2003-06-18 10:38 ` Joseph S. Myers
2003-06-18 20:36   ` Dara Hazeghi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).