GCC 3.5 Plan, Take 2

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* GCC 3.5 Plan, Take 2
@ 2004-08-16  2:25 Mark Mitchell
  2004-08-16  3:59 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Mark Mitchell @ 2004-08-16  2:25 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 75 bytes --]

-- 
Mark Mitchell
CodeSourcery, LLC
(916) 791-8304
mark@codesourcery.com

[-- Attachment #2: gcc-status.txt --]
[-- Type: text/plain, Size: 5440 bytes --]

There have been a lot of comments about the GCC 3.5 plan I posted.

It's clear that there are a set of improvements that people want to
incorporate before before GCC 3.5.  That's not surprising; at any time
that we go to make a release, there will always be more improvements
that we could incoporate if we waited just a little longer.  Hence all
the articles about software feature creep in the management
literature.

It's also clear that I'm not fully aware of all the things people have
in the pipeline.  And, it's clear that I can't make good decisions
without knowing that information.  Below, I'll discuss how I want to
correct that problem.

First, I will talk a bit about my goals for GCC 3.5.  

First and foremost, as with all releases, I would like to see the
benefits -- technical and non-technical -- of upgrading to GCC 3.5
outweigh the benefits of staying with GCC 3.4.x for most of those
users who might consider upgrading.  (There is a class of users for
which no major upgrade is ever a good idea.)  There are a lot of ways
to get there: better code generation, better compile times, better
language conformance, more targets, more languages, better support
from the GCC team for bugs, more support from distributors, better
error messages, fewer wrong-code bugs, a better manual, etc., etc.
There's no objective measure of all of these things, and the
definition of "good enough" varies with lots of inputs, which is why I
have shied away from lots of criteria descriptions.  I view it as my
job to synthesize comments and information from people and then,
eventually, say "OK, that's good enough, let's ship it."

We're already better than GCC 3.4 on a lot of these axes.  Right now,
we're probably losing a bit on compile-time, especially with
optimization enabled, and most people seem to think code generation is
about a wash.

I expect it to be hard to get GCC 3.5's code-generation to be
noticably better than GCC 3.4.x's across the board within a few
months.  I expect that it will not be until six months to a year hence
that we'll see noticably better code generation on most test cases.
Since I don't think that most people want to wait that long for GCC
3.5, I think we need to accept that we're shooting for code generation
that is, in general, not worse, and may be better on some particularly
high-profile cases.  In the worst case, we may even have to accept
worse code on some real cases.  If we can vectorize some loops, and
SRA provides big wins on some C++ test cases, and we're in general
pretty close to GCC 3.4, I think the release will be well-received,
given all the other improvements.

In fact, I think that breaking even would be a great result given the
switch to tree-ssa; it's hard not to be markedly worse when building a
whole new set of optimizers from scratch.  

Thus, if your opinion is that we shouldn't release until and unless
GCC 3.5 is knocking the socks off of GCC 3.4, you and I are going to
have to agree to disagree.

At the same time, I don't want to impose an arbitrary deadline that
causes us to miss substantial performance wins that we could achieve
with just a little bit more time.  

So, here is what I would like to do.  

For each project -- optimization or otherwise -- that you are working
on that you would like to have included in GCC 3.5, please send an
email *directly to me* using the following form:

  Name of Developer:

  Name of Improvement:

  Dependencies:

  [The names of other improvements upon which this one depends, if
  any.  Coordinate with the developers of these other improvements to
  obtain the names that they are using, please.]

  Description of Improvement:

  [A paragraph or ten explaining what you are doing, what benefit your
  improvement will have, with quantification where possible, and what
  risks your improvement will entail, and how these risks will be
  mitigated.  Warning: I will disbelieve descriptions that indicate
  that there are no risks to the indicated change.

  Be as detailed as you can be.]

  Delivery Date:

  [The date on which you can commit to delivering this improvement.
  Be conservative!  If this improvement is already in process, please
  say describe the current state, in terms of percentage complete and
  any testing done.  Warning: late (for a definition of "late" not yet
  determined) delivery dates may result in your improvement not being
  considered for GCC 3.5.  Warning: missing your delivery dates may
  also result in your improvement not being included in GCC 3.5, and
  I'll not be very sympathetic about extenuating circumstances.  So,
  think hard before you say "I can get this done in a week."]

If your organization has a lot of improvements in the pipeline, please
try to work together with the others in your organization, and send me
one coordinated response containing all the improvement descriptions,
rather than multiple possibly-conflicting descriptions.

I will gather these together on a web page, or pages, as they come in.
I'll accept these submissions through Sunday, August 22nd.  I'll not
be terribly rigid about someone who is on vacation this week, but I'll
not be terribly flexible either, so let's get these in, please!  

I'll then make a decision about which ones are in and which are out,
probably after contacting some of you directly for help.  After the
decision is made, I'll expect that you respect my decision, even if
you disagree.

Thanks,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  2:25 GCC 3.5 Plan, Take 2 Mark Mitchell
@ 2004-08-16  3:59 ` Andrew Pinski
  2004-08-16  6:00   ` Daniel Berlin
  2004-08-16 11:19   ` Steven Bosscher
  2004-08-16 19:27 ` Matt Austern
  2004-08-16 22:47 ` Joseph S. Myers
  2 siblings, 2 replies; 9+ messages in thread
From: Andrew Pinski @ 2004-08-16  3:59 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Aug 15, 2004, at 6:47 PM, Mark Mitchell wrote:

> We're already better than GCC 3.4 on a lot of these axes.  Right now,
> we're probably losing a bit on compile-time, especially with
> optimization enabled, and most people seem to think code generation is
> about a wash.

I don't you can claim that compile time is slower with optimization 
enabled.
I will say that there is a bug report that was slow for all of GCC up 
till
tree-ssa.  See PR 2692 which says that the compile time for the tree-ssa
has improved many different things, there are other examples where 
expand
was taking a huge amount of time because inlining was happening before
expand and now we just actually do a many optimizations before expand 
making
expand less important.  What is even more impressive is that we added 
more
optimization passes and the overall compile time speed is about the same
as before the tree-ssa merge.

Most of the current compile time problems with the tree-ssa 
optimizations are
that they are O(n^2) see PR 15524 for an example of where one problem 
is, I and
Steven had hoped the rewrite of jump threading would improve the 
situation
here but it seems like we were wrong.

Also the reason behind the wash in code generation is because the tree 
optimizations
do not do much more than the current generation of RTL optimizers 
except for SRA which
is the single biggest win for more C++ programs.  The other big wins 
are DCE done right
and done before sib calling finding happens.

I also see that we can find uninitialized variables a lot better and 
IMA optimization
is done with the correct aliasing sets and ...

I can continue on what makes up this release so far, there are 244 bugs 
fixed for 3.5.0
which are not regressions and not fortran, and libjava (AWT and SWING 
also).

Oh, we also have a beginning implementation of AWT and SWING now for 
3.5.0 for libjava.

Every one in GCC keeps forgetting about the GCJ and other languages 
besides C and C++.

-- Pinski

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  3:59 ` Andrew Pinski
@ 2004-08-16  6:00   ` Daniel Berlin
  2004-08-16  9:27     ` Nathan Sidwell
  2004-08-16 11:19   ` Steven Bosscher
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Berlin @ 2004-08-16  6:00 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc, Mark Mitchell

On Aug 15, 2004, at 10:40 PM, Andrew Pinski wrote:

>
> On Aug 15, 2004, at 6:47 PM, Mark Mitchell wrote:
>
>> We're already better than GCC 3.4 on a lot of these axes.  Right now,
>> we're probably losing a bit on compile-time, especially with
>> optimization enabled, and most people seem to think code generation is
>> about a wash.
>
> Also the reason behind the wash in code generation is because the tree 
> optimizations
> do not do much more than the current generation of RTL optimizers 
> except for SRA which
> is the single biggest win for more C++ programs.

Uh, as we've demonstrated earlier today, GVN-PRE is doing a heck of a 
lot more than the RTL level PRE.
Otherwise, the RTL level PRE would have had the same exact problem by 
now (since it doesn't have any register pressure controls either)!

Just off the top of my head.

I doubt you can make blanket statements like you did above and have 
them be anywhere near true.

You also have to realize that we are going to soon hit a point (if we 
aren't there already) where we are going to need to significantly 
improve our register allocation and scheduling behavior in order to 
actually be producing better code.
Personally, except for some aliasing issues, and high level loop nest 
optimizations (IE vectorization, unimodular transforms, etc), i'm 
pretty happy when i look at the code the middle end now produces.

Does anyone actually look at the .optimized dump these days and say 
"Wow, this is just terrible?" (as opposed to "we missed something here 
or there")?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  6:00   ` Daniel Berlin
@ 2004-08-16  9:27     ` Nathan Sidwell
  0 siblings, 0 replies; 9+ messages in thread
From: Nathan Sidwell @ 2004-08-16  9:27 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Andrew Pinski, gcc, Mark Mitchell

Daniel Berlin wrote:

> You also have to realize that we are going to soon hit a point (if we 
> aren't there already) where we are going to need to significantly 
> improve our register allocation and scheduling behavior in order to 
> actually be producing better code.
Both of those projects would take some time, and hence delay a release
unacceptably, if they were release criteria.  That seems good evidence
for a 'flush the pipeline and release' plan.

nathan

-- 
Nathan Sidwell    ::   http://www.codesourcery.com   ::     CodeSourcery LLC
nathan@codesourcery.com    ::     http://www.planetfall.pwp.blueyonder.co.uk


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  3:59 ` Andrew Pinski
  2004-08-16  6:00   ` Daniel Berlin
@ 2004-08-16 11:19   ` Steven Bosscher
  2004-08-16 12:48     ` Karel Gardas
  2004-08-16 13:18     ` Scott Robert Ladd
  1 sibling, 2 replies; 9+ messages in thread
From: Steven Bosscher @ 2004-08-16 11:19 UTC (permalink / raw)
  To: Andrew Pinski, Mark Mitchell, bje; +Cc: gcc

On Monday 16 August 2004 04:40, Andrew Pinski wrote:
> On Aug 15, 2004, at 6:47 PM, Mark Mitchell wrote:
> > We're already better than GCC 3.4 on a lot of these axes.  Right now,
> > we're probably losing a bit on compile-time, especially with
> > optimization enabled, and most people seem to think code generation is
> > about a wash.
>
> I don't you can claim that compile time is slower with optimization
> enabled.
> I will say that there is a bug report that was slow for all of GCC up
> till
> tree-ssa.  See PR 2692 which says that the compile time for the tree-ssa
> has improved many different things, there are other examples where
> expand
> was taking a huge amount of time because inlining was happening before
> expand and now we just actually do a many optimizations before expand
> making
> expand less important.  What is even more impressive is that we added
> more
> optimization passes and the overall compile time speed is about the same
> as before the tree-ssa merge.

But overall slower than any other GCC ever released before.  That's clearly
demonstrated by the occasional Mico timings and by Diego's testers:
http://people.redhat.com/dnovillo/spec2000/gcc/individual-build-secs_elapsed.html

> Most of the current compile time problems with the tree-ssa
> optimizations are
> that they are O(n^2)

Not true.  There are almost no quadratic bottlenecks in tree-ssa.  There
are no inherently O(n^2) algorithms in tree-ssa (it's *hard* to write a SSA
algorithm with non-linear behavior! ;-)
The only bottleneck I'm aware of is in DOM, the PR you mentioned.

Most of the current compile time problems come from us not being able to
turn off expensive, of even cheap, RTL passes so that we only added a whole
new set of passes.  What can you expect, it's going to slow down.

> see PR 15524 for an example of where one problem
> is, I and
> Steven had hoped the rewrite of jump threading would improve the
> situation
> here but it seems like we were wrong.

Actually that's only part of the problem.  The quadratic-ness of that test
case is now less a problem than the other one I've already mailed you and
bje about.   For everyone else:

From tree-ssa-dom.c:

  2089832: 2551:  for (e = bb->succ; e; e = e->succ_next)
        -: 2552:    {
(...)
        -: 2584:          /* If the hint is valid (!= phi_num_args), see if it points
        -: 2585:             us to the desired phi alternative.  */
  2687602: 2586:          if (hint != phi_num_args && PHI_ARG_EDGE (phi, hint) == e)
        -: 2587:            ;
        -: 2588:          else
        -: 2589:            {
        -: 2590:              /* The hint was either invalid or did not point to the
        -: 2591:                 correct phi alternative.  Search all the alternatives
        -: 2592:                 for the correct one.  Update the hint.  */
317517478: 2593:              for (i = 0; i < phi_num_args; i++)
317517478: 2594:                if (PHI_ARG_EDGE (phi, i) == e)
316924800: 2595:                  break;
   592678: 2596:              hint = i;
        -: 2597:            }

... and this loop's mirror in tree-flow-inline.h:

        -:  407:/* Return the phi index number for an edge.  */
        -:  408:static inline int
        -:  409:phi_arg_from_edge (tree phi, edge e)
  5360704:  410:{
  5360704:  411:  int i;
        -:  412:#if defined ENABLE_CHECKING
        -:  413:  if (!phi || TREE_CODE (phi) != PHI_NODE)
        -:  414:    abort();
        -:  415:#endif
        -:  416:
322669434:  417:  for (i = 0; i < PHI_NUM_ARGS (phi); i++)
322669434:  418:    if (PHI_ARG_EDGE (phi, i) == e)
        -:  419:      return i;
        -:  420:
        -:  421:  return -1;
  5360704:  422:}

Might not be typical code (insn-attrtab), but it shows that there really
are some data structures we use that are suboptimal: >1.2 billion runtime
conditional branches to look for 8 million PHI arguments!  We spend an
awful lot of time there, wading through memory.  This happens in DOM, but
also for any other tree SSA pass.
This will mostly go away when the edge-vector-branch is ready, hopefully
in time for GCC 3.5.

(bje, http://gcc.gnu.org/ml/gcc/2004-08/msg00685.html -- don't forget to
send an email about your project to Mark!)

> Also the reason behind the wash in code generation is because the tree
> optimizations
> do not do much more than the current generation of RTL optimizers
> except for SRA which
> is the single biggest win for more C++ programs.  The other big wins
> are DCE done right
> and done before sib calling finding happens.

The kind of optimizations that we do on trees may in paper be not very
different from the ones we do on RTL, but in practice they often do a lot
more than their RTL counterparts.

Just look at the GVN-PRE problem I reported the other day.  Look at jump
threads, we do many more than we did on just RTL, the named return value
optimization is now in the middle-end, and as you mentioned, dead store
and dead code elimination do a better job.  For sibling calls, last time
I looked we still had a few regressions wrt. sib/tail calls, but overall
we should catch more of them.

This does result in better code in many instances.  Look at SPEC's mcf
and parser on Diego's SPEC pages:
http://people.redhat.com/dnovillo/spec2000/gcc/individual-run-ratio.html
From the merge of the tree-ssa branch there's big jump there in the
right direction: up to the level of icc.

Also note that we may actually be generating much better code on non-x86
than older GCCs, but nobody is tracking performance on, say, ppc, ia64,
or MIPS, in public.

Overall the code we generate has not improvement (not for SPEC anyway).
Sometimes we generate worse code, so we need to find and understand the
reasons where and why we do that.  I showed one case of register pressure,
and there are probably more cases like that.  One other is still just poor
RTL generation at time.  Other people know more reasons I suppose.

> I also see that we can find uninitialized variables a lot better and
> IMA optimization
> is done with the correct aliasing sets and ...

...and debugging information is worse, and stack slot allocation is borked,
and we have no clear picture of the memory foot print, and...

Gr.
Steven

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16 11:19   ` Steven Bosscher
@ 2004-08-16 12:48     ` Karel Gardas
  2004-08-16 13:18     ` Scott Robert Ladd
  1 sibling, 0 replies; 9+ messages in thread
From: Karel Gardas @ 2004-08-16 12:48 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: GCC Mailing List

On Mon, 16 Aug 2004, Steven Bosscher wrote:

> But overall slower than any other GCC ever released before.  That's clearly
> demonstrated by the occasional Mico timings and by Diego's testers:
> http://people.redhat.com/dnovillo/spec2000/gcc/individual-build-secs_elapsed.html

First of all, I would like to send ``seconded'' message, but then I've
meassured recent trunk against 3.4.1 release on my set of MICO sources and
found that at least for -O0, it is faster than 3.4.1, just about few
seconds but it is:

3.5.0 20040816:
real    9m33.197s
user    8m53.104s
sys     0m23.014s

3.4.1:
real    9m53.070s
user    9m13.973s
sys     0m27.936s

I will post full comparison table hopefully later today. My last report
(http://gcc.gnu.org/ml/gcc/2004-07/msg00391.html) claims that main trunk
as of 20040630 was about 10% slower on -O0 than 3.4.0 -- so there seems to
be really some progress and I'm looking forward to seeing all other
numbers...

Thanks,

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16 11:19   ` Steven Bosscher
  2004-08-16 12:48     ` Karel Gardas
@ 2004-08-16 13:18     ` Scott Robert Ladd
  1 sibling, 0 replies; 9+ messages in thread
From: Scott Robert Ladd @ 2004-08-16 13:18 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Andrew Pinski, Mark Mitchell, bje, gcc

Steven Bosscher wrote:
> Also note that we may actually be generating much better code on non-x86
> than older GCCs, but nobody is tracking performance on, say, ppc, ia64,
> or MIPS, in public.

If we don't hear from people representing a specific architecture, then 
I think we can assume that code generation is (at least) "good enough" 
for their needs. It's not as if IBM, SGI, and Intel couldn't do their 
own benchmarks and report the results. ;)

..Scott

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  2:25 GCC 3.5 Plan, Take 2 Mark Mitchell
  2004-08-16  3:59 ` Andrew Pinski
@ 2004-08-16 19:27 ` Matt Austern
  2004-08-16 22:47 ` Joseph S. Myers
  2 siblings, 0 replies; 9+ messages in thread
From: Matt Austern @ 2004-08-16 19:27 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Aug 15, 2004, at 6:47 PM, Mark Mitchell wrote:

> I expect it to be hard to get GCC 3.5's code-generation to be
> noticably better than GCC 3.4.x's across the board within a few
> months.  I expect that it will not be until six months to a year hence
> that we'll see noticably better code generation on most test cases.
> Since I don't think that most people want to wait that long for GCC
> 3.5, I think we need to accept that we're shooting for code generation
> that is, in general, not worse, and may be better on some particularly
> high-profile cases.  In the worst case, we may even have to accept
> worse code on some real cases.  If we can vectorize some loops, and
> SRA provides big wins on some C++ test cases, and we're in general
> pretty close to GCC 3.4, I think the release will be well-received,
> given all the other improvements.

I'm concerned specifically about lno-branch.  My understanding is
that we agreed not to do a wholesale merge from lno-branch into
mainline and that instead we would more things in on a case by case
basis.  That's fine, but we need to make sure we're realistically
on track for all those patches to get into mainline before we move
into stage 3.  Right now it looks like the lno-branch patch
approvals are going pretty slowly.

			--Matt


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 3.5 Plan, Take 2
  2004-08-16  2:25 GCC 3.5 Plan, Take 2 Mark Mitchell
  2004-08-16  3:59 ` Andrew Pinski
  2004-08-16 19:27 ` Matt Austern
@ 2004-08-16 22:47 ` Joseph S. Myers
  2 siblings, 0 replies; 9+ messages in thread
From: Joseph S. Myers @ 2004-08-16 22:47 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Sun, 15 Aug 2004, Mark Mitchell wrote:

> It's also clear that I'm not fully aware of all the things people have
> in the pipeline.  And, it's clear that I can't make good decisions
> without knowing that information.  Below, I'll discuss how I want to
> correct that problem.

This suggests that contributewhy.html isn't being effective.  Status 
information on major projects in progress is of use at all times, not just 
when immediately planning the next release, although of course the more 
specific and detailed information you request here is of especial value 
now.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-08-16 22:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-16  2:25 GCC 3.5 Plan, Take 2 Mark Mitchell
2004-08-16  3:59 ` Andrew Pinski
2004-08-16  6:00   ` Daniel Berlin
2004-08-16  9:27     ` Nathan Sidwell
2004-08-16 11:19   ` Steven Bosscher
2004-08-16 12:48     ` Karel Gardas
2004-08-16 13:18     ` Scott Robert Ladd
2004-08-16 19:27 ` Matt Austern
2004-08-16 22:47 ` Joseph S. Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).