From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Berlin <dan@cgsoftware.com>
To: Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at>
Cc: Mark Mitchell <mark@codesourcery.com>, Joe Buck <jbuck@synopsys.com>, "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: C++ compile-time regressions (was: GCC 3.0.1 Status Report)
Date: Mon, 23 Jul 2001 11:54:00 -0000
Message-id: <877kwzearp.fsf@cgsoftware.com>
References: <Pine.BSF.4.33.0107231933360.90992-100000@deneb.dbai.tuwien.ac.at>
X-SW-Source: 2001-07/msg01529.html

Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at> writes:

> On Fri, 20 Jul 2001, Mark Mitchell wrote:
>>> you'll see that C++ projects heavily relying on STL apparently
>>> simply cannot use GCC 3.0.
>> An extreme statement, but, that aside, as you know people are working
>> on fixes.
> 
> Yes, and I really appreciate that.
> 
> And I especially hope that our new development model will make prevention
> of such problems easier by providing a much more stable head branch that
> we can benchmark against regularily.
> 
> On Fri, 20 Jul 2001, Joe Buck wrote:
>> Gerald, could you test Daniel's change and see if it helps on your code?
> 
> Of course! Here we go:
> 
>           GCC 2.95.3           GCC 3.0          GCC 3.0.1-pre
>          Time    Size        Time   Size         Time   Size
>   -O0    6:19    3915128     8:20   4159780      8:00   4159588
>   -O1    4:20    4203480    11:40   4829732      7:09   3997668
>   -O2    5:56    4209368    14:09   4862532      7:53   3987556
>   -O3    5:47    4221464    32:04   6166052      7:54   3987140
> 
> The huge compile-time regression is gone, though we are still noticably
> slower than GCC 2.95; and binary size is even better than it used to be.
> 
> That is, there is still work to do to make the compiler faster, but it's
> *much* nicer now.
> 

Tree based optimizations should help on making it faster, as well as
the new parser.
After that our new bottlenecks *should* be (and if they aren't,
we should be able to  make it so they are):

1. Scheduling.
2. Register Allocation.
3. Instruction combiner.


With the store motion changes, we can turn off CSE follow-jumps and
skip-blocks. I tested it, and got the same performance with those off,
and with them on.  Without the store motion fixes i submitted, and
aren't reviewed yet, we'll get slower code with those off.

With follow-jumps and skip-blocks on, CSE takes longer than GCSE with
store motion fixed.
This is kinda silly, since it's a local algorithm.

If GCSE still takes too long (It seems to be a constant 12% of the
time, when i start throwing large enough stuff that it's time isn't
lost in the noise), we can convert to SSA based PRE.
Past that, e very other compiler i know of started using region based compiling
(HP's ELCOR, SGI Pro64, SGI's mipspro, Intel's compilers,
IBM's compilers, etc) in order to cut down the compile time while speeding
up the code.

>From what i understand, it works quite well.

Some tried flat interprocedural analysis, but it was too slow.  Some
still do it for alias analysis, however.

--Dan


> About run-time performance, I'll report later when tests have been
> finished.
> 
> Gerald
> -- 
> 
> Gerald "Jerry" pfeifer@dbai.tuwien.ac.at http://www.dbai.tuwien.ac.at/~pfeifer/

-- 
"I bought my brother some gift-wrap for Christmas.  I took it to
the Gift Wrap Department and told them to wrap it, but in a
different print so he would know when to stop unwrapping.
"-Steven Wright