From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-121456-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 16462 invoked by alias); 7 Dec 2004 14:35:33 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 16311 invoked by alias); 7 Dec 2004 14:35:25 -0000
Date: Tue, 07 Dec 2004 14:35:00 -0000
Message-ID: <20041207143525.16309.qmail@sourceware.org>
From: "rguenth at tat dot physik dot uni-tuebingen dot de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
In-Reply-To: <20041128181553.18704.rguenth@tat.physik.uni-tuebingen.de>
References: <20041128181553.18704.rguenth@tat.physik.uni-tuebingen.de>
Reply-To: gcc-bugzilla@gcc.gnu.org
Subject: [Bug tree-optimization/18704] [4.0 Regression] Inlining limits cause 340% performance regression
X-Bugzilla-Reason: CC
X-SW-Source: 2004-12/txt/msg00989.txt.bz2
List-Id: <gcc-bugs.sourceware.org>


------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de  2004-12-07 14:35 -------
Subject: Re:  [4.0 Regression] Inlining limits
 cause 340% performance regression

On 6 Dec 2004, hubicka at ucw dot cz wrote:

> Looks like I get 4fold speedup on tree profiling with profiling compared
> to tree profiling on mainline that is equivalent to speedup you are
> seeing for leafify patch. That sounds pretty prommising (so the new
> heuristics can get the leafify idea without the hint from user and
> hitting the code growth problems).

Yes, it seems so.  Really nice improvement.  Though profiling is
sloooooow.  I guess you avoid doing any CFG changing transformation
for the profiling stage?  I.e. not even inline the simplest functions?
That would be the reason the Intel compiler is unusable with profiling
for me.  -fprofile-generate comes with a 50fold increase in runtime!

> It would be nice to experiment with this a little - in general the
> heuristics can be viewed as having three players.  There are the limits
> (specified via --param) that it must obey, there is the cost model
> (estimated growth for inlining into all callees without profiling and
> the execute_count to estimated growth for inlining to one call with
> profiling) and the bin packing algorithm optimizing the gains while
> obeying the limits.
>
> With profiling in the cost model is pretty much realistic and it would
> be nice to figure out how the performance behave when the individual
> limits are changed and why.  If you have some time for experimentation,
> it would be very usefull.  I am trying to do the same with SPEC and GCC
> but I have dificulty to play with pooma or Gerald's application as I
> have little understanding what is going there.  I will try it myself
> next but any feedback can be very usefull here.

I can produce some numbers for the tramp testcase.

> My plan is to try undersand the limits first and then try to get the
> cost model better without profiling as it is bit too clumpsy to do both
> at once.

Do you have some written overview of the cost model?

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18704