From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20516 invoked by alias); 6 Dec 2004 12:45:05 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 20422 invoked by alias); 6 Dec 2004 12:44:55 -0000 Date: Mon, 06 Dec 2004 12:45:00 -0000 Message-ID: <20041206124455.20421.qmail@sourceware.org> From: "hubicka at ucw dot cz" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20041128181553.18704.rguenth@tat.physik.uni-tuebingen.de> References: <20041128181553.18704.rguenth@tat.physik.uni-tuebingen.de> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug tree-optimization/18704] [4.0 Regression] Inlining limits cause 340% performance regression X-Bugzilla-Reason: CC X-SW-Source: 2004-12/txt/msg00836.txt.bz2 List-Id: ------- Additional Comments From hubicka at ucw dot cz 2004-12-06 12:44 ------- Subject: Re: [4.0 Regression] Inlining limits cause 340% performance regression > > ------- Additional Comments From rguenth at tat dot physik dot uni-tuebingen dot de 2004-12-06 09:53 ------- > Subject: Re: [4.0 Regression] Inlining limits > cause 340% performance regression > > On 6 Dec 2004, pinskia at gcc dot gnu dot org wrote: > > > No reason to keep this one open, there is PR 17863 still. Also note I heard from Honza that the tree > > profiling branch with feedback can optimizate better than with your leafy patch. > > Wow, that would be cool. Does the tree-profiling branch contain the > cfg inliner? I'll try it asap. The cfg inliner per se is not too interesting. What matters here is the code size esitmation and profitability estimation. I am playing with this now and trying to get profile based inlining working. For -n10 and tramp3d.cc I need 2m14s on mainline, 1m31s on the current tree-profiling. With my new implementation I need 0m27s with profile feedback and 2m53s without. I wonder what makes the new heuristics work worse without profiling, but just increasing the inline-unit-growth very slightly (to 155) I get 0m42s. This might be just little unstability in the order of inlining decisions affecting this. I would be curious how those results compare to leafify and whether the 0m27s is not caused by missoptimization. Unless I will observe it otherwise (on SPEC with intermodule), I will apply my current patch and try to improve the profitability analysis without profiling incrementally. Ideally we ought to build estimated profile and use it, but that needs some work so for the moment I guess I will try to experiment with making loop depth available to the cgraph code. Honza > > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18704 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18704