From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30467 invoked by alias); 17 Apr 2011 08:35:46 -0000 Received: (qmail 30433 invoked by uid 22791); 17 Apr 2011 08:35:45 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 17 Apr 2011 08:35:29 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 616719AC863; Sun, 17 Apr 2011 10:35:28 +0200 (CEST) Date: Sun, 17 Apr 2011 10:23:00 -0000 From: Jan Hubicka To: Dominique Dhumieres Cc: gcc-patches@gcc.gnu.org, richard.guenthe@gmail.com, hubicka@ucw.cz Subject: Re: More of ipa-inline housekeeping Message-ID: <20110417083528.GA28329@kam.mff.cuni.cz> References: <20110415151933.7A5923BE19@mailhost.lps.ens.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110415151933.7A5923BE19@mailhost.lps.ens.fr> User-Agent: Mutt/1.5.18 (2008-05-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-04/txt/msg01300.txt.bz2 > AFAICT revision 172430 fixed the original problem in pr45810: > > gfc -Ofast -fwhole-program fatigue.f90 : 6.301u 0.003s 0:06.30 > gfc -Ofast -fwhole-program -flto fatigue.f90 : 6.263u 0.003s 0:06.26 > > However if I play with --param max-inline-insns-auto=*, I get > > gfc -Ofast -fwhole-program --param max-inline-insns-auto=124 -fstack-arrays fatigue.f90 : 4.870u 0.002s 0:04.87 > gfc -Ofast -fwhole-program --param max-inline-insns-auto=125 -fstack-arrays fatigue.f90 : 2.872u 0.002s 0:02.87 > > and > > gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=515 -fstack-arrays fatigue.f90 : 4.965u 0.003s 0:04.97 > gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=516 -fstack-arrays fatigue.f90 : 2.732u 0.002s 0:02.73 > > while I get the same threshold=125 with/without -flto at revision 172429. > Note that I get the same thresholds without -fstack-arrays, the run times > are only larger. Thanks for notice. This was not really expected, but seems to give some insight. I just tested a new cleanup patch of mine where I fixed few minor bugs in side corners. One of those bugs I noticed was introduced by this patch (an overlook while converting the code to new accesor). In case of nested inlining, the stack usage got misaccounted and consequently we allowed more inlining than --param large-stack-frame-growth would allow normally. The vortex and wupwise improvement seems to be gone, so I think they are due to this issue. I never really tuned the stack frame growth heuristics since it did not cause any problems in the benchmarks. On fortran this is quite different because of the large i/o blocks hitting it very commonly, so I will look into making it more permissive. We definitely can just bump up the limits and/or we can also teach it that if call dominates the return there is not really much to save of stack usage by preventing inlining since both stack frames will wind up on the stack anyway. This means adding new bit whether call edge dominate exit and using this info. Also simple noreturn IPA discovery can be based on this and I recently noticed it might be important for Mozilla. So I will give it a try soonish. I will also look into the estimate_size ICE reported today. Honza