From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1405 invoked by alias); 20 Nov 2004 05:04:22 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 1368 invoked from network); 20 Nov 2004 05:04:16 -0000 Received: from unknown (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org with SMTP; 20 Nov 2004 05:04:16 -0000 Received: from camelot.ms.mff.cuni.cz (kampanus.ms.mff.cuni.cz [195.113.18.107]) by nikam.ms.mff.cuni.cz (Postfix) with SMTP id 58B1F4DDEB; Sat, 20 Nov 2004 06:04:18 +0100 (CET) Received: by camelot.ms.mff.cuni.cz (sSMTP sendmail emulation); Sat, 20 Nov 2004 06:05:12 +0100 Date: Sat, 20 Nov 2004 14:14:00 -0000 From: Jan Hubicka To: Jeffrey A Law Cc: jh@suse.com, gcc@gcc.gnu.org Subject: Re: Slow profile updating (pr 15524) Message-ID: <20041120050512.GO19066@kam.mff.cuni.cz> References: <1100904370.27318.54.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1100904370.27318.54.camel@localhost.localdomain> User-Agent: Mutt/1.3.28i X-SW-Source: 2004-11/txt/msg00715.txt.bz2 > > Believe it or not we're at a point where updating of the profile in > response to a jump thread is the most expensive routine in the > compiler for PR 15524. > > > If we look up update_bb_profile_for_threading we see one loop of > significance: > > else > FOR_EACH_EDGE (c, ei, bb->succs) > c->probability = ((c->probability * REG_BR_PROB_BASE) / (double) > prob); > > > So anytime we thread through some block BB, we have to walk through > all its successors to rescale their probabilities. Needless to say > that gets rather expensive, especially if BB has a large switch > statement and several of its incoming edges are threadable. > > Is it the case that the computation of c->probability actually > has to happen in the order you specified via the parenthesis? If > not, then we could precompute REG_BR_PROB_BASE / (double) prob > outside the loop which would result in a loop like > > tmp = REG_BR_PROB_BASE / (double) prob; > FOR_EACH_EDGE (c, ei, bb->succs) > c->probability *= tmp; > > Which would probably provide a reasonable improvement. It is just rescaling. I think the (double) cast is actually redundant here, but if we want to go into double precision we might lift the division out as you sugest. > > And if that's safe, then we'd probably want to rewrite it like > > else if (prob != REG_BR_PROB_BASE) > { > double tmp = REG_BR_PROB_BASE / (double) prob; > FOR_EACH_EDGE 9c, ei, bb->succs) > c->probability *= tmp; > } > > Which avoids the loop completely if nothing is going to change (as is > the case for pr15524). Yep, that would be fine too ;) > > Doing something like this would give us a net improvement of 15-20% on > PR 15524. > > Alternately we might want to look at whether or not we can rescale the > successor blocks en-masse after all the redirections for a particular > block are complete. This might be good idea too, but we need to be careful that we won't confuse ourselves while doing the other updates, so it is rather fragile... Honza > > Thoughts? > > jeff >