From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 86358 invoked by alias); 30 May 2016 12:38:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 86301 invoked by uid 89); 30 May 2016 12:38:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=noninnermost, sk:flag_p, UD:tree-ssa-loop-ivcanon.c, non-innermost X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 30 May 2016 12:38:44 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id BC63C543A70; Mon, 30 May 2016 14:38:40 +0200 (CEST) Date: Mon, 30 May 2016 14:39:00 -0000 From: Jan Hubicka To: gcc-patches@gcc.gnu.org Subject: Fix profile updating in loop peeling Message-ID: <20160530123840.GA82571@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2016-05/txt/msg02342.txt.bz2 Hi, this patch fixes profile updates in loop peeling pass. First it correctly set wont_exit which can only be set when we know the number of iterations tested by EXIT and this number is higher than maximal number of iterations (an unlikely case which is usually removed by VRP pass or earlier cunroll). Second problem is that we determine number of peelings as number of estimated iterations + 1. After peeling we currently underflow updating estimates which makes them to be re-computed later to bogus values. Last change is way we netermine profile for the new loop header. We used to drop the loop frequency 1/1000. This patch makes use of the info on remaining entry edges to the loop. While working on this I noticed that try_peel_loop has tendency to iterate and peel one loop many times. I will prepare followup for that. Also testcases will come once I commit the change enabling loop peeling at -O3 by using likely estimates. Bootstrapped/regtested x86_64-linux. * tree-ssa-loop-ivcanon.c (try_peel_loop): Correctly set wont_exit for peeled copies; avoid underflow when updating estimates; correctly scale loop profile. Index: tree-ssa-loop-ivcanon.c =================================================================== --- tree-ssa-loop-ivcanon.c (revision 236874) +++ tree-ssa-loop-ivcanon.c (working copy) @@ -970,7 +970,9 @@ try_peel_loop (struct loop *loop, if (!flag_peel_loops || PARAM_VALUE (PARAM_MAX_PEEL_TIMES) <= 0) return false; - /* Peel only innermost loops. */ + /* Peel only innermost loops. + While the code is perfectly capable of peeling non-innermost loops, + the heuristics would probably need some improvements. */ if (loop->inner) { if (dump_file) @@ -1029,13 +1031,23 @@ try_peel_loop (struct loop *loop, /* Duplicate possibly eliminating the exits. */ initialize_original_copy_tables (); wont_exit = sbitmap_alloc (npeel + 1); - bitmap_ones (wont_exit); - bitmap_clear_bit (wont_exit, 0); + if (exit && niter + && TREE_CODE (niter) == INTEGER_CST + && wi::leu_p (npeel, wi::to_widest (niter))) + { + bitmap_ones (wont_exit); + if (wi::eq_p (wi::to_widest (niter), npeel)) + bitmap_clear_bit (wont_exit, 0); + } + else + { + exit = NULL; + bitmap_clear (wont_exit); + } if (!gimple_duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop), npeel, wont_exit, exit, &to_remove, - DLTHE_FLAG_UPDATE_FREQ - | DLTHE_FLAG_COMPLETTE_PEEL)) + DLTHE_FLAG_UPDATE_FREQ)) { free_original_copy_tables (); free (wont_exit); @@ -1053,14 +1065,48 @@ try_peel_loop (struct loop *loop, fprintf (dump_file, "Peeled loop %d, %i times.\n", loop->num, (int) npeel); } + if (loop->any_estimate) + { + if (wi::ltu_p (npeel, loop->nb_iterations_estimate)) + loop->nb_iterations_estimate -= npeel; + else + loop->nb_iterations_estimate = 0; + } if (loop->any_upper_bound) - loop->nb_iterations_upper_bound -= npeel; + { + if (wi::ltu_p (npeel, loop->nb_iterations_estimate)) + loop->nb_iterations_upper_bound -= npeel; + else + loop->nb_iterations_upper_bound = 0; + } if (loop->any_likely_upper_bound) - loop->nb_iterations_likely_upper_bound -= npeel; - loop->nb_iterations_estimate = 0; - /* Make sure to mark loop cold so we do not try to peel it more. */ - scale_loop_profile (loop, 1, 0); - loop->header->count = 0; + { + if (wi::ltu_p (npeel, loop->nb_iterations_estimate)) + loop->nb_iterations_likely_upper_bound -= npeel; + else + { + loop->any_estimate = true; + loop->nb_iterations_estimate = 0; + loop->nb_iterations_likely_upper_bound = 0; + } + } + gcov_type entry_count = 0; + int entry_freq = 0; + + edge_iterator ei; + FOR_EACH_EDGE (e, ei, loop->header->preds) + if (e->src != loop->latch) + { + entry_count += e->src->count; + entry_freq += e->src->frequency; + gcc_assert (!flow_bb_inside_loop_p (loop, e->src)); + } + int scale = 1; + if (loop->header->count) + scale = RDIV (entry_count * REG_BR_PROB_BASE, loop->header->count); + else if (loop->header->frequency) + scale = RDIV (entry_freq * REG_BR_PROB_BASE, loop->header->frequency); + scale_loop_profile (loop, scale, 0); return true; } /* Adds a canonical induction variable to LOOP if suitable.