From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127173 invoked by alias); 21 Feb 2017 15:49:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 127163 invoked by uid 89); 21 Feb 2017 15:49:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=Hx-languages-length:4594 X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 Feb 2017 15:49:44 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 7AC07547D2A; Tue, 21 Feb 2017 16:49:41 +0100 (CET) Date: Tue, 21 Feb 2017 15:52:00 -0000 From: Jan Hubicka To: "Bin.Cheng" Cc: Jan Hubicka , Richard Biener , "gcc-patches@gcc.gnu.org" , "pthaugen@linux.vnet.ibm.com" Subject: Re: [PATCH PR77536]Generate correct profiling information for vectorized loop Message-ID: <20170221154941.GA33977@kam.mff.cuni.cz> References: <20170220140210.GA2932@kam.mff.cuni.cz> <20170220151705.GA29965@kam.mff.cuni.cz> <20170220160509.GA2669@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-SW-Source: 2017-02/txt/msg01304.txt.bz2 > 2017-02-21 Bin Cheng > > PR tree-optimization/77536 > * tree-ssa-loop-manip.c (niter_for_unrolled_loop): New function. > (tree_transform_and_unroll_loop): Use above function to compute the > estimated niter of unrolled loop and use it when scaling profile. > * tree-ssa-loop-manip.h niter_for_unrolled_loop(): New declaration. > * tree-vect-loop.c (scale_profile_for_vect_loop): New function. > (vect_transform_loop): Call above function. > > gcc/testsuite/ChangeLog > 2017-02-21 Bin Cheng > > PR tree-optimization/77536 > * gcc.dg/vect/pr79347.c: Revise testing string. > @@ -1329,7 +1339,12 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor, > freq_h = loop->header->frequency; > freq_e = EDGE_FREQUENCY (loop_preheader_edge (loop)); > if (freq_h != 0) > - scale_loop_frequencies (loop, freq_e * (new_est_niter + 1), freq_h); > + { > + gcov_type scale; > + /* This should not overflow. */ > + scale = GCOV_COMPUTE_SCALE (freq_e * (new_est_niter + 1), freq_h); > + scale_loop_frequencies (loop, scale, REG_BR_PROB_BASE); You need to use counts counts when new_est_niter is derrived from profile feedback. This is because frequencies are capped to 10000, so if loop iterates very many times, new_est_niter will be large, freq_h will be 10000 and freq_e will be 0. Also watch the case when freq_e==loop_preheader_edge (loop)->count==0 and freq_h is non-zero. Just do MAX (freq_e, 1). This will not drop the loop body profile to 0. > +/* Scale profiling counters by estimation for LOOP which is vectorized > + by factor VF. */ > + > +static void > +scale_profile_for_vect_loop (struct loop *loop, unsigned vf) > +{ > + edge preheader = loop_preheader_edge (loop); > + unsigned freq_h = loop->header->frequency; > + unsigned freq_e = EDGE_FREQUENCY (preheader); > + /* Reduce loop iterations by the vectorization factor. */ > + gcov_type new_est_niter = niter_for_unrolled_loop (loop, vf); > + > + /* Use profiling count information if frequencies are zero. */ > + if (freq_h == 0 || freq_e == 0) > + { > + freq_e = preheader->count; > + freq_h = loop->header->count; > + } > + > + if (freq_h != 0) > + { > + gcov_type scale; > + /* This should not overflow. */ > + scale = GCOV_COMPUTE_SCALE (freq_e * (new_est_niter + 1), freq_h); > + scale_loop_frequencies (loop, scale, REG_BR_PROB_BASE); > + } Similarly here. Use counts when they are non-zero and use MAX (freq_e, 1). freq_e/freq_h needs to be gcov_type in that case. Patch is OK with these changes. Thanks a lot! Honza > + > + basic_block exit_bb = single_pred (loop->latch); > + edge exit_e = single_exit (loop); > + exit_e->count = loop_preheader_edge (loop)->count; > + exit_e->probability = REG_BR_PROB_BASE / (new_est_niter + 1); > + > + edge exit_l = single_pred_edge (loop->latch); > + int prob = exit_l->probability; > + exit_l->probability = REG_BR_PROB_BASE - exit_e->probability; > + exit_l->count = exit_bb->count - exit_e->count; > + if (exit_l->count < 0) > + exit_l->count = 0; > + if (prob > 0) > + scale_bbs_frequencies_int (&loop->latch, 1, exit_l->probability, prob); > +} > + > /* Function vect_transform_loop. > > The analysis phase has determined that the loop is vectorizable. > @@ -6743,16 +6785,10 @@ vect_transform_loop (loop_vec_info loop_vinfo) > bool transform_pattern_stmt = false; > bool check_profitability = false; > int th; > - /* Record number of iterations before we started tampering with the profile. */ > - gcov_type expected_iterations = expected_loop_iterations_unbounded (loop); > > if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, vect_location, "=== vec_transform_loop ===\n"); > > - /* If profile is inprecise, we have chance to fix it up. */ > - if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) > - expected_iterations = LOOP_VINFO_INT_NITERS (loop_vinfo); > - > /* Use the more conservative vectorization threshold. If the number > of iterations is constant assume the cost check has been performed > by our caller. If the threshold makes all loops profitable that > @@ -7068,9 +7104,8 @@ vect_transform_loop (loop_vec_info loop_vinfo) > > slpeel_make_loop_iterate_ntimes (loop, niters_vector); > > - /* Reduce loop iterations by the vectorization factor. */ > - scale_loop_profile (loop, GCOV_COMPUTE_SCALE (1, vf), > - expected_iterations / vf); > + scale_profile_for_vect_loop (loop, vf); > + > /* The minimum number of iterations performed by the epilogue. This > is 1 when peeling for gaps because we always need a final scalar > iteration. */