From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20594 invoked by alias); 30 May 2016 16:05:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 20564 invoked by uid 89); 30 May 2016 16:05:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy=22f, D*cz, COST X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Mon, 30 May 2016 16:04:51 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id EC28BAB43; Mon, 30 May 2016 16:04:47 +0000 (UTC) Subject: Re: [PATCH 2/3] Add profiling support for IVOPTS To: "Bin.Cheng" References: <00578bce6fdccccacdd740ff0067ccde46f98f51.1461931011.git.mliska@suse.cz> <5739D16A.9020907@suse.cz> <573D9549.4070700@suse.cz> Cc: gcc-patches List From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: <574C649F.8000808@suse.cz> Date: Mon, 30 May 2016 19:51:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------030507020600000603010006" X-IsSubscribed: yes X-SW-Source: 2016-05/txt/msg02359.txt.bz2 This is a multi-part message in MIME format. --------------030507020600000603010006 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-length: 449 On 05/24/2016 12:11 PM, Bin.Cheng wrote: > Hi, > Could you please factor out this as a function and remove the goto > statements? Okay with this change if no fallout in benchmarks you > run. > > Thanks, > bin Hi. Thanks for the review, I've just verified that it does not introduce any regression on SPECv6 and it improves couple of SPEC2006 benchmarks w/ PGO. I'm going to install the patch and make a control run of benchmarks. Thanks Martin --------------030507020600000603010006 Content-Type: text/x-patch; name="0002-Add-profiling-support-for-IVOPTS-final.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0002-Add-profiling-support-for-IVOPTS-final.patch" Content-length: 4118 >From 2991622862dd934e464f542e9e58270bf0088544 Mon Sep 17 00:00:00 2001 From: marxin Date: Tue, 17 May 2016 15:22:43 +0200 Subject: [PATCH 1/5] Add profiling support for IVOPTS gcc/ChangeLog: 2016-05-17 Martin Liska * tree-ssa-loop-ivopts.c (get_computation_cost_at): Scale computed costs by frequency of BB they belong to. (get_scaled_computation_cost_at): New function. --- gcc/tree-ssa-loop-ivopts.c | 62 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 16 deletions(-) diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index d770ec9..a541ef8 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -4794,7 +4794,33 @@ get_loop_invariant_expr (struct ivopts_data *data, tree ubase, return record_inv_expr (data, expr); } +/* Scale (multiply) the computed COST (except scratch part that should be + hoisted out a loop) by header->frequency / AT->frequency, + which makes expected cost more accurate. */ +static comp_cost +get_scaled_computation_cost_at (ivopts_data *data, gimple *at, iv_cand *cand, + comp_cost cost) +{ + int loop_freq = data->current_loop->header->frequency; + int bb_freq = at->bb->frequency; + if (loop_freq != 0) + { + gcc_assert (cost.scratch <= cost.cost); + int scaled_cost + = cost.scratch + (cost.cost - cost.scratch) * bb_freq / loop_freq; + + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "Scaling iv_use based on cand %d " + "by %2.2f: %d (scratch: %d) -> %d (%d/%d)\n", + cand->id, 1.0f * bb_freq / loop_freq, cost.cost, + cost.scratch, scaled_cost, bb_freq, loop_freq); + + cost.cost = scaled_cost; + } + + return cost; +} /* Determines the cost of the computation by that USE is expressed from induction variable CAND. If ADDRESS_P is true, we just need @@ -4982,18 +5008,21 @@ get_computation_cost_at (struct ivopts_data *data, (symbol/var1/const parts may be omitted). If we are looking for an address, find the cost of addressing this. */ if (address_p) - return cost + get_address_cost (symbol_present, var_present, - offset, ratio, cstepi, - mem_mode, - TYPE_ADDR_SPACE (TREE_TYPE (utype)), - speed, stmt_is_after_inc, can_autoinc); + { + cost += get_address_cost (symbol_present, var_present, + offset, ratio, cstepi, + mem_mode, + TYPE_ADDR_SPACE (TREE_TYPE (utype)), + speed, stmt_is_after_inc, can_autoinc); + return get_scaled_computation_cost_at (data, at, cand, cost); + } /* Otherwise estimate the costs for computing the expression. */ if (!symbol_present && !var_present && !offset) { if (ratio != 1) cost += mult_by_coeff_cost (ratio, TYPE_MODE (ctype), speed); - return cost; + return get_scaled_computation_cost_at (data, at, cand, cost); } /* Symbol + offset should be compile-time computable so consider that they @@ -5012,24 +5041,25 @@ get_computation_cost_at (struct ivopts_data *data, aratio = ratio > 0 ? ratio : -ratio; if (aratio != 1) cost += mult_by_coeff_cost (aratio, TYPE_MODE (ctype), speed); - return cost; + + return get_scaled_computation_cost_at (data, at, cand, cost); fallback: if (can_autoinc) *can_autoinc = false; - { - /* Just get the expression, expand it and measure the cost. */ - tree comp = get_computation_at (data->current_loop, use, cand, at); + /* Just get the expression, expand it and measure the cost. */ + tree comp = get_computation_at (data->current_loop, use, cand, at); - if (!comp) - return infinite_cost; + if (!comp) + return infinite_cost; + + if (address_p) + comp = build_simple_mem_ref (comp); - if (address_p) - comp = build_simple_mem_ref (comp); + cost = comp_cost (computation_cost (comp, speed), 0); - return comp_cost (computation_cost (comp, speed), 0); - } + return get_scaled_computation_cost_at (data, at, cand, cost); } /* Determines the cost of the computation by that USE is expressed -- 2.8.3 --------------030507020600000603010006--