From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 89035 invoked by alias); 14 Sep 2016 12:05:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 89021 invoked by uid 89); 14 Sep 2016 12:05:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy=Discover, sop, sk:single_, sk:SINGLE_ X-HELO: mail-wm0-f49.google.com Received: from mail-wm0-f49.google.com (HELO mail-wm0-f49.google.com) (74.125.82.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Sep 2016 12:04:57 +0000 Received: by mail-wm0-f49.google.com with SMTP id 1so26483473wmz.1 for ; Wed, 14 Sep 2016 05:04:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ASMzBpRP98stzNaPtSMNQYAgRtJ0wDURLCK+cgIaMkg=; b=Y5oKnbBOYPzeqVa3tzZZJQJJDd4QCvmw9NZVydRZdiOJYa34X4v904P31PnEvZfMVA LX8n5vxy5jGnvlrw7ODgFK6wfk29PeU1bNQ+fhcbZOyj9k9RpwbNdBu0lVE9JWu7XG2y 82QFDScDYRk7MPJtt4BPCrYHdH0VQqBiu+qWn2q+u4qc0W9I1UUQoBPQv8y2Iq+JNpwS a9xdUnVIYTcmD8dEMeLG/gTTJymzkqgbY99Gzpjikmxx4ny5R6J5UTraYJXlqFWNqF+k tgMYgP55NzC7AR2rOAKddDqohdGoMkyooGYDVSpOaE3VBc+DlYx2/JJbpZlin+Azt2kA i6wQ== X-Gm-Message-State: AE9vXwM/tOHTh1J/3+PTDLNFl/tkfWBIc5xSahNO+S8qzYJqzLqyejBSJRvV0DEvRvk4IB0C1iAmaskWSvO7Gw== X-Received: by 10.194.61.72 with SMTP id n8mr2328845wjr.74.1473854695532; Wed, 14 Sep 2016 05:04:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.137.129 with HTTP; Wed, 14 Sep 2016 05:04:54 -0700 (PDT) In-Reply-To: References: <57886949.8010300@linaro.org> <57886A71.6030103@linaro.org> <57888BD6.6070200@linaro.org> <578891C8.7090609@linaro.org> <19ff8188-aed7-0f9e-cc0b-0603698787ff@linaro.org> <48e42d0c-057c-312a-4e41-cd78c8b38b5e@linaro.org> From: Richard Biener Date: Wed, 14 Sep 2016 12:11:00 -0000 Message-ID: Subject: Re: [RFC][IPA-VRP] Early VRP Implementation To: Kugan Vivekanandarajah Cc: Andrew Pinski , "gcc-patches@gcc.gnu.org" , Jan Hubicka , Martin Jambor Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-09/txt/msg00807.txt.bz2 On Tue, Aug 23, 2016 at 4:11 AM, Kugan Vivekanandarajah wrote: > Hi, > > On 19 August 2016 at 21:41, Richard Biener wrote: >> On Tue, Aug 16, 2016 at 9:45 AM, kugan >> wrote: >>> Hi Richard, >>> >>> On 12/08/16 20:43, Richard Biener wrote: >>>> >>>> On Wed, Aug 3, 2016 at 3:17 AM, kugan >>>> wrote: >>> >>> >>> [SNIP] >>> >>>> >>>> diff --git a/gcc/common.opt b/gcc/common.opt >>>> index 8a292ed..7028cd4 100644 >>>> --- a/gcc/common.opt >>>> +++ b/gcc/common.opt >>>> @@ -2482,6 +2482,10 @@ ftree-vrp >>>> Common Report Var(flag_tree_vrp) Init(0) Optimization >>>> Perform Value Range Propagation on trees. >>>> >>>> +fdisable-tree-evrp >>>> +Common Report Var(flag_disable_early_vrp) Init(0) Optimization >>>> +Disable Early Value Range Propagation on trees. >>>> + >>>> >>>> no please, this is automatically supported via -fdisable- >>> >>> >>> I am now having -ftree-evrp which is enabled all the time. But This will >>> only be used for disabling the early-vrp. That is, early-vrp will be run >>> when ftree-vrp is enabled and ftree-evrp is not explicitly disabled. Is this >>> OK? >> >> Why would one want to disable early-vrp? I see you do this in the testsuite >> for non-early VRP unit-tests but using -fdisable-tree-evrp1 there >> would be ok as well. > > Removed it altogether. I though that you wanted a way to disable > early-vrp for testing purposes. But there is via the generic -fdisable-tree-DUMPFILE way. >>>> >>>> @@ -1728,11 +1736,12 @@ extract_range_from_assert (value_range *vr_p, tree >>>> expr) >>>> always false. */ >>>> >>>> static void >>>> -extract_range_from_ssa_name (value_range *vr, tree var) >>>> +extract_range_from_ssa_name (value_range *vr, bool dom_p, tree var) >>>> { >>>> value_range *var_vr = get_value_range (var); >>>> >>>> - if (var_vr->type != VR_VARYING) >>>> + if (var_vr->type != VR_VARYING >>>> + && (!dom_p || var_vr->type != VR_UNDEFINED)) >>>> copy_value_range (vr, var_vr); >>>> else >>>> set_value_range (vr, VR_RANGE, var, var, NULL); >>>> >>>> why do you need these changes? I think I already told you you need to >>>> initialize the lattice to sth else than VR_UNDEFINED and that you can't >>>> fully re-use update_value_range. If you don't want to do that then >>>> instead >>>> of doing changes all over the place do it in get_value_range and have a >>>> global flag. >>> >>> >>> I have now added a global early_vrp_p and use this to initialize >>> VR_INITIALIZER and get_value_range default to VR_VARYING. >> >> ICK. Ok, I see that this works, but it is quite ugly, so (see below) >> >>>> >>>> >>>> @@ -3594,7 +3643,8 @@ extract_range_from_cond_expr (value_range *vr, >>>> gassign *stmt) >>>> on the range of its operand and the expression code. */ >>>> >>>> static void >>>> -extract_range_from_comparison (value_range *vr, enum tree_code code, >>>> +extract_range_from_comparison (value_range *vr, >>>> + enum tree_code code, >>>> tree type, tree op0, tree op1) >>>> { >>>> bool sop = false; >>>> >>>> remove these kind of no-op changes. >>> >>> >>> Done. >>> >>> >>>> >>>> +/* Initialize local data structures for VRP. If DOM_P is true, >>>> + we will be calling this from early_vrp where value range propagation >>>> + is done by visiting stmts in dominator tree. ssa_propagate engine >>>> + is not used in this case and that part of the ininitialization will >>>> + be skipped. */ >>>> >>>> static void >>>> -vrp_initialize (void) >>>> +vrp_initialize (bool dom_p) >>>> { >>>> basic_block bb; >>>> >>>> @@ -6949,6 +7010,9 @@ vrp_initialize (void) >>>> vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names); >>>> bitmap_obstack_initialize (&vrp_equiv_obstack); >>>> >>>> + if (dom_p) >>>> + return; >>>> + >>>> >>>> split the function instead. >>>> >>>> @@ -7926,7 +7992,8 @@ vrp_visit_switch_stmt (gswitch *stmt, edge >>>> *taken_edge_p) >>>> If STMT produces a varying value, return SSA_PROP_VARYING. */ >>>> >>>> static enum ssa_prop_result >>>> -vrp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p) >>>> +vrp_visit_stmt_worker (gimple *stmt, bool dom_p, edge *taken_edge_p, >>>> + tree *output_p) >>>> { >>>> tree def; >>>> ssa_op_iter iter; >>>> @@ -7940,7 +8007,7 @@ vrp_visit_stmt (gimple *stmt, edge >>>> *taken_edge_p, tree *output_p) >>>> if (!stmt_interesting_for_vrp (stmt)) >>>> gcc_assert (stmt_ends_bb_p (stmt)); >>>> else if (is_gimple_assign (stmt) || is_gimple_call (stmt)) >>>> - return vrp_visit_assignment_or_call (stmt, output_p); >>>> + return vrp_visit_assignment_or_call (stmt, dom_p, output_p); >>>> else if (gimple_code (stmt) == GIMPLE_COND) >>>> return vrp_visit_cond_stmt (as_a (stmt), taken_edge_p); >>>> else if (gimple_code (stmt) == GIMPLE_SWITCH) >>>> @@ -7954,6 +8021,12 @@ vrp_visit_stmt (gimple *stmt, edge >>>> *taken_edge_p, tree *output_p) >>>> return SSA_PROP_VARYING; >>>> } >>>> >>>> +static enum ssa_prop_result >>>> +vrp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p) >>>> +{ >>>> + return vrp_visit_stmt_worker (stmt, false, taken_edge_p, output_p); >>>> +} >>>> >>>> as said the refactoring that would be appreciated is to split out the >>>> update_value_range calls >>>> from the worker functions so you can call the respective functions >>>> from the DOM implementations. >>>> That they are globbed in vrp_visit_stmt currently is due to the API of >>>> the SSA propagator. >>> >>> Sent this as separate patch for easy reviewing and testing. >> >> Thanks. >> >>>> >>>> @@ -8768,6 +8842,12 @@ vrp_visit_phi_node (gphi *phi) >>>> fprintf (dump_file, "\n"); >>>> } >>>> >>>> + if (dom_p && vr_arg.type == VR_UNDEFINED) >>>> + { >>>> + set_value_range_to_varying (&vr_result); >>>> + break; >>>> + } >>>> + >>>> >>>> eh... ok, so another way to attack this is, instead of initializing >>>> the lattice to sth else >>>> than VR_UNDEFINED, make sure to drop the lattice to varying for all PHI >>>> args on >>>> yet unvisited incoming edges (you're not doing optimistic VRP). That's >>>> the only >>>> place you _have_ to do it. >>> >>> >>> Even when it is initialize (as I am doing now), we can still end up with >>> VR_UNDEFINED during the propagation. I have just left the above so that we >>> dont end up with the wrong VR. >> >> How would we end up with wrong VRs? I see >> >> + /* Discover VR when condition is true. */ >> + extract_range_for_var_from_comparison_expr (op0, code, op0, op1, &vr); >> + if (old_vr->type == VR_RANGE || old_vr->type == VR_ANTI_RANGE) >> + vrp_intersect_ranges (&vr, old_vr); >> >> where you already disregard UNDEFINED as old_vr. >> >> What you might be missing is to set all DEFs on !stmt_interesting_for_vrp to >> VARYING. Also >> >> + if (stmt_interesting_for_vrp (stmt)) >> + { >> + tree lhs = gimple_get_lhs (stmt); >> + value_range vr = VR_INITIALIZER; >> + vrp_visit_stmt_worker (stmt, &taken_edge, &output, &vr); >> + update_value_range (lhs, &vr); >> + >> + /* Try folding stmts with the VR discovered. */ >> + if (fold_stmt (&gsi, follow_single_use_edges)) >> + update_stmt (gsi_stmt (gsi)); >> >> folding here can introduce additional stmts and thus unvisited SSA defs >> which would end up with VR_UNINITIALIZED defs - I don't see exactly >> where that would cause issues but I'm not sure it might not. So better >> use fold_stmt_inplace or alternatively if fold_stmt returned true then, >> remembering the previous stmt gsi, iterate over all stmts emitted and >> set their defs to varying (or re-visit them). See for example how >> tree-ssa-forwprop.c does this re-visiting (it revisits all changed stmts). > > I am now setting the results of stmts that are not interesting to vrp > to VR_VARYING. > > Also, before visiting PHI, if any of the arguments are undefined, > result of the PHI is also set varying. > > I have removed all the others. This now bootstraps and passes > regressions (see attached patch). > > >> Note that you want to have a custom valueize function instead of just >> follow_single_use_edges as you want to valueize all SSA names according >> to their lattice value (if it has a single value). You can use vrp_valueize >> for this though that gets you non-single-use edge following as well. >> Eventually it's going to be cleaner to do what the SSA propagator does and >> before folding do >> >> did_replace = replace_uses_in (stmt, vrp_valueize); >> if (fold_stmt (&gsi, follow_single_use_edges) >> || did_replace) >> update_stmt (gsi_stmt (gsi)); >> >> exporting replace_uses_in for this is ok. I guess I prefer this for now. > > I also added the above. I noticed that I need > recompute_tree_invariant_for_addr_expr as in ssa_propagate. My initial > implementation also had gimple_purge_all_dead_eh_edges and > fixup_noreturn_call as in ssa_propagat but I thinj that is not needed > as it would be done at the end of the pass. I don't see this being done at the end of the pass. So please re-instantiate that parts. > With this I noticed more stmts are folded before vrp1. This required > me to adjust some testcases. > >> >> Overall this now looks good apart from the folding and the VR_INITIALIZER thing. >> >> You can commit the already approved refactoring changes and combine this >> patch with the struct value_range move, this way I can more easily look into >> issues with the UNDEFINED thing if you can come up with a testcase that >> doesn't work. >> > > I have now committed all the dependent patches. > > Attached patch passes regression and bootstrap except pr33738.C. This > is an unrelated issue as discussed in > https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01386.html > > Is this OK? +/* Initialize local data structures for VRP. If DOM_P is true, + we will be calling this from early_vrp where value range propagation + is done by visiting stmts in dominator tree. ssa_propagate engine + is not used in this case and that part of the ininitialization will + be skipped. */ + +static void +vrp_initialize () comment needs updating now. static void -extract_range_from_phi_node (gphi *phi, value_range *vr_result) +extract_range_from_phi_node (gphi *phi, value_range *vr_result, + bool early_vrp_p) { I don't think you need this changes now that you have stmt_visit_phi_node_in_dom_p guarding its call. +static bool +stmt_visit_phi_node_in_dom_p (gphi *phi) +{ + ssa_op_iter iter; + use_operand_p oprnd; + tree op; + value_range *vr; + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + if (TREE_CODE (op) == SSA_NAME) + { + vr = get_value_range (op); + if (vr->type == VR_UNDEFINED) + return false; + } + } I think this is overly conservative in never allowing UNDEFINED on PHI node args (even if the def was actually visited). I think that the most easy way to improve this bit would be to actually track visited blocks. You already set the EDGE_EXECUTABLE flag on edges so you could clear BB_VISITED on all blocks and set it in the before_dom_children hook (at the end). Then the above can be folded into the PHI visiting: bool has_unvisited_pred = false; FOR_EACH_EDGE (e, ei, bb->preds) if (!(e->src->flags & BB_VISITED)) { has_unvisited_preds = true; break; } + /* Visit PHI stmts and discover any new VRs possible. */ + gimple_stmt_iterator gsi; + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + tree lhs = PHI_RESULT (phi); + value_range vr_result = VR_INITIALIZER; + if (! has_unvisived_preds && stmt_interesting_for_vrp (phi) + && stmt_visit_phi_node_in_dom_p (phi)) + extract_range_from_phi_node (phi, &vr_result, true); + else + set_value_range_to_varying (&vr_result); + update_value_range (lhs, &vr_result); + } due to a bug in IRA you need to make sure to un-set BB_VISITED after early-vrp is finished again. + /* Try folding stmts with the VR discovered. */ + bool did_replace = replace_uses_in (stmt, evrp_valueize); + if (fold_stmt (&gsi, follow_single_use_edges) + || did_replace) + update_stmt (gsi_stmt (gsi)); you should be able to re-use vrp_valueize here. + def_operand_p def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); + /* Set the SSA with the value range. */ + if (def_p + && TREE_CODE (DEF_FROM_PTR (def_p)) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (DEF_FROM_PTR (def_p)))) + { + tree def = DEF_FROM_PTR (def_p); + unsigned ver = SSA_NAME_VERSION (def); + if ((vr_value[ver]->type == VR_RANGE Use get_value_range () please, not direct access to vr_value. + || vr_value[ver]->type == VR_ANTI_RANGE) + && (TREE_CODE (vr_value[ver]->min) == INTEGER_CST) + && (TREE_CODE (vr_value[ver]->max) == INTEGER_CST)) + set_range_info (def, vr_value[ver]->type, vr_value[ver]->min, + vr_value[ver]->max); + } Otherwise the patch looks good now (with a lot of improvement possibilities of course). Thanks and sorry for the delay, Richard. > Thanks, > Kugan > > >> Thanks, >> Richard. >> >>> I also noticed that g++.dg/warn/pr33738.C testcase is now failing. This is >>> because, with early-vrp setting value range ccp2 is optimizing without >>> issuing a warning. I will look into it. >>> >>> bootstrap and regression testing is in progress. >>> >>> Thanks, >>> Kugan