From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6884 invoked by alias); 2 Jun 2015 09:40:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 6868 invoked by uid 89); 2 Jun 2015 09:40:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-oi0-f53.google.com Received: from mail-oi0-f53.google.com (HELO mail-oi0-f53.google.com) (209.85.218.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 02 Jun 2015 09:40:20 +0000 Received: by oiww2 with SMTP id w2so121379176oiw.0 for ; Tue, 02 Jun 2015 02:40:17 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.202.79.196 with SMTP id d187mr20449357oib.46.1433238017867; Tue, 02 Jun 2015 02:40:17 -0700 (PDT) Received: by 10.76.115.167 with HTTP; Tue, 2 Jun 2015 02:40:17 -0700 (PDT) In-Reply-To: References: <000001d097a3$c846a200$58d3e600$@arm.com> Date: Tue, 02 Jun 2015 09:42:00 -0000 Message-ID: Subject: Re: [PATCH GCC]Improve how we handle overflow in scev by using overflow information computed for control iv in loop niter, part II From: Richard Biener To: "Bin.Cheng" Cc: Bin Cheng , GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-06/txt/msg00170.txt.bz2 On Tue, Jun 2, 2015 at 11:30 AM, Bin.Cheng wrote: > On Tue, Jun 2, 2015 at 4:40 PM, Richard Biener > wrote: >> On Tue, Jun 2, 2015 at 4:55 AM, Bin.Cheng wrote: >>> On Mon, Jun 1, 2015 at 6:45 PM, Richard Biener >>> wrote: >>>> On Tue, May 26, 2015 at 1:04 PM, Bin Cheng wrote: >>>>> Hi, >>>>> My first part patch improving how we handle overflow in scev is posted at >>>>> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01795.html . Here comes the >>>>> second part patch. >>>>> >>>>> This patch does below improvements: >>>>> 1) Computes and records control iv for each loop's exit edge. This >>>>> provides a way to compute overflow information in loop niter and use it in >>>>> different customers. It think it's useful, especially with option >>>>> -funsafe-loop-optimizers. >>>>> 2) Improve chrec_convert by adding new interface >>>>> loop_exits_before_overflow. It checks if a converted IV overflows wrto its >>>>> type and loop using overflow information of loop's control iv. This >>>>> basically propagates no-overflow information from control iv to ivs >>>>> converted from control iv. Moreover, we can further improve the logic by >>>>> using possible VRP information in the future. >>>> >>>> But 2) you already posted (and I have approved it but you didn't commit yet?). >>>> >>>> Can you commit that approved patch and only send the parts I didn't approve >>>> yet? >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> With this patch, cases like scev-9.c and scev-10.c in patch can be handled >>>>> now. Cases reported in PR48052 can be vectorized too. >>>>> Opinions? >>>>> >>>>> Thanks, >>>>> bin >>>>> >>>>> >>>>> 2015-05-26 Bin Cheng >>>>> >>>>> * cfgloop.h (struct control_iv): New. >>>>> (struct loop): New field control_ivs. >>>>> * tree-ssa-loop-niter.c : Include "stor-layout.h". >>>>> (number_of_iterations_lt): Set no_overflow information. >>>>> (number_of_iterations_exit): Init control iv in niter struct. >>>>> (record_control_iv): New. >>>>> (estimate_numbers_of_iterations_loop): Call record_control_iv. >>>>> (loop_exits_before_overflow): New. Interface factored out of >>>>> scev_probably_wraps_p. >>>>> (scev_probably_wraps_p): Factor loop niter related code into >>>>> loop_exits_before_overflow. >>>>> (free_numbers_of_iterations_estimates_loop): Free control ivs. >>>>> * tree-ssa-loop-niter.h (free_loop_control_ivs): New. >>>>> >>>>> gcc/testsuite/ChangeLog >>>>> 2015-05-26 Bin Cheng >>>>> >>>>> PR tree-optimization/48052 >>>>> * gcc.dg/tree-ssa/scev-8.c: New. >>>>> * gcc.dg/tree-ssa/scev-9.c: New. >>>>> * gcc.dg/tree-ssa/scev-10.c: New. >>>>> * gcc.dg/vect/pr48052.c: New. >>>>> >>> >>> Hi Richard, >>> I think you replied the review message of this patch to another >>> thread. Sorry for being mis-leading. S I copied and answered your >>> review comments in this thread thus we can continue here. >>> >>>>> + /* Done proving if this is a no-overflow control IV. */ >>>>> + if (operand_equal_p (base, civ->base, 0)) >>>>> + return true; >>>> >>>> so all control IVs are no-overflow? >>> >>> This patch only records known no-overflow control ivs in loop >>> structure, so it depends on loop niter analyzer. For now, this patch >>> (and the existing code) sets no-overflow flag only for two cases. One >>> is the step-1 case, the other one is in assert_no_overflow_lt. >>> As a matter of fact, we may want to set no_overflow flag for all cases >>> with -funsafe-loop-optimizations in the future. In that case, we will >>> assume all control IVs are no-overflow. >>> >>>> >>>>> + base <= UPPER_BOUND (type) - step ;;step > 0 >>>>> + base >= LOWER_BOUND (type) - step ;;step < 0 >>>>> + >>>>> + by using loop's initial condition. */ >>>>> + stepped = fold_build2 (PLUS_EXPR, TREE_TYPE (base), base, step); >>>>> + if (operand_equal_p (stepped, civ->base, 0)) >>>>> + { >>>>> + if (tree_int_cst_sign_bit (step)) >>>>> + { >>>>> + code = LT_EXPR; >>>>> + extreme = lower_bound_in_type (type, type); >>>>> + } >>>>> + else >>>>> + { >>>>> + code = GT_EXPR; >>>>> + extreme = upper_bound_in_type (type, type); >>>>> + } >>>>> + extreme = fold_build2 (MINUS_EXPR, type, extreme, step); >>>>> + e = fold_build2 (code, boolean_type_node, base, extreme); >>>> >>>> looks like you are actually computing base + step <= UPPER_BOUND (type) >>>> so maybe adjust the comment. But as both step and UPPER_BOUND (type) >>>> are constants why not compute it the way the comment specifies it? Comparison >>>> codes also don't match the comment and we try to prove the condition is false. >>> I tried to prove the condition are satisfied by proving the reverse >>> condition ("base > UPPER_BOUND (type) - step") is false here. In the >>> updated patch, I revised comments to reflect that logic. Is it ok? >>> >>>> >>>> This also reminds me of eventually pushing forward my idea of strengthening >>>> simplify_using_initial_ >>>> conditions by using the VRP machinery (I have a small >>>> prototype patch for that). >>> Interesting. If I understand correctly, VRP info is hold for ssa var >>> on a global scope basis? The loop's initial condition may strengthen >>> var's range information, rather than the contrary. Actually I tried >>> to refine vrp info in scope of loop and used the refined vrp >>> information in loop optimizer (In the thread which you replied this >>> review message to). Looking forward to seeing how it's handled in >>> your patch. >> >> Attached (don't remember the state of the patch, but of course some >> caching is missing here). > If I understand correctly, you are trying to compute/refine > control-flow sensitive vrp information by using loop's initial > conditions. I did kind of same thing in the old patch, but yes, your > patch is of course more general. One point is we can extend the idea > to non-loop control flows, for example, if-then-else. I guess it > might be harder to develop a caching structure dealing with non-loop > control flows Well, the caching is of course dependent on the "context" BB for which we could keep a lattice for SSA value-ranges we compute. So a caching interface could for example keep a -> value-range map and for a loop context just use the loop header as basic-block to key on. Of course the code isn't very clever with determining context-based value-ranges and I've strided off to start coding a dominator-based VRP instead of a SSA-based inserting assert-exprs one. But I didn't get very far on that. At least it would allow computing value-ranges on-demand like we'd like to have for the niter use. Richard. >> >>>> >>>>> +/* The structure describing a non-overflow control induction variable >>>>> + of loop's exit edge. */ >>>>> +struct GTY ((chain_next ("%h.next"))) control_iv { >>>>> + tree base; >>>>> + tree step; >>>>> + struct control_iv *next; >>>>> +}; >>>> >>>> The comment suggests these don't exist for multiple edges? Maybe you >>>> can clarify this. >>> As the linked list structure suggests, we can support one no-overflow >>> iv for each loop's exit edge. I revised the comment a little bit. >>> >>>> >>>> Otherwise the patch looks ok. >>> >>> I attached the updated patch, does it look fine? >> >> Yes. > Will apply the patch later. > > Thanks, > bin >> >> Thanks, >> Richard. >> >>> Thanks, >>> bin >>>> >>>> Thanks, >>>> Richard.