From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6233 invoked by alias); 13 Oct 2016 10:15:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5359 invoked by uid 89); 13 Oct 2016 10:15:58 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=ysrumyan@gmail.com, ysrumyangmailcom, premature, *loop X-HELO: mail-qk0-f178.google.com Received: from mail-qk0-f178.google.com (HELO mail-qk0-f178.google.com) (209.85.220.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 13 Oct 2016 10:15:55 +0000 Received: by mail-qk0-f178.google.com with SMTP id n189so79593712qke.0 for ; Thu, 13 Oct 2016 03:15:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=TBek1aQl5/nYzEw484JIFoIaDiHvdoH/cYA1JEnx4Rg=; b=CQQhsA0ONjc2xFCXzDhFrB5gUvmLRn6MpMXcoTFjn22JLYtvf1GjZZlynBC81JYM/y 5H5ymweu98AM6FShYDlGqmsBUSiKUtV2qBNRo0l3wdxbc2WalB2/4zBTJ+I7wTBPCY3j bWb9u8uVT7jQMxseO+5VlMRvVZ6UXdcjkVCRPajSwJn1HeI7+0V4iztul6jjTdStlKKd TzkwRDdhA/bmVzikbjxOCgKGAzEAB5KHDmNXjHJ4IdlkkU4yLOy+G9IDV1j3EKra9KfZ lKu309IrsOD+E+8S4POzJFQZq39lQfJQNmtVx5aUFpSW7Ihm/NofAkv4u2vlB4t+IbL2 D0/g== X-Gm-Message-State: AA6/9RmXSEb2bGmRTisGDbq8MbSLFMxXzhQSZfwt3vPxVnRIEElBk5X50d2fNfH/KzCDtUL6KGOundinnJiJLw== X-Received: by 10.194.30.137 with SMTP id s9mr6575983wjh.77.1476353753526; Thu, 13 Oct 2016 03:15:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.155.146 with HTTP; Thu, 13 Oct 2016 03:15:52 -0700 (PDT) In-Reply-To: References: From: Richard Biener Date: Thu, 13 Oct 2016 10:15:00 -0000 Message-ID: Subject: Re: Compile-time improvement for if conversion. To: Yuri Rumyantsev Cc: gcc-patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-10/txt/msg00999.txt.bz2 On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev wrote: > Richard, > > Here is updated patch . I avoided creation of new entry/exit blocks > but instead add check to border cases - do not consider also blocks > which are out of region. > > Any comments will be appreciated. I mostly like it now. Can you split out all the common parts from the dom_info constructor to a helper method please and just compute m_n_basic_blocks and a max_index and do all the memory allocation in that common function? @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb) bn = e->src; /* If the next node BN is either already visited or a border - block the current edge is useless, and simply overwritten - with the next edge out of the current node. */ - if (bn == m_end_block || m_dfs_order[bn->index]) + block or out of region the current edge is useless, and simply + overwritten with the next edge out of the current node. */ + if (bn == m_end_block || bn->dom[d_i] == NULL clever idea ;) Just thinking if this means we support single entry, multiple exit regions for CDI_DOMINATORs and multiple entry, single exit regions for CDI_POST_DOMINATORs. I think so. Please update the function comments accordingly. + m_dfsnum = 1; + m_nodes = 0; + m_fake_exit_edge = NULL; /* Assume that region is SCC. */ you mean SESE rather than SCC? @@ -511,7 +573,7 @@ dom_info::calc_idoms () : ei_start (bb->preds); edge_iterator einext; - if (m_reverse) + if (m_reverse && !in_region) { /* If this block has a fake edge to exit, process that first. */ if (bitmap_bit_p (m_fake_exit_edge, bb->index)) I think it's better to simply change the if (m_reverse) test to a if (m_fake_exit_edge) test. I noticed you do not initialize n_bbs_in_dom_tree (just set it to zero), it's not really used anywhere (but in an assert). free_dominance_info_for_region needs a function level comment (per coding guidelines). Please make free_dominance_info_for_region take a struct function * pointer. @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop, vec *refs) { unsigned int i; basic_block exit_bb = NULL; + vec region; if (find_data_references_in_loop (loop, refs) == chrec_dont_know) return false; - calculate_dominance_info (CDI_DOMINATORS); + if (dom_info_state (CDI_DOMINATORS) != DOM_OK) + calculate_dominance_info (CDI_DOMINATORS); This is a premature (unwanted) change. Other than that the only O(function-size) part left is the zeroing in + /* Determine max basic block index in region. */ + int max_index = region[0]->index; + for (size_t i = 1; i < num; i++) + if (region[i]->index > max_index) + max_index = region[i]->index; + max_index += 1; + m_dfs_order = new_zero_array (max_index + 1); + m_dfs_last = &m_dfs_order[max_index]; I suppose we can try addressing this as followup. Thanks a lot for doing this. Richard. > 2016-10-11 16:48 GMT+03:00 Richard Biener : >> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> I implemented this by passing callback function in_region which >>> returns true if block belongs to region. >>> I am testing it now >>> >>> I attach modified patch for your quick review. >> >> + FOR_EACH_VEC_ELT (region, i, bb) >> + { >> + bb->dom[dir_index] = et_new_tree (bb); >> + } >> + /* Check that region is SESE region. */ >> + entry = region[0]; >> + if ( EDGE_COUNT (entry->succs) > 1) >> + { >> + /* Create new entry block with the only successor. */ >> + basic_block succ = NULL; >> + bool found = false; >> + FOR_EACH_EDGE (e, ei, entry->succs) >> + if (in_region (region, e->dest)) >> + { >> + gcc_assert (!found); >> >> is that check equal to e->dest->dom[dir_index] != NULL? I think we >> shouldn't need the callback. >> >> + new_entry = create_empty_bb (entry); >> + unchecked_make_edge (new_entry, succ, 0); >> >> I still think this is somewhat gross and we should try to avoid it >> somehow - at least it's well-hidden now ;) >> >> + /* Put it to region as entry node. */ >> + region[0] = new_entry; >> >> so region[0] is overwritten now? >> >> As far as I understand calc_dfs_tree it should be possible to compute >> the region on-the-fly >> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe >> avoiding some of the still >> "large" complexity of zeroing arrays in the constructor). >> >> And if we use that DFS walk directly we should be able to avoid >> creating those fake entry/exit >> blocks by using entry/exit edges instead... (somehow). >> >> Richard. >> >> >> >>> Thanks. >>> >>> 2016-10-11 13:33 GMT+03:00 Richard Biener : >>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev wrote: >>>>> Richard, >>>>> >>>>> If "fake" exit or entry block is created in dominance how we can >>>>> determine what is its the only predecessor or successor without using >>>>> a notion of loop? >>>> >>>> The caller passes in an entry and exit edge instead of a block or loop. >>>> >>>> Richard. >>>> >>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener : >>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: >>>>>>> Thanks Richard for your comments. >>>>>>> I'd like to answer on your last comment regarding use split_edge() >>>>>>> instead of creating fake post-header. I started with this splitting >>>>>>> but it requires to fix-up closed ssa form by creating additional phi >>>>>>> nodes, so I decided to use only cfg change without updating ssa form. >>>>>>> Other changes look reasonable and will fix them. >>>>>> >>>>>> Ah. In this case can you investigate what it takes to make the entry/exit >>>>>> edges rather than BBs? That is, introduce those "fakes" only internally >>>>>> in dominance.c? >>>>>> >>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener : >>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Here is implementation of Richard proposal: >>>>>>>>> >>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator >>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions. I believe >>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens >>>>>>>>> < for each loop for the whole function. It shouldn't be very difficult >>>>>>>>> < to write this, >>>>>>>>> < sharing as much as possible code with the current DOM code might need >>>>>>>>> < quite some refactoring though. >>>>>>>>> >>>>>>>>> I implemented this proposal by adding calculation of dominance info >>>>>>>>> for SESE regions and incorporate this change to if conversion pass. >>>>>>>>> SESE region is built by adding loop pre-header and possibly fake >>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after >>>>>>>>> predication completion. >>>>>>>>> >>>>>>>>> Bootstrapping and regression testing did not show any new failures. >>>>>>>>> >>>>>>>>> Is it OK for trunk? >>>>>>>> >>>>>>>> It's mostly reasonable but I have a few comments. First, re-using >>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed >>>>>>>> a requirement to make the patch reasonably small. Please, >>>>>>>> in calculate_dominance_info_for_region, make sure that >>>>>>>> !dom_info_available_p (dir). >>>>>>>> >>>>>>>> You pass loop * everywhere but require ->aux to be set up as >>>>>>>> an array of BBs forming the region with special BBs at array ends. >>>>>>>> >>>>>>>> Please instead pass in a vec which avoids using ->aux >>>>>>>> and also allows other non-loop-based SESE regions to be used >>>>>>>> (I couldn't spot anything that relies on this being a loop). >>>>>>>> >>>>>>>> Adding a convenience wrapper for loop * would be of course nice, >>>>>>>> to cover the special pre/post-header code in tree-if-conv.c. >>>>>>>> >>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_, >>>>>>>> so you might want to see if it's possible to use such a pair of edges >>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks. >>>>>>>> >>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc. >>>>>>>> please use split_edge() of the entry/exit edges. >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>>> ChangeLog: >>>>>>>>> 2016-10-05 Yuri Rumyantsev >>>>>>>>> >>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition. >>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize >>>>>>>>> computation for loop region. >>>>>>>>> (dom_info::dom_info): New function. >>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not >>>>>>>>> handle unvisited blocks. >>>>>>>>> (dom_info::calc_idoms): Likewise. >>>>>>>>> (compute_dom_fast_query_in_region): New function. >>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with >>>>>>>>> false argument. >>>>>>>>> (calculate_dominance_info_for_region): New function. >>>>>>>>> (free_dominance_info_for_region): Likewise. >>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false >>>>>>>>> argument. >>>>>>>>> * dominance.h: Add prototype for introduced functions >>>>>>>>> calculate_dominance_info_for_region and >>>>>>>>> free_dominance_info_for_region. >>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. >>>>>>>>> (build_sese_region): New function. >>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators >>>>>>>>> calculation, free it after basic block predication and delete created >>>>>>>>> fake post-header block if any. >>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for >>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region. >>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops >>>>>>>>> and fake edges to exit block since post-dominator calculation is >>>>>>>>> performed per if-converted loop only.