From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 110470 invoked by alias); 12 Nov 2015 10:58:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 110366 invoked by uid 89); 12 Nov 2015 10:58:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_LINEPADDING,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-yk0-f172.google.com Received: from mail-yk0-f172.google.com (HELO mail-yk0-f172.google.com) (209.85.160.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 12 Nov 2015 10:58:07 +0000 Received: by ykba77 with SMTP id a77so90048046ykb.2 for ; Thu, 12 Nov 2015 02:58:05 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.13.210.4 with SMTP id u4mr14625251ywd.68.1447325885176; Thu, 12 Nov 2015 02:58:05 -0800 (PST) Received: by 10.37.93.11 with HTTP; Thu, 12 Nov 2015 02:58:05 -0800 (PST) In-Reply-To: <5643A732.4040707@redhat.com> References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F41150@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F42541@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295155D@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295ADCB@XAP-PVEXMBX02.xlnx.xilinx.com> <55D4F921.2020708@redhat.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4297704C@XAP-PVEXMBX02.xlnx.xilinx.com> <5643A732.4040707@redhat.com> Date: Thu, 12 Nov 2015 10:58:00 -0000 Message-ID: Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation From: Richard Biener To: Jeff Law Cc: Ajit Kumar Agarwal , GCC Patches , Vinod Kathail , Shail Aditya Gupta , Vidhumouli Hunsigida , Nagaraju Mekala Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-11/txt/msg01479.txt.bz2 On Wed, Nov 11, 2015 at 9:38 PM, Jeff Law wrote: > On 09/04/2015 11:36 AM, Ajit Kumar Agarwal wrote: > >>> diff --git a/gcc/passes.def b/gcc/passes.def >>> index 6b66f8f..20ddf3d 100644 >>> --- a/gcc/passes.def >>> +++ b/gcc/passes.def >>> @@ -82,6 +82,7 @@ along with GCC; see the file COPYING3. If not see >>> NEXT_PASS (pass_ccp); >>> /* After CCP we rewrite no longer addressed locals into SSA >>> form if possible. */ >>> + NEXT_PASS (pass_path_split); >>> NEXT_PASS (pass_forwprop); >>> NEXT_PASS (pass_sra_early); >> >> I can't recall if we've discussed the location of the pass at all. I'm >> not objecting to this location, but would like to hear why you chose >> this particular location in the optimization pipeline. > > So returning to the question of where this would live in the optimization > pipeline and how it interacts with if-conversion and vectorization. Note that adding passes to the early pipeline that do code duplication is a no-no. The early pipeline should be exclusively for things making functions more suitable for inlining. > The concern with moving it to late in the pipeline was that we'd miss > VRP/DCE/CSE opportunities. I'm not sure if you're aware, but we actually > run those passes more than once. So it would be possible to run path > splitting after if-conversion & vectorization, but before the second passs > of VRP & DOM. But trying that seems to result in something scrambling the > loop enough that the path splitting opportunity is missed. That might be > worth deeper investigation if we can't come up with some kind of heuristics > to fire or suppress path splitting. As I still think it is a transform similar to tracer just put it next to that. But IIRC you mentioned it should enable vectorization or so? In this case that's obviously too late. Richard. > Other random notes as I look over the code: > > Call the pass "path-split", not "path_split". I don't think we have any > passes with underscores in their names, dump files, etc. > > You factored out the code for transform_duplicate. When you create new > functions, they should all have a block comment indicating what they do, > return values, etc. > > I asked you to trim down the #includes in tree-ssa-path-split.c Most were > ultimately unnecessary. The trimmed list is just 11 headers. > > Various functions in tree-ssa-path-split.c were missing their block > comments. There were several places in tree-ssa-path-split that I felt > deserved a comment. While you are familiar with the code, it's likely > someone else will have to look at and modify this code at some point in the > future. The comments help make that easier. > > In find_trace_loop_latch_same_as_join_blk, we find the immediate dominator > of the latch and verify it ends in a conditional. That's fine. Then we > look at the predecessors of the latch to see if one is succeeded only by the > latch and falls through to the latch. That is the block we'll end up > redirecting to a copy of the latch. Also fine. > > Note how there is no testing for the relationship between the immediate > dominator of the latch and the predecessors of the latch. ISTM that we can > have a fairly arbitrary region in the THEN/ELSE arms of the conditional. > Was this intentional? Would it be advisable to verify that the THEN/ELSE > arms are single blocks? Do we want to verify that neither the THEN/ELSE > arms transfer control other than to the latch? Do we want to verify the > predecessors of the latch are immediate successors of the latch's immediate > dominator? > > The is_feasible_trace routine was still checking if the block had a > conversion and rejecting it. I removed that check. It does seem to me that > we need an upper limit on the number of statements. I wonder if we should > factor out the maximum statements to copy code from jump threading and use > it for both jump threading and path splitting. > > Instead of creating loop with multiple latches, what ever happened to the > idea of duplicating the latch block twice -- once into each path. Remove the > control statement in each duplicate. Then remove everything but the control > statement in the original latch. > > > I added some direct dump support. Essentially anytime we split the path, we > output something like this: > > Split path in loop: latch block 9, predecessor 7. > > That allows tests in the testsuite to look for the "Split path in loop" > string rather than inferring the information from the SSA graph update's > replacement table. It also allows us to do things like count how many paths > get split if we have more complex tests. > > On the topic of tests. Is the one you provided something where path > splitting results in a significant improvement? From looking at the x86_64 > output, I can see the path splitting transformation occur, but not any > improvement in the final code. > > While the existing test is useful, testing on code that actually improves as > a result of path splitting is better. Ideally we'd test both that path > splitting occurred and that the secondary optimizations we wanted triggered. > > The tests should go into gcc.dg/tree-ssa rather than just gcc.dg. > > ANyway, here's my work-in-progress. Your thoughts on the various questions, > concerns, ideas noted above would be appreciated. Obviously I'd like to > wrap things up quickly and include this patch in gcc6. > > Note, I haven't bootstrapped or regression tested this version. > > > > > > > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 34d2356..6613e83 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1474,6 +1474,7 @@ OBJS = \ > tree-ssa-loop.o \ > tree-ssa-math-opts.o \ > tree-ssa-operands.o \ > + tree-ssa-path-split.o \ > tree-ssa-phionlycprop.o \ > tree-ssa-phiopt.o \ > tree-ssa-phiprop.o \ > diff --git a/gcc/common.opt b/gcc/common.opt > index 757ce85..9cc94e2 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2403,6 +2403,10 @@ ftree-vrp > Common Report Var(flag_tree_vrp) Init(0) Optimization > Perform Value Range Propagation on trees. > > +ftree-path-split > +Common Report Var(flag_tree_path_split) Init(0) Optimization > +Perform Path Splitting on trees for loop backedges > + > funit-at-a-time > Common Report Var(flag_unit_at_a_time) Init(1) > Compile whole compilation unit at a time. > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 213a9d0..b1e95da 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -354,6 +354,7 @@ Objective-C and Objective-C++ Dialects}. > -fdump-tree-fre@r{[}-@var{n}@r{]} @gol > -fdump-tree-vtable-verify @gol > -fdump-tree-vrp@r{[}-@var{n}@r{]} @gol > +-fdump-tree-path-split@r{[}-@var{n}@r{]} @gol > -fdump-tree-storeccp@r{[}-@var{n}@r{]} @gol > -fdump-final-insns=@var{file} @gol > -fcompare-debug@r{[}=@var{opts}@r{]} -fcompare-debug-second @gol > @@ -462,7 +463,7 @@ Objective-C and Objective-C++ Dialects}. > -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta > @gol > -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol > -ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol > --ftree-vectorize -ftree-vrp @gol > +-ftree-vectorize -ftree-vrp @gol -ftree-path-split @gol > -funit-at-a-time -funroll-all-loops -funroll-loops @gol > -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops > @gol > -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol > @@ -7169,6 +7170,11 @@ output on to @file{stderr}. If two conflicting dump > filenames are > given for the same pass, then the latter option overrides the earlier > one. > > +@item path-split > +@opindex fdump-tree-path-split > +Dump each function after path splitting. The file name is made by > +appending @file{.path-split} to the source file name > + > @item all > Turn on all options, except @option{raw}, @option{slim}, @option{verbose} > and @option{lineno}. > @@ -7811,6 +7817,7 @@ also turns on the following optimization flags: > -ftree-switch-conversion -ftree-tail-merge @gol > -ftree-pre @gol > -ftree-vrp @gol > +-ftree-path-split @gol > -fipa-ra} > > Please note the warning under @option{-fgcse} about > @@ -8819,7 +8826,7 @@ currently enabled, but may be enabled by @option{-O2} > in the future. > > @item -ftree-sink > @opindex ftree-sink > -Perform forward store motion on trees. This flag is > +Perform forward store motion on trees. This flag is > enabled by default at @option{-O} and higher. > > @item -ftree-bit-ccp > @@ -9125,6 +9132,13 @@ enabled by default at @option{-O2} and higher. Null > pointer check > elimination is only done if @option{-fdelete-null-pointer-checks} is > enabled. > > +@item -ftree-path-split > +@opindex ftree-path-split > +Perform Path Splitting on trees. When the two execution paths of a > +if-then-else merge at the loop latch node, try to duplicate the > +merge node into two paths. This is enabled by default at @option{-O2} > +and above. > + > @item -fsplit-ivs-in-unroller > @opindex fsplit-ivs-in-unroller > Enables expression of values of induction variables in later iterations > diff --git a/gcc/opts.c b/gcc/opts.c > index 9a3fbb3..9a0b27c 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -509,6 +509,7 @@ static const struct default_options > default_options_table[] = > { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 > }, > { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, > + { OPT_LEVELS_2_PLUS, OPT_ftree_path_split, NULL, 1 }, > > /* -O3 optimizations. */ > { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, > diff --git a/gcc/passes.def b/gcc/passes.def > index c0ab6b9..e0c1cd8 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_remove_cgraph_callee_edges); > NEXT_PASS (pass_object_sizes); > NEXT_PASS (pass_ccp); > + NEXT_PASS (pass_path_split); > /* After CCP we rewrite no longer addressed locals into SSA > form if possible. */ > NEXT_PASS (pass_forwprop); > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c > new file mode 100644 > index 0000000..4b8637b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c > @@ -0,0 +1,67 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fdump-tree-path-split-details " } */ > + > +#include > +#include > + > +#define RGBMAX 255 > + > +int > +test() > +{ > + int i, Pels; > + unsigned char sum = 0; > + unsigned char xr, xg, xb; > + unsigned char xc, xm, xy, xk; > + unsigned char *ReadPtr, *EritePtr; > + > + ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); > + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); > + > + for (i = 0; i < 100;i++) > + { > + ReadPtr[i] = 100 - i; > + } > + > + for (i = 0; i < 100; i++) > + { > + xr = *ReadPtr++; > + xg = *ReadPtr++; > + xb = *ReadPtr++; > + > + xc = (unsigned char) (RGBMAX - xr); > + xm = (unsigned char) (RGBMAX - xg); > + xy = (unsigned char) (RGBMAX - xb); > + > + if (xc < xm) > + { > + xk = (unsigned char) (xc < xy ? xc : xy); > + } > + else > + { > + xk = (unsigned char) (xm < xy ? xm : xy); > + } > + > + xc = (unsigned char) (xc - xk); > + xm = (unsigned char) (xm - xk); > + xy = (unsigned char) (xy - xk); > + > + *EritePtr++ = xc; > + *EritePtr++ = xm; > + *EritePtr++ = xy; > + *EritePtr++ = xk; > + sum += *EritePtr; > + } > + return sum; > +} > + > +int > +main() > +{ > + if (test() != 33) > + abort(); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump "Split path in loop:" "path-split" } } */ > diff --git a/gcc/timevar.def b/gcc/timevar.def > index b429faf..3dba68b 100644 > --- a/gcc/timevar.def > +++ b/gcc/timevar.def > @@ -300,3 +300,4 @@ DEFTIMEVAR (TV_LINK , "link JIT code") > DEFTIMEVAR (TV_LOAD , "load JIT result") > DEFTIMEVAR (TV_JIT_ACQUIRING_MUTEX , "acquiring JIT mutex") > DEFTIMEVAR (TV_JIT_CLIENT_CODE , "JIT client code") > +DEFTIMEVAR (TV_TREE_PATH_SPLIT , "tree path split") > diff --git a/gcc/tracer.c b/gcc/tracer.c > index 941dc20..c7b5150 100644 > --- a/gcc/tracer.c > +++ b/gcc/tracer.c > @@ -51,9 +51,9 @@ > #include "tree-inline.h" > #include "cfgloop.h" > #include "fibonacci_heap.h" > +#include "tracer.h" > > static int count_insns (basic_block); > -static bool ignore_bb_p (const_basic_block); > static bool better_p (const_edge, const_edge); > static edge find_best_successor (basic_block); > static edge find_best_predecessor (basic_block); > @@ -85,7 +85,7 @@ bb_seen_p (basic_block bb) > } > > /* Return true if we should ignore the basic block for purposes of tracing. > */ > -static bool > +bool > ignore_bb_p (const_basic_block bb) > { > if (bb->index < NUM_FIXED_BLOCKS) > @@ -226,6 +226,24 @@ find_trace (basic_block bb, basic_block *trace) > return i; > } > > +/* Duplicate block BB2, placing it after BB in the CFG. Return the > + newly created block. */ > +basic_block > +transform_duplicate (basic_block bb, basic_block bb2) > +{ > + edge e; > + basic_block copy; > + > + e = find_edge (bb, bb2); > + > + copy = duplicate_block (bb2, e, bb); > + flush_pending_stmts (e); > + > + add_phi_args_after_copy (©, 1, NULL); > + > + return (copy); > +} > + > /* Look for basic blocks in frequency order, construct traces and tail > duplicate > if profitable. */ > > @@ -321,17 +339,8 @@ tail_duplicate (void) > entries or at least rotate the loop. */ > && bb2->loop_father->header != bb2) > { > - edge e; > - basic_block copy; > - > - nduplicated += counts [bb2->index]; > - > - e = find_edge (bb, bb2); > - > - copy = duplicate_block (bb2, e, bb); > - flush_pending_stmts (e); > - > - add_phi_args_after_copy (©, 1, NULL); > + nduplicated += counts [bb2->index]; > + basic_block copy = transform_duplicate (bb, bb2); > > /* Reconsider the original copy of block we've duplicated. > Removing the most common predecessor may make it to be > diff --git a/gcc/tracer.h b/gcc/tracer.h > new file mode 100644 > index 0000000..454d3b7 > --- /dev/null > +++ b/gcc/tracer.h > @@ -0,0 +1,26 @@ > +/* Header file for Tracer. > + Copyright (C) 2015 Free Software Foundation, Inc. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify it under > +the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 3, or (at your option) any later > +version. > + > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > + for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +. */ > + > +#ifndef GCC_TRACER_H > +#define GCC_TRACER_H > + > +extern basic_block transform_duplicate (basic_block bb, basic_block bb2); > +extern bool ignore_bb_p (const_basic_block bb); > + > +#endif /* GCC_TRaCER_H */ > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index 49e22a9..6963acc 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -390,6 +390,7 @@ extern gimple_opt_pass *make_pass_tree_loop_done > (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); > +extern gimple_opt_pass *make_pass_path_split (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_build_alias (gcc::context *ctxt); > diff --git a/gcc/tree-ssa-path-split.c b/gcc/tree-ssa-path-split.c > new file mode 100644 > index 0000000..33dbc4d > --- /dev/null > +++ b/gcc/tree-ssa-path-split.c > @@ -0,0 +1,254 @@ > +/* Support routines for Path Splitting. > + Copyright (C) 2015 Free Software Foundation, Inc. > + Contributed by Ajit Kumar Agarwal . > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "backend.h" > +#include "tree.h" > +#include "gimple.h" > +#include "tree-pass.h" > +#include "cfganal.h" > +#include "cfgloop.h" > +#include "gimple-iterator.h" > +#include "tracer.h" > + > +/* Given LOOP, if its latch has an immediate dominator that ends > + in a simple conditional, then search for a block that is a > + predecessor of the latch that falls through to the latch and > + has no other successors. > + > + When found, put that block into TRACE[0] and the latch block into > + TRACE[1]. Otherwise do nothing. */ > + > +static void > +find_trace_loop_latch_same_as_join_blk (loop_p loop, basic_block *trace) > +{ > + basic_block latch = loop->latch; > + > + /* We don't support path splitting if the latch has more than two > + predecessors. */ > + if (EDGE_COUNT (latch->preds) == 2) > + { > + basic_block bb = get_immediate_dominator (CDI_DOMINATORS, latch); > + gimple *last = gsi_stmt (gsi_last_bb (bb)); > + > + /* The immediate dominator of the latch must end in a conditional. > */ > + if (last && gimple_code (last) != GIMPLE_COND) > + return; > + > + /* This looks for a predecessor of the latch which has a single > + fallthru successor (that is the latch). Only one of the blocks > + can match that pattern. > + > + Do we need to check that E->src is a successor of BB? */ > + edge_iterator ei; > + edge e; > + FOR_EACH_EDGE (e, ei, latch->preds) > + { > + if (!single_succ_p (e->src) > + || !(single_succ_edge (e->src)->flags & EDGE_FALLTHRU)) > + break; > + else > + { > + trace[0] = e->src; > + trace[1] = latch; > + break; > + } > + } > + } > +} > + > +/* Return TRUE if BB is a reasonable block to duplicate by examining > + its size, false otherwise. BB will always be a loop latch block. > + > + Should this use the same tests as we do for jump threading? */ > + > +static bool > +is_feasible_trace (basic_block bb) > +{ > + int num_stmt = 0; > + gimple_stmt_iterator gsi; > + > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple *stmt = gsi_stmt (gsi); > + if (!is_gimple_debug (stmt)) > + num_stmt++; > + } > + > + /* We may want to limit how many statements we copy. */ > + if (num_stmt > 1) > + return true; > + > + return false; > +} > + > +/* If the immediate dominator of the latch of the loop is > + block with conditional branch, then the loop latch is > + duplicated to its predecessors path preserving the SSA > + semantics. > + > + CFG before transformation. > + > + : > + xk_35 = MIN_EXPR ; > + goto ; > + > + : > + xk_36 = MIN_EXPR ; > + > + : > + # xk_4 = PHI > + xc_37 = xc_32 - xk_4; > + xm_38 = xm_33 - xk_4; > + xy_39 = xy_34 - xk_4; > + > + CFG After Path Splitting transformation > + before cleanup phase. > + > + : > + xk_35 = MIN_EXPR ; > + > + : > + # xk_29 = PHI > + xc_56 = xc_32 - xk_29; > + xm_57 = xm_33 - xk_29; > + xy_58 = xy_34 - xk_29; > + goto ; > + > + : > + xk_36 = MIN_EXPR ; > + > + : > + # xk_4 = PHI > + xc_37 = xc_32 - xk_4; > + xm_38 = xm_33 - xk_4; > + xy_39 = xy_34 - xk_4; > + > + : ....... */ > + > +static bool > +perform_path_splitting () > +{ > + bool changed = false; > + basic_block trace[2] = {NULL, NULL}; > + loop_p loop; > + > + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); > + initialize_original_copy_tables (); > + calculate_dominance_info (CDI_DOMINATORS); > + > + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) > + { > + /* If we're optimizing for size, or should not duplicate > + the latch block for some other reason, then ignore this > + loop. */ > + if (ignore_bb_p (loop->latch)) > + continue; > + > + /* Get the latch node and its predecessor, storing them into > + trace[0] and trace[1] respectively. */ > + find_trace_loop_latch_same_as_join_blk (loop, trace); > + > + if (trace[0] && trace[1] && is_feasible_trace (trace[1])) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + "Split path in loop: latch block %d, predecessor > %d.\n", > + trace[1]->index, trace[0]->index); > + transform_duplicate (trace[0], trace[1]); > + trace[0] = NULL; > + trace[1] = NULL; > + changed = true; > + } > + } > + > + loop_optimizer_finalize (); > + free_original_copy_tables (); > + return changed; > +} > + > +/* Main entry point for path splitting. Returns TODO_cleanup_cfg if any > + paths where split, otherwise return zero. */ > + > +static unsigned int > +execute_path_split (void) > +{ > + /* If we don't have at least 2 real blocks and backedges in the > + CFG, then there's no point in trying to perform path splitting. */ > + if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1 > + || !mark_dfs_back_edges ()) > + return 0; > + > + bool changed = perform_path_splitting(); > + if (changed) > + { > + free_dominance_info (CDI_DOMINATORS); > + /* If we changed the CFG schedule loops for fixup by cleanup_cfg. */ > + if (current_loops) > + loops_state_set (LOOPS_NEED_FIXUP); > + } > + > + return changed ? TODO_cleanup_cfg : 0; > + > +} > + > +static bool > +gate_path_split(void) > +{ > + return flag_tree_path_split != 0; > +} > + > +namespace { > + > +const pass_data pass_data_path_split = > +{ > + GIMPLE_PASS, /* type */ > + "path-split", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_TREE_PATH_SPLIT, /* tv_id */ > + PROP_ssa, /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + TODO_update_ssa, /* todo_flags_finish */ > +}; > + > +class pass_path_split : public gimple_opt_pass > +{ > + public: > + pass_path_split (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_path_split, ctxt) > + {} > + /* opt_pass methods: */ > + opt_pass * clone () { return new pass_path_split (m_ctxt); } > + virtual bool gate (function *) { return gate_path_split (); } > + virtual unsigned int execute (function *) { return execute_path_split > (); } > + > +}; // class pass_path_split > + > +} // anon namespace > + > +gimple_opt_pass * > +make_pass_path_split (gcc::context *ctxt) > +{ > + return new pass_path_split (ctxt); > +} >