From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-402187-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 112228 invoked by alias); 7 Jul 2015 08:50:55 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 112219 invoked by uid 89); 7 Jul 2015 08:50:54 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-ob0-f173.google.com
Received: from mail-ob0-f173.google.com (HELO mail-ob0-f173.google.com) (209.85.214.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 07 Jul 2015 08:50:50 +0000
Received: by obdbs4 with SMTP id bs4so124244125obd.3        for <gcc-patches@gcc.gnu.org>; Tue, 07 Jul 2015 01:50:48 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.202.0.212 with SMTP id 203mr2699620oia.58.1436259048719; Tue, 07 Jul 2015 01:50:48 -0700 (PDT)
Received: by 10.76.115.167 with HTTP; Tue, 7 Jul 2015 01:50:48 -0700 (PDT)
In-Reply-To: <37378DC5BCD0EE48BA4B082E0B55DFAA41F41150@XAP-PVEXMBX02.xlnx.xilinx.com>
References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C@XAP-PVEXMBX02.xlnx.xilinx.com>	<CAFiYyc1wXMQ_69b=Q9V=bz5KAiLAp8C9VugpfyOZqt9iiDMzXQ@mail.gmail.com>	<37378DC5BCD0EE48BA4B082E0B55DFAA41F41150@XAP-PVEXMBX02.xlnx.xilinx.com>
Date: Tue, 07 Jul 2015 08:50:00 -0000
Message-ID: <CAFiYyc0T+8CeeruKp-S2byhyBXW41ub7esqin+OuAYAy8q1WJA@mail.gmail.com>
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
From: Richard Biener <richard.guenther@gmail.com>
To: Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com>
Cc: "law@redhat.com" <law@redhat.com>, GCC Patches <gcc-patches@gcc.gnu.org>, 	Vinod Kathail <vinodk@xilinx.com>, Shail Aditya Gupta <shailadi@xilinx.com>, 	Vidhumouli Hunsigida <vidhum@xilinx.com>, Nagaraju Mekala <nmekala@xilinx.com>
Content-Type: text/plain; charset=UTF-8
X-IsSubscribed: yes
X-SW-Source: 2015-07/txt/msg00448.txt.bz2

On Sat, Jul 4, 2015 at 2:39 PM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Tuesday, June 30, 2015 4:42 PM
> To: Ajit Kumar Agarwal
> Cc: law@redhat.com; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
>
> On Tue, Jun 30, 2015 at 10:16 AM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>> All:
>>
>> The below patch added a new path Splitting optimization pass on SSA
>> representation. The Path Splitting optimization Pass moves the join
>> block of if-then-else same as loop latch to its predecessors and get merged with the predecessors Preserving the SSA representation.
>>
>> The patch is tested for Microblaze and i386 target. The EEMBC/Mibench
>> benchmarks is run with the Microblaze target And the performance gain
>> of 9.15% and rgbcmy01_lite(EEMBC benchmarks). The Deja GNU tests is run for Mircroblaze Target and no regression is seen for Microblaze target and the new testcase attached are passed.
>>
>> For i386 bootstrapping goes through fine and the Spec cpu2000
>> benchmarks is run with this patch. Following observation were seen with spec cpu2000 benchmarks.
>>
>> Ratio of path splitting change vs Ratio of not having path splitting change is 3653.353 vs 3652.14 for INT benchmarks.
>> Ratio of path splitting change vs Ratio of not having path splitting change is  4353.812 vs 4345.351 for FP benchmarks.
>>
>> Based on comments from RFC patch following changes were done.
>>
>> 1. Added a new pass for path splitting changes.
>> 2. Placed the new path  Splitting Optimization pass before the copy propagation pass.
>> 3. The join block same as the Loop latch is wired into its
>> predecessors so that the CFG Cleanup pass will merge the blocks Wired together.
>> 4. Copy propagation routines added for path splitting changes is not
>> needed as suggested by Jeff. They are removed in the patch as The copy propagation in the copied join blocks will be done by the existing copy propagation pass and the update ssa pass.
>> 5. Only the propagation of phi results of the join block with the phi
>> argument is done which will not be done by the existing update_ssa Or copy propagation pass on tree ssa representation.
>> 6. Added 2 tests.
>>     a) compilation check  tests.
>>    b) execution tests.
>> 7. Refactoring of the code for the feasibility check and finding the join block same as loop latch node.
>>
>>     [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation.
>>
>>     Added a new pass on path splitting on tree SSA representation. The path
>>     splitting optimization does the CFG transformation of join block of the
>>     if-then-else same as the loop latch node is moved and merged with the
>>     predecessor blocks after preserving the SSA representation.
>>
>>     ChangeLog:
>>     2015-06-30  Ajit Agarwal  <ajitkum@xilinx.com>
>>
>>         * gcc/Makefile.in: Add the build of the new file
>>         tree-ssa-path-split.c
>>         * gcc/common.opt: Add the new flag ftree-path-split.
>>         * gcc/opts.c: Add an entry for Path splitting pass
>>         with optimization flag greater and equal to O2.
>>         * gcc/passes.def: Enable and add new pass path splitting.
>>         * gcc/timevar.def: Add the new entry for TV_TREE_PATH_SPLIT.
>>         * gcc/tree-pass.h: Extern Declaration of make_pass_path_split.
>>         * gcc/tree-ssa-path-split.c: New file for path splitting pass.
>>         * gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c: New testcase.
>>         * gcc/testsuite/gcc.dg/path-split-1.c: New testcase.
>
>>>I'm not 100% sure I understand the transform but what I see from the testcases it tail-duplicates from a conditional up to a loop latch block (not sure if it >>includes it and thus ends up creating a loop nest or not).
>
>>>An observation I have is that the pass should at least share the transform stage to some extent with the existing tracer pass (tracer.c) which essentially does >>the same but not restricted to loops in any way.
>
> The following piece of code from tracer.c can be shared with the existing path splitting pass.
>
> {
>              e = find_edge (bb, bb2);
>
>               copy = duplicate_block (bb2, e, bb);
>               flush_pending_stmts (e);
>
>               add_phi_args_after_copy (&copy, 1, NULL);
> }
>
> Sharing the above code of the transform stage of tracer.c with the path splitting pass has the following limitation.
>
> 1. The duplicated loop latch node is wired to its predecessors and the existing phi node in the loop latch node with the
> Phi arguments from its corresponding predecessors is moved to the duplicated loop latch node that is wired into its predecessors. Due
> To this, the duplicated loop latch nodes wired into its predecessors will not be merged with the original predecessors by CFG cleanup phase .
>
>>> So I wonder if your pass could be simply another heuristic to compute paths to trace in the existing tracer pass.
>
> Sorry, I am not very clear when you say the above.  I am trying to figure out whether you expect the existing pass of tracer.c should be modified
> Or the path splitting pass should coexist.

Yes, I was wondering whether tracer.c could be simply modified.  Both
transforms are doing something very similar.

> My understanding of existing tracer pass is to  find out the traces  based on the frequency  with the profile data.
> Based on the traces,  tail duplication is done in order to enable the superblock regions. So I wonder the path splitting pass could be  incorporated in the
> existing  pass  to compute the path of the traces.

Yes, your pass would simply compute extra traces based on the new heuristic.

Richard.

> Thanks & Regards
> Ajit
>
> Thanks,
> Richard.
>
>>     Signed-off-by:Ajit Agarwal ajitkum@xilinx.com.
>>
>> gcc/Makefile.in                              |   1 +
>>  gcc/common.opt                               |   4 +
>>  gcc/opts.c                                   |   1 +
>>  gcc/passes.def                               |   1 +
>>  gcc/testsuite/gcc.dg/path-split-1.c          |  65 ++++
>>  gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c |  62 ++++
>>  gcc/timevar.def                              |   1 +
>>  gcc/tree-pass.h                              |   1 +
>>  gcc/tree-ssa-path-split.c                    | 462 +++++++++++++++++++++++++++
>>  9 files changed, 598 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/path-split-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>>  create mode 100644 gcc/tree-ssa-path-split.c
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 5f9261f..35ac363
>> 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1476,6 +1476,7 @@ OBJS = \
>>         tree-vect-slp.o \
>>         tree-vectorizer.o \
>>         tree-vrp.o \
>> +        tree-ssa-path-split.o \
>>         tree.o \
>>         valtrack.o \
>>         value-prof.o \
>> diff --git a/gcc/common.opt b/gcc/common.opt index e104269..c63b100
>> 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2328,6 +2328,10 @@ ftree-vrp
>>  Common Report Var(flag_tree_vrp) Init(0) Optimization  Perform Value
>> Range Propagation on trees
>>
>> +ftree-path-split
>> +Common Report Var(flag_tree_path_split) Init(0) Optimization Perform
>> +Path Splitting
>> +
>>  funit-at-a-time
>>  Common Report Var(flag_unit_at_a_time) Init(1) Optimization  Compile
>> whole compilation unit at a time diff --git a/gcc/opts.c b/gcc/opts.c
>> index 8a16116..31947ff 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -508,6 +508,7 @@ static const struct default_options default_options_table[] =
>>      { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
>>      { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 },
>>      { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 },
>> +    { OPT_LEVELS_2_PLUS, OPT_ftree_path_split, NULL, 1 },
>>
>>      /* -O3 optimizations.  */
>>      { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1
>> }, diff --git a/gcc/passes.def b/gcc/passes.def index c0ddee4..43618eb
>> 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -155,6 +155,7 @@ along with GCC; see the file COPYING3.  If not see
>>        NEXT_PASS (pass_ccp);
>>        /* After CCP we rewrite no longer addressed locals into SSA
>>          form if possible.  */
>> +      NEXT_PASS (pass_path_split);
>>        NEXT_PASS (pass_copy_prop);
>>        NEXT_PASS (pass_complete_unrolli);
>>        NEXT_PASS (pass_phiprop);
>> diff --git a/gcc/testsuite/gcc.dg/path-split-1.c
>> b/gcc/testsuite/gcc.dg/path-split-1.c
>> new file mode 100644
>> index 0000000..075dc87
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/path-split-1.c
>> @@ -0,0 +1,65 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 " } */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +#define RGBMAX 255
>> +
>> +int
>> +test()
>> +{
>> +  int i, Pels;
>> +  unsigned char sum = 0;
>> +  unsigned char xr, xg, xb;
>> +  unsigned char xc, xm, xy, xk;
>> +  unsigned char *ReadPtr, *EritePtr;
>> +
>> +  ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> +
>> +  for (i = 0; i < 100;i++)
>> +     {
>> +       ReadPtr[i] = 100 - i;
>> +     }
>> +
>> +  for (i = 0; i < 100; i++)
>> +     {
>> +       xr = *ReadPtr++;
>> +       xg = *ReadPtr++;
>> +       xb = *ReadPtr++;
>> +
>> +       xc = (unsigned char) (RGBMAX - xr);
>> +       xm = (unsigned char) (RGBMAX - xg);
>> +       xy = (unsigned char) (RGBMAX - xb);
>> +
>> +       if (xc < xm)
>> +         {
>> +           xk = (unsigned char) (xc < xy ? xc : xy);
>> +         }
>> +       else
>> +        {
>> +          xk = (unsigned char) (xm < xy ? xm : xy);
>> +        }
>> +
>> +       xc = (unsigned char) (xc - xk);
>> +       xm = (unsigned char) (xm - xk);
>> +       xy = (unsigned char) (xy - xk);
>> +
>> +       *EritePtr++ = xc;
>> +       *EritePtr++ = xm;
>> +       *EritePtr++ = xy;
>> +       *EritePtr++ = xk;
>> +       sum += *EritePtr;
>> +    }
>> +  return sum;
>> +}
>> +
>> +int
>> +main()
>> +{
>> +  if (test() != 33)
>> +    abort();
>> +
>> +  return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> new file mode 100644
>> index 0000000..19f277c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c
>> @@ -0,0 +1,62 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-path_split" } */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +#define RGBMAX 255
>> +
>> +int
>> +test()
>> +{
>> +  int i, Pels;
>> +  unsigned char sum = 0;
>> +  unsigned char xr, xg, xb;
>> +  unsigned char xc, xm, xy, xk;
>> +  unsigned char *ReadPtr, *EritePtr;
>> +
>> +  ReadPtr = (unsigned char *) malloc (sizeof (unsigned char) * 100);
>> + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100);
>> +
>> +  for (i = 0; i < 100;i++)
>> +     {
>> +       ReadPtr[i] = 100 - i;
>> +     }
>> +
>> +  for (i = 0; i < 100; i++)
>> +     {
>> +       xr = *ReadPtr++;
>> +       xg = *ReadPtr++;
>> +       xb = *ReadPtr++;
>> +
>> +       xc = ( unsigned char) (RGBMAX - xr);
>> +       xm = ( unsigned char) (RGBMAX - xg);
>> +       xy = ( unsigned char) (RGBMAX - xb);
>> +
>> +       if (xc < xm)
>> +         {
>> +           xk = ( unsigned char) (xc < xy ? xc : xy);
>> +         }
>> +       else
>> +         {
>> +           xk = ( unsigned char) (xm < xy ? xm : xy);
>> +         }
>> +
>> +       xc = (unsigned char) (xc - xk);
>> +       xm = (unsigned char) (xm - xk);
>> +       xy = (unsigned char) (xy - xk);
>> +
>> +       *EritePtr++ = xc;
>> +       *EritePtr++ = xm;
>> +       *EritePtr++ = xy;
>> +       *EritePtr++ = xk;
>> +       sum += *EritePtr;
>> +    }
>> +  return sum;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "xc_[0-9][0-9]* -> { xc_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "xm_[0-9][0-9]* -> { xm_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "xy_[0-9][0-9]* -> { xy_[0-9][0-9]* }"
>> +"path_split"} } */
>> +/* { dg-final { scan-tree-dump "Merging blocks" "path_split"} } */
>> +/* { dg-final { cleanup-tree-dump "path_split" } } */
>> diff --git a/gcc/timevar.def b/gcc/timevar.def index 711bbed..6217a8e
>> 100644
>> --- a/gcc/timevar.def
>> +++ b/gcc/timevar.def
>> @@ -288,3 +288,4 @@ DEFTIMEVAR (TV_JIT_REPLAY        , "replay of JIT client activity")
>>  DEFTIMEVAR (TV_ASSEMBLE             , "assemble JIT code")
>>  DEFTIMEVAR (TV_LINK                 , "link JIT code")
>>  DEFTIMEVAR (TV_LOAD                 , "load JIT result")
>> +DEFTIMEVAR (TV_TREE_PATH_SPLIT  , "tree path_split")
>> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 398ab83..e00639e
>> 100644
>> --- a/gcc/tree-pass.h
>> +++ b/gcc/tree-pass.h
>> @@ -379,6 +379,7 @@ extern gimple_opt_pass *make_pass_iv_optimize
>> (gcc::context *ctxt);  extern gimple_opt_pass
>> *make_pass_tree_loop_done (gcc::context *ctxt);  extern
>> gimple_opt_pass *make_pass_ch (gcc::context *ctxt);  extern
>> gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
>> +extern gimple_opt_pass *make_pass_path_split (gcc::context *ctxt);
>>  extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context
>> *ctxt);  extern gimple_opt_pass *make_pass_build_ssa (gcc::context
>> *ctxt);  extern gimple_opt_pass *make_pass_build_alias (gcc::context
>> *ctxt); diff --git a/gcc/tree-ssa-path-split.c
>> b/gcc/tree-ssa-path-split.c new file mode 100644 index
>> 0000000..3da7791
>> --- /dev/null
>> +++ b/gcc/tree-ssa-path-split.c
>> @@ -0,0 +1,462 @@
>> +/* Support routines for Path Splitting.
>> +   Copyright (C) 2015 Free Software Foundation, Inc.
>> +   Contributed by Ajit Kumar Agarwal <ajitkum@xilinx.com>.
>> +
>> + This file is part of GCC.
>> +
>> + GCC is free software; you can redistribute it and/or modify it under
>> + the terms of the GNU General Public License as published by the Free
>> + Software Foundation; either version 3, or (at your option) any later
>> + version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT
>> +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3.  If not see
>> +<http://www.gnu.org/licenses/>.  */
>> +
>> +#include "config.h"
>> +#include "system.h"
>> +#include "coretypes.h"
>> +#include "tm.h"
>> +#include "flags.h"
>> +#include "tree.h"
>> +#include "stor-layout.h"
>> +#include "calls.h"
>> +#include "predict.h"
>> +#include "vec.h"
>> +#include "hashtab.h"
>> +#include "hash-set.h"
>> +#include "machmode.h"
>> +#include "hard-reg-set.h"
>> +#include "input.h"
>> +#include "function.h"
>> +#include "dominance.h"
>> +#include "cfg.h"
>> +#include "cfganal.h"
>> +#include "basic-block.h"
>> +#include "tree-ssa-alias.h"
>> +#include "internal-fn.h"
>> +#include "gimple-fold.h"
>> +#include "tree-eh.h"
>> +#include "gimple-expr.h"
>> +#include "is-a.h"
>> +#include "gimple.h"
>> +#include "gimple-iterator.h"
>> +#include "gimple-walk.h"
>> +#include "gimple-ssa.h"
>> +#include "tree-cfg.h"
>> +#include "tree-phinodes.h"
>> +#include "ssa-iterators.h"
>> +#include "stringpool.h"
>> +#include "tree-ssanames.h"
>> +#include "tree-ssa-loop-manip.h"
>> +#include "tree-ssa-loop-niter.h"
>> +#include "tree-ssa-loop.h"
>> +#include "tree-into-ssa.h"
>> +#include "tree-ssa.h"
>> +#include "tree-pass.h"
>> +#include "tree-dump.h"
>> +#include "gimple-pretty-print.h"
>> +#include "diagnostic-core.h"
>> +#include "intl.h"
>> +#include "cfgloop.h"
>> +#include "tree-scalar-evolution.h"
>> +#include "tree-ssa-propagate.h"
>> +#include "tree-chrec.h"
>> +#include "tree-ssa-threadupdate.h"
>> +#include "expr.h"
>> +#include "insn-codes.h"
>> +#include "optabs.h"
>> +#include "tree-ssa-threadedge.h"
>> +#include "wide-int.h"
>> +
>> +/* Replace_uses_phi function propagates the phi results with the
>> +   first phi argument into each of the copied join blocks wired into
>> +   its predecessors. This function is called from the replace_uses_phi
>> +   to replace the uses of first phi arguments with the second
>> +   phi arguments in the next copy of join block.  */
>> +
>> +static void
>> +replace_use_phi_operand1_with_operand2 (basic_block b,
>> +                                        tree use1,
>> +                                        tree use2) {
>> +  use_operand_p use;
>> +  ssa_op_iter iter;
>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (gsi = gsi_start_bb (b); !gsi_end_p (gsi);)
>> +     {
>> +       gimple stmt = gsi_stmt (gsi);
>> +       FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>> +       {
>> +         tree tuse = USE_FROM_PTR (use);
>> +          if (use1 == tuse || use1 == NULL_TREE)
>> +            {
>> +              propagate_value (use, use2);
>> +              update_stmt(stmt);
>> +            }
>> +        }
>> +       gsi_next(&gsi);
>> +     }
>> +}
>> +
>> +/* This function propagates the phi result into the use points with
>> +   the phi arguments. The join block is copied and wired into the
>> +   predecessors. Since the use points of the phi results will be same
>> +   in the each of the copy join blocks in the  predecessors, it
>> +   propagates the phi arguments in the copy of the join blocks wired
>> +   into its predecessor.  */
>> +
>> +static
>> +void replace_uses_phi (basic_block b, basic_block temp_bb) {
>> +  gimple_seq phis = phi_nodes (b);
>> +  gimple phi = gimple_seq_first_stmt (phis);
>> +  tree def = gimple_phi_result (phi), use = gimple_phi_arg_def
>> +(phi,0);
>> +  tree use2 = gimple_phi_arg_def (phi,1);
>> +
>> +  if (virtual_operand_p (def))
>> +    {
>> +      imm_use_iterator iter;
>> +      use_operand_p use_p;
>> +      gimple stmt;
>> +
>> +      FOR_EACH_IMM_USE_STMT (stmt, iter, def)
>> +        FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
>> +          SET_USE (use_p, use);
>> +      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (def))
>> +        SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use) = 1;
>> +    }
>> +   else
>> +     replace_uses_by (def, use);
>> +   replace_use_phi_operand1_with_operand2 (temp_bb, use, use2); }
>> +
>> +/* Returns true if the block bb has label or call statements.
>> +   Otherwise return false.  */
>> +
>> +static bool
>> +is_block_has_label_call (basic_block bb) {
>> +  gimple_stmt_iterator gsi;
>> +
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +     {
>> +       gimple stmt = gsi_stmt(gsi);
>> +       if (dyn_cast <glabel *> (stmt))
>> +         {
>> +           return true;
>> +         }
>> +       if (is_gimple_call (stmt))
>> +         return true;
>> +     }
>> +  return false;
>> +}
>> +
>> +/* This function performs the feasibility tests for path splitting
>> +   to perform. Return false if the feasibility for path splitting
>> +   is not done and returns true if the feasbility for path splitting
>> +   is done. Following feasibility tests are performed.
>> +
>> +   1. Return false if the join block has call gimple statements.
>> +   2. Return false if the join block has rhs casting for assign
>> +      gimple statements.
>> +   3. If the number of phis is greater than 1 or the phi node in
>> +      the join block has virtual operand return false.
>> +   4. Return false if the number of sequential statements is
>> +      greater than 2.
>> +   5. If the predecessors blocks has labels and call statements
>> +      return false.
>> +   6. If the phi result in the phi node of the join block is not
>> +      used inside the same join block return false.
>> +   7. Otherwise returns true.  */
>> +
>> +static bool
>> +is_feasible_path_splitting (basic_block join_node, basic_block pred1,
>> +                           basic_block pred2) {
>> +  int num_stmt = 0, num_phis = 0;
>> +  gimple_stmt_iterator psi, gsi;
>> +
>> +  for (gsi = gsi_start_bb (join_node); !gsi_end_p (gsi); gsi_next (&gsi))
>> +     {
>> +       gimple stmt = gsi_stmt(gsi);
>> +
>> +       if (gimple_assign_cast_p (stmt))
>> +         return false;
>> +
>> +       if (is_gimple_call (stmt))
>> +         return false;
>> +
>> +       if (!is_gimple_debug(stmt))
>> +         {
>> +           num_stmt++;
>> +         }
>> +     }
>> +
>> +   if (pred1 && pred2 && (num_stmt > 2))
>> +     {
>> +       bool found_virtual_result = false;
>> +
>> +       for (psi = gsi_start_phis (join_node); !gsi_end_p (psi); )
>> +          {
>> +            use_operand_p use_p;
>> +            imm_use_iterator iter;
>> +            gimple stmt = gsi_stmt(psi);
>> +
>> +            if (!virtual_operand_p (gimple_phi_result (stmt)))
>> +              num_phis++;
>> +            else
>> +              found_virtual_result = true;
>> +
>> +            FOR_EACH_IMM_USE_FAST (use_p, iter, gimple_phi_result (stmt))
>> +            {
>> +              gimple use_stmt = USE_STMT (use_p);
>> +
>> +              if (gimple_bb (use_stmt) != join_node)
>> +                return false;
>> +            }
>> +
>> +            gsi_next(&psi);
>> +         }
>> +
>> +       if ((num_phis >1) || found_virtual_result)
>> +          return false;
>> +
>> +       if(is_block_has_label_call(pred1) || is_block_has_label_call(pred2))
>> +         return false;
>> +
>> +       return true;
>> +    }
>> +  return false;
>> +}
>> +
>> +/* Update the statements in the basic block with the basic
>> +   basic block.  */
>> +
>> +static void
>> +update_stmt_bb(basic_block b)
>> +{
>> +  gimple_stmt_iterator gsi;
>> +  for(gsi = gsi_start_bb(b); !gsi_end_p(gsi); gsi_next(&gsi))
>> +   {
>> +     gimple stmt = gsi_stmt(gsi);
>> +     gimple_set_bb(stmt,b);
>> +   }
>> +}
>> +
>> +/* This function gets the join blocks same as the source
>> +   node of the loop latch nodes and the predecessors of
>> +   the join block is updated in the pred1 and pred2 passed
>> +   as the reference arguments into the function. Return
>> +   the join block.  */
>> +
>> +static basic_block
>> +get_join_blk_same_as_loop_latch (basic_block bb,
>> +                                 basic_block &pred1,
>> +                                 basic_block &pred2) {
>> +  vec<basic_block> bbs;
>> +  basic_block bb1;
>> +  unsigned int i;
>> +  edge_iterator ei;
>> +  edge e1;
>> +  bool found = false ,found1;
>> +  bbs = get_all_dominated_blocks (CDI_DOMINATORS,
>> +                                  bb );
>> +  FOR_EACH_VEC_ELT (bbs, i, bb1)
>> +  {
>> +    found1 = false;
>> +    FOR_EACH_EDGE (e1, ei, bb->succs)
>> +    {
>> +      if ( bb1 == e1->dest)
>> +        {
>> +          found = true;
>> +          found1 = true;
>> +        }
>> +    }
>> +    if (!found1 && found)
>> +      {
>> +        found = false;
>> +        FOR_EACH_EDGE (e1, ei, bb1->succs)
>> +        {
>> +          if (e1->flags & (EDGE_DFS_BACK))
>> +            found = true;
>> +        }
>> +
>> +        if (found && EDGE_COUNT(bb1->preds) == 2)
>> +          {
>> +            unsigned int k = 0;
>> +            FOR_EACH_EDGE (e1, ei, bb1->preds)
>> +            {
>> +              if ((e1->flags & (EDGE_DFS_BACK)))
>> +                continue;
>> +
>> +              if ( k == 1)
>> +                {
>> +                  if (single_succ_p(e1->src) &&
>> +                      single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
>> +                    {
>> +                      pred2 = e1->src;
>> +                    }
>> +                }
>> +                else
>> +                  {
>> +                    if (single_succ_p(e1->src) &&
>> +                        single_succ_edge (e1->src)->flags & EDGE_FALLTHRU)
>> +                      {
>> +                        pred1 = e1->src;
>> +                      }
>> +                  }
>> +                k++;
>> +            }
>> +            bbs.release();
>> +            return bb1;
>> +          }
>> +       }
>> +   }
>> +   bbs.release();
>> +   return NULL;
>> +}
>> +
>> +/* This is the core function to perform path splitting. The join
>> +   same as the source of the loop latch node is identified along
>> +   with their predecessors. Based on the feasibility tests for
>> +   path splitting the path splitting is performed by wiring the
>> +   copy of join blocks into the predecessors and propagating the phi
>> +   result with the corresponding phi arguments into each of the copy
>> +   of join blocks wired with the original predecessors of the join
>> +   block.
>> +
>> +   The  tree-cfg-cleanup will merge the blocks in the predecessors
>> +   path and the update-ssa will update the ssa representation after
>> +   the path splitting is performed.  */
>> +
>> +static void
>> +perform_path_splitting (basic_block bb) {
>> +  basic_block pred1 = NULL, pred2 = NULL, join_block = NULL;
>> +
>> +  join_block = get_join_blk_same_as_loop_latch (bb, pred1, pred2);
>> +
>> +  if (join_block  &&
>> +      is_feasible_path_splitting (join_block, pred1, pred2))
>> +    {
>> +      basic_block new_bb1 = NULL, new_bb2 = NULL;
>> +      gimple_stmt_iterator last;
>> +      basic_block temp_bb = NULL;
>> +      edge_iterator ei;
>> +      edge e1;
>> +
>> +      temp_bb = duplicate_block (join_block, NULL, NULL);
>> +
>> +      FOR_EACH_EDGE (e1, ei, pred1->succs)
>> +        new_bb1 = split_edge (e1);
>> +
>> +      FOR_EACH_EDGE (e1, ei, pred2->succs)
>> +        new_bb2 = split_edge (e1);
>> +
>> +      last = gsi_start_bb (new_bb1);
>> +      gsi_insert_seq_after (&last, bb_seq (join_block), GSI_NEW_STMT);
>> +      last = gsi_start_bb (new_bb2);
>> +      gsi_insert_seq_after (&last, bb_seq (temp_bb), GSI_NEW_STMT);
>> +      update_stmt_bb (new_bb1);
>> +      update_stmt_bb (new_bb2);
>> +
>> +      replace_uses_phi (join_block, new_bb2);
>> +
>> +      set_bb_seq (join_block, NULL);
>> +      set_bb_seq(temp_bb,NULL);
>> +      delete_basic_block (temp_bb);
>> +      return;
>> +    }
>> +}
>> +
>> +static unsigned int
>> +execute_path_split (void)
>> +{
>> +  basic_block bb;
>> +
>> +  loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
>> + initialize_original_copy_tables();
>> +
>> +  calculate_dominance_info (CDI_DOMINATORS);
>> + calculate_dominance_info (CDI_POST_DOMINATORS);
>> +
>> +  mark_dfs_back_edges ();
>> +
>> +  FOR_EACH_BB_FN (bb, cfun)
>> +  {
>> +    gimple last;
>> +
>> +    /* We only care about blocks ending in a COND_EXPR. */
>> +
>> +    last = gsi_stmt (gsi_last_bb (bb));
>> +
>> +    /* We're basically looking for a switch or any kind of conditional with
>> +       integral or pointer type arguments.  Note the type of the second
>> +       argument will be the same as the first argument, so no need to
>> +       check it explicitly.  */
>> +    if ((last && (gimple_code (last) == GIMPLE_COND
>> +            && TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME
>> +            && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last)))
>> +            || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))))
>> +            && (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME
>> +            || is_gimple_min_invariant (gimple_cond_rhs (last))))))
>> +      {
>> +
>> +         if (gimple_code(last) == GIMPLE_COND)
>> +           {
>> +              perform_path_splitting (bb);
>> +           }
>> +      }
>> +   }
>> +
>> +   loop_optimizer_finalize ();
>> +   free_original_copy_tables ();
>> +   free_dominance_info (CDI_DOMINATORS);
>> +   free_dominance_info (CDI_POST_DOMINATORS);
>> +   return 0;
>> +}
>> +
>> +namespace {
>> +
>> +const pass_data pass_data_path_split = {
>> +   GIMPLE_PASS, /* type */
>> +   "path_split", /* name */
>> +    OPTGROUP_NONE, /* optinfo_flags */
>> +    TV_TREE_PATH_SPLIT, /* tv_id */
>> +    PROP_ssa, /* properties_required */
>> +    0, /* properties_provided */
>> +    0, /* properties_destroyed */
>> +    0, /* todo_flags_start */
>> +    ( TODO_cleanup_cfg | TODO_update_ssa ), /* todo_flags_finish */
>> +};
>> +
>> +class pass_path_split : public gimple_opt_pass {
>> +   public:
>> +    pass_path_split (gcc::context *ctxt)
>> +      : gimple_opt_pass (pass_data_path_split, ctxt)
>> +    {}
>> +
>> +   /* opt_pass methods: */
>> +   opt_pass * clone () { return new pass_path_split (m_ctxt); }
>> +   virtual bool gate (function *) { return flag_tree_path_split != 0; }
>> +   virtual unsigned int execute (function *) { return
>> + execute_path_split (); }
>> +
>> +}; // class pass_path_split
>> +
>> +} // anon namespace
>> +
>> +gimple_opt_pass *
>> +make_pass_path_split (gcc::context *ctxt) {
>> +  return new pass_path_split (ctxt);
>> +}
>> --
>> 1.8.2.1
>>
>> Thanks & Regards
>> Ajit