public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Peeling loops at tree level?
@ 2004-09-13 14:34 Richard Guenther
  2004-09-13 14:40 ` Richard Guenther
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Guenther @ 2004-09-13 14:34 UTC (permalink / raw)
  To: gcc

Hi!

Do we (I suspect not) peel small loops at tree level yet?  This seems
to inhibit further (tree) optimization of inlined dimension-unaware
code like:

template <int Dim>
struct Vector
{
        int operator[](int i) const { return val[i]; }
        int val[Dim];
};

template <int Dim>
inline int foo(const Vector<Dim>& x)
{
        int res;
        for (int i=0; i<Dim; ++i)
                res += x[i];
        return res;
}

int bar(const Vector<3>& x)
{
        return foo(x);
}


where optimized tree dump for -O2 -funroll-loops looks like


;; Function int bar(const Vector<3>&) (_Z3barRK6VectorILi3EE)

int bar(const Vector<3>&) (x)
{
  struct Vector<3> & x.15;
  <unnamed type> D.1651;
  <unnamed type> D.1652;
  const int * ivtmp.8;
  int i.2;
  int D.1634;
  struct Vector<3> * const this;
  int i;
  int i;
  int res;
  int D.1625;
  int retval.1;
  int D.1623;
  bool retval.0;
  struct Vector<3> & x;
  int D.1591;
  int D.1590;

<bb 0>:
  ivtmp.8 = &x->val[0];
  i = 0;

Invalid sum of incoming frequencies 12233, should be 10000
<L0>:;
  res = *ivtmp.8 + res;
  D.1652 = (<unnamed type>) i + 1;
  i = (int) D.1652;
  ivtmp.8 = ivtmp.8 + 4B;
  if (D.1652 != 3) goto <L0>; else goto <L3>;

Invalid sum of incoming frequencies 1100, should be 3333
<L3>:;
  return res;

}

and only in the assembler dump the loop is unrolled (g++-3.5 (GCC) 4.0.0
20040913 (experimental)).

Is there some magic option to tell the tree-level loop-optimizer peel
constant running loops completely?

Thanks,
Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?
  2004-09-13 14:34 Peeling loops at tree level? Richard Guenther
@ 2004-09-13 14:40 ` Richard Guenther
  2004-09-13 14:57   ` Zdenek Dvorak
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Guenther @ 2004-09-13 14:40 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

On Mon, 13 Sep 2004 16:09:03 +0200 (CEST), Richard Guenther
<rguenth@tat.physik.uni-tuebingen.de> wrote:
> Hi!
> 
> Do we (I suspect not) peel small loops at tree level yet?  This seems
> to inhibit further (tree) optimization of inlined dimension-unaware
> code like:
> 
> template <int Dim>
> struct Vector
> {
>         int operator[](int i) const { return val[i]; }
>         int val[Dim];
> };
> 
> template <int Dim>
> inline int foo(const Vector<Dim>& x)
> {
>         int res;
>         for (int i=0; i<Dim; ++i)
>                 res += x[i];
>         return res;
> }
> 
> int bar(const Vector<3>& x)
> {
>         return foo(x);
> }

Btw. lno from g++ (GCC) 3.5-tree-ssa-lno 20040630 (merged 20040612)
seems to do the trick:

;; Function int bar(constVector<3>&) (_Z3barRK6VectorILi3EE)

int bar(constVector<3>&) (x)
{
[...]
<bb 0>:
  return *&x->val[2] + x->val[1] + x->val[0] + res;

}

So I guess we'll get this feature during further merge of lno?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?
  2004-09-13 14:40 ` Richard Guenther
@ 2004-09-13 14:57   ` Zdenek Dvorak
       [not found]     ` <Pine.LNX.4.44.0409131647340.9036-100000@alwazn.tat.physik.uni-tuebingen.de>
  0 siblings, 1 reply; 9+ messages in thread
From: Zdenek Dvorak @ 2004-09-13 14:57 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Richard Guenther, gcc

Hello,

> > Do we (I suspect not) peel small loops at tree level yet?
> Btw. lno from g++ (GCC) 3.5-tree-ssa-lno 20040630 (merged 20040612)
> seems to do the trick:
> 
> ;; Function int bar(constVector<3>&) (_Z3barRK6VectorILi3EE)
> 
> int bar(constVector<3>&) (x)
> {
> [...]
> <bb 0>:
>   return *&x->val[2] + x->val[1] + x->val[0] + res;
> 
> }
> 
> So I guess we'll get this feature during further merge of lno?

indeed -- this will work once this lno branch merge patch

http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00325.html

is in and the optimization is enabled.

Zdenek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
       [not found]     ` <Pine.LNX.4.44.0409131647340.9036-100000@alwazn.tat.physik.uni-tuebingen.de>
@ 2004-09-13 15:49       ` Zdenek Dvorak
  2004-09-16  2:21         ` Richard Henderson
                           ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Zdenek Dvorak @ 2004-09-13 15:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc, gcc-patches

Hello,

> > > > Do we (I suspect not) peel small loops at tree level yet?
> > > Btw. lno from g++ (GCC) 3.5-tree-ssa-lno 20040630 (merged 20040612)
> > > seems to do the trick:
> > >
> > > ;; Function int bar(constVector<3>&) (_Z3barRK6VectorILi3EE)
> > >
> > > int bar(constVector<3>&) (x)
> > > {
> > > [...]
> > > <bb 0>:
> > >   return *&x->val[2] + x->val[1] + x->val[0] + res;
> > >
> > > }
> > >
> > > So I guess we'll get this feature during further merge of lno?
> >
> > indeed -- this will work once this lno branch merge patch
> >
> > http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00325.html
> >
> > is in and the optimization is enabled.
> 
> Uh, this looks "old" - maybe you should ping for this merge?

The patch of course is a bit out-of-date just now; only minor changes,
like renaming uses of build_int_cst to build_int_cst_type and removing
parts already merged with ivopts should be needed, however.

> Current status of lno-merge seems a bit unclear at the moment.

Patches #9, 10 and 11 (urls below) are waiting for review.  Some
optimizations that use parts of those patches need to be fully enabled
once the patches are in.  There are several followup patches, but they
are just small ones (and we can survive without them if they are not
accepted due to stage 3 restrictions).

http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00300.html (part 9)
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00325.html (part 10)
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00404.html (part 11)

The patches need to be reviewed in this order, since they depend on the
previous ones.

There are also several loop optimizer related patches submitted
independently on the merge -- induction variables splitting in
loop optimizer (newest version at
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01185.html)
and removal of empty loops
(http://gcc.gnu.org/ml/gcc-patches/2004-07/msg01679.html).

Zdenek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
  2004-09-13 15:49       ` Peeling loops at tree level?, Lno branch merge patches ping Zdenek Dvorak
@ 2004-09-16  2:21         ` Richard Henderson
  2004-09-16  2:44         ` Richard Henderson
  2004-09-16  2:46         ` Richard Henderson
  2 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2004-09-16  2:21 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Richard Guenther, gcc, gcc-patches

On Mon, Sep 13, 2004 at 05:05:42PM +0200, Zdenek Dvorak wrote:
> http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00300.html (part 9)

Ok.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
  2004-09-13 15:49       ` Peeling loops at tree level?, Lno branch merge patches ping Zdenek Dvorak
  2004-09-16  2:21         ` Richard Henderson
@ 2004-09-16  2:44         ` Richard Henderson
  2004-09-16 23:05           ` Zdenek Dvorak
  2004-09-16  2:46         ` Richard Henderson
  2 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2004-09-16  2:44 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Richard Guenther, gcc, gcc-patches

On Mon, Sep 13, 2004 at 05:05:42PM +0200, Zdenek Dvorak wrote:
> http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00325.html (part 10)

This patch isn't up-to-date wrt force_gimple_operand and
others.  Would you please regenerate?


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
  2004-09-13 15:49       ` Peeling loops at tree level?, Lno branch merge patches ping Zdenek Dvorak
  2004-09-16  2:21         ` Richard Henderson
  2004-09-16  2:44         ` Richard Henderson
@ 2004-09-16  2:46         ` Richard Henderson
  2 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2004-09-16  2:46 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Richard Guenther, gcc, gcc-patches

On Mon, Sep 13, 2004 at 05:05:42PM +0200, Zdenek Dvorak wrote:
> http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00404.html (part 11)

Ok.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
  2004-09-16  2:44         ` Richard Henderson
@ 2004-09-16 23:05           ` Zdenek Dvorak
  2004-09-20 20:41             ` Richard Henderson
  0 siblings, 1 reply; 9+ messages in thread
From: Zdenek Dvorak @ 2004-09-16 23:05 UTC (permalink / raw)
  To: Richard Henderson, Richard Guenther, gcc, gcc-patches

Hello,

> On Mon, Sep 13, 2004 at 05:05:42PM +0200, Zdenek Dvorak wrote:
> > http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00325.html (part 10)
> 
> This patch isn't up-to-date wrt force_gimple_operand and
> others.  Would you please regenerate?

here it is.  Great part of the patch was already comitted with other
patches, the functions to split iterations of a loop and to align number
of iterations of a loop are so far only used in my experiments, so
I left them out as unused.

Zdenek

	* cfgloop.h (update_single_exits_after_duplication): Declare.
	(loopify, split_loop_bb): Declaration changed.
	* cfgloopmanip.c (split_loop_bb): Take void * as an argument instead
	of rtx.
	(loopify): Added redirect_all_edges argument.
	(update_single_exits_after_duplication): Export.
	* loop-unswitch.c (unswitch_loop): Changed due to loopify change.
	* tree-flow.h (tree_duplicate_loop_to_header_edge,
	tree_ssa_loop_version): Declare.
	* tree-ssa-loop-manip.c (copy_phi_node_args, rename_variables,
	set_phi_def_stmts, tree_duplicate_loop_to_header_edge,
	lv_adjust_loop_header_phi, lv_adjust_loop_entry_edge,
	lv_update_pending_stmts, tree_ssa_loop_version): New functions.

Index: cfgloop.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfgloop.h,v
retrieving revision 1.31
diff -c -3 -p -r1.31 cfgloop.h
*** cfgloop.h	16 Sep 2004 16:16:13 -0000	1.31
--- cfgloop.h	16 Sep 2004 21:57:02 -0000
*************** extern int flow_loop_scan (struct loop *
*** 254,259 ****
--- 254,261 ----
  extern void flow_loop_free (struct loop *);
  void mark_irreducible_loops (struct loops *);
  void mark_single_exit_loops (struct loops *);
+ void update_single_exits_after_duplication (basic_block *, unsigned,
+ 					    struct loop *);
  extern void create_loop_notes (void);
  
  /* Loop data structure manipulation/querying.  */
*************** extern struct loop * duplicate_loop (str
*** 313,322 ****
  extern int duplicate_loop_to_header_edge (struct loop *, edge, struct loops *,
  					  unsigned, sbitmap, edge, edge *,
  					  unsigned *, int);
! extern struct loop *loopify (struct loops *, edge, edge, basic_block);
  extern void unloop (struct loops *, struct loop *);
  extern bool remove_path (struct loops *, edge);
! extern edge split_loop_bb (basic_block, rtx);
  
  /* Induction variable analysis.  */
  
--- 315,324 ----
  extern int duplicate_loop_to_header_edge (struct loop *, edge, struct loops *,
  					  unsigned, sbitmap, edge, edge *,
  					  unsigned *, int);
! extern struct loop *loopify (struct loops *, edge, edge, basic_block, bool);
  extern void unloop (struct loops *, struct loop *);
  extern bool remove_path (struct loops *, edge);
! extern edge split_loop_bb (basic_block, void *);
  
  /* Induction variable analysis.  */
  
Index: cfgloopmanip.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cfgloopmanip.c,v
retrieving revision 1.31
diff -c -3 -p -r1.31 cfgloopmanip.c
*** cfgloopmanip.c	16 Sep 2004 16:16:13 -0000	1.31
--- cfgloopmanip.c	16 Sep 2004 21:57:02 -0000
*************** static void fix_irreducible_loops (basic
*** 53,59 ****
  /* Splits basic block BB after INSN, returns created edge.  Updates loops
     and dominators.  */
  edge
! split_loop_bb (basic_block bb, rtx insn)
  {
    edge e;
  
--- 53,59 ----
  /* Splits basic block BB after INSN, returns created edge.  Updates loops
     and dominators.  */
  edge
! split_loop_bb (basic_block bb, void *insn)
  {
    edge e;
  
*************** scale_loop_frequencies (struct loop *loo
*** 486,492 ****
  
  struct loop *
  loopify (struct loops *loops, edge latch_edge, edge header_edge, 
! 	 basic_block switch_bb)
  {
    basic_block succ_bb = latch_edge->dest;
    basic_block pred_bb = header_edge->src;
--- 486,492 ----
  
  struct loop *
  loopify (struct loops *loops, edge latch_edge, edge header_edge, 
! 	 basic_block switch_bb, bool redirect_all_edges)
  {
    basic_block succ_bb = latch_edge->dest;
    basic_block pred_bb = header_edge->src;
*************** loopify (struct loops *loops, edge latch
*** 513,524 ****
    loop_redirect_edge (latch_edge, loop->header);
    loop_redirect_edge (BRANCH_EDGE (switch_bb), succ_bb);
  
!   loop_redirect_edge (header_edge, switch_bb);
!   loop_redirect_edge (FALLTHRU_EDGE (switch_bb), loop->header); 
! 
!   /* Update dominators.  */
!   set_immediate_dominator (CDI_DOMINATORS, switch_bb, pred_bb);
!   set_immediate_dominator (CDI_DOMINATORS, loop->header, switch_bb);
  
    set_immediate_dominator (CDI_DOMINATORS, succ_bb, switch_bb);
  
--- 513,529 ----
    loop_redirect_edge (latch_edge, loop->header);
    loop_redirect_edge (BRANCH_EDGE (switch_bb), succ_bb);
  
!   /* During loop versioning, one of the switch_bb edge is already properly
!      set. Do not redirect it again unless redirect_all_edges is true.  */
!   if (redirect_all_edges)
!     {
!       loop_redirect_edge (header_edge, switch_bb);
!       loop_redirect_edge (FALLTHRU_EDGE (switch_bb), loop->header); 
!      
!       /* Update dominators.  */
!       set_immediate_dominator (CDI_DOMINATORS, switch_bb, pred_bb);
!       set_immediate_dominator (CDI_DOMINATORS, loop->header, switch_bb);
!     }
  
    set_immediate_dominator (CDI_DOMINATORS, succ_bb, switch_bb);
  
*************** can_duplicate_loop_p (struct loop *loop)
*** 812,818 ****
  /* The NBBS blocks in BBS will get duplicated and the copies will be placed
     to LOOP.  Update the single_exit information in superloops of LOOP.  */
  
! static void
  update_single_exits_after_duplication (basic_block *bbs, unsigned nbbs,
  				       struct loop *loop)
  {
--- 817,823 ----
  /* The NBBS blocks in BBS will get duplicated and the copies will be placed
     to LOOP.  Update the single_exit information in superloops of LOOP.  */
  
! void
  update_single_exits_after_duplication (basic_block *bbs, unsigned nbbs,
  				       struct loop *loop)
  {
*************** update_single_exits_after_duplication (b
*** 834,840 ****
      bbs[i]->rbi->duplicated = 0;
  }
  
- 
  /* Duplicates body of LOOP to given edge E NDUPL times.  Takes care of updating
     LOOPS structure and dominators.  E's destination must be LOOP header for
     this to work, i.e. it must be entry or latch edge of this loop; these are
--- 839,844 ----
Index: loop-unswitch.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/loop-unswitch.c,v
retrieving revision 1.21
diff -c -3 -p -r1.21 loop-unswitch.c
*** loop-unswitch.c	10 Sep 2004 11:02:24 -0000	1.21
--- loop-unswitch.c	16 Sep 2004 21:57:02 -0000
*************** unswitch_loop (struct loops *loops, stru
*** 475,481 ****
  
    /* Loopify from the copy of LOOP body, constructing the new loop.  */
    nloop = loopify (loops, latch_edge,
! 		   loop->header->rbi->copy->pred, switch_bb);
  
    /* Remove branches that are now unreachable in new loops.  */
    remove_path (loops, true_edge);
--- 475,481 ----
  
    /* Loopify from the copy of LOOP body, constructing the new loop.  */
    nloop = loopify (loops, latch_edge,
! 		   loop->header->rbi->copy->pred, switch_bb, true);
  
    /* Remove branches that are now unreachable in new loops.  */
    remove_path (loops, true_edge);
Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-flow.h,v
retrieving revision 2.46
diff -c -3 -p -r2.46 tree-flow.h
*** tree-flow.h	16 Sep 2004 21:29:38 -0000	2.46
--- tree-flow.h	16 Sep 2004 21:57:02 -0000
*************** void standard_iv_increment_position (str
*** 675,680 ****
--- 675,686 ----
  				     bool *);
  basic_block ip_end_pos (struct loop *);
  basic_block ip_normal_pos (struct loop *);
+ bool tree_duplicate_loop_to_header_edge (struct loop *, edge, struct loops *,
+ 					 unsigned int, sbitmap,
+ 					 edge, edge *,
+ 					 unsigned int *, int);
+ struct loop *tree_ssa_loop_version (struct loops *, struct loop *, tree,
+ 				    basic_block *);
  
  /* In tree-ssa-loop-im.c  */
  /* The possibilities of statement movement.  */
Index: tree-ssa-loop-manip.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-manip.c,v
retrieving revision 2.6
diff -c -3 -p -r2.6 tree-ssa-loop-manip.c
*** tree-ssa-loop-manip.c	12 Sep 2004 20:20:58 -0000	2.6
--- tree-ssa-loop-manip.c	16 Sep 2004 21:57:02 -0000
*************** standard_iv_increment_position (struct l
*** 506,508 ****
--- 506,854 ----
        *insert_after = false;
      }
  }
+ 
+ /* Copies phi node arguments for duplicated blocks.  The index of the first
+    duplicated block is FIRST_NEW_BLOCK.  */
+ 
+ static void
+ copy_phi_node_args (unsigned first_new_block)
+ {
+   unsigned i;
+ 
+   for (i = first_new_block; i < (unsigned) last_basic_block; i++)
+     BASIC_BLOCK (i)->rbi->duplicated = 1;
+ 
+   for (i = first_new_block; i < (unsigned) last_basic_block; i++)
+     add_phi_args_after_copy_bb (BASIC_BLOCK (i));
+ 
+   for (i = first_new_block; i < (unsigned) last_basic_block; i++)
+     BASIC_BLOCK (i)->rbi->duplicated = 0;
+ }
+ 
+ /* Renames variables in the area copied by tree_duplicate_loop_to_header_edge.
+    FIRST_NEW_BLOCK is the first block in the copied area.   DEFINITIONS is
+    a bitmap of all ssa names defined inside the loop.  */
+ 
+ static void
+ rename_variables (unsigned first_new_block, bitmap definitions)
+ {
+   unsigned i, copy_number = 0;
+   basic_block bb;
+   htab_t ssa_name_map = NULL;
+ 
+   for (i = first_new_block; i < (unsigned) last_basic_block; i++)
+     {
+       bb = BASIC_BLOCK (i);
+ 
+       /* We assume that first come all blocks from the first copy, then all
+ 	 blocks from the second copy, etc.  */
+       if (copy_number != (unsigned) bb->rbi->copy_number)
+ 	{
+ 	  allocate_ssa_names (definitions, &ssa_name_map);
+ 	  copy_number = bb->rbi->copy_number;
+ 	}
+ 
+       rewrite_to_new_ssa_names_bb (bb, ssa_name_map);
+     }
+ 
+   htab_delete (ssa_name_map);
+ }
+ 
+ /* Sets SSA_NAME_DEF_STMT for results of all phi nodes in BB.  */
+ 
+ static void
+ set_phi_def_stmts (basic_block bb)
+ {
+   tree phi;
+ 
+   for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi))
+     SSA_NAME_DEF_STMT (PHI_RESULT (phi)) = phi;
+ }
+ 
+ /* The same ad cfgloopmanip.c:duplicate_loop_to_header_edge, but also updates
+    ssa.  In order to achieve this, only loops whose exits all lead to the same
+    location are handled.
+    
+    FIXME: we create some degenerate phi nodes that could be avoided by copy
+    propagating them instead.  Unfortunately this is not completely
+    straightforward due to problems with constant folding.  */
+ 
+ bool
+ tree_duplicate_loop_to_header_edge (struct loop *loop, edge e,
+ 				    struct loops *loops,
+ 				    unsigned int ndupl, sbitmap wont_exit,
+ 				    edge orig, edge *to_remove,
+ 				    unsigned int *n_to_remove, int flags)
+ {
+   unsigned first_new_block;
+   basic_block bb;
+   unsigned i;
+   tree phi, arg, map, def;
+   bitmap definitions;
+ 
+   if (!(loops->state & LOOPS_HAVE_SIMPLE_LATCHES))
+     return false;
+   if (!(loops->state & LOOPS_HAVE_PREHEADERS))
+     return false;
+ 
+ #ifdef ENABLE_CHECKING
+   verify_loop_closed_ssa ();
+ #endif
+ 
+   gcc_assert (!any_marked_for_rewrite_p ());
+ 
+   first_new_block = last_basic_block;
+   if (!duplicate_loop_to_header_edge (loop, e, loops, ndupl, wont_exit,
+ 				      orig, to_remove, n_to_remove, flags))
+     return false;
+ 
+   /* Readd the removed phi args for e.  */
+   map = PENDING_STMT (e);
+   PENDING_STMT (e) = NULL;
+ 
+   for (phi = phi_nodes (e->dest), arg = map;
+        phi;
+        phi = TREE_CHAIN (phi), arg = TREE_CHAIN (arg))
+     {
+       def = TREE_VALUE (arg);
+       add_phi_arg (&phi, def, e);
+     }
+   gcc_assert (arg == NULL);
+ 
+   /* Copy the phi node arguments.  */
+   copy_phi_node_args (first_new_block);
+ 
+   /* Rename the variables.  */
+   definitions = marked_ssa_names ();
+   rename_variables (first_new_block, definitions);
+   unmark_all_for_rewrite ();
+   BITMAP_XFREE (definitions);
+ 
+   /* For some time we have the identical ssa names as results in multiple phi
+      nodes.  When phi node is resized, it sets SSA_NAME_DEF_STMT of its result
+      to the new copy.  This means that we cannot easily ensure that the ssa
+      names defined in those phis are pointing to the right one -- so just
+      recompute SSA_NAME_DEF_STMT for them.  */ 
+ 
+   for (i = first_new_block; i < (unsigned) last_basic_block; i++)
+     {
+       bb = BASIC_BLOCK (i);
+       set_phi_def_stmts (bb);
+       if (bb->rbi->copy_number == 1)
+   	set_phi_def_stmts (bb->rbi->original);
+     }
+ 
+   scev_reset ();
+ #ifdef ENABLE_CHECKING
+   verify_loop_closed_ssa ();
+ #endif
+ 
+   return true;
+ }
+ 
+ /*---------------------------------------------------------------------------
+   Loop versioning
+   ---------------------------------------------------------------------------*/
+  
+ /* Adjust phi nodes for 'first' basic block.  'second' basic block is a copy
+    of 'first'. Both of them are dominated by 'new_head' basic block. When
+    'new_head' was created by 'second's incoming edge it received phi arguments
+    on the edge by split_edge(). Later, additional edge 'e' was created to
+    connect 'new_head' and 'first'. Now this routine adds phi args on this 
+    additional edge 'e' that new_head to second edge received as part of edge 
+    splitting.
+ */
+ 
+ static void
+ lv_adjust_loop_header_phi (basic_block first, basic_block second,
+ 			   basic_block new_head, edge e)
+ {
+   tree phi1, phi2;
+ 
+   /* Browse all 'second' basic block phi nodes and add phi args to
+      edge 'e' for 'first' head. PHI args are always in correct order.  */
+ 
+   for (phi2 = phi_nodes (second), phi1 = phi_nodes (first); 
+        phi2 && phi1; 
+        phi2 = TREE_CHAIN (phi2),  phi1 = TREE_CHAIN (phi1))
+     {
+       int i;
+       for (i = 0; i < PHI_NUM_ARGS (phi2); i++)
+ 	{
+ 	  if (PHI_ARG_EDGE (phi2, i)->src == new_head)
+ 	    {
+ 	      tree def = PHI_ARG_DEF (phi2, i);
+ 	      add_phi_arg (&phi1, def, e);
+ 	    }
+ 	}
+     }
+ }
+ 
+ /* Adjust entry edge for lv.
+    
+   e is a incoming edge. 
+ 
+   --- edge e ---- > [second_head]
+ 
+   Split it and insert new conditional expression and adjust edges.
+    
+    --- edge e ---> [cond expr] ---> [first_head]
+                         |
+                         +---------> [second_head]
+ 
+ */
+    
+ static basic_block
+ lv_adjust_loop_entry_edge (basic_block first_head,
+ 			   basic_block second_head,
+ 			   edge e,
+ 			   tree cond_expr)
+ { 
+   block_stmt_iterator bsi;
+   basic_block orig_head = e->src;
+   basic_block new_head = NULL;
+   tree goto1 = NULL_TREE;
+   tree goto2 = NULL_TREE;
+   tree new_cond_expr = NULL_TREE;
+   edge e0, e1;
+ 
+   /* Split edge 'e'. This will create a new basic block, where we can
+      insert conditional expr.  */
+   new_head = split_edge (e);
+   set_immediate_dominator (CDI_DOMINATORS, new_head, orig_head);
+ 
+   /* Build new conditional expr */
+   goto1 = build1 (GOTO_EXPR, void_type_node, tree_block_label (first_head));
+   goto2 = build1 (GOTO_EXPR, void_type_node, tree_block_label (second_head));
+   new_cond_expr = build3 (COND_EXPR, void_type_node, cond_expr, goto1, goto2);
+ 
+   /* Add new cond. in new head.  */ 
+   bsi = bsi_start (new_head); 
+   bsi_insert_after (&bsi, new_cond_expr, BSI_NEW_STMT);
+ 
+   /* Adjust edges appropriately to connect new head with first head
+      as well as second head.  */
+   e1 = make_edge (new_head, first_head, EDGE_TRUE_VALUE);
+   set_immediate_dominator (CDI_DOMINATORS, first_head, new_head);
+   make_edge (new_head, second_head, EDGE_FALSE_VALUE);
+   set_immediate_dominator (CDI_DOMINATORS, second_head, new_head);
+ 
+   /* Adjust loop header phi nodes.  */
+   lv_adjust_loop_header_phi (first_head, second_head, new_head, e1);
+ 
+   /* When edge 'e' was split, it created a fall through edge
+       from new head to second head. Above created FALSE edge
+       from new head to second head and now we do not need the
+       fall through edge.  */
+   for (e0 = new_head->succ; e0; e0 = e0->succ_next)
+     if (e0->dest == second_head)
+       e0->flags &= ~EDGE_FALLTHRU;
+ 
+   return new_head;
+ }
+ 
+ /* Add phi args using PENDINT_STMT list.  */
+ 
+ static void
+ lv_update_pending_stmts (edge e)
+ {
+   basic_block dest;
+   tree phi, arg, def;
+ 
+   if (!PENDING_STMT (e))
+     return;
+ 
+   dest = e->dest;
+ 
+   for (phi = phi_nodes (dest), arg = PENDING_STMT (e);
+        phi;
+        phi = TREE_CHAIN (phi), arg = TREE_CHAIN (arg))
+     {
+       def = TREE_VALUE (arg);
+       add_phi_arg (&phi, def, e);
+     }
+ 
+   PENDING_STMT (e) = NULL;
+ }
+ 
+ 
+ /* Main entry point for Loop Versioning transformation.
+    
+ This transformation given a condition and a loop, creates
+ -if (condition) { loop_copy1 } else { loop_copy2 },
+ where loop_copy1 is the loop transformed in one way, and loop_copy2
+ is the loop transformed in another way (or unchanged). 'condition'
+ may be a run time test for things that were not resolved by static
+ analysis (overlapping ranges (anti-aliasing), alignment, etc.).  */
+ 
+ struct loop *
+ tree_ssa_loop_version (struct loops *loops, struct loop * loop, 
+ 		       tree cond_expr, basic_block *condition_bb)
+ {
+   edge entry, latch_edge, exit;
+   basic_block first_head, second_head;
+   int irred_flag;
+   struct loop *nloop;
+ 
+   /* CHECKME: Loop versioning does not handle nested loop at this point.  */
+   if (loop->inner)
+     return NULL;
+ 
+   /* Record entry and latch edges for the loop */
+   entry = loop_preheader_edge (loop);
+ 
+   /* Note down head of loop as first_head.  */
+   first_head = entry->dest;
+ 
+   /* Duplicate loop.  */
+   irred_flag = entry->flags & EDGE_IRREDUCIBLE_LOOP;
+   entry->flags &= ~EDGE_IRREDUCIBLE_LOOP;
+   if (!tree_duplicate_loop_to_header_edge (loop, entry, loops, 1,
+ 					   NULL, NULL, NULL, NULL, 0))
+     {
+       entry->flags |= irred_flag;
+       return NULL;
+     }
+ 
+   /* After duplication entry edge now points to new loop head block.
+      Note down new head as second_head.  */
+   second_head = entry->dest;
+ 
+   /* Split loop entry edge and insert new block with cond expr.  */
+   *condition_bb = lv_adjust_loop_entry_edge (first_head, second_head, entry, 
+ 					    cond_expr); 
+ 
+   latch_edge = loop->latch->rbi->copy->succ;
+   nloop = loopify (loops, 
+ 		   latch_edge,
+ 		   loop->header->rbi->copy->pred,
+ 		   *condition_bb,
+ 		   false /* Do not redirect all edges.  */);
+ 
+   exit = loop->single_exit;
+   if (exit)
+     nloop->single_exit = find_edge (exit->src->rbi->copy, exit->dest);
+ 
+   /* loopify redirected latch_edge. Update its PENDING_STMTS.  */ 
+   lv_update_pending_stmts (latch_edge);
+ 
+   /* loopify redirected condition_bb's succ edge. Update its PENDING_STMTS.  */ 
+   lv_update_pending_stmts (FALLTHRU_EDGE (*condition_bb));
+ 
+   /* Adjust irreducible flag.  */
+   if (irred_flag)
+     {
+       (*condition_bb)->flags |= BB_IRREDUCIBLE_LOOP;
+       loop_preheader_edge (loop)->flags |= EDGE_IRREDUCIBLE_LOOP;
+       loop_preheader_edge (nloop)->flags |= EDGE_IRREDUCIBLE_LOOP;
+       (*condition_bb)->pred->flags |= EDGE_IRREDUCIBLE_LOOP;
+     }
+ 
+   /* At this point condition_bb is loop predheader with two successors, 
+      first_head and second_head.   Make sure that loop predheader has only 
+      one successor. */
+   loop_split_edge_with (loop_preheader_edge (loop), NULL);
+   loop_split_edge_with (loop_preheader_edge (nloop), NULL);
+ 
+   return nloop;
+ }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Peeling loops at tree level?, Lno branch merge patches ping
  2004-09-16 23:05           ` Zdenek Dvorak
@ 2004-09-20 20:41             ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2004-09-20 20:41 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Richard Guenther, gcc, gcc-patches

On Fri, Sep 17, 2004 at 12:01:16AM +0200, Zdenek Dvorak wrote:
> +   make_edge (new_head, second_head, EDGE_FALSE_VALUE);
...
> +   /* When edge 'e' was split, it created a fall through edge
> +       from new head to second head. Above created FALSE edge
> +       from new head to second head and now we do not need the
> +       fall through edge.  */
> +   for (e0 = new_head->succ; e0; e0 = e0->succ_next)
> +     if (e0->dest == second_head)
> +       e0->flags &= ~EDGE_FALLTHRU;

  e0 = make_edge (new_head, second_head, EDGE_FALSE_VALUE);


Otherwise ok.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-09-20 19:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-13 14:34 Peeling loops at tree level? Richard Guenther
2004-09-13 14:40 ` Richard Guenther
2004-09-13 14:57   ` Zdenek Dvorak
     [not found]     ` <Pine.LNX.4.44.0409131647340.9036-100000@alwazn.tat.physik.uni-tuebingen.de>
2004-09-13 15:49       ` Peeling loops at tree level?, Lno branch merge patches ping Zdenek Dvorak
2004-09-16  2:21         ` Richard Henderson
2004-09-16  2:44         ` Richard Henderson
2004-09-16 23:05           ` Zdenek Dvorak
2004-09-20 20:41             ` Richard Henderson
2004-09-16  2:46         ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).