conditional lim - Evgeniya Maenkova

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Evgeniya Maenkova <evgeniya.maenkova@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>, gcc-patches@gcc.gnu.org
Subject: conditional lim
Date: Fri, 08 May 2015 21:07:00 -0000	[thread overview]
Message-ID: <CANVW7MPfRvpNjABKuqup9GxbUk7A=BaRqwsPcbP8DRpGoGMWdw@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 5080 bytes --]

Hi,

Could you please review my patch for predicated lim?

Let me note some details about it:



1)      Phi statements are still moved only if they have 1 or 2
arguments. However, phi statements could be move under conditions (as
it’s done for the other statements).  Probably, phi statement motion
with 3 + arguments could be implemented in the next patch after
predicated lim.

2)      Patch has limitations/features like (it was ok to me to
implement it such way, maybe I’m not correct. ):

a)      Loop1

    {

              If (a)

                 Loop2

                     {

                       Stmt - Invariant for Loop1

                     }

             }

       In this case Stmt will be moved only out of Loop2, because of if (a).

b)      Or

Loop1

    {

         …

         If (cond1)

              If (cond2)

                  If (cond3)

                      Stmt;

       }

Stmt will be moved out only if cond1 is always executed in Loop1.

c)       It took me a long time to write all of these code, so there
might be other peculiarities which I forgot to mention. :)

                       Let’s discuss these ones as you will review my patch.

3)      Patch consists of 9 files:

a)      gcc/testsuite/gcc.dg/tree-ssa/loop-7.c,
gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – changed tests:

-          gcc/testsuite/gcc.dg/tree-ssa/loop-7.c  changed as
predicated lim moves 2 more statements out of the loop;

-          gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – with conditional
lim recip optimization in this test doesn’t work (the corresponding
value is below threshold as I could see in the code for recip, 1<3).
So to have recip working in this test I changed test a little bit.

b)      gcc/tree-ssa-loop-im.c – the patched lim per se

c)       gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c,
gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-14.c,

gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c,
gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-16.c,

gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c,
gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c

the tests for conditional lim.

4)      Patch testing:

a)      make –k check (no difference in results for me for the clean
build and the patched one,

-          Revision: 222849,

-          uname -a

               Linux Istanbul 3.16.0-23-generic #31-Ubuntu SMP Tue Oct
21 18:00:35 UTC 2014 i686 i686 i686 GNU/Linux

b)      Bootstrap.

 It goes well now, however to fix it I have made a temporary hack in
the lim code. And with this fix patch definitely shouldn’t be
committed.

I did so, as I would like to discuss this issue first.

The problem is: I got stage2-stage3 comparison failure on the single
file (tree-vect-data-refs.o). After some investigation I understood
that tree-vect-data-refs.o differs being compiled with and without
‘-g’ option (yes, more exactly on stage 2 this is  ‘-g –O2 –gtoggle’,
and for stage 3 this is ‘-g –O2’. But to simplify things I can
reproduce this difference on the same build (even not bootstrapped),
with option ‘ –g’ and without it).

Of course, the file compiled with –g option will contain debug
information and will differ from the corresponding file without debug
information. I mean there is the difference reported by
contrib/compare-debug (I mean we talk about stripped files).

Then I compared assemblers and lim dump files and made a conclusion
the difference is:

There is statement _484=something, which is conditionally moved out of
loop X. In non debug cases no usages of _484 are left inside the loop
X. In debug case, there is the single use in debug statement. So for
debug case my patch generates phi statement for _484 to make it
available inside the loop X. For non debug case, no such phi statement
generated as there is no uses of _484.

There is one more statement with lhs=_493, with exactly this pattern
of use. But this is not important for our discussion.



So, in my opinion it’s ok to generate additional phi node for debug
case. But I’m not a compiler expert and maybe there is some
requirement that debug and non-debug versions should differ only by
debug statements, I don’t know.



My temporary hack fix is just removing of such debug statements (no
debug statements, no problems J).



The alternatives I see are:

-          Move debug statements outside the loop (not good in my opinion);

-          Somehow reset values in debug statements (not good in my
opinion, if I could provide correct information for them);

-          Generate phi for debug statements and fix bootstrap (say,
let’s have some list of files, which we have special check for. I mean
for my case, the difference is not in stage2 and stage3, it’s in debug
and non-debug). I like this variant. However, as I don’t see such list
of special files in the whole gcc, I think I might be missing
something.

What do you think?



Thanks,



Evgeniya

[-- Attachment #2: patch_May8_3 --]
[-- Type: application/octet-stream, Size: 65515 bytes --]

Index: gcc/testsuite/gcc.dg/tree-ssa/loop-7.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-7.c	(revision 222849)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-7.c	(working copy)
@@ -28,8 +28,9 @@ int xxx (void)
 }
 
 /* Calls to cst_fun1 and pure_fun1 may be moved out of the loop.
-   Calls to cst_fun2 and pure_fun2 should not be, since calling
-   with k = 0 may be invalid.  */
+   Calls to cst_fun2 and pure_fun2 should be moved under condition,
+   since calling with k = 0 may be invalid.  */
 
-/* { dg-final { scan-tree-dump-times "Moving statement" 2 "lim1" } } */
+/* { dg-final { scan-tree-dump-times "Moving statement" 4 "lim1" } } */
+/* { dg-final { scan-tree-dump-times "Conditionally executed" 2 "lim1" } } */
 /* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/recip-3.c	(revision 222849)
+++ gcc/testsuite/gcc.dg/tree-ssa/recip-3.c	(working copy)
@@ -7,18 +7,19 @@ double F[2] = { 0.0, 0.0 }, e;
 float h ()
 {
 	int i;
-	double E, W, P, d;
+	double E, W, P1, d, P2, S;
 
 	W = 1.1;
 	d = 2.*e;
 	E = 1. - d;
 
 	for( i=0; i < 2; i++ )
-		if( d > 0.01 )
-		{
-			P = ( W < E ) ? (W - E)/d : (E - W)/d;
-			F[i] += P;
-		}
+	  {
+	    S = (W < E) ? (W - E) : (E - W);
+	    P1 = S / d;
+	    P2 = (S * 3) /d;
+	    F[i] += (P1 > P2) ? P1 : P2;
+	  }
 
 	F[0] += E / d;
 }
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c	(working copy)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-lim1-details" } */
+
+void bar (int);
+void foo (int n, int m, int k)
+{
+  unsigned i;
+  for (i = 0; i < n; ++i)
+    {
+      if (k)
+	{
+	  int x;
+	  if (m < 0)
+	    x = 1;
+	  else
+	    x = m;
+	  bar(x);
+	}
+
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Moving PHI node" 1 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "Conditionally executed" 1 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "Moving statement" 0 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-14.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-14.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-14.c	(working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-lim1-details" } */
+
+void bar (int);
+void foo (int n, int m, int k)
+{
+  unsigned i;
+  for (i = 0; i < n; ++i)
+    {
+      if (k + i < 5)
+	{
+	  int x;
+	  if (m < 0)
+	    x = 1;
+	  else
+	    x = m;
+	  bar(x);
+	}
+
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Moving" 0 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c	(working copy)
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-lim1-details" } */
+
+void bar (int);
+void foo (int n, int m, int k)
+{
+  unsigned i;
+  for (i = 0; i < n; ++i)
+    {
+      if (n + m > 5000)
+	continue;
+      if (k)
+	{
+	  int x;
+	  if (m < 0)
+	    x = 1;
+	  else
+	    x = m;
+	  bar(x);
+	}
+
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Moving PHI node" 1 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "Conditionally executed" 1 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-16.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-16.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-16.c	(working copy)
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-lim1-details" } */
+
+void bar (long);
+void foo (int n, int m, int k, int s)
+{
+  unsigned x,y,z;
+  long a,b,c;
+
+  for (x = 0; x < 1000; ++x)
+    {
+      c = c + k;
+      for (z = 0; z < 2000; z +=3)
+	{
+	  if (k)
+	    {
+	      a += m * n;
+	      b += m * x;
+	    }
+	}
+    }
+  bar (a);
+  bar (b);
+  bar (c);
+
+}
+
+/* { dg-final { scan-tree-dump-times "Conditionally executed" 3 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "Moving statement" 3 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "out of loop 2" 2 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "out of loop 1" 1 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c	(working copy)
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-lim1-details" } */
+
+void bar (long);
+void foo (int n, int m, int k, int s)
+{
+  unsigned x,y,z;
+  long a,b,c;
+
+  for (x = 0; x < 1000; ++x)
+    {
+      c = c + k;
+      if (c + k < 200)
+	continue;
+      for (z = 0; z < 2000; z +=3)
+	{
+	  if (k)
+	    {
+	      a += m * n;
+	      b += m * x;
+	    }
+	}
+    }
+  bar (a);
+  bar (b);
+  bar (c);
+
+}
+
+/* { dg-final { scan-tree-dump-times "Conditionally executed" 3 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "Moving statement" 3 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "out of loop 2" 3 "lim1"  } } */
+/* { dg-final { scan-tree-dump-times "out of loop 1" 0 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c	(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-lim1-details" } */
+
+struct { int x; int y; } global;
+void foo(int n, int m)
+{
+  int i;
+  for ( i=0; i<n; i++)
+    if (m > 0)
+      global.y += global.x*global.x;
+}
+
+/* { dg-final { scan-tree-dump "Executing store motion of global.y" "lim1" } } */
+/* { dg-final { scan-tree-dump "Moving statement.*global.x.*out of loop 1" "lim1" } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/tree-ssa-loop-im.c
===================================================================
--- gcc/tree-ssa-loop-im.c	(revision 222849)
+++ gcc/tree-ssa-loop-im.c	(working copy)
@@ -65,34 +65,13 @@ along with GCC; see the file COPYING3.
 #include "domwalk.h"
 #include "params.h"
 #include "tree-pass.h"
+#include "tree-dfa.h"
 #include "flags.h"
 #include "tree-affine.h"
 #include "tree-ssa-propagate.h"
 #include "trans-mem.h"
 #include "gimple-fold.h"
 
-/* TODO:  Support for predicated code motion.  I.e.
-
-   while (1)
-     {
-       if (cond)
-	 {
-	   a = inv;
-	   something;
-	 }
-     }
-
-   Where COND and INV are invariants, but evaluating INV may trap or be
-   invalid from some other reason if !COND.  This may be transformed to
-
-   if (cond)
-     a = inv;
-   while (1)
-     {
-       if (cond)
-	 something;
-     }  */
-
 /* The auxiliary data kept for each statement.  */
 
 struct lim_aux_data
@@ -116,11 +95,65 @@ struct lim_aux_data
 				   is hoisted; i.e. those that define the
 				   operands of the statement and are inside of
 				   the MAX_LOOP loop.  */
+  bool ignore_condition;	/* This is used for store motion. */
+};
+
+/* The data used to create phi nodes after computations movement (per each loop
+   where stmts were moved to). */
+struct phi_data
+{
+  vec<gimple> stmts;		/* Stmts which phi nodes should be created
+				       for. As left parts of these stmts are
+				       used in stmts in some other levels. */
+  vec<gimple> stmts_fl_phi_only;
+			       /* Stmt which phi nodes should be created only
+				  on the first level. */
+
+  hash_map<tree, tree> *names_map;  /* For each of the statements in stmts
+				       field we store the name of phi node,
+				       which will be used outside of this
+				       level (i.e. in the base block which
+				       is the preheader dominator).
+				       I.e.
+					 (cond)
+					   /   \
+				 (bb1) (cond2)
+					 |       /    \
+					 | (bb2) (bb with stmt)
+					  \       \    /
+					   \ (bb with intermidiate phi node)
+					    \       /
+					 (phi node with
+					     the name from this map,
+					  this is the preheader dominator)*/
+  hash_map<tree, tree> *fl_names_map; /* This is the first level phi node names,
+					 i.e phi node in the first bb after
+					 bb with stmt (see the picture above).
+					 This map is used only to replace var
+					 uses in phi with 2 arguments.*/
+  hash_map<tree, tree> *defs_map;   /* This is used when we create phi nodes
+				       and need to know definitions on outer
+				       loops. */
+
 };
 
 /* Maps statements to their lim_aux_data.  */
 
 static hash_map<gimple, lim_aux_data *> *lim_aux_data_map;
+typedef hash_map<gimple, basic_block> cond_data_map;
+/* This information is used when we move computations. Per loop we store
+base blocks created to keep conditions. */
+static hash_map<struct loop*, cond_data_map*> *level_cond_data_map;
+/* As by this pass the information about the post dominators is not collected
+ (CDI_POST_DOMINATORS flag), we do create some kind of post dominator info for
+created based blocks. This is only locally created, locally used and for anew
+created blocks only. */
+static hash_map<basic_block, basic_block> *post_dominators_map;
+/* This vector keeps the statements with lhs which is used in the conditions.
+After computations movements we do iterate over this to replace lhs as needed
+in these conditions. As there could be several copies of the same conditions,
+we cannot do these replacements before the all movements. */
+static vec<gimple> stmts_with_changed_lhs;
 
 /* Description of a memory reference location.  */
 
@@ -130,21 +163,49 @@ typedef struct mem_ref_loc
   gimple stmt;			/* The statement in that it occurs.  */
 } *mem_ref_loc_p;
 
+/* The type of movement condtion */
+enum cond_type
+{
+   TRUE_TARGET_OF,		     /* bb with move_cond is true target */
+   FALSE_TARGET_OF,		    /* bb with move_cond is false target */
+   ALWAYS_EXECUTED_IN,		 /* bb with move_cond is always executed
+					  in some loop */
+   NEW_BB_FROM_PREHEADER_REGION_OF     /* this is for anew created bb,
+					  indicates which preheader region
+					  this is. */
+};
+
+/* This data is per base block. */
+struct move_cond
+{
+   enum cond_type type;
+   int aux;
+   union {
+       gimple cond_stmt;   /* if cond_type is TRUE_TARGET_OF of FALSE_TARGET_OF,
+			      this is parent condition */
+       struct loop* loop;  /* this is either the loop where correspoding base
+			      block is always executed
+			 (type == ALWAYS_EXECUTED_IN) or the loop which
+			      preheader region corresponding base block belongs
+			      to (NEW_BB_FROM_PREHEADER_REGION_OF). */
+   };
+};
+
 
 /* Description of a memory reference.  */
 
 typedef struct im_mem_ref
 {
   unsigned id;			/* ID assigned to the memory reference
-				   (its index in memory_accesses.refs_list)  */
+				 (its index in memory_accesses.refs_list)  */
   hashval_t hash;		/* Its hash value.  */
 
   /* The memory access itself and associated caching of alias-oracle
      query meta-data.  */
   ao_ref mem;
 
-  bitmap stored;		/* The set of loops in that this memory location
-				   is stored to.  */
+  bitmap stored;		/* The set of loops in that this memory
+				   location is stored to.  */
   vec<mem_ref_loc>		accesses_in_loop;
 				/* The locations of the accesses.  Vector
 				   indexed by the loop number.  */
@@ -157,8 +218,8 @@ typedef struct im_mem_ref
 				   If it is stored in the loop, this store
 				     is independent on all other loads and
 				     stores.
-				   If it is only loaded, then it is independent
-				     on all stores in the loop.  */
+				   If it is only loaded, then it is
+				   independent on all stores in the loop. */
   bitmap_head dep_loop;		/* The complement of INDEP_LOOP.  */
 } *mem_ref_p;
 
@@ -228,10 +289,54 @@ static bool ref_indep_loop_p (struct loo
 /* Minimum cost of an expensive expression.  */
 #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
 
+#define BB_MOVE_COND(BB) ((struct move_cond *) (BB->aux))
 /* The outermost loop for which execution of the header guarantees that the
    block will be executed.  */
-#define ALWAYS_EXECUTED_IN(BB) ((struct loop *) (BB)->aux)
-#define SET_ALWAYS_EXECUTED_IN(BB, VAL) ((BB)->aux = (void *) (VAL))
+#define ALWAYS_EXECUTED_IN(BB)\
+		 (BB_MOVE_COND (BB) != NULL\
+		 && BB_MOVE_COND (BB)->type == ALWAYS_EXECUTED_IN\
+		 ? BB_MOVE_COND (BB)->loop: NULL)
+#define SET_ALWAYS_EXECUTED_IN(BB, VAL)\
+		  BB->aux = (void *) (XCNEW (struct move_cond));\
+		  BB_MOVE_COND (BB)->type=ALWAYS_EXECUTED_IN;\
+		  BB_MOVE_COND (BB)->loop=VAL;
+#define IS_TRUE_TARGET(BB) (BB_MOVE_COND (BB) != NULL\
+			    && BB_MOVE_COND (BB)->type == TRUE_TARGET_OF)
+#define IS_FALSE_TARGET(BB) (BB_MOVE_COND (BB) != NULL\
+			    && BB_MOVE_COND (BB)->type == FALSE_TARGET_OF)
+#define IS_THE_SAME_TARGET_TYPE(BB, type) (BB_MOVE_COND (BB) != NULL\
+			     && BB_MOVE_COND (BB)->type == type)
+
+/* Whether this BB is true or false target. (If the outermost condition is
+   not always executed in the loop, correspoding move_cond of the all child
+   cond and the BB will be NULL */
+#define IS_TRUE_OR_FALSE_TARGET(BB) (IS_TRUE_TARGET (BB)\
+	  || IS_FALSE_TARGET (BB))
+/* Condition statement for BB */
+#define COND_STMT(BB) (BB_MOVE_COND (BB)->cond_stmt)
+#define SET_TARGET_OF(cond,type,BB)  if (BB->aux == NULL)\
+				     BB->aux = (void *) (new move_cond);\
+				     BB_MOVE_COND (BB)->type=type;\
+				     BB_MOVE_COND (BB)->cond_stmt=cond;\
+				     BB_MOVE_COND (BB)->aux=false;
+#define SET_MOVE_COND_AUX(BB, VALUE) (BB_MOVE_COND (BB)->aux = VALUE)
+/* This is used when we evaluate whether the outermost condition is always
+   executed in the loop. */
+#define GET_MOVE_COND_AUX(BB) (BB_MOVE_COND (BB)->aux)
+#define IS_NEW_BB(BB) (BB_MOVE_COND (BB) != NULL\
+		       && BB_MOVE_COND (BB)->type\
+	       == NEW_BB_FROM_PREHEADER_REGION_OF)
+#define SET_LOOP_OF_PREHEADER_REGION(BB,VAL)\
+	   if (BB->aux == NULL)\
+	      {BB->aux = (void *)\
+			 (XCNEW (struct move_cond));\
+	       BB_MOVE_COND (BB)->type\
+		 =NEW_BB_FROM_PREHEADER_REGION_OF;\
+	       BB_MOVE_COND (BB)->loop=VAL;}
+#define LOOP_OF_PREHEADER_REGION(BB) (IS_NEW_BB (BB)\
+	       ? BB_MOVE_COND (BB)->loop: NULL)
+
+#define FREE_AUX_FOR_BB(BB) XDELETE (BB_MOVE_COND (BB));BB->aux=NULL;
 
 /* ID of the shared unanalyzable mem.  */
 #define UNANALYZABLE_MEM_ID 0
@@ -244,7 +349,7 @@ init_lim_data (gimple stmt)
 {
   lim_aux_data *p = XCNEW (struct lim_aux_data);
   lim_aux_data_map->put (stmt, p);
-
+  p->ignore_condition = false;
   return p;
 }
 
@@ -258,6 +363,26 @@ get_lim_data (gimple stmt)
   return *p;
 }
 
+static cond_data_map *get_cond_data_map (struct loop * level)
+{
+  cond_data_map **p = level_cond_data_map->get (level);
+  if (!p)
+    {
+      cond_data_map *map = new cond_data_map;
+      level_cond_data_map->put (level, map);
+      return map;
+    }
+   return *p;
+}
+
+static basic_block get_bb_by_cond (cond_data_map *map, gimple cond)
+{
+  basic_block *p = map->get (cond);
+  if (!p)
+    return NULL;
+  return *p;
+}
+
 /* Releases the memory occupied by DATA.  */
 
 static void
@@ -302,13 +427,9 @@ movement_possibility (gimple stmt)
   tree lhs;
   enum move_pos ret = MOVE_POSSIBLE;
 
-  if (flag_unswitch_loops
-      && gimple_code (stmt) == GIMPLE_COND)
-    {
-      /* If we perform unswitching, force the operands of the invariant
-	 condition to be moved out of the loop.  */
-      return MOVE_POSSIBLE;
-    }
+  if (gimple_code (stmt) == GIMPLE_COND)
+    return MOVE_POSSIBLE;
+
 
   if (gimple_code (stmt) == GIMPLE_PHI
       && gimple_phi_num_args (stmt) <= 2
@@ -540,13 +661,13 @@ stmt_cost (gimple stmt)
 
     case CONSTRUCTOR:
       /* Make vector construction cost proportional to the number
-         of elements.  */
+	 of elements.  */
       return CONSTRUCTOR_NELTS (gimple_assign_rhs1 (stmt));
 
     case SSA_NAME:
     case PAREN_EXPR:
       /* Whether or not something is wrapped inside a PAREN_EXPR
-         should not change move cost.  Nor should an intermediate
+	 should not change move cost.  Nor should an intermediate
 	 unpropagated SSA name copy.  */
       return 0;
 
@@ -707,21 +828,18 @@ extract_true_false_args_from_phi (basic_
    is defined in, and true otherwise.  */
 
 static bool
-determine_max_movement (gimple stmt, bool must_preserve_exec)
+determine_max_movement (gimple stmt, struct loop *start_level)
 {
   basic_block bb = gimple_bb (stmt);
   struct loop *loop = bb->loop_father;
-  struct loop *level;
+  struct loop *level = start_level;
   struct lim_aux_data *lim_data = get_lim_data (stmt);
   tree val;
   ssa_op_iter iter;
 
-  if (must_preserve_exec)
-    level = ALWAYS_EXECUTED_IN (bb);
-  else
-    level = superloop_at_depth (loop, 1);
   lim_data->max_loop = level;
 
+
   if (gphi *phi = dyn_cast <gphi *> (stmt))
     {
       use_operand_p use_p;
@@ -762,7 +880,6 @@ determine_max_movement (gimple stmt, boo
 
       min_cost = MIN (min_cost, total_cost);
       lim_data->cost += min_cost;
-
       if (gimple_phi_num_args (phi) > 1)
 	{
 	  basic_block dom = get_immediate_dominator (CDI_DOMINATORS, bb);
@@ -887,7 +1004,7 @@ nonpure_call_p (gimple stmt)
   return gimple_has_side_effects (stmt);
 }
 
-/* Rewrite a/b to a*(1/b).  Return the invariant stmt to process.  */
+/* Rewrite a/b to a* (1/b).  Return the invariant stmt to process.  */
 
 static gimple
 rewrite_reciprocal (gimple_stmt_iterator *bsi)
@@ -921,7 +1038,7 @@ rewrite_reciprocal (gimple_stmt_iterator
 }
 
 /* Check if the pattern at *BSI is a bittest of the form
-   (A >> B) & 1 != 0 and in this case rewrite it to A & (1 << B) != 0.  */
+ (A >> B) & 1 != 0 and in this case rewrite it to A & (1 << B) != 0.  */
 
 static gimple
 rewrite_bittest (gimple_stmt_iterator *bsi)
@@ -1027,6 +1144,151 @@ public:
   virtual void before_dom_children (basic_block);
 };
 
+#define EVALUATION_IN_PROGRESS (0x01 << 1)
+#define EVALUATED (0x01 << 2)
+
+int is_first_level_cond_always_executed (basic_block bb)
+{
+  if (BB_MOVE_COND (bb) == NULL)
+    return false;
+  /* We have recursive conditions, will not move these computations. */
+  if (GET_MOVE_COND_AUX (bb) & EVALUATION_IN_PROGRESS)
+    {
+      FREE_AUX_FOR_BB (bb);
+      return false;
+    }
+
+  if (GET_MOVE_COND_AUX (bb) & EVALUATED)
+    return true;
+
+  if (ALWAYS_EXECUTED_IN (bb) != NULL)
+    {
+      SET_MOVE_COND_AUX (bb, EVALUATED);
+      return true;
+    }
+  if (!IS_TRUE_OR_FALSE_TARGET (bb))
+    {
+      FREE_AUX_FOR_BB (bb);
+      return false;
+    }
+
+   SET_MOVE_COND_AUX (bb, EVALUATION_IN_PROGRESS);
+   bool result = is_first_level_cond_always_executed (COND_STMT (bb)->bb);
+
+   if (!result)
+     {
+       FREE_AUX_FOR_BB (bb);
+     } else
+       SET_MOVE_COND_AUX (bb, EVALUATED);
+
+   return result;
+}
+
+/* Checks whether bb is conditionally executed. However, we only count
+   as conditionally executed the blocks which outermost condition
+   is always executed.
+   I.e.
+   if (a) if (b) if (c) stmt; ('a' should be always executed in some loop,
+   then base blocks corresponging to 'b', 'c' and 'stmt' will be called
+   conditionally executed, otherwise correspoding move_cond will be NULL
+   for basic blocks corresponding to 'b', 'c' and 'stmt'). */
+bool check_conditionally_executed (basic_block bb)
+{
+  if (!IS_TRUE_OR_FALSE_TARGET (bb))
+    return false;
+  basic_block cond_bb = COND_STMT (bb)->bb;
+  if (cond_bb == bb)
+    {
+      FREE_AUX_FOR_BB (bb);
+      return false;
+    }
+  return is_first_level_cond_always_executed (cond_bb);
+}
+
+bool calculate_tgt_level (gimple stmt, struct loop *outermost,
+			 bool cond_executed, basic_block bb)
+{
+  struct lim_aux_data *lim_data = init_lim_data (stmt);
+  lim_data->always_executed_in = outermost;
+
+  struct loop * start_level = NULL;
+  if (cond_executed)
+    {
+      gimple cond = COND_STMT (bb);
+      struct lim_aux_data *cond_lim_data = get_lim_data (cond);
+      if (cond_lim_data)
+	start_level = cond_lim_data->max_loop;
+	struct loop *current_loop = bb->loop_father;
+	if (start_level != NULL && current_loop != start_level
+	    && !flow_loop_nested_p (start_level, current_loop))
+	  start_level = NULL;
+     }
+  else /* The reason why we don't set start_level for MOVE_POSSIBLE
+	  statements to more outer level is: this statement could depend on
+	  conditional statements and even probably is not defined without this
+	  condition. So either we should analyze these ones and move
+	  under condition somehow or keep more strong start_level */
+    start_level = ALWAYS_EXECUTED_IN (bb);
+  if (start_level == NULL || !determine_max_movement (stmt, start_level))
+    {
+       lim_data->max_loop = NULL;
+       return false;
+    }
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+       print_gimple_stmt (dump_file, stmt, 2, 0);
+       fprintf (dump_file, "  invariant up to level %d, cost %d.\n\n",
+	       loop_depth (lim_data->max_loop),
+	       lim_data->cost);
+     }
+
+  if (lim_data->cost >= LIM_EXPENSIVE)
+    set_profitable_level (stmt);
+  return true;
+}
+
+/* This method is called only from handle_cond_stmt, which is called
+   in CDI_DOMINATORS order, so we call SET_TARGET_OF for all nodes which
+   is predicated by this condition and for _some_ of them predicated by
+   additional subconditions (as this method will be called again for
+   sub-conditions to correct things) */
+
+void handle_branch_edge (edge e, enum cond_type type, gimple stmt)
+ {
+   edge edge_loc = e;
+   basic_block bb = edge_loc->dest;
+   if (stmt->bb->loop_father != bb->loop_father)
+     return;
+   if (!single_pred_p (bb))
+     return;
+   basic_block dominator = bb;
+   struct loop *loop = bb->loop_father;
+   do
+     {
+       if (IS_THE_SAME_TARGET_TYPE (bb, type) && COND_STMT (bb) == stmt)
+	 {
+	    FREE_AUX_FOR_BB (bb);
+	    break;
+	 }
+       SET_TARGET_OF (stmt, type, bb);
+       edge_loc = EDGE_COUNT (bb->succs) > 0 ? EDGE_SUCC (bb, 0) : NULL;
+       if (edge_loc)
+	 bb = edge_loc->dest;
+      } while (edge_loc
+	    && bb->loop_father == loop
+	    &&  dominated_by_p (CDI_DOMINATORS, bb, dominator));
+ }
+
+void handle_cond_stmt (gimple stmt)
+{
+   basic_block bb = stmt->bb;
+   edge true_edge,false_edge;
+   extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
+   handle_branch_edge (true_edge, TRUE_TARGET_OF, stmt);
+   handle_branch_edge (false_edge, FALSE_TARGET_OF, stmt);
+}
+
+
 /* Determine the outermost loops in that statements in basic block BB are
    invariant, and record them to the LIM_DATA associated with the statements.
    Callback for dom_walker.  */
@@ -1039,10 +1301,10 @@ invariantness_dom_walker::before_dom_chi
   gimple stmt;
   bool maybe_never = ALWAYS_EXECUTED_IN (bb) == NULL;
   struct loop *outermost = ALWAYS_EXECUTED_IN (bb);
-  struct lim_aux_data *lim_data;
 
   if (!loop_outer (bb->loop_father))
     return;
+  bool cond_executed = maybe_never && check_conditionally_executed (bb);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "Basic block %d (loop %d -- depth %d):\n\n",
@@ -1063,32 +1325,16 @@ invariantness_dom_walker::before_dom_chi
 	pos = movement_possibility (stmt);
 	if (pos == MOVE_IMPOSSIBLE)
 	  continue;
-
-	lim_data = init_lim_data (stmt);
-	lim_data->always_executed_in = outermost;
-
-	if (!determine_max_movement (stmt, false))
-	  {
-	    lim_data->max_loop = NULL;
-	    continue;
-	  }
-
-	if (dump_file && (dump_flags & TDF_DETAILS))
-	  {
-	    print_gimple_stmt (dump_file, stmt, 2, 0);
-	    fprintf (dump_file, "  invariant up to level %d, cost %d.\n\n",
-		     loop_depth (lim_data->max_loop),
-		     lim_data->cost);
-	  }
-
-	if (lim_data->cost >= LIM_EXPENSIVE)
-	  set_profitable_level (stmt);
+	if (!calculate_tgt_level (stmt, outermost, cond_executed, bb))
+	  continue;
       }
 
-  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+  for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi);
+						 gsi_next (&bsi))
     {
       stmt = gsi_stmt (bsi);
-
+      if (gimple_code (stmt) == GIMPLE_COND)
+	handle_cond_stmt (stmt);
       pos = movement_possibility (stmt);
       if (pos == MOVE_IMPOSSIBLE)
 	{
@@ -1116,7 +1362,7 @@ invariantness_dom_walker::before_dom_chi
 	  struct loop *ol1 = outermost_invariant_loop (op1,
 					loop_containing_stmt (stmt));
 
-	  /* If divisor is invariant, convert a/b to a*(1/b), allowing reciprocal
+	  /* If divisor is invariant, convert a/b to a* (1/b), allowing reciprocal
 	     to be hoisted out of loop, saving expensive divide.  */
 	  if (pos == MOVE_POSSIBLE
 	      && gimple_assign_rhs_code (stmt) == RDIV_EXPR
@@ -1137,55 +1383,656 @@ invariantness_dom_walker::before_dom_chi
 	    stmt = rewrite_bittest (&bsi);
 	}
 
-      lim_data = init_lim_data (stmt);
-      lim_data->always_executed_in = outermost;
+      if (maybe_never && !cond_executed && pos == MOVE_PRESERVE_EXECUTION)
+	continue;
 
-      if (maybe_never && pos == MOVE_PRESERVE_EXECUTION)
+      if (!calculate_tgt_level (stmt, outermost, cond_executed, bb))
 	continue;
+    }
+}
+
+class move_computations_dom_walker : public dom_walker
+{
+public:
+  move_computations_dom_walker (cdi_direction direction)
+    : dom_walker (direction), todo_ (0) {}
+
+  virtual void before_dom_children (basic_block);
+
+  unsigned int todo_;
+};
+
+/* This function should be called only for anew created basic blocks */
+basic_block get_postdominator (basic_block bb)
+{
+  basic_block *p = post_dominators_map->get (bb);
+  /* as this is called only for anew created bb */
+  gcc_assert (p != NULL);
+  return *p;
+}
+
+/* This function is called only for bb created in this pass. */
+edge find_branch_edge (basic_block bb, enum cfg_edge_flags flag)
+{
+  edge e;
+  edge_iterator ei;
 
-      if (!determine_max_movement (stmt, pos == MOVE_PRESERVE_EXECUTION))
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (e->flags & flag)
+      break;
+  if (e != NULL && get_postdominator (bb) != e->dest)
+    return e;
+  return NULL;
+}
+
+/* This function is called only for bb created in this pass.
+   Called to get basic block where to add some statement to. The branch
+   is defined by first_in_branch (the is true or false edge, which src is bb
+   and dest is first_in_branch).*/
+basic_block get_most_remoute_in_branch (basic_block bb,
+				       basic_block first_in_branch,
+				       struct loop * level)
+{
+  basic_block post_dominator = get_postdominator (bb);
+  edge e;
+  edge_iterator ei;
+  basic_block merger = NULL;
+  edge prev= NULL;
+  struct vec<edge> *edges_to_change_dest = XCNEW (struct vec<edge>);
+
+  FOR_EACH_EDGE (e, ei, post_dominator->preds)
+    {
+      if (dominated_by_p (CDI_DOMINATORS, e->src, first_in_branch))
 	{
-	  lim_data->max_loop = NULL;
-	  continue;
+	  if (prev != NULL)
+	    {
+	      if (merger == NULL)
+		{
+		  merger = create_empty_bb (first_in_branch);
+
+		  SET_LOOP_OF_PREHEADER_REGION (merger, level);
+		  edges_to_change_dest->safe_push (prev);
+		  merger->frequency = EDGE_FREQUENCY (prev);
+		  set_immediate_dominator (CDI_DOMINATORS, merger,
+					   first_in_branch);
+		  post_dominators_map->put (first_in_branch, merger);
+		  add_bb_to_loop (merger, first_in_branch->loop_father);
+		}
+	      edges_to_change_dest->safe_push (e);
+	      merger->frequency += EDGE_FREQUENCY (e);
+	    }
+	  else
+	    prev = e;
 	}
+    }
+   if (merger != NULL)
+     {
+       edge new_edge = make_edge (merger, post_dominator,
+					    EDGE_FALLTHRU);
+       new_edge->probability = REG_BR_PROB_BASE;
+       unsigned i;
+       for (i = 0; edges_to_change_dest->iterate (i, &e); ++i)
+	 redirect_edge_succ (e, merger);
+       free (edges_to_change_dest);
+       return merger;
+     }
+  return prev->src;
+}
+
+/* This function is called to get basic block to add statement
+ (under condition from bb, whether true or false edge is defined by flag).
+   Probability fields of the edges added during predicated lim is computed
+   as described in profile.c when profile is absent.
+   Counts fileds of the edges and basic blocks added during predicatedlim
+   is not updated.
+   Frequency fields of the basic blocks added during predicated lim is
+   computed as sum of the edges frequencies calculated by EDGE_FREQUENCY.
+  */
+
+basic_block make_branch (basic_block bb, enum cfg_edge_flags flag,
+			struct loop *level)
+{
+  edge e;
+  edge_iterator ei;
+  basic_block result = NULL;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (e->flags & flag)
+      break;
+  if (e != NULL)
+    result = split_edge (e);
+  else
+    {
+      basic_block new_bb = create_empty_bb (bb);
+      set_immediate_dominator (CDI_DOMINATORS, new_bb, bb);
+      edge e1 = make_edge (bb, new_bb, flag);
+      edge fallthru_edge = find_fallthru_edge (bb->succs);
+      fallthru_edge->flags = (flag & EDGE_TRUE_VALUE ? EDGE_FALSE_VALUE
+			       : EDGE_TRUE_VALUE);
+      int old_freq = EDGE_FREQUENCY (fallthru_edge);
+      basic_block post_dominator = fallthru_edge->dest;
+      edge e2 = make_edge (new_bb, post_dominator, EDGE_FALLTHRU);
+      add_bb_to_loop (new_bb, bb->loop_father);
+      post_dominators_map->put (bb, post_dominator);
+      e1->probability = REG_BR_PROB_BASE/2;
+      fallthru_edge->probability = REG_BR_PROB_BASE/2;
+      e2->probability = REG_BR_PROB_BASE;
+      new_bb->frequency = EDGE_FREQUENCY (e1);
+      post_dominator->frequency +=EDGE_FREQUENCY (fallthru_edge)
+				  + EDGE_FREQUENCY (e2) - old_freq;
+      result = new_bb;
+    }
+  SET_LOOP_OF_PREHEADER_REGION (result, level);
+  return result;
+}
+
+/* This method is called to get a basic block to add a conditional statement.*/
+basic_block get_block_for_predicated_statement (gimple cond, bool branch,
+					       edge preheader,
+					       cond_data_map *cond_map,
+					       struct loop *level)
+{
+  basic_block bb = get_bb_by_cond (cond_map, cond);
+
+  enum cfg_edge_flags flag = branch ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE;
+  if (bb != NULL)
+    {
+      edge be = find_branch_edge (bb, flag);
+      if (be != NULL)
+	return get_most_remoute_in_branch (bb, be->dest, level);
+      else
+	return make_branch (bb, flag, level);
+    }
+  else
+    {
+      basic_block cond_bb = cond->bb;
+      bool always_executed = (ALWAYS_EXECUTED_IN (cond_bb) != NULL);
+      edge edge_for_cond;
+      if (always_executed)
+	{
+	  edge_for_cond = preheader;
+	}
+      else
+	{
+	  gimple parent_cond = COND_STMT (cond_bb);
+	  bool branch = IS_TRUE_TARGET (cond_bb);
+	  cond_bb = get_block_for_predicated_statement (parent_cond, branch,
+						       preheader, cond_map,
+						       level);
+	  edge_for_cond = find_fallthru_edge (cond_bb->succs);
+	}
+      gimple cond_copy = gimple_copy (cond);
+      gsi_insert_on_edge_immediate (edge_for_cond, cond_copy);
+      cond_bb = cond_copy->bb;
+      if (always_executed)
+	split_edge (preheader);
+      cond_map->put (cond, cond_bb);
+      SET_LOOP_OF_PREHEADER_REGION (cond_bb, level);
+      return make_branch (cond_bb, flag, level);
+    }
+}
+
+/* phi_data->names_map is going to be used more often than
+   fl_names_map and defs_map. So we initialize it together
+   with phi_data initialization. */
+phi_data *init_phi_data (struct loop *level)
+{
+  phi_data *data = XCNEW (struct phi_data);
+  data->names_map = new hash_map<tree,tree>;
+  data->fl_names_map = NULL;
+  data->defs_map = NULL;
+  level->aux = data;
+  return data;
+}
+
+tree create_key (tree var)
+{
+  tree key = SSA_NAME_VAR (var);
+  if (key == NULL) //temporary_variable
+    key = var;
+  return key;
+}
+
+tree get_name (hash_map<tree, tree> *map, tree key)
+{
+  if (map == NULL)
+    return NULL;
+  tree *p = map->get (key);
+  if (p != NULL)
+    return *p;
+  return NULL;
+}
+
+
+hash_map<tree, tree> *get_names_map (struct phi_data *data,
+				    bool from_fl_names_map)
+{
+  gcc_assert (data != NULL);
+  if (from_fl_names_map)
+    {
+      if (data->fl_names_map == NULL)
+	data->fl_names_map = new hash_map<tree, tree>;
+      return data->fl_names_map;
+    }
+  else
+    return data->names_map;
+}
+
+tree get_new_name (struct loop * level, tree var,
+		   bool from_fl_names_map)
+{
+  struct phi_data *data = (phi_data *) (level->aux);
+  tree key = create_key (var);
+  tree new_name = NULL;
+  if (data != NULL)
+    new_name = get_name (get_names_map (data,
+				      from_fl_names_map),
+			key);
+  else
+    data = init_phi_data (level);
+  if (new_name == NULL)
+    {
+      new_name = copy_ssa_name (var);
+      get_names_map (data, from_fl_names_map)->put (key,
+						   new_name);
+    }
+  return new_name;
+}
+
+tree get_new_name (struct loop * level, tree var)
+{
+  return get_new_name (level, var, false);
+}
+
+void set_default_def (struct loop* level, tree key, tree def)
+{
+  struct phi_data *data = (phi_data *) (level->aux);
+  if (data == NULL)
+    data = init_phi_data (level);
+  if (data->defs_map == NULL)
+    data->defs_map = new hash_map<tree, tree>;
+  data->defs_map->put (key, def);
+}
+
+/*!!! TODO: debug statements removing is a very temporary solution!!!
+  This is terrible and only before the first review of predicated lim.
+  The problem is: when debug statement is the only statement which we need
+  create phi for there is the different code for 'g' option and without 'g'
+  option.
+  That's why bootstrap build fails. The difference between stage2 and stage3
+  is in the single file onl (tree-vect-data-refs.o).
+  And actually the difference is not in stage2 and
+  stage3. The difference is in between compiler options (-gtoogle).
+  So as a temporary solution, we just remove such debug statements.
+  No statements - no problems :) . */
+void remove_debug_stmt (vec<gimple> *stmts, bool cond)
+{
+  if (!cond)
+    return;
+
+  gimple stmt;
+  gimple_stmt_iterator gsi;
+  unsigned i;
 
+  for (i = 0; stmts->iterate (i, &stmt); ++i)
+    {
+      gsi = gsi_for_stmt (stmt);
+      gsi_remove (&gsi, true);
+    }
+}
+
+tree create_tmp_va (tree old_var, gimple stmt,
+		    struct loop *loop_with_data,
+		    struct loop *level)
+{
+  tree result = old_var;
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  bool cond_for_tmp_var = false;
+  bool remove_debug_stmts_flag = false;
+  struct vec<gimple> *debug_stmts= NULL;
+  FOR_EACH_IMM_USE_FAST (use_p, iter, old_var)
+    {
+      gimple use_stmt = USE_STMT (use_p);
+      struct lim_aux_data * lim_data = get_lim_data (use_stmt);
+      struct loop *tgt_loop = lim_data != NULL ? lim_data->tgt_loop
+					      : NULL;
+      gphi * phi;
+      if (tgt_loop != level && gimple_code (use_stmt) == GIMPLE_DEBUG)
+	{
+	  remove_debug_stmts_flag = true;
+	  if (debug_stmts == NULL)
+	     debug_stmts = XCNEW (struct vec<gimple>);
+	  debug_stmts->safe_push (use_stmt);
+	}
+      else if (tgt_loop != level
+	  || gimple_code (use_stmt) == GIMPLE_COND
+	  || (phi = dyn_cast <gphi *> (use_stmt)))
+	{
+	  cond_for_tmp_var = true;
+	  remove_debug_stmts_flag = false;
+	  break;
+	}
+    }
+  remove_debug_stmt (debug_stmts, remove_debug_stmts_flag);
+  free (debug_stmts);
+  /* This is limitation from make_ssa_name and
+     get_or_create_ssa_default_def. We need these methods further
+     when create phi nodes. */
+  cond_for_tmp_var &= !(TREE_CODE (old_var) == VAR_DECL
+			|| TREE_CODE (old_var) == PARM_DECL
+			|| TREE_CODE (old_var) == RESULT_DECL);
+  if (cond_for_tmp_var)
+    {
+      tree tmp_var = create_tmp_reg (TREE_TYPE (old_var), "plim");
+      result = make_ssa_name (tmp_var, NULL);
+      gimple_set_lhs (stmt, result);
       if (dump_file && (dump_flags & TDF_DETAILS))
 	{
-	  print_gimple_stmt (dump_file, stmt, 2, 0);
-	  fprintf (dump_file, "  invariant up to level %d, cost %d.\n\n",
-		   loop_depth (lim_data->max_loop),
-		   lim_data->cost);
+	  fprintf (dump_file, "created new name ");
+	  print_generic_stmt (dump_file, result, 0);
+	  fprintf (dump_file, "\n");
+	}
+
+      set_default_def (loop_with_data, create_key (result),
+		     build_zero_cst (TREE_TYPE (old_var)));
+    }
+  return result;
+}
+
+#define WAS_NOT_ADDED (0x0)
+#define ADDED_TO_PHI_DATA (0x01 << 1)
+#define ADDED_TO_STMTS_WITH_CH_LHS (0x01 << 2)
+#define ADDED_TO_FL_PHI_DATA (0x01 << 3)
+#define ADDED_TO_EVERYWHERE (ADDED_TO_PHI_DATA | ADDED_TO_STMTS_WITH_CH_LHS)
+
+/* This function is called from move_computations routines, when
+   stmt is going to be moved outside its current loop. So we analyze whether
+   phi nodes should be added in the new loop, store corresponding info
+   into loop->aux (as phi_data->stmts, ADDED_TO_PHI_DATA result code),
+   replace uses of lhs of stmt, if any,
+   to suggested phi name (phi_data->names_map) and also
+   add stmt to stmts_with_changed_lhs to update lhs uses in conditions later
+ (ADDED_TO_STMTS_WITH_CH_LHS result code). */
+int add_info_for_phi (gimple stmt, struct loop *level)
+{
+  int result = WAS_NOT_ADDED;
+  tree old_var = gimple_get_lhs (stmt);
+  if (old_var == NULL)
+     return result;
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Will analyze uses of ");
+      print_generic_expr (dump_file, old_var, 0);
+      fprintf (dump_file,"\n");
+    }
+  struct loop* loop_with_data = loop_outer (level);
+  tree name_for_key = create_tmp_va (old_var, stmt, loop_with_data,
+				     level);
+  bool tmp_reg_created = (name_for_key != old_var);
+
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  gimple use_stmt;
+  tree new_name = NULL;
+  tree fl_new_name = NULL;
+  struct lim_aux_data *lim_data;
+  bool create_phi_node = false;
+  gphi *phi = NULL;
+  if (TREE_CODE (old_var) == SSA_NAME && !has_zero_uses (old_var))
+    {
+      FOR_EACH_IMM_USE_STMT (use_stmt, iter, old_var)
+	{
+	  lim_data = get_lim_data (use_stmt);
+	  struct loop *tgt_loop = lim_data != NULL ? lim_data->tgt_loop
+						       : NULL;
+	  bool is_phi = (phi = dyn_cast <gphi *> (use_stmt))
+	      && (gimple_phi_num_args (phi) == 2);
+	  bool is_the_same_level = (tgt_loop == level);
+	  bool is_cond = gimple_code (use_stmt) == GIMPLE_COND;
+	  if ((lim_data == NULL || tgt_loop == NULL
+	       || loop_depth (tgt_loop) > loop_depth (level))
+	       || (is_the_same_level && (is_phi || is_cond))
+	       || tmp_reg_created)
+	    {
+	      bool is_phi_on_the_same_level = is_phi && is_the_same_level;
+	      tree replacement = NULL;
+	      if (is_phi_on_the_same_level
+		   && check_conditionally_executed (use_stmt->bb))
+		{
+		  if (fl_new_name == NULL)
+		    fl_new_name = get_new_name (loop_with_data, name_for_key,
+						true);
+		  replacement = fl_new_name;
+		}
+	      else
+		{
+		  if ((is_phi_on_the_same_level
+		      || !(tmp_reg_created && is_the_same_level)) && new_name == NULL)
+		    new_name = get_new_name (loop_with_data, name_for_key);
+		  replacement = is_the_same_level && !is_phi ? name_for_key : new_name;
+		}
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "Will replace uses in: ");
+		  print_gimple_stmt (dump_file, use_stmt, 2, 0);
+		}
+	      FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+		{
+		   SET_USE (use_p, replacement);
+		}
+	      update_stmt (use_stmt);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "After replacement: ");
+		  print_gimple_stmt (dump_file, use_stmt, 2, 0);
+		}
+
+	      if (is_cond)
+		{ /* If cond is on the same level, we will leave it in the initial loop,
+		    So we do need phi anyway */
+		  create_phi_node = true;
+		  if ((result & ADDED_TO_STMTS_WITH_CH_LHS) == 0)
+		    {
+		      stmts_with_changed_lhs.safe_push (stmt);
+		      result |= ADDED_TO_STMTS_WITH_CH_LHS;
+		    }
+		  if (new_name == NULL)
+		    new_name = get_new_name (loop_with_data, name_for_key);
+		}
+	    }
+	}
+    }
+  if (new_name != NULL || create_phi_node || fl_new_name != NULL)
+    {
+      ((phi_data *)loop_with_data->aux)->stmts.safe_push (stmt);
+      result |= ADDED_TO_PHI_DATA;
+      if (new_name == NULL && fl_new_name != NULL)
+	{
+	  ((phi_data *)loop_with_data->aux)->stmts_fl_phi_only.safe_push (stmt);
+	  result |= ADDED_TO_FL_PHI_DATA;
 	}
+    }
+  return result;
+}
+/* This is called after computations movement. As there could appear
+   several copies of the same conditional statement
+ (as statements under the same condition could go to different levels),
+   we may want to change some uses in these conditions (definition
+   statements of some operands could also be moved). */
+void replace_uses_in_conditions ()
+{
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  gimple use_stmt;
+  tree new_name;
+  unsigned i;
+  gimple stmt;
+  tree old_var;
+  struct loop *def_loop;
+  struct loop *use_loop;
+  FOR_EACH_VEC_ELT (stmts_with_changed_lhs, i, stmt)
+    {
+      new_name = NULL;
+      old_var = gimple_get_lhs (stmt);
+      def_loop = gimple_bb (stmt)->loop_father;
 
-      if (lim_data->cost >= LIM_EXPENSIVE)
-	set_profitable_level (stmt);
+      FOR_EACH_IMM_USE_STMT (use_stmt, iter, old_var)
+	{
+	  if (gimple_code (use_stmt) != GIMPLE_COND)
+	    continue;
+	  use_loop = gimple_bb (use_stmt)->loop_father;
+	  if (use_loop != def_loop)
+	    {
+	      if (new_name == NULL)
+		new_name = get_new_name (def_loop, old_var);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "Statement: ");
+		  print_gimple_stmt (dump_file, use_stmt, 2, 0);
+		}
+	      FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+		{
+		  SET_USE (use_p, new_name);
+		}
+	      update_stmt (use_stmt);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "changed to: ");
+		  print_gimple_stmt (dump_file, use_stmt, 2, 0);
+		}
+	    }
+	}
     }
 }
 
-class move_computations_dom_walker : public dom_walker
+/* Get ssa default definition corresponding to key recursively asking outer
+   loops. Once found for the given loop, the default definition is stored in
+  phi_data->defs_map (loop->aux), so there is no need in recursion in further
+  calls of this function. */
+tree get_default_def (struct loop *loop, tree key, bool is_first_level)
 {
-public:
-  move_computations_dom_walker (cdi_direction direction)
-    : dom_walker (direction), todo_ (0) {}
+  phi_data *data = (phi_data *)loop->aux;
+  tree default_def = NULL;
+  if (data != NULL)
+    {
+      hash_map<tree, tree> *map = is_first_level ? data->defs_map
+				   : data->names_map;
+      if (map != NULL)
+	 default_def = get_name (map, key);
+     if (default_def != NULL)
+	 return default_def;
+    }
+  struct loop *outer_loop = loop_outer (loop);
+  if (outer_loop != NULL)
+    default_def = get_default_def (outer_loop, key, false);
+  if (is_first_level)
+    {
+      if (default_def == NULL)
+	default_def = get_or_create_ssa_default_def (cfun, key);
+      if (data == NULL)
+	data = init_phi_data (loop);
+      if (data->defs_map == NULL)
+	data->defs_map = new hash_map<tree, tree>;
+      data->defs_map->put (key, default_def);
+    }
+  return default_def;
+}
 
-  virtual void before_dom_children (basic_block);
+void conditionally_replace_last (vec<gimple> vec, int cond, gimple stmt)
+{
+  if (cond)
+    {
+      vec.pop ();
+      vec.safe_push (stmt);
+    }
+}
 
-  unsigned int todo_;
-};
+void check_and_correct_vectors (struct loop *level, gimple stmt, int cond)
+{
+  phi_data *data = (phi_data *)level->aux;
+  if (data != NULL)
+    {
+      conditionally_replace_last (data->stmts, cond & ADDED_TO_PHI_DATA, stmt);
+      conditionally_replace_last (data->stmts_fl_phi_only,
+				  cond & ADDED_TO_FL_PHI_DATA, stmt);
+    }
+  conditionally_replace_last (stmts_with_changed_lhs,
+			     cond & ADDED_TO_STMTS_WITH_CH_LHS, stmt);
+}
+
+basic_block get_bb_for_phi (basic_block bb)
+{
+  edge e;
+  basic_block res = bb;
+  while (1)
+    {
+      e = EDGE_SUCC (res, 0);
+      res = e->dest;
+      if (!dominated_by_p (CDI_DOMINATORS, res, bb))
+	break;
+    };
+  return res;
+}
+
+int prepare_edge_for_stm (gimple stmt,
+			  struct loop *level,
+			  bool cond_executed,
+			  edge &e,
+			  basic_block bb)
+{
+  int add_info_for_phi_result = WAS_NOT_ADDED;
+  e = loop_preheader_edge (level);
+
+  if (cond_executed)
+    {
+      basic_block bb_for_stmt
+	 = get_block_for_predicated_statement (COND_STMT (bb),
+					       IS_TRUE_TARGET (bb),
+					       e,
+					       get_cond_data_map (level),
+					       level);
+      add_info_for_phi_result
+	   = add_info_for_phi (stmt, level);
+      e = find_fallthru_edge (bb_for_stmt->succs);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	  fprintf (dump_file,
+	  "Conditionally executed, created bb to move statement is %d\n",
+	   bb_for_stmt->index);
+
+      }
+
+  return add_info_for_phi_result;
+}
+
+void set_loop_of_preheader_regio (basic_block stmt_bb, struct loop *level,
+				  bool cond_executed)
+{
+  if (stmt_bb != NULL && cond_executed)
+    {
+      SET_LOOP_OF_PREHEADER_REGION (stmt_bb, level);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	  fprintf (dump_file,
+	 "Statement added to bb = %d\n",
+	 stmt_bb->index);
+    }
+}
 
 /* Hoist the statements in basic block BB out of the loops prescribed by
    data stored in LIM_DATA structures associated with each statement.  Callback
    for walk_dominator_tree.  */
-
 void
 move_computations_dom_walker::before_dom_children (basic_block bb)
 {
   struct loop *level;
   unsigned cost = 0;
   struct lim_aux_data *lim_data;
+  edge e = NULL;
 
   if (!loop_outer (bb->loop_father))
     return;
+  struct loop * always_executed_in = ALWAYS_EXECUTED_IN (bb);
+  bool bb_cond_executed = check_conditionally_executed (bb);
 
   for (gphi_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi); )
     {
@@ -1216,7 +2063,6 @@ move_computations_dom_walker::before_dom
 	  fprintf (dump_file, "(cost %u) out of loop %d.\n\n",
 		   cost, level->num);
 	}
-
       if (gimple_phi_num_args (stmt) == 1)
 	{
 	  tree arg = PHI_ARG_DEF (stmt, 0);
@@ -1247,23 +2093,22 @@ move_computations_dom_walker::before_dom
 	  SSA_NAME_RANGE_INFO (lhs) = NULL;
 	  SSA_NAME_ANTI_RANGE_P (lhs) = 0;
 	}
-      gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
+      prepare_edge_for_stm (new_stmt, level, bb_cond_executed, e, bb);
+      basic_block stmt_bb = gsi_insert_on_edge_immediate (e, new_stmt);
+      set_loop_of_preheader_regio (stmt_bb, level, bb_cond_executed);
       remove_phi_node (&bsi, false);
     }
 
   for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi); )
     {
-      edge e;
-
+      bool cond_executed = bb_cond_executed;
       gimple stmt = gsi_stmt (bsi);
-
       lim_data = get_lim_data (stmt);
       if (lim_data == NULL)
 	{
 	  gsi_next (&bsi);
 	  continue;
 	}
-
       cost = lim_data->cost;
       level = lim_data->tgt_loop;
       clear_lim_data (stmt);
@@ -1283,11 +2128,11 @@ move_computations_dom_walker::before_dom
 	{
 	  fprintf (dump_file, "Moving statement\n");
 	  print_gimple_stmt (dump_file, stmt, 0, 0);
-	  fprintf (dump_file, "(cost %u) out of loop %d.\n\n",
-		   cost, level->num);
+	  fprintf (dump_file, " (cost %u) out of loop %d from bb %d.\n\n",
+		   cost, level->num, bb->index);
 	}
 
-      e = loop_preheader_edge (level);
+      edge e = loop_preheader_edge (level);
       gcc_assert (!gimple_vdef (stmt));
       if (gimple_vuse (stmt))
 	{
@@ -1305,36 +2150,256 @@ move_computations_dom_walker::before_dom
 		}
 	    }
 	}
-      gsi_remove (&bsi, false);
+      cond_executed &= !lim_data->ignore_condition;
+      int add_info_for_phi_result =
+	 prepare_edge_for_stm (stmt, level, cond_executed, e, bb);
+
       if (gimple_has_lhs (stmt)
 	  && TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME
 	  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_get_lhs (stmt)))
-	  && (!ALWAYS_EXECUTED_IN (bb)
-	      || !(ALWAYS_EXECUTED_IN (bb) == level
-		   || flow_loop_nested_p (ALWAYS_EXECUTED_IN (bb), level))))
+	  && (!always_executed_in
+	      || ! (always_executed_in == level
+	      || flow_loop_nested_p (always_executed_in, level))))
 	{
 	  tree lhs = gimple_get_lhs (stmt);
 	  SSA_NAME_RANGE_INFO (lhs) = NULL;
 	  SSA_NAME_ANTI_RANGE_P (lhs) = 0;
 	}
+
+       gsi_remove (&bsi, false);
       /* In case this is a stmt that is not unconditionally executed
-         when the target loop header is executed and the stmt may
+	 when the target loop header is executed and the stmt may
 	 invoke undefined integer or pointer overflow rewrite it to
 	 unsigned arithmetic.  */
+      basic_block stmt_bb = NULL;
       if (is_gimple_assign (stmt)
 	  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt)))
 	  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (gimple_assign_lhs (stmt)))
 	  && arith_code_with_undefined_signed_overflow
-	       (gimple_assign_rhs_code (stmt))
-	  && (!ALWAYS_EXECUTED_IN (bb)
-	      || !(ALWAYS_EXECUTED_IN (bb) == level
+	 (gimple_assign_rhs_code (stmt))
+	  && (!always_executed_in
+	      || ! (always_executed_in == level
 		   || flow_loop_nested_p (ALWAYS_EXECUTED_IN (bb), level))))
-	gsi_insert_seq_on_edge (e, rewrite_to_defined_overflow (stmt));
-      else
-	gsi_insert_on_edge (e, stmt);
+	{
+	  gimple_seq seq = rewrite_to_defined_overflow (stmt);
+	  /* stmt which we was going to move could have changed lhs after
+	  rewrite_to_defined_overflow call.
+	  So the last stmt in the sequence will contain another statement,
+	  but with needed lhs. So we need to make the replacement in our
+	  helper structures (check_and_correct_vectors). */
+	  gimple last_stmt = gimple_seq_last_stmt (seq);
+	  stmt_bb = gsi_insert_seq_on_edge_immediate (e, seq);
+	  check_and_correct_vectors (stmt->bb->loop_father, last_stmt,
+				   add_info_for_phi_result);
+	}
+	else
+	  stmt_bb = gsi_insert_on_edge_immediate (e, stmt);
+	todo_ |= TODO_cleanup_cfg;
+	set_loop_of_preheader_regio (stmt_bb, level, cond_executed);
+    }
+}
+
+
+void add_phi_args (gphi *phi, tree lhs, tree default_def,
+		  bool add_default_def, basic_block stmt_bb,
+		  basic_block phi_bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, phi_bb->preds)
+    {
+      if (dominated_by_p (CDI_DOMINATORS, e->src, stmt_bb))
+	add_phi_arg (phi, lhs, e, UNKNOWN_LOCATION);
+      else if (add_default_def)
+	add_phi_arg (phi, default_def, e, UNKNOWN_LOCATION);
+    }
+}
+
+gphi *get_phi_from_map (basic_block bb, tree var,
+		       hash_map<basic_block,
+		       hash_map<tree, gphi*>*> *map)
+{
+  hash_map<tree, gphi*> **p1 = map->get (bb);
+  if (p1 == NULL)
+    return NULL;
+  gphi **p2 = (*p1)->get (var);
+  if (p2 == NULL)
+    return NULL;
+  return *p2;
+}
+
+void put_phi_into_map (basic_block bb, tree var, gphi *phi,
+		      hash_map<basic_block, hash_map<tree, gphi*>*> *map)
+{
+   hash_map<tree, gphi*> **p1 = map->get (bb);
+   hash_map<tree, gphi*> *phi_map;
+   if (p1 == NULL)
+     {
+       phi_map = new hash_map<tree, gphi*>;
+       map->put (bb, phi_map);
+     }
+   else
+     phi_map = *p1;
+   phi_map->put (var, phi);
+}
+
+
+gphi *add_phi_node (basic_block phi_bb,
+		  basic_block stmt_bb, tree lhs,
+		  struct loop *loop, phi_data * data,
+		  bool is_preheader_dominator,
+		  hash_map<basic_block, hash_map<tree, gphi*>*> *map,
+		  bool is_first_phi)
+{
+  tree key = create_key (lhs);
+  gphi *phi = NULL;
+  bool is_tmp = (key == lhs);
+  /* if this is not temorary variable the phi node could be created
+     already (for another branch) */
+  if (!is_tmp)
+    phi = get_phi_from_map (phi_bb, key, map);
+
+  tree default_def = get_default_def (loop, key, true);
+  if (phi == NULL)
+    {
+      phi = create_phi_node (key, phi_bb);
+      if (is_preheader_dominator)
+	{
+	  /* The name of 'the last' phi node should be already
+	     choosen, when moving computations. */
+	  tree phi_name = get_name (data->names_map, key);
+	  gimple_phi_set_result (phi, phi_name);
+	}
+      else if (is_first_phi)
+	{
+	  /* If we move phi with 2 argument this name was choosen
+	     earlier. */
+	  tree phi_name = get_name (data->fl_names_map, key);
+	  if (phi_name != NULL)
+	    gimple_phi_set_result (phi, phi_name);
+	}
+
+      if (!is_tmp)
+	put_phi_into_map (phi_bb, key, phi, map);
+      add_phi_args (phi, lhs, default_def, true, stmt_bb, phi_bb);
+      return phi;
+    }
+  else
+   {
+     add_phi_args (phi, lhs, default_def, false, stmt_bb, phi_bb);
+     return NULL;
+   }
+}
+
+
+void delete_bb_phi_map (hash_map<basic_block, hash_map<tree, gphi*>*> *map)
+{
+  for (hash_map<basic_block, hash_map<tree, gphi*>*>::iterator
+       iter = map->begin ();
+       iter != map->end (); ++iter)
+    {
+      basic_block key = (*iter).first;
+      hash_map<tree, gphi*> **phi_map = map->get (key);
+      delete *phi_map;
+      phi_map = NULL;
+    }
+  delete map;
+  map = NULL;
+}
+
+bool check_fl_phi_cond (phi_data *data, gimple stmt, bool is_first_phi)
+{
+  if (!is_first_phi)
+    return false;
+  vec<gimple> v = data->stmts_fl_phi_only;
+  if (v.length () == 0)
+    return false;
+  if (v.last () == stmt)
+    {
+      v.pop ();
+      return true;
+    }
+  return false;
+}
+
+
+/* After computations movement, we create phi nodes. */
+void create_phi_nodes ()
+{
+  struct loop * loop;
+  phi_data *data;
+  basic_block phi_bb = NULL;
+  basic_block preheader;
+  /* Here we store information about phi nodes in bb.
+  This map corresponding to the current loop in the cycle below.*/
+  hash_map<basic_block, hash_map<tree, gphi*>*> *phi_map;
+  gphi *phi;
+  edge preheader_edge;
+  FOR_EACH_LOOP (loop,1)
+    {
+      phi_map = new hash_map<basic_block, hash_map<tree, gphi*>*>;
+      data = (phi_data *)loop->aux;
+      if (data == NULL)
+	continue;
+      gimple stmt;
+      /* data->stmts keeps the statements which require phi node
+	 because of their lhs uses outside of current level. */
+      while (data->stmts.length () > 0)
+	{
+	  stmt = data->stmts.pop ();
+	  bool is_preheader_dominator = false;
+	  tree lhs = gimple_get_lhs (stmt);
+	  basic_block stmt_bb = stmt->bb;
+	  preheader_edge =
+	   loop_preheader_edge (LOOP_OF_PREHEADER_REGION (stmt_bb));
+	  preheader = preheader_edge->src;
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "Will create phi node for: ");
+	      print_generic_expr (dump_file, lhs, 0);
+	      fprintf (dump_file, "\n, def bb = %d, loop = %d,\
+		      preheader.src = %d\n",
+		      stmt_bb->index, loop->num, preheader->index);
+	    }
+	  /* We add phi nodes while basic block with phi node doesn't
+	     dominate preheader (of the loop which we moved statement from).
+	     That's why we keep
+	     LOOP_OF_PREHEADER_REGION (stmt_bb) information. */
+	  bool is_first_phi = true;
+	  do
+	    {
+	      phi_bb = get_bb_for_phi (stmt_bb);
+	      is_preheader_dominator = dominated_by_p (CDI_DOMINATORS, preheader,
+						      phi_bb);
+	      phi = add_phi_node (phi_bb, stmt_bb, lhs, loop, data,
+				 is_preheader_dominator,
+				 phi_map,
+				 is_first_phi);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  if (phi == NULL)
+		    fprintf (dump_file, "phi node is NULL");
+		  else
+		    {
+		      fprintf (dump_file, "phi node create in the bb = %d:",
+		       	   phi->bb->index);
+		      print_gimple_stmt (dump_file, phi, 2, 0);
+		    }
+		}
+	      if (phi == NULL || check_fl_phi_cond (data, stmt, is_first_phi))
+		break;
+	      lhs = PHI_RESULT (phi);
+	      stmt_bb = phi_bb;
+	      is_first_phi = false;
+	    }
+	  while (!is_preheader_dominator);
+	}
+      delete_bb_phi_map (phi_map);
     }
 }
 
+
 /* Hoist the statements out of the loops prescribed by data stored in
    LIM_DATA structures associated with each statement.*/
 
@@ -1344,10 +2409,11 @@ move_computations (void)
   move_computations_dom_walker walker (CDI_DOMINATORS);
   walker.walk (cfun->cfg->x_entry_block_ptr);
 
-  gsi_commit_edge_inserts ();
+  create_phi_nodes ();
+  replace_uses_in_conditions ();
+
   if (need_ssa_update_p (cfun))
     rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
-
   return walker.todo_;
 }
 
@@ -1564,8 +2630,8 @@ static unsigned *bb_loop_postorder;
 static int
 sort_bbs_in_loop_postorder_cmp (const void *bb1_, const void *bb2_)
 {
-  basic_block bb1 = *(basic_block *)const_cast<void *>(bb1_);
-  basic_block bb2 = *(basic_block *)const_cast<void *>(bb2_);
+  basic_block bb1 = * (basic_block *)const_cast<void *> (bb1_);
+  basic_block bb2 = * (basic_block *)const_cast<void *> (bb2_);
   struct loop *loop1 = bb1->loop_father;
   struct loop *loop2 = bb2->loop_father;
   if (loop1->num == loop2->num)
@@ -1578,8 +2644,8 @@ sort_bbs_in_loop_postorder_cmp (const vo
 static int
 sort_locs_in_loop_postorder_cmp (const void *loc1_, const void *loc2_)
 {
-  mem_ref_loc *loc1 = (mem_ref_loc *)const_cast<void *>(loc1_);
-  mem_ref_loc *loc2 = (mem_ref_loc *)const_cast<void *>(loc2_);
+  mem_ref_loc *loc1 = (mem_ref_loc *)const_cast<void *> (loc1_);
+  mem_ref_loc *loc2 = (mem_ref_loc *)const_cast<void *> (loc2_);
   struct loop *loop1 = gimple_bb (loc1->stmt)->loop_father;
   struct loop *loop2 = gimple_bb (loc2->stmt)->loop_father;
   if (loop1->num == loop2->num)
@@ -1598,7 +2664,7 @@ analyze_memory_references (void)
   unsigned i, n;
 
   /* Collect all basic-blocks in loops and sort them after their
-     loops postorder.  */
+   ps postorder.  */
   i = 0;
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS);
   FOR_EACH_BB_FN (bb, cfun)
@@ -1613,7 +2679,7 @@ analyze_memory_references (void)
     {
       basic_block bb = bbs[i];
       for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
-        gather_mem_refs_stmt (bb->loop_father, gsi_stmt (bsi));
+	gather_mem_refs_stmt (bb->loop_father, gsi_stmt (bsi));
     }
 
   /* Sort the location list of gathered memory references after their
@@ -1685,8 +2751,8 @@ mem_refs_may_alias_p (mem_ref_p mem1, me
 static int
 find_ref_loc_in_loop_cmp (const void *loop_, const void *loc_)
 {
-  struct loop *loop = (struct loop *)const_cast<void *>(loop_);
-  mem_ref_loc *loc = (mem_ref_loc *)const_cast<void *>(loc_);
+  struct loop *loop = (struct loop *)const_cast<void *> (loop_);
+  mem_ref_loc *loc = (mem_ref_loc *)const_cast<void *> (loc_);
   struct loop *loc_loop = gimple_bb (loc->stmt)->loop_father;
   if (loop->num  == loc_loop->num
       || flow_loop_nested_p (loop, loc_loop))
@@ -1696,7 +2762,7 @@ find_ref_loc_in_loop_cmp (const void *lo
 }
 
 /* Iterates over all locations of REF in LOOP and its subloops calling
-   fn.operator() with the location as argument.  When that operator
+   fn.operator () with the location as argument.  When that operator
    returns true the iteration is stopped and true is returned.
    Otherwise false is returned.  */
 
@@ -1808,9 +2874,9 @@ struct prev_flag_edges {
 
      for (...) {
        if (foo)
-         stuff;
+	 stuff;
        else
-         MEM = TMP_VAR;
+	 MEM = TMP_VAR;
      }
 
    into:
@@ -1818,9 +2884,9 @@ struct prev_flag_edges {
      lsm = MEM;
      for (...) {
        if (foo)
-         stuff;
+	 stuff;
        else
-         lsm = TMP_VAR;
+	 lsm = TMP_VAR;
      }
      MEM = lsm;
 
@@ -1832,10 +2898,10 @@ struct prev_flag_edges {
      ...
      for (...) {
        if (foo)
-         stuff;
+	 stuff;
        else {
-         lsm = TMP_VAR;
-         lsm_flag = true;
+	 lsm = TMP_VAR;
+	 lsm_flag = true;
        }
      }
      if (lsm_flag)	<--
@@ -2013,7 +3079,8 @@ execute_sm (struct loop *loop, vec<edge>
   /* Emit the load code on a random exit edge or into the latch if
      the loop does not exit, so that we are sure it will be processed
      by move_computations after all dependencies.  */
-  gsi = gsi_for_stmt (first_mem_ref_loc (loop, ref)->stmt);
+  gimple stmt = first_mem_ref_loc (loop, ref)->stmt;
+  gsi = gsi_for_stmt (stmt);
 
   /* FIXME/TODO: For the multi-threaded variant, we could avoid this
      load altogether, since the store is predicated by a flag.  We
@@ -2022,12 +3089,18 @@ execute_sm (struct loop *loop, vec<edge>
   lim_data = init_lim_data (load);
   lim_data->max_loop = loop;
   lim_data->tgt_loop = loop;
+  basic_block bb = gimple_bb (stmt);
+  bool ignore_condition = IS_TRUE_OR_FALSE_TARGET (bb);
+  lim_data->ignore_condition = ignore_condition;
+
+
   gsi_insert_before (&gsi, load, GSI_SAME_STMT);
 
   if (multi_threaded_model_p)
     {
       load = gimple_build_assign (store_flag, boolean_false_node);
       lim_data = init_lim_data (load);
+      lim_data->ignore_condition = ignore_condition;
       lim_data->max_loop = loop;
       lim_data->tgt_loop = loop;
       gsi_insert_before (&gsi, load, GSI_SAME_STMT);
@@ -2202,6 +3275,7 @@ ref_indep_loop_p_2 (struct loop *loop, m
 
   bool indep_p = ref_indep_loop_p_1 (loop, ref, stored_p);
 
+
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "Querying dependencies of ref %u in loop %d: %s\n",
 	     ref->id, loop->num, indep_p ? "independent" : "dependent");
@@ -2428,9 +3502,12 @@ fill_always_executed_in_1 (struct loop *
     fill_always_executed_in_1 (loop, contains_call);
 }
 
-/* Fills ALWAYS_EXECUTED_IN information for basic blocks, i.e.
-   for each such basic block bb records the outermost loop for that execution
-   of its header implies execution of bb.  */
+
+/* Fills move_cond information for basic blocks (either ALWAYS_EXECUTED_IN
+   or IS_[TRUE/FALSE]_TARGET, i.e.
+   for each such basic block bb records either the outermost loop for that execution
+   of its header implies execution of bb or this is conditionally executed bb
+   or nothing.  */
 
 static void
 fill_always_executed_in (void)
@@ -2444,8 +3521,8 @@ fill_always_executed_in (void)
     {
       gimple_stmt_iterator gsi;
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
-	{
-	  if (nonpure_call_p (gsi_stmt (gsi)))
+	{ gimple stmt = gsi_stmt (gsi);
+	  if (nonpure_call_p (stmt))
 	    break;
 	}
 
@@ -2471,6 +3548,8 @@ tree_ssa_lim_initialize (void)
   bitmap_obstack_initialize (&lim_bitmap_obstack);
   gcc_obstack_init (&mem_ref_obstack);
   lim_aux_data_map = new hash_map<gimple, lim_aux_data *>;
+  level_cond_data_map = new hash_map<struct loop*, cond_data_map *>;
+  post_dominators_map = new hash_map<basic_block, basic_block>;
 
   if (flag_tm)
     compute_transaction_bits ();
@@ -2481,7 +3560,7 @@ tree_ssa_lim_initialize (void)
   memory_accesses.refs_list.create (100);
   /* Allocate a special, unanalyzable mem-ref with ID zero.  */
   memory_accesses.refs_list.quick_push
-    (mem_ref_alloc (error_mark_node, 0, UNANALYZABLE_MEM_ID));
+ (mem_ref_alloc (error_mark_node, 0, UNANALYZABLE_MEM_ID));
 
   memory_accesses.refs_in_loop.create (number_of_loops (cfun));
   memory_accesses.refs_in_loop.quick_grow (number_of_loops (cfun));
@@ -2510,6 +3589,7 @@ tree_ssa_lim_initialize (void)
     bb_loop_postorder[loop->num] = i++;
 }
 
+
 /* Cleans up after the invariant motion pass.  */
 
 static void
@@ -2522,10 +3602,25 @@ tree_ssa_lim_finalize (void)
   free_aux_for_edges ();
 
   FOR_EACH_BB_FN (bb, cfun)
-    SET_ALWAYS_EXECUTED_IN (bb, NULL);
+  {
+      FREE_AUX_FOR_BB (bb);
+  }
 
   bitmap_obstack_release (&lim_bitmap_obstack);
   delete lim_aux_data_map;
+  delete post_dominators_map;
+
+  for (hash_map<struct loop * , cond_data_map *>::iterator iter
+	  = level_cond_data_map->begin ();
+       iter != level_cond_data_map->end (); ++iter)
+  {
+    cond_data_map **map = level_cond_data_map->get ((*iter).first);
+    delete *map;
+    map = NULL;
+  }
+
+
+  delete level_cond_data_map;
 
   delete memory_accesses.refs;
   memory_accesses.refs = NULL;
@@ -2543,21 +3638,41 @@ tree_ssa_lim_finalize (void)
     free_affine_expand_cache (&memory_accesses.ttae_cache);
 
   free (bb_loop_postorder);
+  struct loop *loop;
+  FOR_EACH_LOOP (loop, 1)
+  {
+    phi_data *data = (phi_data*)loop->aux;
+    if (data != NULL)
+    {
+      data->stmts.release ();
+      data->stmts_fl_phi_only.release ();
+      delete data->names_map;
+      data->names_map = NULL;
+      delete data->fl_names_map;
+      data->fl_names_map = NULL;
+      delete data->defs_map;
+      data->defs_map = NULL;
+      free (data);
+      loop->aux = NULL;
+    }
+  }
+
+  stmts_with_changed_lhs.release ();
 }
 
-/* Moves invariants from loops.  Only "expensive" invariants are moved out --
+/*  Moves invariants from loops.  Only "expensive" invariants are moved out --
    i.e. those that are likely to be win regardless of the register pressure.  */
 
 unsigned int
 tree_ssa_lim (void)
 {
   unsigned int todo;
-
   tree_ssa_lim_initialize ();
 
   /* Gathers information about memory accesses in the loops.  */
   analyze_memory_references ();
 
+
   /* Fills ALWAYS_EXECUTED_IN information for basic blocks.  */
   fill_always_executed_in ();
 
@@ -2574,7 +3689,6 @@ tree_ssa_lim (void)
   todo = move_computations ();
 
   tree_ssa_lim_finalize ();
-
   return todo;
 }
 
@@ -2627,3 +3741,4 @@ make_pass_lim (gcc::context *ctxt)
 }
 
 
+

next             reply	other threads:[~2015-05-08 21:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-08 21:07 Evgeniya Maenkova [this message]
2015-05-09 12:41 ` Andi Kleen
2015-05-26 10:32 ` Richard Biener
2015-05-26 13:38   ` Evgeniya Maenkova
2015-05-27 10:52     ` Richard Biener
2015-05-27 13:36       ` Evgeniya Maenkova
2015-05-29 13:32 ` Evgeniya Maenkova
2015-06-09 11:48   ` Richard Biener
2015-06-09 20:12     ` Evgeniya Maenkova
2015-06-29 13:15       ` Richard Biener
2015-06-29 14:24         ` Evgeniya Maenkova
2015-07-14 10:54           ` Richard Biener
2015-07-15  8:43             ` Evgeniya Maenkova
2015-07-15 10:58               ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANVW7MPfRvpNjABKuqup9GxbUk7A=BaRqwsPcbP8DRpGoGMWdw@mail.gmail.com' \
    --to=evgeniya.maenkova@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).