public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/4] Add the -floop-if-convert flag.
  2010-07-07 20:23 [PATCH 0/4] if-conversion of loops with conditionals containing memory writes Sebastian Pop
@ 2010-07-07 20:22 ` Sebastian Pop
  2010-07-07 21:39   ` Richard Guenther
  2010-07-07 20:23 ` [PATCH 3/4] Add flag -floop-if-convert-memory-writes Sebastian Pop
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 20:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

	* common.opt (floop-if-convert): New flag.
	* doc/invoke.texi (floop-if-convert): Documented.
	* tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
	when flag_loop_if_convert is set.
---
 gcc/common.opt      |    4 ++++
 gcc/doc/invoke.texi |   10 ++++++++--
 gcc/tree-if-conv.c  |    3 ++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 6ca787a..2a5d391 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -653,6 +653,10 @@ fif-conversion2
 Common Report Var(flag_if_conversion2) Optimization
 Perform conversion of conditional jumps to conditional execution
 
+floop-if-convert
+Common Report Var(flag_loop_if_convert) Optimization
+Convert conditional jumps in innermost loops to branchless equivalents
+
 ; -finhibit-size-directive inhibits output of .size for ELF.
 ; This is used only for compiling crtstuff.c,
 ; and it may be extended to other effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d70f130..da14bf9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
 -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
 -fforward-propagate -ffunction-sections @gol
--fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
+-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
 -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
 -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
@@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
 -fira-loop-pressure -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
+-floop-block -floop-if-convert -floop-interchange -floop-strip-mine @gol
 -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
 -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
 -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
@@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -floop-if-convert
+Attempt to transform conditional jumps in the innermost loops to
+branch-less equivalents.  The intent is to remove control-flow from
+the innermost loops in order to improve the ability of the
+auto-vectorization pass to handle these loops.
+
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 8d5d226..ad106d7 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1242,7 +1242,8 @@ main_tree_if_conversion (void)
 static bool
 gate_tree_if_conversion (void)
 {
-  return flag_tree_vectorize != 0;
+  return flag_tree_vectorize
+    || flag_loop_if_convert;
 }
 
 struct gimple_opt_pass pass_if_conversion =
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/4] Do not check whether memory references accessed in every iteration trap.
  2010-07-07 20:23 [PATCH 0/4] if-conversion of loops with conditionals containing memory writes Sebastian Pop
  2010-07-07 20:22 ` [PATCH 1/4] Add the -floop-if-convert flag Sebastian Pop
  2010-07-07 20:23 ` [PATCH 3/4] Add flag -floop-if-convert-memory-writes Sebastian Pop
@ 2010-07-07 20:23 ` Sebastian Pop
  2010-07-08 12:29   ` Michael Matz
  2010-07-07 20:24 ` [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates Sebastian Pop
  3 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 20:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

This patch relaxes the checks from gimple_could_trap_p in order to
allow the flag_loop_if_convert_memory_writes to if-convert more loops
in which it is possible to prove that:

- the accesses to an array in a loop do not trap (more than the
  original non-if-converted loop).  This is true when the memory
  accesses are executed at every iteration of the if-converted loop.

- the writes to memory occur on arrays that are not const qualified.
  This is true when there exists at least one unconditional write to
  the array in the analyzed program.  In this patch this analysis is
  limited to the loop to be if-converted.

	* gimple.c (gimple_op_could_trap_p): Outlined from
	gimple_could_trap_p_1.
	(gimple_could_trap_p_1): Call gimple_op_could_trap_p.
	* gimple.h (gimple_op_could_trap_p): Declared.
	* tree-data-ref.h (same_data_refs_base_objects): New.
	(same_data_refs): New.
	* tree-if-conv.c (memrefs_read_or_written_unconditionally): New.
	(write_memrefs_written_at_least_once): New.
	(ifcvt_memrefs_wont_trap): New.
	(operations_could_trap): New.
	(ifcvt_could_trap_p): New.
	(if_convertible_gimple_assign_stmt_p): Call ifcvt_could_trap_p.
	Gets a vector of data refs.
	(if_convertible_stmt_p): Same.
	(if_convertible_loop_p): Before freeing the data refs vector,
	pass it to if_convertible_stmt_p.

	* gcc.dg/tree-ssa/ifc-5.c: New.
	* gcc.dg/vect/vect-cond-1.c: Update pattern.
	* gcc.dg/vect/vect-cond-3.c: Same.
	* gcc.dg/vect/vect-cond-4.c: Same.
	* gcc.dg/vect/vect-cond-6.c: Same.
---
 gcc/gimple.c                            |   38 ++++---
 gcc/gimple.h                            |    1 +
 gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c   |   24 ++++
 gcc/testsuite/gcc.dg/vect/vect-cond-1.c |    2 +-
 gcc/testsuite/gcc.dg/vect/vect-cond-3.c |    2 +-
 gcc/testsuite/gcc.dg/vect/vect-cond-4.c |    2 +-
 gcc/testsuite/gcc.dg/vect/vect-cond-6.c |    1 +
 gcc/tree-data-ref.h                     |   34 +++++
 gcc/tree-if-conv.c                      |  211 +++++++++++++++++++++++++++----
 9 files changed, 270 insertions(+), 45 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c

diff --git a/gcc/gimple.c b/gcc/gimple.c
index fa5b804..a082940 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -2385,24 +2385,14 @@ gimple_rhs_has_side_effects (const_gimple s)
   return false;
 }
 
+/* Helper for gimple_could_trap_p_1.  Return true if the operation of
+   S can trap.  */
 
-/* Helper for gimple_could_trap_p and gimple_assign_rhs_could_trap_p.
-   Return true if S can trap.  If INCLUDE_LHS is true and S is a
-   GIMPLE_ASSIGN, the LHS of the assignment is also checked.
-   Otherwise, only the RHS of the assignment is checked.  */
-
-static bool
-gimple_could_trap_p_1 (gimple s, bool include_lhs)
+bool
+gimple_op_could_trap_p (gimple s)
 {
-  unsigned i, start;
-  tree t, div = NULL_TREE;
   enum tree_code op;
-
-  start = (is_gimple_assign (s) && !include_lhs) ? 1 : 0;
-
-  for (i = start; i < gimple_num_ops (s); i++)
-    if (tree_could_trap_p (gimple_op (s, i)))
-      return true;
+  tree t, div = NULL_TREE;
 
   switch (gimple_code (s))
     {
@@ -2431,7 +2421,25 @@ gimple_could_trap_p_1 (gimple s, bool include_lhs)
     }
 
   return false;
+}
+
+/* Helper for gimple_could_trap_p and gimple_assign_rhs_could_trap_p.
+   Return true if S can trap.  If INCLUDE_LHS is true and S is a
+   GIMPLE_ASSIGN, the LHS of the assignment is also checked.
+   Otherwise, only the RHS of the assignment is checked.  */
+
+static bool
+gimple_could_trap_p_1 (gimple s, bool include_lhs)
+{
+  unsigned i, start;
+
+  start = (is_gimple_assign (s) && !include_lhs) ? 1 : 0;
+
+  for (i = start; i < gimple_num_ops (s); i++)
+    if (tree_could_trap_p (gimple_op (s, i)))
+      return true;
 
+  return gimple_op_could_trap_p (s);
 }
 
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 7d2289b..2716f6a 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -886,6 +886,7 @@ gimple gimple_build_cond_from_tree (tree, tree, tree);
 void gimple_cond_set_condition_from_tree (gimple, tree);
 bool gimple_has_side_effects (const_gimple);
 bool gimple_rhs_has_side_effects (const_gimple);
+bool gimple_op_could_trap_p (gimple);
 bool gimple_could_trap_p (gimple);
 bool gimple_assign_rhs_could_trap_p (gimple);
 void gimple_regimplify_operands (gimple, gimple_stmt_iterator *);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
new file mode 100644
index 0000000..a9cc816
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */
+
+void
+dct_unquantize_h263_inter_c (short *block, int n, int qscale, int nCoeffs)
+{
+  int i, level, qmul, qadd;
+
+  qadd = (qscale - 1) | 1;
+  qmul = qscale << 1;
+
+  for (i = 0; i <= nCoeffs; i++)
+    {
+      level = block[i];
+      if (level < 0)
+	level = level * qmul - qadd;
+      else
+	level = level * qmul + qadd;
+      block[i] = level;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
+/* { dg-final { cleanup-tree-dump "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-1.c b/gcc/testsuite/gcc.dg/vect/vect-cond-1.c
index 4ee6713..888e5cb 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-cond-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-1.c
@@ -52,7 +52,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-3.c b/gcc/testsuite/gcc.dg/vect/vect-cond-3.c
index 56cfbb2..42b9f35 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-cond-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-3.c
@@ -60,7 +60,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-4.c b/gcc/testsuite/gcc.dg/vect/vect-cond-4.c
index c3a1585..1d1d8fc 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-cond-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-4.c
@@ -57,7 +57,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-6.c b/gcc/testsuite/gcc.dg/vect/vect-cond-6.c
index e5e9391..7820444 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-cond-6.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-6.c
@@ -56,5 +56,6 @@ int main ()
 }
 
 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
       
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index eff5348..391d6fc 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -417,6 +417,40 @@ extern void create_rdg_vertices (struct graph *, VEC (gimple, heap) *);
 extern bool dr_may_alias_p (const struct data_reference *,
 			    const struct data_reference *);
 
+
+/* Return true when the base objects of data references A and B are
+   the same memory object.  */
+
+static inline bool
+same_data_refs_base_objects (data_reference_p a, data_reference_p b)
+{
+  return DR_NUM_DIMENSIONS (a) == DR_NUM_DIMENSIONS (b)
+    && dr_may_alias_p (a, b)
+    && operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), 0);
+}
+
+/* Return true when the data references A and B are accessing the same
+   memory object with the same access functions.  */
+
+static inline bool
+same_data_refs (data_reference_p a, data_reference_p b)
+{
+  unsigned int i;
+
+  /* The references are exactly the same.  */
+  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
+    return true;
+
+  if (!same_data_refs_base_objects (a, b))
+    return false;
+
+  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+    if (!eq_evolutions_p (DR_ACCESS_FN (a, i), DR_ACCESS_FN (b, i)))
+      return false;
+
+  return true;
+}
+
 /* Return true when the DDR contains two data references that have the
    same access functions.  */
 
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 2733715..d262d45 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -442,6 +442,157 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
   return true;
 }
 
+/* Returns true when the memory references of STMT are read or written
+   unconditionally.  */
+
+static bool
+memrefs_read_or_written_unconditionally (gimple stmt,
+					 VEC (data_reference_p, heap) *refs)
+{
+  struct data_reference *a, *b;
+  unsigned int i, j;
+  tree cond = bb_predicate (gimple_bb (stmt));
+
+  for (i = 0; VEC_iterate (data_reference_p, refs, i, a); i++)
+    if (DR_STMT (a) == stmt)
+      {
+	for (j = 0; VEC_iterate (data_reference_p, refs, j, b); j++)
+	  if (i != j && same_data_refs (a, b))
+	    {
+	      basic_block bb;
+
+	      if (!is_predicated (gimple_bb (DR_STMT (b))))
+		{
+		  cond = boolean_true_node;
+		  break;
+		}
+
+	      bb = gimple_bb (DR_STMT (b));
+	      cond = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, cond,
+				  bb_predicate (bb));
+
+	      if (is_true_predicate (cond))
+		break;
+	    }
+
+	if (!is_true_predicate (cond))
+	  return false;
+      }
+
+  return true;
+}
+
+/* Returns true when the memory references of STMT are unconditionally
+   written.  */
+
+static bool
+write_memrefs_written_at_least_once (gimple stmt,
+				     VEC (data_reference_p, heap) *refs)
+{
+  struct data_reference *a, *b;
+  unsigned int i, j;
+  tree cond = bb_predicate (gimple_bb (stmt));
+
+  for (i = 0; VEC_iterate (data_reference_p, refs, i, a); i++)
+    if (DR_STMT (a) == stmt
+	&& !DR_IS_READ (a))
+      {
+	for (j = 0; VEC_iterate (data_reference_p, refs, j, b); j++)
+	  if (i != j
+	      && !DR_IS_READ (b)
+	      && same_data_refs_base_objects (a, b))
+	    {
+	      basic_block bb;
+
+	      if (!is_predicated (gimple_bb (DR_STMT (b))))
+		{
+		  cond = boolean_true_node;
+		  break;
+		}
+
+	      bb = gimple_bb (DR_STMT (b));
+	      cond = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, cond,
+				  bb_predicate (bb));
+
+	      if (is_true_predicate (cond))
+		break;
+	    }
+
+	if (!is_true_predicate (cond))
+	  return false;
+      }
+
+  return true;
+}
+
+/* Return true when the memory references of STMT won't trap in the
+   if-converted code.  There are two things that we have to check for:
+
+   - writes to memory occur to writable memory: if-conversion of
+   memory writes transforms the conditional memory writes into
+   unconditional writes, i.e. "if (cond) A[i] = foo" is transformed
+   into "A[i] = cond ? foo : A[i]", and as the write to memory may not
+   be executed at all in the original code, it may be a readonly
+   memory.  To check that A is not const-qualified, we check that
+   there exists at least an unconditional write to A in the current
+   function.
+
+   - reads or writes to memory are valid memory accesses for every
+   iteration.  To check that the memory accesses are correctly formed
+   and that we are allowed to read and write in these locations, we
+   check that the memory accesses to be if-converted occur at every
+   iteration unconditionally.  */
+
+static bool
+ifcvt_memrefs_wont_trap (gimple stmt, VEC (data_reference_p, heap) *refs)
+{
+  return write_memrefs_written_at_least_once (stmt, refs)
+    && memrefs_read_or_written_unconditionally (stmt, refs);
+}
+
+/* Same as gimple_could_trap_p_1, returns true when the statement S
+   could trap, but do not check the memory references.  */
+
+static bool
+operations_could_trap (gimple s)
+{
+  unsigned i;
+
+  for (i = 0; i < gimple_num_ops (s); i++)
+    {
+      tree expr = gimple_op (s, i);
+
+      switch (TREE_CODE (expr))
+	{
+	case ARRAY_REF:
+	case INDIRECT_REF:
+	case MISALIGNED_INDIRECT_REF:
+	  break;
+
+	default:
+	  if (tree_could_trap_p (expr))
+	    return true;
+	}
+    }
+
+  return gimple_op_could_trap_p (s);
+}
+
+/* Wrapper around gimple_could_trap_p refined for the needs of the
+   if-conversion.  Try to prove that the memory accesses of STMT could
+   not trap in the innermost loop containing STMT.  */
+
+static bool
+ifcvt_could_trap_p (gimple stmt, VEC (data_reference_p, heap) *refs)
+{
+  if (gimple_has_mem_ops (stmt)
+      && ifcvt_memrefs_wont_trap (stmt, refs)
+      && !operations_could_trap (stmt))
+    return false;
+
+  return gimple_could_trap_p (stmt);
+}
+
 /* Return true when STMT is if-convertible.
 
    GIMPLE_ASSIGN statement is not if-convertible if,
@@ -450,7 +601,8 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
    - LHS is not var decl.  */
 
 static bool
-if_convertible_gimple_assign_stmt_p (gimple stmt)
+if_convertible_gimple_assign_stmt_p (gimple stmt,
+				     VEC (data_reference_p, heap) *refs)
 {
   tree lhs = gimple_assign_lhs (stmt);
   basic_block bb;
@@ -478,7 +630,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt)
 
   if (flag_loop_if_convert_memory_writes)
     {
-      if (gimple_could_trap_p (stmt))
+      if (ifcvt_could_trap_p (stmt, refs))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file, "tree could trap...\n");
@@ -518,7 +670,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt)
    - it is a GIMPLE_LABEL or a GIMPLE_COND.  */
 
 static bool
-if_convertible_stmt_p (gimple stmt)
+if_convertible_stmt_p (gimple stmt, VEC (data_reference_p, heap) *refs)
 {
   switch (gimple_code (stmt))
     {
@@ -528,7 +680,7 @@ if_convertible_stmt_p (gimple stmt)
       return true;
 
     case GIMPLE_ASSIGN:
-      return if_convertible_gimple_assign_stmt_p (stmt);
+      return if_convertible_gimple_assign_stmt_p (stmt, refs);
 
     default:
       /* Don't know what to do with 'em so don't do anything.  */
@@ -811,6 +963,9 @@ if_convertible_loop_p (struct loop *loop)
   edge e;
   edge_iterator ei;
   basic_block exit_bb = NULL;
+  bool res = false;
+  VEC (data_reference_p, heap) *refs;
+  VEC (ddr_p, heap) *ddrs;
 
   /* Handle only innermost loop.  */
   if (!loop || loop->inner)
@@ -848,18 +1003,14 @@ if_convertible_loop_p (struct loop *loop)
 
   /* Don't if-convert the loop when the data dependences cannot be
      computed: the loop won't be vectorized in that case.  */
-  {
-    VEC (data_reference_p, heap) *refs = VEC_alloc (data_reference_p, heap, 5);
-    VEC (ddr_p, heap) *ddrs = VEC_alloc (ddr_p, heap, 25);
-    bool res = compute_data_dependences_for_loop (loop, true, &refs, &ddrs);
+  refs = VEC_alloc (data_reference_p, heap, 5);
+  ddrs = VEC_alloc (ddr_p, heap, 25);
+  res = compute_data_dependences_for_loop (loop, true, &refs, &ddrs);
 
-    free_data_refs (refs);
-    free_dependence_relations (ddrs);
-
-    if (!res)
-      return false;
-  }
+  if (!res)
+    goto cleanup;
 
+  res = false;
   calculate_dominance_info (CDI_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
@@ -868,7 +1019,7 @@ if_convertible_loop_p (struct loop *loop)
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Irreducible loop\n");
-      return false;
+      goto cleanup;
     }
 
   for (i = 0; i < loop->num_nodes; i++)
@@ -876,14 +1027,18 @@ if_convertible_loop_p (struct loop *loop)
       basic_block bb = ifc_bbs[i];
 
       if (!if_convertible_bb_p (loop, bb, exit_bb))
-	return false;
+	goto cleanup;
 
       if (bb_with_exit_edge_p (loop, bb))
 	exit_bb = bb;
     }
 
-  if (!predicate_bbs (loop))
-    return false;
+  res = predicate_bbs (loop);
+
+  if (!res)
+    goto cleanup;
+
+  res = false;
 
   for (i = 0; i < loop->num_nodes; i++)
     {
@@ -892,21 +1047,23 @@ if_convertible_loop_p (struct loop *loop)
 
       for (itr = gsi_start_phis (bb); !gsi_end_p (itr); gsi_next (&itr))
 	if (!if_convertible_phi_p (loop, bb, gsi_stmt (itr)))
-	  return false;
-
-      /* For non predicated BBs, don't check their statements.  */
-      if (!is_predicated (bb))
-	continue;
+	  goto cleanup;
 
-      for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr))
-	if (!if_convertible_stmt_p (gsi_stmt (itr)))
-	  return false;
+      /* Check the if-convertibility of statements in predicated BBs.  */
+      if (is_predicated (bb))
+	for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr))
+	  if (!if_convertible_stmt_p (gsi_stmt (itr), refs))
+	    goto cleanup;
     }
 
+  res = true;
   if (dump_file)
     fprintf (dump_file, "Applying if-conversion\n");
 
-  return true;
+ cleanup:
+  free_data_refs (refs);
+  free_dependence_relations (ddrs);
+  return res;
 }
 
 /* Basic block BB has two predecessors.  Using predecessor's bb
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4] Add flag -floop-if-convert-memory-writes.
  2010-07-07 20:23 [PATCH 0/4] if-conversion of loops with conditionals containing memory writes Sebastian Pop
  2010-07-07 20:22 ` [PATCH 1/4] Add the -floop-if-convert flag Sebastian Pop
@ 2010-07-07 20:23 ` Sebastian Pop
  2010-07-07 22:54   ` [PATCH 3/4] Add flag -ftree-loop-if-convert-memory-writes Sebastian Pop
  2010-07-07 20:23 ` [PATCH 4/4] Do not check whether memory references accessed in every iteration trap Sebastian Pop
  2010-07-07 20:24 ` [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates Sebastian Pop
  3 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 20:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

This patch adds a flag that controls the replacement of the memory
writes that are in predicated basic blocks with a full write:

for (...)
  if (cond)
    A[i] = foo

is replaced with:

for (...)
  A[i] = cond ? foo : A[i]

In order to do this, we have to call gimple_could_trap_p instead of
gimple_assign_rhs_could_trap_p, as we have to also check that the LHS
of assign stmts does not trap.

	* common.opt (floop-if-convert-memory-writes): New flag.
	* doc/invoke.texi (floop-if-convert-memory-writes): Documented.
	* tree-if-conv.c (if_convertible_phi_p): Allow virtual phi nodes when
	flag_loop_if_convert_memory_writes is set.
	(if_convertible_gimple_assign_stmt_p): Allow memory reads and writes
	Do not handle types that do not match is_gimple_reg_type.
	Remove loop and bb parameters.  Call gimple_could_trap_p instead of
	when flag_loop_if_convert_memory_writes	is set, as LHS can contain
	memory refs.
	(if_convertible_stmt_p): Remove loop and bb parameters.  Update calls
	to if_convertible_gimple_assign_stmt_p.
	(if_convertible_loop_p): Update call to if_convertible_stmt_p.
	(gimplify_condition): New.
	(replace_phi_with_cond_gimple_assign_stmt): Renamed
	predicate_scalar_phi.  Do not handle virtual phi nodes.
	(ifconvert_phi_nodes): Renamed predicate_all_scalar_phis.
	Call predicate_scalar_phi.
	(insert_gimplified_predicates): Insert the gimplified predicate of a BB
	just after the labels for flag_loop_if_convert_memory_writes, otherwise
	insert the predicate in the end of the BB.
	(predicate_mem_writes): New.
	(combine_blocks): Call predicate_all_scalar_phis.  When
	flag_loop_if_convert_memory_writes is set, call predicate_mem_writes.
	(tree_if_conversion): Call mark_sym_for_renaming when
	flag_loop_if_convert_memory_writes is set.
	(main_tree_if_conversion): Call update_ssa when
	flag_loop_if_convert_memory_writes is set.

	* gcc.dg/tree-ssa/ifc-4.c: New.
	* gcc.dg/tree-ssa/ifc-7.c: New.
---
 gcc/common.opt                        |    4 +
 gcc/doc/invoke.texi                   |   20 ++++-
 gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c |   53 ++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c |   29 +++++
 gcc/tree-if-conv.c                    |  181 ++++++++++++++++++++++++++-------
 5 files changed, 247 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 2a5d391..51acb18 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -657,6 +657,10 @@ floop-if-convert
 Common Report Var(flag_loop_if_convert) Optimization
 Convert conditional jumps in innermost loops to branchless equivalents
 
+floop-if-convert-memory-writes
+Common Report Var(flag_loop_if_convert_memory_writes) Optimization
+Also if-convert conditional jumps containing memory writes
+
 ; -finhibit-size-directive inhibits output of .size for ELF.
 ; This is used only for compiling crtstuff.c,
 ; and it may be extended to other effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index da14bf9..0dc01d6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -352,7 +352,8 @@ Objective-C and Objective-C++ Dialects}.
 -fira-loop-pressure -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--floop-block -floop-if-convert -floop-interchange -floop-strip-mine @gol
+-floop-block -floop-if-convert -floop-if-convert-memory-writes @gol
+-floop-interchange -floop-strip-mine @gol
 -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
 -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
 -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
@@ -6889,6 +6890,23 @@ branch-less equivalents.  The intent is to remove control-flow from
 the innermost loops in order to improve the ability of the
 auto-vectorization pass to handle these loops.
 
+@item -floop-if-convert-memory-writes
+Attempt to also if-convert conditional jumps containing memory writes.
+This transformation can be unsafe for multi-threaded programs as it
+transforms conditional memory writes into unconditional memory writes.
+For example,
+@smallexample
+for (i = 0; i < N; i++)
+  if (cond)
+    A[i] = expr;
+@end smallexample
+would be transformed to
+@smallexample
+for (i = 0; i < N; i++)
+  A[i] = cond ? expr : A[i];
+@end smallexample
+potentially producing data races.
+
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
new file mode 100644
index 0000000..beb1a0e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */
+
+struct ht
+{
+  void * (*alloc_subobject) (int);
+};
+typedef struct cpp_reader cpp_reader;
+typedef struct cpp_token cpp_token;
+typedef struct cpp_macro cpp_macro;
+enum cpp_ttype
+{
+    CPP_PASTE,
+};
+struct cpp_token {
+  __extension__ enum cpp_ttype type : 8;
+} cpp_comment_table;
+struct cpp_macro {
+  union cpp_macro_u
+  {
+    cpp_token * tokens;
+  } exp;
+  unsigned int count;
+};
+struct cpp_reader
+{
+  struct ht *hash_table;
+};
+create_iso_definition (cpp_reader *pfile, cpp_macro *macro)
+{
+  unsigned int num_extra_tokens = 0;
+  {
+    cpp_token *tokns =
+      (cpp_token *) pfile->hash_table->alloc_subobject (sizeof (cpp_token)
+							* macro->count);
+    {
+      cpp_token *normal_dest = tokns;
+      cpp_token *extra_dest = tokns + macro->count - num_extra_tokens;
+      unsigned int i;
+      for (i = 0; i < macro->count; i++)
+	{
+	  if (macro->exp.tokens[i].type == CPP_PASTE)
+	    *extra_dest++ = macro->exp.tokens[i];
+	  else
+	    *normal_dest++ = macro->exp.tokens[i];
+	}
+    }
+  }
+}
+
+/* This cannot be if-converted because the stores are to aggregate types.  */
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 0 "ifcvt" } } */
+/* { dg-final { cleanup-tree-dump "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c
new file mode 100644
index 0000000..4d26dc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
+
+typedef struct eqn_d
+{
+  int *coef;
+} *eqn;
+typedef struct omega_pb_d
+{
+  eqn subs;
+} *omega_pb;
+
+omega_pb omega_solve_problem (omega_pb);
+
+omega_pb
+omega_solve_geq (omega_pb pb, int n)
+{
+  int i, e;
+  int j = 0;
+
+  for (e = n - 1; e >= 0; e--)
+    if (pb->subs[e].coef[i] != pb->subs[e].coef[j])
+      {
+	pb->subs[e].coef[i] = j;
+	pb->subs[e].coef[j] = i;
+      }
+
+  return omega_solve_problem (pb);
+}
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 03e453e..2733715 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -387,9 +387,12 @@ bb_with_exit_edge_p (struct loop *loop, basic_block bb)
    and it belongs to basic block BB.
 
    PHI is not if-convertible if:
-   - it has more than 2 arguments,
-   - virtual PHI is immediately used in another PHI node,
-   - virtual PHI on BB other than header.  */
+   - it has more than 2 arguments.
+
+   When the flag_loop_if_convert_memory_writes is not set, PHI is not
+   if-convertible if:
+   - a virtual PHI is immediately used in another PHI node,
+   - there is a virtual PHI in a BB other than the loop->header.  */
 
 static bool
 if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
@@ -407,6 +410,12 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
       return false;
     }
 
+  if (flag_loop_if_convert_memory_writes)
+    return true;
+
+  /* When the flag_loop_if_convert_memory_writes is not set, check
+     that there are no memory writes in the branches of the loop to be
+     if-converted.  */
   if (!is_gimple_reg (SSA_NAME_VAR (gimple_phi_result (phi))))
     {
       imm_use_iterator imm_iter;
@@ -415,9 +424,10 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
       if (bb != loop->header)
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Virtual phi not on loop header.\n");
+	    fprintf (dump_file, "Virtual phi not on loop->header.\n");
 	  return false;
 	}
+
       FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi))
 	{
 	  if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI)
@@ -437,15 +447,13 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
    GIMPLE_ASSIGN statement is not if-convertible if,
    - it is not movable,
    - it could trap,
-   - LHS is not var decl.
-
-   GIMPLE_ASSIGN is part of block BB, which is inside loop LOOP.  */
+   - LHS is not var decl.  */
 
 static bool
-if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
-    				     gimple stmt)
+if_convertible_gimple_assign_stmt_p (gimple stmt)
 {
   tree lhs = gimple_assign_lhs (stmt);
+  basic_block bb;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -453,6 +461,9 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
     }
 
+  if (!is_gimple_reg_type (TREE_TYPE (lhs)))
+    return false;
+
   /* Some of these constrains might be too conservative.  */
   if (stmt_ends_bb_p (stmt)
       || gimple_has_volatile_ops (stmt)
@@ -465,6 +476,17 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       return false;
     }
 
+  if (flag_loop_if_convert_memory_writes)
+    {
+      if (gimple_could_trap_p (stmt))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "tree could trap...\n");
+	  return false;
+	}
+      return true;
+    }
+
   if (gimple_assign_rhs_could_trap_p (stmt))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
@@ -472,9 +494,11 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       return false;
     }
 
+  bb = gimple_bb (stmt);
+
   if (TREE_CODE (lhs) != SSA_NAME
-      && bb != loop->header
-      && !bb_with_exit_edge_p (loop, bb))
+      && bb != bb->loop_father->header
+      && !bb_with_exit_edge_p (bb->loop_father, bb))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	{
@@ -491,12 +515,10 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
 
    A statement is if-convertible if:
    - it is an if-convertible GIMPLE_ASSGIN,
-   - it is a GIMPLE_LABEL or a GIMPLE_COND.
-
-   STMT is inside BB, which is inside loop LOOP.  */
+   - it is a GIMPLE_LABEL or a GIMPLE_COND.  */
 
 static bool
-if_convertible_stmt_p (struct loop *loop, basic_block bb, gimple stmt)
+if_convertible_stmt_p (gimple stmt)
 {
   switch (gimple_code (stmt))
     {
@@ -506,7 +528,7 @@ if_convertible_stmt_p (struct loop *loop, basic_block bb, gimple stmt)
       return true;
 
     case GIMPLE_ASSIGN:
-      return if_convertible_gimple_assign_stmt_p (loop, bb, stmt);
+      return if_convertible_gimple_assign_stmt_p (stmt);
 
     default:
       /* Don't know what to do with 'em so don't do anything.  */
@@ -877,7 +899,7 @@ if_convertible_loop_p (struct loop *loop)
 	continue;
 
       for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr))
-	if (!if_convertible_stmt_p (loop, bb, gsi_stmt (itr)))
+	if (!if_convertible_stmt_p (gsi_stmt (itr)))
 	  return false;
     }
 
@@ -897,7 +919,7 @@ if_convertible_loop_p (struct loop *loop)
 static basic_block
 find_phi_replacement_condition (struct loop *loop,
 				basic_block bb, tree *cond,
-                                gimple_stmt_iterator *gsi)
+				gimple_stmt_iterator *gsi)
 {
   edge first_edge, second_edge;
   tree tmp_cond;
@@ -982,8 +1004,9 @@ find_phi_replacement_condition (struct loop *loop,
   return first_edge->src;
 }
 
-/* Replace PHI node with conditional modify expr using COND.  This
-   routine does not handle PHI nodes with more than two arguments.
+/* Replace a scalar PHI node with a COND_EXPR using COND as condition.
+   This routine does not handle PHI nodes with more than two
+   arguments.
 
    For example,
      S1: A = PHI <x1(1), x2(5)
@@ -995,18 +1018,22 @@ find_phi_replacement_condition (struct loop *loop,
    TRUE_BB is selected.  */
 
 static void
-replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
-    					  basic_block true_bb,
-                                   	  gimple_stmt_iterator *gsi)
+predicate_scalar_phi (gimple phi, tree cond,
+		      basic_block true_bb,
+		      gimple_stmt_iterator *gsi)
 {
   gimple new_stmt;
   basic_block bb;
-  tree rhs;
-  tree arg;
+  tree rhs, res, arg;
 
   gcc_assert (gimple_code (phi) == GIMPLE_PHI
 	      && gimple_phi_num_args (phi) == 2);
 
+  res = gimple_phi_result (phi);
+  /* Do not handle virtual phi nodes.  */
+  if (!is_gimple_reg (SSA_NAME_VAR (res)))
+    return;
+
   bb = gimple_bb (phi);
 
   arg = degenerate_phi_result (phi);
@@ -1028,11 +1055,11 @@ replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
 	}
 
       /* Build new RHS using selected condition and arguments.  */
-      rhs = build3 (COND_EXPR, TREE_TYPE (PHI_RESULT (phi)),
+      rhs = build3 (COND_EXPR, TREE_TYPE (res),
 		    unshare_expr (cond), arg_0, arg_1);
     }
 
-  new_stmt = gimple_build_assign (PHI_RESULT (phi), rhs);
+  new_stmt = gimple_build_assign (res, rhs);
   SSA_NAME_DEF_STMT (gimple_phi_result (phi)) = new_stmt;
   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
   update_stmt (new_stmt);
@@ -1044,11 +1071,11 @@ replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
     }
 }
 
-/* Replaces in LOOP all the phi nodes other than those in the
+/* Replaces in LOOP all the scalar phi nodes other than those in the
    LOOP->header block with conditional modify expressions.  */
 
 static void
-ifconvert_phi_nodes (struct loop *loop)
+predicate_all_scalar_phis (struct loop *loop)
 {
   basic_block bb;
   unsigned int orig_loop_num_nodes = loop->num_nodes;
@@ -1077,7 +1104,7 @@ ifconvert_phi_nodes (struct loop *loop)
       while (!gsi_end_p (phi_gsi))
 	{
 	  phi = gsi_stmt (phi_gsi);
-	  replace_phi_with_cond_gimple_assign_stmt (phi, cond, true_bb, &gsi);
+	  predicate_scalar_phi (phi, cond, true_bb, &gsi);
 	  release_phi_node (phi);
 	  gsi_next (&phi_gsi);
 	}
@@ -1097,7 +1124,7 @@ insert_gimplified_predicates (loop_p loop)
   for (i = 0; i < loop->num_nodes; i++)
     {
       basic_block bb = ifc_bbs[i];
-      gimple_seq stmts = bb_predicate_gimplified_stmts (bb);
+      gimple_seq stmts;
 
       if (!is_predicated (bb))
 	{
@@ -1108,15 +1135,30 @@ insert_gimplified_predicates (loop_p loop)
 	  continue;
 	}
 
+      stmts = bb_predicate_gimplified_stmts (bb);
       if (stmts)
 	{
-	  gimple_stmt_iterator gsi = gsi_last_bb (bb);
-
-	  if (gsi_end_p (gsi)
-	      || gimple_code (gsi_stmt (gsi)) == GIMPLE_COND)
-	    gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	  if (flag_loop_if_convert_memory_writes)
+	    {
+	      /* Insert the predicate of the BB just after the label,
+		 as the if-conversion of memory writes will use this
+		 predicate.  */
+	      gimple_stmt_iterator gsi = gsi_after_labels (bb);
+	      gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	    }
 	  else
-	    gsi_insert_seq_after (&gsi, stmts, GSI_SAME_STMT);
+	    {
+	      /* Insert the predicate of the BB at the end of the BB
+		 as this would reduce the register pressure: the only
+		 use of this predicate will be in successor BBs.  */
+	      gimple_stmt_iterator gsi = gsi_last_bb (bb);
+
+	      if (gsi_end_p (gsi)
+		  || gimple_code (gsi_stmt (gsi)) == GIMPLE_COND)
+		gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	      else
+		gsi_insert_seq_after (&gsi, stmts, GSI_SAME_STMT);
+	    }
 
 	  /* Once the sequence is code generated, set it to NULL.  */
 	  set_bb_predicate_gimplified_stmts (bb, NULL);
@@ -1124,6 +1166,56 @@ insert_gimplified_predicates (loop_p loop)
     }
 }
 
+/* Predicate each write to memory in LOOP.
+
+   Replace a statement "A[i] = foo" with "A[i] = cond ? foo : A[i]"
+   with the condition COND determined from the predicate of the basic
+   block containing the statement.  */
+
+static void
+predicate_mem_writes (loop_p loop)
+{
+  unsigned int i, orig_loop_num_nodes = loop->num_nodes;
+
+  for (i = 1; i < orig_loop_num_nodes; i++)
+    {
+      gimple_stmt_iterator gsi;
+      basic_block bb = ifc_bbs[i];
+      tree cond = bb_predicate (bb);
+
+      if (is_true_predicate (cond))
+	continue;
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	if (is_gimple_assign (gsi_stmt (gsi))
+	    && gimple_vdef (gsi_stmt (gsi)))
+	  {
+	    gimple new_stmt, lhs_stmt, rhs_stmt;
+	    gimple stmt = gsi_stmt (gsi);
+	    tree lhs = gimple_get_lhs (stmt);
+	    tree rhs = gimple_op (stmt, 1);
+
+	    gcc_assert (gimple_num_ops (stmt) == 2);
+
+	    rhs_stmt = ifc_temp_var (TREE_TYPE (rhs), unshare_expr (rhs));
+	    gsi_insert_before (&gsi, rhs_stmt, GSI_SAME_STMT);
+	    rhs = gimple_get_lhs (rhs_stmt);
+
+	    lhs_stmt = ifc_temp_var (TREE_TYPE (lhs), unshare_expr (lhs));
+	    gsi_insert_before (&gsi, lhs_stmt, GSI_SAME_STMT);
+	    lhs = gimple_get_lhs (lhs_stmt);
+
+	    cond = unshare_expr (cond);
+	    rhs = build3 (COND_EXPR, TREE_TYPE (lhs), cond, rhs, lhs);
+
+	    new_stmt = ifc_temp_var (TREE_TYPE (lhs), rhs);
+	    gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+	    gimple_set_op (stmt, 1, gimple_get_lhs (new_stmt));
+	    update_stmt (stmt);
+	  }
+    }
+}
+
 /* Remove all GIMPLE_CONDs and GIMPLE_LABELs of all the basic blocks
    other than the exit and latch of the LOOP.  Also resets the
    GIMPLE_DEBUG information.  */
@@ -1180,7 +1272,10 @@ combine_blocks (struct loop *loop)
 
   remove_conditions_and_labels (loop);
   insert_gimplified_predicates (loop);
-  ifconvert_phi_nodes (loop);
+  predicate_all_scalar_phis (loop);
+
+  if (flag_loop_if_convert_memory_writes)
+    predicate_mem_writes (loop);
 
   /* Merge basic blocks: first remove all the edges in the loop,
      except for those from the exit block.  */
@@ -1282,6 +1377,10 @@ tree_if_conversion (struct loop *loop)
      blocks into one huge basic block doing the if-conversion
      on-the-fly.  */
   combine_blocks (loop);
+
+  if (flag_loop_if_convert_memory_writes)
+    mark_sym_for_renaming (gimple_vop (cfun));
+
   changed = true;
 
  cleanup:
@@ -1314,6 +1413,9 @@ main_tree_if_conversion (void)
   FOR_EACH_LOOP (li, loop, 0)
     changed |= tree_if_conversion (loop);
 
+  if (changed && flag_loop_if_convert_memory_writes)
+    update_ssa (TODO_update_ssa);
+
   return changed ? TODO_cleanup_cfg : 0;
 }
 
@@ -1321,7 +1423,8 @@ static bool
 gate_tree_if_conversion (void)
 {
   return flag_tree_vectorize
-    || flag_loop_if_convert;
+    || flag_loop_if_convert
+    || flag_loop_if_convert_memory_writes;
 }
 
 struct gimple_opt_pass pass_if_conversion =
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 0/4] if-conversion of loops with conditionals containing memory writes
@ 2010-07-07 20:23 Sebastian Pop
  2010-07-07 20:22 ` [PATCH 1/4] Add the -floop-if-convert flag Sebastian Pop
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 20:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

The patch-set adds -floop-if-convert, -floop-if-convert-memory-writes,
and improves the code generation of if-conversion, fixing the PR44710:
"if-conversion generates redundant statements."

This patch-set has been regtested on amd64-linux.  Ok for trunk?

Thanks,
Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates.
  2010-07-07 20:23 [PATCH 0/4] if-conversion of loops with conditionals containing memory writes Sebastian Pop
                   ` (2 preceding siblings ...)
  2010-07-07 20:23 ` [PATCH 4/4] Do not check whether memory references accessed in every iteration trap Sebastian Pop
@ 2010-07-07 20:24 ` Sebastian Pop
  2010-07-08  9:57   ` Richard Guenther
  3 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 20:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

	PR tree-optimization/44710
	* tree-if-conv.c (parse_predicate): New.
	(add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
	Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.

	* gcc.dg/tree-ssa/ifc-6.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
 gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
 2 files changed, 100 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
new file mode 100644
index 0000000..a9c5db3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
+
+static int x;
+foo (int n, int *A)
+{
+  int i;
+  for (i = 0; i < n; i++)
+    {
+      if (A[i])
+	x = 2;
+      if (A[i + 1])
+	x = 1;
+    }
+}
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index ad106d7..03e453e 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -259,17 +259,95 @@ is_predicated (basic_block bb)
   return !is_true_predicate (bb_predicate (bb));
 }
 
-/* Add condition NEW_COND to the predicate list of basic block BB.  */
+/* Parses the predicate COND and returns its comparison code and
+   operands OP0 and OP1.  */
+
+static enum tree_code
+parse_predicate (tree cond, tree *op0, tree *op1)
+{
+  gimple s;
+
+  if (TREE_CODE (cond) == SSA_NAME
+      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
+    {
+      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
+	{
+	  *op0 = gimple_assign_rhs1 (s);
+	  *op1 = gimple_assign_rhs2 (s);
+	  return gimple_assign_rhs_code (s);
+	}
+
+      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
+	{
+	  tree op = gimple_assign_rhs1 (s);
+	  tree type = TREE_TYPE (op);
+	  enum tree_code code = parse_predicate (op, op0, op1);
+
+	  return code == ERROR_MARK ? ERROR_MARK :
+	    invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));
+	}
+
+      return ERROR_MARK;
+    }
+
+  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
+    {
+      *op0 = TREE_OPERAND (cond, 0);
+      *op1 = TREE_OPERAND (cond, 1);
+      return TREE_CODE (cond);
+    }
+
+  return ERROR_MARK;
+}
+
+/* Add condition NC to the predicate list of basic block BB.  */
 
 static inline void
-add_to_predicate_list (basic_block bb, tree new_cond)
+add_to_predicate_list (basic_block bb, tree nc)
 {
-  tree cond = bb_predicate (bb);
+  tree bc;
 
-  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
-		    fold_build2_loc (EXPR_LOCATION (cond),
-				     TRUTH_OR_EXPR, boolean_type_node,
-				     cond, new_cond));
+  if (is_true_predicate (nc))
+    return;
+
+  if (!is_predicated (bb))
+    bc = nc;
+  else
+    {
+      enum tree_code code1, code2;
+      tree op1a, op1b, op2a, op2b;
+
+      bc = bb_predicate (bb);
+      code1 = parse_predicate (bc, &op1a, &op1b);
+      code2 = parse_predicate (nc, &op2a, &op2b);
+
+      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
+	{
+	  bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
+
+	  if (!bc)
+	    {
+	      bc = bb_predicate (bb);
+	      bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
+				    boolean_type_node, bc, nc);
+	    }
+	}
+      else
+	bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
+			      boolean_type_node, bc, nc);
+    }
+
+  if (!is_gimple_condexpr (bc))
+    {
+      gimple_seq stmts;
+      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
+      add_bb_predicate_gimplified_stmts (bb, stmts);
+    }
+
+  if (is_true_predicate (bc))
+    reset_bb_predicate (bb);
+  else
+    set_bb_predicate (bb, bc);
 }
 
 /* Add the condition COND to the previous condition PREV_COND, and add
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Add the -floop-if-convert flag.
  2010-07-07 20:22 ` [PATCH 1/4] Add the -floop-if-convert flag Sebastian Pop
@ 2010-07-07 21:39   ` Richard Guenther
  2010-07-07 21:49     ` Sebastian Pop
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2010-07-07 21:39 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: gcc-patches, rguenther

On Wed, Jul 7, 2010 at 10:22 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>        * common.opt (floop-if-convert): New flag.
>        * doc/invoke.texi (floop-if-convert): Documented.
>        * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
>        when flag_loop_if_convert is set.
> ---
>  gcc/common.opt      |    4 ++++
>  gcc/doc/invoke.texi |   10 ++++++++--
>  gcc/tree-if-conv.c  |    3 ++-
>  3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6ca787a..2a5d391 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -653,6 +653,10 @@ fif-conversion2
>  Common Report Var(flag_if_conversion2) Optimization
>  Perform conversion of conditional jumps to conditional execution
>
> +floop-if-convert
> +Common Report Var(flag_loop_if_convert) Optimization
> +Convert conditional jumps in innermost loops to branchless equivalents
> +

Init(0) is missing.  Also can you name it -ftree-loop-if-conversion instead
consistent with -ftree-loop-* and -fif-conversion.

There is still no way to disable if-conversion if the vectorizer is enabled,
so I guess a tri-state -1, disabled and enabled would be more useful
(as you wanted it for debugging in the first place).

Richard.

>  ; -finhibit-size-directive inhibits output of .size for ELF.
>  ; This is used only for compiling crtstuff.c,
>  ; and it may be extended to other effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d70f130..da14bf9 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
>  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffunction-sections @gol
> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
>  -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fira-loop-pressure -fno-ira-share-save-slots @gol
>  -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> +-floop-block -floop-if-convert -floop-interchange -floop-strip-mine @gol
>  -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
>  -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>
> +@item -floop-if-convert
> +Attempt to transform conditional jumps in the innermost loops to
> +branch-less equivalents.  The intent is to remove control-flow from
> +the innermost loops in order to improve the ability of the
> +auto-vectorization pass to handle these loops.
> +
>  @item -ftree-loop-distribution
>  Perform loop distribution.  This flag can improve cache performance on
>  big loop bodies and allow further loop optimizations, like
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index 8d5d226..ad106d7 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -1242,7 +1242,8 @@ main_tree_if_conversion (void)
>  static bool
>  gate_tree_if_conversion (void)
>  {
> -  return flag_tree_vectorize != 0;
> +  return flag_tree_vectorize
> +    || flag_loop_if_convert;
>  }
>
>  struct gimple_opt_pass pass_if_conversion =
> --
> 1.7.0.4
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Add the -floop-if-convert flag.
  2010-07-07 21:39   ` Richard Guenther
@ 2010-07-07 21:49     ` Sebastian Pop
  2010-07-07 22:53       ` [PATCH 1/4] Add the -ftree-loop-if-convert flag Sebastian Pop
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 21:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, rguenther

On Wed, Jul 7, 2010 at 16:39, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Jul 7, 2010 at 10:22 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>>        * common.opt (floop-if-convert): New flag.
>>        * doc/invoke.texi (floop-if-convert): Documented.
>>        * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
>>        when flag_loop_if_convert is set.
>> ---
>>  gcc/common.opt      |    4 ++++
>>  gcc/doc/invoke.texi |   10 ++++++++--
>>  gcc/tree-if-conv.c  |    3 ++-
>>  3 files changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 6ca787a..2a5d391 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -653,6 +653,10 @@ fif-conversion2
>>  Common Report Var(flag_if_conversion2) Optimization
>>  Perform conversion of conditional jumps to conditional execution
>>
>> +floop-if-convert
>> +Common Report Var(flag_loop_if_convert) Optimization
>> +Convert conditional jumps in innermost loops to branchless equivalents
>> +
>
> Init(0) is missing.  Also can you name it -ftree-loop-if-conversion instead
> consistent with -ftree-loop-* and -fif-conversion.
>

Ok, I will do this.
Although, I do not like the "tree" in the flags, as I find that there is no
reason to expose the internals of GCC to GCC users.

> There is still no way to disable if-conversion if the vectorizer is enabled,
> so I guess a tri-state -1, disabled and enabled would be more useful
> (as you wanted it for debugging in the first place).
>

Good idea.
I will post an updated patch with these modifications.

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] Add the -ftree-loop-if-convert flag.
  2010-07-07 21:49     ` Sebastian Pop
@ 2010-07-07 22:53       ` Sebastian Pop
  2010-07-08  9:02         ` Richard Guenther
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 22:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

	* common.opt (ftree-loop-if-convert): New flag.
	* doc/invoke.texi (ftree-loop-if-convert): Documented.
	* tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
	when flag_tree_loop_if_convert is set.
---
 gcc/common.opt      |    4 ++++
 gcc/doc/invoke.texi |   14 ++++++++++----
 gcc/tree-if-conv.c  |    6 +++++-
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 6ca787a..111d7b7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -653,6 +653,10 @@ fif-conversion2
 Common Report Var(flag_if_conversion2) Optimization
 Perform conversion of conditional jumps to conditional execution
 
+ftree-loop-if-convert
+Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
+Convert conditional jumps in innermost loops to branchless equivalents
+
 ; -finhibit-size-directive inhibits output of .size for ELF.
 ; This is used only for compiling crtstuff.c,
 ; and it may be extended to other effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d70f130..0847e01 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
 -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
 -fforward-propagate -ffunction-sections @gol
--fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
+-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
 -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
 -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
@@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
 -fira-loop-pressure -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
+-floop-block -floop-interchange -floop-strip-mine @gol
 -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
 -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
 -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
@@ -382,8 +382,8 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
 -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
--ftree-copyrename -ftree-dce @gol
--ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-im @gol
+-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
+-ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
 -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
@@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -ftree-loop-if-convert
+Attempt to transform conditional jumps in the innermost loops to
+branch-less equivalents.  The intent is to remove control-flow from
+the innermost loops in order to improve the ability of the
+auto-vectorization pass to handle these loops.
+
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 8d5d226..873cd89 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1242,7 +1242,11 @@ main_tree_if_conversion (void)
 static bool
 gate_tree_if_conversion (void)
 {
-  return flag_tree_vectorize != 0;
+  if (flag_tree_vectorize
+      && flag_tree_loop_if_convert < 0)
+    flag_tree_loop_if_convert = 1;
+
+  return flag_tree_loop_if_convert > 0;
 }
 
 struct gimple_opt_pass pass_if_conversion =
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4] Add flag -ftree-loop-if-convert-memory-writes.
  2010-07-07 20:23 ` [PATCH 3/4] Add flag -floop-if-convert-memory-writes Sebastian Pop
@ 2010-07-07 22:54   ` Sebastian Pop
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Pop @ 2010-07-07 22:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, Sebastian Pop

This patch adds a flag that controls the replacement of the memory
writes that are in predicated basic blocks with a full write:

for (...)
  if (cond)
    A[i] = foo

is replaced with:

for (...)
  A[i] = cond ? foo : A[i]

In order to do this, we have to call gimple_could_trap_p instead of
gimple_assign_rhs_could_trap_p, as we have to also check that the LHS
of assign stmts does not trap.

	* common.opt (ftree-loop-if-convert-memory-writes): New flag.
	* doc/invoke.texi (ftree-loop-if-convert-memory-writes): Documented.
	* tree-if-conv.c (if_convertible_phi_p): Allow virtual phi nodes when
	flag_loop_if_convert_memory_writes is set.
	(if_convertible_gimple_assign_stmt_p): Allow memory reads and writes
	Do not handle types that do not match is_gimple_reg_type.
	Remove loop and bb parameters.  Call gimple_could_trap_p instead of
	when flag_loop_if_convert_memory_writes	is set, as LHS can contain
	memory refs.
	(if_convertible_stmt_p): Remove loop and bb parameters.  Update calls
	to if_convertible_gimple_assign_stmt_p.
	(if_convertible_loop_p): Update call to if_convertible_stmt_p.
	(gimplify_condition): New.
	(replace_phi_with_cond_gimple_assign_stmt): Renamed
	predicate_scalar_phi.  Do not handle virtual phi nodes.
	(ifconvert_phi_nodes): Renamed predicate_all_scalar_phis.
	Call predicate_scalar_phi.
	(insert_gimplified_predicates): Insert the gimplified predicate of a BB
	just after the labels for flag_loop_if_convert_memory_writes, otherwise
	insert the predicate in the end of the BB.
	(predicate_mem_writes): New.
	(combine_blocks): Call predicate_all_scalar_phis.  When
	flag_loop_if_convert_memory_writes is set, call predicate_mem_writes.
	(tree_if_conversion): Call mark_sym_for_renaming when
	flag_loop_if_convert_memory_writes is set.
	(main_tree_if_conversion): Call update_ssa when
	flag_loop_if_convert_memory_writes is set.

	* gcc.dg/tree-ssa/ifc-4.c: New.
	* gcc.dg/tree-ssa/ifc-7.c: New.
---
 gcc/common.opt                        |    4 +
 gcc/doc/invoke.texi                   |   20 ++++-
 gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c |   53 ++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c |   29 +++++
 gcc/tree-if-conv.c                    |  183 +++++++++++++++++++++++++-------
 5 files changed, 248 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 111d7b7..ceb94f8 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -657,6 +657,10 @@ ftree-loop-if-convert
 Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
 Convert conditional jumps in innermost loops to branchless equivalents
 
+ftree-loop-if-convert-memory-writes
+Common Report Var(flag_tree_loop_if_convert_memory_writes) Optimization
+Also if-convert conditional jumps containing memory writes
+
 ; -finhibit-size-directive inhibits output of .size for ELF.
 ; This is used only for compiling crtstuff.c,
 ; and it may be extended to other effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0847e01..eb04fd5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -383,7 +383,8 @@ Objective-C and Objective-C++ Dialects}.
 -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
--ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
+-ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
+-ftree-loop-if-convert-memory-writes -ftree-loop-im @gol
 -ftree-phiprop -ftree-loop-distribution @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
 -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
@@ -6889,6 +6890,23 @@ branch-less equivalents.  The intent is to remove control-flow from
 the innermost loops in order to improve the ability of the
 auto-vectorization pass to handle these loops.
 
+@item -ftree-loop-if-convert-memory-writes
+Attempt to also if-convert conditional jumps containing memory writes.
+This transformation can be unsafe for multi-threaded programs as it
+transforms conditional memory writes into unconditional memory writes.
+For example,
+@smallexample
+for (i = 0; i < N; i++)
+  if (cond)
+    A[i] = expr;
+@end smallexample
+would be transformed to
+@smallexample
+for (i = 0; i < N; i++)
+  A[i] = cond ? expr : A[i];
+@end smallexample
+potentially producing data races.
+
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
new file mode 100644
index 0000000..beb1a0e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-4.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */
+
+struct ht
+{
+  void * (*alloc_subobject) (int);
+};
+typedef struct cpp_reader cpp_reader;
+typedef struct cpp_token cpp_token;
+typedef struct cpp_macro cpp_macro;
+enum cpp_ttype
+{
+    CPP_PASTE,
+};
+struct cpp_token {
+  __extension__ enum cpp_ttype type : 8;
+} cpp_comment_table;
+struct cpp_macro {
+  union cpp_macro_u
+  {
+    cpp_token * tokens;
+  } exp;
+  unsigned int count;
+};
+struct cpp_reader
+{
+  struct ht *hash_table;
+};
+create_iso_definition (cpp_reader *pfile, cpp_macro *macro)
+{
+  unsigned int num_extra_tokens = 0;
+  {
+    cpp_token *tokns =
+      (cpp_token *) pfile->hash_table->alloc_subobject (sizeof (cpp_token)
+							* macro->count);
+    {
+      cpp_token *normal_dest = tokns;
+      cpp_token *extra_dest = tokns + macro->count - num_extra_tokens;
+      unsigned int i;
+      for (i = 0; i < macro->count; i++)
+	{
+	  if (macro->exp.tokens[i].type == CPP_PASTE)
+	    *extra_dest++ = macro->exp.tokens[i];
+	  else
+	    *normal_dest++ = macro->exp.tokens[i];
+	}
+    }
+  }
+}
+
+/* This cannot be if-converted because the stores are to aggregate types.  */
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 0 "ifcvt" } } */
+/* { dg-final { cleanup-tree-dump "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c
new file mode 100644
index 0000000..4d26dc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-7.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
+
+typedef struct eqn_d
+{
+  int *coef;
+} *eqn;
+typedef struct omega_pb_d
+{
+  eqn subs;
+} *omega_pb;
+
+omega_pb omega_solve_problem (omega_pb);
+
+omega_pb
+omega_solve_geq (omega_pb pb, int n)
+{
+  int i, e;
+  int j = 0;
+
+  for (e = n - 1; e >= 0; e--)
+    if (pb->subs[e].coef[i] != pb->subs[e].coef[j])
+      {
+	pb->subs[e].coef[i] = j;
+	pb->subs[e].coef[j] = i;
+      }
+
+  return omega_solve_problem (pb);
+}
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index d5b0f7e..673b129 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -387,9 +387,12 @@ bb_with_exit_edge_p (struct loop *loop, basic_block bb)
    and it belongs to basic block BB.
 
    PHI is not if-convertible if:
-   - it has more than 2 arguments,
-   - virtual PHI is immediately used in another PHI node,
-   - virtual PHI on BB other than header.  */
+   - it has more than 2 arguments.
+
+   When the flag_tree_loop_if_convert_memory_writes is not set, PHI is not
+   if-convertible if:
+   - a virtual PHI is immediately used in another PHI node,
+   - there is a virtual PHI in a BB other than the loop->header.  */
 
 static bool
 if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
@@ -407,6 +410,12 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
       return false;
     }
 
+  if (flag_tree_loop_if_convert_memory_writes)
+    return true;
+
+  /* When the flag_tree_loop_if_convert_memory_writes is not set, check
+     that there are no memory writes in the branches of the loop to be
+     if-converted.  */
   if (!is_gimple_reg (SSA_NAME_VAR (gimple_phi_result (phi))))
     {
       imm_use_iterator imm_iter;
@@ -415,9 +424,10 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
       if (bb != loop->header)
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "Virtual phi not on loop header.\n");
+	    fprintf (dump_file, "Virtual phi not on loop->header.\n");
 	  return false;
 	}
+
       FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi))
 	{
 	  if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI)
@@ -437,15 +447,13 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi)
    GIMPLE_ASSIGN statement is not if-convertible if,
    - it is not movable,
    - it could trap,
-   - LHS is not var decl.
-
-   GIMPLE_ASSIGN is part of block BB, which is inside loop LOOP.  */
+   - LHS is not var decl.  */
 
 static bool
-if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
-    				     gimple stmt)
+if_convertible_gimple_assign_stmt_p (gimple stmt)
 {
   tree lhs = gimple_assign_lhs (stmt);
+  basic_block bb;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -453,6 +461,9 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
     }
 
+  if (!is_gimple_reg_type (TREE_TYPE (lhs)))
+    return false;
+
   /* Some of these constrains might be too conservative.  */
   if (stmt_ends_bb_p (stmt)
       || gimple_has_volatile_ops (stmt)
@@ -465,6 +476,17 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       return false;
     }
 
+  if (flag_tree_loop_if_convert_memory_writes)
+    {
+      if (gimple_could_trap_p (stmt))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "tree could trap...\n");
+	  return false;
+	}
+      return true;
+    }
+
   if (gimple_assign_rhs_could_trap_p (stmt))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
@@ -472,9 +494,11 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
       return false;
     }
 
+  bb = gimple_bb (stmt);
+
   if (TREE_CODE (lhs) != SSA_NAME
-      && bb != loop->header
-      && !bb_with_exit_edge_p (loop, bb))
+      && bb != bb->loop_father->header
+      && !bb_with_exit_edge_p (bb->loop_father, bb))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	{
@@ -491,12 +515,10 @@ if_convertible_gimple_assign_stmt_p (struct loop *loop, basic_block bb,
 
    A statement is if-convertible if:
    - it is an if-convertible GIMPLE_ASSGIN,
-   - it is a GIMPLE_LABEL or a GIMPLE_COND.
-
-   STMT is inside BB, which is inside loop LOOP.  */
+   - it is a GIMPLE_LABEL or a GIMPLE_COND.  */
 
 static bool
-if_convertible_stmt_p (struct loop *loop, basic_block bb, gimple stmt)
+if_convertible_stmt_p (gimple stmt)
 {
   switch (gimple_code (stmt))
     {
@@ -506,7 +528,7 @@ if_convertible_stmt_p (struct loop *loop, basic_block bb, gimple stmt)
       return true;
 
     case GIMPLE_ASSIGN:
-      return if_convertible_gimple_assign_stmt_p (loop, bb, stmt);
+      return if_convertible_gimple_assign_stmt_p (stmt);
 
     default:
       /* Don't know what to do with 'em so don't do anything.  */
@@ -877,7 +899,7 @@ if_convertible_loop_p (struct loop *loop)
 	continue;
 
       for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr))
-	if (!if_convertible_stmt_p (loop, bb, gsi_stmt (itr)))
+	if (!if_convertible_stmt_p (gsi_stmt (itr)))
 	  return false;
     }
 
@@ -897,7 +919,7 @@ if_convertible_loop_p (struct loop *loop)
 static basic_block
 find_phi_replacement_condition (struct loop *loop,
 				basic_block bb, tree *cond,
-                                gimple_stmt_iterator *gsi)
+				gimple_stmt_iterator *gsi)
 {
   edge first_edge, second_edge;
   tree tmp_cond;
@@ -982,8 +1004,9 @@ find_phi_replacement_condition (struct loop *loop,
   return first_edge->src;
 }
 
-/* Replace PHI node with conditional modify expr using COND.  This
-   routine does not handle PHI nodes with more than two arguments.
+/* Replace a scalar PHI node with a COND_EXPR using COND as condition.
+   This routine does not handle PHI nodes with more than two
+   arguments.
 
    For example,
      S1: A = PHI <x1(1), x2(5)
@@ -995,18 +1018,22 @@ find_phi_replacement_condition (struct loop *loop,
    TRUE_BB is selected.  */
 
 static void
-replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
-    					  basic_block true_bb,
-                                   	  gimple_stmt_iterator *gsi)
+predicate_scalar_phi (gimple phi, tree cond,
+		      basic_block true_bb,
+		      gimple_stmt_iterator *gsi)
 {
   gimple new_stmt;
   basic_block bb;
-  tree rhs;
-  tree arg;
+  tree rhs, res, arg;
 
   gcc_assert (gimple_code (phi) == GIMPLE_PHI
 	      && gimple_phi_num_args (phi) == 2);
 
+  res = gimple_phi_result (phi);
+  /* Do not handle virtual phi nodes.  */
+  if (!is_gimple_reg (SSA_NAME_VAR (res)))
+    return;
+
   bb = gimple_bb (phi);
 
   arg = degenerate_phi_result (phi);
@@ -1028,11 +1055,11 @@ replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
 	}
 
       /* Build new RHS using selected condition and arguments.  */
-      rhs = build3 (COND_EXPR, TREE_TYPE (PHI_RESULT (phi)),
+      rhs = build3 (COND_EXPR, TREE_TYPE (res),
 		    unshare_expr (cond), arg_0, arg_1);
     }
 
-  new_stmt = gimple_build_assign (PHI_RESULT (phi), rhs);
+  new_stmt = gimple_build_assign (res, rhs);
   SSA_NAME_DEF_STMT (gimple_phi_result (phi)) = new_stmt;
   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
   update_stmt (new_stmt);
@@ -1044,11 +1071,11 @@ replace_phi_with_cond_gimple_assign_stmt (gimple phi, tree cond,
     }
 }
 
-/* Replaces in LOOP all the phi nodes other than those in the
+/* Replaces in LOOP all the scalar phi nodes other than those in the
    LOOP->header block with conditional modify expressions.  */
 
 static void
-ifconvert_phi_nodes (struct loop *loop)
+predicate_all_scalar_phis (struct loop *loop)
 {
   basic_block bb;
   unsigned int orig_loop_num_nodes = loop->num_nodes;
@@ -1077,7 +1104,7 @@ ifconvert_phi_nodes (struct loop *loop)
       while (!gsi_end_p (phi_gsi))
 	{
 	  phi = gsi_stmt (phi_gsi);
-	  replace_phi_with_cond_gimple_assign_stmt (phi, cond, true_bb, &gsi);
+	  predicate_scalar_phi (phi, cond, true_bb, &gsi);
 	  release_phi_node (phi);
 	  gsi_next (&phi_gsi);
 	}
@@ -1097,7 +1124,7 @@ insert_gimplified_predicates (loop_p loop)
   for (i = 0; i < loop->num_nodes; i++)
     {
       basic_block bb = ifc_bbs[i];
-      gimple_seq stmts = bb_predicate_gimplified_stmts (bb);
+      gimple_seq stmts;
 
       if (!is_predicated (bb))
 	{
@@ -1108,15 +1135,30 @@ insert_gimplified_predicates (loop_p loop)
 	  continue;
 	}
 
+      stmts = bb_predicate_gimplified_stmts (bb);
       if (stmts)
 	{
-	  gimple_stmt_iterator gsi = gsi_last_bb (bb);
-
-	  if (gsi_end_p (gsi)
-	      || gimple_code (gsi_stmt (gsi)) == GIMPLE_COND)
-	    gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	  if (flag_tree_loop_if_convert_memory_writes)
+	    {
+	      /* Insert the predicate of the BB just after the label,
+		 as the if-conversion of memory writes will use this
+		 predicate.  */
+	      gimple_stmt_iterator gsi = gsi_after_labels (bb);
+	      gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	    }
 	  else
-	    gsi_insert_seq_after (&gsi, stmts, GSI_SAME_STMT);
+	    {
+	      /* Insert the predicate of the BB at the end of the BB
+		 as this would reduce the register pressure: the only
+		 use of this predicate will be in successor BBs.  */
+	      gimple_stmt_iterator gsi = gsi_last_bb (bb);
+
+	      if (gsi_end_p (gsi)
+		  || gimple_code (gsi_stmt (gsi)) == GIMPLE_COND)
+		gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+	      else
+		gsi_insert_seq_after (&gsi, stmts, GSI_SAME_STMT);
+	    }
 
 	  /* Once the sequence is code generated, set it to NULL.  */
 	  set_bb_predicate_gimplified_stmts (bb, NULL);
@@ -1124,6 +1166,56 @@ insert_gimplified_predicates (loop_p loop)
     }
 }
 
+/* Predicate each write to memory in LOOP.
+
+   Replace a statement "A[i] = foo" with "A[i] = cond ? foo : A[i]"
+   with the condition COND determined from the predicate of the basic
+   block containing the statement.  */
+
+static void
+predicate_mem_writes (loop_p loop)
+{
+  unsigned int i, orig_loop_num_nodes = loop->num_nodes;
+
+  for (i = 1; i < orig_loop_num_nodes; i++)
+    {
+      gimple_stmt_iterator gsi;
+      basic_block bb = ifc_bbs[i];
+      tree cond = bb_predicate (bb);
+
+      if (is_true_predicate (cond))
+	continue;
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	if (is_gimple_assign (gsi_stmt (gsi))
+	    && gimple_vdef (gsi_stmt (gsi)))
+	  {
+	    gimple new_stmt, lhs_stmt, rhs_stmt;
+	    gimple stmt = gsi_stmt (gsi);
+	    tree lhs = gimple_get_lhs (stmt);
+	    tree rhs = gimple_op (stmt, 1);
+
+	    gcc_assert (gimple_num_ops (stmt) == 2);
+
+	    rhs_stmt = ifc_temp_var (TREE_TYPE (rhs), unshare_expr (rhs));
+	    gsi_insert_before (&gsi, rhs_stmt, GSI_SAME_STMT);
+	    rhs = gimple_get_lhs (rhs_stmt);
+
+	    lhs_stmt = ifc_temp_var (TREE_TYPE (lhs), unshare_expr (lhs));
+	    gsi_insert_before (&gsi, lhs_stmt, GSI_SAME_STMT);
+	    lhs = gimple_get_lhs (lhs_stmt);
+
+	    cond = unshare_expr (cond);
+	    rhs = build3 (COND_EXPR, TREE_TYPE (lhs), cond, rhs, lhs);
+
+	    new_stmt = ifc_temp_var (TREE_TYPE (lhs), rhs);
+	    gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+	    gimple_set_op (stmt, 1, gimple_get_lhs (new_stmt));
+	    update_stmt (stmt);
+	  }
+    }
+}
+
 /* Remove all GIMPLE_CONDs and GIMPLE_LABELs of all the basic blocks
    other than the exit and latch of the LOOP.  Also resets the
    GIMPLE_DEBUG information.  */
@@ -1180,7 +1272,10 @@ combine_blocks (struct loop *loop)
 
   remove_conditions_and_labels (loop);
   insert_gimplified_predicates (loop);
-  ifconvert_phi_nodes (loop);
+  predicate_all_scalar_phis (loop);
+
+  if (flag_tree_loop_if_convert_memory_writes)
+    predicate_mem_writes (loop);
 
   /* Merge basic blocks: first remove all the edges in the loop,
      except for those from the exit block.  */
@@ -1282,6 +1377,10 @@ tree_if_conversion (struct loop *loop)
      blocks into one huge basic block doing the if-conversion
      on-the-fly.  */
   combine_blocks (loop);
+
+  if (flag_tree_loop_if_convert_memory_writes)
+    mark_sym_for_renaming (gimple_vop (cfun));
+
   changed = true;
 
  cleanup:
@@ -1314,14 +1413,18 @@ main_tree_if_conversion (void)
   FOR_EACH_LOOP (li, loop, 0)
     changed |= tree_if_conversion (loop);
 
+  if (changed && flag_tree_loop_if_convert_memory_writes)
+    update_ssa (TODO_update_ssa);
+
   return changed ? TODO_cleanup_cfg : 0;
 }
 
 static bool
 gate_tree_if_conversion (void)
 {
-  if (flag_tree_vectorize
-      && flag_tree_loop_if_convert < 0)
+  if ((flag_tree_vectorize
+       && flag_tree_loop_if_convert < 0)
+      || flag_tree_loop_if_convert_memory_writes > 0)
     flag_tree_loop_if_convert = 1;
 
   return flag_tree_loop_if_convert > 0;
-- 
1.7.0.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Add the -ftree-loop-if-convert flag.
  2010-07-07 22:53       ` [PATCH 1/4] Add the -ftree-loop-if-convert flag Sebastian Pop
@ 2010-07-08  9:02         ` Richard Guenther
  2010-07-08 16:40           ` Sebastian Pop
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2010-07-08  9:02 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: gcc-patches

On Wed, 7 Jul 2010, Sebastian Pop wrote:

> 	* common.opt (ftree-loop-if-convert): New flag.
> 	* doc/invoke.texi (ftree-loop-if-convert): Documented.
> 	* tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
> 	when flag_tree_loop_if_convert is set.
> ---
>  gcc/common.opt      |    4 ++++
>  gcc/doc/invoke.texi |   14 ++++++++++----
>  gcc/tree-if-conv.c  |    6 +++++-
>  3 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6ca787a..111d7b7 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -653,6 +653,10 @@ fif-conversion2
>  Common Report Var(flag_if_conversion2) Optimization
>  Perform conversion of conditional jumps to conditional execution
>  
> +ftree-loop-if-convert
> +Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
> +Convert conditional jumps in innermost loops to branchless equivalents
> +
>  ; -finhibit-size-directive inhibits output of .size for ELF.
>  ; This is used only for compiling crtstuff.c,
>  ; and it may be extended to other effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d70f130..0847e01 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
>  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffunction-sections @gol
> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
>  -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fira-loop-pressure -fno-ira-share-save-slots @gol
>  -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> +-floop-block -floop-interchange -floop-strip-mine @gol
>  -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
>  -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
> @@ -382,8 +382,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
>  -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce @gol
> --ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-im @gol
> +-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> +-ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>  
> +@item -ftree-loop-if-convert
> +Attempt to transform conditional jumps in the innermost loops to
> +branch-less equivalents.  The intent is to remove control-flow from
> +the innermost loops in order to improve the ability of the
> +auto-vectorization pass to handle these loops.
> +

Please state that this is enabled by default if vectorization is enabled.

>  @item -ftree-loop-distribution
>  Perform loop distribution.  This flag can improve cache performance on
>  big loop bodies and allow further loop optimizations, like
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index 8d5d226..873cd89 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -1242,7 +1242,11 @@ main_tree_if_conversion (void)
>  static bool
>  gate_tree_if_conversion (void)
>  {
> -  return flag_tree_vectorize != 0;
> +  if (flag_tree_vectorize
> +      && flag_tree_loop_if_convert < 0)
> +    flag_tree_loop_if_convert = 1;

Err, no.  This should be

  return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
          || flag_tree_loop_if_convert == 1);

not set flag_tree_loop_if_convert here.

But on a 2nd thought please follow what -ftree-cselim does, do
Init(2) (ISTR -1 is now problematic for some reason), and in
process_options () set flag_tree_loop_if_convert if it is
equal to AUTODETECT_VALUE (2) to the setting of flag_tree_vectorize.

The gate function then simply can return flag_tree_loop_if_convert.

Ok with that change.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates.
  2010-07-07 20:24 ` [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates Sebastian Pop
@ 2010-07-08  9:57   ` Richard Guenther
  2010-07-08 16:40     ` Sebastian Pop
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2010-07-08  9:57 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: gcc-patches

On Wed, 7 Jul 2010, Sebastian Pop wrote:

> 	PR tree-optimization/44710
> 	* tree-if-conv.c (parse_predicate): New.
> 	(add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
> 	Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.
> 
> 	* gcc.dg/tree-ssa/ifc-6.c: New.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
>  gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
>  2 files changed, 100 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> new file mode 100644
> index 0000000..a9c5db3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
> +
> +static int x;
> +foo (int n, int *A)
> +{
> +  int i;
> +  for (i = 0; i < n; i++)
> +    {
> +      if (A[i])
> +	x = 2;
> +      if (A[i + 1])
> +	x = 1;
> +    }
> +}
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index ad106d7..03e453e 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -259,17 +259,95 @@ is_predicated (basic_block bb)
>    return !is_true_predicate (bb_predicate (bb));
>  }
>  
> -/* Add condition NEW_COND to the predicate list of basic block BB.  */
> +/* Parses the predicate COND and returns its comparison code and
> +   operands OP0 and OP1.  */
> +
> +static enum tree_code
> +parse_predicate (tree cond, tree *op0, tree *op1)
> +{
> +  gimple s;
> +
> +  if (TREE_CODE (cond) == SSA_NAME
> +      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
> +    {
> +      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
> +	{
> +	  *op0 = gimple_assign_rhs1 (s);
> +	  *op1 = gimple_assign_rhs2 (s);
> +	  return gimple_assign_rhs_code (s);
> +	}
> +
> +      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
> +	{
> +	  tree op = gimple_assign_rhs1 (s);
> +	  tree type = TREE_TYPE (op);
> +	  enum tree_code code = parse_predicate (op, op0, op1);
> +
> +	  return code == ERROR_MARK ? ERROR_MARK :
> +	    invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));

The : goes on the next line.

> +	}
> +
> +      return ERROR_MARK;
> +    }
> +
> +  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
> +    {
> +      *op0 = TREE_OPERAND (cond, 0);
> +      *op1 = TREE_OPERAND (cond, 1);
> +      return TREE_CODE (cond);
> +    }
> +
> +  return ERROR_MARK;
> +}
> +
> +/* Add condition NC to the predicate list of basic block BB.  */
>  
>  static inline void
> -add_to_predicate_list (basic_block bb, tree new_cond)
> +add_to_predicate_list (basic_block bb, tree nc)
>  {
> -  tree cond = bb_predicate (bb);
> +  tree bc;
>  
> -  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
> -		    fold_build2_loc (EXPR_LOCATION (cond),
> -				     TRUTH_OR_EXPR, boolean_type_node,
> -				     cond, new_cond));
> +  if (is_true_predicate (nc))
> +    return;
> +
> +  if (!is_predicated (bb))
> +    bc = nc;
> +  else
> +    {
> +      enum tree_code code1, code2;
> +      tree op1a, op1b, op2a, op2b;
> +
> +      bc = bb_predicate (bb);
> +      code1 = parse_predicate (bc, &op1a, &op1b);
> +      code2 = parse_predicate (nc, &op2a, &op2b);
> +
> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
> +	{
> +	  bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
> +
> +	  if (!bc)
> +	    {
> +	      bc = bb_predicate (bb);
> +	      bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
> +				    boolean_type_node, bc, nc);
> +	    }
> +	}
> +      else
> +	bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
> +			      boolean_type_node, bc, nc);

your re-use of bc makes this overly complicated.  Please use a new
tem for the maybe_fold_or_comparisons result and commonize the
!temp fold_build2_loc paths (and avoid re-loading bb_predicate)

> +    }
> +
> +  if (!is_gimple_condexpr (bc))
> +    {
> +      gimple_seq stmts;
> +      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
> +      add_bb_predicate_gimplified_stmts (bb, stmts);
> +    }
> +
> +  if (is_true_predicate (bc))
> +    reset_bb_predicate (bb);
> +  else
> +    set_bb_predicate (bb, bc);
>  }
>  
>  /* Add the condition COND to the previous condition PREV_COND, and add
> 

Ok with that change.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] Do not check whether memory references accessed in every iteration trap.
  2010-07-07 20:23 ` [PATCH 4/4] Do not check whether memory references accessed in every iteration trap Sebastian Pop
@ 2010-07-08 12:29   ` Michael Matz
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Matz @ 2010-07-08 12:29 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: gcc-patches, rguenther

Hi,

On Wed, 7 Jul 2010, Sebastian Pop wrote:

> +static bool
> +memrefs_read_or_written_unconditionally (gimple stmt,
> +					 VEC (data_reference_p, heap) *refs)
> +{
> +  struct data_reference *a, *b;
> +  unsigned int i, j;
> +  tree cond = bb_predicate (gimple_bb (stmt));
> +
> +  for (i = 0; VEC_iterate (data_reference_p, refs, i, a); i++)
> +    if (DR_STMT (a) == stmt)
> +      {
> +	for (j = 0; VEC_iterate (data_reference_p, refs, j, b); j++)

So, this is O(num-data-refs).

> +static bool
> +ifcvt_could_trap_p (gimple stmt, VEC (data_reference_p, heap) *refs)
> +{
> +  if (gimple_has_mem_ops (stmt)
> +      && ifcvt_memrefs_wont_trap (stmt, refs)
> +      && !operations_could_trap (stmt))
> +    return false;
> +
> +  return gimple_could_trap_p (stmt);

Therefore this is too.

> @@ -478,7 +630,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt)
>  
>    if (flag_loop_if_convert_memory_writes)
>      {
> -      if (gimple_could_trap_p (stmt))
> +      if (ifcvt_could_trap_p (stmt, refs))

And here you replace a O(1) with the above O(num-data-refs).  This itself 
is called on every assign statement for loops.  I think it would be much 
better to compute the trapping/non-trapping condition once for each 
data-ref and mark the associated statements in some way (we have 
pass-local flags for that for instance).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] Add the -ftree-loop-if-convert flag.
  2010-07-08  9:02         ` Richard Guenther
@ 2010-07-08 16:40           ` Sebastian Pop
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Pop @ 2010-07-08 16:40 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

On Thu, Jul 8, 2010 at 04:01, Richard Guenther <rguenther@suse.de> wrote:
> On Wed, 7 Jul 2010, Sebastian Pop wrote:
>
>>       * common.opt (ftree-loop-if-convert): New flag.
>>       * doc/invoke.texi (ftree-loop-if-convert): Documented.
>>       * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
>>       when flag_tree_loop_if_convert is set.
>> ---
>>  gcc/common.opt      |    4 ++++
>>  gcc/doc/invoke.texi |   14 ++++++++++----
>>  gcc/tree-if-conv.c  |    6 +++++-
>>  3 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 6ca787a..111d7b7 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -653,6 +653,10 @@ fif-conversion2
>>  Common Report Var(flag_if_conversion2) Optimization
>>  Perform conversion of conditional jumps to conditional execution
>>
>> +ftree-loop-if-convert
>> +Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
>> +Convert conditional jumps in innermost loops to branchless equivalents
>> +
>>  ; -finhibit-size-directive inhibits output of .size for ELF.
>>  ; This is used only for compiling crtstuff.c,
>>  ; and it may be extended to other effects
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index d70f130..0847e01 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
>>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
>>  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>>  -fforward-propagate -ffunction-sections @gol
>> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
>> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
>>  -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
>>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
>> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
>>  -fira-loop-pressure -fno-ira-share-save-slots @gol
>>  -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
>>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
>> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
>> +-floop-block -floop-interchange -floop-strip-mine @gol
>>  -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
>>  -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
>> @@ -382,8 +382,8 @@ Objective-C and Objective-C++ Dialects}.
>>  -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
>>  -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
>>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
>> --ftree-copyrename -ftree-dce @gol
>> --ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-im @gol
>> +-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>> +-ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
>>  -ftree-phiprop -ftree-loop-distribution @gol
>>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>>  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
>> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
>>  Compare the results of several data dependence analyzers.  This option
>>  is used for debugging the data dependence analyzers.
>>
>> +@item -ftree-loop-if-convert
>> +Attempt to transform conditional jumps in the innermost loops to
>> +branch-less equivalents.  The intent is to remove control-flow from
>> +the innermost loops in order to improve the ability of the
>> +auto-vectorization pass to handle these loops.
>> +
>
> Please state that this is enabled by default if vectorization is enabled.
>
>>  @item -ftree-loop-distribution
>>  Perform loop distribution.  This flag can improve cache performance on
>>  big loop bodies and allow further loop optimizations, like
>> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
>> index 8d5d226..873cd89 100644
>> --- a/gcc/tree-if-conv.c
>> +++ b/gcc/tree-if-conv.c
>> @@ -1242,7 +1242,11 @@ main_tree_if_conversion (void)
>>  static bool
>>  gate_tree_if_conversion (void)
>>  {
>> -  return flag_tree_vectorize != 0;
>> +  if (flag_tree_vectorize
>> +      && flag_tree_loop_if_convert < 0)
>> +    flag_tree_loop_if_convert = 1;
>
> Err, no.  This should be
>
>  return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
>          || flag_tree_loop_if_convert == 1);
>
> not set flag_tree_loop_if_convert here.
>
> But on a 2nd thought please follow what -ftree-cselim does, do
> Init(2) (ISTR -1 is now problematic for some reason), and in
> process_options () set flag_tree_loop_if_convert if it is
> equal to AUTODETECT_VALUE (2) to the setting of flag_tree_vectorize.
>
> The gate function then simply can return flag_tree_loop_if_convert.
>
> Ok with that change.
>

Committed r161963.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed  predicates.
  2010-07-08  9:57   ` Richard Guenther
@ 2010-07-08 16:40     ` Sebastian Pop
  2010-07-09 12:11       ` Richard Guenther
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-08 16:40 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

On Thu, Jul 8, 2010 at 04:57, Richard Guenther <rguenther@suse.de> wrote:
> On Wed, 7 Jul 2010, Sebastian Pop wrote:
>
>>       PR tree-optimization/44710
>>       * tree-if-conv.c (parse_predicate): New.
>>       (add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
>>       Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.
>>
>>       * gcc.dg/tree-ssa/ifc-6.c: New.
>> ---
>>  gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
>>  gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
>>  2 files changed, 100 insertions(+), 7 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>> new file mode 100644
>> index 0000000..a9c5db3
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
>> +
>> +static int x;
>> +foo (int n, int *A)
>> +{
>> +  int i;
>> +  for (i = 0; i < n; i++)
>> +    {
>> +      if (A[i])
>> +     x = 2;
>> +      if (A[i + 1])
>> +     x = 1;
>> +    }
>> +}
>> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
>> index ad106d7..03e453e 100644
>> --- a/gcc/tree-if-conv.c
>> +++ b/gcc/tree-if-conv.c
>> @@ -259,17 +259,95 @@ is_predicated (basic_block bb)
>>    return !is_true_predicate (bb_predicate (bb));
>>  }
>>
>> -/* Add condition NEW_COND to the predicate list of basic block BB.  */
>> +/* Parses the predicate COND and returns its comparison code and
>> +   operands OP0 and OP1.  */
>> +
>> +static enum tree_code
>> +parse_predicate (tree cond, tree *op0, tree *op1)
>> +{
>> +  gimple s;
>> +
>> +  if (TREE_CODE (cond) == SSA_NAME
>> +      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
>> +    {
>> +      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
>> +     {
>> +       *op0 = gimple_assign_rhs1 (s);
>> +       *op1 = gimple_assign_rhs2 (s);
>> +       return gimple_assign_rhs_code (s);
>> +     }
>> +
>> +      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
>> +     {
>> +       tree op = gimple_assign_rhs1 (s);
>> +       tree type = TREE_TYPE (op);
>> +       enum tree_code code = parse_predicate (op, op0, op1);
>> +
>> +       return code == ERROR_MARK ? ERROR_MARK :
>> +         invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));
>
> The : goes on the next line.
>
>> +     }
>> +
>> +      return ERROR_MARK;
>> +    }
>> +
>> +  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
>> +    {
>> +      *op0 = TREE_OPERAND (cond, 0);
>> +      *op1 = TREE_OPERAND (cond, 1);
>> +      return TREE_CODE (cond);
>> +    }
>> +
>> +  return ERROR_MARK;
>> +}
>> +
>> +/* Add condition NC to the predicate list of basic block BB.  */
>>
>>  static inline void
>> -add_to_predicate_list (basic_block bb, tree new_cond)
>> +add_to_predicate_list (basic_block bb, tree nc)
>>  {
>> -  tree cond = bb_predicate (bb);
>> +  tree bc;
>>
>> -  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
>> -                 fold_build2_loc (EXPR_LOCATION (cond),
>> -                                  TRUTH_OR_EXPR, boolean_type_node,
>> -                                  cond, new_cond));
>> +  if (is_true_predicate (nc))
>> +    return;
>> +
>> +  if (!is_predicated (bb))
>> +    bc = nc;
>> +  else
>> +    {
>> +      enum tree_code code1, code2;
>> +      tree op1a, op1b, op2a, op2b;
>> +
>> +      bc = bb_predicate (bb);
>> +      code1 = parse_predicate (bc, &op1a, &op1b);
>> +      code2 = parse_predicate (nc, &op2a, &op2b);
>> +
>> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
>> +     {
>> +       bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
>> +
>> +       if (!bc)
>> +         {
>> +           bc = bb_predicate (bb);
>> +           bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>> +                                 boolean_type_node, bc, nc);
>> +         }
>> +     }
>> +      else
>> +     bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>> +                           boolean_type_node, bc, nc);
>
> your re-use of bc makes this overly complicated.  Please use a new
> tem for the maybe_fold_or_comparisons result and commonize the
> !temp fold_build2_loc paths (and avoid re-loading bb_predicate)
>
>> +    }
>> +
>> +  if (!is_gimple_condexpr (bc))
>> +    {
>> +      gimple_seq stmts;
>> +      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
>> +      add_bb_predicate_gimplified_stmts (bb, stmts);
>> +    }
>> +
>> +  if (is_true_predicate (bc))
>> +    reset_bb_predicate (bb);
>> +  else
>> +    set_bb_predicate (bb, bc);
>>  }
>>
>>  /* Add the condition COND to the previous condition PREV_COND, and add
>>
>
> Ok with that change.
>

Committed r161964.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed  predicates.
  2010-07-08 16:40     ` Sebastian Pop
@ 2010-07-09 12:11       ` Richard Guenther
  2010-07-09 16:41         ` Sebastian Pop
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2010-07-09 12:11 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5564 bytes --]

On Thu, 8 Jul 2010, Sebastian Pop wrote:

> On Thu, Jul 8, 2010 at 04:57, Richard Guenther <rguenther@suse.de> wrote:
> > On Wed, 7 Jul 2010, Sebastian Pop wrote:
> >
> >>       PR tree-optimization/44710
> >>       * tree-if-conv.c (parse_predicate): New.
> >>       (add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
> >>       Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.
> >>
> >>       * gcc.dg/tree-ssa/ifc-6.c: New.
> >> ---
> >>  gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
> >>  gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
> >>  2 files changed, 100 insertions(+), 7 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> >> new file mode 100644
> >> index 0000000..a9c5db3
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
> >> @@ -0,0 +1,15 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
> >> +
> >> +static int x;
> >> +foo (int n, int *A)
> >> +{
> >> +  int i;
> >> +  for (i = 0; i < n; i++)
> >> +    {
> >> +      if (A[i])
> >> +     x = 2;
> >> +      if (A[i + 1])
> >> +     x = 1;
> >> +    }
> >> +}
> >> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> >> index ad106d7..03e453e 100644
> >> --- a/gcc/tree-if-conv.c
> >> +++ b/gcc/tree-if-conv.c
> >> @@ -259,17 +259,95 @@ is_predicated (basic_block bb)
> >>    return !is_true_predicate (bb_predicate (bb));
> >>  }
> >>
> >> -/* Add condition NEW_COND to the predicate list of basic block BB.  */
> >> +/* Parses the predicate COND and returns its comparison code and
> >> +   operands OP0 and OP1.  */
> >> +
> >> +static enum tree_code
> >> +parse_predicate (tree cond, tree *op0, tree *op1)
> >> +{
> >> +  gimple s;
> >> +
> >> +  if (TREE_CODE (cond) == SSA_NAME
> >> +      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
> >> +    {
> >> +      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
> >> +     {
> >> +       *op0 = gimple_assign_rhs1 (s);
> >> +       *op1 = gimple_assign_rhs2 (s);
> >> +       return gimple_assign_rhs_code (s);
> >> +     }
> >> +
> >> +      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
> >> +     {
> >> +       tree op = gimple_assign_rhs1 (s);
> >> +       tree type = TREE_TYPE (op);
> >> +       enum tree_code code = parse_predicate (op, op0, op1);
> >> +
> >> +       return code == ERROR_MARK ? ERROR_MARK :
> >> +         invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));
> >
> > The : goes on the next line.
> >
> >> +     }
> >> +
> >> +      return ERROR_MARK;
> >> +    }
> >> +
> >> +  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
> >> +    {
> >> +      *op0 = TREE_OPERAND (cond, 0);
> >> +      *op1 = TREE_OPERAND (cond, 1);
> >> +      return TREE_CODE (cond);
> >> +    }
> >> +
> >> +  return ERROR_MARK;
> >> +}
> >> +
> >> +/* Add condition NC to the predicate list of basic block BB.  */
> >>
> >>  static inline void
> >> -add_to_predicate_list (basic_block bb, tree new_cond)
> >> +add_to_predicate_list (basic_block bb, tree nc)
> >>  {
> >> -  tree cond = bb_predicate (bb);
> >> +  tree bc;
> >>
> >> -  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
> >> -                 fold_build2_loc (EXPR_LOCATION (cond),
> >> -                                  TRUTH_OR_EXPR, boolean_type_node,
> >> -                                  cond, new_cond));
> >> +  if (is_true_predicate (nc))
> >> +    return;
> >> +
> >> +  if (!is_predicated (bb))
> >> +    bc = nc;
> >> +  else
> >> +    {
> >> +      enum tree_code code1, code2;
> >> +      tree op1a, op1b, op2a, op2b;
> >> +
> >> +      bc = bb_predicate (bb);
> >> +      code1 = parse_predicate (bc, &op1a, &op1b);
> >> +      code2 = parse_predicate (nc, &op2a, &op2b);
> >> +
> >> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
> >> +     {
> >> +       bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
> >> +
> >> +       if (!bc)
> >> +         {
> >> +           bc = bb_predicate (bb);
> >> +           bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
> >> +                                 boolean_type_node, bc, nc);
> >> +         }
> >> +     }
> >> +      else
> >> +     bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
> >> +                           boolean_type_node, bc, nc);
> >
> > your re-use of bc makes this overly complicated.  Please use a new
> > tem for the maybe_fold_or_comparisons result and commonize the
> > !temp fold_build2_loc paths (and avoid re-loading bb_predicate)
> >
> >> +    }
> >> +
> >> +  if (!is_gimple_condexpr (bc))
> >> +    {
> >> +      gimple_seq stmts;
> >> +      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
> >> +      add_bb_predicate_gimplified_stmts (bb, stmts);
> >> +    }
> >> +
> >> +  if (is_true_predicate (bc))
> >> +    reset_bb_predicate (bb);
> >> +  else
> >> +    set_bb_predicate (bb, bc);
> >>  }
> >>
> >>  /* Add the condition COND to the previous condition PREV_COND, and add
> >>
> >
> > Ok with that change.
> >
> 
> Committed r161964.

Why did you commit without the requested change?

Richard.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed  predicates.
  2010-07-09 12:11       ` Richard Guenther
@ 2010-07-09 16:41         ` Sebastian Pop
  2010-07-09 16:45           ` Richard Guenther
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Pop @ 2010-07-09 16:41 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

On Fri, Jul 9, 2010 at 07:10, Richard Guenther <rguenther@suse.de> wrote:
> On Thu, 8 Jul 2010, Sebastian Pop wrote:
>
>> On Thu, Jul 8, 2010 at 04:57, Richard Guenther <rguenther@suse.de> wrote:
>> > On Wed, 7 Jul 2010, Sebastian Pop wrote:
>> >
>> >>       PR tree-optimization/44710
>> >>       * tree-if-conv.c (parse_predicate): New.
>> >>       (add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
>> >>       Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.
>> >>
>> >>       * gcc.dg/tree-ssa/ifc-6.c: New.
>> >> ---
>> >>  gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
>> >>  gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
>> >>  2 files changed, 100 insertions(+), 7 deletions(-)
>> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>> >>
>> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>> >> new file mode 100644
>> >> index 0000000..a9c5db3
>> >> --- /dev/null
>> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>> >> @@ -0,0 +1,15 @@
>> >> +/* { dg-do compile } */
>> >> +/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
>> >> +
>> >> +static int x;
>> >> +foo (int n, int *A)
>> >> +{
>> >> +  int i;
>> >> +  for (i = 0; i < n; i++)
>> >> +    {
>> >> +      if (A[i])
>> >> +     x = 2;
>> >> +      if (A[i + 1])
>> >> +     x = 1;
>> >> +    }
>> >> +}
>> >> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
>> >> index ad106d7..03e453e 100644
>> >> --- a/gcc/tree-if-conv.c
>> >> +++ b/gcc/tree-if-conv.c
>> >> @@ -259,17 +259,95 @@ is_predicated (basic_block bb)
>> >>    return !is_true_predicate (bb_predicate (bb));
>> >>  }
>> >>
>> >> -/* Add condition NEW_COND to the predicate list of basic block BB.  */
>> >> +/* Parses the predicate COND and returns its comparison code and
>> >> +   operands OP0 and OP1.  */
>> >> +
>> >> +static enum tree_code
>> >> +parse_predicate (tree cond, tree *op0, tree *op1)
>> >> +{
>> >> +  gimple s;
>> >> +
>> >> +  if (TREE_CODE (cond) == SSA_NAME
>> >> +      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
>> >> +    {
>> >> +      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
>> >> +     {
>> >> +       *op0 = gimple_assign_rhs1 (s);
>> >> +       *op1 = gimple_assign_rhs2 (s);
>> >> +       return gimple_assign_rhs_code (s);
>> >> +     }
>> >> +
>> >> +      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
>> >> +     {
>> >> +       tree op = gimple_assign_rhs1 (s);
>> >> +       tree type = TREE_TYPE (op);
>> >> +       enum tree_code code = parse_predicate (op, op0, op1);
>> >> +
>> >> +       return code == ERROR_MARK ? ERROR_MARK :
>> >> +         invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));
>> >
>> > The : goes on the next line.
>> >
>> >> +     }
>> >> +
>> >> +      return ERROR_MARK;
>> >> +    }
>> >> +
>> >> +  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
>> >> +    {
>> >> +      *op0 = TREE_OPERAND (cond, 0);
>> >> +      *op1 = TREE_OPERAND (cond, 1);
>> >> +      return TREE_CODE (cond);
>> >> +    }
>> >> +
>> >> +  return ERROR_MARK;
>> >> +}
>> >> +
>> >> +/* Add condition NC to the predicate list of basic block BB.  */
>> >>
>> >>  static inline void
>> >> -add_to_predicate_list (basic_block bb, tree new_cond)
>> >> +add_to_predicate_list (basic_block bb, tree nc)
>> >>  {
>> >> -  tree cond = bb_predicate (bb);
>> >> +  tree bc;
>> >>
>> >> -  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
>> >> -                 fold_build2_loc (EXPR_LOCATION (cond),
>> >> -                                  TRUTH_OR_EXPR, boolean_type_node,
>> >> -                                  cond, new_cond));
>> >> +  if (is_true_predicate (nc))
>> >> +    return;
>> >> +
>> >> +  if (!is_predicated (bb))
>> >> +    bc = nc;
>> >> +  else
>> >> +    {
>> >> +      enum tree_code code1, code2;
>> >> +      tree op1a, op1b, op2a, op2b;
>> >> +
>> >> +      bc = bb_predicate (bb);
>> >> +      code1 = parse_predicate (bc, &op1a, &op1b);
>> >> +      code2 = parse_predicate (nc, &op2a, &op2b);
>> >> +
>> >> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
>> >> +     {
>> >> +       bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
>> >> +
>> >> +       if (!bc)
>> >> +         {
>> >> +           bc = bb_predicate (bb);
>> >> +           bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>> >> +                                 boolean_type_node, bc, nc);
>> >> +         }
>> >> +     }
>> >> +      else
>> >> +     bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>> >> +                           boolean_type_node, bc, nc);
>> >
>> > your re-use of bc makes this overly complicated.  Please use a new
>> > tem for the maybe_fold_or_comparisons result and commonize the
>> > !temp fold_build2_loc paths (and avoid re-loading bb_predicate)
>> >
>> >> +    }
>> >> +
>> >> +  if (!is_gimple_condexpr (bc))
>> >> +    {
>> >> +      gimple_seq stmts;
>> >> +      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
>> >> +      add_bb_predicate_gimplified_stmts (bb, stmts);
>> >> +    }
>> >> +
>> >> +  if (is_true_predicate (bc))
>> >> +    reset_bb_predicate (bb);
>> >> +  else
>> >> +    set_bb_predicate (bb, bc);
>> >>  }
>> >>
>> >>  /* Add the condition COND to the previous condition PREV_COND, and add
>> >>
>> >
>> > Ok with that change.
>> >
>>
>> Committed r161964.
>
> Why did you commit without the requested change?

The two changes that you requested are implemented in the
patch that I committed.  I addressed this in the exact way
you described: using a temp.

+      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
+	{
+	  tree t = maybe_fold_or_comparisons (code1, op1a, op1b,
+					      code2, op2a, op2b);
+	  if (!t)
+	    t = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
+				 boolean_type_node, bc, nc);
+	  bc = t;
+	}

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed  predicates.
  2010-07-09 16:41         ` Sebastian Pop
@ 2010-07-09 16:45           ` Richard Guenther
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Guenther @ 2010-07-09 16:45 UTC (permalink / raw)
  To: Sebastian Pop; +Cc: Richard Guenther, gcc-patches

On Fri, Jul 9, 2010 at 6:40 PM, Sebastian Pop <sebpop@gmail.com> wrote:
> On Fri, Jul 9, 2010 at 07:10, Richard Guenther <rguenther@suse.de> wrote:
>> On Thu, 8 Jul 2010, Sebastian Pop wrote:
>>
>>> On Thu, Jul 8, 2010 at 04:57, Richard Guenther <rguenther@suse.de> wrote:
>>> > On Wed, 7 Jul 2010, Sebastian Pop wrote:
>>> >
>>> >>       PR tree-optimization/44710
>>> >>       * tree-if-conv.c (parse_predicate): New.
>>> >>       (add_to_predicate_list): Call it, call maybe_fold_or_comparisons.
>>> >>       Make sure that the predicates are either SSA_NAMEs or gimple_condexpr.
>>> >>
>>> >>       * gcc.dg/tree-ssa/ifc-6.c: New.
>>> >> ---
>>> >>  gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c |   15 +++++
>>> >>  gcc/tree-if-conv.c                    |   92 ++++++++++++++++++++++++++++++---
>>> >>  2 files changed, 100 insertions(+), 7 deletions(-)
>>> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>>> >>
>>> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>>> >> new file mode 100644
>>> >> index 0000000..a9c5db3
>>> >> --- /dev/null
>>> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-6.c
>>> >> @@ -0,0 +1,15 @@
>>> >> +/* { dg-do compile } */
>>> >> +/* { dg-options "-c -O2 -ftree-vectorize" { target *-*-* } } */
>>> >> +
>>> >> +static int x;
>>> >> +foo (int n, int *A)
>>> >> +{
>>> >> +  int i;
>>> >> +  for (i = 0; i < n; i++)
>>> >> +    {
>>> >> +      if (A[i])
>>> >> +     x = 2;
>>> >> +      if (A[i + 1])
>>> >> +     x = 1;
>>> >> +    }
>>> >> +}
>>> >> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
>>> >> index ad106d7..03e453e 100644
>>> >> --- a/gcc/tree-if-conv.c
>>> >> +++ b/gcc/tree-if-conv.c
>>> >> @@ -259,17 +259,95 @@ is_predicated (basic_block bb)
>>> >>    return !is_true_predicate (bb_predicate (bb));
>>> >>  }
>>> >>
>>> >> -/* Add condition NEW_COND to the predicate list of basic block BB.  */
>>> >> +/* Parses the predicate COND and returns its comparison code and
>>> >> +   operands OP0 and OP1.  */
>>> >> +
>>> >> +static enum tree_code
>>> >> +parse_predicate (tree cond, tree *op0, tree *op1)
>>> >> +{
>>> >> +  gimple s;
>>> >> +
>>> >> +  if (TREE_CODE (cond) == SSA_NAME
>>> >> +      && is_gimple_assign (s = SSA_NAME_DEF_STMT (cond)))
>>> >> +    {
>>> >> +      if (TREE_CODE_CLASS (gimple_assign_rhs_code (s)) == tcc_comparison)
>>> >> +     {
>>> >> +       *op0 = gimple_assign_rhs1 (s);
>>> >> +       *op1 = gimple_assign_rhs2 (s);
>>> >> +       return gimple_assign_rhs_code (s);
>>> >> +     }
>>> >> +
>>> >> +      else if (gimple_assign_rhs_code (s) == TRUTH_NOT_EXPR)
>>> >> +     {
>>> >> +       tree op = gimple_assign_rhs1 (s);
>>> >> +       tree type = TREE_TYPE (op);
>>> >> +       enum tree_code code = parse_predicate (op, op0, op1);
>>> >> +
>>> >> +       return code == ERROR_MARK ? ERROR_MARK :
>>> >> +         invert_tree_comparison (code, HONOR_NANS (TYPE_MODE (type)));
>>> >
>>> > The : goes on the next line.
>>> >
>>> >> +     }
>>> >> +
>>> >> +      return ERROR_MARK;
>>> >> +    }
>>> >> +
>>> >> +  if (TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
>>> >> +    {
>>> >> +      *op0 = TREE_OPERAND (cond, 0);
>>> >> +      *op1 = TREE_OPERAND (cond, 1);
>>> >> +      return TREE_CODE (cond);
>>> >> +    }
>>> >> +
>>> >> +  return ERROR_MARK;
>>> >> +}
>>> >> +
>>> >> +/* Add condition NC to the predicate list of basic block BB.  */
>>> >>
>>> >>  static inline void
>>> >> -add_to_predicate_list (basic_block bb, tree new_cond)
>>> >> +add_to_predicate_list (basic_block bb, tree nc)
>>> >>  {
>>> >> -  tree cond = bb_predicate (bb);
>>> >> +  tree bc;
>>> >>
>>> >> -  set_bb_predicate (bb, is_true_predicate (cond) ? new_cond :
>>> >> -                 fold_build2_loc (EXPR_LOCATION (cond),
>>> >> -                                  TRUTH_OR_EXPR, boolean_type_node,
>>> >> -                                  cond, new_cond));
>>> >> +  if (is_true_predicate (nc))
>>> >> +    return;
>>> >> +
>>> >> +  if (!is_predicated (bb))
>>> >> +    bc = nc;
>>> >> +  else
>>> >> +    {
>>> >> +      enum tree_code code1, code2;
>>> >> +      tree op1a, op1b, op2a, op2b;
>>> >> +
>>> >> +      bc = bb_predicate (bb);
>>> >> +      code1 = parse_predicate (bc, &op1a, &op1b);
>>> >> +      code2 = parse_predicate (nc, &op2a, &op2b);
>>> >> +
>>> >> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
>>> >> +     {
>>> >> +       bc = maybe_fold_or_comparisons (code1, op1a, op1b, code2, op2a, op2b);
>>> >> +
>>> >> +       if (!bc)
>>> >> +         {
>>> >> +           bc = bb_predicate (bb);
>>> >> +           bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>>> >> +                                 boolean_type_node, bc, nc);
>>> >> +         }
>>> >> +     }
>>> >> +      else
>>> >> +     bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
>>> >> +                           boolean_type_node, bc, nc);
>>> >
>>> > your re-use of bc makes this overly complicated.  Please use a new
>>> > tem for the maybe_fold_or_comparisons result and commonize the
>>> > !temp fold_build2_loc paths (and avoid re-loading bb_predicate)
>>> >
>>> >> +    }
>>> >> +
>>> >> +  if (!is_gimple_condexpr (bc))
>>> >> +    {
>>> >> +      gimple_seq stmts;
>>> >> +      bc = force_gimple_operand (bc, &stmts, true, NULL_TREE);
>>> >> +      add_bb_predicate_gimplified_stmts (bb, stmts);
>>> >> +    }
>>> >> +
>>> >> +  if (is_true_predicate (bc))
>>> >> +    reset_bb_predicate (bb);
>>> >> +  else
>>> >> +    set_bb_predicate (bb, bc);
>>> >>  }
>>> >>
>>> >>  /* Add the condition COND to the previous condition PREV_COND, and add
>>> >>
>>> >
>>> > Ok with that change.
>>> >
>>>
>>> Committed r161964.
>>
>> Why did you commit without the requested change?
>
> The two changes that you requested are implemented in the
> patch that I committed.  I addressed this in the exact way
> you described: using a temp.
>
> +      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
> +       {
> +         tree t = maybe_fold_or_comparisons (code1, op1a, op1b,
> +                                             code2, op2a, op2b);
> +         if (!t)
> +           t = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
> +                                boolean_type_node, bc, nc);
> +         bc = t;
> +       }

I see in my tree

      enum tree_code code1, code2;
      tree op1a, op1b, op2a, op2b;

      bc = bb_predicate (bb);
      code1 = parse_predicate (bc, &op1a, &op1b);
      code2 = parse_predicate (nc, &op2a, &op2b);

      if (code1 != ERROR_MARK && code2 != ERROR_MARK)
        {
          tree t = maybe_fold_or_comparisons (code1, op1a, op1b,
                                              code2, op2a, op2b);
          if (!t)
            t = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
                                 boolean_type_node, bc, nc);
          bc = t;
        }
      else
        bc = fold_build2_loc (EXPR_LOCATION (bc), TRUTH_OR_EXPR,
                              boolean_type_node, bc, nc);

which is indeed different from the version you proposed initially
(sorry for not noticing), but there appearantly was a misunderstanding
as that code still calls fold_build2_loc twice with the same args.
Sth you fixed with the followup patch.

Sorry for the noise,
Richard.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-07-09 16:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-07 20:23 [PATCH 0/4] if-conversion of loops with conditionals containing memory writes Sebastian Pop
2010-07-07 20:22 ` [PATCH 1/4] Add the -floop-if-convert flag Sebastian Pop
2010-07-07 21:39   ` Richard Guenther
2010-07-07 21:49     ` Sebastian Pop
2010-07-07 22:53       ` [PATCH 1/4] Add the -ftree-loop-if-convert flag Sebastian Pop
2010-07-08  9:02         ` Richard Guenther
2010-07-08 16:40           ` Sebastian Pop
2010-07-07 20:23 ` [PATCH 3/4] Add flag -floop-if-convert-memory-writes Sebastian Pop
2010-07-07 22:54   ` [PATCH 3/4] Add flag -ftree-loop-if-convert-memory-writes Sebastian Pop
2010-07-07 20:23 ` [PATCH 4/4] Do not check whether memory references accessed in every iteration trap Sebastian Pop
2010-07-08 12:29   ` Michael Matz
2010-07-07 20:24 ` [PATCH 2/4] Call maybe_fold_or_comparisons to fold OR-ed predicates Sebastian Pop
2010-07-08  9:57   ` Richard Guenther
2010-07-08 16:40     ` Sebastian Pop
2010-07-09 12:11       ` Richard Guenther
2010-07-09 16:41         ` Sebastian Pop
2010-07-09 16:45           ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).