public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yuri Rumyantsev <ysrumyan@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Igor Zamyatin <izamyatin@gmail.com>
Subject: Re: [PATCH 2/3] Extended if-conversion
Date: Mon, 01 Dec 2014 15:53:00 -0000	[thread overview]
Message-ID: <CAEoMCqQX5qEExZ7Xtz_JNzus5FQizvWkbFSW5y+TGNjt71BHaQ@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc3ONiHL1nsLK1MSEqZR94pHYmb+rK-M24n2fzACvFDiVA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6882 bytes --]

Hi Richard,

I resend you patch1 and patch2 with minor changes:
1. I renamed flag_force_vectorize to aggressive_if_conv.
2. Use static cast for the first argument of gimple_phi_arg_edge.
I also very sorry that I sent you bad patch.

Now let me answer on your questions related to second patch.
1. Why we need both predicate_extended_scalar_phi and
predicate_arbitrary_scalar_phi?

Let's consider the following simple test-case:

  #pragma omp simd safelen(8)
  for (i=0; i<512; i++)
  {
    float t = a[i];
    if (t > 0.0f & t < 1.0e+17f)
      if (c[i] != 0)  /* c is integer array. */
res += 1;
  }

we can see the following phi node correspondent to res:

# res_1 = PHI <res_15(3), res_15(4), res_10(5)>

It is clear that we can optimize it to phi node with 2 arguments only
and only one check can be used for phi predication (for reduction in
our case), namely predicate of bb_5. In general case we can't do it
even if we sort all phi argument values since we still have to produce
a chain of cond expressions to perform phi predication (see comments
for predicate_arbitrary_scalar_phi).
2. Why we need to introduce find_insertion_point?
 Let's consider another test-case extracted from 175.vpr ( t5.c is
attached) and we can see that bb_7 and bb_9 containig phi nodes has
only critical incoming edges and both contain code computing edge
predicates, e.g.

<bb 7>:
# xmax_edge_30 = PHI <xmax_edge_36(6), xmax_edge_18(5)>
_46 = xmax_17 == xmax_37;
_47 = xmax_17 == xmax_27;
_48 = _46 & _47;
_53 = xmax_17 == xmax_37;
_54 = ~_53;
_55 = xmax_17 == xmax_27;
_56 = _54 & _55;
_57 = _48 | _56;
xmax_edge_19 = xmax_edge_39 + 1;
goto <bb 11>;

It is evident that we can not put phi predication at the block
beginning but need to put it after predicate computations.
Note also that if there are no critical edges for phi arguments
insertion point will be "after labels" Note also that phi result can
have use in this block too, so we can't put predication code to the
block end.

Let me know if you still have any questions.

Best regards.
Yuri.




2014-11-28 15:43 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Wed, Nov 12, 2014 at 2:35 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Hi All,
>>
>> Here is the second patch related to extended predication.
>> Few comments which explain a main goal of design.
>>
>> 1. I don't want to insert any critical edge splitting since it may
>> lead to less efficient binaries.
>> 2. One special case of extended PHI node predication was introduced
>> when #arguments is more than 2 but only two arguments are different
>> and one argument has the only occurrence. For such PHI conditional
>> scalar reduction is applied.
>> This is correspondent to the following statement:
>>     if (q1 && q2 && q3) var++
>>  New function phi_has_two_different_args was introduced to detect such phi.
>> 3. Original algorithm for PHI predication used assumption that at
>> least one incoming edge for blocks containing PHI is not critical - it
>> guarantees that all computations related to predicate of normal edge
>> are already inserted above this block and
>> code related to PHI predication can be inserted at the beginning of
>> block. But this is not true for critical edges for which predicate
>> computations are  in the block where code for phi predication must be
>> inserted. So new function find_insertion_point is introduced which is
>> simply found out the last statement in block defining predicates
>> correspondent to all incoming edges and insert phi predication code
>> after it (with some minor exceptions).
>
> Unfortunately the patch doesn't apply for me - I get
>
> patch: **** malformed patch at line 505: @@ -1720,6 +2075,8 @@
> predicate_all_scalar_phis (struct loop *loop)
>
> a few remarks nevertheless.  I don't see how we need both
> predicate_extended_scalar_phi and predicate_arbitrary_scalar_phi.
> Couldn't we simply sort an array of (edge, value) pairs after value
> and handle equal values specially in predicate_extended_scalar_phi?
> That would even make PHI <a, a, b, c, c> more optimal.
>
> I don't understand the need for find_insertion_point.  All SSA names
> required for the predicates are defined upward - and the complex CFG
> is squashed to a single basic-block, thus the defs will dominate the
> inserted code if you insert after labels just like for the other case.
> Or what am I missing?  ("flattening" of the basic-blocks of course needs
> to happen in dominator order - but I guess that happens already?)
>
> I'd like the extended PHI handling to be enablable by a flag even
> for !force-vectorization - I've seen cases with 3 PHI args multiple
> times that would have been nice to vectorize.  I suggest to
> add -ftree-loop-if-convert-aggressive for this.  We can do this as
> followup, but please rename the local flag_force_vectorize flag
> to something less looking like a flag, like simply 'aggressive'.
>
> Otherwise patch 2 looks ok to me.
>
> Richard.
>
>
>> ChangeLog:
>>
>> 2014-10-24  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>
>> * tree-if-conv.c (ifcvt_can_use_mask_load_store): Use
>> FLAG_FORCE_VECTORIZE instead of loop flag.
>> (if_convertible_bb_p): Allow bb has more than 2 predecessors if
>> FLAG_FORCE_VECTORIZE is true.
>> (if_convertible_bb_p): Delete check that bb has at least one
>> non-critical incoming edge.
>> (phi_has_two_different_args): New function.
>> (is_cond_scalar_reduction): Add argument EXTENDED to choose access
>> to phi arguments. Invoke phi_has_two_different_args to get phi
>> arguments if EXTENDED is true. Change check that block
>> containing reduction statement candidate is predecessor
>> of phi-block since phi may have more than two arguments.
>> (convert_scalar_cond_reduction): Add argument BEFORE to insert
>> statement before/after gsi point.
>> (predicate_scalar_phi): Add argument false (which means non-extended
>> predication) to call of is_cond_scalar_reduction. Add argument
>> true (which correspondent to argument BEFORE) to call of
>> convert_scalar_cond_reduction.
>> (get_predicate_for_edge): New function.
>> (predicate_arbitrary_scalar_phi): New function.
>> (predicate_extended_scalar_phi): New function.
>> (find_insertion_point): New function.
>> (predicate_all_scalar_phis): Add two boolean variables EXTENDED and
>> BEFORE. Initialize EXTENDED to true if BB containing phi has more
>> than 2 predecessors or both incoming edges are critical. Invoke
>> find_phi_replacement_condition and predicate_scalar_phi or
>> find_insertion_point and predicate_extended_scalar_phi depending on
>> EXTENDED value.
>> (insert_gimplified_predicates): Add check that non-predicated block
>> may have statements to insert. Insert predicate of BB just after label
>> if FLAG_FORCE_VECTORIZE is true.
>> (tree_if_conversion): Add initialization of FLAG_FORCE_VECTORIZE which
>> is copy of inner or outer loop field force_vectorize.

[-- Attachment #2: t5.c --]
[-- Type: text/x-csrc, Size: 713 bytes --]

#define N 512
#define max(x,y) (x) >= (y)? (x) : (y)
#define min(x,y) (x) <= (y)? (x) : (y)
int c_X[N];
int x_max;
int x_min;
extern int nx;

void foo (int n)
{
  int i, x;
  int xmin, xmax;
  int xmin_edge, xmax_edge;

  x = c_X[0];
  xmin = xmax = x;
  xmin_edge = xmax_edge  = 1;
#pragma omp simd safelen(8)
  for (i = 1; i<n; i++) {
    x = c_X[i];
    x = max(min(x,nx),1);
    if (x == xmin) {  
       xmin_edge++;
    }
    if (x == xmax) {
       xmax_edge++;
    }
    else if (x < xmin) {
       xmin = x;
       xmin_edge = 1;
    }
    else if (x > xmax) {
       xmax = x;
       xmax_edge = 1;
    }

  }
    x_max = xmax_edge;
    x_min = xmin_edge;
}

   

[-- Attachment #3: if-conv.patch1.1 --]
[-- Type: application/octet-stream, Size: 7951 bytes --]

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index f7befac..6c9ad32 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -131,6 +131,9 @@ along with GCC; see the file COPYING3.  If not see
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
+/* Apply more aggressive (extended) if-conversion if true.  */
+static bool aggressive_if_conv;
+
 /* Structure used to predicate basic blocks.  This is attached to the
    ->aux field of the BBs in the loop to be if-converted.  */
 typedef struct bb_predicate_s {
@@ -160,6 +163,17 @@ bb_predicate (basic_block bb)
   return ((bb_predicate_p) bb->aux)->predicate;
 }
 
+/* Returns predicate for critical edge E.  */
+
+static inline tree
+edge_predicate (edge e)
+{
+  gcc_assert (EDGE_COUNT (e->src->succs) >= 2);
+  gcc_assert (EDGE_COUNT (e->dest->preds) >= 2);
+  gcc_assert (e->aux != NULL);
+  return (tree) e->aux;
+}
+
 /* Sets the gimplified predicate COND for basic block BB.  */
 
 static inline void
@@ -171,6 +185,16 @@ set_bb_predicate (basic_block bb, tree cond)
   ((bb_predicate_p) bb->aux)->predicate = cond;
 }
 
+/* Sets predicate COND for critical edge E.
+   Assumes that #(E->src->succs) >=2 & #(E->dest->preds) >= 2.  */
+
+static inline void
+set_edge_predicate (edge e, tree cond)
+{
+  gcc_assert (cond != NULL_TREE);
+  e->aux = cond;
+}
+
 /* Returns the sequence of statements of the gimplification of the
    predicate for basic block BB.  */
 
@@ -485,10 +509,16 @@ add_to_dst_predicate_list (struct loop *loop, edge e,
     cond = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
 			prev_cond, cond);
 
-  add_to_predicate_list (loop, e->dest, cond);
+  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest))
+    add_to_predicate_list (loop, e->dest, cond);
+
+  /* If edge E is critical save predicate on it.
+     Assume that #(e->src->succs) >= 2.  */
+  if (EDGE_COUNT (e->dest->preds) >= 2)
+    set_edge_predicate (e, cond);
 }
 
-/* Return true if one of the successor edges of BB exits LOOP.  */
+/* Returns true if one of the successor edges of BB exits LOOP.  */
 
 static bool
 bb_with_exit_edge_p (struct loop *loop, basic_block bb)
@@ -512,7 +542,9 @@ bb_with_exit_edge_p (struct loop *loop, basic_block bb)
    When the flag_tree_loop_if_convert_stores is not set, PHI is not
    if-convertible if:
    - a virtual PHI is immediately used in another PHI node,
-   - there is a virtual PHI in a BB other than the loop->header.  */
+   - there is a virtual PHI in a BB other than the loop->header.
+   When the aggressive_if_conv is set, PHI can have more than
+   two arguments.  */
 
 static bool
 if_convertible_phi_p (struct loop *loop, basic_block bb, gphi *phi,
@@ -524,11 +556,17 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gphi *phi,
       print_gimple_stmt (dump_file, phi, 0, TDF_SLIM);
     }
 
-  if (bb != loop->header && gimple_phi_num_args (phi) != 2)
+  if (bb != loop->header)
     {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "More than two phi node args.\n");
-      return false;
+      if (gimple_phi_num_args (phi) != 2)
+	{
+	  if (!aggressive_if_conv)
+	    {
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "More than two phi node args.\n");
+	      return false;
+	    }
+        }
     }
 
   if (flag_tree_loop_if_convert_stores || any_mask_load_store)
@@ -895,7 +933,8 @@ if_convertible_gimple_assign_stmt_p (gimple stmt,
 
    A statement is if-convertible if:
    - it is an if-convertible GIMPLE_ASSIGN,
-   - it is a GIMPLE_LABEL or a GIMPLE_COND.  */
+   - it is a GIMPLE_LABEL or a GIMPLE_COND,
+   - it is builtins call.  */
 
 static bool
 if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
@@ -942,6 +981,22 @@ if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
   return true;
 }
 
+/* Assumes that BB has more than 1 predecessors.
+   Returns false if at least one successor is not on critical edge
+   and true otherwise.  */
+
+static inline bool
+all_preds_critical_p (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    if (EDGE_COUNT (e->src->succs) == 1)
+      return false;
+  return true;
+}
+
 /* Return true when BB is if-convertible.  This routine does not check
    basic block's statements and phis.
 
@@ -950,6 +1005,9 @@ if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
    - it is after the exit block but before the latch,
    - its edges are not normal.
 
+   Last restriction will be deleted after adding support for extended
+   predication.
+
    EXIT_BB is the basic block containing the exit of the LOOP.  BB is
    inside LOOP.  */
 
@@ -1001,18 +1059,17 @@ if_convertible_bb_p (struct loop *loop, basic_block bb, basic_block exit_bb)
 
   /* At least one incoming edge has to be non-critical as otherwise edge
      predicates are not equal to basic-block predicates of the edge
-     source.  */
+     source. This restriction will be removed after adding support for
+     extended predication.  */
   if (EDGE_COUNT (bb->preds) > 1
       && bb != loop->header)
     {
-      bool found = false;
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	if (EDGE_COUNT (e->src->succs) == 1)
-	  found = true;
-      if (!found)
+      if (!aggressive_if_conv && all_preds_critical_p (bb))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "only critical predecessors\n");
+	    fprintf (dump_file, "only critical predecessors in bb#%d\n",
+		      bb->index);
+
 	  return false;
 	}
     }
@@ -1126,11 +1183,12 @@ predicate_bbs (loop_p loop)
       tree cond;
       gimple stmt;
 
-      /* The loop latch is always executed and has no extra conditions
-	 to be processed: skip it.  */
-      if (bb == loop->latch)
+      /* The loop latch and loop exit block are always executed and
+	 have no extra conditions to be processed: skip them.  */
+      if (bb == loop->latch
+	  || bb_with_exit_edge_p (loop, bb))
 	{
-	  reset_bb_predicate (loop->latch);
+	  reset_bb_predicate (bb);
 	  continue;
 	}
 
@@ -1141,7 +1199,7 @@ predicate_bbs (loop_p loop)
 	  tree c2;
 	  edge true_edge, false_edge;
 	  location_t loc = gimple_location (stmt);
-	  tree c = fold_build2_loc (loc, gimple_cond_code (stmt),
+	  tree c = build2_loc (loc, gimple_cond_code (stmt),
 				    boolean_type_node,
 				    gimple_cond_lhs (stmt),
 				    gimple_cond_rhs (stmt));
@@ -1150,6 +1208,8 @@ predicate_bbs (loop_p loop)
 	  extract_true_false_edges_from_block (gimple_bb (stmt),
 					       &true_edge, &false_edge);
 
+          true_edge->aux = false_edge->aux = NULL;
+
 	  /* If C is true, then TRUE_EDGE is taken.  */
 	  add_to_dst_predicate_list (loop, true_edge, unshare_expr (cond),
 				     unshare_expr (c));
@@ -2145,6 +2205,9 @@ tree_if_conversion (struct loop *loop)
   ifc_bbs = NULL;
   bool any_mask_load_store = false;
 
+  /* Temporary set up this flag to false.  */
+  aggressive_if_conv = false;
+
   if (!if_convertible_loop_p (loop, &any_mask_load_store)
       || !dbg_cnt (if_conversion_tree))
     goto cleanup;
@@ -2154,7 +2217,9 @@ tree_if_conversion (struct loop *loop)
 	  || loop->dont_vectorize))
     goto cleanup;
 
-  if (any_mask_load_store && !version_loop_for_if_conversion (loop))
+  if ((any_mask_load_store
+       || (loop->force_vectorize && flag_tree_loop_if_convert != 1))
+      && !version_loop_for_if_conversion (loop))
     goto cleanup;
 
   /* Now all statements are if-convertible.  Combine all the basic
@@ -2175,7 +2240,15 @@ tree_if_conversion (struct loop *loop)
       unsigned int i;
 
       for (i = 0; i < loop->num_nodes; i++)
-	free_bb_predicate (ifc_bbs[i]);
+	{
+	  basic_block bb = ifc_bbs[i];
+	  free_bb_predicate (bb);
+	  if (EDGE_COUNT (bb->succs) == 2)
+	    {
+	      EDGE_SUCC (bb, 0)->aux = NULL;
+	      EDGE_SUCC (bb, 1)->aux = NULL;
+	    }
+	}
 
       free (ifc_bbs);
       ifc_bbs = NULL;

[-- Attachment #4: if-conv.patch2.1 --]
[-- Type: application/octet-stream, Size: 17100 bytes --]

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 6c9ad32..bde2119 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1020,10 +1020,15 @@ if_convertible_bb_p (struct loop *loop, basic_block bb, basic_block exit_bb)
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "----------[%d]-------------\n", bb->index);
 
-  if (EDGE_COUNT (bb->preds) > 2
-      || EDGE_COUNT (bb->succs) > 2)
+  if (EDGE_COUNT (bb->succs) > 2)
     return false;
 
+  if (EDGE_COUNT (bb->preds) > 2)
+    {
+      if (!aggressive_if_conv)
+	return false;
+    }
+
   if (exit_bb)
     {
       if (bb != loop->latch)
@@ -1057,23 +1062,6 @@ if_convertible_bb_p (struct loop *loop, basic_block bb, basic_block exit_bb)
 	return false;
       }
 
-  /* At least one incoming edge has to be non-critical as otherwise edge
-     predicates are not equal to basic-block predicates of the edge
-     source. This restriction will be removed after adding support for
-     extended predication.  */
-  if (EDGE_COUNT (bb->preds) > 1
-      && bb != loop->header)
-    {
-      if (!aggressive_if_conv && all_preds_critical_p (bb))
-	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "only critical predecessors in bb#%d\n",
-		      bb->index);
-
-	  return false;
-	}
-    }
-
   return true;
 }
 
@@ -1477,6 +1465,66 @@ find_phi_replacement_condition (basic_block bb, tree *cond,
   return first_edge->src;
 }
 
+/* Returns true if phi arguments are equal except for one; argument values and
+   index of exclusive argument are saved if needed.  */
+
+static bool
+phi_has_two_different_args (gimple phi, tree *arg_0, tree *arg_1,
+			    unsigned int *index)
+{
+  unsigned int i, ind0 = 0, ind1;
+  tree arg0, arg1 = NULL_TREE;
+  bool seen_same = false;
+
+  arg0 = gimple_phi_arg_def (phi, 0);
+  for (i = 1; i < gimple_phi_num_args (phi); i++)
+    {
+      tree tmp;
+      tmp = gimple_phi_arg_def (phi, i);
+      if (arg0 == NULL_TREE
+	  && operand_equal_p (tmp, arg1, 0) == 0)
+	{
+	  arg0 = tmp;
+	  ind0 = i;
+	}
+      else if (seen_same && operand_equal_p (tmp, arg1, 0) != 0)
+	continue;
+      else if (operand_equal_p (tmp, arg0, 0) == 0)
+	{
+	  if (arg1 == NULL_TREE)
+	    {
+	      arg1 = tmp;
+	      ind1 = i;
+	    }
+	  else if (operand_equal_p (tmp, arg1, 0) == 0)
+	    return false;
+	  else
+	    seen_same = true;
+	}
+      else if (!seen_same)
+	{
+	  /* Swap arguments.  */
+	  seen_same = true;
+	  arg0 = arg1;
+	  arg1 = tmp;
+	  ind0 = ind1;
+	}
+      else
+	return false;
+    }
+  if (arg0 == NULL_TREE)
+    return false;
+
+  if (arg_0)
+    *arg_0 = arg0;
+  if (arg_1)
+    *arg_1 = arg1;
+  if (index)
+    *index = ind0;
+
+  return true;
+}
+
 /* Returns true if def-stmt for phi argument ARG is simple increment/decrement
    which is in predicated basic block.
    In fact, the following PHI pattern is searching:
@@ -1487,11 +1535,12 @@ find_phi_replacement_condition (basic_block bb, tree *cond,
 	  reduc_3 = ...
 	reduc_2 = PHI <reduc_1, reduc_3>
 
-   REDUC, OP0 and OP1 contain reduction stmt and its operands.  */
+   REDUC, OP0 and OP1 contain reduction stmt and its operands.
+   EXTENDED is used to get phi arguments through different methods.  */
 
 static bool
 is_cond_scalar_reduction (gimple phi, gimple *reduc,
-			  tree *op0, tree *op1)
+			  tree *op0, tree *op1, bool extended)
 {
   tree lhs, r_op1, r_op2;
   tree arg_0, arg_1;
@@ -1503,9 +1552,19 @@ is_cond_scalar_reduction (gimple phi, gimple *reduc,
   edge latch_e = loop_latch_edge (loop);
   imm_use_iterator imm_iter;
   use_operand_p use_p;
+  edge e;
+  edge_iterator ei;
+  bool result = false;
+
+  if (!extended)
+    {
+      arg_0 = PHI_ARG_DEF (phi, 0);
+      arg_1 = PHI_ARG_DEF (phi, 1);
+    }
+  else
+    /* Phi may have more than 2 arguments, but only two are different.  */
+    phi_has_two_different_args (phi, &arg_0, &arg_1, NULL);
 
-  arg_0 = PHI_ARG_DEF (phi, 0);
-  arg_1 = PHI_ARG_DEF (phi, 1);
   if (TREE_CODE (arg_0) != SSA_NAME || TREE_CODE (arg_1) != SSA_NAME)
     return false;
 
@@ -1540,8 +1599,13 @@ is_cond_scalar_reduction (gimple phi, gimple *reduc,
     return false;
 
   /* Check that stmt-block is predecessor of phi-block.  */
-  if (EDGE_PRED (bb, 0)->src != gimple_bb (stmt)
-      && EDGE_PRED (bb, 1)->src != gimple_bb (stmt))
+  FOR_EACH_EDGE (e, ei, gimple_bb (stmt)->succs)
+    if (e->dest == bb)
+      {
+	result = true;
+	break;
+      }
+  if (!result)
     return false;
 
   if (!has_single_use (lhs))
@@ -1597,11 +1661,13 @@ is_cond_scalar_reduction (gimple phi, gimple *reduc,
     res_2 = res_13 + _ifc__1;
   Argument SWAP tells that arguments of conditional expression should be
   swapped.
+  Argument BEFORE is used to insert new statement before/after.
   Returns rhs of resulting PHI assignment.  */
 
 static tree
 convert_scalar_cond_reduction (gimple reduc, gimple_stmt_iterator *gsi,
-			       tree cond, tree op0, tree op1, bool swap)
+			       tree cond, tree op0, tree op1, bool swap,
+			       bool before)
 {
   gimple_stmt_iterator stmt_it;
   gimple new_assign;
@@ -1626,7 +1692,10 @@ convert_scalar_cond_reduction (gimple reduc, gimple_stmt_iterator *gsi,
 
   /* Create assignment stmt and insert it at GSI.  */
   new_assign = gimple_build_assign (tmp, c);
-  gsi_insert_before (gsi, new_assign, GSI_SAME_STMT);
+  if (before)
+    gsi_insert_before (gsi, new_assign, GSI_SAME_STMT);
+  else
+    gsi_insert_after (gsi, new_assign, GSI_NEW_STMT);
   /* Build rhs for unconditional increment/decrement.  */
   rhs = fold_build2 (gimple_assign_rhs_code (reduc),
 		     TREE_TYPE (rhs1), op0, tmp);
@@ -1694,10 +1763,11 @@ predicate_scalar_phi (gphi *phi, tree cond,
 	  arg_0 = gimple_phi_arg_def (phi, 0);
 	  arg_1 = gimple_phi_arg_def (phi, 1);
 	}
-      if (is_cond_scalar_reduction (phi, &reduc, &op0, &op1))
+      if (is_cond_scalar_reduction (phi, &reduc, &op0, &op1, false))
 	/* Convert reduction stmt into vectorizable form.  */
 	rhs = convert_scalar_cond_reduction (reduc, gsi, cond, op0, op1,
-					     true_bb != gimple_bb (reduc));
+					     true_bb != gimple_bb (reduc),
+					     true);
       else
 	/* Build new RHS using selected condition and arguments.  */
 	rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond),
@@ -1715,6 +1785,288 @@ predicate_scalar_phi (gphi *phi, tree cond,
     }
 }
 
+/* Returns predicate of edge associated with argument of phi node.  */
+
+static tree
+get_predicate_for_edge (edge e)
+{
+  tree c;
+  basic_block b = e->src;
+
+  if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
+    /* Edge E is not critical, use predicate of edge source bb.  */
+    c = bb_predicate (b);
+  else
+    /* Edge E is critical and its aux field contains predicate.  */
+    c = edge_predicate (e);
+  return c;
+}
+
+/* This is enhancement for predication of a phi node with arbitrary
+   number of arguments, i.e. for
+	x = phi (x_1, x_2, ..., x_k)
+   a chain of recurrent cond expressions will be produced.
+   For example,
+	bb_0
+	if (_5 != 0) goto bb_1 else goto bb_2
+	end_bb_0
+
+	bb_1
+	res_2 = some computations;
+	goto bb_5
+	end_bb_1
+
+	bb_2
+	if (_9 != 0) goto bb_3 else goto bb_4
+	end_bb_2
+
+	bb_3
+	res_3 = ...;
+	goto bb_5
+	end_bb_3
+
+	bb4
+	res_4 = ...;
+	end_bb_4
+
+	bb_5
+	# res_1 = PHI <res_2(1), res_3(3), res_4(4)>
+
+    will be if-converted into chain of unconditional assignments:
+	_ifc__42 = <PRD_3> ? res_3 : res_4;
+	res_1 = _5 != 0 ? res_2 : _ifc__42;
+
+    where <PRD_3> is predicate of <bb_3>.
+
+    All created intermediate statements are inserted at GSI point
+    using value of argumnet BEFORE.
+    Returns cond expression correspondent to rhs of new phi
+    replacement stmt.  */
+
+static tree
+predicate_arbitrary_scalar_phi (gimple phi, gimple_stmt_iterator *gsi,
+				bool before)
+{
+  int i;
+  int num = (int) gimple_phi_num_args (phi);
+  tree last = gimple_phi_arg_def (phi, num - 1);
+  tree type = TREE_TYPE (gimple_phi_result (phi));
+  tree curr;
+  tree c;
+  gimple stmt;
+  tree lhs;
+  tree cond;
+  bool swap = false;
+
+  gcc_assert (aggressive_if_conv);
+  for (i = num - 2; i > 0; i--)
+    {
+      curr = gimple_phi_arg_def (phi, i);
+      lhs = make_temp_ssa_name (type, NULL, "_ifc_");
+      cond = get_predicate_for_edge (gimple_phi_arg_edge (as_a <gphi *> (phi),
+				     i));
+      swap = false;
+      if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+	{
+	  cond = TREE_OPERAND (cond, 0);
+	  swap = true;
+	}
+      /* Gimplify the condition to a valid cond-expr conditonal operand.  */
+      if (before)
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   true, GSI_SAME_STMT);
+      else
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   false, GSI_CONTINUE_LINKING);
+
+      c = fold_build_cond_expr (type, unshare_expr (cond),
+				 swap? last : curr,
+				 swap? curr : last);
+      stmt = gimple_build_assign (lhs, c);
+
+      if (before)
+	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+      else
+	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+      update_stmt (stmt);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Create new assign stmt for phi arg#%d\n", i);
+	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+	}
+      last = lhs;
+    }
+  curr = gimple_phi_arg_def (phi, 0);
+  cond = get_predicate_for_edge (gimple_phi_arg_edge (as_a <gphi *> (phi), 0));
+  swap = false;
+  if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+    {
+      cond = TREE_OPERAND (cond, 0);
+      swap = true;
+    }
+  if (before)
+    cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+				       is_gimple_condexpr, NULL_TREE, true,
+				       GSI_SAME_STMT);
+  else
+    cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+				       is_gimple_condexpr, NULL_TREE, false,
+				       GSI_CONTINUE_LINKING);
+
+  return fold_build_cond_expr (type,
+			       unshare_expr (cond),
+			       swap? last : curr,
+			       swap? curr : last);
+}
+
+/* Replace scalar phi node with more than 2 arguments. Distinguish
+   one important particular case if phi has only two different
+   arguments and one of them has the only occurance.  */
+
+static void
+predicate_extended_scalar_phi (gimple phi, gimple_stmt_iterator *gsi,
+			       bool before)
+{
+  gimple new_stmt, reduc;
+  tree rhs, res, arg0, arg1, op0, op1;
+  tree cond;
+  unsigned int index0;
+  edge e;
+  bool swap = false;
+
+  res = gimple_phi_result (phi);
+  if (virtual_operand_p (res))
+    return;
+
+  if (!phi_has_two_different_args (phi, &arg0, &arg1, &index0))
+    rhs = predicate_arbitrary_scalar_phi (phi, gsi, before);
+  else
+    {
+      e = gimple_phi_arg_edge (as_a <gphi *> (phi), index0);
+      cond = get_predicate_for_edge (e);
+      if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+	{
+	  swap = true;
+	  cond = TREE_OPERAND (cond, 0);
+	}
+      /* Gimplify the condition to a valid cond-expr conditonal operand.  */
+      if (before)
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   true, GSI_SAME_STMT);
+      else
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   false, GSI_CONTINUE_LINKING);
+      if (!(is_cond_scalar_reduction (phi, &reduc, &op0, &op1, true)))
+	rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond),
+				    swap? arg1 : arg0,
+				    swap? arg0 : arg1);
+      else
+	/* Convert reduction stmt into vectorizable form.  */
+	rhs = convert_scalar_cond_reduction (reduc, gsi, cond, op0, op1,
+					     swap, before);
+    }
+  new_stmt = gimple_build_assign (res, rhs);
+  if (before)
+    gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+  else
+    gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
+  update_stmt (new_stmt);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "new extended phi replacement stmt\n");
+      print_gimple_stmt (dump_file, new_stmt, 0, TDF_SLIM);
+    }
+}
+
+/* Returns gimple statement iterator to insert code for predicated phi.  */
+
+static gimple_stmt_iterator
+find_insertion_point (basic_block bb, bool* before)
+{
+  edge e;
+  edge_iterator ei;
+  tree cond;
+  gimple last = NULL;
+  gimple curr;
+  int num_opnd;
+  tree opnd1, opnd2;
+
+  /* Found last statement in bb after which code for predicated phi can be
+     inserted using edge predicates.  */
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      cond = get_predicate_for_edge (e);
+      if (TREE_CODE (cond) == SSA_NAME)
+	{
+	  opnd1 = cond;
+	  opnd2 = NULL_TREE;
+	}
+      else if (TREE_CONSTANT (cond))
+	continue;
+      else if ((num_opnd = TREE_OPERAND_LENGTH (cond)) == 2)
+	{
+	  opnd1 = TREE_OPERAND (cond, 0);
+	  opnd2 = TREE_OPERAND (cond, 1);
+	}
+      else
+	{
+	  gcc_assert (num_opnd == 1);
+	  opnd1 = TREE_OPERAND (cond, 0);
+	  opnd2 = NULL_TREE;
+	}
+      /* Process each operand of cond to determine the latest defenition.  */
+      while (true)
+	{
+	  if (TREE_CODE (opnd1) == SSA_NAME)
+	    {
+	      curr = SSA_NAME_DEF_STMT (opnd1);
+	      /* Skip defenition in other bb's.  */
+	      if (gimple_bb (curr) == bb)
+		{
+		  if (last == NULL)
+		    last = curr;
+		  else
+		    {
+		      /* Determine what stmt is latest in bb.  */
+		      gimple_stmt_iterator gsi;
+		      gimple stmt;
+		      for (gsi = gsi_last_bb (bb);
+			   !gsi_end_p (gsi);
+			    gsi_prev (&gsi))
+			if ((stmt = gsi_stmt (gsi)) == last)
+			  break;
+			else if (stmt == curr)
+			  {
+			    last = curr;
+			    break;
+			  }
+		    }
+		}
+	    }
+	    if (opnd2 != NULL_TREE)
+	      {
+		opnd1 = opnd2;
+		opnd2 = NULL_TREE;
+	      }
+	    else
+	      break;
+	}
+    }
+
+  if (last == NULL)
+    {
+      *before = true;
+      return gsi_after_labels (bb);
+    }
+  *before = false;
+  return gsi_for_stmt (last);
+}
+
 /* Replaces in LOOP all the scalar phi nodes other than those in the
    LOOP->header block with conditional modify expressions.  */
 
@@ -1724,6 +2076,8 @@ predicate_all_scalar_phis (struct loop *loop)
   basic_block bb;
   unsigned int orig_loop_num_nodes = loop->num_nodes;
   unsigned int i;
+  bool extended;
+  bool before = false;
 
   for (i = 1; i < orig_loop_num_nodes; i++)
     {
@@ -1741,15 +2095,26 @@ predicate_all_scalar_phis (struct loop *loop)
       if (gsi_end_p (phi_gsi))
 	continue;
 
-      /* BB has two predecessors.  Using predecessor's aux field, set
-	 appropriate condition for the PHI node replacement.  */
-      gsi = gsi_after_labels (bb);
-      true_bb = find_phi_replacement_condition (bb, &cond, &gsi);
+      /* If BB has more than 2 predecessors or all incoming edges to bb
+	 are critical, must handle PHI through extended predication.  */
+      extended = EDGE_COUNT (bb->preds) != 2 || all_preds_critical_p (bb);
+      if (!extended)
+	{
+	  /* BB has two predecessors.  Using predecessor's aux field, set
+	     appropriate condition for the PHI node replacement.  */
+	  gsi = gsi_after_labels (bb);
+	  true_bb = find_phi_replacement_condition (bb, &cond, &gsi);
+	}
+      else
+	gsi = find_insertion_point (bb, &before);
 
       while (!gsi_end_p (phi_gsi))
 	{
 	  phi = phi_gsi.phi ();
-	  predicate_scalar_phi (phi, cond, true_bb, &gsi);
+	  if (!extended)
+	    predicate_scalar_phi (phi, cond, true_bb, &gsi);
+	  else
+	    predicate_extended_scalar_phi (phi, &gsi, before);
 	  release_phi_node (phi);
 	  gsi_next (&phi_gsi);
 	}
@@ -1771,7 +2136,8 @@ insert_gimplified_predicates (loop_p loop, bool any_mask_load_store)
       basic_block bb = ifc_bbs[i];
       gimple_seq stmts;
 
-      if (!is_predicated (bb))
+      /* Non-predicated join blocks can have the statements to insert.  */
+      if (!is_predicated (bb) && bb_predicate_gimplified_stmts (bb) == NULL)
 	{
 	  /* Do not insert statements for a basic block that is not
 	     predicated.  Also make sure that the predicate of the
@@ -1784,7 +2150,8 @@ insert_gimplified_predicates (loop_p loop, bool any_mask_load_store)
       if (stmts)
 	{
 	  if (flag_tree_loop_if_convert_stores
-	      || any_mask_load_store)
+	      || any_mask_load_store
+	      || aggressive_if_conv)
 	    {
 	      /* Insert the predicate of the BB just after the label,
 		 as the if-conversion of memory writes will use this
@@ -2205,8 +2572,15 @@ tree_if_conversion (struct loop *loop)
   ifc_bbs = NULL;
   bool any_mask_load_store = false;
 
-  /* Temporary set up this flag to false.  */
-  aggressive_if_conv = false;
+  /* Set-up aggressive if-conversion for loops marked with simd pragma.  */
+  aggressive_if_conv = loop->force_vectorize;
+  /* Check either outer loop was marked with simd pragma.  */
+  if (!aggressive_if_conv)
+    {
+      struct loop *outer_loop = loop_outer (loop);
+      if (outer_loop && outer_loop->force_vectorize)
+	aggressive_if_conv = true;
+    }
 
   if (!if_convertible_loop_p (loop, &any_mask_load_store)
       || !dbg_cnt (if_conversion_tree))

  reply	other threads:[~2014-12-01 15:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-12 13:36 Yuri Rumyantsev
2014-11-28 12:46 ` Richard Biener
2014-12-01 15:53   ` Yuri Rumyantsev [this message]
2014-12-02 13:28     ` Richard Biener
2014-12-02 15:29       ` Yuri Rumyantsev
2014-12-04 12:41         ` Richard Biener
2014-12-04 13:15           ` Yuri Rumyantsev
2014-12-04 13:37             ` Richard Biener
2014-12-09 13:11               ` Yuri Rumyantsev
2014-12-09 15:21                 ` Richard Biener
2014-12-10 10:54                   ` Yuri Rumyantsev
2014-12-10 14:31                     ` Richard Biener
2014-12-10 15:22                       ` Yuri Rumyantsev
2014-12-11  8:59                         ` Richard Biener
2014-12-16 15:16                           ` Yuri Rumyantsev
2014-12-17 15:45                             ` Richard Biener
2014-12-18 13:48                               ` Yuri Rumyantsev
2014-12-19 11:46                                 ` Richard Biener
2014-12-22 14:49                                   ` Yuri Rumyantsev
2015-01-09 12:31                                     ` Richard Biener
2015-01-14 13:33                                       ` Yuri Rumyantsev
2015-01-14 15:00                                         ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEoMCqQX5qEExZ7Xtz_JNzus5FQizvWkbFSW5y+TGNjt71BHaQ@mail.gmail.com \
    --to=ysrumyan@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=izamyatin@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).