public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
@ 2014-10-13 10:00 Yuri Rumyantsev
  2014-10-15 10:09 ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Yuri Rumyantsev @ 2014-10-13 10:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Igor Zamyatin

[-- Attachment #1: Type: text/plain, Size: 22557 bytes --]

Richard,

Here is updated patch (part1) for extended if conversion.

Second part of patch will be sent later.

Changelog.

2014-10-13  Yuri Rumyantsev  <ysrumyan@gmail.com>

* tree-if-conv.c (cgraph.h): Add include file to detect function clone.
(flag_force_vectorize): New variable.
(edge_predicate): New function.
(set_edge_predicate): New function.
(add_to_predicate_list): Check unconditionally that bb is always
executed to early exit. Use predicate of cd-equivalent block
for join blocks if it exists.
(add_to_dst_predicate_list): Invoke add_to_predicate_list if
destination block of edge is not always executed. Set-up predicate
for critical edge.
(if_convertible_phi_p): Accept phi nodes with more than two args
if FLAG_FORCE_VECTORIZE was set-up.
(ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
(if_convertible_stmt_p): Fix up pre-function comments.
(all_edges_are_critical): New function.
(if_convertible_bb_p): Allow bb has more than two predecessors if
FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
to reject block if-conversion with incoming critical edges only if
FLAG_FORCE_VECTORIZE was not set-up.
(predicate_bbs): Skip loop exit block also. Add check that if
fold_build2 produces bool conversion, recompute predicate using
build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
(if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
(find_phi_replacement_condition): Extend function interface:
it returns NULL if given phi node must be handled by means of
extended phi node predication. If number of predecessors of phi-block
is equal 2 and atleast one incoming edge is not critical original
algorithm is used.
(get_predicate_for_edge): New function.
(find_insertion_point): New function.
(predicate_arbitrary_scalar_phi): New function.
(predicate_all_scalar_phis): Introduce new variable BEFORE.
Invoke find_insertion_point to initialize gsi and
predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
that extended predication must be applied).
(insert_gimplified_predicates): Add test for non-predicated basic
blocks that there are no gimplified statements to insert. Insert
predicates at the block begining for extended if-conversion.
(tree_if_conversion): Initialize flag_force_vectorize from current
loop or outer loop (to support pragma omp declare).Do loop versioning
for innermost loop marked with pragma omp simd and
FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
for blocks with two successors.




2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
> Richard,
>
> here is reduced patch (part.1) which was reduced almost twice.
> Let's me also answer on your comments.
>
> 1. I really use edge field 'aux' to keep predicate for critical edges.
> My previous code was not correct and now it looks like:
>
>   if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
>     /* Edge E is not critical,  use predicate of edge source bb. */
>     c = bb_predicate (b);
>   else
>     /* Edge E is critical and its aux field contains predicate.  */
>     c = edge_predicate (e);
>
> 2. I completely delete all code related to creation of conditional
> expressions and completely rely on bool pattern recognition in
> vectorizer. But we need to delete all dead predicate computations
> which are not used since they prevent vectorization. I will add this
> local-dce function in next patch.
> 3. I also did not include in this patch recognition of general
> phi-nodes with two arguments only for which conversion of conditional
> scalar reduction can be applied also.
> Note that all these changes are applied for loop marked with pragma
> omp simd only.
>
> 2014-09-22  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
> (flag_force_vectorize): New variable.
> (edge_predicate): New function.
> (set_edge_predicate): New function.
> (convert_name_to_cmp): New function.
> (add_to_predicate_list): Check unconditionally that bb is always
> executed to early exit. Use predicate of cd-equivalent block
> for join blocks if it exists.
> (add_to_dst_predicate_list): Invoke add_to_predicate_list if
> destination block of edge is not always executed. Set-up predicate
> for critical edge.
> (if_convertible_phi_p): Accept phi nodes with more than two args
> if FLAG_FORCE_VECTORIZE was set-up.
> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
> (if_convertible_stmt_p): Fix up pre-function comments.
> (all_edges_are_critical): New function.
> (if_convertible_bb_p): Allow bb has more than two predecessors if
> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
> to reject block if-conversion with incoming critical edges only if
> FLAG_FORCE_VECTORIZE was not set-up.
> (predicate_bbs): Skip loop exit block also. Add check that if
> fold_build2 produces bool conversion, recompute predicate using
> build2_loc. Add zeroing of edge 'aux' field under FLAG_FORCE_VECTORIZE.
> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's.
> (find_phi_replacement_condition): Extend function interface:
> it returns NULL if given phi node must be handled by means of
> extended phi node predication. If number of predecessors of phi-block
> is equal 2 and atleast one incoming edge is not critical original
> algorithm is used.
> (get_predicate_for_edge): New function.
> (find_insertion_point): New function.
> (predicate_arbitrary_scalar_phi): New function.
> (predicate_all_scalar_phis): Introduce new variable BEFORE.
> Invoke find_insertion_point to initialize gsi and
> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals
> that extended predication must be applied).
> (insert_gimplified_predicates): Add test for non-predicated basic
> blocks that there are no gimplified statements to insert. Insert
> predicates at the block begining for extended if-conversion.
> (tree_if_conversion): Initialize flag_force_vectorize from current
> loop or outer loop (to support pragma omp declare).Do loop versioning
> for innermost loop marked with pragma omp simd and
> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of edges
> for blocks with two successors.
>
>
>
>
> 2014-09-08 17:10 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard!
>>> Here is updated patch with the following changes:
>>>
>>> 1. Any restrictions on phi-function were eliminated for extended conversion.
>>> 2.  Put predicate for critical edges to 'aux' field of edge, i.e.
>>> negate_predicate was deleted.
>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges can
>>> be critical.
>>> 4. Use notion of cd-equivalence to set-up predicate for join basic
>>> blocks to simplify it.
>>> 5. I decided to not design pre-pass since it will lead generating
>>> chain of cond expressions for phi-node if conversion, whereas for phi
>>> of kind
>>>   x = PHI <1(2), 1(3), 2(4)>
>>> only one cond expression is required and this is considered as simple
>>> optimization for arbitrary phi-function. More precise,
>>> if phi-function have only two different arguments and one of them has
>>> single occurrence, if- conversion is performed as if phi have only 2
>>> arguments.
>>> For arbitrary phi function a chain of cond expressions is produced.
>>>
>>> Updated patch is attached.
>>>
>>> Any comments will be appreciated.
>>
>> The patch is still very big and does multiple things at once which makes
>> it hard to review.
>>
>> In addition to that it changes function singatures without updating
>> the function comments.  For example what is the convert_bool
>> argument doing to add_to_dst_predicate_list?  Why do we need
>> all this added logic.
>>
>> You duplicate operand_equal_for_phi_arg_p.
>>
>> I think the code handling PHIs with more than two operands but
>> only two unequal operands is useful generally, so that's an obvious
>> candidate for splitting out into a separate patch.
>>
>> +   CONVERT_BOOL argument was added to convert bool predicate computations
>> +   which is not supported by vectorizer to int type through creating of
>> +   conditional expressions.  */
>>
>> Example?  The vectorizer has patterns for bool predicate computations.
>> This seems to be another feature that needs splitting out.
>>
>> The way you get around the critical edge parts looks awkward to me.
>> Please either do _all_ predicates as edge predicates or simply
>> split critical edges (of the respective loop body).
>>
>> I still think that an utility doing same PHI arg merging by introducing
>> forwarder blocks would be nicer to have.
>>
>> I'd restructure the main tree_if_conversion function to apply these
>> CFG pre-transforms when we are going to version the loop
>> for if conversion (eventually transitioning to always doing that).
>>
>> So - please split up the patch.  It's way too big.
>>
>> Thanks,
>> Richard.
>>
>>> 2014-08-15  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>
>>> * tree-if-conv.c (cgraph.h): Add include file to detect function clone.
>>> (flag_force_vectorize): New variable.
>>> (edge_predicate): New function.
>>> (set_edge_predicate): New function.
>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>> (convert_name_to_cmp): New function.
>>> (get_type_for_cond): New function.
>>> (convert_bool_predicate): New function.
>>> (predicate_disjunction): New function.
>>> (predicate_conjunction): New function.
>>> (add_to_predicate_list): Add convert_bool argument.
>>> Use predicate of cd-equivalent block if convert_bool is true and
>>> such bb exists; save it in static variable for further possible use.
>>> Add call of predicate_disjunction if convert_bool argument is true.
>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>> Add early function exit if edge target block is always executed.
>>> Add call of predicate_conjunction if convert_bool argument is true.
>>> Pass convert_bool argument for add_to_predicate_list.
>>> Set-up predicate for crritical edge if convert_bool is true.
>>> (equal_phi_args): New function.
>>> (phi_has_two_different_args): New function.
>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>> if flag_force_vectorize wa set-up.
>>> (ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
>>> (if_convertible_stmt_p): Allow calls of function clones if
>>> flag_force_vectorize was set-up.
>>> (all_edges_are_critical): New function.
>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>> flag_force_vectorize was set-up. Use call of all_edges_are_critical
>>> to reject block if-conversion with imcoming critical edges only if
>>> flag_force_vectorize was not set-up.
>>> (walk_cond_tree): New function.
>>> (vect_bool_pattern_is_applicable): New function.
>>> (predicate_bbs): Add convert_bool argument which is used to transform
>>> comparison expressions of boolean type into conditional expressions
>>> with integral operands. If convert_bool argument was set-up and
>>> vect bool pattern can be appied perform the following transformation:
>>> (bool) x != 0  --> y = (int) x; x != 0;
>>> Add check that if fold_build2 produces bool conversion if convert_bool
>>> was set-up, recompute predicate using build2_loc. Additional argument
>>> 'convert_bool" is passed to add_to_dst_predicate_list and
>>> add_to_predicate_list.
>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
>>> flag_force_vectorize was set-up to calculate cd equivalent bb's.
>>> Call predicate_bbs with additional argument equal to false.
>>> (find_phi_replacement_condition): Extend function interface:
>>> it returns NULL if given phi node must be handled by means of
>>> extended phi node predication. If number of predecessors of phi-block
>>> is equal 2 and atleast one incoming edge is not critical original
>>> algorithm is used.
>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>> phi arguments must be evaluated through phi_has_two_different_args.
>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if cond
>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>> (get_predicate_for_edge): New function.
>>> (find_insertion_point): New function.
>>> (predicate_arbitrary_phi): New function.
>>> (predicate_extended_scalar_phi): New function.
>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>> iterator for predication of extended scalar phi's for insertion.
>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>> blocks that there are no gimplified statements to insert. Insert
>>> predicates at the block begining for extended if-conversion.
>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>> predication to build mask.
>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>> loop or outer loop (to support pragma omp declare).Do loop versioning
>>> for innermost loop marked with pragma omp simd.
>>>
>>> 2014-08-01 13:40 GMT+04:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Hi All,
>>>>>
>>>>> We implemented additional support for pragma omp simd in part of
>>>>> extended if-conversion loops with such pragma. These extensions
>>>>> include:
>>>>>
>>>>> 1. All extensions are performed only if considered loop or its outer
>>>>>    loop was marked with pragma omp simd (force_vectorize); For ordinary
>>>>>    loops behavior was not changed.
>>>>> 2. Took off cfg restriction on basic block which can have more than 2
>>>>>    predecessors.
>>>>> 3. Put additional restriction on phi nodes which was missed in current design:
>>>>>    all phi nodes must be in non-predicated basic block to conform
>>>>>    semantic of COND_EXPR which is used for transformation.
>>>>
>>>> How is that so?  If the PHI is predicated then its result will be used
>>>> in a PHI node again and thus we'd create a sequence of COND_EXPRs.
>>>>
>>>> No?
>>>>
>>>>> 4. Extend predication of phi nodes: phi may have more than 2 arguments
>>>>> with some limitations:
>>>>>    - for phi nodes which have more than 2 arguments, but only two
>>>>>    arguments are different and one of them has the only occurence,
>>>>> transformation to  single COND_EXPR can be done.
>>>>>    - if phi node has more different arguments and all edge predicates
>>>>>    correspondent to phi-arguments are disjoint, a chain of COND_EXPR
>>>>>    will be generated for it. In current design very simple check is used:
>>>>>    check starting from end that two edges correspondent to neighbor
>>>>> arguments have common predecessor which is used for further check
>>>>> with next edge.
>>>>>  These guarantee that phi predication will produce the correct result.
>>>>
>>>> Btw, you can think of these extensions as unfactoring a PHI node by
>>>> inserting forwarder blocks.  Thus
>>>>
>>>>    x = PHI <1(2), 1(3), 2(4)>
>>>>
>>>> becomes
>>>>
>>>>   bb 5: <forwarder-from(2)-and(3)>
>>>>
>>>>   x = PHI <1(5), 2(4)>
>>>>
>>>> and
>>>>
>>>>   x = PHI <1(2), 2(3), 3(4)>
>>>>
>>>> becomes
>>>>
>>>>   bb 5:
>>>>   x' = PHI <1(2), 2(3)>
>>>>
>>>>   b = PHI<x'(5), 3(4)>
>>>>
>>>> which means that 3) has to work.  Note that we want this kind of
>>>> PHI transforms for out-of-SSA as well to reduce the number of
>>>> copies we need to insert on edges.
>>>>
>>>> Thus it would be nice if you implemented 4) in terms of a pre-pass
>>>> over the force_vect loops PHI nodes, applying that CFG transform.
>>>> And make 3) work properly if it doesn't already.
>>>>
>>>> It looks like you introduce a "negate predicate" to work around the
>>>> critical edge limitation?  Please instead change if-conversion to
>>>> work with edge predicates (as opposed to BB predicates).
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>
>>>>> Here is example of such extended predication (compile with -march=core-avx2):
>>>>> #pragma omp simd safelen(8)
>>>>>   for (i=0; i<512; i++)
>>>>>   {
>>>>>     float t = a[i];
>>>>>     if (t > 0 & t < 1.0e+17f)
>>>>>       if (c[i] != 0)
>>>>> res += 1;
>>>>>   }
>>>>>   <bb 4>:
>>>>>   # res_15 = PHI <res_1(5), 0(3)>
>>>>>   # i_16 = PHI <i_11(5), 0(3)>
>>>>>   # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>>>>   t_5 = a[i_16];
>>>>>   _6 = t_5 > 0.0;
>>>>>   _7 = t_5 < 9.9999998430674944e+16;
>>>>>   _8 = _7 & _6;
>>>>>   _ifc__28 = (unsigned int) _8;
>>>>>   _10 = &c[i_16];
>>>>>   _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>>>>   _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>>>>   _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>>>>   _ifc__30 = (int) _ifc__29;
>>>>>   _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>>>>   _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>>>>   _ifc__33 = (int) _ifc__32;
>>>>>   _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>>>>   _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>>>>   res_1 = res_15 + _ifc__35;
>>>>>   i_11 = i_16 + 1;
>>>>>   ivtmp_14 = ivtmp_17 - 1;
>>>>>   if (ivtmp_14 != 0)
>>>>>     goto <bb 4>;
>>>>>
>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>
>>>>> gcc/ChageLog
>>>>>
>>>>> 2014-06-25  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>
>>>>> * tree-if-conv.c (flag_force_vectorize): New variable.
>>>>> (struct bb_predicate_s): Add negate_predicate field.
>>>>> (bb_negate_predicate): New function.
>>>>> (set_bb_negate_predicate): New function.
>>>>> (bb_copy_predicate): New function.
>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>>>>> (init_bb_predicate): Add initialization of negate_predicate field.
>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>>>>> (convert_name_to_cmp): New function.
>>>>> (get_type_for_cond): New function.
>>>>> (convert_bool_predicate): New function.
>>>>> (predicate_disjunction): New function.
>>>>> (predicate_conjunction): New function.
>>>>> (add_to_predicate_list): Add convert_bool argument.
>>>>> Add call of predicate_disjunction if convert_bool argument is true.
>>>>> (add_to_dst_predicate_list): Add convert_bool argument.
>>>>> Add early function exit if edge target block is always executed.
>>>>> Add call of predicate_conjunction if convert_bool argument is true.
>>>>> Pass convert_bool argument for add_to_predicate_list.
>>>>> (equal_phi_args): New function.
>>>>> (phi_has_two_different_args): New function.
>>>>> (phi_args_disjoint): New function.
>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>>>> for loops marked with pragma omp simd. Add check that phi nodes are
>>>>> in non-predicated basic blocks.
>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>>>>> (all_edges_are_critical): New function.
>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>>>> flag_force_vectorize was setup. Use call of all_edges_are_critical
>>>>> to reject block if-conversion with imcoming critical edges only if
>>>>> flag_force_vectorize was not setup.
>>>>> (walk_cond_tree): New function.
>>>>> (vect_bool_pattern_is_applicable): New function.
>>>>> (predicate_bbs): Add convert_bool argument that is used to transform
>>>>> comparison expressions of boolean type into conditional expressions
>>>>> with integral operands. If bool_conv argument is false or both
>>>>> outgoing edges are not critical old algorithm of predicate assignments
>>>>> is used, otherwise the following code was added: check on applicable
>>>>> of vect-bool-pattern recognition and trnasformation of
>>>>> (bool) x != 0  --> y = (int) x; x != 0;
>>>>> compute predicates for both outgoing edges one of which is critical
>>>>> one using 'normal' edge, i.e. compute true and false predicates using
>>>>> normal outgoing edge only; evaluated predicates are stored in
>>>>> predicate and negate_predicate fields of struct bb_predicate_s and
>>>>> negate_predicate of normal edge conatins predicate of critical edge,
>>>>> but generated gimplified statements are stored in their destination
>>>>> block fields. Additional argument 'convert_bool" is passed to
>>>>> add_to_dst_predicate_list and add_to_predicate_list.
>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional argument
>>>>> equal to false.
>>>>> (find_phi_replacement_condition): Extend function interface:
>>>>> it returns NULL if given phi node must be handled by means of
>>>>> extended phi node predication. If number of predecessors of phi-block
>>>>> is equal 2 and atleast one incoming edge is not critical original
>>>>> algorithm is used.
>>>>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>>>>> both phi arguments must be evaluated through phi_has_two_different_args.
>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if cond
>>>>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>>>>> (get_predicate_for_edge): New function.
>>>>> (find_insertion_point): New function.
>>>>> (predicate_phi_disjoint_args): New function.
>>>>> (predicate_extended_scalar_phi): New function.
>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>>>>> iterator for predication of extended scalar phi's for insertion.
>>>>> (insert_gimplified_predicates): Add test for non-predicated basic
>>>>> blocks that there are no gimplified statements to insert. Insert
>>>>> predicates at the block begining for extended if-conversion.
>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>>>>> predication to build mask.
>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>>>>> (split_crit_edge): New function.
>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current
>>>>> loop or outer loop (to support pragma omp declare). Invoke
>>>>> split_crit_edge for extended predication. Do loop versioning for
>>>>> innermost loop marked with pragma omp simd.

[-- Attachment #2: patch.part-1 --]
[-- Type: application/octet-stream, Size: 21959 bytes --]

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
old mode 100644
new mode 100755
index 1f8ef03..f213506
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -120,6 +120,9 @@ along with GCC; see the file COPYING3.  If not see
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
+/* Copy of 'force_vectorize' field of loop.  */
+static bool flag_force_vectorize;
+
 /* Structure used to predicate basic blocks.  This is attached to the
    ->aux field of the BBs in the loop to be if-converted.  */
 typedef struct bb_predicate_s {
@@ -149,6 +152,16 @@ bb_predicate (basic_block bb)
   return ((bb_predicate_p) bb->aux)->predicate;
 }
 
+/* Returns predicate for critical edge E.  */
+
+static inline tree
+edge_predicate (edge e)
+{
+  gcc_assert (EDGE_COUNT (e->dest->preds) >= 2);
+  gcc_assert (e->aux != NULL);
+  return (tree) e->aux;
+}
+
 /* Sets the gimplified predicate COND for basic block BB.  */
 
 static inline void
@@ -160,6 +173,16 @@ set_bb_predicate (basic_block bb, tree cond)
   ((bb_predicate_p) bb->aux)->predicate = cond;
 }
 
+/* Sets predicate COND for critical edge E.  */
+
+static inline void
+set_edge_predicate (edge e, tree cond)
+{
+  gcc_assert (EDGE_COUNT (e->dest->preds) >= 2);
+  gcc_assert (cond != NULL_TREE);
+  e->aux = cond;
+}
+
 /* Returns the sequence of statements of the gimplification of the
    predicate for basic block BB.  */
 
@@ -396,25 +419,51 @@ fold_build_cond_expr (tree type, tree cond, tree rhs, tree lhs)
 }
 
 /* Add condition NC to the predicate list of basic block BB.  LOOP is
-   the loop to be if-converted.  */
+   the loop to be if-converted. Use predicate of cd-equivalent block
+   if it exists for join bb.  */
 
 static inline void
 add_to_predicate_list (struct loop *loop, basic_block bb, tree nc)
 {
   tree bc, *tp;
+  basic_block dom_bb;
+  static basic_block join_bb = NULL;
 
   if (is_true_predicate (nc))
     return;
 
-  if (!is_predicated (bb))
+  /* If dominance tells us this basic block is always executed,
+     don't record any predicates for it.  */
+  if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
+    return;
+
+  /* If predicate has been already set up for given bb using cd-equivalent
+     block predicate, simply escape. Post-dominator tree was built under
+     flag_force_vectorize only.  */
+  if (flag_force_vectorize)
     {
-      /* If dominance tells us this basic block is always executed, don't
-	 record any predicates for it.  */
-      if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
+      if (join_bb == bb)
 	return;
+      dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+      /* We use notion of cd equivalence to get simplier predicate for
+	 join block, e.g. if join block has 2 predecessors with predicates
+	 p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of
+	 p1 & p2 | p1 & !p2.  */
+      if (dom_bb != loop->header
+	  && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) == bb)
+	{
+	  gcc_assert (flow_bb_inside_loop_p (loop, dom_bb));
+	  bc = bb_predicate (dom_bb);
+	  gcc_assert (!is_true_predicate (bc));
+	  set_bb_predicate (bb, bc);
 
-      bc = nc;
+	  /* Save bb in join_bb to not handle it once more.  */
+	  join_bb = bb;
+	  return;
+	}
     }
+  if (!is_predicated (bb))
+    bc = nc;
   else
     {
       bc = bb_predicate (bb);
@@ -455,10 +504,15 @@ add_to_dst_predicate_list (struct loop *loop, edge e,
     cond = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
 			prev_cond, cond);
 
-  add_to_predicate_list (loop, e->dest, cond);
+  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest))
+    add_to_predicate_list (loop, e->dest, cond);
+
+  /* If edge E is critical save predicate on it.  */
+  if (EDGE_COUNT (e->dest->preds) >= 2)
+    set_edge_predicate (e, cond);
 }
 
-/* Return true if one of the successor edges of BB exits LOOP.  */
+/* Returns true if one of the successor edges of BB exits LOOP.  */
 
 static bool
 bb_with_exit_edge_p (struct loop *loop, basic_block bb)
@@ -482,7 +536,9 @@ bb_with_exit_edge_p (struct loop *loop, basic_block bb)
    When the flag_tree_loop_if_convert_stores is not set, PHI is not
    if-convertible if:
    - a virtual PHI is immediately used in another PHI node,
-   - there is a virtual PHI in a BB other than the loop->header.  */
+   - there is a virtual PHI in a BB other than the loop->header.
+   When the flag_force_vectorize is set, PHI can have more than
+   two arguments.  */
 
 static bool
 if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi,
@@ -494,11 +550,18 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi,
       print_gimple_stmt (dump_file, phi, 0, TDF_SLIM);
     }
 
-  if (bb != loop->header && gimple_phi_num_args (phi) != 2)
+  if (bb != loop->header)
     {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	fprintf (dump_file, "More than two phi node args.\n");
-      return false;
+      if (gimple_phi_num_args (phi) != 2)
+	{
+	  if (!flag_force_vectorize)
+	    {
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "More than two phi node args.\n");
+	      return false;
+	    }
+
+        }
     }
 
   if (flag_tree_loop_if_convert_stores || any_mask_load_store)
@@ -728,7 +791,7 @@ ifcvt_can_use_mask_load_store (gimple stmt)
   basic_block bb = gimple_bb (stmt);
   bool is_load;
 
-  if (!(flag_tree_loop_vectorize || bb->loop_father->force_vectorize)
+  if (!(flag_tree_loop_vectorize || flag_force_vectorize)
       || bb->loop_father->dont_vectorize
       || !gimple_assign_single_p (stmt)
       || gimple_has_volatile_ops (stmt))
@@ -865,7 +928,8 @@ if_convertible_gimple_assign_stmt_p (gimple stmt,
 
    A statement is if-convertible if:
    - it is an if-convertible GIMPLE_ASSIGN,
-   - it is a GIMPLE_LABEL or a GIMPLE_COND.  */
+   - it is a GIMPLE_LABEL or a GIMPLE_COND,
+   - it is builtins call.  */
 
 static bool
 if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
@@ -912,6 +976,22 @@ if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
   return true;
 }
 
+/* Assumes that BB has more than 2 predecessors.
+   Returns false if at least one successor is not on critical edge
+   and true otherwise.  */
+
+static inline bool
+all_edges_are_critical (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    if (EDGE_COUNT (e->src->succs) == 1)
+      return false;
+  return true;
+}
+
 /* Return true when BB is if-convertible.  This routine does not check
    basic block's statements and phis.
 
@@ -920,6 +1000,8 @@ if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs,
    - it is after the exit block but before the latch,
    - its edges are not normal.
 
+   Last restriction is not applicable for loops marked with simd pragma.
+
    EXIT_BB is the basic block containing the exit of the LOOP.  BB is
    inside LOOP.  */
 
@@ -932,9 +1014,13 @@ if_convertible_bb_p (struct loop *loop, basic_block bb, basic_block exit_bb)
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "----------[%d]-------------\n", bb->index);
 
-  if (EDGE_COUNT (bb->preds) > 2
-      || EDGE_COUNT (bb->succs) > 2)
+  if (EDGE_COUNT (bb->succs) > 2)
     return false;
+  if (EDGE_COUNT (bb->preds) > 2)
+    {
+      if (!flag_force_vectorize)
+	return false;
+    }
 
   if (exit_bb)
     {
@@ -971,18 +1057,17 @@ if_convertible_bb_p (struct loop *loop, basic_block bb, basic_block exit_bb)
 
   /* At least one incoming edge has to be non-critical as otherwise edge
      predicates are not equal to basic-block predicates of the edge
-     source.  */
+     source. This restriction is not valid for loops marked with
+     simd pragma.  */
   if (EDGE_COUNT (bb->preds) > 1
       && bb != loop->header)
     {
-      bool found = false;
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	if (EDGE_COUNT (e->src->succs) == 1)
-	  found = true;
-      if (!found)
+      if (!flag_force_vectorize && all_edges_are_critical (bb))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    fprintf (dump_file, "only critical predecessors\n");
+	    fprintf (dump_file, "only critical predecessors in bb#%d\n",
+		      bb->index);
+
 	  return false;
 	}
     }
@@ -1064,6 +1149,7 @@ get_loop_body_in_if_conv_order (const struct loop *loop)
   return blocks;
 }
 
+
 /* Returns true when the analysis of the predicates for all the basic
    blocks in LOOP succeeded.
 
@@ -1096,9 +1182,10 @@ predicate_bbs (loop_p loop)
       tree cond;
       gimple stmt;
 
-      /* The loop latch is always executed and has no extra conditions
-	 to be processed: skip it.  */
-      if (bb == loop->latch)
+      /* The loop latch and loop exit block are always executed and
+	 have no extra conditions to be processed: skip them.  */
+      if (bb == loop->latch
+	  || bb_with_exit_edge_p (loop, bb))
 	{
 	  reset_bb_predicate (loop->latch);
 	  continue;
@@ -1108,27 +1195,41 @@ predicate_bbs (loop_p loop)
       stmt = last_stmt (bb);
       if (stmt && gimple_code (stmt) == GIMPLE_COND)
 	{
-	  tree c2;
+	  tree c, c2;
 	  edge true_edge, false_edge;
 	  location_t loc = gimple_location (stmt);
-	  tree c = fold_build2_loc (loc, gimple_cond_code (stmt),
-				    boolean_type_node,
-				    gimple_cond_lhs (stmt),
-				    gimple_cond_rhs (stmt));
-
-	  /* Add new condition into destination's predicate list.  */
-	  extract_true_false_edges_from_block (gimple_bb (stmt),
-					       &true_edge, &false_edge);
+	  tree lopnd = gimple_cond_lhs (stmt);
+	  enum tree_code code = gimple_cond_code (stmt);
+
+	  /* Compute predicates for true and false edges.  */
+	  c = fold_build2_loc (loc, code,
+			       boolean_type_node,
+			       lopnd,
+			       gimple_cond_rhs (stmt));
+	  /* Fold_build2 can produce bool conversion which is not
+             supported by vectorizer, so re-build it without folding.
+	     For example, such conversion is generated for sequence:
+		_Bool _7, _8, _9;
+		_7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9;
+		if (_9 != 0)  --> (bool)_9.  */
+
+	  if (CONVERT_EXPR_P (c)
+	      && TREE_CODE_CLASS (code) == tcc_comparison)
+	    c = build2_loc (loc, code, boolean_type_node,
+			    lopnd, gimple_cond_rhs (stmt));
+	  c2 = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node,
+			   unshare_expr (c));
 
+	  extract_true_false_edges_from_block (bb, &true_edge, &false_edge);
+	  if (flag_force_vectorize)
+	    true_edge->aux = false_edge->aux = NULL;
 	  /* If C is true, then TRUE_EDGE is taken.  */
 	  add_to_dst_predicate_list (loop, true_edge, unshare_expr (cond),
 				     unshare_expr (c));
 
 	  /* If C is false, then FALSE_EDGE is taken.  */
-	  c2 = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node,
-			   unshare_expr (c));
-	  add_to_dst_predicate_list (loop, false_edge,
-				     unshare_expr (cond), c2);
+	  add_to_dst_predicate_list (loop, false_edge, unshare_expr (cond),
+				     unshare_expr (c2));
 
 	  cond = NULL_TREE;
 	}
@@ -1176,6 +1277,8 @@ if_convertible_loop_p_1 (struct loop *loop,
     return false;
 
   calculate_dominance_info (CDI_DOMINATORS);
+  if (flag_force_vectorize)
+    calculate_dominance_info (CDI_POST_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
   ifc_bbs = get_loop_body_in_if_conv_order (loop);
@@ -1337,7 +1440,9 @@ if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store)
    replacement.  Return the true block whose phi arguments are
    selected when cond is true.  LOOP is the loop containing the
    if-converted region, GSI is the place to insert the code for the
-   if-conversion.  */
+   if-conversion.
+   Returns NULL if given phi node must be handled by means of extended
+   phi node predication.  */
 
 static basic_block
 find_phi_replacement_condition (basic_block bb, tree *cond,
@@ -1346,7 +1451,13 @@ find_phi_replacement_condition (basic_block bb, tree *cond,
   edge first_edge, second_edge;
   tree tmp_cond;
 
-  gcc_assert (EDGE_COUNT (bb->preds) == 2);
+  if (EDGE_COUNT (bb->preds) != 2
+      || all_edges_are_critical (bb))
+    {
+      gcc_assert (flag_force_vectorize);
+      return NULL;
+    }
+
   first_edge = EDGE_PRED (bb, 0);
   second_edge = EDGE_PRED (bb, 1);
 
@@ -1624,6 +1735,237 @@ predicate_scalar_phi (gimple phi, tree cond,
     }
 }
 
+/* Returns predicate of edge associated with argument of phi node.  */
+
+static tree
+get_predicate_for_edge (edge e)
+{
+  tree c;
+  basic_block b = e->src;
+
+  if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == 1)
+    /* Edge E is not critical, use predicate of edge source bb.  */
+    c = bb_predicate (b);
+  else
+    /* Edge E is critical and its aux field contains predicate.  */
+    c = edge_predicate (e);
+  return c;
+}
+
+/* This is enhancement for predication of a phi node with arbitrary
+   number of arguments, i.e. for
+	x = phi (x_1, x_2, ..., x_k)
+   a chain of recurrent cond expressions will be produced.
+   For example,
+	bb_0
+	if (_5 != 0) goto bb_1 else goto bb_2
+	end_bb_0
+
+	bb_1
+	res_2 = some computations;
+	goto bb_5
+	end_bb_1
+
+	bb_2
+	if (_9 != 0) goto bb_3 else goto bb_4
+	end_bb_2
+
+	bb_3
+	res_3 = ...;
+	goto bb_5
+	end_bb_3
+
+	bb4
+	res_4 = ...;
+	end_bb_4
+
+	bb_5
+	# res_1 = PHI <res_2(1), res_3(3), res_4(4)>
+
+    will be if-converted into chain of unconditional assignments:
+	_ifc__42 = <PRD_3> ? res_3 : res_4;
+	res_1 = _5 != 0 ? res_2 : _ifc__42;
+
+    where <PRD_3> is predicate of <bb_3>.
+
+    All created intermediate statements are inserted at GSI point.  */
+
+static void
+predicate_arbitrary_scalar_phi (gimple phi, gimple_stmt_iterator *gsi,
+				bool before)
+{
+  int i;
+  int num = (int) gimple_phi_num_args (phi);
+  tree last = gimple_phi_arg_def (phi, num - 1);
+  tree type = TREE_TYPE (gimple_phi_result (phi));
+  tree curr;
+  gimple stmt;
+  tree lhs;
+  tree rhs;
+  tree res;
+  tree cond;
+  bool swap = false;
+
+  res = gimple_phi_result (phi);
+  if (virtual_operand_p (res))
+    return;
+
+  for (i = num - 2; i > 0; i--)
+    {
+      curr = gimple_phi_arg_def (phi, i);
+      lhs = make_temp_ssa_name (type, NULL, "_ifc_");
+      cond = get_predicate_for_edge (gimple_phi_arg_edge (phi, i));
+      swap = false;
+      if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+	{
+	  cond = TREE_OPERAND (cond, 0);
+	  swap = true;
+	}
+      /* Gimplify the condition to a valid cond-expr conditonal operand.  */
+      if (before)
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   true, GSI_SAME_STMT);
+      else
+	cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+					   is_gimple_condexpr, NULL_TREE,
+					   false, GSI_CONTINUE_LINKING);
+
+      stmt = gimple_build_assign_with_ops (COND_EXPR, lhs,
+					   unshare_expr (cond),
+					   swap? last : curr,
+					   swap? curr : last);
+
+      if (before)
+	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+      else
+	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+      update_stmt (stmt);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Create new assign stmt for phi arg#%d\n", i);
+	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+	}
+      last = lhs;
+    }
+  curr = gimple_phi_arg_def (phi, 0);
+  cond = get_predicate_for_edge (gimple_phi_arg_edge (phi, 0));
+  swap = false;
+  if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+    {
+      cond = TREE_OPERAND (cond, 0);
+      swap = true;
+    }
+  if (before)
+    cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+				       is_gimple_condexpr, NULL_TREE, true,
+				       GSI_SAME_STMT);
+  else
+    cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond),
+				       is_gimple_condexpr, NULL_TREE, false,
+				       GSI_CONTINUE_LINKING);
+  rhs = fold_build_cond_expr (type,
+			      unshare_expr (cond),
+			      swap? last : curr,
+			      swap? curr : last);
+  stmt = gimple_build_assign (res, rhs);
+  if (before)
+    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+  else
+    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+  update_stmt (stmt);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "new phi replacement stmt\n");
+      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+    }
+}
+
+/* Returns gimple statement iterator to insert code for predicated phi.  */
+
+static gimple_stmt_iterator
+find_insertion_point (basic_block bb, bool* before)
+{
+  edge e;
+  edge_iterator ei;
+  tree cond;
+  gimple last = NULL;
+  gimple curr;
+  int num_opnd;
+  tree opnd1, opnd2;
+
+  /* Found last statement in bb after which code for predicated phi can be
+     inserted using edge predicates.  */
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      cond = get_predicate_for_edge (e);
+      if (TREE_CODE (cond) == SSA_NAME)
+	{
+	  opnd1 = cond;
+	  opnd2 = NULL_TREE;
+	}
+      else if (TREE_CONSTANT (cond))
+	continue;
+      else if ((num_opnd = TREE_OPERAND_LENGTH (cond)) == 2)
+	{
+	  opnd1 = TREE_OPERAND (cond, 0);
+	  opnd2 = TREE_OPERAND (cond, 1);
+	}
+      else
+	{
+	  gcc_assert (num_opnd == 1);
+	  opnd1 = TREE_OPERAND (cond, 0);
+	  opnd2 = NULL_TREE;
+	}
+      /* Process each operand of cond to determine the latest defenition.  */
+      while (true)
+	{
+	  if (TREE_CODE (opnd1) == SSA_NAME)
+	    {
+	      curr = SSA_NAME_DEF_STMT (opnd1);
+	      /* Skip defenition in other bb's.  */
+	      if (gimple_bb (curr) == bb)
+		{
+		  if (last == NULL)
+		    last = curr;
+		  else
+		    {
+		      /* Determine what stmt is latest in bb.  */
+		      gimple_stmt_iterator gsi;
+		      gimple stmt;
+		      for (gsi = gsi_last_bb (bb);
+			   !gsi_end_p (gsi);
+			    gsi_prev (&gsi))
+			if ((stmt = gsi_stmt (gsi)) == last)
+			  break;
+			else if (stmt == curr)
+			  {
+			    last = curr;
+			    break;
+			  }
+		    }
+		}
+	    }
+	    if (opnd2 != NULL_TREE)
+	      {
+		opnd1 = opnd2;
+		opnd2 = NULL_TREE;
+	      }
+	    else
+	      break;
+	}
+    }
+
+  if (last == NULL)
+    {
+      *before = true;
+      return gsi_after_labels (bb);
+    }
+  *before = false;
+  return gsi_for_stmt (last);
+}
+
 /* Replaces in LOOP all the scalar phi nodes other than those in the
    LOOP->header block with conditional modify expressions.  */
 
@@ -1633,6 +1975,7 @@ predicate_all_scalar_phis (struct loop *loop)
   basic_block bb;
   unsigned int orig_loop_num_nodes = loop->num_nodes;
   unsigned int i;
+  bool before = false;
 
   for (i = 1; i < orig_loop_num_nodes; i++)
     {
@@ -1653,11 +1996,17 @@ predicate_all_scalar_phis (struct loop *loop)
 	 appropriate condition for the PHI node replacement.  */
       gsi = gsi_after_labels (bb);
       true_bb = find_phi_replacement_condition (bb, &cond, &gsi);
+      if (!true_bb)
+	/* Will use extended predication, find out insertion point.  */
+	gsi = find_insertion_point (bb, &before);
 
       while (!gsi_end_p (phi_gsi))
 	{
 	  phi = gsi_stmt (phi_gsi);
-	  predicate_scalar_phi (phi, cond, true_bb, &gsi);
+	  if (true_bb)
+	    predicate_scalar_phi (phi, cond, true_bb, &gsi);
+	  else
+	    predicate_arbitrary_scalar_phi (phi, &gsi, before);
 	  release_phi_node (phi);
 	  gsi_next (&phi_gsi);
 	}
@@ -1673,13 +2022,12 @@ static void
 insert_gimplified_predicates (loop_p loop, bool any_mask_load_store)
 {
   unsigned int i;
-
   for (i = 0; i < loop->num_nodes; i++)
     {
       basic_block bb = ifc_bbs[i];
       gimple_seq stmts;
 
-      if (!is_predicated (bb))
+      if (!is_predicated (bb) && bb_predicate_gimplified_stmts (bb) == NULL)
 	{
 	  /* Do not insert statements for a basic block that is not
 	     predicated.  Also make sure that the predicate of the
@@ -1692,7 +2040,8 @@ insert_gimplified_predicates (loop_p loop, bool any_mask_load_store)
       if (stmts)
 	{
 	  if (flag_tree_loop_if_convert_stores
-	      || any_mask_load_store)
+	      || any_mask_load_store
+	      || flag_force_vectorize)
 	    {
 	      /* Insert the predicate of the BB just after the label,
 		 as the if-conversion of memory writes will use this
@@ -1849,7 +2198,6 @@ predicate_mem_writes (loop_p loop)
 	  swap = true;
 	  cond = TREE_OPERAND (cond, 0);
 	}
-
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
 	if (!gimple_assign_single_p (stmt = gsi_stmt (gsi)))
 	  continue;
@@ -2102,6 +2450,7 @@ version_loop_for_if_conversion (struct loop *loop)
   return true;
 }
 
+
 /* If-convert LOOP when it is legal.  For the moment this pass has no
    profitability analysis.  Returns non-zero todo flags when something
    changed.  */
@@ -2113,6 +2462,15 @@ tree_if_conversion (struct loop *loop)
   ifc_bbs = NULL;
   bool any_mask_load_store = false;
 
+  flag_force_vectorize = loop->force_vectorize;
+  /* Check either outer loop was marked with simd pragma.  */
+  if (!flag_force_vectorize)
+    {
+      struct loop *outer_loop = loop_outer (loop);
+      if (outer_loop && outer_loop->force_vectorize)
+	flag_force_vectorize = true;
+    }
+
   if (!if_convertible_loop_p (loop, &any_mask_load_store)
       || !dbg_cnt (if_conversion_tree))
     goto cleanup;
@@ -2122,7 +2480,9 @@ tree_if_conversion (struct loop *loop)
 	  || loop->dont_vectorize))
     goto cleanup;
 
-  if (any_mask_load_store && !version_loop_for_if_conversion (loop))
+  if ((any_mask_load_store
+       || (loop->force_vectorize && flag_tree_loop_if_convert != 1))
+      && !version_loop_for_if_conversion (loop))
     goto cleanup;
 
   /* Now all statements are if-convertible.  Combine all the basic
@@ -2143,7 +2503,15 @@ tree_if_conversion (struct loop *loop)
       unsigned int i;
 
       for (i = 0; i < loop->num_nodes; i++)
-	free_bb_predicate (ifc_bbs[i]);
+	{
+	  basic_block bb = ifc_bbs[i];
+	  free_bb_predicate (bb);
+	  if (EDGE_COUNT (bb->succs) == 2)
+	    {
+	      EDGE_SUCC (bb, 0)->aux = NULL;
+	      EDGE_SUCC (bb, 1)->aux = NULL;
+	    }
+	}
 
       free (ifc_bbs);
       ifc_bbs = NULL;

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-11-07 14:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-13 10:00 [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd Yuri Rumyantsev
2014-10-15 10:09 ` Richard Biener
2014-10-16 15:52   ` Yuri Rumyantsev
2014-10-17  9:11     ` Richard Biener
2014-10-17 14:15       ` Yuri Rumyantsev
2014-10-20  8:02         ` Richard Biener
2014-10-20 14:11           ` Yuri Rumyantsev
2014-10-21 12:29             ` Yuri Rumyantsev
2014-10-21 12:56               ` Richard Biener
2014-10-21 13:26                 ` Yuri Rumyantsev
2014-10-21 13:45                   ` Richard Biener
2014-10-21 14:01                     ` Yuri Rumyantsev
2014-10-21 14:11                       ` Richard Biener
2014-10-21 14:20                         ` Richard Biener
2014-10-21 14:36                           ` Yuri Rumyantsev
2014-10-24  9:14                             ` Richard Biener
2014-10-24 10:23                               ` Yuri Rumyantsev
2014-11-07 14:08                                 ` Yuri Rumyantsev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).