From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 2DEF73856DE6 for ; Wed, 11 Oct 2023 12:10:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2DEF73856DE6 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id C80E82183B; Wed, 11 Oct 2023 12:09:57 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 970EE2C6A1; Wed, 11 Oct 2023 12:09:57 +0000 (UTC) Date: Wed, 11 Oct 2023 12:09:57 +0000 (UTC) From: Richard Biener To: Tamar Christina cc: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH 3/3]middle-end: maintain LCSSA throughout loop peeling In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spamd-Bar: ++++ Authentication-Results: smtp-out1.suse.de; dkim=none; dmarc=none; spf=softfail (smtp-out1.suse.de: 149.44.160.134 is neither permitted nor denied by domain of rguenther@suse.de) smtp.mailfrom=rguenther@suse.de X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [4.49 / 50.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.20)[suse.de]; R_SPF_SOFTFAIL(0.60)[~all:c]; RWL_MAILSPIKE_GOOD(0.00)[149.44.160.134:from]; VIOLATED_DIRECT_SPF(3.50)[]; MX_GOOD(-0.01)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.20)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-Spam-Score: 4.49 X-Rspamd-Queue-Id: C80E82183B X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 11 Oct 2023, Tamar Christina wrote: > > > + auto loop_exits = get_loop_exit_edges (loop); > > > + auto_vec doms; > > > + > > > if (at_exit) /* Add the loop copy at exit. */ > > > { > > > - if (scalar_loop != loop) > > > + if (scalar_loop != loop && new_exit->dest != exit_dest) > > > { > > > - gphi_iterator gsi; > > > new_exit = redirect_edge_and_branch (new_exit, exit_dest); > > > + flush_pending_stmts (new_exit); > > > + } > > > > > > - for (gsi = gsi_start_phis (exit_dest); !gsi_end_p (gsi); > > > - gsi_next (&gsi)) > > > - { > > > - gphi *phi = gsi.phi (); > > > - tree orig_arg = PHI_ARG_DEF_FROM_EDGE (phi, e); > > > - location_t orig_locus > > > - = gimple_phi_arg_location_from_edge (phi, e); > > > + auto_vec new_phis; > > > + hash_map new_phi_args; > > > + /* First create the empty phi nodes so that when we flush the > > > + statements they can be filled in. However because there is no order > > > + between the PHI nodes in the exits and the loop headers we need to > > > + order them base on the order of the two headers. First record the > > new > > > + phi nodes. */ > > > + for (auto gsi_from = gsi_start_phis (scalar_exit->dest); > > > + !gsi_end_p (gsi_from); gsi_next (&gsi_from)) > > > + { > > > + gimple *from_phi = gsi_stmt (gsi_from); > > > + tree new_res = copy_ssa_name (gimple_phi_result (from_phi)); > > > + gphi *res = create_phi_node (new_res, new_preheader); > > > + new_phis.safe_push (res); > > > + } > > > > > > - add_phi_arg (phi, orig_arg, new_exit, orig_locus); > > > + /* Then redirect the edges and flush the changes. This writes out the > > new > > > + SSA names. */ > > > + for (edge exit : loop_exits) > > > > I realize at the moment it's the same, but we are redirecting multiple exit edges > > here and from the walk above expect them all to have the same set of PHI > > nodes - that looks a bit fragile? > > No, it only expects the two preheaders to have the same PHI nodes. Since one loop > is copied from the other we know that to be true. > > Now of course there are cases where your exit blocks have more PHI nodes than the > headers (e.g. live values) but those are handled later in the hunk below (with new_phi_args). > > For the flush_pending_stmts to work I had to make sure the order of the phi nodes are the > same as the original. This is why I can't iterate over the values in the exit block instead and > need to handle it in two steps. > > > Does this need adjustments later for the early exit vectorization? > > > > I believe (need to finish the rebase) that the only adjustment I'll need here for multiple exits > is the updates of the dominators. I don't think I'll need more. I had issues with live values that > I had to handle specially before, but I think this new approach should deal with it already. OK. > > This also somewhat confuses the original redirection of 'e', the main exit with > > the later (*) > > > > > + { > > > + edge e = redirect_edge_and_branch (exit, new_preheader); > > > + flush_pending_stmts (e); > > > + } > > > + > > > + /* Record the new SSA names in the cache so that we can skip > > materializing > > > + them again when we fill in the rest of the LCSSA variables. */ > > > + for (auto phi : new_phis) > > > + { > > > + tree new_arg = gimple_phi_arg (phi, 0)->def; > > > > and here you look at the (for now) single edge we redirected ... > > > > > + new_phi_args.put (new_arg, gimple_phi_result (phi)); > > > + } > > > + > > > + /* Copy the current loop LC PHI nodes between the original loop exit > > > + block and the new loop header. This allows us to later split the > > > + preheader block and still find the right LC nodes. */ > > > + edge latch_new = single_succ_edge (new_preheader); > > > > odd name - the single successor of a loop preheader is the loop header and the > > corresponding edge is the loop entry edge, not the latch? > > > > > + for (auto gsi_from = gsi_start_phis (loop->header), > > > + gsi_to = gsi_start_phis (new_loop->header); > > > + flow_loops && !gsi_end_p (gsi_from) && !gsi_end_p (gsi_to); > > > > Eh, can we have > > > > if (flow_loops) > > for (auto ...) > > > > please, even if that indents more? > > > > > + gsi_next (&gsi_from), gsi_next (&gsi_to)) > > > + { > > > + gimple *from_phi = gsi_stmt (gsi_from); > > > + gimple *to_phi = gsi_stmt (gsi_to); > > > + tree new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi, > > > + loop_latch_edge (loop)); > > > + > > > + /* Check if we've already created a new phi node during edge > > > + redirection. If we have, only propagate the value downwards. */ > > > + if (tree *res = new_phi_args.get (new_arg)) > > > + { > > > + adjust_phi_and_debug_stmts (to_phi, latch_new, *res); > > > + continue; > > > } > > > + > > > + tree new_res = copy_ssa_name (gimple_phi_result (from_phi)); > > > + gphi *lcssa_phi = create_phi_node (new_res, e->dest); > > > + > > > + /* Main loop exit should use the final iter value. */ > > > + add_phi_arg (lcssa_phi, new_arg, loop_exit, UNKNOWN_LOCATION); > > > > For all other edges into the loop besides 'e' there's missing PHI arguments? > > You are using 'e' here again, but also use that as temporary in for blocks, > > shadowing the parameter - that makes it difficult to read. Also it's sometimes > > 'e->dest' and sometimes new_preheader - I think you want to use > > new_preheader here as well (in create_phi_node) for consistency and ease of > > understanding. > > > > ISTR when early break vectorization lands we're going to redirect the alternate > > exits away again "fixing" the missing PHI args. > > > > We indeed had a discussion about this, and I'll expand more on the reasoning in the > patch for early breaks. But I think not redirecting the edges away for early break makes > more sense as It treats early break, alignment peeling and epilogue vectorization the same > way and the only difference is in the statement inside the guard blocks. > > But also more importantly this representation also makes it easier to implement First-Faulting > Loads support. For FFL we'll copy the main loop and at the "fault" check we branch to a new > Loop remainder that has the same sequences as the remainder of the main vector loop but > with different predicates. The reason for this is to remove the predicate mangling from the > optimal/likely loop body which is critical for performance. > > Now since FFL is intended to pair naturally with early break having the early exit edges all > lead into the same block makes the flow a lot easier to manage. > > But I'll make sure to include a diagram in the early break peeling patch. Thanks. So with the minor pending adjustments this series should be OK. Thanks, Richard. > Thanks, > Tamar > > > > + > > > + adjust_phi_and_debug_stmts (to_phi, latch_new, new_res); > > > } > > > - redirect_edge_and_branch_force (e, new_preheader); > > > - flush_pending_stmts (e); > > > + > > > set_immediate_dominator (CDI_DOMINATORS, new_preheader, e->src); > > > - if (was_imm_dom || duplicate_outer_loop) > > > + > > > + if ((was_imm_dom || duplicate_outer_loop)) > > > > extra ()s > > > > > set_immediate_dominator (CDI_DOMINATORS, exit_dest, new_exit- > > >src); > > > > > > /* And remove the non-necessary forwarder again. Keep the > > > other @@ -1598,6 +1680,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg > > (class loop *loop, edge loop_exit, > > > } > > > else /* Add the copy at entry. */ > > > { > > > + /* Copy the current loop LC PHI nodes between the original loop exit > > > + block and the new loop header. This allows us to later split the > > > + preheader block and still find the right LC nodes. */ > > > + for (auto gsi_from = gsi_start_phis (new_loop->header), > > > + gsi_to = gsi_start_phis (loop->header); > > > + flow_loops && !gsi_end_p (gsi_from) && !gsi_end_p (gsi_to); > > > > same if (flow_loops) > > > > > + gsi_next (&gsi_from), gsi_next (&gsi_to)) > > > + { > > > + gimple *from_phi = gsi_stmt (gsi_from); > > > + gimple *to_phi = gsi_stmt (gsi_to); > > > + tree new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi, > > > + loop_latch_edge (new_loop)); > > > > this looks wrong? IMHO it should be the PHI_RESULT, no? Note this only > > triggers for alignment peeling ... > > > > Otherwise looks OK. > > > > Thanks, > > Richard. > > > > > > > + adjust_phi_and_debug_stmts (to_phi, loop_preheader_edge (loop), > > > + new_arg); > > > + } > > > + > > > if (scalar_loop != loop) > > > { > > > /* Remove the non-necessary forwarder of scalar_loop again. */ @@ > > > -1627,29 +1725,6 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop > > *loop, edge loop_exit, > > > loop_preheader_edge (new_loop)->src); > > > } > > > > > > - if (scalar_loop != loop) > > > - { > > > - /* Update new_loop->header PHIs, so that on the preheader > > > - edge they are the ones from loop rather than scalar_loop. */ > > > - gphi_iterator gsi_orig, gsi_new; > > > - edge orig_e = loop_preheader_edge (loop); > > > - edge new_e = loop_preheader_edge (new_loop); > > > - > > > - for (gsi_orig = gsi_start_phis (loop->header), > > > - gsi_new = gsi_start_phis (new_loop->header); > > > - !gsi_end_p (gsi_orig) && !gsi_end_p (gsi_new); > > > - gsi_next (&gsi_orig), gsi_next (&gsi_new)) > > > - { > > > - gphi *orig_phi = gsi_orig.phi (); > > > - gphi *new_phi = gsi_new.phi (); > > > - tree orig_arg = PHI_ARG_DEF_FROM_EDGE (orig_phi, orig_e); > > > - location_t orig_locus > > > - = gimple_phi_arg_location_from_edge (orig_phi, orig_e); > > > - > > > - add_phi_arg (new_phi, orig_arg, new_e, orig_locus); > > > - } > > > - } > > > - > > > free (new_bbs); > > > free (bbs); > > > > > > @@ -2579,139 +2654,36 @@ vect_gen_vector_loop_niters_mult_vf > > > (loop_vec_info loop_vinfo, > > > > > > /* LCSSA_PHI is a lcssa phi of EPILOG loop which is copied from LOOP, > > > this function searches for the corresponding lcssa phi node in exit > > > - bb of LOOP. If it is found, return the phi result; otherwise return > > > - NULL. */ > > > + bb of LOOP following the LCSSA_EDGE to the exit node. If it is found, > > > + return the phi result; otherwise return NULL. */ > > > > > > static tree > > > find_guard_arg (class loop *loop ATTRIBUTE_UNUSED, > > > class loop *epilog ATTRIBUTE_UNUSED, > > > - const_edge e, gphi *lcssa_phi) > > > + const_edge e, gphi *lcssa_phi, int lcssa_edge = 0) > > > { > > > gphi_iterator gsi; > > > > > > - gcc_assert (single_pred_p (e->dest)); > > > for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); gsi_next (&gsi)) > > > { > > > gphi *phi = gsi.phi (); > > > - if (operand_equal_p (PHI_ARG_DEF (phi, 0), > > > - PHI_ARG_DEF (lcssa_phi, 0), 0)) > > > - return PHI_RESULT (phi); > > > - } > > > - return NULL_TREE; > > > -} > > > - > > > -/* Function slpeel_tree_duplicate_loop_to_edge_cfg duplciates > > FIRST/SECOND > > > - from SECOND/FIRST and puts it at the original loop's preheader/exit > > > - edge, the two loops are arranged as below: > > > - > > > - preheader_a: > > > - first_loop: > > > - header_a: > > > - i_1 = PHI; > > > - ... > > > - i_2 = i_1 + 1; > > > - if (cond_a) > > > - goto latch_a; > > > - else > > > - goto between_bb; > > > - latch_a: > > > - goto header_a; > > > - > > > - between_bb: > > > - ;; i_x = PHI; ;; LCSSA phi node to be created for FIRST, > > > - > > > - second_loop: > > > - header_b: > > > - i_3 = PHI; ;; Use of i_0 to be replaced with i_x, > > > - or with i_2 if no LCSSA phi is created > > > - under condition of > > CREATE_LCSSA_FOR_IV_PHIS. > > > - ... > > > - i_4 = i_3 + 1; > > > - if (cond_b) > > > - goto latch_b; > > > - else > > > - goto exit_bb; > > > - latch_b: > > > - goto header_b; > > > - > > > - exit_bb: > > > - > > > - This function creates loop closed SSA for the first loop; update the > > > - second loop's PHI nodes by replacing argument on incoming edge with the > > > - result of newly created lcssa PHI nodes. IF CREATE_LCSSA_FOR_IV_PHIS > > > - is false, Loop closed ssa phis will only be created for non-iv phis for > > > - the first loop. > > > - > > > - This function assumes exit bb of the first loop is preheader bb of the > > > - second loop, i.e, between_bb in the example code. With PHIs updated, > > > - the second loop will execute rest iterations of the first. */ > > > - > > > -static void > > > -slpeel_update_phi_nodes_for_loops (loop_vec_info loop_vinfo, > > > - class loop *first, edge first_loop_e, > > > - class loop *second, edge second_loop_e, > > > - bool create_lcssa_for_iv_phis) > > > -{ > > > - gphi_iterator gsi_update, gsi_orig; > > > - class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > > > - > > > - edge first_latch_e = EDGE_SUCC (first->latch, 0); > > > - edge second_preheader_e = loop_preheader_edge (second); > > > - basic_block between_bb = first_loop_e->dest; > > > - > > > - gcc_assert (between_bb == second_preheader_e->src); > > > - gcc_assert (single_pred_p (between_bb) && single_succ_p > > > (between_bb)); > > > - /* Either the first loop or the second is the loop to be > > > vectorized. */ > > > - gcc_assert (loop == first || loop == second); > > > - > > > - for (gsi_orig = gsi_start_phis (first->header), > > > - gsi_update = gsi_start_phis (second->header); > > > - !gsi_end_p (gsi_orig) && !gsi_end_p (gsi_update); > > > - gsi_next (&gsi_orig), gsi_next (&gsi_update)) > > > - { > > > - gphi *orig_phi = gsi_orig.phi (); > > > - gphi *update_phi = gsi_update.phi (); > > > - > > > - tree arg = PHI_ARG_DEF_FROM_EDGE (orig_phi, first_latch_e); > > > - /* Generate lcssa PHI node for the first loop. */ > > > - gphi *vect_phi = (loop == first) ? orig_phi : update_phi; > > > - stmt_vec_info vect_phi_info = loop_vinfo->lookup_stmt (vect_phi); > > > - if (create_lcssa_for_iv_phis || !iv_phi_p (vect_phi_info)) > > > + /* Nested loops with multiple exits can have different no# phi node > > > + arguments between the main loop and epilog as epilog falls to the > > > + second loop. */ > > > + if (gimple_phi_num_args (phi) > e->dest_idx) > > > { > > > - tree new_res = copy_ssa_name (PHI_RESULT (orig_phi)); > > > - gphi *lcssa_phi = create_phi_node (new_res, between_bb); > > > - add_phi_arg (lcssa_phi, arg, first_loop_e, UNKNOWN_LOCATION); > > > - arg = new_res; > > > - } > > > - > > > - /* Update PHI node in the second loop by replacing arg on the loop's > > > - incoming edge. */ > > > - adjust_phi_and_debug_stmts (update_phi, second_preheader_e, arg); > > > - } > > > - > > > - /* For epilogue peeling we have to make sure to copy all LC PHIs > > > - for correct vectorization of live stmts. */ > > > - if (loop == first) > > > - { > > > - basic_block orig_exit = second_loop_e->dest; > > > - for (gsi_orig = gsi_start_phis (orig_exit); > > > - !gsi_end_p (gsi_orig); gsi_next (&gsi_orig)) > > > - { > > > - gphi *orig_phi = gsi_orig.phi (); > > > - tree orig_arg = PHI_ARG_DEF (orig_phi, 0); > > > - if (TREE_CODE (orig_arg) != SSA_NAME || virtual_operand_p > > (orig_arg)) > > > - continue; > > > - > > > - const_edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo); > > > - /* Already created in the above loop. */ > > > - if (find_guard_arg (first, second, exit_e, orig_phi)) > > > + tree var = PHI_ARG_DEF (phi, e->dest_idx); > > > + if (TREE_CODE (var) != SSA_NAME) > > > continue; > > > - > > > - tree new_res = copy_ssa_name (orig_arg); > > > - gphi *lcphi = create_phi_node (new_res, between_bb); > > > - add_phi_arg (lcphi, orig_arg, first_loop_e, UNKNOWN_LOCATION); > > > + tree def = get_current_def (var); > > > + if (!def) > > > + continue; > > > + if (operand_equal_p (def, > > > + PHI_ARG_DEF (lcssa_phi, lcssa_edge), 0)) > > > + return PHI_RESULT (phi); > > > } > > > } > > > + return NULL_TREE; > > > } > > > > > > /* Function slpeel_add_loop_guard adds guard skipping from the > > > beginning @@ -2796,11 +2768,11 @@ > > slpeel_update_phi_nodes_for_guard1 (class loop *skip_loop, > > > } > > > } > > > > > > -/* LOOP and EPILOG are two consecutive loops in CFG and EPILOG is copied > > > - from LOOP. Function slpeel_add_loop_guard adds guard skipping from a > > > - point between the two loops to the end of EPILOG. Edges GUARD_EDGE > > > - and MERGE_EDGE are the two pred edges of merge_bb at the end of > > EPILOG. > > > - The CFG looks like: > > > +/* LOOP and EPILOG are two consecutive loops in CFG connected by > > LOOP_EXIT edge > > > + and EPILOG is copied from LOOP. Function slpeel_add_loop_guard adds > > guard > > > + skipping from a point between the two loops to the end of EPILOG. Edges > > > + GUARD_EDGE and MERGE_EDGE are the two pred edges of merge_bb at > > the end of > > > + EPILOG. The CFG looks like: > > > > > > loop: > > > header_a: > > > @@ -2851,6 +2823,7 @@ slpeel_update_phi_nodes_for_guard1 (class loop > > > *skip_loop, > > > > > > static void > > > slpeel_update_phi_nodes_for_guard2 (class loop *loop, class loop > > > *epilog, > > > + const_edge loop_exit, > > > edge guard_edge, edge merge_edge) { > > > gphi_iterator gsi; > > > @@ -2859,13 +2832,11 @@ slpeel_update_phi_nodes_for_guard2 (class > > loop *loop, class loop *epilog, > > > gcc_assert (single_succ_p (merge_bb)); > > > edge e = single_succ_edge (merge_bb); > > > basic_block exit_bb = e->dest; > > > - gcc_assert (single_pred_p (exit_bb)); > > > - gcc_assert (single_pred (exit_bb) == single_exit (epilog)->dest); > > > > > > for (gsi = gsi_start_phis (exit_bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > { > > > gphi *update_phi = gsi.phi (); > > > - tree old_arg = PHI_ARG_DEF (update_phi, 0); > > > + tree old_arg = PHI_ARG_DEF (update_phi, e->dest_idx); > > > > > > tree merge_arg = NULL_TREE; > > > > > > @@ -2877,8 +2848,8 @@ slpeel_update_phi_nodes_for_guard2 (class loop > > *loop, class loop *epilog, > > > if (!merge_arg) > > > merge_arg = old_arg; > > > > > > - tree guard_arg > > > - = find_guard_arg (loop, epilog, single_exit (loop), update_phi); > > > + tree guard_arg = find_guard_arg (loop, epilog, loop_exit, > > > + update_phi, e->dest_idx); > > > /* If the var is live after loop but not a reduction, we simply > > > use the old arg. */ > > > if (!guard_arg) > > > @@ -2898,21 +2869,6 @@ slpeel_update_phi_nodes_for_guard2 (class > > loop *loop, class loop *epilog, > > > } > > > } > > > > > > -/* EPILOG loop is duplicated from the original loop for vectorizing, > > > - the arg of its loop closed ssa PHI needs to be updated. */ > > > - > > > -static void > > > -slpeel_update_phi_nodes_for_lcssa (class loop *epilog) -{ > > > - gphi_iterator gsi; > > > - basic_block exit_bb = single_exit (epilog)->dest; > > > - > > > - gcc_assert (single_pred_p (exit_bb)); > > > - edge e = EDGE_PRED (exit_bb, 0); > > > - for (gsi = gsi_start_phis (exit_bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > - rename_use_op (PHI_ARG_DEF_PTR_FROM_EDGE (gsi.phi (), e)); > > > -} > > > - > > > /* LOOP_VINFO is an epilogue loop whose corresponding main loop can be > > skipped. > > > Return a value that equals: > > > > > > @@ -3255,8 +3211,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree > > niters, tree nitersm1, > > > e, &prolog_e); > > > gcc_assert (prolog); > > > prolog->force_vectorize = false; > > > - slpeel_update_phi_nodes_for_loops (loop_vinfo, prolog, prolog_e, loop, > > > - exit_e, true); > > > + > > > first_loop = prolog; > > > reset_original_copy_tables (); > > > > > > @@ -3336,8 +3291,6 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree > > niters, tree nitersm1, > > > LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo) = new_epilog_e; > > > gcc_assert (epilog); > > > epilog->force_vectorize = false; > > > - slpeel_update_phi_nodes_for_loops (loop_vinfo, loop, e, epilog, > > > - new_epilog_e, false); > > > bb_before_epilog = loop_preheader_edge (epilog)->src; > > > > > > /* Scalar version loop may be preferred. In this case, add > > > guard @@ -3430,7 +3383,9 @@ vect_do_peeling (loop_vec_info > > loop_vinfo, tree niters, tree nitersm1, > > > irred_flag); > > > if (vect_epilogues) > > > epilogue_vinfo->skip_this_loop_edge = guard_e; > > > - slpeel_update_phi_nodes_for_guard2 (loop, epilog, guard_e, > > epilog_e); > > > + edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo); > > > + slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv, > > guard_e, > > > + epilog_e); > > > /* Only need to handle basic block before epilog loop if it's not > > > the guard_bb, which is the case when skip_vector is true. */ > > > if (guard_bb != bb_before_epilog) > > > @@ -3441,8 +3396,6 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree > > niters, tree nitersm1, > > > } > > > scale_loop_profile (epilog, prob_epilog, -1); > > > } > > > - else > > > - slpeel_update_phi_nodes_for_lcssa (epilog); > > > > > > unsigned HOST_WIDE_INT bound; > > > if (bound_scalar.is_constant (&bound)) diff --git > > > a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index > > > > > f1caa5f207d3b13da58c3a313b11d1ef98374349..327cab0f736da7f1bd3e0 > > 24d666d > > > f46ef9208107 100644 > > > --- a/gcc/tree-vect-loop.cc > > > +++ b/gcc/tree-vect-loop.cc > > > @@ -5877,7 +5877,7 @@ vect_create_epilog_for_reduction (loop_vec_info > > loop_vinfo, > > > basic_block exit_bb; > > > tree scalar_dest; > > > tree scalar_type; > > > - gimple *new_phi = NULL, *phi; > > > + gimple *new_phi = NULL, *phi = NULL; > > > gimple_stmt_iterator exit_gsi; > > > tree new_temp = NULL_TREE, new_name, new_scalar_dest; > > > gimple *epilog_stmt = NULL; > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index > > > > > 55b6771b271d5072fa1327d595e1dddb112cfdf6..25ceb6600673d71fd601 > > 24434039 > > > 97e921066483 100644 > > > --- a/gcc/tree-vectorizer.h > > > +++ b/gcc/tree-vectorizer.h > > > @@ -2183,7 +2183,7 @@ extern bool slpeel_can_duplicate_loop_p (const > > class loop *, const_edge, > > > const_edge); > > > class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class loop *, edge, > > > class loop *, edge, > > > - edge, edge *); > > > + edge, edge *, bool = true); > > > class loop *vect_loop_versioning (loop_vec_info, gimple *); extern > > > class loop *vect_do_peeling (loop_vec_info, tree, tree, > > > tree *, tree *, tree *, int, bool, bool, > > > > > > > > > > > > > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)