From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id E05FC3858D35 for ; Wed, 15 Nov 2023 13:01:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E05FC3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E05FC3858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:67c:2178:6::1d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700053286; cv=none; b=ExW5V7viA8B5hSfDAePqIuQZ6RGFKN33OGpLZM9MukPGqYNxSDfqN0bscGjD6B33/yZUbivlMX8a6xXs2qugQXcrjKAGhHM4cGTZHmhAGH2uFrcGhkj0hmi8Jwqn6laFWVzujP7whgKmumvVNNHDejNG4ADhvc1rWskyAnXvjL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700053286; c=relaxed/simple; bh=oF1I7Lo7pzO39QUsqyn7dg5XRn5mGiWtsC5K4SHvXd8=; h=Date:From:To:Subject:Message-ID:MIME-Version; b=inBQneDr7+yL133fY7gRVimPyBvxiOxpHFrylKvoCOPcVlCqSPzCHKVhRWYYrciz00T96TqbETATGGgoACHMAKeYzioICaPwy+9Prx2gujyJM/EES/CKyHzsBMj+kRajV2QWkvAp+qxlYc8SBPCGbz/dgZ4XX22laDjvhJuMbno= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 2FFBF204D9; Wed, 15 Nov 2023 13:01:23 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id D418C2D3F5; Wed, 15 Nov 2023 13:01:22 +0000 (UTC) Date: Wed, 15 Nov 2023 13:01:22 +0000 (UTC) From: Richard Biener To: Tamar Christina cc: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Level: Authentication-Results: smtp-out2.suse.de; none X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[] X-Spam-Score: -4.00 X-Rspamd-Queue-Id: 2FFBF204D9 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 15 Nov 2023, Tamar Christina wrote: > Patch updated to latest trunk: > > Hi All, > > This changes the PHI node updates to support early breaks. > It has to support both the case where the loop's exit matches the normal loop > exit and one where the early exit is "inverted", i.e. it's an early exit edge. > > In the latter case we must always restart the loop for VF iterations. For an > early exit the reason is obvious, but there are cases where the "normal" exit > is located before the early one. This exit then does a check on ivtmp resulting > in us leaving the loop since it thinks we're done. > > In these case we may still have side-effects to perform so we also go to the > scalar loop. > > For the "normal" exit niters has already been adjusted for peeling, for the > early exits we must find out how many iterations we actually did. So we have > to recalculate the new position for each exit. > > Thanks, > Tamar > > gcc/ChangeLog: > > * tree-vect-loop-manip.cc (vect_set_loop_condition_normal): Hide unused. > (vect_update_ivs_after_vectorizer): Support early break. > (vect_do_peeling): Use it. > > --- inline copy of patch --- > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc > index d3fa8699271c4d7f404d648a38a95beabeabc99a..e1d210ab4617c894dab3d2654cf1c842baac58f5 100644 > --- a/gcc/tree-vect-loop-manip.cc > +++ b/gcc/tree-vect-loop-manip.cc > @@ -1200,7 +1200,7 @@ vect_set_loop_condition_partial_vectors_avx512 (class loop *loop, > loop handles exactly VF scalars per iteration. */ > > static gcond * > -vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge exit_edge, > +vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge exit_edge, > class loop *loop, tree niters, tree step, > tree final_iv, bool niters_maybe_zero, > gimple_stmt_iterator loop_cond_gsi) > @@ -1412,7 +1412,7 @@ vect_set_loop_condition (class loop *loop, edge loop_e, loop_vec_info loop_vinfo > When this happens we need to flip the understanding of main and other > exits by peeling and IV updates. */ > > -bool inline > +bool > vect_is_loop_exit_latch_pred (edge loop_exit, class loop *loop) > { > return single_pred (loop->latch) == loop_exit->src; > @@ -2142,6 +2142,7 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) > Input: > - LOOP - a loop that is going to be vectorized. The last few iterations > of LOOP were peeled. > + - VF - The chosen vectorization factor for LOOP. > - NITERS - the number of iterations that LOOP executes (before it is > vectorized). i.e, the number of times the ivs should be bumped. > - UPDATE_E - a successor edge of LOOP->exit that is on the (only) path the comment on this is now a bit misleading, can you try to update it and/or move the comment bits to the docs on EARLY_EXIT? > @@ -2152,6 +2153,9 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) > The phi args associated with the edge UPDATE_E in the bb > UPDATE_E->dest are updated accordingly. > > + - restart_loop - Indicates whether the scalar loop needs to restart the params are ALL_CAPS > + iteration count where the vector loop began. > + > Assumption 1: Like the rest of the vectorizer, this function assumes > a single loop exit that has a single predecessor. > > @@ -2169,18 +2173,22 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) > */ > > static void > -vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > - tree niters, edge update_e) > +vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, poly_uint64 vf, LOOP_VINFO_VECT_FACTOR? > + tree niters, edge update_e, bool restart_loop) I think 'bool early_exit' is better here? I wonder if we have an "early" exit after the main exit we are probably sure there are no side-effects to re-execute and could avoid this restarting? > { > gphi_iterator gsi, gsi1; > class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > basic_block update_bb = update_e->dest; > - > - basic_block exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; > - > - /* Make sure there exists a single-predecessor exit bb: */ > - gcc_assert (single_pred_p (exit_bb)); > - gcc_assert (single_succ_edge (exit_bb) == update_e); > + bool inversed_iv > + = !vect_is_loop_exit_latch_pred (LOOP_VINFO_IV_EXIT (loop_vinfo), > + LOOP_VINFO_LOOP (loop_vinfo)); > + bool needs_interm_block = LOOP_VINFO_EARLY_BREAKS (loop_vinfo) > + && flow_bb_inside_loop_p (loop, update_e->src); > + edge loop_e = LOOP_VINFO_IV_EXIT (loop_vinfo); > + gcond *cond = get_loop_exit_condition (loop_e); > + basic_block exit_bb = loop_e->dest; > + basic_block iv_block = NULL; > + gimple_stmt_iterator last_gsi = gsi_last_bb (exit_bb); > > for (gsi = gsi_start_phis (loop->header), gsi1 = gsi_start_phis (update_bb); > !gsi_end_p (gsi) && !gsi_end_p (gsi1); > @@ -2190,7 +2198,6 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > tree step_expr, off; > tree type; > tree var, ni, ni_name; > - gimple_stmt_iterator last_gsi; > > gphi *phi = gsi.phi (); > gphi *phi1 = gsi1.phi (); > @@ -2222,11 +2229,52 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > enum vect_induction_op_type induction_type > = STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE (phi_info); > > - if (induction_type == vect_step_op_add) > + tree iv_var = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop)); > + /* create_iv always places it on the LHS. Alternatively we can set a > + property during create_iv to identify it. */ > + bool ivtemp = gimple_cond_lhs (cond) == iv_var; > + if (restart_loop && ivtemp) > { > + type = TREE_TYPE (gimple_phi_result (phi)); > + ni = build_int_cst (type, vf); > + if (inversed_iv) > + ni = fold_build2 (MINUS_EXPR, type, ni, > + fold_convert (type, step_expr)); > + } > + else if (induction_type == vect_step_op_add) > + { > + > tree stype = TREE_TYPE (step_expr); > - off = fold_build2 (MULT_EXPR, stype, > - fold_convert (stype, niters), step_expr); > + > + /* Early exits always use last iter value not niters. */ > + if (restart_loop) > + { > + /* Live statements in the non-main exit shouldn't be adjusted. We > + normally didn't have this problem with a single exit as live > + values would be in the exit block. However when dealing with > + multiple exits all exits are redirected to the merge block > + and we restart the iteration. */ Hmm, I fail to see how this works - we're either using the value to continue the induction or not, independent of STMT_VINFO_LIVE_P. > + if (STMT_VINFO_LIVE_P (phi_info)) > + continue; > + > + /* For early break the final loop IV is: > + init + (final - init) * vf which takes into account peeling > + values and non-single steps. The main exit can use niters > + since if you exit from the main exit you've done all vector > + iterations. For an early exit we don't know when we exit so we > + must re-calculate this on the exit. */ > + tree start_expr = gimple_phi_result (phi); > + off = fold_build2 (MINUS_EXPR, stype, > + fold_convert (stype, start_expr), > + fold_convert (stype, init_expr)); > + /* Now adjust for VF to get the final iteration value. */ > + off = fold_build2 (MULT_EXPR, stype, off, > + build_int_cst (stype, vf)); > + } > + else > + off = fold_build2 (MULT_EXPR, stype, > + fold_convert (stype, niters), step_expr); > + > if (POINTER_TYPE_P (type)) > ni = fold_build_pointer_plus (init_expr, off); > else > @@ -2238,6 +2286,8 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > /* Don't bother call vect_peel_nonlinear_iv_init. */ > else if (induction_type == vect_step_op_neg) > ni = init_expr; > + else if (restart_loop) > + continue; This looks all a bit complicated - why wouldn't we simply always use the PHI result when 'restart_loop'? Isn't that the correct old start value in all cases? > else > ni = vect_peel_nonlinear_iv_init (&stmts, init_expr, > niters, step_expr, > @@ -2245,9 +2295,20 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > > var = create_tmp_var (type, "tmp"); > > - last_gsi = gsi_last_bb (exit_bb); > gimple_seq new_stmts = NULL; > ni_name = force_gimple_operand (ni, &new_stmts, false, var); > + > + /* For non-main exit create an intermediat edge to get any updated iv > + calculations. */ > + if (needs_interm_block > + && !iv_block > + && (!gimple_seq_empty_p (stmts) || !gimple_seq_empty_p (new_stmts))) > + { > + iv_block = split_edge (update_e); > + update_e = single_succ_edge (update_e->dest); > + last_gsi = gsi_last_bb (iv_block); > + } > + > /* Exit_bb shouldn't be empty. */ > if (!gsi_end_p (last_gsi)) > { > @@ -3342,8 +3403,26 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, > niters_vector_mult_vf steps. */ > gcc_checking_assert (vect_can_advance_ivs_p (loop_vinfo)); > update_e = skip_vector ? e : loop_preheader_edge (epilog); > - vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf, > - update_e); > + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) > + update_e = single_succ_edge (e->dest); > + bool inversed_iv > + = !vect_is_loop_exit_latch_pred (LOOP_VINFO_IV_EXIT (loop_vinfo), > + LOOP_VINFO_LOOP (loop_vinfo)); You are computing this here and in vect_update_ivs_after_vectorizer? > + > + /* Update the main exit first. */ > + vect_update_ivs_after_vectorizer (loop_vinfo, vf, niters_vector_mult_vf, > + update_e, inversed_iv); > + > + /* And then update the early exits. */ > + for (auto exit : get_loop_exit_edges (loop)) > + { > + if (exit == LOOP_VINFO_IV_EXIT (loop_vinfo)) > + continue; > + > + vect_update_ivs_after_vectorizer (loop_vinfo, vf, > + niters_vector_mult_vf, > + exit, true); ... why does the same not work here? Wouldn't the proper condition be !dominated_by_p (CDI_DOMINATORS, exit->src, LOOP_VINFO_IV_EXIT (loop_vinfo)->src) or similar? That is, whether the exit is at or after the main IV exit? (consider having two) > + } > > if (skip_epilog) > { >