From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id AA4BD38582BE for ; Tue, 14 Nov 2023 07:56:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AA4BD38582BE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AA4BD38582BE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.220.28 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699948568; cv=none; b=QrkueNpLKNSmkFk+J4cUu3kMTHQQNrooG2EViIGAx+849RjztQqoiAHrq/LHzHNfVFOjgmpYBYRxAtESXYmyjhNRLwjk1j6MOFdZsl2cPGe8igkAIhuRl3dpW6z2an8EONdUpw8bvnhvy3LRcRkgEnVbbIm0s13G68lQVJSTSZI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699948568; c=relaxed/simple; bh=k6aJTTAqEhalJ8vP9xgDup2DYO2T+kbo98ySuQffIWo=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=arJBsAQBuAYAhyu/E7uozXUdt/B8QrfmIi2qlc+YcqP0yJE/bLG8HKyAH373UIUMoR07nhuA6nh4f2VpYfilf7PHCpSnEhhXk5cLQtJCOCGK8oGKAyDlwYToBMW8NqsxGYiiENAFzCcSHOGFsA+N9UM73A6nbmZNq7p2g9SmUH4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id CF92E218B2; Tue, 14 Nov 2023 07:56:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1699948565; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BBK91dekLw/AGh6NNJGW06jN2bL6ZMIYWE7z/5Klpe0=; b=X6e73DubKejKpEouPhHU/yJM6lztX83KVLjtPFX9wYbElvGht/n9rx+hcX4BZAPIlZ1Es4 cs6ilt3cFymhbcMblOHPmmGLf9nY/zyu7Ue+gp6OMoBHmZgonQkldMDflrx1MVeC03yiGo RYw5VF5qXVCPp4vaCjnw3Er0rxB4cdg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1699948565; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BBK91dekLw/AGh6NNJGW06jN2bL6ZMIYWE7z/5Klpe0=; b=hPKFLKFPj0MnE+j/KYEnA1m21AEAUEwFkVxiw348zR+mEVJpNNjon2FBNLtMXCvTC6WfXi zOjoBFY/Ois6eoCQ== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id AA15C2D34E; Tue, 14 Nov 2023 07:56:05 +0000 (UTC) Date: Tue, 14 Nov 2023 07:56:05 +0000 (UTC) From: Richard Biener To: Tamar Christina cc: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH 5/21]middle-end: update vectorizer's control update to support picking an exit other than loop latch In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 13 Nov 2023, Tamar Christina wrote: > > -----Original Message----- > > From: Richard Biener > > Sent: Tuesday, November 7, 2023 3:04 PM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; jlaw@ventanamicro.com > > Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to > > support picking an exit other than loop latch > > > > On Mon, 6 Nov 2023, Tamar Christina wrote: > > > > > Hi All, > > > > > > As requested, the vectorizer is now free to pick it's own exit which > > > can be different than what the loop CFG infrastucture uses. The > > > vectorizer makes use of this to vectorize loops that it previously could not. > > > > > > But this means that loop control must be materialized in the block > > > that needs it less we corrupt the SSA chain. This makes it so we use > > > the vectorizer's main IV block instead of the loop infra. > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > Ok for master? > > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > * tree-ssa-loop-manip.cc (standard_iv_increment_position): > > Conditionally > > > take dest BB. > > > * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise. > > > * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it. > > > (vect_set_loop_condition_partial_vectors_avx512): Likewise. > > > (vect_set_loop_condition_normal): Likewise. > > > > > > --- inline copy of patch -- > > > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h > > > index > > > > > bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00 > > 301423df1 > > > 11cbe7bf7ba8 100644 > > > --- a/gcc/tree-ssa-loop-manip.h > > > +++ b/gcc/tree-ssa-loop-manip.h > > > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool > > > = false); extern basic_block ip_end_pos (class loop *); extern > > > basic_block ip_normal_pos (class loop *); extern void > > > standard_iv_increment_position (class loop *, > > > - gimple_stmt_iterator *, bool *); > > > + gimple_stmt_iterator *, bool *, > > > + basic_block = NULL); > > > extern bool > > > gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned > > int, > > > sbitmap, edge, vec *, int); diff > > --git > > > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index > > > > > e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d > > de6c294 > > > 92ce4d4e4550 100644 > > > --- a/gcc/tree-ssa-loop-manip.cc > > > +++ b/gcc/tree-ssa-loop-manip.cc > > > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop) > > > > > > /* Stores the standard position for induction variable increment in LOOP > > > (just before the exit condition if it is available and latch block is empty, > > > - end of the latch block otherwise) to BSI. INSERT_AFTER is set to true if > > > - the increment should be inserted after *BSI. */ > > > + end of the latch block otherwise) to BSI. If DEST_BB is specified then that > > > + basic block is used as the destination instead of the loop latch source > > > + block. INSERT_AFTER is set to true if the increment should be inserted > > after > > > + *BSI. */ > > > > > > void > > > standard_iv_increment_position (class loop *loop, gimple_stmt_iterator > > *bsi, > > > - bool *insert_after) > > > + bool *insert_after, basic_block dest_bb) > > > { > > > - basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop); > > > + basic_block bb = dest_bb; > > > + if (!bb) > > > + bb = ip_normal_pos (loop); > > > + basic_block latch = ip_end_pos (loop); > > > > I don't think that's a good API extension. Given that we don't support an early > > exit after the main IV exit doesn't this code already work fine as-is? It chooses > > the last exit. The position is also not semantically relevant, we just try to keep > > the latch empty here (that is, it's a bit of a "bad" API). > > > > So, do you really need this change? > > Yes I do, If you look at these kinds of loops https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f > > You'll see that the main exit, i.e. the one attached to the latch block is the early break. > Because SCEV can't analyze it picks the main exit to be the one in BB4. > > This means that the loop control must be placed in BB4. If we place ivtmp_10 = ivtmp_9 - 1 > In BB 3 then that's broken SSA. If we use `ivtmp_9` in BB4 then we'll have an off by one issue. OK, but then I think the fix is to not use standard_iv_increment_position (it's a weird API anyway). Instead insert before the main exit condition. Btw, I assumed this order of main / early exit cannot happen. But I didn't re-review the main exit identification code yet. Richard. > You could have reached the end of the valid range for the loop when you re-enter BB4, since > loads are still allowed you'll then read out of bounds before checking that you exit. > > This is also annoyingly hard to get correct, which Is what took me a long time. Such loops mean > You need to restart the scalar loop at i_7 if you take the main exit. > > Regards, > Tamar > > > > > Maybe we're really using standard_iv_increment_position wrong here, the > > result is supposed to _only_ feed the PHI latch argument. > > Richard. > > > > > gimple *last = last_nondebug_stmt (latch); > > > > > > if (!bb > > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc > > > index > > > > > 6fbb5b80986fd657814b48eb009b52b094f331e6..3d59119787d6afdc5a64 > > 65a547d1 > > > ea2d3d940373 100644 > > > --- a/gcc/tree-vect-loop-manip.cc > > > +++ b/gcc/tree-vect-loop-manip.cc > > > @@ -531,7 +531,8 @@ vect_set_loop_controls_directly (class loop *loop, > > loop_vec_info loop_vinfo, > > > tree index_before_incr, index_after_incr; > > > gimple_stmt_iterator incr_gsi; > > > bool insert_after; > > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after); > > > + edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo); > > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after, > > > + exit_e->src); > > > if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo)) > > > { > > > /* Create an IV that counts down from niters_total and whose > > > step @@ -1017,7 +1018,8 @@ > > vect_set_loop_condition_partial_vectors_avx512 (class loop *loop, > > > tree index_before_incr, index_after_incr; > > > gimple_stmt_iterator incr_gsi; > > > bool insert_after; > > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after); > > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after, > > > + exit_edge->src); > > > create_iv (niters_adj, MINUS_EXPR, iv_step, NULL_TREE, loop, > > > &incr_gsi, insert_after, &index_before_incr, > > > &index_after_incr); > > > @@ -1185,7 +1187,7 @@ vect_set_loop_condition_partial_vectors_avx512 > > (class loop *loop, > > > loop handles exactly VF scalars per iteration. */ > > > > > > static gcond * > > > -vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge > > > exit_edge, > > > +vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge > > > +exit_edge, > > > class loop *loop, tree niters, tree step, > > > tree final_iv, bool niters_maybe_zero, > > > gimple_stmt_iterator loop_cond_gsi) @@ - > > 1278,7 +1280,8 @@ > > > vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge > > exit_edge, > > > } > > > } > > > > > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after); > > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after, > > > + exit_edge->src); > > > create_iv (init, PLUS_EXPR, step, NULL_TREE, loop, > > > &incr_gsi, insert_after, &indx_before_incr, &indx_after_incr); > > > indx_after_incr = force_gimple_operand_gsi (&loop_cond_gsi, > > > indx_after_incr, > > > > > > > > > > > > > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)