From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 8C1903858C53 for ; Wed, 7 Feb 2024 09:49:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8C1903858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8C1903858C53 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707299401; cv=none; b=Q88FflOf6WS4bDlB2kF0RHXC7JnBBxNF3aS59HljkLma5idvKmNXFXJ9iOwBrF7QODxl2KhyREuJdFBzyYGITWU7dS79vtI8ssYHpFUij3rXF2ubWHxXmvpJLrFewG7vFlkSMWDFGb5O+NLcVcPq5jNgwVQDbMEMmFZ9h/gaGsQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707299401; c=relaxed/simple; bh=BpMzmMf9Nriz0U0ntU3BSlvVl9zf6kABifOdi/YvlEE=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=MRE2T9Q6XqxMi6xdP0/AU9plUPPswT/IVymngAedJ3Euusc20L40QFwzCeKILaJX8NEx9/AtmZmEJPYXLomHAj2biS6TKaAKPzRBTSTW0gp4x+PgLHi29Z/NZm+MxjxmgESMiJkKjL0PEg3ryM9nbTKpCeif4ssWqOj0oC8S6kk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 75CCA1FBBC; Wed, 7 Feb 2024 09:49:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707299398; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=dHq03boqTbHCZe9G5O35VgjegP/ZUgACCN4Zp5Ot7pQ=; b=RXEwPxxJje1r7rJO0plJi8iNiQroHgiCjQd36mK5Xfau3TmRZ3zkd5N003oD3M0BMKZLa8 nESNUsmpjkqH5WWpCixaygVyRlkf9VgYUR89dIehcop3H2YQjgu2zNskCrMaiKQl8tDfKw StwSoN7HHcS5rCbM5zrE0YlaR8m5U+o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707299398; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=dHq03boqTbHCZe9G5O35VgjegP/ZUgACCN4Zp5Ot7pQ=; b=qC9Mft0p3ciqyqg4jJjI6iMCBC9YDdfZHdEmGbzYodTMWqdu30ciDr9iDI6xdttwPN96K0 11wds7qlViXz9eDA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707299398; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=dHq03boqTbHCZe9G5O35VgjegP/ZUgACCN4Zp5Ot7pQ=; b=RXEwPxxJje1r7rJO0plJi8iNiQroHgiCjQd36mK5Xfau3TmRZ3zkd5N003oD3M0BMKZLa8 nESNUsmpjkqH5WWpCixaygVyRlkf9VgYUR89dIehcop3H2YQjgu2zNskCrMaiKQl8tDfKw StwSoN7HHcS5rCbM5zrE0YlaR8m5U+o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707299398; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=dHq03boqTbHCZe9G5O35VgjegP/ZUgACCN4Zp5Ot7pQ=; b=qC9Mft0p3ciqyqg4jJjI6iMCBC9YDdfZHdEmGbzYodTMWqdu30ciDr9iDI6xdttwPN96K0 11wds7qlViXz9eDA== Date: Wed, 7 Feb 2024 10:49:58 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: tamar.christina@arm.com Subject: [PATCH] Apply TLC to vect_analyze_early_break_dependences MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: 1.45 X-Spamd-Result: default: False [1.45 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.90)[-0.896]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_SHORT(2.94)[0.981]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,MISSING_MID,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Message-ID: <20240207094958.ZBn4t8tOf5Hdr-yHcBX9GimZM1Fu9oLCFF2Lkprg_7Q@z> There has been some confusion in my understanding of how early breaks work, the following clarifies some comments and undoes one change that shouldn't have been necessary. It also fixes the dependence test to avoit TBAA (we're moving stores down across loads). I'm bootstrapping and testing this on x86_64-unknown-linux-gnu. It sofar passed vect.exp testing with SSE4.2 and AVX512. Richard. * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Only check whether reads are in-bound in places that are not safe. Fix dependence check. Add missing newline. Clarify comments. --- gcc/tree-vect-data-refs.cc | 43 ++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index f79ade9509b..69ba4fb7a82 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -684,9 +684,10 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) /* Since we don't support general control flow, the location we'll move the side-effects to is always the latch connected exit. When we support general control flow we can do better but for now this is fine. Move - side-effects to the in-loop destination of the last early exit. For the PEELED - case we move the side-effects to the latch block as this is guaranteed to be the - last block to be executed when a vector iteration finished. */ + side-effects to the in-loop destination of the last early exit. For the + PEELED case we move the side-effects to the latch block as this is + guaranteed to be the last block to be executed when a vector iteration + finished. */ if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo)) dest_bb = loop->latch; else @@ -697,9 +698,9 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) loads. */ basic_block bb = dest_bb; - /* In the peeled case we need to check all the loads in the loop since to move the - the stores we lift the stores over all loads into the latch. */ - bool check_deps = LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo); + /* We move stores across all loads to the beginning of dest_bb, so + the first block processed below doesn't need dependence checking. */ + bool check_deps = false; do { @@ -711,8 +712,7 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) { gimple *stmt = gsi_stmt (gsi); gsi_prev (&gsi); - if (!gimple_has_ops (stmt) - || is_gimple_debug (stmt)) + if (is_gimple_debug (stmt)) continue; stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (stmt); @@ -720,18 +720,25 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) if (!dr_ref) continue; + /* We know everything below dest_bb is safe since we know we + had a full vector iteration when reaching it. Either by + the loop entry / IV exit test being last or because this + is the loop latch itself. */ + if (!check_deps) + continue; + /* Check if vector accesses to the object will be within bounds. must be a constant or assume loop will be versioned or niters - bounded by VF so accesses are within range. We only need to check the - reads since writes are moved to a safe place where if we get there we - know they are safe to perform. */ + bounded by VF so accesses are within range. We only need to check + the reads since writes are moved to a safe place where if we get + there we know they are safe to perform. */ if (DR_IS_READ (dr_ref) && !ref_within_array_bound (stmt, DR_REF (dr_ref))) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "early breaks not supported: vectorization " - "would %s beyond size of obj.", + "would %s beyond size of obj.\n", DR_IS_READ (dr_ref) ? "read" : "write"); return opt_result::failure_at (stmt, "can't safely apply code motion to " @@ -739,9 +746,6 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) "the early exit.\n", stmt); } - if (!check_deps) - continue; - if (DR_IS_READ (dr_ref)) bases.safe_push (dr_ref); else if (DR_IS_WRITE (dr_ref)) @@ -768,7 +772,11 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) the store. */ for (auto dr_read : bases) - if (dr_may_alias_p (dr_ref, dr_read, loop_nest)) + /* Note we're not passing the DRs in stmt order here + since the DR dependence checking routine does not + envision we're moving stores down. The read-write + order tricks it to avoid applying TBAA. */ + if (dr_may_alias_p (dr_read, dr_ref, loop_nest)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, @@ -808,8 +816,7 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo) break; } - /* For the non-PEELED case we don't want to check the loads in the IV exit block - for dependencies with the stores, but any block preceeding it we do. */ + /* All earlier blocks need dependence checking. */ check_deps = true; bb = single_pred (bb); } -- 2.35.3