From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id CD47F385842B for ; Fri, 19 Jan 2024 13:55:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD47F385842B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD47F385842B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705672544; cv=none; b=uggSyHEbHZH4FPux/JFfI92WG8nyWe1B08A0pBf0a1MfKmGsQjM/O8DveHBMwNOT/d+nAtg6XqJRPgfQvchABGb6+el1eF4bDSnVAVH9jSRlxbTKXcVUZ8CHqKBdURw5nSiWPMMZP9ZTWkdA0KRHlENnqTiiLu0FFQ4fKWYu0aI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705672544; c=relaxed/simple; bh=2YPfmYHDv1rWwBUHxpT3EYJ8kBaHj/3TJ6ZHCpRIJp0=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=uUAGkbwrJxvKCRsaHuSlsw31d8QFjza7MeQzbLUwe3xZObNms55z7mgJJEE5G7xrvbZq5JMZDMOiEUZrsMQzwYa10q1lF632x1+3dJqWs4and7vTelvHnQTEwWzYLJ1UqesQJOcwzNdu/QNw9lVZiHSSFGssfwuUY40RIBfF8r0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A9B4C21EF6; Fri, 19 Jan 2024 13:55:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705672540; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=+CAWYf/bZzgm80cGxKC0oY1U9TOlLoXico3fGrbUgTo=; b=GL7UJsn0KP/WELDWtUvufiO3L3HUOlrJ26KqhXzCKyJOQ0uW88pxALJjYj26m+gft1Drm1 1fqfa0gKio7S/1qOUZTo+TOzDBTdt3WAuVBcH9/lDxRJkQ0b/bmioS1/8q+ocUhg6mjBOD cFTjG9+q0fZHVfDAa/qAFEN55TpwGd8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705672540; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=+CAWYf/bZzgm80cGxKC0oY1U9TOlLoXico3fGrbUgTo=; b=wtj2xngIFp402CHm0ir5iPv4RHrHw+afUSglEgvsCqOhyqkxmu/+ZYvmigqSdFYmznY+rX 3J6LxtGS0hvo31BQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705672540; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=+CAWYf/bZzgm80cGxKC0oY1U9TOlLoXico3fGrbUgTo=; b=GL7UJsn0KP/WELDWtUvufiO3L3HUOlrJ26KqhXzCKyJOQ0uW88pxALJjYj26m+gft1Drm1 1fqfa0gKio7S/1qOUZTo+TOzDBTdt3WAuVBcH9/lDxRJkQ0b/bmioS1/8q+ocUhg6mjBOD cFTjG9+q0fZHVfDAa/qAFEN55TpwGd8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705672540; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=+CAWYf/bZzgm80cGxKC0oY1U9TOlLoXico3fGrbUgTo=; b=wtj2xngIFp402CHm0ir5iPv4RHrHw+afUSglEgvsCqOhyqkxmu/+ZYvmigqSdFYmznY+rX 3J6LxtGS0hvo31BQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 875441388C; Fri, 19 Jan 2024 13:55:40 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id HNRJH1x/qmUUJwAAD6G6ig (envelope-from ); Fri, 19 Jan 2024 13:55:40 +0000 Date: Fri, 19 Jan 2024 14:55:40 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: tamar.christina@arm.com Subject: [PATCH] tree-optimization/113373 - add missing LC PHIs for live operations MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20240119135540.875441388C@imap1.dmz-prg2.suse.org> Authentication-Results: smtp-out1.suse.de; none X-Spamd-Result: default: False [-3.10 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Spam-Level: X-Spam-Score: -3.10 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The following makes reduction epilogue code generation happy by properly adding LC PHIs to the exit blocks for multiple exit vectorized loops. Some refactoring might make the flow easier to follow but I've refrained from doing that with this patch. I've kept some fixes in reduction epilogue generation from the earlier attempt fixing this PR. Bootstrap and regtest running on x86_64-unknown-linux-gnu. I'm waiting for the linaro CI and on Monday will followup with some refactoring. Richard. PR tree-optimization/113373 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Create LC PHIs in the exit blocks where necessary. * tree-vect-loop.cc (vectorizable_live_operation): Do not try to handle missing LC PHIs. (find_connected_edge): Remove. (vect_create_epilog_for_reduction): Cleanup use of auto_vec. * gcc.dg/vect/vect-early-break_104-pr113373.c: New testcase. --- .../vect/vect-early-break_104-pr113373.c | 19 ++++++++ gcc/tree-vect-loop-manip.cc | 34 ++++++++++++-- gcc/tree-vect-loop.cc | 46 +++++-------------- 3 files changed, 60 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c new file mode 100644 index 00000000000..1601aafb3e6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_104-pr113373.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ + +struct asCArray { + unsigned *array; + int length; +}; +unsigned asCReaderTranslateFunction(struct asCArray b, unsigned t) +{ + int size = 0; + for (unsigned num; num < t; num++) + { + if (num >= b.length) + __builtin_abort(); + size += b.array[num]; + } + return size; +} diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 1477906e96e..eacbc022549 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -1696,7 +1696,8 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, /* Check if we've already created a new phi node during edge redirection. If we have, only propagate the value downwards in case there is no merge block. */ - if (tree *res = new_phi_args.get (new_arg)) + tree *res; + if ((res = new_phi_args.get (new_arg))) { if (multiple_exits_p) new_arg = *res; @@ -1717,7 +1718,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, /* Similar to the single exit case, If we have an existing LCSSA variable thread through the original value otherwise skip it and directly use the final value. */ - if (tree *res = new_phi_args.get (tmp_arg)) + if ((res = new_phi_args.get (tmp_arg))) new_arg = *res; else if (!virtual_operand_p (new_arg)) new_arg = tmp_arg; @@ -1728,9 +1729,20 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, /* Otherwise, main loop exit should use the final iter value. */ if (multiple_exits_p) - SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, - single_succ_edge (main_loop_exit_block), - new_arg); + { + /* Create a LC PHI if it doesn't already exist. */ + if (!virtual_operand_p (new_arg) && !res) + { + tree new_def = copy_ssa_name (new_arg); + gphi *lc_phi + = create_phi_node (new_def, main_loop_exit_block); + SET_PHI_ARG_DEF (lc_phi, 0, new_arg); + new_arg = new_def; + } + SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, + single_succ_edge (main_loop_exit_block), + new_arg); + } else SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, loop_exit, new_arg); @@ -1766,6 +1778,18 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, if (vphi) alt_arg = gimple_phi_result (vphi); } + /* For other live args we didn't create LC PHI nodes. + Do so here. */ + else + { + tree alt_def = copy_ssa_name (alt_arg); + gphi *lc_phi + = create_phi_node (alt_def, alt_loop_exit_block); + for (unsigned i = 0; i < gimple_phi_num_args (lc_phi); + ++i) + SET_PHI_ARG_DEF (lc_phi, i, alt_arg); + alt_arg = alt_def; + } edge main_e = single_succ_edge (alt_loop_exit_block); SET_PHI_ARG_DEF_ON_EDGE (to_phi, main_e, alt_arg); } diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 4769d6f53e4..fe631252dc2 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -6017,7 +6017,6 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, int j, i; vec &scalar_results = reduc_info->reduc_scalar_results; unsigned int group_size = 1, k; - auto_vec phis; /* SLP reduction without reduction chain, e.g., # a1 = phi # b1 = phi @@ -6930,12 +6929,12 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, use */ gcc_assert (live_out_stmts.size () == scalar_results.length ()); + auto_vec phis; for (k = 0; k < live_out_stmts.size (); k++) { stmt_vec_info scalar_stmt_info = vect_orig_stmt (live_out_stmts[k]); scalar_dest = gimple_get_lhs (scalar_stmt_info->stmt); - phis.create (3); /* Find the loop-closed-use at the loop exit of the original scalar result. (The reduction result is expected to have two immediate uses, one at the latch block, and one at the loop exit). For double @@ -6988,7 +6987,7 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, } } - phis.release (); + phis.truncate (0); } } @@ -10710,18 +10709,6 @@ vectorizable_live_operation_1 (loop_vec_info loop_vinfo, return new_tree; } -/* Find the edge that's the final one in the path from SRC to DEST and - return it. This edge must exist in at most one forwarder edge between. */ - -static edge -find_connected_edge (edge src, basic_block dest) -{ - if (src->dest == dest) - return src; - - return find_edge (src->dest, dest); -} - /* Function vectorizable_live_operation. STMT_INFO computes a value that is used outside the loop. Check if @@ -10964,13 +10951,8 @@ vectorizable_live_operation (vec_info *vinfo, stmt_vec_info stmt_info, { edge e = gimple_phi_arg_edge (as_a (use_stmt), phi_arg_index_from_use (use_p)); - bool main_exit_edge = e == main_e - || find_connected_edge (main_e, e->src); - - /* Early exits have an merge block, we want the merge block itself - so use ->src. For main exit the merge block is the - destination. */ - basic_block dest = main_exit_edge ? main_e->dest : e->src; + gcc_assert (loop_exit_edge_p (loop, e)); + bool main_exit_edge = e == main_e; tree tmp_vec_lhs = vec_lhs; tree tmp_bitstart = bitstart; @@ -10988,22 +10970,18 @@ vectorizable_live_operation (vec_info *vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator exit_gsi; tree new_tree = vectorizable_live_operation_1 (loop_vinfo, stmt_info, - dest, vectype, ncopies, + e->dest, vectype, ncopies, slp_node, bitsize, tmp_bitstart, tmp_vec_lhs, lhs_type, &exit_gsi); - if (gimple_phi_num_args (use_stmt) == 1) - { - auto gsi = gsi_for_stmt (use_stmt); - remove_phi_node (&gsi, false); - tree lhs_phi = gimple_phi_result (use_stmt); - gimple *copy = gimple_build_assign (lhs_phi, new_tree); - gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT); - } - else - SET_PHI_ARG_DEF (use_stmt, e->dest_idx, new_tree); - } + auto gsi = gsi_for_stmt (use_stmt); + remove_phi_node (&gsi, false); + tree lhs_phi = gimple_phi_result (use_stmt); + gimple *copy = gimple_build_assign (lhs_phi, new_tree); + gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT); + break; + } /* There a no further out-of-loop uses of lhs by LC-SSA construction. */ FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, lhs) -- 2.35.3