From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 29F9F3858C2C for ; Fri, 8 Jul 2022 10:03:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 29F9F3858C2C Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 903C41FF7B; Fri, 8 Jul 2022 10:02:59 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 7EB5F2C141; Fri, 8 Jul 2022 10:02:59 +0000 (UTC) Date: Fri, 8 Jul 2022 10:02:59 +0000 (UTC) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com Subject: [PATCH] tree-optimization/106226 - move vectorizer virtual SSA update Message-ID: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jul 2022 10:03:03 -0000 When we knowingly have broken virtual SSA form we need to update it before we eventually perform slpeel manual updating which will call delete_update_ssa. Currently that's done on-demand but communicating whether it's a known unavoidable case is broken there. The following makes that a synchronous operation but instead of actually performing the update we instead recod the need, clear the update SSA sub-state and force virtual renaming at the very end of the vectorization pass. This reorg exposes an issue like with LOAD_LANES in simd-clone handling which I patch up in a similar fashion. So it might uncover new latent issues elsewhere. Bootstrap / regtest re-running on x86_64-unknown-linux-gnu after fixing the simdclode issue. Richard. PR tree-optimization/106226 * tree-vect-loop-manip.cc (vect_do_peeling): Assert that no SSA update is needed. Move virtual SSA update ... * tree-vectorizer.cc (pass_vectorize::execute): ... here, via forced virtual renaming when TODO_update_ssa_only_virtuals is queued. (vect_transform_loops): Return TODO_update_ssa_only_virtuals when virtual SSA update is required. (try_vectorize_loop_1): Adjust. * tree-vect-stmts.c (vectorizable_simd_clone_call): Allow virtual renaming if the ABI forces an aggregate return but the original call did not have a virtual definition. * gfortran.dg/pr106226.f: New testcase. --- gcc/testsuite/gfortran.dg/pr106226.f | 37 ++++++++++++++++++++++++++++ gcc/tree-vect-loop-manip.cc | 11 ++++----- gcc/tree-vect-stmts.cc | 8 ++++++ gcc/tree-vectorizer.cc | 29 +++++++++++++++++++--- 4 files changed, 76 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr106226.f diff --git a/gcc/testsuite/gfortran.dg/pr106226.f b/gcc/testsuite/gfortran.dg/pr106226.f new file mode 100644 index 00000000000..19237bc5a71 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr106226.f @@ -0,0 +1,37 @@ +! { dg-do compile } +! { dg-options "-O3 -std=legacy" } + + SUBROUTINE EFTORD(DM,CHDINT,L4) + IMPLICIT DOUBLE PRECISION (A-H,O-Z) + PARAMETER (MXPT=100, MXFRG=50, MXFGPT=MXPT*MXFRG) + DIMENSION DM(*),CHDINT(L4) + COMMON /FGRAD / DEF0,DEFT0,TORQ0 + * ,ATORQ(3,MXFRG) + COMMON /CSSTV / CX,CY,CZ + * EFBTRM(MXFGPT),EFATRM2(MXFGPT),EFBTRM2(MXFGPT), + * EFDIP(3,MXFGPT),EFQAD(6,MXFGPT), + * EFOCT(10,MXFGPT),FRGNME(MXFGPT) + IF(NROOTS.EQ.5) CALL ROOT5 + IF(NROOTS.EQ.6) CALL ROOT6 + IF(NROOTS.GE.7) THEN + CALL ABRT + END IF + DO 403 I = 1,IJ + CHDINT(ICC)=CHDINT(ICC)-DUM*DUMY + ICC=ICC+1 + 403 CONTINUE + CHDINT(ICC)=CHDINT(ICC)-DUM*DUMY + DO 550 J=MINJ,MAX + LJ=LOCJ+J + IF (LI-LJ) 920,940,940 + 920 ID = LJ + GO TO 960 + 940 ID = LI + 960 NN = (ID*(ID-1))/2+JD + DUM = DM(NN) + ATORQ(1,INF)=ATORQ(1,INF)-DUM*(CHDINT(ICC+1)*EFDIP(3,IC) + $ -CHDINT(ICC+2)*EFDIP(2,IC)) + ICC=ICC+1 + ICC=ICC+1 + 550 CONTINUE + END diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index d7410d7b4bd..2c2b4f7bd53 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -2696,12 +2696,11 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, class loop *first_loop = loop; bool irred_flag = loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP; - /* We should not have to update virtual SSA form here but some - transforms involve creating new virtual definitions which makes - updating difficult. */ - gcc_assert (!need_ssa_update_p (cfun) - || loop_vinfo->any_known_not_updated_vssa); - update_ssa (TODO_update_ssa_only_virtuals); + /* SSA form needs to be up-to-date since we are going to manually + update SSA form in slpeel_tree_duplicate_loop_to_edge_cfg and delete all + update SSA state after that, so we have to make sure to not lose any + pending update needs. */ + gcc_assert (!need_ssa_update_p (cfun)); create_lcssa_for_virtual_phi (loop); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 3db6620dd42..01d982eea98 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4247,6 +4247,14 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (!vec_stmt) /* transformation not required. */ { + /* When the original call is pure or const but the SIMD ABI dictates + an aggregate return we will have to use a virtual definition and + in a loop eventually even need to add a virtual PHI. That's + not straight-forward so allow to fix this up via renaming. */ + if (gimple_call_lhs (stmt) + && !gimple_vdef (stmt) + && TREE_CODE (TREE_TYPE (TREE_TYPE (bestn->decl))) == ARRAY_TYPE) + vinfo->any_known_not_updated_vssa = true; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (bestn->decl); for (i = 0; i < nargs; i++) if ((bestn->simdclone->args[i].arg_type diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc index 53dc4520963..6ec49511d74 100644 --- a/gcc/tree-vectorizer.cc +++ b/gcc/tree-vectorizer.cc @@ -82,6 +82,7 @@ along with GCC; see the file COPYING3. If not see #include "opt-problem.h" #include "internal-fn.h" #include "tree-ssa-sccvn.h" +#include "tree-into-ssa.h" /* Loop or bb location, with hotness information. */ dump_user_location_t vect_location; @@ -982,7 +983,7 @@ set_uid_loop_bbs (loop_vec_info loop_vinfo, gimple *loop_vectorized_call, /* Generate vectorized code for LOOP and its epilogues. */ -static void +static unsigned vect_transform_loops (hash_table *&simduid_to_vf_htab, loop_p loop, gimple *loop_vectorized_call, function *fun) @@ -1020,9 +1021,25 @@ vect_transform_loops (hash_table *&simduid_to_vf_htab, = simduid_to_vf_data; } + /* We should not have to update virtual SSA form here but some + transforms involve creating new virtual definitions which makes + updating difficult. + We delay the actual update to the end of the pass but avoid + confusing ourselves by forcing need_ssa_update_p () to false. */ + unsigned todo = 0; + if (need_ssa_update_p (cfun)) + { + gcc_assert (loop_vinfo->any_known_not_updated_vssa); + fun->gimple_df->ssa_renaming_needed = false; + todo |= TODO_update_ssa_only_virtuals; + } + gcc_assert (!need_ssa_update_p (cfun)); + /* Epilogue of vectorized loop must be vectorized too. */ if (new_loop) - vect_transform_loops (simduid_to_vf_htab, new_loop, NULL, fun); + todo |= vect_transform_loops (simduid_to_vf_htab, new_loop, NULL, fun); + + return todo; } /* Try to vectorize LOOP. */ @@ -1133,7 +1150,8 @@ try_vectorize_loop_1 (hash_table *&simduid_to_vf_htab, (*num_vectorized_loops)++; /* Transform LOOP and its epilogues. */ - vect_transform_loops (simduid_to_vf_htab, loop, loop_vectorized_call, fun); + ret |= vect_transform_loops (simduid_to_vf_htab, loop, + loop_vectorized_call, fun); if (loop_vectorized_call) { @@ -1332,6 +1350,11 @@ pass_vectorize::execute (function *fun) if (num_vectorized_loops > 0) { + /* We are collecting some corner cases where we need to update + virtual SSA form via the TODO but delete the queued update-SSA + state. Force renaming if we think that might be necessary. */ + if (ret & TODO_update_ssa_only_virtuals) + mark_virtual_operands_for_renaming (cfun); /* If we vectorized any loop only virtual SSA form needs to be updated. ??? Also while we try hard to update loop-closed SSA form we fail to properly do this in some corner-cases (see PR56286). */ -- 2.36.1