From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1666) id 1338E3858D1E; Fri, 23 Feb 2024 07:31:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1338E3858D1E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1708673470; bh=VlX2ka2ZxAmEogMxhAXuvuk+V9PrwDL6FmGoBDl43WA=; h=From:To:Subject:Date:From; b=LxE8DtJXTIUI8OKzopPyofuYoDKOPbXU+2lB/srIEJwn6TvN5l7PFLyzBnrunF7Oz soOuOfjjB294qjVVDZtfciHE0hwx57fNgcTl3y3CQeA3XtOoDYIZf5P5uv49EfdrL/ XfYK3H+c3DRmbvXZEzxySGxR2AFl5/GKTnC4WvPw= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Richard Biener To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/rguenth/heads/vect-force-slp)] Add --param vect-single-lane-slp X-Act-Checkin: gcc X-Git-Author: Richard Biener X-Git-Refname: refs/users/rguenth/heads/vect-force-slp X-Git-Oldrev: e5d482cbd046c89325670e2675f18b15c9214ca3 X-Git-Newrev: f9c2a5d3b3a0438aec38a31d01d32927f8134a5d Message-Id: <20240223073110.1338E3858D1E@sourceware.org> Date: Fri, 23 Feb 2024 07:31:10 +0000 (GMT) List-Id: https://gcc.gnu.org/g:f9c2a5d3b3a0438aec38a31d01d32927f8134a5d commit f9c2a5d3b3a0438aec38a31d01d32927f8134a5d Author: Richard Biener Date: Fri Sep 29 12:54:17 2023 +0200 Add --param vect-single-lane-slp The following adds --param vect-single-lane-slp to guard single-lane loop SLP discovery. As first client we look at non-grouped stores with an assert that SLP discovery works to discover gaps in it. * params.opt (-param=vect-single-lane-slp=): New. * tree-vect-slp.cc (vect_analyze_slp): Perform single-lane loop SLP discovery for non-grouped stores if requested. Diff: --- gcc/params.opt | 4 ++++ gcc/tree-vect-slp.cc | 26 ++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/gcc/params.opt b/gcc/params.opt index e1848e6cc2af..ae5d0ed24e85 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -1198,6 +1198,10 @@ The maximum factor which the loop vectorizer applies to the cost of statements i Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization Enable loop vectorization of floating point inductions. +-param=vect-single-lane-slp= +Common Joined UInteger Var(param_vect_single_lane_slp) Init(0) IntegerRange(0, 1) Param Optimization +Enable single lane SLP discovery. + -param=vect-force-slp= Common Joined UInteger Var(param_vect_force_slp) Init(0) IntegerRange(0, 1) Param Optimization Fail vectorization when falling back to non-SLP. diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 7cf9504398c9..a89836f0df35 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -3628,6 +3628,7 @@ vect_analyze_slp_instance (vec_info *vinfo, opt_result vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size) { + loop_vec_info loop_vinfo = dyn_cast (vinfo); unsigned int i; stmt_vec_info first_element; slp_instance instance; @@ -3643,6 +3644,31 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size) FOR_EACH_VEC_ELT (vinfo->grouped_stores, i, first_element) vect_analyze_slp_instance (vinfo, bst_map, first_element, slp_inst_kind_store, max_tree_size, &limit); + if (loop_vinfo && param_vect_single_lane_slp != 0) + { + data_reference_p dr; + FOR_EACH_VEC_ELT (vinfo->shared->datarefs, i, dr) + if (DR_IS_WRITE (dr)) + { + stmt_vec_info stmt_info = vinfo->lookup_dr (dr)->stmt; + /* It works a bit to dissolve the group but that's + not really what we want to do. Instead group analysis + above starts discovery for each lane and pieces them together + to a single store to the whole group. */ + if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) + continue; + vec stmts; + vec roots = vNULL; + vec remain = vNULL; + stmts.create (1); + stmts.quick_push (stmt_info); + bool res = vect_build_slp_instance (vinfo, slp_inst_kind_store, + stmts, roots, remain, + max_tree_size, &limit, + bst_map, NULL); + gcc_assert (res); + } + } if (bb_vec_info bb_vinfo = dyn_cast (vinfo)) {