From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com,
segher@kernel.crashing.org, bergner@linux.ibm.com
Subject: [PATCH 3/9 v2] vect: Adjust vectorizable_load costing on VMAT_INVARIANT
Date: Mon, 3 Jul 2023 10:58:30 +0800 [thread overview]
Message-ID: <06e499be-2151-5c64-52be-ac8f69c46ad9@linux.ibm.com> (raw)
In-Reply-To: <CAFiYyc1YS1TnR8edxm5+=gjt7_PROLQN4dy9L+rZdukv_jDYaA@mail.gmail.com>
Hi Richi,
on 2023/6/30 19:18, Richard Biener wrote:
> On Tue, Jun 13, 2023 at 4:03 AM Kewen Lin <linkw@linux.ibm.com> wrote:
>>
>> This patch adjusts the cost handling on VMAT_INVARIANT in
>> function vectorizable_load. We don't call function
>> vect_model_load_cost for it any more.
>>
>> To make the costing on VMAT_INVARIANT better, this patch is
>> to query hoist_defs_of_uses for hoisting decision, and add
>> costs for different "where" based on it. Currently function
>> hoist_defs_of_uses would always hoist the defs of all SSA
>> uses, adding one argument HOIST_P aims to avoid the actual
>> hoisting during costing phase.
>>
>> gcc/ChangeLog:
>>
>> * tree-vect-stmts.cc (hoist_defs_of_uses): Add one argument HOIST_P.
>> (vectorizable_load): Adjust the handling on VMAT_INVARIANT to respect
>> hoisting decision and without calling vect_model_load_cost.
>> (vect_model_load_cost): Assert it won't get VMAT_INVARIANT any more
>> and remove VMAT_INVARIANT related handlings.
>> ---
>> gcc/tree-vect-stmts.cc | 61 +++++++++++++++++++++++++++---------------
>> 1 file changed, 39 insertions(+), 22 deletions(-)
>>
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index 744cdf40e26..19c61d703c8 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -1135,7 +1135,8 @@ vect_model_load_cost (vec_info *vinfo,
>> slp_tree slp_node,
>> stmt_vector_for_cost *cost_vec)
>> {
>> - gcc_assert (memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl);
>> + gcc_assert ((memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl)
>> + && memory_access_type != VMAT_INVARIANT);
>>
>> unsigned int inside_cost = 0, prologue_cost = 0;
>> bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info);
>> @@ -1238,16 +1239,6 @@ vect_model_load_cost (vec_info *vinfo,
>> ncopies * assumed_nunits,
>> scalar_load, stmt_info, 0, vect_body);
>> }
>> - else if (memory_access_type == VMAT_INVARIANT)
>> - {
>> - /* Invariant loads will ideally be hoisted and splat to a vector. */
>> - prologue_cost += record_stmt_cost (cost_vec, 1,
>> - scalar_load, stmt_info, 0,
>> - vect_prologue);
>> - prologue_cost += record_stmt_cost (cost_vec, 1,
>> - scalar_to_vec, stmt_info, 0,
>> - vect_prologue);
>> - }
>> else
>> vect_get_load_cost (vinfo, stmt_info, ncopies,
>> alignment_support_scheme, misalignment, first_stmt_p,
>> @@ -9121,10 +9112,11 @@ permute_vec_elements (vec_info *vinfo,
>> /* Hoist the definitions of all SSA uses on STMT_INFO out of the loop LOOP,
>> inserting them on the loops preheader edge. Returns true if we
>> were successful in doing so (and thus STMT_INFO can be moved then),
>> - otherwise returns false. */
>> + otherwise returns false. HOIST_P indicates if we want to hoist the
>> + definitions of all SSA uses, it would be false when we are costing. */
>>
>> static bool
>> -hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop)
>> +hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p)
>> {
>> ssa_op_iter i;
>> tree op;
>> @@ -9158,6 +9150,9 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop)
>> if (!any)
>> return true;
>>
>> + if (!hoist_p)
>> + return true;
>> +
>> FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
>> {
>> gimple *def_stmt = SSA_NAME_DEF_STMT (op);
>> @@ -9510,14 +9505,6 @@ vectorizable_load (vec_info *vinfo,
>>
>> if (memory_access_type == VMAT_INVARIANT)
>> {
>> - if (costing_p)
>> - {
>> - vect_model_load_cost (vinfo, stmt_info, ncopies, vf,
>> - memory_access_type, alignment_support_scheme,
>> - misalignment, &gs_info, slp_node, cost_vec);
>> - return true;
>> - }
>> -
>> gcc_assert (!grouped_load && !mask && !bb_vinfo);
>> /* If we have versioned for aliasing or the loop doesn't
>> have any data dependencies that would preclude this,
>> @@ -9525,7 +9512,37 @@ vectorizable_load (vec_info *vinfo,
>> thus we can insert it on the preheader edge. */
>> bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
>> && !nested_in_vect_loop
>> - && hoist_defs_of_uses (stmt_info, loop));
>> + && hoist_defs_of_uses (stmt_info, loop, !costing_p));
>
> 'hoist_defs_of_uses' should ideally be computed once at analysis time and
> the result remembered. It's not so easy in this case so maybe just
> add a comment
> for this here.
Ok, updated with:
/* If we have versioned for aliasing or the loop doesn't
have any data dependencies that would preclude this,
then we are sure this is a loop invariant load and
- thus we can insert it on the preheader edge. */
+ thus we can insert it on the preheader edge.
+ TODO: hoist_defs_of_uses should ideally be computed
+ once at analysis time, remembered and used in the
+ transform time. */
>
>> + if (costing_p)
>> + {
>> + if (hoist_p)
>> + {
>> + unsigned int prologue_cost;
>> + prologue_cost = record_stmt_cost (cost_vec, 1, scalar_load,
>> + stmt_info, 0, vect_prologue);
>> + prologue_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec,
>> + stmt_info, 0, vect_prologue);
>> + if (dump_enabled_p ())
>> + dump_printf_loc (MSG_NOTE, vect_location,
>> + "vect_model_load_cost: inside_cost = 0, "
>> + "prologue_cost = %d .\n",
>> + prologue_cost);
>> + }
>> + else
>> + {
>> + unsigned int inside_cost;
>> + inside_cost = record_stmt_cost (cost_vec, 1, scalar_load,
>> + stmt_info, 0, vect_body);
>> + inside_cost += record_stmt_cost (cost_vec, 1, scalar_to_vec,
>> + stmt_info, 0, vect_body);
>> + if (dump_enabled_p ())
>> + dump_printf_loc (MSG_NOTE, vect_location,
>> + "vect_model_load_cost: inside_cost = %d, "
>> + "prologue_cost = 0 .\n",
>> + inside_cost);
>> + }
>
> Please instead do
>
> enum vect_cost_model_location loc = hoist_p ?
> vect_prologue : vect_body;
>
> and merge the two branches which otherwise look identical to me.
Good idea, the dump_printf_loc also has some difference, updated with:
+ if (costing_p)
+ {
+ enum vect_cost_model_location cost_loc
+ = hoist_p ? vect_prologue : vect_body;
+ unsigned int cost = record_stmt_cost (cost_vec, 1, scalar_load,
+ stmt_info, 0, cost_loc);
+ cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, stmt_info, 0,
+ cost_loc);
+ unsigned int prologue_cost = hoist_p ? cost : 0;
+ unsigned int inside_cost = hoist_p ? 0 : cost;
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "vect_model_load_cost: inside_cost = %d, "
+ "prologue_cost = %d .\n",
+ inside_cost, prologue_cost);
+ return true;
+ }
---------------------
The whole patch v2 is as below:
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index dd8f5421d4e..ce53cb30c79 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1136,7 +1136,8 @@ vect_model_load_cost (vec_info *vinfo,
slp_tree slp_node,
stmt_vector_for_cost *cost_vec)
{
- gcc_assert (memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl);
+ gcc_assert ((memory_access_type != VMAT_GATHER_SCATTER || !gs_info->decl)
+ && memory_access_type != VMAT_INVARIANT);
unsigned int inside_cost = 0, prologue_cost = 0;
bool grouped_access_p = STMT_VINFO_GROUPED_ACCESS (stmt_info);
@@ -1241,16 +1242,6 @@ vect_model_load_cost (vec_info *vinfo,
ncopies * assumed_nunits,
scalar_load, stmt_info, 0, vect_body);
}
- else if (memory_access_type == VMAT_INVARIANT)
- {
- /* Invariant loads will ideally be hoisted and splat to a vector. */
- prologue_cost += record_stmt_cost (cost_vec, 1,
- scalar_load, stmt_info, 0,
- vect_prologue);
- prologue_cost += record_stmt_cost (cost_vec, 1,
- scalar_to_vec, stmt_info, 0,
- vect_prologue);
- }
else
vect_get_load_cost (vinfo, stmt_info, ncopies,
alignment_support_scheme, misalignment, first_stmt_p,
@@ -9269,10 +9260,11 @@ permute_vec_elements (vec_info *vinfo,
/* Hoist the definitions of all SSA uses on STMT_INFO out of the loop LOOP,
inserting them on the loops preheader edge. Returns true if we
were successful in doing so (and thus STMT_INFO can be moved then),
- otherwise returns false. */
+ otherwise returns false. HOIST_P indicates if we want to hoist the
+ definitions of all SSA uses, it would be false when we are costing. */
static bool
-hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop)
+hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p)
{
ssa_op_iter i;
tree op;
@@ -9306,6 +9298,9 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop)
if (!any)
return true;
+ if (!hoist_p)
+ return true;
+
FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
{
gimple *def_stmt = SSA_NAME_DEF_STMT (op);
@@ -9658,22 +9653,34 @@ vectorizable_load (vec_info *vinfo,
if (memory_access_type == VMAT_INVARIANT)
{
- if (costing_p)
- {
- vect_model_load_cost (vinfo, stmt_info, ncopies, vf,
- memory_access_type, alignment_support_scheme,
- misalignment, &gs_info, slp_node, cost_vec);
- return true;
- }
-
gcc_assert (!grouped_load && !mask && !bb_vinfo);
/* If we have versioned for aliasing or the loop doesn't
have any data dependencies that would preclude this,
then we are sure this is a loop invariant load and
- thus we can insert it on the preheader edge. */
+ thus we can insert it on the preheader edge.
+ TODO: hoist_defs_of_uses should ideally be computed
+ once at analysis time, remembered and used in the
+ transform time. */
bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
&& !nested_in_vect_loop
- && hoist_defs_of_uses (stmt_info, loop));
+ && hoist_defs_of_uses (stmt_info, loop, !costing_p));
+ if (costing_p)
+ {
+ enum vect_cost_model_location cost_loc
+ = hoist_p ? vect_prologue : vect_body;
+ unsigned int cost = record_stmt_cost (cost_vec, 1, scalar_load,
+ stmt_info, 0, cost_loc);
+ cost += record_stmt_cost (cost_vec, 1, scalar_to_vec, stmt_info, 0,
+ cost_loc);
+ unsigned int prologue_cost = hoist_p ? cost : 0;
+ unsigned int inside_cost = hoist_p ? 0 : cost;
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "vect_model_load_cost: inside_cost = %d, "
+ "prologue_cost = %d .\n",
+ inside_cost, prologue_cost);
+ return true;
+ }
if (hoist_p)
{
gassign *stmt = as_a <gassign *> (stmt_info->stmt);
--
2.31.1
next prev parent reply other threads:[~2023-07-03 2:58 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-13 2:03 [PATCH 0/9] vect: Move costing next to the transform for vect load Kewen Lin
2023-06-13 2:03 ` [PATCH 1/9] vect: Move vect_model_load_cost next to the transform in vectorizable_load Kewen Lin
2023-06-13 2:03 ` [PATCH 2/9] vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_info.decl Kewen Lin
2023-06-30 11:11 ` Richard Biener
2023-07-03 2:57 ` [PATCH 2/9 v2] " Kewen.Lin
2023-06-13 2:03 ` [PATCH 3/9] vect: Adjust vectorizable_load costing on VMAT_INVARIANT Kewen Lin
2023-06-30 11:18 ` Richard Biener
2023-07-03 2:58 ` Kewen.Lin [this message]
2023-06-13 2:03 ` [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP Kewen Lin
2023-07-02 8:58 ` Richard Sandiford
2023-07-03 3:19 ` Kewen.Lin
2023-07-22 15:58 ` Iain Sandoe
2023-07-24 1:50 ` Kewen.Lin
2023-06-13 2:03 ` [PATCH 5/9] vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER Kewen Lin
2023-07-03 3:01 ` [PATCH 5/9 v2] " Kewen.Lin
2023-06-13 2:03 ` [PATCH 6/9] vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES Kewen Lin
2023-06-13 2:03 ` [PATCH 7/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE Kewen Lin
2023-06-13 2:03 ` [PATCH 8/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE Kewen Lin
2023-06-14 8:17 ` Hongtao Liu
2023-06-19 7:23 ` Kewen.Lin
2023-06-13 2:03 ` [PATCH 9/9] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS Kewen Lin
2023-07-03 3:06 ` [PATCH 9/9 v2] " Kewen.Lin
2023-06-26 6:00 ` [PATCH 0/9] vect: Move costing next to the transform for vect load Kewen.Lin
2023-06-30 11:37 ` Richard Biener
2023-07-02 9:13 ` Richard Sandiford
2023-07-03 3:39 ` Kewen.Lin
2023-07-03 8:42 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=06e499be-2151-5c64-52be-ac8f69c46ad9@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).