From: "Kewen.Lin" <linkw@linux.ibm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <richard.guenther@gmail.com>,
Richard Sandiford <richard.sandiford@arm.com>,
Segher Boessenkool <segher@kernel.crashing.org>,
Peter Bergner <bergner@linux.ibm.com>
Subject: [PATCH 2/2] vect: Enhance cost evaluation in vect_transform_slp_perm_load_1
Date: Wed, 17 May 2023 14:15:00 +0800 [thread overview]
Message-ID: <71fda837-6a92-7f74-43e1-90b046919f6a@linux.ibm.com> (raw)
In-Reply-To: <72a5c5db-bc06-eded-d229-82af34342515@linux.ibm.com>
Hi,
Following Richi's suggestion in [1], I'm working on deferring
cost evaluation next to the transformation, this patch is
to enhance function vect_transform_slp_perm_load_1 which
could under-cost for vector permutation, since the costing
doesn't try to consider nvectors_per_build, it's inconsistent
with the transformation part.
Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
Is it ok for trunk?
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html
BR,
Kewen
-----
gcc/ChangeLog:
* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Adjust the
calculation on n_perms by considering nvectors_per_build.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c: New test.
---
.../vect/costmodel/ppc/costmodel-slp-perm.c | 23 +++++++
gcc/tree-vect-slp.cc | 66 ++++++++++---------
2 files changed, 57 insertions(+), 32 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
new file mode 100644
index 00000000000..e5c4dceddfb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-slp-perm.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* Specify power9 to ensure the vectorization is profitable
+ and test point stands, otherwise it could be not profitable
+ to vectorize. */
+/* { dg-additional-options "-mdejagnu-cpu=power9 -mpower9-vector" } */
+
+/* Verify we cost the exact count for required vec_perm. */
+
+int x[1024], y[1024];
+
+void
+foo ()
+{
+ for (int i = 0; i < 512; ++i)
+ {
+ x[2 * i] = y[1023 - (2 * i)];
+ x[2 * i + 1] = y[1023 - (2 * i + 1)];
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "2 times vec_perm" 1 "vect" } } */
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index e5c9d7e766e..af9a6dd4fa9 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -8115,12 +8115,12 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node,
mode = TYPE_MODE (vectype);
poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
+ unsigned int nstmts = SLP_TREE_NUMBER_OF_VEC_STMTS (node);
/* Initialize the vect stmts of NODE to properly insert the generated
stmts later. */
if (! analyze_only)
- for (unsigned i = SLP_TREE_VEC_STMTS (node).length ();
- i < SLP_TREE_NUMBER_OF_VEC_STMTS (node); i++)
+ for (unsigned i = SLP_TREE_VEC_STMTS (node).length (); i < nstmts; i++)
SLP_TREE_VEC_STMTS (node).quick_push (NULL);
/* Generate permutation masks for every NODE. Number of masks for each NODE
@@ -8161,7 +8161,10 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node,
(b) the permutes only need a single vector input. */
mask.new_vector (nunits, group_size, 3);
nelts_to_build = mask.encoded_nelts ();
- nvectors_per_build = SLP_TREE_VEC_STMTS (node).length ();
+ /* It's possible to obtain zero nstmts during analyze_only, so make
+ it at least one to ensure the later computation for n_perms
+ proceed. */
+ nvectors_per_build = nstmts > 0 ? nstmts : 1;
in_nlanes = DR_GROUP_SIZE (stmt_info) * 3;
}
else
@@ -8252,40 +8255,39 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node,
return false;
}
- ++*n_perms;
-
+ tree mask_vec = NULL_TREE;
if (!analyze_only)
- {
- tree mask_vec = vect_gen_perm_mask_checked (vectype, indices);
+ mask_vec = vect_gen_perm_mask_checked (vectype, indices);
- if (second_vec_index == -1)
- second_vec_index = first_vec_index;
+ if (second_vec_index == -1)
+ second_vec_index = first_vec_index;
- for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
+ for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
+ {
+ ++*n_perms;
+ if (analyze_only)
+ continue;
+ /* Generate the permute statement if necessary. */
+ tree first_vec = dr_chain[first_vec_index + ri];
+ tree second_vec = dr_chain[second_vec_index + ri];
+ gassign *stmt = as_a<gassign *> (stmt_info->stmt);
+ tree perm_dest
+ = vect_create_destination_var (gimple_assign_lhs (stmt),
+ vectype);
+ perm_dest = make_ssa_name (perm_dest);
+ gimple *perm_stmt
+ = gimple_build_assign (perm_dest, VEC_PERM_EXPR, first_vec,
+ second_vec, mask_vec);
+ vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt,
+ gsi);
+ if (dce_chain)
{
- /* Generate the permute statement if necessary. */
- tree first_vec = dr_chain[first_vec_index + ri];
- tree second_vec = dr_chain[second_vec_index + ri];
- gassign *stmt = as_a<gassign *> (stmt_info->stmt);
- tree perm_dest
- = vect_create_destination_var (gimple_assign_lhs (stmt),
- vectype);
- perm_dest = make_ssa_name (perm_dest);
- gimple *perm_stmt
- = gimple_build_assign (perm_dest, VEC_PERM_EXPR,
- first_vec, second_vec, mask_vec);
- vect_finish_stmt_generation (vinfo, stmt_info, perm_stmt,
- gsi);
- if (dce_chain)
- {
- bitmap_set_bit (used_defs, first_vec_index + ri);
- bitmap_set_bit (used_defs, second_vec_index + ri);
- }
-
- /* Store the vector statement in NODE. */
- SLP_TREE_VEC_STMTS (node) [vect_stmts_counter++]
- = perm_stmt;
+ bitmap_set_bit (used_defs, first_vec_index + ri);
+ bitmap_set_bit (used_defs, second_vec_index + ri);
}
+
+ /* Store the vector statement in NODE. */
+ SLP_TREE_VEC_STMTS (node)[vect_stmts_counter++] = perm_stmt;
}
}
else if (!analyze_only)
--
2.39.1
next prev parent reply other threads:[~2023-05-17 6:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-17 6:09 [PATCH 1/2] vect: Refactor code for index == count " Kewen.Lin
2023-05-17 6:15 ` Kewen.Lin [this message]
2023-05-22 13:44 ` [PATCH 2/2] vect: Enhance cost evaluation " Richard Biener
2023-05-23 3:01 ` Kewen.Lin
2023-05-23 6:19 ` Richard Biener
2023-05-24 5:23 ` Kewen.Lin
2023-05-17 6:34 ` [PATCH 1/2] vect: Refactor code for index == count " Richard Biener
2023-05-17 7:18 ` Kewen.Lin
2023-05-18 6:12 ` Richard Biener
2023-05-22 5:37 ` Kewen.Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71fda837-6a92-7f74-43e1-90b046919f6a@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).