public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc/devel/omp/gcc-11] [og10] vect: Add target hook to prefer gather/scatter instructions
@ 2021-05-13 16:18 Kwok Yeung
0 siblings, 0 replies; only message in thread
From: Kwok Yeung @ 2021-05-13 16:18 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:2c625c740e5d5b0f984f8aa6b0fa3525d146f7ef
commit 2c625c740e5d5b0f984f8aa6b0fa3525d146f7ef
Author: Julian Brown <julian@codesourcery.com>
Date: Wed Nov 25 09:08:01 2020 -0800
[og10] vect: Add target hook to prefer gather/scatter instructions
For AMD GCN, the instructions available for loading/storing vectors are
always scatter/gather operations (i.e. there are separate addresses for
each vector lane), so the current heuristic to avoid gather/scatter
operations with too many elements in get_group_load_store_type is
counterproductive. Avoiding such operations in that function can
subsequently lead to a missed vectorization opportunity whereby later
analyses in the vectorizer try to use a very wide array type which is
not available on this target, and thus it bails out.
The attached patch adds a target hook to override the "single_element_p"
heuristic in the function as a target hook, and activates it for GCN. This
allows much better code to be generated for affected loops.
2021-01-13 Julian Brown <julian@codesourcery.com>
gcc/
* doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add
documentation hook.
* doc/tm.texi: Regenerate.
* target.def (prefer_gather_scatter): Add target hook under vectorizer.
* tree-vect-stmts.c (get_group_load_store_type): Optionally prefer
gather/scatter instructions to scalar/elementwise fallback.
* config/gcn/gcn.c (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define
hook.
Diff:
---
gcc/config/gcn/gcn.c | 2 ++
gcc/doc/tm.texi | 5 +++++
gcc/doc/tm.texi.in | 2 ++
gcc/target.def | 8 ++++++++
gcc/tree-vect-stmts.c | 9 +++++++--
5 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 95b19c485fc..c1823e8747e 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -6534,6 +6534,8 @@ gcn_dwarf_register_span (rtx rtl)
gcn_vector_alignment_reachable
#undef TARGET_VECTOR_MODE_SUPPORTED_P
#define TARGET_VECTOR_MODE_SUPPORTED_P gcn_vector_mode_supported_p
+#undef TARGET_VECTORIZE_PREFER_GATHER_SCATTER
+#define TARGET_VECTORIZE_PREFER_GATHER_SCATTER true
struct gcc_target targetm = TARGET_INITIALIZER;
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8259b6dbb38..873c4919a22 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6143,6 +6143,11 @@ The default is @code{NULL_TREE} which means to not vectorize scatter
stores.
@end deftypefn
+@deftypevr {Target Hook} bool TARGET_VECTORIZE_PREFER_GATHER_SCATTER
+This hook is set to TRUE if gather loads or scatter stores are cheaper on
+this target than a sequence of elementwise loads or stores.
+@end deftypevr
+
@deftypefn {Target Hook} int TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN (struct cgraph_node *@var{}, struct cgraph_simd_clone *@var{}, @var{tree}, @var{int})
This hook should set @var{vecsize_mangle}, @var{vecsize_int}, @var{vecsize_float}
fields in @var{simd_clone} structure pointed by @var{clone_info} argument and also
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 193b3478b5c..69531ac46cf 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4199,6 +4199,8 @@ address; but often a machine-dependent strategy can generate better code.
@hook TARGET_VECTORIZE_BUILTIN_SCATTER
+@hook TARGET_VECTORIZE_PREFER_GATHER_SCATTER
+
@hook TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN
@hook TARGET_SIMD_CLONE_ADJUST
diff --git a/gcc/target.def b/gcc/target.def
index 3b3719dd3b7..a00eded91e0 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2012,6 +2012,14 @@ all zeros. GCC can then try to branch around the instruction instead.",
(unsigned ifn),
default_empty_mask_is_expensive)
+/* Prefer gather/scatter loads/stores to e.g. elementwise accesses if\n\
+we cannot use a contiguous access. */
+DEFHOOKPOD
+(prefer_gather_scatter,
+ "This hook is set to TRUE if gather loads or scatter stores are cheaper on\n\
+this target than a sequence of elementwise loads or stores.",
+ bool, false)
+
/* Target builtin that implements vector gather operation. */
DEFHOOK
(builtin_gather,
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 4c01e82ff39..35288a03ab8 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2264,9 +2264,14 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
it probably isn't a win to use separate strided accesses based
on nearby locations. Or, even if it's a win over scalar code,
it might not be a win over vectorizing at a lower VF, if that
- allows us to use contiguous accesses. */
+ allows us to use contiguous accesses.
+
+ On some targets (e.g. AMD GCN), always use gather/scatter accesses
+ here since those are the only types of vector loads/stores available,
+ and the fallback case of using elementwise accesses is very
+ inefficient. */
if (*memory_access_type == VMAT_ELEMENTWISE
- && single_element_p
+ && (targetm.vectorize.prefer_gather_scatter || single_element_p)
&& loop_vinfo
&& vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo,
masked_p, gs_info))
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-05-13 16:18 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13 16:18 [gcc/devel/omp/gcc-11] [og10] vect: Add target hook to prefer gather/scatter instructions Kwok Yeung
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).