From: Julian Brown <julian@codesourcery.com>
To: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
Cc: <gcc-patches@gcc.gnu.org>, Tom de Vries <tdevries@suse.de>,
"Chung-Lin Tang" <cltang@codesourcery.com>, <jakub@redhat.com>
Subject: Re: [PATCH, OpenACC] Add support for gang local storage allocation in shared memory
Date: Mon, 03 Jun 2019 16:03:00 -0000 [thread overview]
Message-ID: <20190603170245.4a62a0ad@squid.athome> (raw)
In-Reply-To: <20181211150811.47a032cf@squid.athome>
[-- Attachment #1: Type: text/plain, Size: 3290 bytes --]
On Tue, 11 Dec 2018 15:08:11 +0000
Julian Brown <julian@codesourcery.com> wrote:
> Is this version OK? Re-tested with offloading to NVPTX.
This is a ping for the patch posted here:
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00749.html
This is a new version of the patch, rebased and with a couple of
additional bugfixes, as follows:
Firstly, in mark_oacc_gangprivate, each decl is looked up (using
maybe_lookup_decl) to apply the "oacc gangprivate" attribute to the
innermost-nested copy of the decl.
Secondly, I'd misunderstood when the maximum parallelism level was
calculated for each nested omp_context, meaning that the code to
trigger adding the "oacc gangprivate" attribute could trigger in the
wrong circumstances. I've fixed this by moving the attribute-setting to
execute_lower_omp.
I've also added a new testcase (gangprivate-attrib-2.f90). Re-tested
with offloading to nvptx.
OK for trunk?
Thank you,
Julian
2019-06-03 Julian Brown <julian@codesourcery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (tree-hash-traits.h): Include.
(gangprivate_shared_size): New global variable.
(gangprivate_shared_align): Likewise.
(gangprivate_shared_sym): Likewise.
(gangprivate_shared_hmap): Likewise.
(nvptx_option_override): Initialize gangprivate_shared_sym,
gangprivate_shared_align.
(nvptx_file_end): Output gangprivate_shared_sym.
(nvptx_goacc_expand_accel_var): New function.
(nvptx_set_current_function): Initialise gangprivate_shared_hmap. Add
function comment.
(TARGET_GOACC_EXPAND_ACCEL): Likewise.
* doc/tm.texi (TARGET_GOACC_EXPAND_ACCEL_VAR): Document new hook.
* doc/tm.texi.in (TARGET_GOACC_EXPAND_ACCEL_VAR): Likewise.
* expr.c (expand_expr_real_1): Remap VAR_DECLs marked with the
"oacc gangprivate" attribute.
* omp-low.c (omp_context): Add oacc_partitioning_level and
oacc_addressable_var_decls fields.
(new_omp_context): Initialize oacc_addressable_var_decls in new
omp_context.
(delete_omp_context): Delete oacc_addressable_var_decls in old
omp_context.
(lower_oacc_head_tail): Record partitioning-level count in omp context.
(oacc_record_private_var_clauses, oacc_record_vars_in_bind,
mark_oacc_gangprivate): New functions.
(lower_omp_for): Call oacc_record_private_var_clauses with "for"
clauses.
(lower_omp_target): Likewise, for "target" clauses.
Call mark_oacc_gangprivate for offloaded target regions.
(process_oacc_gangprivate_1): New function.
(lower_omp_1): Call oacc_record_vars_in_bind for GIMPLE_BIND within OMP
regions.
(execute_lower_omp): Call process_oacc_gangprivate_1 for each OMP
context.
* target.def (expand_accel_var): New hook.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New test.
* testsuite/libgomp.oacc-c/pr85465.c: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: New test.
[-- Attachment #2: 0001-OpenACC-Add-support-for-gang-local-storage-allocatio.patch --]
[-- Type: text/x-patch, Size: 21812 bytes --]
From 917189cd07fcb68ba289c5fbcd768b7d4dff785f Mon Sep 17 00:00:00 2001
From: Julian Brown <julian@codesourcery.com>
Date: Thu, 9 Aug 2018 20:27:04 -0700
Subject: [PATCH] [OpenACC] Add support for gang local storage allocation in
shared memory
2019-06-03 Julian Brown <julian@codesourcery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (tree-hash-traits.h): Include.
(gangprivate_shared_size): New global variable.
(gangprivate_shared_align): Likewise.
(gangprivate_shared_sym): Likewise.
(gangprivate_shared_hmap): Likewise.
(nvptx_option_override): Initialize gangprivate_shared_sym,
gangprivate_shared_align.
(nvptx_file_end): Output gangprivate_shared_sym.
(nvptx_goacc_expand_accel_var): New function.
(nvptx_set_current_function): Initialise gangprivate_shared_hmap. Add
function comment.
(TARGET_GOACC_EXPAND_ACCEL): Likewise.
* doc/tm.texi (TARGET_GOACC_EXPAND_ACCEL_VAR): Document new hook.
* doc/tm.texi.in (TARGET_GOACC_EXPAND_ACCEL_VAR): Likewise.
* expr.c (expand_expr_real_1): Remap VAR_DECLs marked with the
"oacc gangprivate" attribute.
* omp-low.c (omp_context): Add oacc_partitioning_level and
oacc_addressable_var_decls fields.
(new_omp_context): Initialize oacc_addressable_var_decls in new
omp_context.
(delete_omp_context): Delete oacc_addressable_var_decls in old
omp_context.
(lower_oacc_head_tail): Record partitioning-level count in omp context.
(oacc_record_private_var_clauses, oacc_record_vars_in_bind,
mark_oacc_gangprivate): New functions.
(lower_omp_for): Call oacc_record_private_var_clauses with "for"
clauses.
(lower_omp_target): Likewise, for "target" clauses.
Call mark_oacc_gangprivate for offloaded target regions.
(process_oacc_gangprivate_1): New function.
(lower_omp_1): Call oacc_record_vars_in_bind for GIMPLE_BIND within OMP
regions.
(execute_lower_omp): Call process_oacc_gangprivate_1 for each OMP
context.
* target.def (expand_accel_var): New hook.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New test.
* testsuite/libgomp.oacc-c/pr85465.c: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: New test.
---
gcc/config/nvptx/nvptx.c | 53 +++++++++
gcc/doc/tm.texi | 8 ++
gcc/doc/tm.texi.in | 2 +
gcc/expr.c | 13 +-
gcc/omp-low.c | 111 ++++++++++++++++++
gcc/target.def | 10 ++
.../gang-private-1.c | 38 ++++++
.../libgomp.oacc-c-c++-common/loop-gwv-2.c | 95 +++++++++++++++
libgomp/testsuite/libgomp.oacc-c/pr85465.c | 11 ++
.../gangprivate-attrib-1.f90 | 25 ++++
.../gangprivate-attrib-2.f90 | 23 ++++
11 files changed, 388 insertions(+), 1 deletion(-)
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c/pr85465.c
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index a28099ac89d..c93fb926609 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -74,6 +74,7 @@
#include "cfgloop.h"
#include "fold-const.h"
#include "intl.h"
+#include "tree-hash-traits.h"
/* This file should be included last. */
#include "target-def.h"
@@ -166,6 +167,12 @@ static unsigned vector_red_align;
static unsigned vector_red_partition;
static GTY(()) rtx vector_red_sym;
+/* Shared memory block for gang-private variables. */
+static unsigned gangprivate_shared_size;
+static unsigned gangprivate_shared_align;
+static GTY(()) rtx gangprivate_shared_sym;
+static hash_map<tree_decl_hash, unsigned int> gangprivate_shared_hmap;
+
/* Global lock variable, needed for 128bit worker & gang reductions. */
static GTY(()) tree global_lock_var;
@@ -247,6 +254,10 @@ nvptx_option_override (void)
vector_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
vector_red_partition = 0;
+ gangprivate_shared_sym = gen_rtx_SYMBOL_REF (Pmode, "__gangprivate_shared");
+ SET_SYMBOL_DATA_AREA (gangprivate_shared_sym, DATA_AREA_SHARED);
+ gangprivate_shared_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+
diagnose_openacc_conflict (TARGET_GOMP, "-mgomp");
diagnose_openacc_conflict (TARGET_SOFT_STACK, "-msoft-stack");
diagnose_openacc_conflict (TARGET_UNIFORM_SIMT, "-muniform-simt");
@@ -5237,6 +5248,10 @@ nvptx_file_end (void)
write_shared_buffer (asm_out_file, vector_red_sym,
vector_red_align, vector_red_size);
+ if (gangprivate_shared_size)
+ write_shared_buffer (asm_out_file, gangprivate_shared_sym,
+ gangprivate_shared_align, gangprivate_shared_size);
+
if (need_softstack_decl)
{
write_var_marker (asm_out_file, false, true, "__nvptx_stacks");
@@ -6430,14 +6445,49 @@ nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t)
return false;
}
+/* Implement TARGET_GOACC_EXPAND_ACCEL_VAR. Place "oacc gangprivate"
+ variables in shared memory. */
+
+static rtx
+nvptx_goacc_expand_accel_var (tree var)
+{
+ if (VAR_P (var)
+ && lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (var)))
+ {
+ unsigned int offset, *poffset;
+ poffset = gangprivate_shared_hmap.get (var);
+ if (poffset)
+ offset = *poffset;
+ else
+ {
+ unsigned HOST_WIDE_INT align = DECL_ALIGN (var);
+ gangprivate_shared_size
+ = (gangprivate_shared_size + align - 1) & ~(align - 1);
+ if (gangprivate_shared_align < align)
+ gangprivate_shared_align = align;
+
+ offset = gangprivate_shared_size;
+ bool existed = gangprivate_shared_hmap.put (var, offset);
+ gcc_assert (!existed);
+ gangprivate_shared_size += tree_to_uhwi (DECL_SIZE_UNIT (var));
+ }
+ rtx addr = plus_constant (Pmode, gangprivate_shared_sym, offset);
+ return gen_rtx_MEM (TYPE_MODE (TREE_TYPE (var)), addr);
+ }
+ return NULL_RTX;
+}
+
static GTY(()) tree nvptx_previous_fndecl;
+/* Implement TARGET_SET_CURRENT_FUNCTION. Reset per-function context. */
+
static void
nvptx_set_current_function (tree fndecl)
{
if (!fndecl || fndecl == nvptx_previous_fndecl)
return;
+ gangprivate_shared_hmap.empty ();
nvptx_previous_fndecl = fndecl;
vector_red_partition = 0;
oacc_bcast_partition = 0;
@@ -6579,6 +6629,9 @@ nvptx_set_current_function (tree fndecl)
#undef TARGET_HAVE_SPECULATION_SAFE_VALUE
#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+#undef TARGET_GOACC_EXPAND_ACCEL_VAR
+#define TARGET_GOACC_EXPAND_ACCEL_VAR nvptx_goacc_expand_accel_var
+
#undef TARGET_SET_CURRENT_FUNCTION
#define TARGET_SET_CURRENT_FUNCTION nvptx_set_current_function
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 622e8cf240f..61da9709268 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6161,6 +6161,14 @@ like @code{cond_add@var{m}}. The default implementation returns a zero
constant of type @var{type}.
@end deftypefn
+@deftypefn {Target Hook} rtx TARGET_GOACC_EXPAND_ACCEL_VAR (tree @var{var})
+This hook, if defined, is used by accelerator target back-ends to expand
+specially handled kinds of VAR_DECL expressions. A particular use is to
+place variables with specific attributes inside special accelarator
+memories. A return value of NULL indicates that the target does not
+handle this VAR_DECL, and normal RTL expanding is resumed.
+@end deftypefn
+
@node Anchored Addresses
@section Anchored Addresses
@cindex anchored addresses
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 17560fce6b7..5579623e331 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4210,6 +4210,8 @@ address; but often a machine-dependent strategy can generate better code.
@hook TARGET_PREFERRED_ELSE_VALUE
+@hook TARGET_GOACC_EXPAND_ACCEL_VAR
+
@node Anchored Addresses
@section Anchored Addresses
@cindex anchored addresses
diff --git a/gcc/expr.c b/gcc/expr.c
index c78bc74c0d9..34510aab55d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9974,8 +9974,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
exp = SSA_NAME_VAR (ssa_name);
goto expand_decl_rtl;
- case PARM_DECL:
case VAR_DECL:
+ /* Allow accel compiler to handle specific cases of variables,
+ specifically those tagged with the "oacc gangprivate" attribute,
+ which may be intended to be placed in special memory in GPUs. */
+ if (flag_openacc && targetm.goacc.expand_accel_var)
+ {
+ temp = targetm.goacc.expand_accel_var (exp);
+ if (temp)
+ return temp;
+ }
+ /* ... fall through ... */
+
+ case PARM_DECL:
/* If a static var's type was incomplete when the decl was written,
but the type is complete now, lay out the decl now. */
if (DECL_SIZE (exp) == 0
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index cfc237cd895..d0ed5c2255c 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -137,6 +137,12 @@ struct omp_context
/* True if this construct can be cancelled. */
bool cancellable;
+
+ /* The number of levels of OpenACC partitioning invoked in this context. */
+ unsigned oacc_partitioning_levels;
+
+ /* Addressable variable decls in this context. */
+ vec<tree> *oacc_addressable_var_decls;
};
static splay_tree all_contexts;
@@ -878,6 +884,7 @@ new_omp_context (gimple *stmt, omp_context *outer_ctx)
}
ctx->cb.decl_map = new hash_map<tree, tree>;
+ ctx->oacc_addressable_var_decls = new vec<tree> ();
return ctx;
}
@@ -960,6 +967,7 @@ delete_omp_context (splay_tree_value value)
}
delete ctx->lastprivate_conditional_map;
+ delete ctx->oacc_addressable_var_decls;
XDELETE (ctx);
}
@@ -6757,6 +6765,9 @@ lower_oacc_head_tail (location_t loc, tree clauses,
tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
gcc_assert (count);
+
+ ctx->oacc_partitioning_levels = count;
+
for (unsigned done = 1; count; count--, done++)
{
gimple_seq fork_seq = NULL;
@@ -8458,6 +8469,79 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
}
}
+/* Record vars listed in private clauses in CLAUSES in CTX. This information
+ is used to mark up variables that should be made private per-gang. */
+
+static void
+oacc_record_private_var_clauses (omp_context *ctx, tree clauses)
+{
+ tree c;
+
+ if (!ctx)
+ return;
+
+ for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
+ {
+ tree decl = OMP_CLAUSE_DECL (c);
+ if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
+ ctx->oacc_addressable_var_decls->safe_push (decl);
+ }
+}
+
+/* Record addressable vars declared in BINDVARS in CTX. This information is
+ used to mark up variables that should be made private per-gang. */
+
+static void
+oacc_record_vars_in_bind (omp_context *ctx, tree bindvars)
+{
+ if (!ctx)
+ return;
+
+ for (tree v = bindvars; v; v = DECL_CHAIN (v))
+ if (VAR_P (v) && TREE_ADDRESSABLE (v))
+ ctx->oacc_addressable_var_decls->safe_push (v);
+}
+
+/* Mark addressable variables which are declared implicitly or explicitly as
+ gang private with a special attribute. These may need to have their
+ declarations altered later on in compilation (e.g. in
+ execute_oacc_device_lower or the backend, depending on how the OpenACC
+ execution model is implemented on a given target) to ensure that sharing
+ semantics are correct. */
+
+static void
+mark_oacc_gangprivate (vec<tree> *decls, omp_context *ctx)
+{
+ int i;
+ tree decl;
+
+ FOR_EACH_VEC_ELT (*decls, i, decl)
+ {
+ for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
+ {
+ tree inner_decl = maybe_lookup_decl (decl, thisctx);
+ if (inner_decl)
+ {
+ decl = inner_decl;
+ break;
+ }
+ }
+ if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl)))
+ {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file,
+ "Setting 'oacc gangprivate' attribute for decl:");
+ print_generic_decl (dump_file, decl, TDF_SLIM);
+ fputc ('\n', dump_file);
+ }
+ DECL_ATTRIBUTES (decl)
+ = tree_cons (get_identifier ("oacc gangprivate"),
+ NULL, DECL_ATTRIBUTES (decl));
+ }
+ }
+}
/* Lower code for an OMP loop directive. */
@@ -8475,6 +8559,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
push_gimplify_context ();
+ oacc_record_private_var_clauses (ctx, gimple_omp_for_clauses (stmt));
+
lower_omp (gimple_omp_for_pre_body_ptr (stmt), ctx);
block = make_node (BLOCK);
@@ -9420,6 +9506,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
clauses = gimple_omp_target_clauses (stmt);
+ oacc_record_private_var_clauses (ctx, clauses);
+
gimple_seq dep_ilist = NULL;
gimple_seq dep_olist = NULL;
if (omp_find_clause (clauses, OMP_CLAUSE_DEPEND))
@@ -9670,6 +9758,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
if (offloaded)
{
+ mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx);
+
/* Declare all the variables created by mapping and the variables
declared in the scope of the target body. */
record_vars_into (ctx->block_vars, child_fn);
@@ -10521,6 +10611,25 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p, omp_context *ctx)
gimple_build_omp_return (false));
}
+/* Find gang-private variables in a context. */
+
+static int
+process_oacc_gangprivate_1 (splay_tree_node node, void * ARG_UNUSED (data))
+{
+ omp_context *ctx = (omp_context *) node->value;
+ unsigned level_total = 0;
+ omp_context *thisctx;
+
+ for (thisctx = ctx; thisctx; thisctx = thisctx->outer)
+ level_total += thisctx->oacc_partitioning_levels;
+
+ /* If the current context and parent contexts are distributed over a
+ total of one parallelism level, we have gang partitioning. */
+ if (level_total == 1)
+ mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx);
+
+ return 0;
+}
/* Callback for lower_omp_1. Return non-NULL if *tp needs to be
regimplified. If DATA is non-NULL, lower_omp_1 is outside
@@ -10665,6 +10774,7 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
ctx);
break;
case GIMPLE_BIND:
+ oacc_record_vars_in_bind (ctx, gimple_bind_vars (as_a <gbind *> (stmt)));
lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx);
maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt));
break;
@@ -10905,6 +11015,7 @@ execute_lower_omp (void)
if (all_contexts)
{
+ splay_tree_foreach (all_contexts, process_oacc_gangprivate_1, NULL);
splay_tree_delete (all_contexts);
all_contexts = NULL;
}
diff --git a/gcc/target.def b/gcc/target.def
index 7d52102c815..5334c206afa 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1719,6 +1719,16 @@ for allocating any storage for reductions when necessary.",
void, (gcall *call),
default_goacc_reduction)
+DEFHOOK
+(expand_accel_var,
+"This hook, if defined, is used by accelerator target back-ends to expand\n\
+specially handled kinds of VAR_DECL expressions. A particular use is to\n\
+place variables with specific attributes inside special accelarator\n\
+memories. A return value of NULL indicates that the target does not\n\
+handle this VAR_DECL, and normal RTL expanding is resumed.",
+rtx, (tree var),
+NULL)
+
HOOK_VECTOR_END (goacc)
/* Functions relating to vectorization. */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
new file mode 100644
index 00000000000..f378346ed0a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
@@ -0,0 +1,38 @@
+#include <assert.h>
+
+int main (void)
+{
+ int ret;
+
+ #pragma acc parallel num_gangs(1) num_workers(32) copyout(ret)
+ {
+ int w = 0;
+
+ #pragma acc loop worker
+ for (int i = 0; i < 32; i++)
+ {
+ #pragma acc atomic update
+ w++;
+ }
+
+ ret = (w == 32);
+ }
+ assert (ret);
+
+ #pragma acc parallel num_gangs(1) vector_length(32) copyout(ret)
+ {
+ int v = 0;
+
+ #pragma acc loop vector
+ for (int i = 0; i < 32; i++)
+ {
+ #pragma acc atomic update
+ v++;
+ }
+
+ ret = (v == 32);
+ }
+ assert (ret);
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
new file mode 100644
index 00000000000..a4f81a39e24
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
@@ -0,0 +1,95 @@
+#include <stdio.h>
+#include <openacc.h>
+#include <alloca.h>
+#include <string.h>
+#include <gomp-constants.h>
+#include <stdlib.h>
+
+#if 0
+#define DEBUG(DIM, IDX, VAL) \
+ fprintf (stderr, "%sdist[%d] = %d\n", (DIM), (IDX), (VAL))
+#else
+#define DEBUG(DIM, IDX, VAL)
+#endif
+
+#define N (32*32*32)
+
+int
+check (const char *dim, int *dist, int dimsize)
+{
+ int ix;
+ int exit = 0;
+
+ for (ix = 0; ix < dimsize; ix++)
+ {
+ DEBUG(dim, ix, dist[ix]);
+ if (dist[ix] < (N) / (dimsize + 0.5)
+ || dist[ix] > (N) / (dimsize - 0.5))
+ {
+ fprintf (stderr, "did not distribute to %ss (%d not between %d "
+ "and %d)\n", dim, dist[ix], (int) ((N) / (dimsize + 0.5)),
+ (int) ((N) / (dimsize - 0.5)));
+ exit |= 1;
+ }
+ }
+
+ return exit;
+}
+
+int main ()
+{
+ int ary[N];
+ int ix;
+ int exit = 0;
+ int gangsize = 0, workersize = 0, vectorsize = 0;
+ int *gangdist, *workerdist, *vectordist;
+
+ for (ix = 0; ix < N;ix++)
+ ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
+ copy(ary) copyout(gangsize, workersize, vectorsize)
+ {
+#pragma acc loop gang worker vector
+ for (unsigned ix = 0; ix < N; ix++)
+ {
+ int g, w, v;
+
+ g = __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+ w = __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+ v = __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+
+ ary[ix] = (g << 16) | (w << 8) | v;
+ }
+
+ gangsize = __builtin_goacc_parlevel_size (GOMP_DIM_GANG);
+ workersize = __builtin_goacc_parlevel_size (GOMP_DIM_WORKER);
+ vectorsize = __builtin_goacc_parlevel_size (GOMP_DIM_VECTOR);
+ }
+
+ gangdist = (int *) alloca (gangsize * sizeof (int));
+ workerdist = (int *) alloca (workersize * sizeof (int));
+ vectordist = (int *) alloca (vectorsize * sizeof (int));
+ memset (gangdist, 0, gangsize * sizeof (int));
+ memset (workerdist, 0, workersize * sizeof (int));
+ memset (vectordist, 0, vectorsize * sizeof (int));
+
+ /* Test that work is shared approximately equally amongst each active
+ gang/worker/vector. */
+ for (ix = 0; ix < N; ix++)
+ {
+ int g = (ary[ix] >> 16) & 255;
+ int w = (ary[ix] >> 8) & 255;
+ int v = ary[ix] & 255;
+
+ gangdist[g]++;
+ workerdist[w]++;
+ vectordist[v]++;
+ }
+
+ exit = check ("gang", gangdist, gangsize);
+ exit |= check ("worker", workerdist, workersize);
+ exit |= check ("vector", vectordist, vectorsize);
+
+ return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c/pr85465.c b/libgomp/testsuite/libgomp.oacc-c/pr85465.c
new file mode 100644
index 00000000000..329e8a09cf9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/pr85465.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-w" } */
+
+int
+main (void)
+{
+#pragma acc parallel
+ foo ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
new file mode 100644
index 00000000000..5f8a5e650ea
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
@@ -0,0 +1,25 @@
+! Test for "oacc gangprivate" attribute on gang-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-omplower-details" }
+! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl: integer\\(kind=4\\) w;" 1 "omplower" } } */
+
+program main
+ integer :: w, arr(0:31)
+
+ !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+ !$acc loop gang private(w)
+ do j = 0, 31
+ w = 0
+ !$acc loop seq
+ do i = 0, 31
+ !$acc atomic update
+ w = w + 1
+ !$acc end atomic
+ end do
+ arr(j) = w
+ end do
+ !$acc end parallel
+
+ if (any (arr .ne. 32)) stop 1
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
new file mode 100644
index 00000000000..d147229d91e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
@@ -0,0 +1,23 @@
+! Test for lack of "oacc gangprivate" attribute on worker-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-omplower-details" }
+! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl" 0 "omplower" } } */
+
+program main
+ integer :: w, arr(0:31)
+
+ !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+ !$acc loop gang worker private(w)
+ do j = 0, 31
+ w = 0
+ !$acc loop seq
+ do i = 0, 31
+ w = w + 1
+ end do
+ arr(j) = w
+ end do
+ !$acc end parallel
+
+ if (any (arr .ne. 32)) stop 1
+end program main
--
2.20.1
next prev parent reply other threads:[~2019-06-03 16:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-27 16:21 [gomp4] add " Cesar Philippidis
2018-08-13 16:22 ` [PATCH, OpenACC] Add " Julian Brown
2018-08-13 18:42 ` Cesar Philippidis
2018-08-13 19:06 ` Cesar Philippidis
2018-08-15 16:46 ` Julian Brown
2018-08-15 19:57 ` Bernhard Reutner-Fischer
2018-08-16 15:47 ` Julian Brown
2018-08-17 16:39 ` Bernhard Reutner-Fischer
2018-12-11 15:08 ` Julian Brown
2019-06-03 16:03 ` Julian Brown [this message]
2019-06-03 16:23 ` Jakub Jelinek
2019-06-07 14:08 ` Julian Brown
2019-06-12 10:23 ` Jakub Jelinek
2019-06-12 10:32 ` Tom de Vries
2019-06-12 11:57 ` Thomas Schwinge
2019-06-12 19:43 ` Julian Brown
2019-11-06 22:59 ` Julian Brown
2021-05-21 19:05 ` Thomas Schwinge
2022-02-14 15:56 ` Thomas Schwinge
2022-02-15 13:40 ` Julian Brown
2022-03-10 11:28 ` [OpenACC privatization] Analyze 'lookup_decl'-translated DECL [PR90115, PR102330, PR104774] Thomas Schwinge
2022-03-10 11:13 ` Add 'gfortran.dg/goacc-gomp/pr102330-{1,2,3}.f90' [PR102330] Thomas Schwinge
2022-03-10 11:18 ` Add 'c-c++-common/goacc/kernels-decompose-pr104774-1.c' [PR104774] Thomas Schwinge
2018-10-05 14:07 ` [PATCH, OpenACC] Add support for gang local storage allocation in shared memory Tom de Vries
2018-08-13 20:42 ` Julian Brown
2021-05-19 12:10 ` Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c' (was: [PATCH, OpenACC] Add support for gang local storage allocation in shared memory) Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190603170245.4a62a0ad@squid.athome \
--to=julian@codesourcery.com \
--cc=cltang@codesourcery.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=rep.dot.nop@gmail.com \
--cc=tdevries@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).