public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc/devel/omp/gcc-11] Re-do OpenACC private variable resolution
@ 2021-05-13 16:13 Kwok Yeung
  0 siblings, 0 replies; only message in thread
From: Kwok Yeung @ 2021-05-13 16:13 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:f98376f55369775069dc2e3a4d3dcfe9f85caff5

commit f98376f55369775069dc2e3a4d3dcfe9f85caff5
Author: Julian Brown <julian@codesourcery.com>
Date:   Mon Oct 14 13:12:39 2019 -0700

    Re-do OpenACC private variable resolution
    
            gcc/
            * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
            to...
            (gcn_goacc_adjust_private_decl): ...this.
            * config/gcn/gcn-tree.c (diagnostic-core.h): Include.
            (gcn_goacc_adjust_gangprivate_decl): Rename to...
            (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
            * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
            (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
            * config/nvptx/nvptx.c (tree-pretty-print.h): Include.
            (nvptx_goacc_adjust_private_decl): New function.
            (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
            * doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
            (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
            * doc/tm.texi: Regenerated.
            * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
            * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
            * omp-low.c (omp_context): Remove oacc_partitioning_levels field.
            (lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert before
            fork.
            (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
            gimple call arguments as appropriate. Don't set
            oacc_partitioning_levels in omp_context. Pass private_marker to
            lower_oacc_reductions.
            (oacc_record_private_var_clauses): Don't check for NULL ctx.
            (mark_oacc_gangprivate): Remove unused function.
            (make_oacc_private_marker): New function.
            (lower_omp_for): Only call oacc_record_vars_in_bind for
            OpenACC contexts.  Create private marker and pass to
            lower_oacc_head_tail.
            (lower_omp_target): Remove unnecessary call to
            oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
            Create private marker and pass to lower_oacc_reductions.
            (process_oacc_gangprivate_1): Remove.
            (lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC.  Don't
            iterate over contexts calling process_oacc_gangprivate_1.
            (omp-offload.c (oacc_loop_xform_head_tail): Treat
            private-variable markers like fork/join when transforming head/tail
            sequences.
            (execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
            "oacc gangprivate" attributes to determine partitioning level of
            variables.  Remove unused variables.
            * omp-sese.c (find_gangprivate_vars): New function.
            (find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
            of "oacc gangprivate" attribute to determine which variables are
            gang-private.
            (oacc_do_neutering): Use find_gangprivate_vars.
            * target.def (adjust_gangprivate_decl): Rename to...
            (adjust_private_decl): ...this.  Update documentation (briefly).
    
            libgomp/
            * testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
            oaccdevlow dump and update scanned output.
            * testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
            Add missing atomic to force worker partitioning for test variable.

Diff:
---
 gcc/ChangeLog.omp                                  |  52 +++++++
 gcc/config/gcn/gcn-protos.h                        |   2 +-
 gcc/config/gcn/gcn-tree.c                          |   6 +-
 gcc/config/gcn/gcn.c                               |   4 +-
 gcc/config/nvptx/nvptx.c                           |  26 ++++
 gcc/doc/tm.texi                                    |   5 +-
 gcc/doc/tm.texi.in                                 |   2 +-
 gcc/internal-fn.c                                  |   2 +
 gcc/internal-fn.h                                  |   3 +-
 gcc/omp-low.c                                      | 164 +++++++++++----------
 gcc/omp-offload.c                                  |  59 ++++++--
 gcc/omp-sese.c                                     |  54 ++++++-
 gcc/target.def                                     |   7 +-
 libgomp/ChangeLog.omp                              |   7 +
 .../libgomp.oacc-fortran/gangprivate-attrib-1.f90  |   4 +-
 .../libgomp.oacc-fortran/gangprivate-attrib-2.f90  |   8 +-
 16 files changed, 291 insertions(+), 114 deletions(-)

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 63dc5b23904..c374ebe92ae 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,3 +1,55 @@
+2019-10-16  Julian Brown  <julian@codesourcery.com>
+	    Tobias Burnus  <tobias@codesourcery.com>
+
+	* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
+	to...
+	(gcn_goacc_adjust_private_decl): ...this.
+	* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
+	(gcn_goacc_adjust_gangprivate_decl): Rename to...
+	(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
+	* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
+	(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
+	* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
+	(nvptx_goacc_adjust_private_decl): New function.
+	(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
+	* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
+	(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
+	* doc/tm.texi: Regenerated.
+	* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
+	* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
+	* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
+	(lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert before
+	fork.
+	(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
+	gimple call arguments as appropriate. Don't set
+	oacc_partitioning_levels in omp_context. Pass private_marker to
+	lower_oacc_reductions.
+	(oacc_record_private_var_clauses): Don't check for NULL ctx.
+	(mark_oacc_gangprivate): Remove unused function.
+	(make_oacc_private_marker): New function.
+	(lower_omp_for): Only call oacc_record_vars_in_bind for
+	OpenACC contexts.  Create private marker and pass to
+	lower_oacc_head_tail.
+	(lower_omp_target): Remove unnecessary call to
+	oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
+	Create private marker and pass to lower_oacc_reductions.
+	(process_oacc_gangprivate_1): Remove.
+	(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC.  Don't
+	iterate over contexts calling process_oacc_gangprivate_1.
+	(omp-offload.c (oacc_loop_xform_head_tail): Treat
+	private-variable markers like fork/join when transforming head/tail
+	sequences.
+	(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
+	"oacc gangprivate" attributes to determine partitioning level of
+	variables.  Remove unused variables.
+	* omp-sese.c (find_gangprivate_vars): New function.
+	(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
+	of "oacc gangprivate" attribute to determine which variables are
+	gang-private.
+	(oacc_do_neutering): Use find_gangprivate_vars.
+	* target.def (adjust_gangprivate_decl): Rename to...
+	(adjust_private_decl): ...this.  Update documentation (briefly).
+
 2019-09-20  Julian Brown  <julian@codesourcery.com>
 
 	* gimplify.c (localize_reductions): Rewrite references for
diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index 2e510569c4f..61a2b00e936 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -40,7 +40,7 @@ extern rtx gcn_gen_undef (machine_mode);
 extern bool gcn_global_address_p (rtx);
 extern tree gcn_goacc_create_propagation_record (tree record_type, bool sender,
 						 const char *name);
-extern void gcn_goacc_adjust_gangprivate_decl (tree var);
+extern void gcn_goacc_adjust_private_decl (tree var, int level);
 extern void gcn_goacc_reduction (gcall *call);
 extern bool gcn_hard_regno_rename_ok (unsigned int from_reg,
 				      unsigned int to_reg);
diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c
index b8128f87641..90be7f15151 100644
--- a/gcc/config/gcn/gcn-tree.c
+++ b/gcc/config/gcn/gcn-tree.c
@@ -44,6 +44,7 @@
 #include "cgraph.h"
 #include "targhooks.h"
 #include "langhooks-def.h"
+#include "diagnostic-core.h"
 
 /* }}}  */
 /* {{{ OpenACC reductions.  */
@@ -578,8 +579,11 @@ gcn_goacc_create_propagation_record (tree record_type, bool sender,
 }
 
 void
-gcn_goacc_adjust_gangprivate_decl (tree var)
+gcn_goacc_adjust_private_decl (tree var, int level)
 {
+  if (level != GOMP_DIM_GANG)
+    return;
+
   tree type = TREE_TYPE (var);
   tree lds_type = build_qualified_type (type,
 		    TYPE_QUALS_NO_ADDR_SPACE (type)
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 286018b936b..3741853e76f 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -6313,8 +6313,8 @@ gcn_dwarf_register_span (rtx rtl)
 #undef  TARGET_GOACC_CREATE_PROPAGATION_RECORD
 #define TARGET_GOACC_CREATE_PROPAGATION_RECORD \
   gcn_goacc_create_propagation_record
-#undef  TARGET_GOACC_ADJUST_GANGPRIVATE_DECL
-#define TARGET_GOACC_ADJUST_GANGPRIVATE_DECL gcn_goacc_adjust_gangprivate_decl
+#undef  TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL gcn_goacc_adjust_private_decl
 #undef  TARGET_GOACC_FORK_JOIN
 #define TARGET_GOACC_FORK_JOIN gcn_fork_join
 #undef  TARGET_GOACC_REDUCTION
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 347ca534c87..28f197f214a 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -81,6 +81,7 @@
 #include "opts.h"
 #include "tree-hash-traits.h"
 #include "omp-sese.h"
+#include "tree-pretty-print.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -6171,6 +6172,28 @@ nvptx_truly_noop_truncation (poly_uint64, poly_uint64)
   return false;
 }
 
+/* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL.  Set "oacc gangprivate"
+   attribute for gang-private variable declarations.  */
+
+void
+nvptx_goacc_adjust_private_decl (tree decl, int level)
+{
+  if (level != GOMP_DIM_GANG)
+    return;
+
+  if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl)))
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Setting 'oacc gangprivate' attribute for decl:");
+	  print_generic_decl (dump_file, decl, TDF_SLIM);
+	  fputc ('\n', dump_file);
+	}
+      tree id = get_identifier ("oacc gangprivate");
+      DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl));
+    }
+}
+
 /* Implement TARGET_GOACC_EXPAND_ACCEL_VAR.  Place "oacc gangprivate"
    variables in shared memory.  */
 
@@ -6598,6 +6621,9 @@ nvptx_expand_to_rtl_hook (void)
 #undef TARGET_HAVE_SPECULATION_SAFE_VALUE
 #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
 
+#undef TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL nvptx_goacc_adjust_private_decl
+
 #undef TARGET_GOACC_EXPAND_ACCEL_VAR
 #define TARGET_GOACC_EXPAND_ACCEL_VAR nvptx_goacc_expand_accel_var
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 6bbbd9fafef..54acc5169b6 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6235,8 +6235,9 @@ memories.  A return value of NULL indicates that the target does not
 handle this VAR_DECL, and normal RTL expanding is resumed.
 @end deftypefn
 
-@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_GANGPRIVATE_DECL (tree @var{var})
-Tweak variable declaration for a gang-private variable.
+@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, @var{int})
+Tweak variable declaration for a private variable at the specified
+parallelism level.
 @end deftypefn
 
 @deftypevr {Target Hook} bool TARGET_GOACC_WORKER_PARTITIONING
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 1fccdb30f01..6c97edecb2e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4221,7 +4221,7 @@ address;  but often a machine-dependent strategy can generate better code.
 
 @hook TARGET_GOACC_EXPAND_ACCEL_VAR
 
-@hook TARGET_GOACC_ADJUST_GANGPRIVATE_DECL
+@hook TARGET_GOACC_ADJUST_PRIVATE_DECL
 
 @hook TARGET_GOACC_WORKER_PARTITIONING
 
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index dd7173126fb..e6611e8572f 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2957,6 +2957,8 @@ expand_UNIQUE (internal_fn, gcall *stmt)
       else
 	gcc_unreachable ();
       break;
+    case IFN_UNIQUE_OACC_PRIVATE:
+      break;
     }
 
   if (pattern)
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index c6599ce4894..9004840e0f5 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
 #define IFN_UNIQUE_CODES				  \
   DEF(UNSPEC),	\
     DEF(OACC_FORK), DEF(OACC_JOIN),		\
-    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK)
+    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK),	\
+    DEF(OACC_PRIVATE)
 
 enum ifn_unique_kind {
 #define DEF(X) IFN_UNIQUE_##X
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index c584e42f147..206f4328e02 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -153,9 +153,6 @@ struct omp_context
   /* True if this construct can be cancelled.  */
   bool cancellable;
 
-  /* The number of levels of OpenACC partitioning invoked in this context.  */
-  unsigned oacc_partitioning_levels;
-
   /* Addressable variable decls in this context.  */
   vec<tree> *oacc_addressable_var_decls;
 
@@ -7373,8 +7370,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p,
 
 static void
 lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
-		       gcall *fork, gcall *join, gimple_seq *fork_seq,
-		       gimple_seq *join_seq, omp_context *ctx)
+		       gcall *fork, gcall *private_marker, gcall *join,
+		       gimple_seq *fork_seq, gimple_seq *join_seq,
+		       omp_context *ctx)
 {
   gimple_seq before_fork = NULL;
   gimple_seq after_fork = NULL;
@@ -7578,6 +7576,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 
   /* Now stitch things together.  */
   gimple_seq_add_seq (fork_seq, before_fork);
+  if (private_marker)
+    gimple_seq_add_stmt (fork_seq, private_marker);
   if (fork)
     gimple_seq_add_stmt (fork_seq, fork);
   gimple_seq_add_seq (fork_seq, after_fork);
@@ -8320,7 +8320,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head,
    HEAD and TAIL.  */
 
 static void
-lower_oacc_head_tail (location_t loc, tree clauses,
+lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker,
 		      gimple_seq *head, gimple_seq *tail, omp_context *ctx)
 {
   bool inner = false;
@@ -8328,13 +8328,19 @@ lower_oacc_head_tail (location_t loc, tree clauses,
   gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node));
 
   unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx);
+
+  if (private_marker)
+    {
+      gimple_set_location (private_marker, loc);
+      gimple_call_set_lhs (private_marker, ddvar);
+      gimple_call_set_arg (private_marker, 1, ddvar);
+    }
+
   tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK);
   tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
 
   gcc_assert (count);
 
-  ctx->oacc_partitioning_levels = count;
-
   for (unsigned done = 1; count; count--, done++)
     {
       gimple_seq fork_seq = NULL;
@@ -8361,7 +8367,8 @@ lower_oacc_head_tail (location_t loc, tree clauses,
 			      &join_seq);
 
       lower_oacc_reductions (loc, clauses, place, inner,
-			     fork, join, &fork_seq, &join_seq,  ctx);
+			     fork, (count == 1) ? private_marker : NULL,
+			     join, &fork_seq, &join_seq,  ctx);
 
       /* Append this level to head. */
       gimple_seq_add_seq (head, fork_seq);
@@ -10116,9 +10123,6 @@ oacc_record_private_var_clauses (omp_context *ctx, tree clauses)
 {
   tree c;
 
-  if (!ctx)
-    return;
-
   for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
     if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
       {
@@ -10142,45 +10146,6 @@ oacc_record_vars_in_bind (omp_context *ctx, tree bindvars)
       ctx->oacc_addressable_var_decls->safe_push (v);
 }
 
-/* Mark addressable variables which are declared implicitly or explicitly as
-   gang private with a special attribute.  These may need to have their
-   declarations altered later on in compilation (e.g. in
-   execute_oacc_device_lower or the backend, depending on how the OpenACC
-   execution model is implemented on a given target) to ensure that sharing
-   semantics are correct.  */
-
-static void
-mark_oacc_gangprivate (vec<tree> *decls, omp_context *ctx)
-{
-  int i;
-  tree decl;
-
-  FOR_EACH_VEC_ELT (*decls, i, decl)
-    {
-      for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
-	{
-	  tree inner_decl = maybe_lookup_decl (decl, thisctx);
-	  if (inner_decl)
-	    {
-	      decl = inner_decl;
-	      break;
-	    }
-	}
-      if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl)))
-	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file,
-		       "Setting 'oacc gangprivate' attribute for decl:");
-	      print_generic_decl (dump_file, decl, TDF_SLIM);
-	      fputc ('\n', dump_file);
-	    }
-	  DECL_ATTRIBUTES (decl)
-	    = tree_cons (get_identifier ("oacc gangprivate"),
-			 NULL, DECL_ATTRIBUTES (decl));
-	}
-    }
-}
 
 /* Gimplify a GIMPLE_OMP_CRITICAL statement.  This is a relatively simple
    substitution of a couple of function calls.  But in the NAMED case,
@@ -11228,6 +11193,58 @@ lower_omp_for_scan (gimple_seq *body_p, gimple_seq *dlist, gomp_for *stmt,
   *dlist = new_dlist;
 }
 
+/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing
+   the addresses of variables that should be made private at the surrounding
+   parallelism level.  Such functions appear in the gimple code stream in two
+   forms, e.g. for a partitioned loop:
+
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68);
+      .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w);
+      .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1);
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6);
+
+   or alternatively, OACC_PRIVATE can appear at the top level of a parallel,
+   not as part of a HEAD_MARK sequence:
+
+      .UNIQUE (OACC_PRIVATE, 0, 0, &w);
+
+   For such stand-alone appearances, the 3rd argument is always 0, denoting
+   gang partitioning.  */
+
+static gcall *
+make_oacc_private_marker (omp_context *ctx)
+{
+  int i;
+  tree decl;
+
+  if (ctx->oacc_addressable_var_decls->length () == 0)
+    return NULL;
+
+  auto_vec<tree, 5> args;
+
+  args.quick_push (build_int_cst (integer_type_node,
+				 IFN_UNIQUE_OACC_PRIVATE));
+  args.quick_push (integer_zero_node);
+  args.quick_push (integer_minus_one_node);
+
+  FOR_EACH_VEC_ELT (*ctx->oacc_addressable_var_decls, i, decl)
+    {
+      for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
+	{
+	  tree inner_decl = maybe_lookup_decl (decl, thisctx);
+	  if (inner_decl)
+	    {
+	      decl = inner_decl;
+	      break;
+	    }
+	}
+      tree addr = build_fold_addr_expr (decl);
+      args.safe_push (addr);
+    }
+
+  return gimple_build_call_internal_vec (IFN_UNIQUE, args);
+}
+
 /* Lower code for an OMP loop directive.  */
 
 static void
@@ -11264,6 +11281,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       gbind *inner_bind
 	= as_a <gbind *> (gimple_seq_first_stmt (omp_for_body));
       tree vars = gimple_bind_vars (inner_bind);
+      if (is_gimple_omp_oacc (ctx->stmt))
+	oacc_record_vars_in_bind (ctx, vars);
       gimple_bind_append_vars (new_stmt, vars);
       /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't
 	 keep them on the inner_bind and it's block.  */
@@ -11377,6 +11396,12 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   lower_omp (gimple_omp_body_ptr (stmt), ctx);
 
+  gcall *private_marker = NULL;
+  if (is_gimple_omp_oacc (ctx->stmt)
+      && !gimple_seq_empty_p (omp_for_body)
+      && !gimple_seq_empty_p (omp_for_body))
+    private_marker = make_oacc_private_marker (ctx);
+
   /* Lower the header expressions.  At this point, we can assume that
      the header is of the form:
 
@@ -11431,7 +11456,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   if (is_gimple_omp_oacc (ctx->stmt)
       && !ctx_in_oacc_kernels_region (ctx))
     lower_oacc_head_tail (gimple_location (stmt),
-			  gimple_omp_for_clauses (stmt),
+			  gimple_omp_for_clauses (stmt), private_marker,
 			  &oacc_head, &oacc_tail, ctx);
 
   /* Add OpenACC partitioning and reduction markers just before the loop.  */
@@ -12306,8 +12331,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   clauses = gimple_omp_target_clauses (stmt);
 
-  oacc_record_private_var_clauses (ctx, clauses);
-
   gimple_seq dep_ilist = NULL;
   gimple_seq dep_olist = NULL;
   if (omp_find_clause (clauses, OMP_CLAUSE_DEPEND))
@@ -12625,8 +12648,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   if (offloaded)
     {
-      mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx);
-
       /* Declare all the variables created by mapping and the variables
 	 declared in the scope of the target body.  */
       record_vars_into (ctx->block_vars, child_fn);
@@ -13607,8 +13628,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	     them as a dummy GANG loop.  */
 	  tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG);
 
+	  gcall *private_marker = make_oacc_private_marker (ctx);
+
+	  if (private_marker)
+	    gimple_call_set_arg (private_marker, 2, level);
+
 	  lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level,
-				 false, NULL, NULL, &fork_seq, &join_seq, ctx);
+				 false, NULL, private_marker, NULL, &fork_seq,
+				 &join_seq, ctx);
 	}
 
       gimple_seq_add_seq (&new_body, fork_seq);
@@ -13707,24 +13734,6 @@ lower_omp_teams (gimple_stmt_iterator *gsi_p, omp_context *ctx)
     TREE_USED (block) = 1;
 }
 
-static int
-process_oacc_gangprivate_1 (splay_tree_node node, void * /* data */)
-{
-  omp_context *ctx = (omp_context *) node->value;
-  unsigned level_total = 0;
-  omp_context *thisctx;
-
-  for (thisctx = ctx; thisctx; thisctx = thisctx->outer)
-    level_total += thisctx->oacc_partitioning_levels;
-
-  /* If the current context and parent contexts are distributed over a
-     total of one parallelism level, we have gang partitioning.  */
-  if (level_total == 1)
-    mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx);
-
-  return 0;
-}
-
 /* Callback for lower_omp_1.  Return non-NULL if *tp needs to be
    regimplified.  If DATA is non-NULL, lower_omp_1 is outside
    of OMP context, but with task_shared_vars set.  */
@@ -13868,7 +13877,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		 ctx);
       break;
     case GIMPLE_BIND:
-      oacc_record_vars_in_bind (ctx, gimple_bind_vars (as_a <gbind *> (stmt)));
+      if (ctx && is_gimple_omp_oacc (ctx->stmt))
+	oacc_record_vars_in_bind (ctx,
+				  gimple_bind_vars (as_a <gbind *> (stmt)));
       lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx);
       maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt));
       break;
@@ -14118,7 +14129,6 @@ execute_lower_omp (void)
 
   if (all_contexts)
     {
-      splay_tree_foreach (all_contexts, process_oacc_gangprivate_1, NULL);
       splay_tree_delete (all_contexts);
       all_contexts = NULL;
     }
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 5788456d88f..b10047b96d2 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -1365,7 +1365,9 @@ oacc_loop_xform_head_tail (gcall *from, int level)
 	    = ((enum ifn_unique_kind)
 	       TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)));
 
-	  if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN)
+	  if (k == IFN_UNIQUE_OACC_FORK
+	      || k == IFN_UNIQUE_OACC_JOIN
+	      || k == IFN_UNIQUE_OACC_PRIVATE)
 	    *gimple_call_arg_ptr (stmt, 2) = replacement;
 	  else if (k == kind && stmt != from)
 	    break;
@@ -2094,6 +2096,8 @@ execute_oacc_device_lower ()
   for (unsigned i = 0; i < GOMP_DIM_MAX; i++)
     dims[i] = oacc_get_fn_dim_size (current_function_decl, i);
 
+  hash_set<tree> adjusted_vars;
+
   /* Now lower internal loop functions to target-specific code
      sequences.  */
   basic_block bb;
@@ -2170,6 +2174,43 @@ execute_oacc_device_lower ()
 		case IFN_UNIQUE_OACC_TAIL_MARK:
 		  remove = true;
 		  break;
+
+		case IFN_UNIQUE_OACC_PRIVATE:
+		  {
+		    HOST_WIDE_INT level
+		      = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+		    if (level == -1)
+		      break;
+		    for (unsigned i = 3;
+			 i < gimple_call_num_args (call);
+			 i++)
+		      {
+			tree arg = gimple_call_arg (call, i);
+			gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
+			tree decl = TREE_OPERAND (arg, 0);
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  {
+			    static char const *const axes[] =
+			      /* Must be kept in sync with GOMP_DIM
+				 enumeration.  */
+			      { "gang", "worker", "vector" };
+			    fprintf (dump_file, "Decl UID %u has %s "
+				     "partitioning:", DECL_UID (decl),
+				     axes[level]);
+			    print_generic_decl (dump_file, decl, TDF_SLIM);
+			    fputc ('\n', dump_file);
+			  }
+			if (targetm.goacc.adjust_private_decl)
+			  {
+			    tree oldtype = TREE_TYPE (decl);
+			    targetm.goacc.adjust_private_decl (decl, level);
+			    if (TREE_TYPE (decl) != oldtype)
+			      adjusted_vars.add (decl);
+			  }
+		      }
+		    remove = true;
+		  }
+		  break;
 		}
 	      break;
 	    }
@@ -2218,22 +2259,8 @@ execute_oacc_device_lower ()
      uses (2).  At least on AMD GCN, there are atomic operations that work
      directly in the LDS address space.  */
 
-  if (targetm.goacc.adjust_gangprivate_decl)
+  if (targetm.goacc.adjust_private_decl)
     {
-      tree var;
-      unsigned i;
-      hash_set<tree> adjusted_vars;
-
-      FOR_EACH_LOCAL_DECL (cfun, i, var)
-	{
-	  if (!VAR_P (var)
-	      || !lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (var)))
-	    continue;
-
-	  targetm.goacc.adjust_gangprivate_decl (var);
-	  adjusted_vars.add (var);
-	}
-
       FOR_ALL_BB_FN (bb, cfun)
 	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
 	     !gsi_end_p (gsi);
diff --git a/gcc/omp-sese.c b/gcc/omp-sese.c
index d9be28ae08f..cfdbb803929 100644
--- a/gcc/omp-sese.c
+++ b/gcc/omp-sese.c
@@ -712,19 +712,61 @@ find_partitioned_var_uses (parallel_g *par, unsigned outer_mask,
     }
 }
 
+/* Gang-private variables (typically placed in a GPU's shared memory) do not
+   need to be processed by the worker-propagation mechanism.  Populate the
+   GANGPRIVATE_VARS set with any such variables found in the current
+   function.  */
+
+static void
+find_gangprivate_vars (hash_set<tree> *gangprivate_vars)
+{
+  basic_block block;
+
+  FOR_EACH_BB_FN (block, cfun)
+    {
+      for (gimple_stmt_iterator gsi = gsi_start_bb (block);
+	   !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+
+	  if (gimple_call_internal_p (stmt, IFN_UNIQUE))
+	    {
+	      enum ifn_unique_kind k = ((enum ifn_unique_kind)
+		TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)));
+	      if (k == IFN_UNIQUE_OACC_PRIVATE)
+		{
+		  HOST_WIDE_INT level
+		    = TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
+		  if (level != GOMP_DIM_GANG)
+		    continue;
+		  for (unsigned i = 3; i < gimple_call_num_args (stmt); i++)
+		    {
+		      tree arg = gimple_call_arg (stmt, i);
+		      gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
+		      tree decl = TREE_OPERAND (arg, 0);
+		      gangprivate_vars->add (decl);
+		    }
+		}
+	    }
+	}
+    }
+}
+
 static void
 find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask,
 			      hash_set<tree> *partitioned_var_uses,
+			      hash_set<tree> *gangprivate_vars,
 			      vec<propagation_set *> *prop_set)
 {
   unsigned mask = outer_mask | par->mask;
 
   if (par->inner)
     find_local_vars_to_propagate (par->inner, mask, partitioned_var_uses,
-				  prop_set);
+				  gangprivate_vars, prop_set);
   if (par->next)
     find_local_vars_to_propagate (par->next, outer_mask, partitioned_var_uses,
-				  prop_set);
+				  gangprivate_vars, prop_set);
 
   if (!(mask & GOMP_DIM_MASK (GOMP_DIM_WORKER)))
     {
@@ -746,8 +788,7 @@ find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask,
 		      || is_global_var (var)
 		      || AGGREGATE_TYPE_P (TREE_TYPE (var))
 		      || !partitioned_var_uses->contains (var)
-		      || lookup_attribute ("oacc gangprivate",
-					   DECL_ATTRIBUTES (var)))
+		      || gangprivate_vars->contains (var))
 		    continue;
 
 		  if (stmt_may_clobber_ref_p (stmt, var))
@@ -1352,9 +1393,12 @@ oacc_do_neutering (void)
 			       &prop_set);
 
   hash_set<tree> partitioned_var_uses;
+  hash_set<tree> gangprivate_vars;
 
+  find_gangprivate_vars (&gangprivate_vars);
   find_partitioned_var_uses (par, mask, &partitioned_var_uses);
-  find_local_vars_to_propagate (par, mask, &partitioned_var_uses, &prop_set);
+  find_local_vars_to_propagate (par, mask, &partitioned_var_uses,
+				&gangprivate_vars, &prop_set);
 
   FOR_ALL_BB_FN (bb, cfun)
     {
diff --git a/gcc/target.def b/gcc/target.def
index e884627c26c..a3f67fa8223 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1737,9 +1737,10 @@ rtx, (tree var),
 NULL)
 
 DEFHOOK
-(adjust_gangprivate_decl,
-"Tweak variable declaration for a gang-private variable.",
-void, (tree var),
+(adjust_private_decl,
+"Tweak variable declaration for a private variable at the specified\n\
+parallelism level.",
+void, (tree var, int),
 NULL)
 
 DEFHOOK
diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index e0a6bff168f..6ff1f707fed 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,10 @@
+2019-10-16  Julian Brown  <julian@codesourcery.com>
+
+	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
+	oaccdevlow dump and update scanned output.
+	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
+	Add missing atomic to force worker partitioning for test variable.
+
 2019-09-20  Julian Brown  <julian@codesourcery.com>
 
 	* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
index 9158b6f4768..dafc70c743e 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
@@ -1,8 +1,8 @@
 ! Test for "oacc gangprivate" attribute on gang-private variables
 
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-omplower-details" }
-! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl:  integer\\(kind=4\\) w;" 1 "omplower" } } */
+! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
 
 program main
   integer :: w, arr(0:31)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
index d147229d91e..90e06be24ff 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
@@ -1,8 +1,8 @@
-! Test for lack of "oacc gangprivate" attribute on worker-private variables
+! Test for worker-private variables
 
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-omplower-details" }
-! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl" 0 "omplower" } } */
+! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
 
 program main
   integer :: w, arr(0:31)
@@ -13,7 +13,9 @@ program main
       w = 0
       !$acc loop seq
       do i = 0, 31
+        !$acc atomic update
         w = w + 1
+        !$acc end atomic
       end do
       arr(j) = w
     end do


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-05-13 16:13 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13 16:13 [gcc/devel/omp/gcc-11] Re-do OpenACC private variable resolution Kwok Yeung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).