public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/3] openacc: Gang-private variables in shared memory
@ 2021-02-26 12:34 Julian Brown
  2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Julian Brown @ 2021-02-26 12:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: jakub, Thomas Schwinge, Tom de Vries

This series contains a rebased/updated/bug-fixed version of the patch
to place gang-local variables in GPU shared memory, last posted here:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534551.html

Further commentary on individual patches. I am posting this for review
now, but I would not expect to commit it until stage 1.

Thanks,

Julian

Julian Brown (3):
  openacc: Add support for gang local storage allocation in shared
    memory
  amdgcn: AMD GCN parts for OpenACC private variables patch
  nvptx: NVPTX parts for OpenACC private variables patch

 gcc/config/gcn/gcn-protos.h                   |   2 +-
 gcc/config/gcn/gcn-tree.c                     |   9 +-
 gcc/config/gcn/gcn.c                          |   4 +-
 gcc/config/nvptx/nvptx.c                      |  78 ++++++
 gcc/doc/tm.texi                               |  26 ++
 gcc/doc/tm.texi.in                            |   4 +
 gcc/expr.c                                    |  13 +-
 gcc/internal-fn.c                             |   2 +
 gcc/internal-fn.h                             |   3 +-
 gcc/omp-low.c                                 | 122 +++++++++-
 gcc/omp-offload.c                             | 225 +++++++++++++++++-
 gcc/target.def                                |  30 +++
 .../gang-private-1.c                          |  38 +++
 .../libgomp.oacc-c-c++-common/loop-gwv-2.c    |  95 ++++++++
 .../gangprivate-attrib-1.f90                  |  25 ++
 .../gangprivate-attrib-2.f90                  |  25 ++
 16 files changed, 687 insertions(+), 14 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90

-- 
2.29.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-02-26 12:34 [PATCH 0/3] openacc: Gang-private variables in shared memory Julian Brown
@ 2021-02-26 12:34 ` Julian Brown
  2021-04-15 17:26   ` Thomas Schwinge
  2021-05-21 19:12   ` [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory Thomas Schwinge
  2021-02-26 12:34 ` [PATCH 2/3] amdgcn: AMD GCN parts for OpenACC private variables patch Julian Brown
  2021-02-26 12:34 ` [PATCH 3/3] nvptx: NVPTX " Julian Brown
  2 siblings, 2 replies; 24+ messages in thread
From: Julian Brown @ 2021-02-26 12:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: jakub, Thomas Schwinge, Tom de Vries

This patch implements a method to track the "private-ness" of
OpenACC variables declared in offload regions in gang-partitioned,
worker-partitioned or vector-partitioned modes. Variables declared
implicitly in scoped blocks and those declared "private" on enclosing
directives (e.g. "acc parallel") are both handled. Variables that are
e.g. gang-private can then be adjusted so they reside in GPU shared
memory.

The reason for doing this is twofold: correct implementation of OpenACC
semantics, and optimisation, since shared memory might be faster than
the main memory on a GPU. Handling of private variables is intimately
tied to the execution model for gangs/workers/vectors implemented by
a particular target: for current targets, we use (or on mainline, will
soon use) a broadcasting/neutering scheme.

That is sufficient for code that e.g. sets a variable in worker-single
mode and expects to use the value in worker-partitioned mode. The
difficulty (semantics-wise) comes when the user wants to do something like
an atomic operation in worker-partitioned mode and expects a worker-single
(gang private) variable to be shared across each partitioned worker.
Forcing use of shared memory for such variables makes that work properly.

In terms of implementation, the parallelism level of a given loop is
not fixed until the oaccdevlow pass in the offload compiler, so the
patch delays fixing the parallelism level of variables declared on or
within such loops until the same point. This is done by adding a new
internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
private variable as an argument, and other arguments set so as to be able
to determine the correct parallelism level to use for the listed
variables. This new internal function fits into the existing scheme for
demarcating OpenACC loops, as described in comments in the patch.

Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
TARGET_GOACC_EXPAND_VAR_DECL.  The first can tweak a variable declaration
at oaccdevlow time, and the second at expand time.  The first or both
of these target hooks can be used by a given offload target, depending
on its strategy for implementing private variables.

Tested with offloading to AMD GCN and (separately) to NVPTX.

OK (for stage 1)?

Thanks,

Julian

2021-02-22  Julian Brown  <julian@codesourcery.com>
	    Chung-Lin Tang  <cltang@codesourcery.com>

gcc/
	* doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL,
	TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks.
	* doc/tm.texi: Regenerate.
	* expr.c (expand_expr_real_1): Expand decls using the expand_var_decl
	OpenACC hook if defined.
	* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
	* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
	* omp-low.c (omp_context): Add oacc_addressable_var_decls field.
	(lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert before
	fork.
	(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify private
	marker's gimple call arguments, and pass it to lower_oacc_reductions.
	(oacc_record_private_var_clauses, oacc_record_vars_in_bind,
	make_oacc_private_marker): New functions.
	(lower_omp_for): Call oacc_record_private_var_clauses with "for"
	clauses. Call oacc_record_vars_in_bind for OpenACC contexts. Create
	private marker and pass to lower_oacc_head_tail.
	(lower_omp_target): Create private marker and pass to
	lower_oacc_reductions.
	(lower_omp_1): Call oacc_record_vars_in_bind for OpenACC.
	* omp-offload.c (convert.h): Include.
	(oacc_loop_xform_head_tail): Treat private-variable markers like
	fork/join when transforming head/tail sequences.
	(struct addr_expr_rewrite_info): Add struct.
	(rewrite_addr_expr): New function.
	(is_sync_builtin_call): New function.
	(execute_oacc_device_lower): Support rewriting gang-private variables
	using target hook, and fix up addr_expr and var_decl nodes afterwards.
	* target.def (expand_accel_var, adjust_private_decl): New hooks.

libgomp/
	* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise.
	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
---
 gcc/doc/tm.texi                               |  26 ++
 gcc/doc/tm.texi.in                            |   4 +
 gcc/expr.c                                    |  13 +-
 gcc/internal-fn.c                             |   2 +
 gcc/internal-fn.h                             |   3 +-
 gcc/omp-low.c                                 | 122 +++++++++-
 gcc/omp-offload.c                             | 225 +++++++++++++++++-
 gcc/target.def                                |  30 +++
 .../gang-private-1.c                          |  38 +++
 .../libgomp.oacc-c-c++-common/loop-gwv-2.c    |  95 ++++++++
 .../gangprivate-attrib-1.f90                  |  25 ++
 .../gangprivate-attrib-2.f90                  |  25 ++
 12 files changed, 599 insertions(+), 9 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 062785af1e2..94927ea7b2b 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6227,6 +6227,32 @@ like @code{cond_add@var{m}}.  The default implementation returns a zero
 constant of type @var{type}.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_GOACC_EXPAND_VAR_DECL (tree @var{var})
+This hook, if defined, is used by accelerator target back-ends to expand
+specially handled kinds of @code{VAR_DECL} expressions.  A particular use is
+to place variables with specific attributes inside special accelarator
+memories.  A return value of @code{NULL} indicates that the target does not
+handle this @code{VAR_DECL}, and normal RTL expanding is resumed.
+
+Only define this hook if your accelerator target needs to expand certain
+@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust
+private variables at OpenACC device-lowering time using the
+@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.
+@end deftypefn
+
+@deftypefn {Target Hook} tree TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, int @var{level})
+This hook, if defined, is used by accelerator target back-ends to adjust
+OpenACC variable declarations that should be made private to the given
+parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or
+@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable
+declarations at the @code{gang} level to reside in GPU shared memory, by
+setting the address space of the decl and making it static.
+
+You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the
+adjusted variable declaration needs to be expanded to RTL in a non-standard
+way.
+@end deftypefn
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 3b19e6f4281..b8c23cf6db5 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4219,6 +4219,10 @@ address;  but often a machine-dependent strategy can generate better code.
 
 @hook TARGET_PREFERRED_ELSE_VALUE
 
+@hook TARGET_GOACC_EXPAND_VAR_DECL
+
+@hook TARGET_GOACC_ADJUST_PRIVATE_DECL
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/expr.c b/gcc/expr.c
index 86dc1b6c973..349825cf286 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -10224,8 +10224,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       exp = SSA_NAME_VAR (ssa_name);
       goto expand_decl_rtl;
 
-    case PARM_DECL:
     case VAR_DECL:
+      /* Allow accel compiler to handle variables that require special
+	 treatment, e.g. if they have been modified in some way earlier in
+	 compilation by the adjust_private_decl OpenACC hook.  */
+      if (flag_openacc && targetm.goacc.expand_var_decl)
+	{
+	  temp = targetm.goacc.expand_var_decl (exp);
+	  if (temp)
+	    return temp;
+	}
+      /* ... fall through ...  */
+
+    case PARM_DECL:
       /* If a static var's type was incomplete when the decl was written,
 	 but the type is complete now, lay out the decl now.  */
       if (DECL_SIZE (exp) == 0
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index dd7173126fb..e6611e8572f 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2957,6 +2957,8 @@ expand_UNIQUE (internal_fn, gcall *stmt)
       else
 	gcc_unreachable ();
       break;
+    case IFN_UNIQUE_OACC_PRIVATE:
+      break;
     }
 
   if (pattern)
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index c6599ce4894..9004840e0f5 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
 #define IFN_UNIQUE_CODES				  \
   DEF(UNSPEC),	\
     DEF(OACC_FORK), DEF(OACC_JOIN),		\
-    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK)
+    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK),	\
+    DEF(OACC_PRIVATE)
 
 enum ifn_unique_kind {
 #define DEF(X) IFN_UNIQUE_##X
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index df5b6cec586..fd8025e0e3f 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -171,6 +171,9 @@ struct omp_context
 
   /* True if there is bind clause on the construct (i.e. a loop construct).  */
   bool loop_p;
+
+  /* Addressable variable decls in this context.  */
+  vec<tree> oacc_addressable_var_decls;
 };
 
 static splay_tree all_contexts;
@@ -7048,8 +7051,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p,
 
 static void
 lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
-		       gcall *fork, gcall *join, gimple_seq *fork_seq,
-		       gimple_seq *join_seq, omp_context *ctx)
+		       gcall *fork, gcall *private_marker, gcall *join,
+		       gimple_seq *fork_seq, gimple_seq *join_seq,
+		       omp_context *ctx)
 {
   gimple_seq before_fork = NULL;
   gimple_seq after_fork = NULL;
@@ -7253,6 +7257,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 
   /* Now stitch things together.  */
   gimple_seq_add_seq (fork_seq, before_fork);
+  if (private_marker)
+    gimple_seq_add_stmt (fork_seq, private_marker);
   if (fork)
     gimple_seq_add_stmt (fork_seq, fork);
   gimple_seq_add_seq (fork_seq, after_fork);
@@ -7989,7 +7995,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head,
    HEAD and TAIL.  */
 
 static void
-lower_oacc_head_tail (location_t loc, tree clauses,
+lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker,
 		      gimple_seq *head, gimple_seq *tail, omp_context *ctx)
 {
   bool inner = false;
@@ -7997,6 +8003,14 @@ lower_oacc_head_tail (location_t loc, tree clauses,
   gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node));
 
   unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx);
+
+  if (private_marker)
+    {
+      gimple_set_location (private_marker, loc);
+      gimple_call_set_lhs (private_marker, ddvar);
+      gimple_call_set_arg (private_marker, 1, ddvar);
+    }
+
   tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK);
   tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
 
@@ -8027,7 +8041,8 @@ lower_oacc_head_tail (location_t loc, tree clauses,
 			      &join_seq);
 
       lower_oacc_reductions (loc, clauses, place, inner,
-			     fork, join, &fork_seq, &join_seq,  ctx);
+			     fork, (count == 1) ? private_marker : NULL,
+			     join, &fork_seq, &join_seq,  ctx);
 
       /* Append this level to head. */
       gimple_seq_add_seq (head, fork_seq);
@@ -9992,6 +10007,32 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
     }
 }
 
+/* Record vars listed in private clauses in CLAUSES in CTX.  This information
+   is used to mark up variables that should be made private per-gang.  */
+
+static void
+oacc_record_private_var_clauses (omp_context *ctx, tree clauses)
+{
+  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
+      {
+	tree decl = OMP_CLAUSE_DECL (c);
+	if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
+	  ctx->oacc_addressable_var_decls.safe_push (decl);
+      }
+}
+
+/* Record addressable vars declared in BINDVARS in CTX.  This information is
+   used to mark up variables that should be made private per-gang.  */
+
+static void
+oacc_record_vars_in_bind (omp_context *ctx, tree bindvars)
+{
+  for (tree v = bindvars; v; v = DECL_CHAIN (v))
+    if (VAR_P (v) && TREE_ADDRESSABLE (v))
+      ctx->oacc_addressable_var_decls.safe_push (v);
+}
+
 /* Callback for walk_gimple_seq.  Find #pragma omp scan statement.  */
 
 static tree
@@ -10821,6 +10862,57 @@ lower_omp_for_scan (gimple_seq *body_p, gimple_seq *dlist, gomp_for *stmt,
   *dlist = new_dlist;
 }
 
+/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing
+   the addresses of variables that should be made private at the surrounding
+   parallelism level.  Such functions appear in the gimple code stream in two
+   forms, e.g. for a partitioned loop:
+
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68);
+      .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w);
+      .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1);
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6);
+
+   or alternatively, OACC_PRIVATE can appear at the top level of a parallel,
+   not as part of a HEAD_MARK sequence:
+
+      .UNIQUE (OACC_PRIVATE, 0, 0, &w);
+
+   For such stand-alone appearances, the 3rd argument is always 0, denoting
+   gang partitioning.  */
+
+static gcall *
+make_oacc_private_marker (omp_context *ctx)
+{
+  int i;
+  tree decl;
+
+  if (ctx->oacc_addressable_var_decls.length () == 0)
+    return NULL;
+
+  auto_vec<tree, 5> args;
+
+  args.quick_push (build_int_cst (integer_type_node, IFN_UNIQUE_OACC_PRIVATE));
+  args.quick_push (integer_zero_node);
+  args.quick_push (integer_minus_one_node);
+
+  FOR_EACH_VEC_ELT (ctx->oacc_addressable_var_decls, i, decl)
+    {
+      for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
+	{
+	  tree inner_decl = maybe_lookup_decl (decl, thisctx);
+	  if (inner_decl)
+	    {
+	      decl = inner_decl;
+	      break;
+	    }
+	}
+      tree addr = build_fold_addr_expr (decl);
+      args.safe_push (addr);
+    }
+
+  return gimple_build_call_internal_vec (IFN_UNIQUE, args);
+}
+
 /* Lower code for an OMP loop directive.  */
 
 static void
@@ -10837,6 +10929,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   push_gimplify_context ();
 
+  oacc_record_private_var_clauses (ctx, gimple_omp_for_clauses (stmt));
+
   lower_omp (gimple_omp_for_pre_body_ptr (stmt), ctx);
 
   block = make_node (BLOCK);
@@ -10855,6 +10949,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       gbind *inner_bind
 	= as_a <gbind *> (gimple_seq_first_stmt (omp_for_body));
       tree vars = gimple_bind_vars (inner_bind);
+      if (is_gimple_omp_oacc (ctx->stmt))
+	oacc_record_vars_in_bind (ctx, vars);
       gimple_bind_append_vars (new_stmt, vars);
       /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't
 	 keep them on the inner_bind and it's block.  */
@@ -10968,6 +11064,11 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   lower_omp (gimple_omp_body_ptr (stmt), ctx);
 
+  gcall *private_marker = NULL;
+  if (is_gimple_omp_oacc (ctx->stmt)
+      && !gimple_seq_empty_p (omp_for_body))
+    private_marker = make_oacc_private_marker (ctx);
+
   /* Lower the header expressions.  At this point, we can assume that
      the header is of the form:
 
@@ -11022,7 +11123,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   if (is_gimple_omp_oacc (ctx->stmt)
       && !ctx_in_oacc_kernels_region (ctx))
     lower_oacc_head_tail (gimple_location (stmt),
-			  gimple_omp_for_clauses (stmt),
+			  gimple_omp_for_clauses (stmt), private_marker,
 			  &oacc_head, &oacc_tail, ctx);
 
   /* Add OpenACC partitioning and reduction markers just before the loop.  */
@@ -13019,8 +13120,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	     them as a dummy GANG loop.  */
 	  tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG);
 
+	  gcall *private_marker = make_oacc_private_marker (ctx);
+
+	  if (private_marker)
+	    gimple_call_set_arg (private_marker, 2, level);
+
 	  lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level,
-				 false, NULL, NULL, &fork_seq, &join_seq, ctx);
+				 false, NULL, private_marker, NULL, &fork_seq,
+				 &join_seq, ctx);
 	}
 
       gimple_seq_add_seq (&new_body, fork_seq);
@@ -13262,6 +13369,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		 ctx);
       break;
     case GIMPLE_BIND:
+      if (ctx && is_gimple_omp_oacc (ctx->stmt))
+	oacc_record_vars_in_bind (ctx,
+				  gimple_bind_vars (as_a <gbind *> (stmt)));
       lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx);
       maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt));
       break;
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 57be342da97..b3f543b597a 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "cfgloop.h"
 #include "context.h"
+#include "convert.h"
 
 /* Describe the OpenACC looping structure of a function.  The entire
    function is held in a 'NULL' loop.  */
@@ -1356,7 +1357,9 @@ oacc_loop_xform_head_tail (gcall *from, int level)
 	    = ((enum ifn_unique_kind)
 	       TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)));
 
-	  if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN)
+	  if (k == IFN_UNIQUE_OACC_FORK
+	      || k == IFN_UNIQUE_OACC_JOIN
+	      || k == IFN_UNIQUE_OACC_PRIVATE)
 	    *gimple_call_arg_ptr (stmt, 2) = replacement;
 	  else if (k == kind && stmt != from)
 	    break;
@@ -1773,6 +1776,136 @@ default_goacc_reduction (gcall *call)
   gsi_replace_with_seq (&gsi, seq, true);
 }
 
+struct var_decl_rewrite_info
+{
+  gimple *stmt;
+  hash_map<tree, tree> *adjusted_vars;
+  bool avoid_pointer_conversion;
+  bool modified;
+};
+
+/* Helper function for execute_oacc_device_lower.  Rewrite VAR_DECLs (by
+   themselves or wrapped in various other nodes) according to ADJUSTED_VARS in
+   the var_decl_rewrite_info pointed to via DATA.  Used as part of coercing
+   gang-private variables in OpenACC offload regions to reside in GPU shared
+   memory.  */
+
+static tree
+oacc_rewrite_var_decl (tree *tp, int *walk_subtrees, void *data)
+{
+  walk_stmt_info *wi = (walk_stmt_info *) data;
+  var_decl_rewrite_info *info = (var_decl_rewrite_info *) wi->info;
+
+  if (TREE_CODE (*tp) == ADDR_EXPR)
+    {
+      tree arg = TREE_OPERAND (*tp, 0);
+      tree *new_arg = info->adjusted_vars->get (arg);
+
+      if (new_arg)
+	{
+	  if (info->avoid_pointer_conversion)
+	    {
+	      *tp = build_fold_addr_expr (*new_arg);
+	      info->modified = true;
+	      *walk_subtrees = 0;
+	    }
+	  else
+	    {
+	      gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
+	      tree repl = build_fold_addr_expr (*new_arg);
+	      gimple *stmt1
+		= gimple_build_assign (make_ssa_name (TREE_TYPE (repl)), repl);
+	      tree conv = convert_to_pointer (TREE_TYPE (*tp),
+					      gimple_assign_lhs (stmt1));
+	      gimple *stmt2
+		= gimple_build_assign (make_ssa_name (TREE_TYPE (*tp)), conv);
+	      gsi_insert_before (&gsi, stmt1, GSI_SAME_STMT);
+	      gsi_insert_before (&gsi, stmt2, GSI_SAME_STMT);
+	      *tp = gimple_assign_lhs (stmt2);
+	      info->modified = true;
+	      *walk_subtrees = 0;
+	    }
+	}
+    }
+  else if (TREE_CODE (*tp) == COMPONENT_REF || TREE_CODE (*tp) == ARRAY_REF)
+    {
+      tree *base = &TREE_OPERAND (*tp, 0);
+
+      while (TREE_CODE (*base) == COMPONENT_REF
+	     || TREE_CODE (*base) == ARRAY_REF)
+	base = &TREE_OPERAND (*base, 0);
+
+      if (TREE_CODE (*base) != VAR_DECL)
+	return NULL;
+
+      tree *new_decl = info->adjusted_vars->get (*base);
+      if (!new_decl)
+	return NULL;
+
+      int base_quals = TYPE_QUALS (TREE_TYPE (*new_decl));
+      tree field = TREE_OPERAND (*tp, 1);
+
+      /* Adjust the type of the field.  */
+      int field_quals = TYPE_QUALS (TREE_TYPE (field));
+      if (TREE_CODE (field) == FIELD_DECL && field_quals != base_quals)
+	{
+	  tree *field_type = &TREE_TYPE (field);
+	  while (TREE_CODE (*field_type) == ARRAY_TYPE)
+	    field_type = &TREE_TYPE (*field_type);
+	  field_quals |= base_quals;
+	  *field_type = build_qualified_type (*field_type, field_quals);
+	}
+
+      /* Adjust the type of the component ref itself.  */
+      tree comp_type = TREE_TYPE (*tp);
+      int comp_quals = TYPE_QUALS (comp_type);
+      if (TREE_CODE (*tp) == COMPONENT_REF && comp_quals != base_quals)
+	{
+	  comp_quals |= base_quals;
+	  TREE_TYPE (*tp)
+	    = build_qualified_type (comp_type, comp_quals);
+	}
+
+      *base = *new_decl;
+      info->modified = true;
+    }
+  else if (TREE_CODE (*tp) == VAR_DECL)
+    {
+      tree *new_decl = info->adjusted_vars->get (*tp);
+      if (new_decl)
+	{
+	  *tp = *new_decl;
+	  info->modified = true;
+	}
+    }
+
+  return NULL_TREE;
+}
+
+/* Return TRUE if CALL is a call to a builtin atomic/sync operation.  */
+
+static bool
+is_sync_builtin_call (gcall *call)
+{
+  tree callee = gimple_call_fndecl (call);
+
+  if (callee != NULL_TREE
+      && gimple_call_builtin_p (call, BUILT_IN_NORMAL))
+    switch (DECL_FUNCTION_CODE (callee))
+      {
+#undef DEF_SYNC_BUILTIN
+#define DEF_SYNC_BUILTIN(ENUM, NAME, TYPE, ATTRS) case ENUM:
+#include "sync-builtins.def"
+#undef DEF_SYNC_BUILTIN
+	return true;
+
+      default:
+	;
+      }
+
+  return false;
+}
+
 /* Main entry point for oacc transformations which run on the device
    compiler after LTO, so we know what the target device is at this
    point (including the host fallback).  */
@@ -1922,6 +2055,8 @@ execute_oacc_device_lower ()
      dominance information to update SSA.  */
   calculate_dominance_info (CDI_DOMINATORS);
 
+  hash_map<tree, tree> adjusted_vars;
+
   /* Now lower internal loop functions to target-specific code
      sequences.  */
   basic_block bb;
@@ -1998,6 +2133,45 @@ execute_oacc_device_lower ()
 		case IFN_UNIQUE_OACC_TAIL_MARK:
 		  remove = true;
 		  break;
+
+		case IFN_UNIQUE_OACC_PRIVATE:
+		  {
+		    HOST_WIDE_INT level
+		      = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+		    if (level == -1)
+		      break;
+		    for (unsigned i = 3;
+			 i < gimple_call_num_args (call);
+			 i++)
+		      {
+			tree arg = gimple_call_arg (call, i);
+			gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
+			tree decl = TREE_OPERAND (arg, 0);
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  {
+			    static char const *const axes[] =
+			      /* Must be kept in sync with GOMP_DIM
+				 enumeration.  */
+			      { "gang", "worker", "vector" };
+			    fprintf (dump_file, "Decl UID %u has %s "
+				     "partitioning:", DECL_UID (decl),
+				     axes[level]);
+			    print_generic_decl (dump_file, decl, TDF_SLIM);
+			    fputc ('\n', dump_file);
+			  }
+			if (targetm.goacc.adjust_private_decl)
+			  {
+			    tree oldtype = TREE_TYPE (decl);
+			    tree newdecl
+			      = targetm.goacc.adjust_private_decl (decl, level);
+			    if (TREE_TYPE (newdecl) != oldtype
+				|| newdecl != decl)
+			      adjusted_vars.put (decl, newdecl);
+			  }
+		      }
+		    remove = true;
+		  }
+		  break;
 		}
 	      break;
 	    }
@@ -2029,6 +2203,55 @@ execute_oacc_device_lower ()
 	  gsi_next (&gsi);
       }
 
+  /* Make adjustments to gang-private local variables if required by the
+     target, e.g. forcing them into a particular address space.  Afterwards,
+     ADDR_EXPR nodes which have adjusted variables as their argument need to
+     be modified in one of two ways:
+
+       1. They can be recreated, making a pointer to the variable in the new
+	  address space, or
+
+       2. The address of the variable in the new address space can be taken,
+	  converted to the default (original) address space, and the result of
+	  that conversion subsituted in place of the original ADDR_EXPR node.
+
+     Which of these is done depends on the gimple statement being processed.
+     At present atomic operations and inline asms use (1), and everything else
+     uses (2).  At least on AMD GCN, there are atomic operations that work
+     directly in the LDS address space.
+
+     COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
+     the new decl, adjusting types of appropriate tree nodes as necessary.  */
+
+  if (targetm.goacc.adjust_private_decl)
+    {
+      FOR_ALL_BB_FN (bb, cfun)
+	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	     !gsi_end_p (gsi);
+	     gsi_next (&gsi))
+	  {
+	    gimple *stmt = gsi_stmt (gsi);
+	    walk_stmt_info wi;
+	    var_decl_rewrite_info info;
+
+	    info.avoid_pointer_conversion
+	      = (is_gimple_call (stmt)
+		 && is_sync_builtin_call (as_a <gcall *> (stmt)))
+		|| gimple_code (stmt) == GIMPLE_ASM;
+	    info.stmt = stmt;
+	    info.modified = false;
+	    info.adjusted_vars = &adjusted_vars;
+
+	    memset (&wi, 0, sizeof (wi));
+	    wi.info = &info;
+
+	    walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
+
+	    if (info.modified)
+	      update_stmt (stmt);
+	  }
+    }
+
   free_oacc_loop (loops);
 
   return 0;
diff --git a/gcc/target.def b/gcc/target.def
index be7fcde961a..00b6f8f1bc9 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1712,6 +1712,36 @@ for allocating any storage for reductions when necessary.",
 void, (gcall *call),
 default_goacc_reduction)
 
+DEFHOOK
+(expand_var_decl,
+"This hook, if defined, is used by accelerator target back-ends to expand\n\
+specially handled kinds of @code{VAR_DECL} expressions.  A particular use is\n\
+to place variables with specific attributes inside special accelarator\n\
+memories.  A return value of @code{NULL} indicates that the target does not\n\
+handle this @code{VAR_DECL}, and normal RTL expanding is resumed.\n\
+\n\
+Only define this hook if your accelerator target needs to expand certain\n\
+@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust\n\
+private variables at OpenACC device-lowering time using the\n\
+@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.",
+rtx, (tree var),
+NULL)
+
+DEFHOOK
+(adjust_private_decl,
+"This hook, if defined, is used by accelerator target back-ends to adjust\n\
+OpenACC variable declarations that should be made private to the given\n\
+parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or\n\
+@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable\n\
+declarations at the @code{gang} level to reside in GPU shared memory, by\n\
+setting the address space of the decl and making it static.\n\
+\n\
+You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the\n\
+adjusted variable declaration needs to be expanded to RTL in a non-standard\n\
+way.",
+tree, (tree var, int level),
+NULL)
+
 HOOK_VECTOR_END (goacc)
 
 /* Functions relating to vectorization.  */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
new file mode 100644
index 00000000000..28222c25da3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
@@ -0,0 +1,38 @@
+#include <assert.h>
+
+int main (void)
+{
+  int ret;
+
+  #pragma acc parallel num_gangs(1) num_workers(32) copyout(ret)
+  {
+    int w = 0;
+
+    #pragma acc loop worker
+    for (int i = 0; i < 32; i++)
+      {
+	#pragma acc atomic update
+	w++;
+      }
+
+    ret = (w == 32);
+  }
+  assert (ret);
+
+  #pragma acc parallel num_gangs(1) vector_length(32) copyout(ret)
+  {
+    int v = 0;
+
+    #pragma acc loop vector
+    for (int i = 0; i < 32; i++)
+      {
+	#pragma acc atomic update
+	v++;
+      }
+
+    ret = (v == 32);
+  }
+  assert (ret);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
new file mode 100644
index 00000000000..a4f81a39e24
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
@@ -0,0 +1,95 @@
+#include <stdio.h>
+#include <openacc.h>
+#include <alloca.h>
+#include <string.h>
+#include <gomp-constants.h>
+#include <stdlib.h>
+
+#if 0
+#define DEBUG(DIM, IDX, VAL) \
+  fprintf (stderr, "%sdist[%d] = %d\n", (DIM), (IDX), (VAL))
+#else
+#define DEBUG(DIM, IDX, VAL)
+#endif
+
+#define N (32*32*32)
+
+int
+check (const char *dim, int *dist, int dimsize)
+{
+  int ix;
+  int exit = 0;
+
+  for (ix = 0; ix < dimsize; ix++)
+    {
+      DEBUG(dim, ix, dist[ix]);
+      if (dist[ix] < (N) / (dimsize + 0.5)
+	  || dist[ix] > (N) / (dimsize - 0.5))
+	{
+	  fprintf (stderr, "did not distribute to %ss (%d not between %d "
+		   "and %d)\n", dim, dist[ix], (int) ((N) / (dimsize + 0.5)),
+		   (int) ((N) / (dimsize - 0.5)));
+	  exit |= 1;
+	}
+    }
+
+  return exit;
+}
+
+int main ()
+{
+  int ary[N];
+  int ix;
+  int exit = 0;
+  int gangsize = 0, workersize = 0, vectorsize = 0;
+  int *gangdist, *workerdist, *vectordist;
+
+  for (ix = 0; ix < N;ix++)
+    ary[ix] = -1;
+
+#pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
+	    copy(ary) copyout(gangsize, workersize, vectorsize)
+  {
+#pragma acc loop gang worker vector
+    for (unsigned ix = 0; ix < N; ix++)
+      {
+	int g, w, v;
+
+	g = __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+	w = __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+	v = __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+
+	ary[ix] = (g << 16) | (w << 8) | v;
+      }
+
+    gangsize = __builtin_goacc_parlevel_size (GOMP_DIM_GANG);
+    workersize = __builtin_goacc_parlevel_size (GOMP_DIM_WORKER);
+    vectorsize = __builtin_goacc_parlevel_size (GOMP_DIM_VECTOR);
+  }
+
+  gangdist = (int *) alloca (gangsize * sizeof (int));
+  workerdist = (int *) alloca (workersize * sizeof (int));
+  vectordist = (int *) alloca (vectorsize * sizeof (int));
+  memset (gangdist, 0, gangsize * sizeof (int));
+  memset (workerdist, 0, workersize * sizeof (int));
+  memset (vectordist, 0, vectorsize * sizeof (int));
+
+  /* Test that work is shared approximately equally amongst each active
+     gang/worker/vector.  */
+  for (ix = 0; ix < N; ix++)
+    {
+      int g = (ary[ix] >> 16) & 255;
+      int w = (ary[ix] >> 8) & 255;
+      int v = ary[ix] & 255;
+
+      gangdist[g]++;
+      workerdist[w]++;
+      vectordist[v]++;
+    }
+
+  exit = check ("gang", gangdist, gangsize);
+  exit |= check ("worker", workerdist, workersize);
+  exit |= check ("vector", vectordist, vectorsize);
+
+  return exit;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
new file mode 100644
index 00000000000..f330f7de1be
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
@@ -0,0 +1,25 @@
+! Test for "oacc gangprivate" attribute on gang-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-oaccdevlow-details -w" }
+
+program main
+  integer :: w, arr(0:31)
+
+  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+    !$acc loop gang private(w)
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    do j = 0, 31
+      w = 0
+      !$acc loop seq
+      do i = 0, 31
+        !$acc atomic update
+        w = w + 1
+        !$acc end atomic
+      end do
+      arr(j) = w
+    end do
+  !$acc end parallel
+
+  if (any (arr .ne. 32)) stop 1
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
new file mode 100644
index 00000000000..f4e67b0c708
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
@@ -0,0 +1,25 @@
+! Test for worker-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
+
+program main
+  integer :: w, arr(0:31)
+
+  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+    !$acc loop gang worker private(w)
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    do j = 0, 31
+      w = 0
+      !$acc loop seq
+      do i = 0, 31
+        !$acc atomic update
+        w = w + 1
+        !$acc end atomic
+      end do
+      arr(j) = w
+    end do
+  !$acc end parallel
+
+  if (any (arr .ne. 32)) stop 1
+end program main
-- 
2.29.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] amdgcn: AMD GCN parts for OpenACC private variables patch
  2021-02-26 12:34 [PATCH 0/3] openacc: Gang-private variables in shared memory Julian Brown
  2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
@ 2021-02-26 12:34 ` Julian Brown
  2021-02-26 12:34 ` [PATCH 3/3] nvptx: NVPTX " Julian Brown
  2 siblings, 0 replies; 24+ messages in thread
From: Julian Brown @ 2021-02-26 12:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: jakub, Thomas Schwinge, Tom de Vries

This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in
the AMD GCN backend to the current name and prototype. (An earlier
version of the hook was already present, but dormant.)

(I can self-approve this. I will commit as/when the previous patch
is approved.)

Thanks,

Julian

gcc/
	* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
	to...
	(gcn_goacc_adjust_private_decl): ...this.
	* config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl): Rename
	to...
	(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
	* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename
	definition using gcn_goacc_adjust_gangprivate_decl...
	(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using
	gcn_goacc_adjust_private_decl.
---
 gcc/config/gcn/gcn-protos.h | 2 +-
 gcc/config/gcn/gcn-tree.c   | 9 +++++++--
 gcc/config/gcn/gcn.c        | 4 ++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index dc9331c445d..7ef7ae8af46 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -40,7 +40,7 @@ extern rtx gcn_gen_undef (machine_mode);
 extern bool gcn_global_address_p (rtx);
 extern tree gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
 						 const char *name);
-extern void gcn_goacc_adjust_gangprivate_decl (tree var);
+extern tree gcn_goacc_adjust_private_decl (tree var, int level);
 extern void gcn_goacc_reduction (gcall *call);
 extern bool gcn_hard_regno_rename_ok (unsigned int from_reg,
 				      unsigned int to_reg);
diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c
index 8f270991c86..75ea50c59dd 100644
--- a/gcc/config/gcn/gcn-tree.c
+++ b/gcc/config/gcn/gcn-tree.c
@@ -577,9 +577,12 @@ gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
   return decl;
 }
 
-void
-gcn_goacc_adjust_gangprivate_decl (tree var)
+tree
+gcn_goacc_adjust_private_decl (tree var, int level)
 {
+  if (level != GOMP_DIM_GANG)
+    return var;
+
   tree type = TREE_TYPE (var);
   tree lds_type = build_qualified_type (type,
 		    TYPE_QUALS_NO_ADDR_SPACE (type)
@@ -597,6 +600,8 @@ gcn_goacc_adjust_gangprivate_decl (tree var)
 
   if (machfun)
     machfun->use_flat_addressing = true;
+
+  return var;
 }
 
 /* }}}  */
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index e8bb0b63756..1ea919bf058 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -6317,8 +6317,8 @@ gcn_dwarf_register_span (rtx rtl)
 #undef  TARGET_GOACC_ADJUST_PROPAGATION_RECORD
 #define TARGET_GOACC_ADJUST_PROPAGATION_RECORD \
   gcn_goacc_adjust_propagation_record
-#undef  TARGET_GOACC_ADJUST_GANGPRIVATE_DECL
-#define TARGET_GOACC_ADJUST_GANGPRIVATE_DECL gcn_goacc_adjust_gangprivate_decl
+#undef  TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL gcn_goacc_adjust_private_decl
 #undef  TARGET_GOACC_FORK_JOIN
 #define TARGET_GOACC_FORK_JOIN gcn_fork_join
 #undef  TARGET_GOACC_REDUCTION
-- 
2.29.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] nvptx: NVPTX parts for OpenACC private variables patch
  2021-02-26 12:34 [PATCH 0/3] openacc: Gang-private variables in shared memory Julian Brown
  2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
  2021-02-26 12:34 ` [PATCH 2/3] amdgcn: AMD GCN parts for OpenACC private variables patch Julian Brown
@ 2021-02-26 12:34 ` Julian Brown
  2021-05-21 18:59   ` Thomas Schwinge
  2 siblings, 1 reply; 24+ messages in thread
From: Julian Brown @ 2021-02-26 12:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: jakub, Thomas Schwinge, Tom de Vries

This patch contains the NVPTX backend support for placing OpenACC
gang-private variables in GPU shared memory.

Tested with offloading to NVPTX.

This is substantially the same as the version previously posted: I will
assume it is already approved (unless I hear objections), and will commit
it at the same time as the rest of the series.

  (https://gcc.gnu.org/pipermail/gcc-patches/2018-October/507909.html)

Thanks,

Julian

2021-02-23  Chung-Lin Tang  <cltang@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>

gcc/
	* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
	(gangprivate_shared_size): New global variable.
	(gangprivate_shared_align): Likewise.
	(gangprivate_shared_sym): Likewise.
	(gangprivate_shared_hmap): Likewise.
	(nvptx_option_override): Initialize gangprivate_shared_sym,
	gangprivate_shared_align.
	(nvptx_file_end): Output gangprivate_shared_sym.
	(nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_accel_var): New
	functions.
	(nvptx_set_current_function): Clear gangprivate_shared_hmap.
	(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook.
	(TARGET_GOACC_EXPAND_VAR_DECL): Likewise.
---
 gcc/config/nvptx/nvptx.c | 78 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 794c5a69db0..a0474b0077b 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -75,6 +75,7 @@
 #include "fold-const.h"
 #include "intl.h"
 #include "opts.h"
+#include "tree-pretty-print.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -167,6 +168,12 @@ static unsigned vector_red_align;
 static unsigned vector_red_partition;
 static GTY(()) rtx vector_red_sym;
 
+/* Shared memory block for gang-private variables.  */
+static unsigned gangprivate_shared_size;
+static unsigned gangprivate_shared_align;
+static GTY(()) rtx gangprivate_shared_sym;
+static hash_map<tree_decl_hash, unsigned int> gangprivate_shared_hmap;
+
 /* Global lock variable, needed for 128bit worker & gang reductions.  */
 static GTY(()) tree global_lock_var;
 
@@ -251,6 +258,10 @@ nvptx_option_override (void)
   vector_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
   vector_red_partition = 0;
 
+  gangprivate_shared_sym = gen_rtx_SYMBOL_REF (Pmode, "__gangprivate_shared");
+  SET_SYMBOL_DATA_AREA (gangprivate_shared_sym, DATA_AREA_SHARED);
+  gangprivate_shared_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+
   diagnose_openacc_conflict (TARGET_GOMP, "-mgomp");
   diagnose_openacc_conflict (TARGET_SOFT_STACK, "-msoft-stack");
   diagnose_openacc_conflict (TARGET_UNIFORM_SIMT, "-muniform-simt");
@@ -5355,6 +5366,10 @@ nvptx_file_end (void)
     write_shared_buffer (asm_out_file, vector_red_sym,
 			 vector_red_align, vector_red_size);
 
+  if (gangprivate_shared_size)
+    write_shared_buffer (asm_out_file, gangprivate_shared_sym,
+			 gangprivate_shared_align, gangprivate_shared_size);
+
   if (need_softstack_decl)
     {
       write_var_marker (asm_out_file, false, true, "__nvptx_stacks");
@@ -6582,6 +6597,62 @@ nvptx_truly_noop_truncation (poly_uint64, poly_uint64)
   return false;
 }
 
+/* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL.  Set "oacc gangprivate"
+   attribute for gang-private variable declarations.  */
+
+static tree
+nvptx_goacc_adjust_private_decl (tree decl, int level)
+{
+  if (level != GOMP_DIM_GANG)
+    return decl;
+
+  if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl)))
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Setting 'oacc gangprivate' attribute for decl:");
+	  print_generic_decl (dump_file, decl, TDF_SLIM);
+	  fputc ('\n', dump_file);
+	}
+      tree id = get_identifier ("oacc gangprivate");
+      DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl));
+    }
+
+  return decl;
+}
+
+/* Implement TARGET_GOACC_EXPAND_VAR_DECL.  Place "oacc gangprivate"
+   variables in shared memory.  */
+
+static rtx
+nvptx_goacc_expand_var_decl (tree var)
+{
+  if (VAR_P (var)
+      && lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (var)))
+    {
+      unsigned int offset, *poffset;
+      poffset = gangprivate_shared_hmap.get (var);
+      if (poffset)
+	offset = *poffset;
+      else
+	{
+	  unsigned HOST_WIDE_INT align = DECL_ALIGN (var);
+	  gangprivate_shared_size
+	    = (gangprivate_shared_size + align - 1) & ~(align - 1);
+	  if (gangprivate_shared_align < align)
+	    gangprivate_shared_align = align;
+
+	  offset = gangprivate_shared_size;
+	  bool existed = gangprivate_shared_hmap.put (var, offset);
+	  gcc_assert (!existed);
+	  gangprivate_shared_size += tree_to_uhwi (DECL_SIZE_UNIT (var));
+	}
+      rtx addr = plus_constant (Pmode, gangprivate_shared_sym, offset);
+      return gen_rtx_MEM (TYPE_MODE (TREE_TYPE (var)), addr);
+    }
+  return NULL_RTX;
+}
+
 static GTY(()) tree nvptx_previous_fndecl;
 
 static void
@@ -6590,6 +6661,7 @@ nvptx_set_current_function (tree fndecl)
   if (!fndecl || fndecl == nvptx_previous_fndecl)
     return;
 
+  gangprivate_shared_hmap.empty ();
   nvptx_previous_fndecl = fndecl;
   vector_red_partition = 0;
   oacc_bcast_partition = 0;
@@ -6754,6 +6826,12 @@ nvptx_libc_has_function (enum function_class fn_class, tree type)
 #undef TARGET_HAVE_SPECULATION_SAFE_VALUE
 #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
 
+#undef TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL nvptx_goacc_adjust_private_decl
+
+#undef TARGET_GOACC_EXPAND_VAR_DECL
+#define TARGET_GOACC_EXPAND_VAR_DECL nvptx_goacc_expand_var_decl
+
 #undef TARGET_SET_CURRENT_FUNCTION
 #define TARGET_SET_CURRENT_FUNCTION nvptx_set_current_function
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
@ 2021-04-15 17:26   ` Thomas Schwinge
  2021-04-16 16:05     ` Andrew Stubbs
  2021-04-19 11:23     ` Julian Brown
  2021-05-21 19:12   ` [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory Thomas Schwinge
  1 sibling, 2 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-04-15 17:26 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Jakub Jelinek, Tom de Vries

Hi!

On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com> wrote:
> This patch

Thanks, Julian, for your continued improving of these changes!

This has iterated through several conceptually different designs and
implementations, by several people, over the past several years.

It's now been made my task to finish it up -- but I'll very much
appreciate your input (Julian's, primarily) on the following remarks,
which are basically my open work items.


> implements a method to track the "private-ness" of
> OpenACC variables declared in offload regions in gang-partitioned,
> worker-partitioned or vector-partitioned modes. Variables declared
> implicitly in scoped blocks and those declared "private" on enclosing
> directives (e.g. "acc parallel") are both handled. Variables that are
> e.g. gang-private can then be adjusted so they reside in GPU shared
> memory.
>
> The reason for doing this is twofold: correct implementation of OpenACC
> semantics

ACK, and as mentioned before, this very much relates to
<https://gcc.gnu.org/PR90115> "OpenACC: predetermined private levels for
variables declared in blocks" (plus the corresponding use of 'private'
clauses, implicit/explicit, including 'firstprivate') and
<https://gcc.gnu.org/PR90114> "Predetermined private levels for variables
declared in OpenACC accelerator routines", which we thus should refer in
testcases/ChangeLog/commit log, as appropriate.  I do understand we're
not yet addressing all of that (and that's fine!), but we should capture
remaining work items of the PRs and Cesar's list in
<http://mid.mail-archive.com/70d27ebd-762e-59a3-082f-48fa0c687212@codesourcery.com>),
as appropriate.


I was surprised that we didn't really have to fix up any existing libgomp
testcases, because there seem to be quite some that contain a pattern
(exemplified by the 'tmp' variable) as follows:

    int main()
    {
    #define N 123
      int data[N];
      int tmp;

    #pragma acc parallel // implicit 'firstprivate(tmp)'
      {
        // 'tmp' now conceptually made gang-private here.
    #pragma acc loop gang
        for (int i = 0; i < 123; ++i)
          {
            tmp = i + 234;
            data[i] = tmp;
          }
      }

      for (int i = 0; i < 123; ++i)
        if (data[i] != i + 234)
          __builtin_abort ();

      return 0;
    }

With the code changes as posted, this actually now does *not* use
gang-private memory for 'tmp', but instead continues to use
"thread-private registers", as before.

Same for:

    --- s3.c    2021-04-13 17:26:49.628739379 +0200
    +++ s3_2.c  2021-04-13 17:29:43.484579664 +0200
    @@ -4,6 +4,6 @@
       int data[N];
    -  int tmp;

    -#pragma acc parallel // implicit 'firstprivate(tmp)'
    +#pragma acc parallel
       {
    +    int tmp;
         // 'tmp' now conceptually made gang-private here.
     #pragma acc loop gang

I suppose that's due to conditionalizing this transformation on
'TREE_ADDRESSABLE' (as you're doing), so we should be mostly "safe"
regarding such existing testcases (but I haven't verified that yet in
detail).

That needs to be documented in testcases, with some kind of dump scanning
(host compilation-side even; see below).

A note for later: if this weren't just a 'gang' loop, but 'gang' plus
'worker' and/or 'vector', we'd actually be fixing up user code with
undefined behavior into "correct" code (by *not* making 'tmp'
gang-private, but thread-private), right?

As that may not be obvious to the reader, I'd like to have the
'TREE_ADDRESSABLE' conditionalization be documented in the code.  You had
explained that in
<http://mid.mail-archive.com/20190612204216.0ec83e4e@squid.athome>: "a
non-addressable variable [...]".


> and optimisation, since shared memory might be faster than
> the main memory on a GPU.

Do we potentially have a problem that making more use of (scarce)
gang-private memory may negatively affect peformance, because potentially
fewer OpenACC gangs may then be launched to the GPU hardware in parallel?
(Of course, OpenACC semantics conformance firstly is more important than
performance, but there may be ways to be conformant and performant;
"quality of implementation".)  Have you run any such performance testing
with the benchmarking codes that we've got set up?

(As I'm more familiar with that, I'm using nvptx offloading examples in
the following, whilst assuming that similar discussion may apply for GCN
offloading, which uses similar hardware concepts, as far as I remember.)

Looking at the existing 'libgomp.oacc-c-c++-common/private-variables.c'
(random example), for nvptx offloading, '-O0', we see the following PTX
JIT compilation changes (word-'diff' of 'GOMP_DEBUG=1' at run-time):

    info    : Function properties for 'local_g_1$_omp_fn$0':
    info    : used 27 registers, 32 stack, [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
    info    : Function properties for 'local_w_1$_omp_fn$0':
    info    : used 40 registers, 48 stack, [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
    info    : Function properties for 'local_w_2$_omp_fn$0':
    [...]
    info    : Function properties for 'parallel_g_1$_omp_fn$0':
    info    : used 27 registers, 32 stack, [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
    info    : Function properties for 'parallel_g_2$_omp_fn$0':
    info    : used 32 registers, 160 stack, [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem

... that is, PTX '.shared' usage increases from 176 to 256 bytes for
*all* functions, even though only 'loop_g_4$_omp_fn$0' and
'loop_g_5$_omp_fn$0' are actually using gang-private memory.

Execution testing works before (original code, not using gang-private
memory) as well as after (code changes as posted, using gang-private
memory), so use on gang-private memory doesn't seem necessary here for
"correct execution" -- or at least: "expected execution result".  ;-)
I haven't looked yet whether there's a potentional issue in the testcases
here.

The additional '256 - 176 = 80' bytes of PTX '.shared' memory requested
are due to GCC nvptx back end implementation's use of a global "Shared
memory block for gang-private variables":

     // BEGIN VAR DEF: __oacc_bcast
     .shared .align 8 .u8 __oacc_bcast[176];
    +// BEGIN VAR DEF: __gangprivate_shared
    +.shared .align 32 .u8 __gangprivate_shared[64];

..., plus (I suppose) an additional '80 - 64 = 16' padding/unused bytes
to establish '.align 32' after '.align 8' for '__oacc_bcast'.

Per
<https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities>,
"Table 15. Technical Specifications per Compute Capability", "Compute
Capability": "3.5", we have a "Maximum amount of shared memory per SM":
"48 KB", so with '176 bytes smem', that permits '48 * 1024 / 176 = 279'
thread blocks ('num_gangs') resident at one point in time, whereas with
'256 bytes smem', it's just '48 * 1024 / 256 = 192' thread blocks
resident at one point in time.  (Not sure that I got all the details
right, but you get the idea/concern?)

Anyway, that shall be OK for now, but we shall later look into optimizing
that; can't we have '.shared' local to the relevant PTX functions instead
of global?

Interestingly, compiling with '-O2', we see:

    // BEGIN VAR DEF: __oacc_bcast
    .shared .align 8 .u8 __oacc_bcast[144];
    {+// BEGIN VAR DEF: __gangprivate_shared+}
    {+.shared .align 128 .u8 __gangprivate_shared[32];+}

With '-O2', only 'loop_g_5$_omp_fn$0' is using gang-private memory, and
apparently the PTX JIT is able to figure that out from the PTX code that
GCC generates, and is then able to localize '.shared' memory usage to
just 'loop_g_5$_omp_fn$0':

    [...]
    info    : Function properties for 'loop_g_4$_omp_fn$0':
    info    : used 12 registers, 0 stack, 144 bytes smem, 328 bytes cmem[0], 0 bytes lmem
    info    : Function properties for 'loop_g_5$_omp_fn$0':
    info    : used [-30-]{+32+} registers, 32 stack, [-144-]{+288+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
    info    : Function properties for 'loop_g_6$_omp_fn$0':
    info    : used 13 registers, 0 stack, 144 bytes smem, 328 bytes cmem[0], 0 bytes lmem
    [...]

This strongly suggests to me that indeed there must exist a programmatic
way to get rid of the global "Shared memory block for gang-private
variables".

The additional '288 - 144 = 144' bytes of PTX '.shared' memory requested
are 32 bytes for 'int x[8]' ('#pragma acc loop gang private(x)') plus
'288 - 32 - 144 = 112' padding/unused bytes to establish '.align 128' (!)
after '.align 8' for '__oacc_bcast'.  That's clearly not ideal: 112 bytes
wasted in contrast to just '144 + 32 = 176' bytes actually used.  (I have
not yet looked why/whether this really needs '.align 128'.)

I have not yet looked whether similar concerns exist for the GCC GCN back
end implementation.  (That one also does set 'TREE_STATIC' for
gang-private memory, so it's a global allocation?)


> Handling of private variables is intimately
> tied to the execution model for gangs/workers/vectors implemented by
> a particular target: for current targets, we use (or on mainline, will
> soon use) a broadcasting/neutering scheme.
>
> That is sufficient for code that e.g. sets a variable in worker-single
> mode and expects to use the value in worker-partitioned mode. The
> difficulty (semantics-wise) comes when the user wants to do something like
> an atomic operation in worker-partitioned mode and expects a worker-single
> (gang private) variable to be shared across each partitioned worker.
> Forcing use of shared memory for such variables makes that work properly.

Are we reliably making sure that gang-private variables (and other
levels, in general) are not subject to the usual broadcasting scheme
(nvptx, at least), or does that currently work "by accident"?  (I haven't
looked into that, yet.)


> In terms of implementation, the parallelism level of a given loop is
> not fixed until the oaccdevlow pass in the offload compiler, so the
> patch delays fixing the parallelism level of variables declared on or
> within such loops until the same point. This is done by adding a new
> internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
> private variable as an argument, and other arguments set so as to be able
> to determine the correct parallelism level to use for the listed
> variables. This new internal function fits into the existing scheme for
> demarcating OpenACC loops, as described in comments in the patch.

Yes, thanks, that's conceptually now much better than the earlier
variants that we had.  :-) (Hooray, again, for Nathan's OpenACC execution
model design!)

What we should add, though, is a bunch of testcases to verify that the
expected processing does/doesn't happen for relevant source code
constructs.  I'm thinking that when the transformation is/isn't done,
that gets logged, and we can then scan the dumps accordingly.  Some of
that is implemented already; we should be able to do such scanning
generally for host compilation, too, not just offloading compilation.


Generally, we also have to make sure that the expected privatizations
(plural) happen if there are multiple levels of parallelism involved:
(deep) loops nests with 'gang', 'worker', 'vector', 'seq' as well as
combinations of 'gang', 'worker', 'vector' on one level.

    #pragma acc parallel
    {
      int x;
      // What's 'x' at this level?
      #pragma acc loop seq private(x)
      [for]
        {
          // What's 'x' at this level?
          #pragma acc loop private(x)
          [for]
            {
              // What's 'x' at this level?
              #pragma acc loop worker vector private(x)
              [for...]
                {
                  // What's 'x' at this level?

Etc.


> Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
> TARGET_GOACC_EXPAND_VAR_DECL.  The first can tweak a variable declaration
> at oaccdevlow time, and the second at expand time.  The first or both
> of these target hooks can be used by a given offload target, depending
> on its strategy for implementing private variables.

ACK.

So, currently we're only looking at making the gang-private level work.
Regarding that, we have two configurations: (1) for GCN offloading,
'targetm.goacc.adjust_private_decl' does the work (in particular, change
'TREE_TYPE' etc.) and there is no 'targetm.goacc.expand_var_decl', and
(2) for nvptx offloading, 'targetm.goacc.adjust_private_decl' only sets a
marker ('oacc gangprivate' attribute) and then
'targetm.goacc.expand_var_decl' does the work.

Therefore I suggest we clarify the (currently) expected handling similar
to:

    --- gcc/omp-offload.c
    +++ gcc/omp-offload.c
    @@ -1854,6 +1854,19 @@ oacc_rewrite_var_decl (tree *tp, int *walk_subtrees, void *data)
       return NULL_TREE;
     }

    +static tree
    +oacc_rewrite_var_decl_ (tree *tp, int *walk_subtrees, void *data)
    +{
    +  tree t = oacc_rewrite_var_decl (tp, walk_subtrees, data);
    +  if (targetm.goacc.expand_var_decl)
    +    {
    +      walk_stmt_info *wi = (walk_stmt_info *) data;
    +      var_decl_rewrite_info *info = (var_decl_rewrite_info *) wi->info;
    +      gcc_assert (!info->modified);
    +    }
    +  return t;
    +}
    +
     /* Return TRUE if CALL is a call to a builtin atomic/sync operation.  */

     static bool
    @@ -2195,6 +2208,9 @@ execute_oacc_device_lower ()
          COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
          the new decl, adjusting types of appropriate tree nodes as necessary.  */

    +  if (targetm.goacc.expand_var_decl)
    +    gcc_assert (adjusted_vars.is_empty ());
    +
       if (targetm.goacc.adjust_private_decl)
         {
           FOR_ALL_BB_FN (bb, cfun)
    @@ -2217,7 +2233,7 @@ execute_oacc_device_lower ()
                memset (&wi, 0, sizeof (wi));
                wi.info = &info;

    -           walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
    +           walk_gimple_op (stmt, oacc_rewrite_var_decl_, &wi);

                if (info.modified)
                  update_stmt (stmt);

Or, in fact, 'if (targetm.goacc.expand_var_decl)', skip the
'adjusted_vars' handling completely?

I do understand that eventually (in particular, for worker-private
level?), both 'targetm.goacc.adjust_private_decl' and
'targetm.goacc.expand_var_decl' may need to do things, but that's
currently not meant to be addressed, and thus not fully worked out and
implemented, and thus untested.  Hence, 'assert' what currently is
implemented/tested, only.

(Given that eventual goal, that's probably sufficient motivation to
indeed add the 'adjusted_vars' handling in generic 'gcc/omp-offload.c'
instead of moving it into the GCN back end?)


For 'libgomp.oacc-c-c++-common/static-variable-1.c' that I've recently
added, the code changes here cause execution test FAILs for nvptx
offloading (because of making 'static' variables gang-private), and
trigger an ICE with GCN offloading compilation.  It isn't clear to me
what the desired semantics are for (user-specified) 'static' variables --
see <https://github.com/OpenACC/openacc-spec/issues/372> "C/C++ 'static'
variables" (only visible to members of the GitHub OpenACC organization)
-- but an ICE clearly isn't the right answer.  ;-)

As for certain transformation/optimizations, 'static' variables may be
synthesized in the GCC middle end, I suppose we should preserve the
status quo (as documented via
'libgomp.oacc-c-c++-common/static-variable-1.c') until #372 gets resolved
in OpenACC?  (I suppose, skip the transformation if 'TREE_STATIC' is set,
or similar.)


A few individual comments (search for '[TS]'), for easy reference
embedded in full-quote of the generic code changes.  GCN and nvptx back
end code changes to be found in
<http://mid.mail-archive.com/d6ae43626eed9fd968250ee10109433e810d1048.1614342218.git.julian@codesourcery.com>,
<http://mid.mail-archive.com/aab0a87b99797e1fcc73e7f3e76152405289805a.1614342218.git.julian@codesourcery.com>.


> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -1712,6 +1712,36 @@ for allocating any storage for reductions when necessary.",
>  void, (gcall *call),
>  default_goacc_reduction)
>
> +DEFHOOK
> +(expand_var_decl,
> +"This hook, if defined, is used by accelerator target back-ends to expand\n\
> +specially handled kinds of @code{VAR_DECL} expressions.  A particular use is\n\
> +to place variables with specific attributes inside special accelarator\n\
> +memories.  A return value of @code{NULL} indicates that the target does not\n\
> +handle this @code{VAR_DECL}, and normal RTL expanding is resumed.\n\
> +\n\
> +Only define this hook if your accelerator target needs to expand certain\n\
> +@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust\n\
> +private variables at OpenACC device-lowering time using the\n\
> +@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.",
> +rtx, (tree var),
> +NULL)
> +
> +DEFHOOK
> +(adjust_private_decl,
> +"This hook, if defined, is used by accelerator target back-ends to adjust\n\
> +OpenACC variable declarations that should be made private to the given\n\
> +parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or\n\
> +@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable\n\
> +declarations at the @code{gang} level to reside in GPU shared memory, by\n\
> +setting the address space of the decl and making it static.\n\
> +\n\
> +You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the\n\
> +adjusted variable declaration needs to be expanded to RTL in a non-standard\n\
> +way.",
> +tree, (tree var, int level),
> +NULL)
> +
>  HOOK_VECTOR_END (goacc)
>
>  /* Functions relating to vectorization.  */

> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6227,6 +6227,32 @@ like @code{cond_add@var{m}}.  The default implementation returns a zero
>  constant of type @var{type}.
>  @end deftypefn
>
> +@deftypefn {Target Hook} rtx TARGET_GOACC_EXPAND_VAR_DECL (tree @var{var})
> +This hook, if defined, is used by accelerator target back-ends to expand
> +specially handled kinds of @code{VAR_DECL} expressions.  A particular use is
> +to place variables with specific attributes inside special accelarator
> +memories.  A return value of @code{NULL} indicates that the target does not
> +handle this @code{VAR_DECL}, and normal RTL expanding is resumed.
> +
> +Only define this hook if your accelerator target needs to expand certain
> +@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust
> +private variables at OpenACC device-lowering time using the
> +@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} tree TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, int @var{level})
> +This hook, if defined, is used by accelerator target back-ends to adjust
> +OpenACC variable declarations that should be made private to the given
> +parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or
> +@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable
> +declarations at the @code{gang} level to reside in GPU shared memory, by
> +setting the address space of the decl and making it static.
> +
> +You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the
> +adjusted variable declaration needs to be expanded to RTL in a non-standard
> +way.
> +@end deftypefn
> +
>  @node Anchored Addresses
>  @section Anchored Addresses
>  @cindex anchored addresses

> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4219,6 +4219,10 @@ address;  but often a machine-dependent strategy can generate better code.
>
>  @hook TARGET_PREFERRED_ELSE_VALUE
>
> +@hook TARGET_GOACC_EXPAND_VAR_DECL
> +
> +@hook TARGET_GOACC_ADJUST_PRIVATE_DECL
> +
>  @node Anchored Addresses
>  @section Anchored Addresses
>  @cindex anchored addresses


> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -10224,8 +10224,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>        exp = SSA_NAME_VAR (ssa_name);
>        goto expand_decl_rtl;
>
> -    case PARM_DECL:
>      case VAR_DECL:
> +      /* Allow accel compiler to handle variables that require special
> +      treatment, e.g. if they have been modified in some way earlier in
> +      compilation by the adjust_private_decl OpenACC hook.  */
> +      if (flag_openacc && targetm.goacc.expand_var_decl)
> +     {
> +       temp = targetm.goacc.expand_var_decl (exp);
> +       if (temp)
> +         return temp;
> +     }
> +      /* ... fall through ...  */
> +
> +    case PARM_DECL:

[TS] Are we sure that we don't need the same handling for a 'PARM_DECL',
too?  (If yes, to document and verify that, should we thus again unify
the two 'case's, and in 'targetm.goacc.expand_var_decl' add a
'gcc_checking_assert (TREE_CODE (var) == VAR_DECL')'?)

Also, are we sure that all the following existing processing is not
relevant to do before the 'return temp' (see above)?  That's not a
concern for GCN (which doesn't use 'targetm.goacc.expand_var_decl', and
thus does execute all this following existing processing), but it is for
nvptx (which does use 'targetm.goacc.expand_var_decl', and thus doesn't
execute all this following existing processing if that returned
something).  Or, is 'targetm.goacc.expand_var_decl' conceptually and
practically meant to implement all of the following processing, or is
this for other reasons not relevant in the
'targetm.goacc.expand_var_decl' case:

>        /* If a static var's type was incomplete when the decl was written,
>        but the type is complete now, lay out the decl now.  */
>        if (DECL_SIZE (exp) == 0
|            && COMPLETE_OR_UNBOUND_ARRAY_TYPE_P (TREE_TYPE (exp))
|            && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
|          layout_decl (exp, 0);
|
|        /* fall through */
|
|      case FUNCTION_DECL:
|      case RESULT_DECL:
|        decl_rtl = DECL_RTL (exp);
|      expand_decl_rtl:
|        gcc_assert (decl_rtl);
|
|        /* DECL_MODE might change when TYPE_MODE depends on attribute target
|           settings for VECTOR_TYPE_P that might switch for the function.  */
|        if (currently_expanding_to_rtl
|            && code == VAR_DECL && MEM_P (decl_rtl)
|            && VECTOR_TYPE_P (type) && exp && DECL_MODE (exp) != mode)
|          decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
|        else
|          decl_rtl = copy_rtx (decl_rtl);
|
|        /* Record writes to register variables.  */
|        if (modifier == EXPAND_WRITE
|            && REG_P (decl_rtl)
|            && HARD_REGISTER_P (decl_rtl))
|          add_to_hard_reg_set (&crtl->asm_clobbers,
|                               GET_MODE (decl_rtl), REGNO (decl_rtl));
|
|        /* Ensure variable marked as used even if it doesn't go through
|           a parser.  If it hasn't be used yet, write out an external
|           definition.  */
|        if (exp)
|          TREE_USED (exp) = 1;
|
|        /* Show we haven't gotten RTL for this yet.  */
|        temp = 0;
|
|        /* Variables inherited from containing functions should have
|           been lowered by this point.  */
|        if (exp)
|          context = decl_function_context (exp);
|        gcc_assert (!exp
|                    || SCOPE_FILE_SCOPE_P (context)
|                    || context == current_function_decl
|                    || TREE_STATIC (exp)
|                    || DECL_EXTERNAL (exp)
|                    /* ??? C++ creates functions that are not TREE_STATIC.  */
|                    || TREE_CODE (exp) == FUNCTION_DECL);
|
|        /* This is the case of an array whose size is to be determined
|           from its initializer, while the initializer is still being parsed.
|           ??? We aren't parsing while expanding anymore.  */
|
|        if (MEM_P (decl_rtl) && REG_P (XEXP (decl_rtl, 0)))
|          temp = validize_mem (decl_rtl);
|
|        /* If DECL_RTL is memory, we are in the normal case and the
|           address is not valid, get the address into a register.  */
|
|        else if (MEM_P (decl_rtl) && modifier != EXPAND_INITIALIZER)
|          {
|            if (alt_rtl)
|              *alt_rtl = decl_rtl;
|            decl_rtl = use_anchored_address (decl_rtl);
|            if (modifier != EXPAND_CONST_ADDRESS
|                && modifier != EXPAND_SUM
|                && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
|                                                 : GET_MODE (decl_rtl),
|                                                 XEXP (decl_rtl, 0),
|                                                 MEM_ADDR_SPACE (decl_rtl)))
|              temp = replace_equiv_address (decl_rtl,
|                                            copy_rtx (XEXP (decl_rtl, 0)));
|          }
|
|        /* If we got something, return it.  But first, set the alignment
|           if the address is a register.  */
|        if (temp != 0)
|          {
|            if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
|              mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
|          }
|        else if (MEM_P (decl_rtl))
|          temp = decl_rtl;
|
|        if (temp != 0)
|          {
|            if (MEM_P (temp)
|                && modifier != EXPAND_WRITE
|                && modifier != EXPAND_MEMORY
|                && modifier != EXPAND_INITIALIZER
|                && modifier != EXPAND_CONST_ADDRESS
|                && modifier != EXPAND_SUM
|                && !inner_reference_p
|                && mode != BLKmode
|                && MEM_ALIGN (temp) < GET_MODE_ALIGNMENT (mode))
|              temp = expand_misaligned_mem_ref (temp, mode, unsignedp,
|                                                MEM_ALIGN (temp), NULL_RTX, NULL);
|
|            return temp;
|          }
| [...]

[TS] I don't understand that yet.  :-|

Instead of the current "early-return" handling:

    temp = targetm.goacc.expand_var_decl (exp);
    if (temp)
      return temp;

... should we maybe just set:

    DECL_RTL (exp) = targetm.goacc.expand_var_decl (exp)

... (or similar), and then let the usual processing continue?


> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -2957,6 +2957,8 @@ expand_UNIQUE (internal_fn, gcall *stmt)
>        else
>       gcc_unreachable ();
>        break;
> +    case IFN_UNIQUE_OACC_PRIVATE:
> +      break;
>      }
>
>    if (pattern)

> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
>  #define IFN_UNIQUE_CODES                               \
>    DEF(UNSPEC),       \
>      DEF(OACC_FORK), DEF(OACC_JOIN),          \
> -    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK)
> +    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK),        \
> +    DEF(OACC_PRIVATE)
>
>  enum ifn_unique_kind {
>  #define DEF(X) IFN_UNIQUE_##X


> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -171,6 +171,9 @@ struct omp_context
>
>    /* True if there is bind clause on the construct (i.e. a loop construct).  */
>    bool loop_p;
> +
> +  /* Addressable variable decls in this context.  */
> +  vec<tree> oacc_addressable_var_decls;
>  };
>
>  static splay_tree all_contexts;
> @@ -7048,8 +7051,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p,
>
>  static void
>  lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
> -                    gcall *fork, gcall *join, gimple_seq *fork_seq,
> -                    gimple_seq *join_seq, omp_context *ctx)
> +                    gcall *fork, gcall *private_marker, gcall *join,
> +                    gimple_seq *fork_seq, gimple_seq *join_seq,
> +                    omp_context *ctx)
>  {
>    gimple_seq before_fork = NULL;
>    gimple_seq after_fork = NULL;
> @@ -7253,6 +7257,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
>
>    /* Now stitch things together.  */
>    gimple_seq_add_seq (fork_seq, before_fork);
> +  if (private_marker)
> +    gimple_seq_add_stmt (fork_seq, private_marker);
>    if (fork)
>      gimple_seq_add_stmt (fork_seq, fork);
>    gimple_seq_add_seq (fork_seq, after_fork);
> @@ -7989,7 +7995,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head,
>     HEAD and TAIL.  */
>
>  static void
> -lower_oacc_head_tail (location_t loc, tree clauses,
> +lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker,
>                     gimple_seq *head, gimple_seq *tail, omp_context *ctx)
>  {
>    bool inner = false;
> @@ -7997,6 +8003,14 @@ lower_oacc_head_tail (location_t loc, tree clauses,
>    gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node));
>
>    unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx);
> +
> +  if (private_marker)
> +    {
> +      gimple_set_location (private_marker, loc);
> +      gimple_call_set_lhs (private_marker, ddvar);
> +      gimple_call_set_arg (private_marker, 1, ddvar);
> +    }
> +
>    tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK);
>    tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
>
> @@ -8027,7 +8041,8 @@ lower_oacc_head_tail (location_t loc, tree clauses,
>                             &join_seq);
>
>        lower_oacc_reductions (loc, clauses, place, inner,
> -                          fork, join, &fork_seq, &join_seq,  ctx);
> +                          fork, (count == 1) ? private_marker : NULL,
> +                          join, &fork_seq, &join_seq,  ctx);
>
>        /* Append this level to head. */
>        gimple_seq_add_seq (head, fork_seq);

[TS] That looks good in principle.  Via the testing mentioned above, I
just want to make sure that this does all the expected things regarding
differently nested loops and privatization levels.

> @@ -9992,6 +10007,32 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
>      }
>  }
>
> +/* Record vars listed in private clauses in CLAUSES in CTX.  This information
> +   is used to mark up variables that should be made private per-gang.  */
> +
> +static void
> +oacc_record_private_var_clauses (omp_context *ctx, tree clauses)
> +{
> +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
> +      {
> +     tree decl = OMP_CLAUSE_DECL (c);
> +     if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
> +       ctx->oacc_addressable_var_decls.safe_push (decl);
> +      }
> +}
> +
> +/* Record addressable vars declared in BINDVARS in CTX.  This information is
> +   used to mark up variables that should be made private per-gang.  */
> +
> +static void
> +oacc_record_vars_in_bind (omp_context *ctx, tree bindvars)
> +{
> +  for (tree v = bindvars; v; v = DECL_CHAIN (v))
> +    if (VAR_P (v) && TREE_ADDRESSABLE (v))
> +      ctx->oacc_addressable_var_decls.safe_push (v);
> +}
> +

[TS] For these two, we'd add the 'TREE_ADDRESSABLE' rationale mentioned
above.

>  /* Callback for walk_gimple_seq.  Find #pragma omp scan statement.  */
>
>  static tree
> @@ -10821,6 +10862,57 @@ lower_omp_for_scan (gimple_seq *body_p, gimple_seq *dlist, gomp_for *stmt,
>    *dlist = new_dlist;
>  }
>
> +/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing
> +   the addresses of variables that should be made private at the surrounding
> +   parallelism level.  Such functions appear in the gimple code stream in two
> +   forms, e.g. for a partitioned loop:
> +
> +      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68);
> +      .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w);
> +      .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1);
> +      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6);
> +
> +   or alternatively, OACC_PRIVATE can appear at the top level of a parallel,
> +   not as part of a HEAD_MARK sequence:
> +
> +      .UNIQUE (OACC_PRIVATE, 0, 0, &w);
> +
> +   For such stand-alone appearances, the 3rd argument is always 0, denoting
> +   gang partitioning.  */
> +
> +static gcall *
> +make_oacc_private_marker (omp_context *ctx)
> +{
> +  int i;
> +  tree decl;
> +
> +  if (ctx->oacc_addressable_var_decls.length () == 0)
> +    return NULL;
> +
> +  auto_vec<tree, 5> args;
> +
> +  args.quick_push (build_int_cst (integer_type_node, IFN_UNIQUE_OACC_PRIVATE));
> +  args.quick_push (integer_zero_node);
> +  args.quick_push (integer_minus_one_node);
> +
> +  FOR_EACH_VEC_ELT (ctx->oacc_addressable_var_decls, i, decl)
> +    {
> +      for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
> +     {
> +       tree inner_decl = maybe_lookup_decl (decl, thisctx);
> +       if (inner_decl)
> +         {
> +           decl = inner_decl;
> +           break;
> +         }
> +     }
> +      tree addr = build_fold_addr_expr (decl);
> +      args.safe_push (addr);
> +    }
> +
> +  return gimple_build_call_internal_vec (IFN_UNIQUE, args);
> +}
> +
>  /* Lower code for an OMP loop directive.  */
>
>  static void
> @@ -10837,6 +10929,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>
>    push_gimplify_context ();
>
> +  oacc_record_private_var_clauses (ctx, gimple_omp_for_clauses (stmt));
> +
>    lower_omp (gimple_omp_for_pre_body_ptr (stmt), ctx);
>
>    block = make_node (BLOCK);
> @@ -10855,6 +10949,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>        gbind *inner_bind
>       = as_a <gbind *> (gimple_seq_first_stmt (omp_for_body));
>        tree vars = gimple_bind_vars (inner_bind);
> +      if (is_gimple_omp_oacc (ctx->stmt))
> +     oacc_record_vars_in_bind (ctx, vars);
>        gimple_bind_append_vars (new_stmt, vars);
>        /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't
>        keep them on the inner_bind and it's block.  */
> @@ -10968,6 +11064,11 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>
>    lower_omp (gimple_omp_body_ptr (stmt), ctx);
>
> +  gcall *private_marker = NULL;
> +  if (is_gimple_omp_oacc (ctx->stmt)
> +      && !gimple_seq_empty_p (omp_for_body))
> +    private_marker = make_oacc_private_marker (ctx);
> +
>    /* Lower the header expressions.  At this point, we can assume that
>       the header is of the form:
>
> @@ -11022,7 +11123,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>    if (is_gimple_omp_oacc (ctx->stmt)
>        && !ctx_in_oacc_kernels_region (ctx))
>      lower_oacc_head_tail (gimple_location (stmt),
> -                       gimple_omp_for_clauses (stmt),
> +                       gimple_omp_for_clauses (stmt), private_marker,
>                         &oacc_head, &oacc_tail, ctx);
>
>    /* Add OpenACC partitioning and reduction markers just before the loop.  */
> @@ -13019,8 +13120,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>            them as a dummy GANG loop.  */
>         tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG);
>
> +       gcall *private_marker = make_oacc_private_marker (ctx);
> +
> +       if (private_marker)
> +         gimple_call_set_arg (private_marker, 2, level);
> +
>         lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level,
> -                              false, NULL, NULL, &fork_seq, &join_seq, ctx);
> +                              false, NULL, private_marker, NULL, &fork_seq,
> +                              &join_seq, ctx);
>       }
>
>        gimple_seq_add_seq (&new_body, fork_seq);
> @@ -13262,6 +13369,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>                ctx);
>        break;
>      case GIMPLE_BIND:
> +      if (ctx && is_gimple_omp_oacc (ctx->stmt))
> +     oacc_record_vars_in_bind (ctx,
> +                               gimple_bind_vars (as_a <gbind *> (stmt)));
>        lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx);
>        maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt));
>        break;

[TS] I have not yet verified whether these lowering case are sufficient
to also handle the <https://gcc.gnu.org/PR90114> "Predetermined private
levels for variables declared in OpenACC accelerator routines" case.  (If
yes, then that needs testcases, too, if not, then need to add a TODO
note, for later.)


> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "cfgloop.h"
>  #include "context.h"
> +#include "convert.h"
>
>  /* Describe the OpenACC looping structure of a function.  The entire
>     function is held in a 'NULL' loop.  */
> @@ -1356,7 +1357,9 @@ oacc_loop_xform_head_tail (gcall *from, int level)
>           = ((enum ifn_unique_kind)
>              TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)));
>
> -       if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN)
> +       if (k == IFN_UNIQUE_OACC_FORK
> +           || k == IFN_UNIQUE_OACC_JOIN
> +           || k == IFN_UNIQUE_OACC_PRIVATE)
>           *gimple_call_arg_ptr (stmt, 2) = replacement;
>         else if (k == kind && stmt != from)
>           break;
> @@ -1773,6 +1776,136 @@ default_goacc_reduction (gcall *call)
>    gsi_replace_with_seq (&gsi, seq, true);
>  }
>
> +struct var_decl_rewrite_info
> +{
> +  gimple *stmt;
> +  hash_map<tree, tree> *adjusted_vars;
> +  bool avoid_pointer_conversion;
> +  bool modified;
> +};
> +
> +/* Helper function for execute_oacc_device_lower.  Rewrite VAR_DECLs (by
> +   themselves or wrapped in various other nodes) according to ADJUSTED_VARS in
> +   the var_decl_rewrite_info pointed to via DATA.  Used as part of coercing
> +   gang-private variables in OpenACC offload regions to reside in GPU shared
> +   memory.  */
> +
> +static tree
> +oacc_rewrite_var_decl (tree *tp, int *walk_subtrees, void *data)
> +{
> +  walk_stmt_info *wi = (walk_stmt_info *) data;
> +  var_decl_rewrite_info *info = (var_decl_rewrite_info *) wi->info;
> +
> +  if (TREE_CODE (*tp) == ADDR_EXPR)
> +    {
> +      tree arg = TREE_OPERAND (*tp, 0);
> +      tree *new_arg = info->adjusted_vars->get (arg);
> +
> +      if (new_arg)
> +     {
> +       if (info->avoid_pointer_conversion)
> +         {
> +           *tp = build_fold_addr_expr (*new_arg);
> +           info->modified = true;
> +           *walk_subtrees = 0;
> +         }
> +       else
> +         {
> +           gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
> +           tree repl = build_fold_addr_expr (*new_arg);
> +           gimple *stmt1
> +             = gimple_build_assign (make_ssa_name (TREE_TYPE (repl)), repl);
> +           tree conv = convert_to_pointer (TREE_TYPE (*tp),
> +                                           gimple_assign_lhs (stmt1));
> +           gimple *stmt2
> +             = gimple_build_assign (make_ssa_name (TREE_TYPE (*tp)), conv);
> +           gsi_insert_before (&gsi, stmt1, GSI_SAME_STMT);
> +           gsi_insert_before (&gsi, stmt2, GSI_SAME_STMT);
> +           *tp = gimple_assign_lhs (stmt2);
> +           info->modified = true;
> +           *walk_subtrees = 0;
> +         }
> +     }
> +    }
> +  else if (TREE_CODE (*tp) == COMPONENT_REF || TREE_CODE (*tp) == ARRAY_REF)
> +    {
> +      tree *base = &TREE_OPERAND (*tp, 0);
> +
> +      while (TREE_CODE (*base) == COMPONENT_REF
> +          || TREE_CODE (*base) == ARRAY_REF)
> +     base = &TREE_OPERAND (*base, 0);
> +
> +      if (TREE_CODE (*base) != VAR_DECL)
> +     return NULL;
> +
> +      tree *new_decl = info->adjusted_vars->get (*base);
> +      if (!new_decl)
> +     return NULL;
> +
> +      int base_quals = TYPE_QUALS (TREE_TYPE (*new_decl));
> +      tree field = TREE_OPERAND (*tp, 1);
> +
> +      /* Adjust the type of the field.  */
> +      int field_quals = TYPE_QUALS (TREE_TYPE (field));
> +      if (TREE_CODE (field) == FIELD_DECL && field_quals != base_quals)
> +     {
> +       tree *field_type = &TREE_TYPE (field);
> +       while (TREE_CODE (*field_type) == ARRAY_TYPE)
> +         field_type = &TREE_TYPE (*field_type);
> +       field_quals |= base_quals;
> +       *field_type = build_qualified_type (*field_type, field_quals);
> +     }
> +
> +      /* Adjust the type of the component ref itself.  */
> +      tree comp_type = TREE_TYPE (*tp);
> +      int comp_quals = TYPE_QUALS (comp_type);
> +      if (TREE_CODE (*tp) == COMPONENT_REF && comp_quals != base_quals)
> +     {
> +       comp_quals |= base_quals;
> +       TREE_TYPE (*tp)
> +         = build_qualified_type (comp_type, comp_quals);
> +     }
> +
> +      *base = *new_decl;
> +      info->modified = true;
> +    }
> +  else if (TREE_CODE (*tp) == VAR_DECL)
> +    {
> +      tree *new_decl = info->adjusted_vars->get (*tp);
> +      if (new_decl)
> +     {
> +       *tp = *new_decl;
> +       info->modified = true;
> +     }
> +    }
> +
> +  return NULL_TREE;
> +}
> +
> +/* Return TRUE if CALL is a call to a builtin atomic/sync operation.  */
> +
> +static bool
> +is_sync_builtin_call (gcall *call)
> +{
> +  tree callee = gimple_call_fndecl (call);
> +
> +  if (callee != NULL_TREE
> +      && gimple_call_builtin_p (call, BUILT_IN_NORMAL))
> +    switch (DECL_FUNCTION_CODE (callee))
> +      {
> +#undef DEF_SYNC_BUILTIN
> +#define DEF_SYNC_BUILTIN(ENUM, NAME, TYPE, ATTRS) case ENUM:
> +#include "sync-builtins.def"
> +#undef DEF_SYNC_BUILTIN
> +     return true;
> +
> +      default:
> +     ;
> +      }
> +
> +  return false;
> +}
> +
>  /* Main entry point for oacc transformations which run on the device
>     compiler after LTO, so we know what the target device is at this
>     point (including the host fallback).  */
> @@ -1922,6 +2055,8 @@ execute_oacc_device_lower ()
>       dominance information to update SSA.  */
>    calculate_dominance_info (CDI_DOMINATORS);
>
> +  hash_map<tree, tree> adjusted_vars;
> +
>    /* Now lower internal loop functions to target-specific code
>       sequences.  */
>    basic_block bb;
> @@ -1998,6 +2133,45 @@ execute_oacc_device_lower ()
>               case IFN_UNIQUE_OACC_TAIL_MARK:
>                 remove = true;
>                 break;
> +
> +             case IFN_UNIQUE_OACC_PRIVATE:
> +               {
> +                 HOST_WIDE_INT level
> +                   = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
> +                 if (level == -1)
> +                   break;
> +                 for (unsigned i = 3;
> +                      i < gimple_call_num_args (call);
> +                      i++)
> +                   {
> +                     tree arg = gimple_call_arg (call, i);
> +                     gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
> +                     tree decl = TREE_OPERAND (arg, 0);
> +                     if (dump_file && (dump_flags & TDF_DETAILS))
> +                       {
> +                         static char const *const axes[] =
> +                           /* Must be kept in sync with GOMP_DIM
> +                              enumeration.  */
> +                           { "gang", "worker", "vector" };
> +                         fprintf (dump_file, "Decl UID %u has %s "
> +                                  "partitioning:", DECL_UID (decl),
> +                                  axes[level]);
> +                         print_generic_decl (dump_file, decl, TDF_SLIM);
> +                         fputc ('\n', dump_file);
> +                       }
> +                     if (targetm.goacc.adjust_private_decl)
> +                       {
> +                         tree oldtype = TREE_TYPE (decl);
> +                         tree newdecl
> +                           = targetm.goacc.adjust_private_decl (decl, level);
> +                         if (TREE_TYPE (newdecl) != oldtype
> +                             || newdecl != decl)
> +                           adjusted_vars.put (decl, newdecl);
> +                       }
> +                   }
> +                 remove = true;
> +               }
> +               break;
>               }
>             break;
>           }
> @@ -2029,6 +2203,55 @@ execute_oacc_device_lower ()
>         gsi_next (&gsi);
>        }
>
> +  /* Make adjustments to gang-private local variables if required by the
> +     target, e.g. forcing them into a particular address space.  Afterwards,
> +     ADDR_EXPR nodes which have adjusted variables as their argument need to
> +     be modified in one of two ways:
> +
> +       1. They can be recreated, making a pointer to the variable in the new
> +       address space, or
> +
> +       2. The address of the variable in the new address space can be taken,
> +       converted to the default (original) address space, and the result of
> +       that conversion subsituted in place of the original ADDR_EXPR node.
> +
> +     Which of these is done depends on the gimple statement being processed.
> +     At present atomic operations and inline asms use (1), and everything else
> +     uses (2).  At least on AMD GCN, there are atomic operations that work
> +     directly in the LDS address space.
> +
> +     COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
> +     the new decl, adjusting types of appropriate tree nodes as necessary.  */

[TS] As I understand, this is only relevant for GCN offloading, but not
nvptx, and I'll trust that these two variants make sense from a GCN point
of view (which I cannot verify easily).

> +
> +  if (targetm.goacc.adjust_private_decl)
> +    {
> +      FOR_ALL_BB_FN (bb, cfun)
> +     for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
> +          !gsi_end_p (gsi);
> +          gsi_next (&gsi))
> +       {
> +         gimple *stmt = gsi_stmt (gsi);
> +         walk_stmt_info wi;
> +         var_decl_rewrite_info info;
> +
> +         info.avoid_pointer_conversion
> +           = (is_gimple_call (stmt)
> +              && is_sync_builtin_call (as_a <gcall *> (stmt)))
> +             || gimple_code (stmt) == GIMPLE_ASM;
> +         info.stmt = stmt;
> +         info.modified = false;
> +         info.adjusted_vars = &adjusted_vars;
> +
> +         memset (&wi, 0, sizeof (wi));
> +         wi.info = &info;
> +
> +         walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
> +
> +         if (info.modified)
> +           update_stmt (stmt);
> +       }
> +    }
> +
>    free_oacc_loop (loops);
>
>    return 0;

[TS] As disucssed above, maybe can completely skip the 'adjusted_vars'
rewriting for nvptx offloading?


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c

[TS] Without any code changes, this one FAILs (as expected) with nvptx
offloading, but with GCN offloading, it already PASSes.

> @@ -0,0 +1,38 @@
> +#include <assert.h>
> +
> +int main (void)
> +{
> +  int ret;
> +
> +  #pragma acc parallel num_gangs(1) num_workers(32) copyout(ret)
> +  {
> +    int w = 0;
> +
> +    #pragma acc loop worker
> +    for (int i = 0; i < 32; i++)
> +      {
> +     #pragma acc atomic update
> +     w++;
> +      }
> +
> +    ret = (w == 32);
> +  }
> +  assert (ret);
> +
> +  #pragma acc parallel num_gangs(1) vector_length(32) copyout(ret)
> +  {
> +    int v = 0;
> +
> +    #pragma acc loop vector
> +    for (int i = 0; i < 32; i++)
> +      {
> +     #pragma acc atomic update
> +     v++;
> +      }
> +
> +    ret = (v == 32);
> +  }
> +  assert (ret);
> +
> +  return 0;
> +}


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c

[TS] Both with nvptx and GCN offloading, that one already PASSes without
any code changes.

> @@ -0,0 +1,95 @@
> +#include <stdio.h>
> +#include <openacc.h>
> +#include <alloca.h>
> +#include <string.h>
> +#include <gomp-constants.h>
> +#include <stdlib.h>
> +
> +#if 0
> +#define DEBUG(DIM, IDX, VAL) \
> +  fprintf (stderr, "%sdist[%d] = %d\n", (DIM), (IDX), (VAL))
> +#else
> +#define DEBUG(DIM, IDX, VAL)
> +#endif
> +
> +#define N (32*32*32)
> +
> +int
> +check (const char *dim, int *dist, int dimsize)
> +{
> +  int ix;
> +  int exit = 0;
> +
> +  for (ix = 0; ix < dimsize; ix++)
> +    {
> +      DEBUG(dim, ix, dist[ix]);
> +      if (dist[ix] < (N) / (dimsize + 0.5)
> +       || dist[ix] > (N) / (dimsize - 0.5))
> +     {
> +       fprintf (stderr, "did not distribute to %ss (%d not between %d "
> +                "and %d)\n", dim, dist[ix], (int) ((N) / (dimsize + 0.5)),
> +                (int) ((N) / (dimsize - 0.5)));
> +       exit |= 1;
> +     }
> +    }
> +
> +  return exit;
> +}
> +
> +int main ()
> +{
> +  int ary[N];
> +  int ix;
> +  int exit = 0;
> +  int gangsize = 0, workersize = 0, vectorsize = 0;
> +  int *gangdist, *workerdist, *vectordist;
> +
> +  for (ix = 0; ix < N;ix++)
> +    ary[ix] = -1;
> +
> +#pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
> +         copy(ary) copyout(gangsize, workersize, vectorsize)
> +  {
> +#pragma acc loop gang worker vector
> +    for (unsigned ix = 0; ix < N; ix++)
> +      {
> +     int g, w, v;
> +
> +     g = __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
> +     w = __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
> +     v = __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
> +
> +     ary[ix] = (g << 16) | (w << 8) | v;
> +      }
> +
> +    gangsize = __builtin_goacc_parlevel_size (GOMP_DIM_GANG);
> +    workersize = __builtin_goacc_parlevel_size (GOMP_DIM_WORKER);
> +    vectorsize = __builtin_goacc_parlevel_size (GOMP_DIM_VECTOR);
> +  }
> +
> +  gangdist = (int *) alloca (gangsize * sizeof (int));
> +  workerdist = (int *) alloca (workersize * sizeof (int));
> +  vectordist = (int *) alloca (vectorsize * sizeof (int));
> +  memset (gangdist, 0, gangsize * sizeof (int));
> +  memset (workerdist, 0, workersize * sizeof (int));
> +  memset (vectordist, 0, vectorsize * sizeof (int));
> +
> +  /* Test that work is shared approximately equally amongst each active
> +     gang/worker/vector.  */
> +  for (ix = 0; ix < N; ix++)
> +    {
> +      int g = (ary[ix] >> 16) & 255;
> +      int w = (ary[ix] >> 8) & 255;
> +      int v = ary[ix] & 255;
> +
> +      gangdist[g]++;
> +      workerdist[w]++;
> +      vectordist[v]++;
> +    }
> +
> +  exit = check ("gang", gangdist, gangsize);
> +  exit |= check ("worker", workerdist, workersize);
> +  exit |= check ("vector", vectordist, vectorsize);
> +
> +  return exit;
> +}


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90

[TS] This one does show the expected behavior: FAILs without code
changes, PASSes with code changes as posted.

> @@ -0,0 +1,25 @@
> +! Test for "oacc gangprivate" attribute on gang-private variables
> +
> +! { dg-do run }
> +! { dg-additional-options "-fdump-tree-oaccdevlow-details -w" }
> +
> +program main
> +  integer :: w, arr(0:31)
> +
> +  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
> +    !$acc loop gang private(w)
> +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
> +    do j = 0, 31
> +      w = 0
> +      !$acc loop seq
> +      do i = 0, 31
> +        !$acc atomic update
> +        w = w + 1
> +        !$acc end atomic
> +      end do
> +      arr(j) = w
> +    end do
> +  !$acc end parallel
> +
> +  if (any (arr .ne. 32)) stop 1
> +end program main


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90

[TS] With code changes as posted, this one FAILs for nvptx offloading
execution.  (... for all but the Nvidia Titan V GPU in my set of testing
configurations, huh?)

> @@ -0,0 +1,25 @@
> +! Test for worker-private variables
> +
> +! { dg-do run }
> +! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
> +
> +program main
> +  integer :: w, arr(0:31)
> +
> +  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
> +    !$acc loop gang worker private(w)
> +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
> +    do j = 0, 31
> +      w = 0
> +      !$acc loop seq
> +      do i = 0, 31
> +        !$acc atomic update
> +        w = w + 1
> +        !$acc end atomic
> +      end do
> +      arr(j) = w
> +    end do
> +  !$acc end parallel
> +
> +  if (any (arr .ne. 32)) stop 1
> +end program main


[TS] So we'll have to verify whether these are sufficiently testing what
they're meant to be testing, and fix up as necessary.


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-15 17:26   ` Thomas Schwinge
@ 2021-04-16 16:05     ` Andrew Stubbs
  2021-04-16 17:30       ` Thomas Schwinge
  2021-04-19 11:23     ` Julian Brown
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Stubbs @ 2021-04-16 16:05 UTC (permalink / raw)
  To: Thomas Schwinge, Julian Brown; +Cc: Jakub Jelinek, gcc-patches

On 15/04/2021 18:26, Thomas Schwinge wrote:
>> and optimisation, since shared memory might be faster than
>> the main memory on a GPU.
> 
> Do we potentially have a problem that making more use of (scarce)
> gang-private memory may negatively affect peformance, because potentially
> fewer OpenACC gangs may then be launched to the GPU hardware in parallel?
> (Of course, OpenACC semantics conformance firstly is more important than
> performance, but there may be ways to be conformant and performant;
> "quality of implementation".)  Have you run any such performance testing
> with the benchmarking codes that we've got set up?
> 
> (As I'm more familiar with that, I'm using nvptx offloading examples in
> the following, whilst assuming that similar discussion may apply for GCN
> offloading, which uses similar hardware concepts, as far as I remember.)

Yes, that could happen. However, there's space for quite a lot of 
scalars before performance is affected: 64KB of LDS memory shared by a 
hardware-defined maximum of 40 threads gives about 1.5KB of space for 
worker-reduction variables and gang-private variables. We might have a 
problem if there are large private arrays.

I believe we have a "good enough" solution for the usual case, and a 
v2.0 full solution is going to be big and hairy enough for a whole patch 
of it's own (requiring per-gang dynamic allocation, a different memory 
address space and possibly different instruction selection too).

Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-16 16:05     ` Andrew Stubbs
@ 2021-04-16 17:30       ` Thomas Schwinge
  2021-04-18 22:53         ` Andrew Stubbs
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Schwinge @ 2021-04-16 17:30 UTC (permalink / raw)
  To: Andrew Stubbs, Julian Brown; +Cc: Jakub Jelinek, gcc-patches, Tom de Vries

Hi!

On 2021-04-16T17:05:24+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 15/04/2021 18:26, Thomas Schwinge wrote:
>>> and optimisation, since shared memory might be faster than
>>> the main memory on a GPU.
>>
>> Do we potentially have a problem that making more use of (scarce)
>> gang-private memory may negatively affect peformance, because potentially
>> fewer OpenACC gangs may then be launched to the GPU hardware in parallel?
>> (Of course, OpenACC semantics conformance firstly is more important than
>> performance, but there may be ways to be conformant and performant;
>> "quality of implementation".)  Have you run any such performance testing
>> with the benchmarking codes that we've got set up?
>>
>> (As I'm more familiar with that, I'm using nvptx offloading examples in
>> the following, whilst assuming that similar discussion may apply for GCN
>> offloading, which uses similar hardware concepts, as far as I remember.)
>
> Yes, that could happen.

Thanks for sharing the GCN perspective.

> However, there's space for quite a lot of
> scalars before performance is affected: 64KB of LDS memory shared by a
> hardware-defined maximum of 40 threads

(Instead of threads, something like thread blocks, I suppose?)

> gives about 1.5KB of space for
> worker-reduction variables and gang-private variables.

PTX, as I understand this, may generally have a lot of Thread Blocks in
flight: all for the same GPU kernel as well as any GPU kernels running
asynchronously/generally concurrently (system-wide), and libgomp does try
launching a high number of Thread Blocks ('num_gangs') (for purposes of
hiding memory access latency?).  Random example:

    nvptx_exec: kernel t0_r$_omp_fn$0: launch gangs=1920, workers=32, vectors=32

With that, PTX's 48 KiB of '.shared' memory per SM (processor) are then
not so much anymore: just '48 * 1024 / 1920 = 25' bytes of gang-private
memory available for each of the 1920 gangs: 'double x, y, z'?  (... for
the simple case where just one GPU kernel is executing.)

(I suppose that calculation is valid for a GPU hardware variant where
there is just one SM.  If there are several (typically in the order of a
few dozens?), I suppose the Thread Blocks launched will be distributed
over all these, thus improving the situation correspondingly.)

(And of course, there are certainly other factors that also limit the
number of Thread Blocks that are actually executing in parallel.)

> We might have a
> problem if there are large private arrays.

Yes, that's understood.

Also, directly related, the problem that comes with supporting
worker-private memory, which basically calculates to the amount necessary
for gang-private memory multiplied by the number of workers?  (Out of
scope at present.)

> I believe we have a "good enough" solution for the usual case

So you believe that.  ;-)

It's certainly what I'd hope, too!  But we don't know yet whether there's
any noticeable performance impact if we run with (potentially) lesser
parallelism, hence my question whether this patch has been run through
performance testing.

> and a
> v2.0 full solution is going to be big and hairy enough for a whole patch
> of it's own (requiring per-gang dynamic allocation, a different memory
> address space and possibly different instruction selection too).

Agree that a fully dynamic allocation scheme likely is going to be ugly,
so I'd certainly like to avoid that.

Before attempting that, we'd first try to optimize gang-private memory
allocation: so that it's function-local (and thus GPU kernel-local)
instead of device-global (assuming that's indeed possible), and try not
using gang-private memory in cases where it's not actually necessary
(semantically not observable, and not necessary for performance reasons).


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-16 17:30       ` Thomas Schwinge
@ 2021-04-18 22:53         ` Andrew Stubbs
  2021-04-19 11:06           ` Thomas Schwinge
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Stubbs @ 2021-04-18 22:53 UTC (permalink / raw)
  To: Thomas Schwinge, Julian Brown; +Cc: Jakub Jelinek, gcc-patches, Tom de Vries

On 16/04/2021 18:30, Thomas Schwinge wrote:
> Hi!
> 
> On 2021-04-16T17:05:24+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
>> On 15/04/2021 18:26, Thomas Schwinge wrote:
>>>> and optimisation, since shared memory might be faster than
>>>> the main memory on a GPU.
>>>
>>> Do we potentially have a problem that making more use of (scarce)
>>> gang-private memory may negatively affect peformance, because potentially
>>> fewer OpenACC gangs may then be launched to the GPU hardware in parallel?
>>> (Of course, OpenACC semantics conformance firstly is more important than
>>> performance, but there may be ways to be conformant and performant;
>>> "quality of implementation".)  Have you run any such performance testing
>>> with the benchmarking codes that we've got set up?
>>>
>>> (As I'm more familiar with that, I'm using nvptx offloading examples in
>>> the following, whilst assuming that similar discussion may apply for GCN
>>> offloading, which uses similar hardware concepts, as far as I remember.)
>>
>> Yes, that could happen.
> 
> Thanks for sharing the GCN perspective.
> 
>> However, there's space for quite a lot of
>> scalars before performance is affected: 64KB of LDS memory shared by a
>> hardware-defined maximum of 40 threads
> 
> (Instead of threads, something like thread blocks, I suppose?)

Workers. Wavefronts. The terminology is so confusing for these cases! 
They look like CPU threads running SIMD instructions, at least on GCN. 
OpenMP calls them threads.

Each GCN compute unit can run up to 40 of them. A gang can have up to 16 
workers (in AMD terminology, a work group can have up 16 wavefronts), so 
each compute unit will usually have at least two gangs, meaning each 
gang would get 32KB local memory. If there are no worker loops then you 
get 40 gangs (of one worker each) per compute unit, hence the minimum of 
1.5KB per gang.

The local memory is specific to the compute unit and gangs launched 
there will stay there until they're done, so the 40 gangs really is the 
limit for memory division. If you launch more gangs than there are 
resources then they get queued, so the memory doesn't get divided any more.

>> gives about 1.5KB of space for
>> worker-reduction variables and gang-private variables.
> 
> PTX, as I understand this, may generally have a lot of Thread Blocks in
> flight: all for the same GPU kernel as well as any GPU kernels running
> asynchronously/generally concurrently (system-wide), and libgomp does try
> launching a high number of Thread Blocks ('num_gangs') (for purposes of
> hiding memory access latency?).  Random example:
> 
>      nvptx_exec: kernel t0_r$_omp_fn$0: launch gangs=1920, workers=32, vectors=32
> 
> With that, PTX's 48 KiB of '.shared' memory per SM (processor) are then
> not so much anymore: just '48 * 1024 / 1920 = 25' bytes of gang-private
> memory available for each of the 1920 gangs: 'double x, y, z'?  (... for
> the simple case where just one GPU kernel is executing.)

Your maths feels way off to me. That's not enough memory for any use, 
and it's not the only resource that will be stretched thin: how many GPU 
registers does an SM have? (I doubt that register contents are getting 
paged in and out.)

For comparison, with the maximum num_workers(16) GCN can run only 2 
gangs on each compute unit. Each compute unit can run 40 gangs 
simultaneously with num_workers(1), but that is the limit. If you launch 
more gangs than that then they are queued; even if you launch 100,000 
single-worker gangs, each one will still get 1/40th of the resources.

I doubt that NVPTX is magically running 1920 gangs of 32 workers on one 
SM without any queueing and with the gang resources split 1920 ways (and 
the worker resources split 61440 ways).

> (I suppose that calculation is valid for a GPU hardware variant where
> there is just one SM.  If there are several (typically in the order of a
> few dozens?), I suppose the Thread Blocks launched will be distributed
> over all these, thus improving the situation correspondingly.)
> 
> (And of course, there are certainly other factors that also limit the
> number of Thread Blocks that are actually executing in parallel.)
> 
>> We might have a
>> problem if there are large private arrays.
> 
> Yes, that's understood.
> 
> Also, directly related, the problem that comes with supporting
> worker-private memory, which basically calculates to the amount necessary
> for gang-private memory multiplied by the number of workers?  (Out of
> scope at present.)

GCN just uses the stack space for that, which lives in main memory. 
That's limited resource, of course, but it's not architectural. I don't 
know what NVPTX does here.

>> I believe we have a "good enough" solution for the usual case
> 
> So you believe that.  ;-)
> 
> It's certainly what I'd hope, too!  But we don't know yet whether there's
> any noticeable performance impact if we run with (potentially) lesser
> parallelism, hence my question whether this patch has been run through
> performance testing.

Well, indeed I don't know the comparative situation with benchmark 
results because the benchmarks couldn't run at full occupancy, on GCN, 
without it. The purpose of this patch was precisely to allow us to 
reduce the local memory allocation enough to increase occupancy for 
benchmarks that don't use worker loops.

>> and a
>> v2.0 full solution is going to be big and hairy enough for a whole patch
>> of it's own (requiring per-gang dynamic allocation, a different memory
>> address space and possibly different instruction selection too).
> 
> Agree that a fully dynamic allocation scheme likely is going to be ugly,
> so I'd certainly like to avoid that.
> 
> Before attempting that, we'd first try to optimize gang-private memory
> allocation: so that it's function-local (and thus GPU kernel-local)
> instead of device-global (assuming that's indeed possible), and try not
> using gang-private memory in cases where it's not actually necessary
> (semantically not observable, and not necessary for performance reasons).

Global layout isn't ideal, but I don't know how we know how much to 
reserve otherwise? I suppose one would set the shared gang memory up as 
a stack, complete with a stack pointer in the ABI, which would allow 
recursion etc., but that would have other issues.

Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-18 22:53         ` Andrew Stubbs
@ 2021-04-19 11:06           ` Thomas Schwinge
  0 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-04-19 11:06 UTC (permalink / raw)
  To: Andrew Stubbs, Julian Brown; +Cc: Jakub Jelinek, gcc-patches, Tom de Vries

Hi!

On 2021-04-18T23:53:01+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
> On 16/04/2021 18:30, Thomas Schwinge wrote:
>> On 2021-04-16T17:05:24+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
>>> On 15/04/2021 18:26, Thomas Schwinge wrote:
>>>>> and optimisation, since shared memory might be faster than
>>>>> the main memory on a GPU.
>>>>
>>>> Do we potentially have a problem that making more use of (scarce)
>>>> gang-private memory may negatively affect peformance, because potentially
>>>> fewer OpenACC gangs may then be launched to the GPU hardware in parallel?
>>>> (Of course, OpenACC semantics conformance firstly is more important than
>>>> performance, but there may be ways to be conformant and performant;
>>>> "quality of implementation".)  Have you run any such performance testing
>>>> with the benchmarking codes that we've got set up?
>>>>
>>>> (As I'm more familiar with that, I'm using nvptx offloading examples in
>>>> the following, whilst assuming that similar discussion may apply for GCN
>>>> offloading, which uses similar hardware concepts, as far as I remember.)
>>>
>>> Yes, that could happen.
>>
>> Thanks for sharing the GCN perspective.
>>
>>> However, there's space for quite a lot of
>>> scalars before performance is affected: 64KB of LDS memory shared by a
>>> hardware-defined maximum of 40 threads
>>
>> (Instead of threads, something like thread blocks, I suppose?)
>
> Workers. Wavefronts.

(ACK.)

> The terminology is so confusing for these cases!

Absolutely!  Everyone has their own, and slightly redefines meaning of
certain words -- and then again uses different words for the same
things/concepts...

> They look like CPU threads running SIMD instructions, at least on GCN.
> OpenMP calls them threads.

Alright -- and in OpenACC (which is the context here), "a thread is any
one vector lane of one worker of one gang" (that is, any element of a GCN
SIMD instruction).

> Each GCN compute unit can run up to 40 of them. A gang can have up to 16
> workers (in AMD terminology, a work group can have up 16 wavefronts), so
> each compute unit will usually have at least two gangs, meaning each
> gang would get 32KB local memory. If there are no worker loops then you
> get 40 gangs (of one worker each) per compute unit, hence the minimum of
> 1.5KB per gang.
>
> The local memory is specific to the compute unit and gangs launched
> there will stay there until they're done, so the 40 gangs really is the
> limit for memory division. If you launch more gangs than there are
> resources then they get queued, so the memory doesn't get divided any more.
>
>>> gives about 1.5KB of space for
>>> worker-reduction variables and gang-private variables.
>>
>> PTX, as I understand this, may generally have a lot of Thread Blocks in
>> flight: all for the same GPU kernel as well as any GPU kernels running
>> asynchronously/generally concurrently (system-wide), and libgomp does try
>> launching a high number of Thread Blocks ('num_gangs') (for purposes of
>> hiding memory access latency?).  Random example:
>>
>>      nvptx_exec: kernel t0_r$_omp_fn$0: launch gangs=1920, workers=32, vectors=32
>>
>> With that, PTX's 48 KiB of '.shared' memory per SM (processor) are then
>> not so much anymore: just '48 * 1024 / 1920 = 25' bytes of gang-private
>> memory available for each of the 1920 gangs: 'double x, y, z'?  (... for
>> the simple case where just one GPU kernel is executing.)
>
> Your maths feels way off to me. That's not enough memory for any use,
> and it's not the only resource that will be stretched thin:

Might be way off, yes.  I did mention "other [limiting] factors" later
on, and:

According to the documentation that I'd pointed to, CC 3.5 may have
"Maximum number of resident blocks per SM": "16".

(Aha, and if, for example, we assume there are 80 SMs, then libgomp
launching 1920 gangs means '1920 / 80 = 24' Thread Blocks per SM -- which
seems reasonable.)

What I don't know is whether "resident" means scheduled/executing and the
same applies to the '.shared' memory allocation -- or whether the two
parts are separate (thus you can occupy '.shared' memory without having
it used via execution).  If we assume that allocation and execution are
done in one, and there is no pre-emption once launched, that indeed
simplifies the considerations quite some.

We'd then have a decent '48 * 1024 / 16 = 3072' bytes of gang-private
memory available for each of the 16 "resident" gangs (per SM).

> how many GPU
> registers does an SM have?

"Number of 32-bit registers per SM": "64 K", and with "Maximum number of
resident threads per SM": "2048", that means '64 K / 2048 = 32' registers
in this configuration vs. "Maximum number of 32-bit registers per
thread": "255" with correspondingly reduced occupancy.

> (I doubt that register contents are getting
> paged in and out.)

(Again, I have not looked up to which extent Nvidia GPUs/Driver are doing
any such things.)

> For comparison, with the maximum num_workers(16) GCN can run only 2
> gangs on each compute unit. Each compute unit can run 40 gangs
> simultaneously with num_workers(1), but that is the limit. If you launch
> more gangs than that then they are queued; even if you launch 100,000
> single-worker gangs, each one will still get 1/40th of the resources.
>
> I doubt that NVPTX is magically running 1920 gangs of 32 workers on one
> SM without any queueing and with the gang resources split 1920 ways (and
> the worker resources split 61440 ways).

No, indeed.  As I'd said:

>> (I suppose that calculation is valid for a GPU hardware variant where
>> there is just one SM.  If there are several (typically in the order of a
>> few dozens?), I suppose the Thread Blocks launched will be distributed
>> over all these, thus improving the situation correspondingly.)
>>
>> (And of course, there are certainly other factors that also limit the
>> number of Thread Blocks that are actually executing in parallel.)


>>> We might have a
>>> problem if there are large private arrays.
>>
>> Yes, that's understood.
>>
>> Also, directly related, the problem that comes with supporting
>> worker-private memory, which basically calculates to the amount necessary
>> for gang-private memory multiplied by the number of workers?  (Out of
>> scope at present.)
>
> GCN just uses the stack space for that, which lives in main memory.
> That's limited resource, of course, but it's not architectural. I don't
> know what NVPTX does here.

Per my understanding, neither GCN nor nvptx are supporting OpenACC
worker-private memory yet.


>>> I believe we have a "good enough" solution for the usual case
>>
>> So you believe that.  ;-)
>>
>> It's certainly what I'd hope, too!  But we don't know yet whether there's
>> any noticeable performance impact if we run with (potentially) lesser
>> parallelism, hence my question whether this patch has been run through
>> performance testing.
>
> Well, indeed I don't know the comparative situation with benchmark
> results because the benchmarks couldn't run at full occupancy, on GCN,
> without it. The purpose of this patch was precisely to allow us to
> reduce the local memory allocation enough to increase occupancy for
> benchmarks that don't use worker loops.

ACK, that's the GCN perspective.  But for nvptx, we ought be careful to
not regress existing functionality/performance.

Plus, we all agree, the proposed code changes do improve certain aspects
of OpenACC specification conformance: the concept of gang-private memory.


>>> and a
>>> v2.0 full solution is going to be big and hairy enough for a whole patch
>>> of it's own (requiring per-gang dynamic allocation, a different memory
>>> address space and possibly different instruction selection too).
>>
>> Agree that a fully dynamic allocation scheme likely is going to be ugly,
>> so I'd certainly like to avoid that.
>>
>> Before attempting that, we'd first try to optimize gang-private memory
>> allocation: so that it's function-local (and thus GPU kernel-local)
>> instead of device-global (assuming that's indeed possible), and try not
>> using gang-private memory in cases where it's not actually necessary
>> (semantically not observable, and not necessary for performance reasons).
>
> Global layout isn't ideal, but I don't know how we know how much to
> reserve otherwise? I suppose one would set the shared gang memory up as
> a stack, complete with a stack pointer in the ABI, which would allow
> recursion etc., but that would have other issues.

Due to lack of in-depth knowledge, I haven't made an attempt to reason
about how to implement that on GCN, but for nvptx there certainly is
evidence of '.shared' memory allocation per function, building a complete
call graph from the GPU kernel entry point onwards, and thus '.shared'
memory allocation per each individual GPU kernel launch.

(Yet, again, I'm totally fine to defer all these things for later --
unless the nvptx performance testing numbers mandate otherwise.)


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-15 17:26   ` Thomas Schwinge
  2021-04-16 16:05     ` Andrew Stubbs
@ 2021-04-19 11:23     ` Julian Brown
  2021-05-21 18:55       ` Thomas Schwinge
                         ` (3 more replies)
  1 sibling, 4 replies; 24+ messages in thread
From: Julian Brown @ 2021-04-19 11:23 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: gcc-patches, Jakub Jelinek, Tom de Vries, Chung-Lin Tang

Hi,

(Chung-Lin, question for you buried below.)

On Thu, 15 Apr 2021 19:26:54 +0200
Thomas Schwinge <thomas@codesourcery.com> wrote:

> Hi!
> 
> On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com>
> wrote:
> > This patch  
> 
> Thanks, Julian, for your continued improving of these changes!

You're welcome!

> This has iterated through several conceptually different designs and
> implementations, by several people, over the past several years.

I hope this wasn't a hint that I'd failed to attribute the authorship of
the patch properly? Many apologies if so, that certainly wasn't my
intention!

> > implements a method to track the "private-ness" of
> > OpenACC variables declared in offload regions in gang-partitioned,
> > worker-partitioned or vector-partitioned modes. Variables declared
> > implicitly in scoped blocks and those declared "private" on
> > enclosing directives (e.g. "acc parallel") are both handled.
> > Variables that are e.g. gang-private can then be adjusted so they
> > reside in GPU shared memory.
> >
> > The reason for doing this is twofold: correct implementation of
> > OpenACC semantics  
> 
> ACK, and as mentioned before, this very much relates to
> <https://gcc.gnu.org/PR90115> "OpenACC: predetermined private levels
> for variables declared in blocks" (plus the corresponding use of
> 'private' clauses, implicit/explicit, including 'firstprivate') and
> <https://gcc.gnu.org/PR90114> "Predetermined private levels for
> variables declared in OpenACC accelerator routines", which we thus
> should refer in testcases/ChangeLog/commit log, as appropriate.  I do
> understand we're not yet addressing all of that (and that's fine!),
> but we should capture remaining work items of the PRs and Cesar's
> list in
> <http://mid.mail-archive.com/70d27ebd-762e-59a3-082f-48fa0c687212@codesourcery.com>),
> as appropriate.

From that list:

>  * Currently variables in private clauses inside acc loops will not
>    utilize shared memory.

The patch should handle this properly now.

>  * OpenACC routines don't use shared memory, except for reductions and
>    worker state propagation.

Routines weren't a focus of this patch (at the point I inherited it),
and I did not attempt to extend it to cover routines either. TBH the
state there is a bit of an unknown (but the patch won't make the
situation any worse).

>  * Variables local to worker loops don't use shared memory.

That's still true, and IIUC for that to work we'd need to expand
scalars into indexed array references, (i.e. "var" ->
"var_arr[vector_lane]" or similar). It's not clear if/when/why we'd
want to do that.

As an aside, if we want to avoid shared memory for some reason but want
to maintain OpenACC semantics, we'd also have to do a similar
transformation for gang-private variables ("var" ->
"var[gang_number]", where the array is on the stack or in global
memory, or similar). Then for worker-private variables we need to do
"var" -> "var[gang_number * num_workers + worker_number]". We've
avoided needing to do that so far, but for some cases -- maybe large
local private arrays? -- it might be necessary, at some point.

>  * Variables local to automatically partitioned gang and worker loops
>    don't use shared memory.

Local variables in automatically-partitioned gang loops should work fine
now.

>  * Shared memory is allocated globally, not locally on a per-function
>    basis. We're not sure if that matters though.

Arguably, that's down to the target, not this middle-end patch -- this
patch itself might not *help* do per-function allocation, but it
doesn't set a policy that allocation must be global either.

> I was surprised that we didn't really have to fix up any existing
> libgomp testcases, because there seem to be quite some that contain a
> pattern (exemplified by the 'tmp' variable) as follows:
> 
>     int main()
>     {
>     #define N 123
>       int data[N];
>       int tmp;
>     
>     #pragma acc parallel // implicit 'firstprivate(tmp)'
>       {
>         // 'tmp' now conceptually made gang-private here.
>     #pragma acc loop gang
>         for (int i = 0; i < 123; ++i)
>           {
>             tmp = i + 234;
>             data[i] = tmp;
>           }
>       }
>     
>       for (int i = 0; i < 123; ++i)
>         if (data[i] != i + 234)
>           __builtin_abort ();
>       
>       return 0;
>     }
> 
> With the code changes as posted, this actually now does *not* use
> gang-private memory for 'tmp', but instead continues to use
> "thread-private registers", as before.

When "tmp" is a local, non-address-taken scalar like that, it'll
probably end up in a register in offloaded code (or of course be
compiled out completely), both before and after this patch. So I
wouldn't expect this to not work in the pre-patch state.

> Same for:
> 
>     --- s3.c	2021-04-13 17:26:49.628739379 +0200
>     +++ s3_2.c	2021-04-13 17:29:43.484579664 +0200
>     @@ -4,6 +4,6 @@
>        int data[N];
>     -  int tmp;
>      
>     -#pragma acc parallel // implicit 'firstprivate(tmp)'
>     +#pragma acc parallel
>        {
>     +    int tmp;
>          // 'tmp' now conceptually made gang-private here.
>      #pragma acc loop gang
> 
> I suppose that's due to conditionalizing this transformation on
> 'TREE_ADDRESSABLE' (as you're doing), so we should be mostly "safe"
> regarding such existing testcases (but I haven't verified that yet in
> detail).

Right.

> That needs to be documented in testcases, with some kind of dump
> scanning (host compilation-side even; see below).
> 
> A note for later: if this weren't just a 'gang' loop, but 'gang' plus
> 'worker' and/or 'vector', we'd actually be fixing up user code with
> undefined behavior into "correct" code (by *not* making 'tmp'
> gang-private, but thread-private), right?

Possibly -- coming up with a case like that might need a little
"ingenuity"...

> As that may not be obvious to the reader, I'd like to have the
> 'TREE_ADDRESSABLE' conditionalization be documented in the code.  You
> had explained that in
> <http://mid.mail-archive.com/20190612204216.0ec83e4e@squid.athome>: "a
> non-addressable variable [...]".

Yeah that probably makes sense.

> > and optimisation, since shared memory might be faster than
> > the main memory on a GPU.  
> 
> Do we potentially have a problem that making more use of (scarce)
> gang-private memory may negatively affect peformance, because
> potentially fewer OpenACC gangs may then be launched to the GPU
> hardware in parallel? (Of course, OpenACC semantics conformance
> firstly is more important than performance, but there may be ways to
> be conformant and performant; "quality of implementation".)  Have you
> run any such performance testing with the benchmarking codes that
> we've got set up?

I don't have any numbers for this patch, no. As for the question as to
whether there are constructs that are currently compiled in a
semantically-correct way but that this patch pessimises -- I'm not aware
of anything like that, but there might be.

> (As I'm more familiar with that, I'm using nvptx offloading examples
> in the following, whilst assuming that similar discussion may apply
> for GCN offloading, which uses similar hardware concepts, as far as I
> remember.)
> 
> Looking at the existing
> 'libgomp.oacc-c-c++-common/private-variables.c' (random example), for
> nvptx offloading, '-O0', we see the following PTX JIT compilation
> changes (word-'diff' of 'GOMP_DEBUG=1' at run-time):
> 
>     info    : Function properties for 'local_g_1$_omp_fn$0':
>     info    : used 27 registers, 32 stack, [-176-]{+256+} bytes smem,
> 328 bytes cmem[0], 0 bytes lmem info    : Function properties for
> 'local_w_1$_omp_fn$0': info    : used 40 registers, 48 stack,
> [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem info    :
> Function properties for 'local_w_2$_omp_fn$0': [...]
>     info    : Function properties for 'parallel_g_1$_omp_fn$0':
>     info    : used 27 registers, 32 stack, [-176-]{+256+} bytes smem,
> 328 bytes cmem[0], 0 bytes lmem info    : Function properties for
> 'parallel_g_2$_omp_fn$0': info    : used 32 registers, 160 stack,
> [-176-]{+256+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
> 
> ... that is, PTX '.shared' usage increases from 176 to 256 bytes for
> *all* functions, even though only 'loop_g_4$_omp_fn$0' and
> 'loop_g_5$_omp_fn$0' are actually using gang-private memory.
> 
> Execution testing works before (original code, not using gang-private
> memory) as well as after (code changes as posted, using gang-private
> memory), so use on gang-private memory doesn't seem necessary here for
> "correct execution" -- or at least: "expected execution result".  ;-)
> I haven't looked yet whether there's a potentional issue in the
> testcases here.
> 
> The additional '256 - 176 = 80' bytes of PTX '.shared' memory
> requested are due to GCC nvptx back end implementation's use of a
> global "Shared memory block for gang-private variables":
> 
>      // BEGIN VAR DEF: __oacc_bcast
>      .shared .align 8 .u8 __oacc_bcast[176];
>     +// BEGIN VAR DEF: __gangprivate_shared
>     +.shared .align 32 .u8 __gangprivate_shared[64];
> 
> ..., plus (I suppose) an additional '80 - 64 = 16' padding/unused
> bytes to establish '.align 32' after '.align 8' for '__oacc_bcast'.
> 
> Per
> <https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities>,
> "Table 15. Technical Specifications per Compute Capability", "Compute
> Capability": "3.5", we have a "Maximum amount of shared memory per
> SM": "48 KB", so with '176 bytes smem', that permits '48 * 1024 / 176
> = 279' thread blocks ('num_gangs') resident at one point in time,
> whereas with '256 bytes smem', it's just '48 * 1024 / 256 = 192'
> thread blocks resident at one point in time.  (Not sure that I got
> all the details right, but you get the idea/concern?)
> 
> Anyway, that shall be OK for now, but we shall later look into
> optimizing that; can't we have '.shared' local to the relevant PTX
> functions instead of global?

As mentioned in a previous posting (probably some time ago!) the NVPTX
backend parts were a bit of the patch I inherited from the earliest
versions of the patch, and didn't alter much. The possibility for
function-local allocation has been raised before (for NVPTX), but I
haven't investigated if it's possible or beneficial.

> Interestingly, compiling with '-O2', we see:
> 
>     // BEGIN VAR DEF: __oacc_bcast
>     .shared .align 8 .u8 __oacc_bcast[144];
>     {+// BEGIN VAR DEF: __gangprivate_shared+}
>     {+.shared .align 128 .u8 __gangprivate_shared[32];+}
> 
> With '-O2', only 'loop_g_5$_omp_fn$0' is using gang-private memory,
> and apparently the PTX JIT is able to figure that out from the PTX
> code that GCC generates, and is then able to localize '.shared'
> memory usage to just 'loop_g_5$_omp_fn$0':
> 
>     [...]
>     info    : Function properties for 'loop_g_4$_omp_fn$0':
>     info    : used 12 registers, 0 stack, 144 bytes smem, 328 bytes
> cmem[0], 0 bytes lmem info    : Function properties for
> 'loop_g_5$_omp_fn$0': info    : used [-30-]{+32+} registers, 32
> stack, [-144-]{+288+} bytes smem, 328 bytes cmem[0], 0 bytes lmem
> info    : Function properties for 'loop_g_6$_omp_fn$0': info    :
> used 13 registers, 0 stack, 144 bytes smem, 328 bytes cmem[0], 0
> bytes lmem [...]
> 
> This strongly suggests to me that indeed there must exist a
> programmatic way to get rid of the global "Shared memory block for
> gang-private variables".
> 
> The additional '288 - 144 = 144' bytes of PTX '.shared' memory
> requested are 32 bytes for 'int x[8]' ('#pragma acc loop gang
> private(x)') plus '288 - 32 - 144 = 112' padding/unused bytes to
> establish '.align 128' (!) after '.align 8' for '__oacc_bcast'.
> That's clearly not ideal: 112 bytes wasted in contrast to just '144 +
> 32 = 176' bytes actually used.  (I have not yet looked why/whether
> this really needs '.align 128'.)

I'm sure improvements are possible there (maybe later?).

> I have not yet looked whether similar concerns exist for the GCC GCN
> back end implementation.  (That one also does set 'TREE_STATIC' for
> gang-private memory, so it's a global allocation?)

Yes, or rather per-CU allocation.

> > Handling of private variables is intimately
> > tied to the execution model for gangs/workers/vectors implemented by
> > a particular target: for current targets, we use (or on mainline,
> > will soon use) a broadcasting/neutering scheme.
> >
> > That is sufficient for code that e.g. sets a variable in
> > worker-single mode and expects to use the value in
> > worker-partitioned mode. The difficulty (semantics-wise) comes when
> > the user wants to do something like an atomic operation in
> > worker-partitioned mode and expects a worker-single (gang private)
> > variable to be shared across each partitioned worker. Forcing use
> > of shared memory for such variables makes that work properly.  
> 
> Are we reliably making sure that gang-private variables (and other
> levels, in general) are not subject to the usual broadcasting scheme
> (nvptx, at least), or does that currently work "by accident"?  (I
> haven't looked into that, yet.)

Yes, that case is explicitly handled by the broadcasting/neutering patch
recently posted. (One of the reasons that patch depends on this one.)

> > In terms of implementation, the parallelism level of a given loop is
> > not fixed until the oaccdevlow pass in the offload compiler, so the
> > patch delays fixing the parallelism level of variables declared on
> > or within such loops until the same point. This is done by adding a
> > new internal UNIQUE function (OACC_PRIVATE) that lists (the address
> > of) each private variable as an argument, and other arguments set
> > so as to be able to determine the correct parallelism level to use
> > for the listed variables. This new internal function fits into the
> > existing scheme for demarcating OpenACC loops, as described in
> > comments in the patch.  
> 
> Yes, thanks, that's conceptually now much better than the earlier
> variants that we had.  :-) (Hooray, again, for Nathan's OpenACC
> execution model design!)
> 
> What we should add, though, is a bunch of testcases to verify that the
> expected processing does/doesn't happen for relevant source code
> constructs.  I'm thinking that when the transformation is/isn't done,
> that gets logged, and we can then scan the dumps accordingly.  Some of
> that is implemented already; we should be able to do such scanning
> generally for host compilation, too, not just offloading compilation.

More test coverage is always welcome, of course.

> > Two new target hooks are introduced:
> > TARGET_GOACC_ADJUST_PRIVATE_DECL and TARGET_GOACC_EXPAND_VAR_DECL.
> > The first can tweak a variable declaration at oaccdevlow time, and
> > the second at expand time.  The first or both of these target hooks
> > can be used by a given offload target, depending on its strategy
> > for implementing private variables.  
> 
> ACK.
> 
> So, currently we're only looking at making the gang-private level
> work. Regarding that, we have two configurations: (1) for GCN
> offloading, 'targetm.goacc.adjust_private_decl' does the work (in
> particular, change 'TREE_TYPE' etc.) and there is no
> 'targetm.goacc.expand_var_decl', and (2) for nvptx offloading,
> 'targetm.goacc.adjust_private_decl' only sets a marker ('oacc
> gangprivate' attribute) and then 'targetm.goacc.expand_var_decl' does
> the work.
> 
> Therefore I suggest we clarify the (currently) expected handling
> similar to:
> 
>     --- gcc/omp-offload.c
>     +++ gcc/omp-offload.c
>     @@ -1854,6 +1854,19 @@ oacc_rewrite_var_decl (tree *tp, int
> *walk_subtrees, void *data) return NULL_TREE;
>      }
>      
>     +static tree
>     +oacc_rewrite_var_decl_ (tree *tp, int *walk_subtrees, void *data)
>     +{
>     +  tree t = oacc_rewrite_var_decl (tp, walk_subtrees, data);
>     +  if (targetm.goacc.expand_var_decl)
>     +    {
>     +      walk_stmt_info *wi = (walk_stmt_info *) data;
>     +      var_decl_rewrite_info *info = (var_decl_rewrite_info *)
> wi->info;
>     +      gcc_assert (!info->modified);
>     +    }
>     +  return t;
>     +}

Why the ugly _ tail on the function name!? I don't think that's a
typical GNU coding standards thing, is it?

>     +
>      /* Return TRUE if CALL is a call to a builtin atomic/sync
> operation.  */ 
>      static bool
>     @@ -2195,6 +2208,9 @@ execute_oacc_device_lower ()
>           COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also
> rewritten to use the new decl, adjusting types of appropriate tree
> nodes as necessary.  */ 
>     +  if (targetm.goacc.expand_var_decl)
>     +    gcc_assert (adjusted_vars.is_empty ());

If you like -- or do something like

>        if (targetm.goacc.adjust_private_decl)
             && !adjusted_vars.is_empty ())

perhaps.

>          {
>            FOR_ALL_BB_FN (bb, cfun)
>     @@ -2217,7 +2233,7 @@ execute_oacc_device_lower ()
>                 memset (&wi, 0, sizeof (wi));
>                 wi.info = &info;
>      
>     -           walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
>     +           walk_gimple_op (stmt, oacc_rewrite_var_decl_, &wi);
>      
>                 if (info.modified)
>                   update_stmt (stmt);
> 
> Or, in fact, 'if (targetm.goacc.expand_var_decl)', skip the
> 'adjusted_vars' handling completely?

For the current pair of implementations, sure. I don't think it's
necessary to set that as a constraint for future targets though? I
guess it doesn't matter much until such a target exists.

> I do understand that eventually (in particular, for worker-private
> level?), both 'targetm.goacc.adjust_private_decl' and
> 'targetm.goacc.expand_var_decl' may need to do things, but that's
> currently not meant to be addressed, and thus not fully worked out and
> implemented, and thus untested.  Hence, 'assert' what currently is
> implemented/tested, only.

If you like, no strong feelings from me on that.

> (Given that eventual goal, that's probably sufficient motivation to
> indeed add the 'adjusted_vars' handling in generic 'gcc/omp-offload.c'
> instead of moving it into the GCN back end?)

I'm not sure what moving it to the GCN back end would look like. I
guess it's a question of keeping the right abstractions in the right
place.

> For 'libgomp.oacc-c-c++-common/static-variable-1.c' that I've recently
> added, the code changes here cause execution test FAILs for nvptx
> offloading (because of making 'static' variables gang-private), and
> trigger an ICE with GCN offloading compilation.  It isn't clear to me
> what the desired semantics are for (user-specified) 'static'
> variables -- see <https://github.com/OpenACC/openacc-spec/issues/372>
> "C/C++ 'static' variables" (only visible to members of the GitHub
> OpenACC organization) -- but an ICE clearly isn't the right answer.
> ;-)
> 
> As for certain transformation/optimizations, 'static' variables may be
> synthesized in the GCC middle end, I suppose we should preserve the
> status quo (as documented via
> 'libgomp.oacc-c-c++-common/static-variable-1.c') until #372 gets
> resolved in OpenACC?  (I suppose, skip the transformation if
> 'TREE_STATIC' is set, or similar.)

ICEs are bad -- but a user expecting static variables to do something
meaningful in offloaded code is being somewhat optimistic, I think!

> > --- a/gcc/expr.c
> > +++ b/gcc/expr.c
> > @@ -10224,8 +10224,19 @@ expand_expr_real_1 (tree exp, rtx target,
> > machine_mode tmode, exp = SSA_NAME_VAR (ssa_name);
> >        goto expand_decl_rtl;
> >  
> > -    case PARM_DECL:
> >      case VAR_DECL:
> > +      /* Allow accel compiler to handle variables that require
> > special
> > +	 treatment, e.g. if they have been modified in some way
> > earlier in
> > +	 compilation by the adjust_private_decl OpenACC hook.  */
> > +      if (flag_openacc && targetm.goacc.expand_var_decl)
> > +	{
> > +	  temp = targetm.goacc.expand_var_decl (exp);
> > +	  if (temp)
> > +	    return temp;
> > +	}
> > +      /* ... fall through ...  */
> > +
> > +    case PARM_DECL:  
> 
> [TS] Are we sure that we don't need the same handling for a
> 'PARM_DECL', too?  (If yes, to document and verify that, should we
> thus again unify the two 'case's, and in
> 'targetm.goacc.expand_var_decl' add a 'gcc_checking_assert (TREE_CODE
> (var) == VAR_DECL')'?)

Maybe for routines? Those bits date from the earliest version of the
patch and (same excuse again) I didn't have call to revisit those
decisions.

> Also, are we sure that all the following existing processing is not
> relevant to do before the 'return temp' (see above)?  That's not a
> concern for GCN (which doesn't use 'targetm.goacc.expand_var_decl',
> and thus does execute all this following existing processing), but it
> is for nvptx (which does use 'targetm.goacc.expand_var_decl', and
> thus doesn't execute all this following existing processing if that
> returned something).  Or, is 'targetm.goacc.expand_var_decl'
> conceptually and practically meant to implement all of the following
> processing, or is this for other reasons not relevant in the
> 'targetm.goacc.expand_var_decl' case:
> 
> >        /* If a static var's type was incomplete when the decl was
> > written, but the type is complete now, lay out the decl now.  */
> >        if (DECL_SIZE (exp) == 0  
> |            && COMPLETE_OR_UNBOUND_ARRAY_TYPE_P (TREE_TYPE (exp))
> |            && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
> |          layout_decl (exp, 0);
> |  
> |        /* fall through */
> |  
> |      case FUNCTION_DECL:
> |      case RESULT_DECL:
> |        decl_rtl = DECL_RTL (exp);
> |      expand_decl_rtl:
> |        gcc_assert (decl_rtl);
> |  
> |        /* DECL_MODE might change when TYPE_MODE depends on
> attribute target |           settings for VECTOR_TYPE_P that might
> switch for the function.  */ |        if (currently_expanding_to_rtl
> |            && code == VAR_DECL && MEM_P (decl_rtl)
> |            && VECTOR_TYPE_P (type) && exp && DECL_MODE (exp) !=
> mode) |          decl_rtl = change_address (decl_rtl, TYPE_MODE
> (type), 0); |        else
> |          decl_rtl = copy_rtx (decl_rtl);
> |  
> |        /* Record writes to register variables.  */
> |        if (modifier == EXPAND_WRITE
> |            && REG_P (decl_rtl)
> |            && HARD_REGISTER_P (decl_rtl))
> |          add_to_hard_reg_set (&crtl->asm_clobbers,
> |                               GET_MODE (decl_rtl), REGNO
> (decl_rtl)); |  
> |        /* Ensure variable marked as used even if it doesn't go
> through |           a parser.  If it hasn't be used yet, write out an
> external |           definition.  */
> |        if (exp)
> |          TREE_USED (exp) = 1;
> |  
> |        /* Show we haven't gotten RTL for this yet.  */
> |        temp = 0;
> |  
> |        /* Variables inherited from containing functions should have
> |           been lowered by this point.  */
> |        if (exp)
> |          context = decl_function_context (exp);
> |        gcc_assert (!exp
> |                    || SCOPE_FILE_SCOPE_P (context)
> |                    || context == current_function_decl
> |                    || TREE_STATIC (exp)
> |                    || DECL_EXTERNAL (exp)
> |                    /* ??? C++ creates functions that are not
> TREE_STATIC.  */ |                    || TREE_CODE (exp) ==
> FUNCTION_DECL); |  
> |        /* This is the case of an array whose size is to be
> determined |           from its initializer, while the initializer is
> still being parsed. |           ??? We aren't parsing while expanding
> anymore.  */ |  
> |        if (MEM_P (decl_rtl) && REG_P (XEXP (decl_rtl, 0)))
> |          temp = validize_mem (decl_rtl);
> |  
> |        /* If DECL_RTL is memory, we are in the normal case and the
> |           address is not valid, get the address into a register.  */
> |  
> |        else if (MEM_P (decl_rtl) && modifier != EXPAND_INITIALIZER)
> |          {
> |            if (alt_rtl)
> |              *alt_rtl = decl_rtl;
> |            decl_rtl = use_anchored_address (decl_rtl);
> |            if (modifier != EXPAND_CONST_ADDRESS
> |                && modifier != EXPAND_SUM
> |                && !memory_address_addr_space_p (exp ? DECL_MODE
> (exp) |                                                 : GET_MODE
> (decl_rtl), |                                                 XEXP
> (decl_rtl, 0), |
> MEM_ADDR_SPACE (decl_rtl))) |              temp =
> replace_equiv_address (decl_rtl, |
>         copy_rtx (XEXP (decl_rtl, 0))); |          }
> |  
> |        /* If we got something, return it.  But first, set the
> alignment |           if the address is a register.  */
> |        if (temp != 0)
> |          {
> |            if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
> |              mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
> |          }
> |        else if (MEM_P (decl_rtl))
> |          temp = decl_rtl;
> |  
> |        if (temp != 0)
> |          {
> |            if (MEM_P (temp)
> |                && modifier != EXPAND_WRITE
> |                && modifier != EXPAND_MEMORY
> |                && modifier != EXPAND_INITIALIZER
> |                && modifier != EXPAND_CONST_ADDRESS
> |                && modifier != EXPAND_SUM
> |                && !inner_reference_p
> |                && mode != BLKmode
> |                && MEM_ALIGN (temp) < GET_MODE_ALIGNMENT (mode))
> |              temp = expand_misaligned_mem_ref (temp, mode,
> unsignedp, |                                                MEM_ALIGN
> (temp), NULL_RTX, NULL); |  
> |            return temp;
> |          }
> | [...]
> 
> [TS] I don't understand that yet.  :-|
> 
> Instead of the current "early-return" handling:
> 
>     temp = targetm.goacc.expand_var_decl (exp);
>     if (temp)
>       return temp;
> 
> ... should we maybe just set:
> 
>     DECL_RTL (exp) = targetm.goacc.expand_var_decl (exp)
> 
> ... (or similar), and then let the usual processing continue?

Hum, not sure about that. See above excuse... maybe Chung-Lin
remembers? My guess is the extra processing doesn't matter in practice
for the limited kinds of variables that are handled by that hook, at
least for NVPTX (which skips register allocation, etc. anyway).

> > [snip]
> >    tree fork_kind = build_int_cst (unsigned_type_node,
> > IFN_UNIQUE_OACC_FORK); tree join_kind = build_int_cst
> > (unsigned_type_node, IFN_UNIQUE_OACC_JOIN); 
> > @@ -8027,7 +8041,8 @@ lower_oacc_head_tail (location_t loc, tree
> > clauses, &join_seq);
> >  
> >        lower_oacc_reductions (loc, clauses, place, inner,
> > -			     fork, join, &fork_seq, &join_seq,
> > ctx);
> > +			     fork, (count == 1) ? private_marker :
> > NULL,
> > +			     join, &fork_seq, &join_seq,  ctx);
> >  
> >        /* Append this level to head. */
> >        gimple_seq_add_seq (head, fork_seq);  
> 
> [TS] That looks good in principle.  Via the testing mentioned above, I
> just want to make sure that this does all the expected things
> regarding differently nested loops and privatization levels.

Feel free to extend test coverage as you see fit...

> >        gimple_seq_add_seq (&new_body, fork_seq);
> > @@ -13262,6 +13369,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p,
> > omp_context *ctx) ctx);
> >        break;
> >      case GIMPLE_BIND:
> > +      if (ctx && is_gimple_omp_oacc (ctx->stmt))
> > +	oacc_record_vars_in_bind (ctx,
> > +				  gimple_bind_vars (as_a <gbind *>
> > (stmt))); lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)),
> > ctx); maybe_remove_omp_member_access_dummy_vars (as_a <gbind *>
> > (stmt)); break;  
> 
> [TS] I have not yet verified whether these lowering case are
> sufficient to also handle the <https://gcc.gnu.org/PR90114>
> "Predetermined private levels for variables declared in OpenACC
> accelerator routines" case.  (If yes, then that needs testcases, too,
> if not, then need to add a TODO note, for later.)

I believe that's a TODO.

> > +       1. They can be recreated, making a pointer to the variable
> > in the new
> > +	  address space, or
> > +
> > +       2. The address of the variable in the new address space can
> > be taken,
> > +	  converted to the default (original) address space, and
> > the result of
> > +	  that conversion subsituted in place of the original
> > ADDR_EXPR node. +
> > +     Which of these is done depends on the gimple statement being
> > processed.
> > +     At present atomic operations and inline asms use (1), and
> > everything else
> > +     uses (2).  At least on AMD GCN, there are atomic operations
> > that work
> > +     directly in the LDS address space.
> > +
> > +     COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also
> > rewritten to use
> > +     the new decl, adjusting types of appropriate tree nodes as
> > necessary.  */  
> 
> [TS] As I understand, this is only relevant for GCN offloading, but
> not nvptx, and I'll trust that these two variants make sense from a
> GCN point of view (which I cannot verify easily).

The idea (hope) is that that's what's necessary "generically", though
the only target using that support is GCN at present. I.e. it's not
supposed to be GCN-specific, necessarily. Of course though, who knows
what some other exotic target will need? (We don't want to be in the
state where each target has to start completely from scratch for this
sort of thing, if we can help it.)

> > +  if (targetm.goacc.adjust_private_decl)
> > +    {
> > +      FOR_ALL_BB_FN (bb, cfun)
> > +	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
> > +	     !gsi_end_p (gsi);
> > +	     gsi_next (&gsi))
> > +	  {
> > +	    gimple *stmt = gsi_stmt (gsi);
> > +	    walk_stmt_info wi;
> > +	    var_decl_rewrite_info info;
> > +
> > +	    info.avoid_pointer_conversion
> > +	      = (is_gimple_call (stmt)
> > +		 && is_sync_builtin_call (as_a <gcall *> (stmt)))
> > +		|| gimple_code (stmt) == GIMPLE_ASM;
> > +	    info.stmt = stmt;
> > +	    info.modified = false;
> > +	    info.adjusted_vars = &adjusted_vars;
> > +
> > +	    memset (&wi, 0, sizeof (wi));
> > +	    wi.info = &info;
> > +
> > +	    walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
> > +
> > +	    if (info.modified)
> > +	      update_stmt (stmt);
> > +	  }
> > +    }
> > +
> >    free_oacc_loop (loops);
> >  
> >    return 0;  
> 
> [TS] As disucssed above, maybe can completely skip the 'adjusted_vars'
> rewriting for nvptx offloading?

Yeah sure, if you like.

> > --- /dev/null
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c  
> 
> [TS] Without any code changes, this one FAILs (as expected) with nvptx
> offloading, but with GCN offloading, it already PASSes.

Not sure about that, of course one gets lucky sometimes.
 
> > --- /dev/null
> > +++
> > b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90  
> 
> [TS] With code changes as posted, this one FAILs for nvptx offloading
> execution.  (... for all but the Nvidia Titan V GPU in my set of
> testing configurations, huh?)
> 
> > @@ -0,0 +1,25 @@
> > +! Test for worker-private variables
> > +
> > +! { dg-do run }
> > +! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
> > +
> > +program main
> > +  integer :: w, arr(0:31)
> > +
> > +  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
> > +    !$acc loop gang worker private(w)
> > +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker
> > partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
> > +    do j = 0, 31
> > +      w = 0
> > +      !$acc loop seq
> > +      do i = 0, 31
> > +        !$acc atomic update
> > +        w = w + 1
> > +        !$acc end atomic
> > +      end do
> > +      arr(j) = w
> > +    end do
> > +  !$acc end parallel
> > +
> > +  if (any (arr .ne. 32)) stop 1
> > +end program main  

Boo. I don't think I saw such a failure on the systems I tested on.
That needs investigation (though it might be something CUDA-version or
GPU specific, hence not directly a GCC problem? Not sure.)

Thanks for review, and please ask if there's anything I can help
further with.

Julian

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-19 11:23     ` Julian Brown
@ 2021-05-21 18:55       ` Thomas Schwinge
  2021-05-21 19:18       ` Thomas Schwinge
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 18:55 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 11443 bytes --]

Hi!

On 2021-04-19T12:23:56+0100, Julian Brown <julian@codesourcery.com> wrote:
> On Thu, 15 Apr 2021 19:26:54 +0200
> Thomas Schwinge <thomas@codesourcery.com> wrote:
>> This has iterated through several conceptually different designs and
>> implementations, by several people, over the past several years.
>
> I hope this wasn't a hint that I'd failed to attribute the authorship of
> the patch properly? Many apologies if so, that certainly wasn't my
> intention!

No, not at all -- this was just to highlight the several iterations this
work has gone though.


With a first set of my modification merged in, I've now pushed "openacc:
Add support for gang local storage allocation in shared memory [PR90115]"
to master branch in commit 29a2f51806c5b30e17a8d0e9ba7915a3c53c34ff, see
attached.  I shall now follow up with a number of further changes, and
more to come later (once developed).


>> On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com>
>> wrote:
>> > This patch implements a method to track the "private-ness" of
>> > OpenACC variables declared in offload regions in gang-partitioned,
>> > worker-partitioned or vector-partitioned modes. Variables declared
>> > implicitly in scoped blocks and those declared "private" on
>> > enclosing directives (e.g. "acc parallel") are both handled.
>> > Variables that are e.g. gang-private can then be adjusted so they
>> > reside in GPU shared memory.
>> >
>> > The reason for doing this is twofold: correct implementation of
>> > OpenACC semantics
>>
>> ACK, and as mentioned before, this very much relates to
>> <https://gcc.gnu.org/PR90115> "OpenACC: predetermined private levels
>> for variables declared in blocks" (plus the corresponding use of
>> 'private' clauses, implicit/explicit, including 'firstprivate') and
>> <https://gcc.gnu.org/PR90114> "Predetermined private levels for
>> variables declared in OpenACC accelerator routines", which we thus
>> should refer in testcases/ChangeLog/commit log, as appropriate.  I do
>> understand we're not yet addressing all of that (and that's fine!),
>> but we should capture remaining work items of the PRs and Cesar's
>> list in
>> <http://mid.mail-archive.com/70d27ebd-762e-59a3-082f-48fa0c687212@codesourcery.com>),
>> as appropriate.
>
> From that list: [...]

Thanks, that'll be useful for later.


>> > Handling of private variables is intimately
>> > tied to the execution model for gangs/workers/vectors implemented by
>> > a particular target: for current targets, we use (or on mainline,
>> > will soon use) a broadcasting/neutering scheme.
>> >
>> > That is sufficient for code that e.g. sets a variable in
>> > worker-single mode and expects to use the value in
>> > worker-partitioned mode. The difficulty (semantics-wise) comes when
>> > the user wants to do something like an atomic operation in
>> > worker-partitioned mode and expects a worker-single (gang private)
>> > variable to be shared across each partitioned worker. Forcing use
>> > of shared memory for such variables makes that work properly.
>>
>> Are we reliably making sure that gang-private variables (and other
>> levels, in general) are not subject to the usual broadcasting scheme
>> (nvptx, at least), or does that currently work "by accident"?  (I
>> haven't looked into that, yet.)
>
> Yes, that case is explicitly handled by the broadcasting/neutering patch
> recently posted. (One of the reasons that patch depends on this one.)

OK, I shall look into these GCN patches soon -- and I still haven't
looked into the nvptx aspect.


>> > --- a/gcc/expr.c
>> > +++ b/gcc/expr.c
>> > @@ -10224,8 +10224,19 @@ expand_expr_real_1 (tree exp, rtx target,
>> > machine_mode tmode, exp = SSA_NAME_VAR (ssa_name);
>> >        goto expand_decl_rtl;
>> >
>> > -    case PARM_DECL:
>> >      case VAR_DECL:
>> > +      /* Allow accel compiler to handle variables that require special
>> > +   treatment, e.g. if they have been modified in some way earlier in
>> > +   compilation by the adjust_private_decl OpenACC hook.  */
>> > +      if (flag_openacc && targetm.goacc.expand_var_decl)
>> > +  {
>> > +    temp = targetm.goacc.expand_var_decl (exp);
>> > +    if (temp)
>> > +      return temp;
>> > +  }
>> > +      /* ... fall through ...  */
>> > +
>> > +    case PARM_DECL:
>>
>> [TS] Are we sure that we don't need the same handling for a
>> 'PARM_DECL', too?  (If yes, to document and verify that, should we
>> thus again unify the two 'case's, and in
>> 'targetm.goacc.expand_var_decl' add a 'gcc_checking_assert (TREE_CODE
>> (var) == VAR_DECL')'?)
>
> Maybe for routines? Those bits date from the earliest version of the
> patch and (same excuse again) I didn't have call to revisit those
> decisions.

Indeed we're currently not handling 'p' here:

    int f(int p)
    {
      int l;
      #pragma acc parallel
        {
          #pragma acc loop gang private(l, p) // 'l' is, but 'p' is *not* made gang-private here.
          for ([...])

... to be fixed at some later point.


>> Also, are we sure that all the following existing processing is not
>> relevant to do before the 'return temp' (see above)?  That's not a
>> concern for GCN (which doesn't use 'targetm.goacc.expand_var_decl',
>> and thus does execute all this following existing processing), but it
>> is for nvptx (which does use 'targetm.goacc.expand_var_decl', and
>> thus doesn't execute all this following existing processing if that
>> returned something).  Or, is 'targetm.goacc.expand_var_decl'
>> conceptually and practically meant to implement all of the following
>> processing, or is this for other reasons not relevant in the
>> 'targetm.goacc.expand_var_decl' case:
>>
>> >        /* If a static var's type was incomplete when the decl was
>> > written, but the type is complete now, lay out the decl now.  */
>> >        if (DECL_SIZE (exp) == 0
>> |            && COMPLETE_OR_UNBOUND_ARRAY_TYPE_P (TREE_TYPE (exp))
>> |            && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>> |          layout_decl (exp, 0);
>> |
>> |        /* fall through */
>> |
>> |      case FUNCTION_DECL:
>> |      case RESULT_DECL:
>> |        decl_rtl = DECL_RTL (exp);
>> |      expand_decl_rtl:
>> |        gcc_assert (decl_rtl);
>> |
>> |        /* DECL_MODE might change when TYPE_MODE depends on attribute target
>> |           settings for VECTOR_TYPE_P that might switch for the
>> |        function.  */
>> |        if (currently_expanding_to_rtl
>> |            && code == VAR_DECL && MEM_P (decl_rtl)
>> |            && VECTOR_TYPE_P (type) && exp && DECL_MODE (exp) != mode)
>> |          decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
>> |        else
>> |          decl_rtl = copy_rtx (decl_rtl);
>> |
>> |        /* Record writes to register variables.  */
>> |        if (modifier == EXPAND_WRITE
>> |            && REG_P (decl_rtl)
>> |            && HARD_REGISTER_P (decl_rtl))
>> |          add_to_hard_reg_set (&crtl->asm_clobbers,
>> |                               GET_MODE (decl_rtl), REGNO (decl_rtl));
>> |
>> |        /* Ensure variable marked as used even if it doesn't go through
>> |           a parser.  If it hasn't be used yet, write out an external
>> |           definition.  */
>> |        if (exp)
>> |          TREE_USED (exp) = 1;
>> |
>> |        /* Show we haven't gotten RTL for this yet.  */
>> |        temp = 0;
>> |
>> |        /* Variables inherited from containing functions should have
>> |           been lowered by this point.  */
>> |        if (exp)
>> |          context = decl_function_context (exp);
>> |        gcc_assert (!exp
>> |                    || SCOPE_FILE_SCOPE_P (context)
>> |                    || context == current_function_decl
>> |                    || TREE_STATIC (exp)
>> |                    || DECL_EXTERNAL (exp)
>> |                    /* ??? C++ creates functions that are not TREE_STATIC.  */
>> |                    || TREE_CODE (exp) == FUNCTION_DECL);
>> |
>> |        /* This is the case of an array whose size is to be determined
>> |           from its initializer, while the initializer is still being parsed.
>> |           ??? We aren't parsing while expanding anymore.  */
>> |
>> |        if (MEM_P (decl_rtl) && REG_P (XEXP (decl_rtl, 0)))
>> |          temp = validize_mem (decl_rtl);
>> |
>> |        /* If DECL_RTL is memory, we are in the normal case and the
>> |           address is not valid, get the address into a register.  */
>> |
>> |        else if (MEM_P (decl_rtl) && modifier != EXPAND_INITIALIZER)
>> |          {
>> |            if (alt_rtl)
>> |              *alt_rtl = decl_rtl;
>> |            decl_rtl = use_anchored_address (decl_rtl);
>> |            if (modifier != EXPAND_CONST_ADDRESS
>> |                && modifier != EXPAND_SUM
>> |                && !memory_address_addr_space_p (exp ? DECL_MODE (exp)
>> |                                                 : GET_MODE (decl_rtl),
>> |                                                 XEXP (decl_rtl, 0),
>> |                                                 MEM_ADDR_SPACE (decl_rtl)))
>> |              temp = replace_equiv_address (decl_rtl,
>> |                                            copy_rtx (XEXP (decl_rtl, 0)));
>> |          }
>> |
>> |        /* If we got something, return it.  But first, set the alignment
>> |           if the address is a register.  */
>> |        if (temp != 0)
>> |          {
>> |            if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0)))
>> |              mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp));
>> |          }
>> |        else if (MEM_P (decl_rtl))
>> |          temp = decl_rtl;
>> |
>> |        if (temp != 0)
>> |          {
>> |            if (MEM_P (temp)
>> |                && modifier != EXPAND_WRITE
>> |                && modifier != EXPAND_MEMORY
>> |                && modifier != EXPAND_INITIALIZER
>> |                && modifier != EXPAND_CONST_ADDRESS
>> |                && modifier != EXPAND_SUM
>> |                && !inner_reference_p
>> |                && mode != BLKmode
>> |                && MEM_ALIGN (temp) < GET_MODE_ALIGNMENT (mode))
>> |              temp = expand_misaligned_mem_ref (temp, mode, unsignedp,
>> |                                                MEM_ALIGN (temp), NULL_RTX, NULL);
>> |
>> |            return temp;
>> |          }
>> | [...]
>>
>> [TS] I don't understand that yet.  :-|
>>
>> Instead of the current "early-return" handling:
>>
>>     temp = targetm.goacc.expand_var_decl (exp);
>>     if (temp)
>>       return temp;
>>
>> ... should we maybe just set:
>>
>>     DECL_RTL (exp) = targetm.goacc.expand_var_decl (exp)
>>
>> ... (or similar), and then let the usual processing continue?
>
> Hum, not sure about that. See above excuse... maybe Chung-Lin
> remembers? My guess is the extra processing doesn't matter in practice
> for the limited kinds of variables that are handled by that hook, at
> least for NVPTX (which skips register allocation, etc. anyway).

I haven't yet looked into that further.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-openacc-Add-support-for-gang-local-storage-allocatio.patch --]
[-- Type: text/x-diff, Size: 39156 bytes --]

From 29a2f51806c5b30e17a8d0e9ba7915a3c53c34ff Mon Sep 17 00:00:00 2001
From: Julian Brown <julian@codesourcery.com>
Date: Fri, 26 Feb 2021 04:34:49 -0800
Subject: [PATCH] openacc: Add support for gang local storage allocation in
 shared memory [PR90115]

This patch implements a method to track the "private-ness" of
OpenACC variables declared in offload regions in gang-partitioned,
worker-partitioned or vector-partitioned modes. Variables declared
implicitly in scoped blocks and those declared "private" on enclosing
directives (e.g. "acc parallel") are both handled. Variables that are
e.g. gang-private can then be adjusted so they reside in GPU shared
memory.

The reason for doing this is twofold: correct implementation of OpenACC
semantics, and optimisation, since shared memory might be faster than
the main memory on a GPU. Handling of private variables is intimately
tied to the execution model for gangs/workers/vectors implemented by
a particular target: for current targets, we use (or on mainline, will
soon use) a broadcasting/neutering scheme.

That is sufficient for code that e.g. sets a variable in worker-single
mode and expects to use the value in worker-partitioned mode. The
difficulty (semantics-wise) comes when the user wants to do something like
an atomic operation in worker-partitioned mode and expects a worker-single
(gang private) variable to be shared across each partitioned worker.
Forcing use of shared memory for such variables makes that work properly.

In terms of implementation, the parallelism level of a given loop is
not fixed until the oaccdevlow pass in the offload compiler, so the
patch delays fixing the parallelism level of variables declared on or
within such loops until the same point. This is done by adding a new
internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
private variable as an argument, and other arguments set so as to be able
to determine the correct parallelism level to use for the listed
variables. This new internal function fits into the existing scheme for
demarcating OpenACC loops, as described in comments in the patch.

Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
TARGET_GOACC_EXPAND_VAR_DECL.  The first can tweak a variable declaration
at oaccdevlow time, and the second at expand time.  The first or both
of these target hooks can be used by a given offload target, depending
on its strategy for implementing private variables.

This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in
the AMD GCN backend to the current name and prototype. (An earlier
version of the hook was already present, but dormant.)

	gcc/
	PR middle-end/90115
	* doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL)
	(TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks.
	* doc/tm.texi: Regenerate.
	* expr.c (expand_expr_real_1): Expand decls using the
	expand_var_decl OpenACC hook if defined.
	* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
	* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
	* omp-low.c (omp_context): Add oacc_privatization_candidates
	field.
	(lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert
	before fork.
	(lower_oacc_head_tail): Add PRIVATE_MARKER parameter.  Modify
	private marker's gimple call arguments, and pass it to
	lower_oacc_reductions.
	(oacc_privatization_scan_clause_chain)
	(oacc_privatization_scan_decl_chain, lower_oacc_private_marker):
	New functions.
	(lower_omp_for, lower_omp_target, lower_omp_1): Use these.
	* omp-offload.c (convert.h): Include.
	(oacc_loop_xform_head_tail): Treat private-variable markers like
	fork/join when transforming head/tail sequences.
	(struct var_decl_rewrite_info): Add struct.
	(oacc_rewrite_var_decl, is_sync_builtin_call): New functions.
	(execute_oacc_device_lower): Support rewriting gang-private
	variables using target hook, and fix up addr_expr and var_decl
	nodes afterwards.
	* target.def (adjust_private_decl, expand_var_decl): New hooks.
	* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl):
	Rename to...
	(gcn_goacc_adjust_private_decl): ...this.
	* config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl):
	Rename to...
	(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
	* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename
	definition using gcn_goacc_adjust_gangprivate_decl...
	(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using
	gcn_goacc_adjust_private_decl.
	* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
	(gang_private_shared_size): New global variable.
	(gang_private_shared_align): Likewise.
	(gang_private_shared_sym): Likewise.
	(gang_private_shared_hmap): Likewise.
	(nvptx_option_override): Initialize these.
	(nvptx_file_end): Output gang_private_shared_sym.
	(nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_var_decl):
	New functions.
	(nvptx_set_current_function): Clear gang_private_shared_hmap.
	(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook.
	(TARGET_GOACC_EXPAND_VAR_DECL): Likewise.
	libgomp/
	PR middle-end/90115
	* testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: New
	test.
	* testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90:
	Likewise.

Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
---
 gcc/config/gcn/gcn-protos.h                   |   2 +-
 gcc/config/gcn/gcn-tree.c                     |   9 +-
 gcc/config/gcn/gcn.c                          |   4 +-
 gcc/config/nvptx/nvptx.c                      |  80 +++++++
 gcc/doc/tm.texi                               |  25 ++
 gcc/doc/tm.texi.in                            |   4 +
 gcc/expr.c                                    |  13 +-
 gcc/internal-fn.c                             |   2 +
 gcc/internal-fn.h                             |   8 +-
 gcc/omp-low.c                                 | 125 +++++++++-
 gcc/omp-offload.c                             | 225 +++++++++++++++++-
 gcc/target.def                                |  29 +++
 .../private-atomic-1-gang.c                   |  38 +++
 .../private-atomic-1-gang.f90                 |  25 ++
 .../private-atomic-1-worker.f90               |  32 +++
 15 files changed, 606 insertions(+), 15 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index dc9331c445d..7ef7ae8af46 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -40,7 +40,7 @@ extern rtx gcn_gen_undef (machine_mode);
 extern bool gcn_global_address_p (rtx);
 extern tree gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
 						 const char *name);
-extern void gcn_goacc_adjust_gangprivate_decl (tree var);
+extern tree gcn_goacc_adjust_private_decl (tree var, int level);
 extern void gcn_goacc_reduction (gcall *call);
 extern bool gcn_hard_regno_rename_ok (unsigned int from_reg,
 				      unsigned int to_reg);
diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c
index 8f270991c86..75ea50c59dd 100644
--- a/gcc/config/gcn/gcn-tree.c
+++ b/gcc/config/gcn/gcn-tree.c
@@ -577,9 +577,12 @@ gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
   return decl;
 }
 
-void
-gcn_goacc_adjust_gangprivate_decl (tree var)
+tree
+gcn_goacc_adjust_private_decl (tree var, int level)
 {
+  if (level != GOMP_DIM_GANG)
+    return var;
+
   tree type = TREE_TYPE (var);
   tree lds_type = build_qualified_type (type,
 		    TYPE_QUALS_NO_ADDR_SPACE (type)
@@ -597,6 +600,8 @@ gcn_goacc_adjust_gangprivate_decl (tree var)
 
   if (machfun)
     machfun->use_flat_addressing = true;
+
+  return var;
 }
 
 /* }}}  */
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 9660ca6eaa4..283a91fe50a 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -6320,8 +6320,8 @@ gcn_dwarf_register_span (rtx rtl)
 #undef  TARGET_GOACC_ADJUST_PROPAGATION_RECORD
 #define TARGET_GOACC_ADJUST_PROPAGATION_RECORD \
   gcn_goacc_adjust_propagation_record
-#undef  TARGET_GOACC_ADJUST_GANGPRIVATE_DECL
-#define TARGET_GOACC_ADJUST_GANGPRIVATE_DECL gcn_goacc_adjust_gangprivate_decl
+#undef  TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL gcn_goacc_adjust_private_decl
 #undef  TARGET_GOACC_FORK_JOIN
 #define TARGET_GOACC_FORK_JOIN gcn_fork_join
 #undef  TARGET_GOACC_REDUCTION
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 722b0faa330..80116e570d6 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -75,6 +75,7 @@
 #include "fold-const.h"
 #include "intl.h"
 #include "opts.h"
+#include "tree-pretty-print.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -167,6 +168,12 @@ static unsigned vector_red_align;
 static unsigned vector_red_partition;
 static GTY(()) rtx vector_red_sym;
 
+/* Shared memory block for gang-private variables.  */
+static unsigned gang_private_shared_size;
+static unsigned gang_private_shared_align;
+static GTY(()) rtx gang_private_shared_sym;
+static hash_map<tree_decl_hash, unsigned int> gang_private_shared_hmap;
+
 /* Global lock variable, needed for 128bit worker & gang reductions.  */
 static GTY(()) tree global_lock_var;
 
@@ -251,6 +258,10 @@ nvptx_option_override (void)
   vector_red_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
   vector_red_partition = 0;
 
+  gang_private_shared_sym = gen_rtx_SYMBOL_REF (Pmode, "__gang_private_shared");
+  SET_SYMBOL_DATA_AREA (gang_private_shared_sym, DATA_AREA_SHARED);
+  gang_private_shared_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+
   diagnose_openacc_conflict (TARGET_GOMP, "-mgomp");
   diagnose_openacc_conflict (TARGET_SOFT_STACK, "-msoft-stack");
   diagnose_openacc_conflict (TARGET_UNIFORM_SIMT, "-muniform-simt");
@@ -5435,6 +5446,10 @@ nvptx_file_end (void)
     write_shared_buffer (asm_out_file, vector_red_sym,
 			 vector_red_align, vector_red_size);
 
+  if (gang_private_shared_size)
+    write_shared_buffer (asm_out_file, gang_private_shared_sym,
+			 gang_private_shared_align, gang_private_shared_size);
+
   if (need_softstack_decl)
     {
       write_var_marker (asm_out_file, false, true, "__nvptx_stacks");
@@ -6662,6 +6677,64 @@ nvptx_truly_noop_truncation (poly_uint64, poly_uint64)
   return false;
 }
 
+/* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL.  */
+
+static tree
+nvptx_goacc_adjust_private_decl (tree decl, int level)
+{
+  if (level != GOMP_DIM_GANG)
+    return decl;
+
+  /* Set "oacc gang-private" attribute for gang-private variable
+     declarations.  */
+  if (!lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (decl)))
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Setting 'oacc gang-private' attribute for decl:");
+	  print_generic_decl (dump_file, decl, TDF_SLIM);
+	  fputc ('\n', dump_file);
+	}
+      tree id = get_identifier ("oacc gang-private");
+      DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl));
+    }
+
+  return decl;
+}
+
+/* Implement TARGET_GOACC_EXPAND_VAR_DECL.  */
+
+static rtx
+nvptx_goacc_expand_var_decl (tree var)
+{
+  /* Place "oacc gang-private" variables in shared memory.  */
+  if (VAR_P (var)
+      && lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (var)))
+    {
+      unsigned int offset, *poffset;
+      poffset = gang_private_shared_hmap.get (var);
+      if (poffset)
+	offset = *poffset;
+      else
+	{
+	  unsigned HOST_WIDE_INT align = DECL_ALIGN (var);
+	  gang_private_shared_size
+	    = (gang_private_shared_size + align - 1) & ~(align - 1);
+	  if (gang_private_shared_align < align)
+	    gang_private_shared_align = align;
+
+	  offset = gang_private_shared_size;
+	  bool existed = gang_private_shared_hmap.put (var, offset);
+	  gcc_checking_assert (!existed);
+	  gang_private_shared_size += tree_to_uhwi (DECL_SIZE_UNIT (var));
+	}
+      rtx addr = plus_constant (Pmode, gang_private_shared_sym, offset);
+      return gen_rtx_MEM (TYPE_MODE (TREE_TYPE (var)), addr);
+    }
+
+  return NULL_RTX;
+}
+
 static GTY(()) tree nvptx_previous_fndecl;
 
 static void
@@ -6670,6 +6743,7 @@ nvptx_set_current_function (tree fndecl)
   if (!fndecl || fndecl == nvptx_previous_fndecl)
     return;
 
+  gang_private_shared_hmap.empty ();
   nvptx_previous_fndecl = fndecl;
   vector_red_partition = 0;
   oacc_bcast_partition = 0;
@@ -6834,6 +6908,12 @@ nvptx_libc_has_function (enum function_class fn_class, tree type)
 #undef TARGET_HAVE_SPECULATION_SAFE_VALUE
 #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
 
+#undef TARGET_GOACC_ADJUST_PRIVATE_DECL
+#define TARGET_GOACC_ADJUST_PRIVATE_DECL nvptx_goacc_adjust_private_decl
+
+#undef TARGET_GOACC_EXPAND_VAR_DECL
+#define TARGET_GOACC_EXPAND_VAR_DECL nvptx_goacc_expand_var_decl
+
 #undef TARGET_SET_CURRENT_FUNCTION
 #define TARGET_SET_CURRENT_FUNCTION nvptx_set_current_function
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 85ea9395560..78c330c292d 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6236,6 +6236,31 @@ like @code{cond_add@var{m}}.  The default implementation returns a zero
 constant of type @var{type}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, int @var{level})
+This hook, if defined, is used by accelerator target back-ends to adjust
+OpenACC variable declarations that should be made private to the given
+parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or
+@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable
+declarations at the @code{gang} level to reside in GPU shared memory.
+
+You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the
+adjusted variable declaration needs to be expanded to RTL in a non-standard
+way.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_GOACC_EXPAND_VAR_DECL (tree @var{var})
+This hook, if defined, is used by accelerator target back-ends to expand
+specially handled kinds of @code{VAR_DECL} expressions.  A particular use is
+to place variables with specific attributes inside special accelarator
+memories.  A return value of @code{NULL} indicates that the target does not
+handle this @code{VAR_DECL}, and normal RTL expanding is resumed.
+
+Only define this hook if your accelerator target needs to expand certain
+@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust
+private variables at OpenACC device-lowering time using the
+@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.
+@end deftypefn
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d8e3de14af1..d9fbbe20e6f 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4221,6 +4221,10 @@ address;  but often a machine-dependent strategy can generate better code.
 
 @hook TARGET_PREFERRED_ELSE_VALUE
 
+@hook TARGET_GOACC_ADJUST_PRIVATE_DECL
+
+@hook TARGET_GOACC_EXPAND_VAR_DECL
+
 @node Anchored Addresses
 @section Anchored Addresses
 @cindex anchored addresses
diff --git a/gcc/expr.c b/gcc/expr.c
index ba61eb98b3b..e4660f0e90a 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -10419,8 +10419,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
       exp = SSA_NAME_VAR (ssa_name);
       goto expand_decl_rtl;
 
-    case PARM_DECL:
     case VAR_DECL:
+      /* Allow accel compiler to handle variables that require special
+	 treatment, e.g. if they have been modified in some way earlier in
+	 compilation by the adjust_private_decl OpenACC hook.  */
+      if (flag_openacc && targetm.goacc.expand_var_decl)
+	{
+	  temp = targetm.goacc.expand_var_decl (exp);
+	  if (temp)
+	    return temp;
+	}
+      /* ... fall through ...  */
+
+    case PARM_DECL:
       /* If a static var's type was incomplete when the decl was written,
 	 but the type is complete now, lay out the decl now.  */
       if (DECL_SIZE (exp) == 0
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index d209a52f823..d92080c8077 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2969,6 +2969,8 @@ expand_UNIQUE (internal_fn, gcall *stmt)
       else
 	gcc_unreachable ();
       break;
+    case IFN_UNIQUE_OACC_PRIVATE:
+      break;
     }
 
   if (pattern)
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index c6599ce4894..5bc5660c1ff 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -32,11 +32,15 @@ along with GCC; see the file COPYING3.  If not see
    or leaving partitioned execution.
       DEP_VAR = UNIQUE ({HEAD,TAIL}_MARK, REMAINING_MARKS, ...PRIMARY_FLAGS)
 
-   The PRIMARY_FLAGS only occur on the first HEAD_MARK of a sequence.  */
+   The PRIMARY_FLAGS only occur on the first HEAD_MARK of a sequence.
+
+   PRIVATE captures variables to be made private at the surrounding parallelism
+   level.  */
 #define IFN_UNIQUE_CODES				  \
   DEF(UNSPEC),	\
     DEF(OACC_FORK), DEF(OACC_JOIN),		\
-    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK)
+    DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK),	\
+    DEF(OACC_PRIVATE)
 
 enum ifn_unique_kind {
 #define DEF(X) IFN_UNIQUE_##X
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index d1136d181b3..da827ef2e34 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -179,6 +179,9 @@ struct omp_context
   /* Only used for omp target contexts.  True if an OpenMP construct other
      than teams is strictly nested in it.  */
   bool nonteams_nested_p;
+
+  /* Candidates for adjusting OpenACC privatization level.  */
+  vec<tree> oacc_privatization_candidates;
 };
 
 static splay_tree all_contexts;
@@ -7132,8 +7135,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *body_p,
 
 static void
 lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
-		       gcall *fork, gcall *join, gimple_seq *fork_seq,
-		       gimple_seq *join_seq, omp_context *ctx)
+		       gcall *fork, gcall *private_marker, gcall *join,
+		       gimple_seq *fork_seq, gimple_seq *join_seq,
+		       omp_context *ctx)
 {
   gimple_seq before_fork = NULL;
   gimple_seq after_fork = NULL;
@@ -7337,6 +7341,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner,
 
   /* Now stitch things together.  */
   gimple_seq_add_seq (fork_seq, before_fork);
+  if (private_marker)
+    gimple_seq_add_stmt (fork_seq, private_marker);
   if (fork)
     gimple_seq_add_stmt (fork_seq, fork);
   gimple_seq_add_seq (fork_seq, after_fork);
@@ -8116,7 +8122,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head,
    HEAD and TAIL.  */
 
 static void
-lower_oacc_head_tail (location_t loc, tree clauses,
+lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker,
 		      gimple_seq *head, gimple_seq *tail, omp_context *ctx)
 {
   bool inner = false;
@@ -8124,6 +8130,14 @@ lower_oacc_head_tail (location_t loc, tree clauses,
   gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node));
 
   unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx);
+
+  if (private_marker)
+    {
+      gimple_set_location (private_marker, loc);
+      gimple_call_set_lhs (private_marker, ddvar);
+      gimple_call_set_arg (private_marker, 1, ddvar);
+    }
+
   tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK);
   tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
 
@@ -8154,7 +8168,8 @@ lower_oacc_head_tail (location_t loc, tree clauses,
 			      &join_seq);
 
       lower_oacc_reductions (loc, clauses, place, inner,
-			     fork, join, &fork_seq, &join_seq,  ctx);
+			     fork, (count == 1) ? private_marker : NULL,
+			     join, &fork_seq, &join_seq,  ctx);
 
       /* Append this level to head. */
       gimple_seq_add_seq (head, fork_seq);
@@ -10129,6 +10144,32 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
     }
 }
 
+/* Scan CLAUSES for candidates for adjusting OpenACC privatization level in
+   CTX.  */
+
+static void
+oacc_privatization_scan_clause_chain (omp_context *ctx, tree clauses)
+{
+  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
+      {
+	tree decl = OMP_CLAUSE_DECL (c);
+	if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
+	  ctx->oacc_privatization_candidates.safe_push (decl);
+      }
+}
+
+/* Scan DECLS for candidates for adjusting OpenACC privatization level in
+   CTX.  */
+
+static void
+oacc_privatization_scan_decl_chain (omp_context *ctx, tree decls)
+{
+  for (tree decl = decls; decl; decl = DECL_CHAIN (decl))
+    if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
+      ctx->oacc_privatization_candidates.safe_push (decl);
+}
+
 /* Callback for walk_gimple_seq.  Find #pragma omp scan statement.  */
 
 static tree
@@ -10958,6 +10999,58 @@ lower_omp_for_scan (gimple_seq *body_p, gimple_seq *dlist, gomp_for *stmt,
   *dlist = new_dlist;
 }
 
+/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing
+   the addresses of variables to be made private at the surrounding
+   parallelism level.  Such functions appear in the gimple code stream in two
+   forms, e.g. for a partitioned loop:
+
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68);
+      .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w);
+      .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1);
+      .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6);
+
+   or alternatively, OACC_PRIVATE can appear at the top level of a parallel,
+   not as part of a HEAD_MARK sequence:
+
+      .UNIQUE (OACC_PRIVATE, 0, 0, &w);
+
+   For such stand-alone appearances, the 3rd argument is always 0, denoting
+   gang partitioning.  */
+
+static gcall *
+lower_oacc_private_marker (omp_context *ctx)
+{
+  if (ctx->oacc_privatization_candidates.length () == 0)
+    return NULL;
+
+  auto_vec<tree, 5> args;
+
+  args.quick_push (build_int_cst (integer_type_node, IFN_UNIQUE_OACC_PRIVATE));
+  args.quick_push (integer_zero_node);
+  args.quick_push (integer_minus_one_node);
+
+  int i;
+  tree decl;
+  FOR_EACH_VEC_ELT (ctx->oacc_privatization_candidates, i, decl)
+    {
+      for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer)
+	{
+	  tree inner_decl = maybe_lookup_decl (decl, thisctx);
+	  if (inner_decl)
+	    {
+	      decl = inner_decl;
+	      break;
+	    }
+	}
+      gcc_checking_assert (decl);
+
+      tree addr = build_fold_addr_expr (decl);
+      args.safe_push (addr);
+    }
+
+  return gimple_build_call_internal_vec (IFN_UNIQUE, args);
+}
+
 /* Lower code for an OMP loop directive.  */
 
 static void
@@ -10974,6 +11067,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   push_gimplify_context ();
 
+  oacc_privatization_scan_clause_chain (ctx, gimple_omp_for_clauses (stmt));
+
   lower_omp (gimple_omp_for_pre_body_ptr (stmt), ctx);
 
   block = make_node (BLOCK);
@@ -10992,6 +11087,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       gbind *inner_bind
 	= as_a <gbind *> (gimple_seq_first_stmt (omp_for_body));
       tree vars = gimple_bind_vars (inner_bind);
+      if (is_gimple_omp_oacc (ctx->stmt))
+	oacc_privatization_scan_decl_chain (ctx, vars);
       gimple_bind_append_vars (new_stmt, vars);
       /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't
 	 keep them on the inner_bind and it's block.  */
@@ -11105,6 +11202,11 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
   lower_omp (gimple_omp_body_ptr (stmt), ctx);
 
+  gcall *private_marker = NULL;
+  if (is_gimple_omp_oacc (ctx->stmt)
+      && !gimple_seq_empty_p (omp_for_body))
+    private_marker = lower_oacc_private_marker (ctx);
+
   /* Lower the header expressions.  At this point, we can assume that
      the header is of the form:
 
@@ -11159,7 +11261,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   if (is_gimple_omp_oacc (ctx->stmt)
       && !ctx_in_oacc_kernels_region (ctx))
     lower_oacc_head_tail (gimple_location (stmt),
-			  gimple_omp_for_clauses (stmt),
+			  gimple_omp_for_clauses (stmt), private_marker,
 			  &oacc_head, &oacc_tail, ctx);
 
   /* Add OpenACC partitioning and reduction markers just before the loop.  */
@@ -13156,8 +13258,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	     them as a dummy GANG loop.  */
 	  tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG);
 
+	  gcall *private_marker = lower_oacc_private_marker (ctx);
+
+	  if (private_marker)
+	    gimple_call_set_arg (private_marker, 2, level);
+
 	  lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level,
-				 false, NULL, NULL, &fork_seq, &join_seq, ctx);
+				 false, NULL, private_marker, NULL, &fork_seq,
+				 &join_seq, ctx);
 	}
 
       gimple_seq_add_seq (&new_body, fork_seq);
@@ -13399,6 +13507,11 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		 ctx);
       break;
     case GIMPLE_BIND:
+      if (ctx && is_gimple_omp_oacc (ctx->stmt))
+	{
+	  tree vars = gimple_bind_vars (as_a <gbind *> (stmt));
+	  oacc_privatization_scan_decl_chain (ctx, vars);
+	}
       lower_omp (gimple_bind_body_ptr (as_a <gbind *> (stmt)), ctx);
       maybe_remove_omp_member_access_dummy_vars (as_a <gbind *> (stmt));
       break;
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 16124613fa7..080bdddfe88 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "cfgloop.h"
 #include "context.h"
+#include "convert.h"
 
 /* Describe the OpenACC looping structure of a function.  The entire
    function is held in a 'NULL' loop.  */
@@ -1357,7 +1358,9 @@ oacc_loop_xform_head_tail (gcall *from, int level)
 	    = ((enum ifn_unique_kind)
 	       TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)));
 
-	  if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN)
+	  if (k == IFN_UNIQUE_OACC_FORK
+	      || k == IFN_UNIQUE_OACC_JOIN
+	      || k == IFN_UNIQUE_OACC_PRIVATE)
 	    *gimple_call_arg_ptr (stmt, 2) = replacement;
 	  else if (k == kind && stmt != from)
 	    break;
@@ -1774,6 +1777,136 @@ default_goacc_reduction (gcall *call)
   gsi_replace_with_seq (&gsi, seq, true);
 }
 
+struct var_decl_rewrite_info
+{
+  gimple *stmt;
+  hash_map<tree, tree> *adjusted_vars;
+  bool avoid_pointer_conversion;
+  bool modified;
+};
+
+/* Helper function for execute_oacc_device_lower.  Rewrite VAR_DECLs (by
+   themselves or wrapped in various other nodes) according to ADJUSTED_VARS in
+   the var_decl_rewrite_info pointed to via DATA.  Used as part of coercing
+   gang-private variables in OpenACC offload regions to reside in GPU shared
+   memory.  */
+
+static tree
+oacc_rewrite_var_decl (tree *tp, int *walk_subtrees, void *data)
+{
+  walk_stmt_info *wi = (walk_stmt_info *) data;
+  var_decl_rewrite_info *info = (var_decl_rewrite_info *) wi->info;
+
+  if (TREE_CODE (*tp) == ADDR_EXPR)
+    {
+      tree arg = TREE_OPERAND (*tp, 0);
+      tree *new_arg = info->adjusted_vars->get (arg);
+
+      if (new_arg)
+	{
+	  if (info->avoid_pointer_conversion)
+	    {
+	      *tp = build_fold_addr_expr (*new_arg);
+	      info->modified = true;
+	      *walk_subtrees = 0;
+	    }
+	  else
+	    {
+	      gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
+	      tree repl = build_fold_addr_expr (*new_arg);
+	      gimple *stmt1
+		= gimple_build_assign (make_ssa_name (TREE_TYPE (repl)), repl);
+	      tree conv = convert_to_pointer (TREE_TYPE (*tp),
+					      gimple_assign_lhs (stmt1));
+	      gimple *stmt2
+		= gimple_build_assign (make_ssa_name (TREE_TYPE (*tp)), conv);
+	      gsi_insert_before (&gsi, stmt1, GSI_SAME_STMT);
+	      gsi_insert_before (&gsi, stmt2, GSI_SAME_STMT);
+	      *tp = gimple_assign_lhs (stmt2);
+	      info->modified = true;
+	      *walk_subtrees = 0;
+	    }
+	}
+    }
+  else if (TREE_CODE (*tp) == COMPONENT_REF || TREE_CODE (*tp) == ARRAY_REF)
+    {
+      tree *base = &TREE_OPERAND (*tp, 0);
+
+      while (TREE_CODE (*base) == COMPONENT_REF
+	     || TREE_CODE (*base) == ARRAY_REF)
+	base = &TREE_OPERAND (*base, 0);
+
+      if (TREE_CODE (*base) != VAR_DECL)
+	return NULL;
+
+      tree *new_decl = info->adjusted_vars->get (*base);
+      if (!new_decl)
+	return NULL;
+
+      int base_quals = TYPE_QUALS (TREE_TYPE (*new_decl));
+      tree field = TREE_OPERAND (*tp, 1);
+
+      /* Adjust the type of the field.  */
+      int field_quals = TYPE_QUALS (TREE_TYPE (field));
+      if (TREE_CODE (field) == FIELD_DECL && field_quals != base_quals)
+	{
+	  tree *field_type = &TREE_TYPE (field);
+	  while (TREE_CODE (*field_type) == ARRAY_TYPE)
+	    field_type = &TREE_TYPE (*field_type);
+	  field_quals |= base_quals;
+	  *field_type = build_qualified_type (*field_type, field_quals);
+	}
+
+      /* Adjust the type of the component ref itself.  */
+      tree comp_type = TREE_TYPE (*tp);
+      int comp_quals = TYPE_QUALS (comp_type);
+      if (TREE_CODE (*tp) == COMPONENT_REF && comp_quals != base_quals)
+	{
+	  comp_quals |= base_quals;
+	  TREE_TYPE (*tp)
+	    = build_qualified_type (comp_type, comp_quals);
+	}
+
+      *base = *new_decl;
+      info->modified = true;
+    }
+  else if (TREE_CODE (*tp) == VAR_DECL)
+    {
+      tree *new_decl = info->adjusted_vars->get (*tp);
+      if (new_decl)
+	{
+	  *tp = *new_decl;
+	  info->modified = true;
+	}
+    }
+
+  return NULL_TREE;
+}
+
+/* Return TRUE if CALL is a call to a builtin atomic/sync operation.  */
+
+static bool
+is_sync_builtin_call (gcall *call)
+{
+  tree callee = gimple_call_fndecl (call);
+
+  if (callee != NULL_TREE
+      && gimple_call_builtin_p (call, BUILT_IN_NORMAL))
+    switch (DECL_FUNCTION_CODE (callee))
+      {
+#undef DEF_SYNC_BUILTIN
+#define DEF_SYNC_BUILTIN(ENUM, NAME, TYPE, ATTRS) case ENUM:
+#include "sync-builtins.def"
+#undef DEF_SYNC_BUILTIN
+	return true;
+
+      default:
+	;
+      }
+
+  return false;
+}
+
 /* Main entry point for oacc transformations which run on the device
    compiler after LTO, so we know what the target device is at this
    point (including the host fallback).  */
@@ -1923,6 +2056,8 @@ execute_oacc_device_lower ()
      dominance information to update SSA.  */
   calculate_dominance_info (CDI_DOMINATORS);
 
+  hash_map<tree, tree> adjusted_vars;
+
   /* Now lower internal loop functions to target-specific code
      sequences.  */
   basic_block bb;
@@ -1999,6 +2134,45 @@ execute_oacc_device_lower ()
 		case IFN_UNIQUE_OACC_TAIL_MARK:
 		  remove = true;
 		  break;
+
+		case IFN_UNIQUE_OACC_PRIVATE:
+		  {
+		    HOST_WIDE_INT level
+		      = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+		    if (level == -1)
+		      break;
+		    for (unsigned i = 3;
+			 i < gimple_call_num_args (call);
+			 i++)
+		      {
+			tree arg = gimple_call_arg (call, i);
+			gcc_checking_assert (TREE_CODE (arg) == ADDR_EXPR);
+			tree decl = TREE_OPERAND (arg, 0);
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  {
+			    static char const *const axes[] =
+			      /* Must be kept in sync with GOMP_DIM
+				 enumeration.  */
+			      { "gang", "worker", "vector" };
+			    fprintf (dump_file, "Decl UID %u has %s "
+				     "partitioning:", DECL_UID (decl),
+				     axes[level]);
+			    print_generic_decl (dump_file, decl, TDF_SLIM);
+			    fputc ('\n', dump_file);
+			  }
+			if (targetm.goacc.adjust_private_decl)
+			  {
+			    tree oldtype = TREE_TYPE (decl);
+			    tree newdecl
+			      = targetm.goacc.adjust_private_decl (decl, level);
+			    if (TREE_TYPE (newdecl) != oldtype
+				|| newdecl != decl)
+			      adjusted_vars.put (decl, newdecl);
+			  }
+		      }
+		    remove = true;
+		  }
+		  break;
 		}
 	      break;
 	    }
@@ -2030,6 +2204,55 @@ execute_oacc_device_lower ()
 	  gsi_next (&gsi);
       }
 
+  /* Make adjustments to gang-private local variables if required by the
+     target, e.g. forcing them into a particular address space.  Afterwards,
+     ADDR_EXPR nodes which have adjusted variables as their argument need to
+     be modified in one of two ways:
+
+       1. They can be recreated, making a pointer to the variable in the new
+	  address space, or
+
+       2. The address of the variable in the new address space can be taken,
+	  converted to the default (original) address space, and the result of
+	  that conversion subsituted in place of the original ADDR_EXPR node.
+
+     Which of these is done depends on the gimple statement being processed.
+     At present atomic operations and inline asms use (1), and everything else
+     uses (2).  At least on AMD GCN, there are atomic operations that work
+     directly in the LDS address space.
+
+     COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
+     the new decl, adjusting types of appropriate tree nodes as necessary.  */
+
+  if (targetm.goacc.adjust_private_decl)
+    {
+      FOR_ALL_BB_FN (bb, cfun)
+	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	     !gsi_end_p (gsi);
+	     gsi_next (&gsi))
+	  {
+	    gimple *stmt = gsi_stmt (gsi);
+	    walk_stmt_info wi;
+	    var_decl_rewrite_info info;
+
+	    info.avoid_pointer_conversion
+	      = (is_gimple_call (stmt)
+		 && is_sync_builtin_call (as_a <gcall *> (stmt)))
+		|| gimple_code (stmt) == GIMPLE_ASM;
+	    info.stmt = stmt;
+	    info.modified = false;
+	    info.adjusted_vars = &adjusted_vars;
+
+	    memset (&wi, 0, sizeof (wi));
+	    wi.info = &info;
+
+	    walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
+
+	    if (info.modified)
+	      update_stmt (stmt);
+	  }
+    }
+
   free_oacc_loop (loops);
 
   return 0;
diff --git a/gcc/target.def b/gcc/target.def
index bbaf6b4f3a0..660b69f5cb5 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1726,6 +1726,35 @@ for allocating any storage for reductions when necessary.",
 void, (gcall *call),
 default_goacc_reduction)
 
+DEFHOOK
+(adjust_private_decl,
+"This hook, if defined, is used by accelerator target back-ends to adjust\n\
+OpenACC variable declarations that should be made private to the given\n\
+parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or\n\
+@code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable\n\
+declarations at the @code{gang} level to reside in GPU shared memory.\n\
+\n\
+You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the\n\
+adjusted variable declaration needs to be expanded to RTL in a non-standard\n\
+way.",
+tree, (tree var, int level),
+NULL)
+
+DEFHOOK
+(expand_var_decl,
+"This hook, if defined, is used by accelerator target back-ends to expand\n\
+specially handled kinds of @code{VAR_DECL} expressions.  A particular use is\n\
+to place variables with specific attributes inside special accelarator\n\
+memories.  A return value of @code{NULL} indicates that the target does not\n\
+handle this @code{VAR_DECL}, and normal RTL expanding is resumed.\n\
+\n\
+Only define this hook if your accelerator target needs to expand certain\n\
+@code{VAR_DECL} nodes in a way that differs from the default.  You can also adjust\n\
+private variables at OpenACC device-lowering time using the\n\
+@code{TARGET_GOACC_ADJUST_PRIVATE_DECL} target hook.",
+rtx, (tree var),
+NULL)
+
 HOOK_VECTOR_END (goacc)
 
 /* Functions relating to vectorization.  */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
new file mode 100644
index 00000000000..28222c25da3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
@@ -0,0 +1,38 @@
+#include <assert.h>
+
+int main (void)
+{
+  int ret;
+
+  #pragma acc parallel num_gangs(1) num_workers(32) copyout(ret)
+  {
+    int w = 0;
+
+    #pragma acc loop worker
+    for (int i = 0; i < 32; i++)
+      {
+	#pragma acc atomic update
+	w++;
+      }
+
+    ret = (w == 32);
+  }
+  assert (ret);
+
+  #pragma acc parallel num_gangs(1) vector_length(32) copyout(ret)
+  {
+    int v = 0;
+
+    #pragma acc loop vector
+    for (int i = 0; i < 32; i++)
+      {
+	#pragma acc atomic update
+	v++;
+      }
+
+    ret = (v == 32);
+  }
+  assert (ret);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
new file mode 100644
index 00000000000..81487d7a7e0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
@@ -0,0 +1,25 @@
+! Test for "oacc gang-private" attribute on gang-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-oaccdevlow-details -w" }
+
+program main
+  integer :: w, arr(0:31)
+
+  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+    !$acc loop gang private(w)
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    do j = 0, 31
+      w = 0
+      !$acc loop seq
+      do i = 0, 31
+        !$acc atomic update
+        w = w + 1
+        !$acc end atomic
+      end do
+      arr(j) = w
+    end do
+  !$acc end parallel
+
+  if (any (arr .ne. 32)) stop 1
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90
new file mode 100644
index 00000000000..21d13754591
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90
@@ -0,0 +1,32 @@
+! Test for worker-private variables
+
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
+
+program main
+  integer :: w, arr(0:31)
+
+  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+    !$acc loop gang worker private(w)
+! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    do j = 0, 31
+      w = 0
+      !$acc loop seq
+      do i = 0, 31
+        !$acc atomic update
+        w = w + 1
+        ! nvptx offloading: PR83812 "operation not supported on global/shared address space".
+        ! { dg-output "(\n|\r\n|\r)libgomp: cuStreamSynchronize error: operation not supported on global/shared address space(\n|\r\n|\r)$" { target openacc_nvidia_accel_selected } }
+        !   Scan for what we expect in the "XFAILed" case (without actually XFAILing).
+        ! { dg-shouldfail "XFAILed" { openacc_nvidia_accel_selected } }
+        !   ... instead of 'dg-xfail-run-if' so that 'dg-output' is evaluated at all.
+        ! { dg-final { if { [dg-process-target { xfail openacc_nvidia_accel_selected }] == "F" } { xfail "[testname-for-summary] really is XFAILed" } } }
+        !   ... so that we still get an XFAIL visible in the log.
+        !$acc end atomic
+      end do
+      arr(j) = w
+    end do
+  !$acc end parallel
+
+  if (any (arr .ne. 32)) stop 1
+end program main
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] nvptx: NVPTX parts for OpenACC private variables patch
  2021-02-26 12:34 ` [PATCH 3/3] nvptx: NVPTX " Julian Brown
@ 2021-05-21 18:59   ` Thomas Schwinge
  0 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 18:59 UTC (permalink / raw)
  To: Julian Brown, gcc-patches, Tom de Vries; +Cc: jakub

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

Hi!

On 2021-02-26T04:34:52-0800, Julian Brown <julian@codesourcery.com> wrote:
> This patch contains the NVPTX backend support for placing OpenACC
> gang-private variables in GPU shared memory.
>
> Tested with offloading to NVPTX.
>
> This is substantially the same as the version previously posted: I will
> assume it is already approved (unless I hear objections), and will commit
> it at the same time as the rest of the series.
>
>   (https://gcc.gnu.org/pipermail/gcc-patches/2018-October/507909.html)

I've additionally pushed "[OpenACC privatization, nvptx] Tighten some
aspects [PR90115]" to master branch in commit
f6f45309d9fc140006886456b291e4ac24812cea, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-nvptx-Tighten-some-aspects-PR9.patch --]
[-- Type: text/x-diff, Size: 1681 bytes --]

From f6f45309d9fc140006886456b291e4ac24812cea Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 15:08:38 +0200
Subject: [PATCH] [OpenACC privatization, nvptx] Tighten some aspects [PR90115]

No functional change.

	gcc/
	PR middle-end/90115
	* config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl)
	(nvptx_goacc_expand_var_decl): Tighten.
---
 gcc/config/nvptx/nvptx.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 80116e570d6..60d3f079048 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -6682,12 +6682,12 @@ nvptx_truly_noop_truncation (poly_uint64, poly_uint64)
 static tree
 nvptx_goacc_adjust_private_decl (tree decl, int level)
 {
-  if (level != GOMP_DIM_GANG)
-    return decl;
+  gcc_checking_assert (!lookup_attribute ("oacc gang-private",
+					  DECL_ATTRIBUTES (decl)));
 
   /* Set "oacc gang-private" attribute for gang-private variable
      declarations.  */
-  if (!lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (decl)))
+  if (level == GOMP_DIM_GANG)
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	{
@@ -6708,9 +6708,10 @@ static rtx
 nvptx_goacc_expand_var_decl (tree var)
 {
   /* Place "oacc gang-private" variables in shared memory.  */
-  if (VAR_P (var)
-      && lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (var)))
+  if (lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (var)))
     {
+      gcc_checking_assert (VAR_P (var));
+
       unsigned int offset, *poffset;
       poffset = gang_private_shared_hmap.get (var);
       if (poffset)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
  2021-04-15 17:26   ` Thomas Schwinge
@ 2021-05-21 19:12   ` Thomas Schwinge
  1 sibling, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 19:12 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: jakub

[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]

Hi!

On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com> wrote:
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -2957,6 +2957,8 @@ expand_UNIQUE (internal_fn, gcall *stmt)
>        else
>       gcc_unreachable ();
>        break;
> +    case IFN_UNIQUE_OACC_PRIVATE:
> +      break;
>      }
>
>    if (pattern)

That's unexpected.  Meaning: better if this doesn't happen.

> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c

> @@ -1998,6 +2133,45 @@ execute_oacc_device_lower ()
>               case IFN_UNIQUE_OACC_TAIL_MARK:
>                 remove = true;
>                 break;
> +
> +             case IFN_UNIQUE_OACC_PRIVATE:
> +               {
> +                 HOST_WIDE_INT level
> +                   = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
> +                 if (level == -1)
> +                   break;

They should be all "handled" here (meaning: also for 'level == -1', do
'remove = true' after the real handling):

> +                 for (unsigned i = 3;
> +                      i < gimple_call_num_args (call);
> +                      i++)
> +                   {
> +                     [...]
> +                   }
> +                 remove = true;
> +               }
> +               break;
>               }
>             break;
>           }

Why we at all can have 'level == -1' cases is a separate bug to be fixed.

I've pushed "[OpenACC privatization] Don't let unhandled
'IFN_UNIQUE_OACC_PRIVATE' linger [PR90115]" to master branch in commit
ff451ea723deb3fe8471eb96ac9381c063ec6533, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-Don-t-let-unhandled-IFN_UNIQUE.patch --]
[-- Type: text/x-diff, Size: 2113 bytes --]

From ff451ea723deb3fe8471eb96ac9381c063ec6533 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 15:37:07 +0200
Subject: [PATCH] [OpenACC privatization] Don't let unhandled
 'IFN_UNIQUE_OACC_PRIVATE' linger [PR90115]

Make sure they're all handled in 'execute_oacc_device_lower'.  Why we at all
can have 'level == -1' cases is a separate bug to be fixed.

	gcc/
	PR middle-end/90115
	* omp-offload.c (execute_oacc_device_lower)
	<IFN_UNIQUE_OACC_PRIVATE>: Diagnose and handle for 'level == -1'
	case, too.
	* internal-fn.c (expand_UNIQUE): Don't expect
	'IFN_UNIQUE_OACC_PRIVATE'.
---
 gcc/internal-fn.c |  2 --
 gcc/omp-offload.c | 10 ++++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index d92080c8077..d209a52f823 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2969,8 +2969,6 @@ expand_UNIQUE (internal_fn, gcall *stmt)
       else
 	gcc_unreachable ();
       break;
-    case IFN_UNIQUE_OACC_PRIVATE:
-      break;
     }
 
   if (pattern)
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 080bdddfe88..36bd2e44d81 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2139,8 +2139,9 @@ execute_oacc_device_lower ()
 		  {
 		    HOST_WIDE_INT level
 		      = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
-		    if (level == -1)
-		      break;
+		    gcc_checking_assert (level == -1
+					 || (level >= 0
+					     && level < GOMP_DIM_MAX));
 		    for (unsigned i = 3;
 			 i < gimple_call_num_args (call);
 			 i++)
@@ -2156,11 +2157,12 @@ execute_oacc_device_lower ()
 			      { "gang", "worker", "vector" };
 			    fprintf (dump_file, "Decl UID %u has %s "
 				     "partitioning:", DECL_UID (decl),
-				     axes[level]);
+				     (level == -1 ? "UNKNOWN" : axes[level]));
 			    print_generic_decl (dump_file, decl, TDF_SLIM);
 			    fputc ('\n', dump_file);
 			  }
-			if (targetm.goacc.adjust_private_decl)
+			if (level != -1
+			    && targetm.goacc.adjust_private_decl)
 			  {
 			    tree oldtype = TREE_TYPE (decl);
 			    tree newdecl
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-19 11:23     ` Julian Brown
  2021-05-21 18:55       ` Thomas Schwinge
@ 2021-05-21 19:18       ` Thomas Schwinge
  2021-05-21 19:20       ` Thomas Schwinge
  2021-05-21 19:29       ` Thomas Schwinge
  3 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 19:18 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 7526 bytes --]

Hi!

On 2021-04-19T12:23:56+0100, Julian Brown <julian@codesourcery.com> wrote:
> On Thu, 15 Apr 2021 19:26:54 +0200
> Thomas Schwinge <thomas@codesourcery.com> wrote:
>> On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com>
>> wrote:
>> > Two new target hooks are introduced:
>> > TARGET_GOACC_ADJUST_PRIVATE_DECL and TARGET_GOACC_EXPAND_VAR_DECL.
>> > The first can tweak a variable declaration at oaccdevlow time, and
>> > the second at expand time.  The first or both of these target hooks
>> > can be used by a given offload target, depending on its strategy
>> > for implementing private variables.
>>
>> ACK.
>>
>> So, currently we're only looking at making the gang-private level
>> work. Regarding that, we have two configurations: (1) for GCN
>> offloading, 'targetm.goacc.adjust_private_decl' does the work (in
>> particular, change 'TREE_TYPE' etc.) and there is no
>> 'targetm.goacc.expand_var_decl', and (2) for nvptx offloading,
>> 'targetm.goacc.adjust_private_decl' only sets a marker ('oacc
>> gangprivate' attribute) and then 'targetm.goacc.expand_var_decl' does
>> the work.
>>
>> Therefore I suggest we clarify the (currently) expected handling
>> similar to:
>>
>>     --- gcc/omp-offload.c
>>     +++ gcc/omp-offload.c
>>     @@ -1854,6 +1854,19 @@ oacc_rewrite_var_decl (tree *tp, int *walk_subtrees, void *data) return NULL_TREE;
>>      }
>>
>>     +static tree
>>     +oacc_rewrite_var_decl_ (tree *tp, int *walk_subtrees, void *data)
>>     +{
>>     +  tree t = oacc_rewrite_var_decl (tp, walk_subtrees, data);
>>     +  if (targetm.goacc.expand_var_decl)
>>     +    {
>>     +      walk_stmt_info *wi = (walk_stmt_info *) data;
>>     +      var_decl_rewrite_info *info = (var_decl_rewrite_info *) wi->info;
>>     +      gcc_assert (!info->modified);
>>     +    }
>>     +  return t;
>>     +}
>
> Why the ugly _ tail on the function name!? I don't think that's a
> typical GNU coding standards thing, is it?

Heh, that was just to make the WIP prototype changes diff as small as
possible.  ;-)

>>     +
>>      /* Return TRUE if CALL is a call to a builtin atomic/sync operation.  */
>>      static bool
>>     @@ -2195,6 +2208,9 @@ execute_oacc_device_lower ()
>>           COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also
>> rewritten to use the new decl, adjusting types of appropriate tree
>> nodes as necessary.  */
>>     +  if (targetm.goacc.expand_var_decl)
>>     +    gcc_assert (adjusted_vars.is_empty ());
>
> If you like

I've pushed "[OpenACC privatization] Explain two different configurations
[PR90115]" to master branch in commit
21803fcaebeab36de0d7b6b8cf6abb9389f5e51f, see attached.

> -- or do something like
>
>>        if (targetm.goacc.adjust_private_decl)
>              && !adjusted_vars.is_empty ())
>
> perhaps.

That, too, additionally: I've pushed "[OpenACC privatization] Skip
processing if no work to be done [PR90115]" to master branch in commit
ad4612cb048b261f6834e9155e41e40e9252c80b, see attached.

>>          {
>>            FOR_ALL_BB_FN (bb, cfun)
>>     @@ -2217,7 +2233,7 @@ execute_oacc_device_lower ()
>>                 memset (&wi, 0, sizeof (wi));
>>                 wi.info = &info;
>>
>>     -           walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
>>     +           walk_gimple_op (stmt, oacc_rewrite_var_decl_, &wi);
>>
>>                 if (info.modified)
>>                   update_stmt (stmt);
>>
>> Or, in fact, 'if (targetm.goacc.expand_var_decl)', skip the
>> 'adjusted_vars' handling completely?
>
> For the current pair of implementations, sure. I don't think it's
> necessary to set that as a constraint for future targets though? I
> guess it doesn't matter much until such a target exists.
>
>> I do understand that eventually (in particular, for worker-private
>> level?), both 'targetm.goacc.adjust_private_decl' and
>> 'targetm.goacc.expand_var_decl' may need to do things, but that's
>> currently not meant to be addressed, and thus not fully worked out and
>> implemented, and thus untested.  Hence, 'assert' what currently is
>> implemented/tested, only.
>
> If you like, no strong feelings from me on that.
>
>> (Given that eventual goal, that's probably sufficient motivation to
>> indeed add the 'adjusted_vars' handling in generic 'gcc/omp-offload.c'
>> instead of moving it into the GCN back end?)
>
> I'm not sure what moving it to the GCN back end would look like. I
> guess it's a question of keeping the right abstractions in the right
> place.

Right.  I guess we'll figure that out once we have more than one back end
using the 'adjusted_vars' machinery.

>> > +       1. They can be recreated, making a pointer to the variable in the new
>> > +    address space, or
>> > +
>> > +       2. The address of the variable in the new address space can be taken,
>> > +    converted to the default (original) address space, and the result of
>> > +    that conversion subsituted in place of the original ADDR_EXPR node.
>> > +
>> > +     Which of these is done depends on the gimple statement being processed.
>> > +     At present atomic operations and inline asms use (1), and everything else
>> > +     uses (2).  At least on AMD GCN, there are atomic operations that work
>> > +     directly in the LDS address space.
>> > +
>> > +     COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
>> > +     the new decl, adjusting types of appropriate tree nodes as necessary.  */
>>
>> [TS] As I understand, this is only relevant for GCN offloading, but
>> not nvptx, and I'll trust that these two variants make sense from a
>> GCN point of view (which I cannot verify easily).
>
> The idea (hope) is that that's what's necessary "generically", though
> the only target using that support is GCN at present. I.e. it's not
> supposed to be GCN-specific, necessarily. Of course though, who knows
> what some other exotic target will need? (We don't want to be in the
> state where each target has to start completely from scratch for this
> sort of thing, if we can help it.)
>
>> > +  if (targetm.goacc.adjust_private_decl)
>> > +    {
>> > +      FOR_ALL_BB_FN (bb, cfun)
>> > +  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
>> > +       !gsi_end_p (gsi);
>> > +       gsi_next (&gsi))
>> > +    {
>> > +      gimple *stmt = gsi_stmt (gsi);
>> > +      walk_stmt_info wi;
>> > +      var_decl_rewrite_info info;
>> > +
>> > +      info.avoid_pointer_conversion
>> > +        = (is_gimple_call (stmt)
>> > +           && is_sync_builtin_call (as_a <gcall *> (stmt)))
>> > +          || gimple_code (stmt) == GIMPLE_ASM;
>> > +      info.stmt = stmt;
>> > +      info.modified = false;
>> > +      info.adjusted_vars = &adjusted_vars;
>> > +
>> > +      memset (&wi, 0, sizeof (wi));
>> > +      wi.info = &info;
>> > +
>> > +      walk_gimple_op (stmt, oacc_rewrite_var_decl, &wi);
>> > +
>> > +      if (info.modified)
>> > +        update_stmt (stmt);
>> > +    }
>> > +    }
>> > +
>> >    free_oacc_loop (loops);
>> >
>> >    return 0;
>>
>> [TS] As disucssed above, maybe can completely skip the 'adjusted_vars'
>> rewriting for nvptx offloading?
>
> Yeah sure, if you like.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-Explain-two-different-configur.patch --]
[-- Type: text/x-diff, Size: 1836 bytes --]

From 21803fcaebeab36de0d7b6b8cf6abb9389f5e51f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 15:44:09 +0200
Subject: [PATCH 1/2] [OpenACC privatization] Explain two different
 configurations [PR90115]

	gcc/
	PR middle-end/90115
	* omp-offload.c (execute_oacc_device_lower): Explain.
---
 gcc/omp-offload.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 36bd2e44d81..336b48d5a3b 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2206,6 +2206,26 @@ execute_oacc_device_lower ()
 	  gsi_next (&gsi);
       }
 
+  /* Regarding the OpenACC privatization level, we're currently only looking at
+     making the gang-private level work.  Regarding that, we have the following
+     configurations:
+
+       - GCN offloading: 'targetm.goacc.adjust_private_decl' does the work (in
+	 particular, change 'TREE_TYPE', etc.) and there is no
+	 'targetm.goacc.expand_var_decl'.
+
+       - nvptx offloading: 'targetm.goacc.adjust_private_decl' only sets a
+	 marker and then 'targetm.goacc.expand_var_decl' does the work.
+
+     Eventually (in particular, for worker-private level?), both
+     'targetm.goacc.adjust_private_decl' and 'targetm.goacc.expand_var_decl'
+     may need to do things, but that's currently not meant to be addressed, and
+     thus not fully worked out and implemented, and thus untested.  Hence,
+     'assert' what currently is implemented/tested, only.  */
+
+  if (targetm.goacc.expand_var_decl)
+    gcc_assert (adjusted_vars.is_empty ());
+
   /* Make adjustments to gang-private local variables if required by the
      target, e.g. forcing them into a particular address space.  Afterwards,
      ADDR_EXPR nodes which have adjusted variables as their argument need to
-- 
2.30.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-OpenACC-privatization-Skip-processing-if-no-work-to-.patch --]
[-- Type: text/x-diff, Size: 1019 bytes --]

From ad4612cb048b261f6834e9155e41e40e9252c80b Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 15:45:06 +0200
Subject: [PATCH 2/2] [OpenACC privatization] Skip processing if no work to be
 done [PR90115]

	gcc/
	PR middle-end/90115
	* omp-offload.c (execute_oacc_device_lower): Skip processing if no
	work to be done.
---
 gcc/omp-offload.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 336b48d5a3b..8bfb8b36cf0 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2246,7 +2246,8 @@ execute_oacc_device_lower ()
      COMPONENT_REFS, ARRAY_REFS and plain VAR_DECLs are also rewritten to use
      the new decl, adjusting types of appropriate tree nodes as necessary.  */
 
-  if (targetm.goacc.adjust_private_decl)
+  if (targetm.goacc.adjust_private_decl
+      && !adjusted_vars.is_empty ())
     {
       FOR_ALL_BB_FN (bb, cfun)
 	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-19 11:23     ` Julian Brown
  2021-05-21 18:55       ` Thomas Schwinge
  2021-05-21 19:18       ` Thomas Schwinge
@ 2021-05-21 19:20       ` Thomas Schwinge
  2021-05-21 19:29       ` Thomas Schwinge
  3 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 19:20 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 879 bytes --]

Hi!

On 2021-04-19T12:23:56+0100, Julian Brown <julian@codesourcery.com> wrote:
> On Thu, 15 Apr 2021 19:26:54 +0200
> Thomas Schwinge <thomas@codesourcery.com> wrote:
>> As that may not be obvious to the reader, I'd like to have the
>> 'TREE_ADDRESSABLE' conditionalization be documented in the code.  You
>> had explained that in
>> <http://mid.mail-archive.com/20190612204216.0ec83e4e@squid.athome>: "a
>> non-addressable variable [...]".
>
> Yeah that probably makes sense.

I've pushed "[OpenACC privatization] Explain OpenACC privatization
candidate selection [PR90115]" to master branch in commit
5a0fe1f6c4ad0e50bf4684e723ae2ba17d94c9e4, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-Explain-OpenACC-privatization-.patch --]
[-- Type: text/x-diff, Size: 2865 bytes --]

From 5a0fe1f6c4ad0e50bf4684e723ae2ba17d94c9e4 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 15:55:18 +0200
Subject: [PATCH] [OpenACC privatization] Explain OpenACC privatization
 candidate selection [PR90115]

	gcc/
	PR middle-end/90115
	* omp-low.c (oacc_privatization_candidate_p): New function.
	(oacc_privatization_scan_clause_chain)
	(oacc_privatization_scan_decl_chain): Use it.  Also
	'gcc_checking_assert' that we're not seeing duplicates.
---
 gcc/omp-low.c | 45 ++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index a86c6c1e82c..577676b2a16 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -10144,6 +10144,36 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
     }
 }
 
+/* OpenACC privatization.
+
+   Or, in other words, *sharing* at the respective OpenACC level of
+   parallelism.
+
+   From a correctness perspective, a non-addressable variable can't be accessed
+   outside the current thread, so it can go in a (faster than shared memory)
+   register -- though that register may need to be broadcast in some
+   circumstances.  A variable can only meaningfully be "shared" across workers
+   or vector lanes if its address is taken, e.g. by a call to an atomic
+   builtin.
+
+   From an optimisation perspective, the answer might be fuzzier: maybe
+   sometimes, using shared memory directly would be faster than
+   broadcasting.  */
+
+static bool
+oacc_privatization_candidate_p (const tree decl)
+{
+  bool res = true;
+
+  if (res && !VAR_P (decl))
+    res = false;
+
+  if (res && !TREE_ADDRESSABLE (decl))
+    res = false;
+
+  return res;
+}
+
 /* Scan CLAUSES for candidates for adjusting OpenACC privatization level in
    CTX.  */
 
@@ -10154,8 +10184,12 @@ oacc_privatization_scan_clause_chain (omp_context *ctx, tree clauses)
     if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE)
       {
 	tree decl = OMP_CLAUSE_DECL (c);
-	if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
-	  ctx->oacc_privatization_candidates.safe_push (decl);
+
+	if (!oacc_privatization_candidate_p (decl))
+	  continue;
+
+	gcc_checking_assert (!ctx->oacc_privatization_candidates.contains (decl));
+	ctx->oacc_privatization_candidates.safe_push (decl);
       }
 }
 
@@ -10166,8 +10200,13 @@ static void
 oacc_privatization_scan_decl_chain (omp_context *ctx, tree decls)
 {
   for (tree decl = decls; decl; decl = DECL_CHAIN (decl))
-    if (VAR_P (decl) && TREE_ADDRESSABLE (decl))
+    {
+      if (!oacc_privatization_candidate_p (decl))
+	continue;
+
+      gcc_checking_assert (!ctx->oacc_privatization_candidates.contains (decl));
       ctx->oacc_privatization_candidates.safe_push (decl);
+    }
 }
 
 /* Callback for walk_gimple_seq.  Find #pragma omp scan statement.  */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory
  2021-04-19 11:23     ` Julian Brown
                         ` (2 preceding siblings ...)
  2021-05-21 19:20       ` Thomas Schwinge
@ 2021-05-21 19:29       ` Thomas Schwinge
  2021-05-22  1:40         ` [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64 sunil.k.pandey
                           ` (5 more replies)
  3 siblings, 6 replies; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-21 19:29 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 7767 bytes --]

Hi!

On 2021-04-19T12:23:56+0100, Julian Brown <julian@codesourcery.com> wrote:
> On Thu, 15 Apr 2021 19:26:54 +0200
> Thomas Schwinge <thomas@codesourcery.com> wrote:
>> On 2021-02-26T04:34:50-0800, Julian Brown <julian@codesourcery.com>
>> wrote:
>> I was surprised that we didn't really have to fix up any existing
>> libgomp testcases, because there seem to be quite some that contain a
>> pattern (exemplified by the 'tmp' variable) as follows:
>>
>>     int main()
>>     {
>>     #define N 123
>>       int data[N];
>>       int tmp;
>>
>>     #pragma acc parallel // implicit 'firstprivate(tmp)'
>>       {
>>         // 'tmp' now conceptually made gang-private here.
>>     #pragma acc loop gang
>>         for (int i = 0; i < 123; ++i)
>>           {
>>             tmp = i + 234;
>>             data[i] = tmp;
>>           }
>>       }
>>
>>       for (int i = 0; i < 123; ++i)
>>         if (data[i] != i + 234)
>>           __builtin_abort ();
>>
>>       return 0;
>>     }
>>
>> With the code changes as posted, this actually now does *not* use
>> gang-private memory for 'tmp', but instead continues to use
>> "thread-private registers", as before.
>
> When "tmp" is a local, non-address-taken scalar like that, it'll
> probably end up in a register in offloaded code (or of course be
> compiled out completely), both before and after this patch. So I
> wouldn't expect this to not work in the pre-patch state.

Of course, in the example as posted, there's no need to make 'tmp'
gang-private.  However, even if the 'i' loop did something more
spectacular (that makes 'tmp' addressable/potentially shared), at present
we still wouldn't handle that case: (a) we're not processing clauses on
OpenACC compute constructs (only 'loop' construct), and (b) we're not
processing 'firstprivate' clauses (only 'private').  That's now all easy
to fix (and reflect in the testsuite), but needs proper time allocated.

Relatedly, may also think about using that new privatization
functionality for the 'private' aspect that comes with 'reduction'
clauses?

>> Same for:
>>
>>     --- s3.c 2021-04-13 17:26:49.628739379 +0200
>>     +++ s3_2.c       2021-04-13 17:29:43.484579664 +0200
>>     @@ -4,6 +4,6 @@
>>        int data[N];
>>     -  int tmp;
>>
>>     -#pragma acc parallel // implicit 'firstprivate(tmp)'
>>     +#pragma acc parallel
>>        {
>>     +    int tmp;
>>          // 'tmp' now conceptually made gang-private here.
>>      #pragma acc loop gang
>>
>> I suppose that's due to conditionalizing this transformation on
>> 'TREE_ADDRESSABLE' (as you're doing), so we should be mostly "safe"
>> regarding such existing testcases (but I haven't verified that yet in
>> detail).
>
> Right.
>
>> That needs to be documented in testcases, with some kind of dump
>> scanning (host compilation-side even; see below).

Done.

>> A note for later: if this weren't just a 'gang' loop, but 'gang' plus
>> 'worker' and/or 'vector', we'd actually be fixing up user code with
>> undefined behavior into "correct" code (by *not* making 'tmp'
>> gang-private, but thread-private), right?
>
> Possibly -- coming up with a case like that might need a little
> "ingenuity"...

Still to be done.

>> > In terms of implementation, the parallelism level of a given loop is
>> > not fixed until the oaccdevlow pass in the offload compiler, so the
>> > patch delays fixing the parallelism level of variables declared on
>> > or within such loops until the same point. This is done by adding a
>> > new internal UNIQUE function (OACC_PRIVATE) that lists (the address
>> > of) each private variable as an argument, and other arguments set
>> > so as to be able to determine the correct parallelism level to use
>> > for the listed variables. This new internal function fits into the
>> > existing scheme for demarcating OpenACC loops, as described in
>> > comments in the patch.
>>
>> Yes, thanks, that's conceptually now much better than the earlier
>> variants that we had.  :-) (Hooray, again, for Nathan's OpenACC
>> execution model design!)
>>
>> What we should add, though, is a bunch of testcases to verify that the
>> expected processing does/doesn't happen for relevant source code
>> constructs.  I'm thinking that when the transformation is/isn't done,
>> that gets logged, and we can then scan the dumps accordingly.  Some of
>> that is implemented already; we should be able to do such scanning
>> generally for host compilation, too, not just offloading compilation.
>
> More test coverage is always welcome, of course.

;-) I couldn't resist -- and along the way found/fixed several issues in
the code.

>> > [snip]
>> >    tree fork_kind = build_int_cst (unsigned_type_node,
>> > IFN_UNIQUE_OACC_FORK); tree join_kind = build_int_cst
>> > (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
>> > @@ -8027,7 +8041,8 @@ lower_oacc_head_tail (location_t loc, tree
>> > clauses, &join_seq);
>> >
>> >        lower_oacc_reductions (loc, clauses, place, inner,
>> > -                       fork, join, &fork_seq, &join_seq, ctx);
>> > +                       fork, (count == 1) ? private_marker : NULL,
>> > +                       join, &fork_seq, &join_seq,  ctx);
>> >
>> >        /* Append this level to head. */
>> >        gimple_seq_add_seq (head, fork_seq);
>>
>> [TS] That looks good in principle.  Via the testing mentioned above, I
>> just want to make sure that this does all the expected things
>> regarding differently nested loops and privatization levels.
>
> Feel free to extend test coverage as you see fit...

A little bit added, but more still to be done.

I've pushed "[OpenACC privatization] Largely extend diagnostics and
corresponding testsuite coverage [PR90115]" to master branch in commit
11b8286a83289f5b54e813f14ff56d730c3f3185, see attached.  I had, of
course, developed that in several iterations, intertwined with
implementation changes, but didn't now feel like disentangling all that,
sorry.


>> > --- /dev/null
>> > +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90
>>
>> [TS] With code changes as posted, this one FAILs for nvptx offloading
>> execution.  (... for all but the Nvidia Titan V GPU in my set of
>> testing configurations, huh?)
>>
>> > @@ -0,0 +1,25 @@
>> > +! Test for worker-private variables
>> > +
>> > +! { dg-do run }
>> > +! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
>> > +
>> > +program main
>> > +  integer :: w, arr(0:31)
>> > +
>> > +  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
>> > +    !$acc loop gang worker private(w)
>> > +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
>> > +    do j = 0, 31
>> > +      w = 0
>> > +      !$acc loop seq
>> > +      do i = 0, 31
>> > +        !$acc atomic update
>> > +        w = w + 1
>> > +        !$acc end atomic
>> > +      end do
>> > +      arr(j) = w
>> > +    end do
>> > +  !$acc end parallel
>> > +
>> > +  if (any (arr .ne. 32)) stop 1
>> > +end program main
>
> Boo. I don't think I saw such a failure on the systems I tested on.
> That needs investigation (though it might be something CUDA-version or
> GPU specific, hence not directly a GCC problem? Not sure.)

That's <https://gcc.gnu.org/PR100678> "[OpenACC/nvptx]
'libgomp.oacc-c-c++-common/private-atomic-1.c' FAILs (differently) in
certain configurations", now XFAILed.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-Largely-extend-diagnostics-and.patch --]
[-- Type: text/x-diff, Size: 331599 bytes --]

From 11b8286a83289f5b54e813f14ff56d730c3f3185 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 20 May 2021 16:11:37 +0200
Subject: [PATCH] [OpenACC privatization] Largely extend diagnostics and
 corresponding testsuite coverage [PR90115]

	gcc/
	PR middle-end/90115
	* flag-types.h (enum openacc_privatization): New.
	* params.opt (-param=openacc-privatization): New.
	* doc/invoke.texi (openacc-privatization): Document it.
	* omp-general.h (get_openacc_privatization_dump_flags): New
	function.
	* omp-low.c (oacc_privatization_candidate_p): Add diagnostics.
	* omp-offload.c (execute_oacc_device_lower)
	<IFN_UNIQUE_OACC_PRIVATE>: Re-work diagnostics.
	* target.def (goacc.adjust_private_decl): Add 'location_t'
	parameter.
	* doc/tm.texi: Regenerate.
	* config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Adjust.
	* config/gcn/gcn-tree.c (gcn_goacc_adjust_private_decl): Likewise.
	* config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl):
	Likewise.  Preserve it for...
	(nvptx_goacc_expand_var_decl): ... use here.
	gcc/testsuite/
	PR middle-end/90115
	* c-c++-common/goacc/privatization-1-compute-loop.c: New file.
	* c-c++-common/goacc/privatization-1-compute.c: Likewise.
	* c-c++-common/goacc/privatization-1-routine_gang-loop.c:
	Likewise.
	* c-c++-common/goacc/privatization-1-routine_gang.c: Likewise.
	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise.
	* gfortran.dg/goacc/privatization-1-compute.f90: Likewise.
	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90:
	Likewise.
	* gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise.
	* c-c++-common/goacc-gomp/nesting-1.c: Update.
	* c-c++-common/goacc/private-reduction-1.c: Likewise.
	* gfortran.dg/goacc/private-3.f95: Likewise.
	libgomp/
	PR middle-end/90115
	* testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: New
	file.
	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/private-variables.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c:
	Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/declare-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
---
 gcc/config/gcn/gcn-protos.h                   |   2 +-
 gcc/config/gcn/gcn-tree.c                     |   2 +-
 gcc/config/nvptx/nvptx.c                      |  61 ++-
 gcc/doc/invoke.texi                           |   8 +
 gcc/doc/tm.texi                               |   3 +-
 gcc/flag-types.h                              |   7 +
 gcc/omp-general.h                             |  13 +
 gcc/omp-low.c                                 |  75 +++-
 gcc/omp-offload.c                             |  73 +++-
 gcc/params.opt                                |  13 +
 gcc/target.def                                |   3 +-
 .../c-c++-common/goacc-gomp/nesting-1.c       |  14 +
 .../c-c++-common/goacc/private-reduction-1.c  |   6 +
 .../goacc/privatization-1-compute-loop.c      |  95 +++++
 .../goacc/privatization-1-compute.c           |  90 +++++
 .../goacc/privatization-1-routine_gang-loop.c |  95 +++++
 .../goacc/privatization-1-routine_gang.c      |  93 +++++
 gcc/testsuite/gfortran.dg/goacc/private-3.f95 |   7 +-
 .../goacc/privatization-1-compute-loop.f90    |  57 +++
 .../goacc/privatization-1-compute.f90         |  48 +++
 .../privatization-1-routine_gang-loop.f90     |  56 +++
 .../goacc/privatization-1-routine_gang.f90    |  47 +++
 .../firstprivate-1.c                          |  10 +
 .../libgomp.oacc-c-c++-common/host_data-7.c   |  16 +-
 .../kernels-decompose-1.c                     |  14 +-
 .../kernels-private-vars-local-worker-1.c     |  16 +
 .../kernels-private-vars-local-worker-2.c     |  13 +
 .../kernels-private-vars-local-worker-3.c     |  13 +
 .../kernels-private-vars-local-worker-4.c     |  14 +
 .../kernels-private-vars-local-worker-5.c     |  13 +
 .../kernels-private-vars-loop-gang-1.c        |   8 +
 .../kernels-private-vars-loop-gang-2.c        |  10 +
 .../kernels-private-vars-loop-gang-3.c        |  10 +
 .../kernels-private-vars-loop-gang-4.c        |  11 +
 .../kernels-private-vars-loop-gang-5.c        |  10 +
 .../kernels-private-vars-loop-gang-6.c        |  10 +
 .../kernels-private-vars-loop-vector-1.c      |  14 +
 .../kernels-private-vars-loop-vector-2.c      |  12 +
 .../kernels-private-vars-loop-worker-1.c      |  10 +
 .../kernels-private-vars-loop-worker-2.c      |  12 +
 .../kernels-private-vars-loop-worker-3.c      |  16 +
 .../kernels-private-vars-loop-worker-4.c      |  13 +
 .../kernels-private-vars-loop-worker-5.c      |  14 +
 .../kernels-private-vars-loop-worker-6.c      |  13 +
 .../kernels-private-vars-loop-worker-7.c      |  13 +
 .../libgomp.oacc-c-c++-common/loop-g-1.c      |  11 +
 .../libgomp.oacc-c-c++-common/loop-g-2.c      |  11 +
 .../libgomp.oacc-c-c++-common/loop-gwv-1.c    |  11 +
 .../libgomp.oacc-c-c++-common/loop-gwv-2.c    |  11 +
 .../libgomp.oacc-c-c++-common/loop-red-g-1.c  |  12 +
 .../loop-red-gwv-1.c                          |  12 +
 .../libgomp.oacc-c-c++-common/loop-red-v-1.c  |  12 +
 .../libgomp.oacc-c-c++-common/loop-red-v-2.c  |  13 +
 .../libgomp.oacc-c-c++-common/loop-red-w-1.c  |  14 +-
 .../libgomp.oacc-c-c++-common/loop-red-w-2.c  |  15 +-
 .../libgomp.oacc-c-c++-common/loop-red-wv-1.c |  12 +
 .../libgomp.oacc-c-c++-common/loop-v-1.c      |  11 +
 .../libgomp.oacc-c-c++-common/loop-w-1.c      |  13 +-
 .../libgomp.oacc-c-c++-common/loop-wv-1.c     |  11 +
 .../parallel-reduction.c                      |   7 +
 .../private-atomic-1-gang.c                   |  85 +++-
 .../private-atomic-1.c                        |  13 +
 .../private-variables.c                       | 378 +++++++++++++-----
 .../libgomp.oacc-c-c++-common/routine-4.c     |  13 +
 .../static-variable-1.c                       |  24 ++
 .../acc_on_device-1-1.f90                     |  11 +-
 .../libgomp.oacc-fortran/acc_on_device-1-2.f  |  11 +-
 .../libgomp.oacc-fortran/acc_on_device-1-3.f  |  11 +-
 .../libgomp.oacc-fortran/declare-1.f90        |  18 +
 .../libgomp.oacc-fortran/host_data-5.F90      |  56 ++-
 .../testsuite/libgomp.oacc-fortran/if-1.f90   | 149 +++++--
 .../kernels-private-vars-loop-gang-1.f90      |   8 +
 .../kernels-private-vars-loop-gang-2.f90      |   9 +
 .../kernels-private-vars-loop-gang-3.f90      |   9 +
 .../kernels-private-vars-loop-gang-6.f90      |   9 +
 .../kernels-private-vars-loop-vector-1.f90    |  12 +
 .../kernels-private-vars-loop-vector-2.f90    |  10 +
 .../kernels-private-vars-loop-worker-1.f90    |   9 +
 .../kernels-private-vars-loop-worker-2.f90    |  10 +
 .../kernels-private-vars-loop-worker-3.f90    |  13 +
 .../kernels-private-vars-loop-worker-4.f90    |  11 +
 .../kernels-private-vars-loop-worker-5.f90    |  12 +
 .../kernels-private-vars-loop-worker-6.f90    |  11 +
 .../kernels-private-vars-loop-worker-7.f90    |  11 +
 .../libgomp.oacc-fortran/optional-private.f90 |  16 +
 .../libgomp.oacc-fortran/parallel-dims.f90    |  13 +
 .../private-atomic-1-gang.f90                 |  16 +-
 .../private-atomic-1-vector.f90               |  42 ++
 .../private-atomic-1-worker.f90               |  16 +-
 .../private-variables.f90                     | 175 ++++++--
 .../libgomp.oacc-fortran/privatized-ref-2.f90 |  71 +++-
 .../libgomp.oacc-fortran/routine-7.f90        |  14 +
 92 files changed, 2342 insertions(+), 253 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/privatization-1-compute-loop.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/privatization-1-compute.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang-loop.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index 7ef7ae8af46..8bd0b434a84 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -40,7 +40,7 @@ extern rtx gcn_gen_undef (machine_mode);
 extern bool gcn_global_address_p (rtx);
 extern tree gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
 						 const char *name);
-extern tree gcn_goacc_adjust_private_decl (tree var, int level);
+extern tree gcn_goacc_adjust_private_decl (location_t, tree var, int level);
 extern void gcn_goacc_reduction (gcall *call);
 extern bool gcn_hard_regno_rename_ok (unsigned int from_reg,
 				      unsigned int to_reg);
diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c
index 75ea50c59dd..1eb8882d4bf 100644
--- a/gcc/config/gcn/gcn-tree.c
+++ b/gcc/config/gcn/gcn-tree.c
@@ -578,7 +578,7 @@ gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
 }
 
 tree
-gcn_goacc_adjust_private_decl (tree var, int level)
+gcn_goacc_adjust_private_decl (location_t, tree var, int level)
 {
   if (level != GOMP_DIM_GANG)
     return var;
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 60d3f079048..6642bdfa867 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -6680,7 +6680,7 @@ nvptx_truly_noop_truncation (poly_uint64, poly_uint64)
 /* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL.  */
 
 static tree
-nvptx_goacc_adjust_private_decl (tree decl, int level)
+nvptx_goacc_adjust_private_decl (location_t loc, tree decl, int level)
 {
   gcc_checking_assert (!lookup_attribute ("oacc gang-private",
 					  DECL_ATTRIBUTES (decl)));
@@ -6689,14 +6689,12 @@ nvptx_goacc_adjust_private_decl (tree decl, int level)
      declarations.  */
   if (level == GOMP_DIM_GANG)
     {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Setting 'oacc gang-private' attribute for decl:");
-	  print_generic_decl (dump_file, decl, TDF_SLIM);
-	  fputc ('\n', dump_file);
-	}
       tree id = get_identifier ("oacc gang-private");
-      DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl));
+      /* For later diagnostic purposes, pass LOC as VALUE (wrapped as a
+	 TREE).  */
+      tree loc_tree = build_empty_stmt (loc);
+      DECL_ATTRIBUTES (decl)
+	= tree_cons (id, loc_tree, DECL_ATTRIBUTES (decl));
     }
 
   return decl;
@@ -6708,7 +6706,8 @@ static rtx
 nvptx_goacc_expand_var_decl (tree var)
 {
   /* Place "oacc gang-private" variables in shared memory.  */
-  if (lookup_attribute ("oacc gang-private", DECL_ATTRIBUTES (var)))
+  if (tree attr = lookup_attribute ("oacc gang-private",
+				    DECL_ATTRIBUTES (var)))
     {
       gcc_checking_assert (VAR_P (var));
 
@@ -6728,6 +6727,50 @@ nvptx_goacc_expand_var_decl (tree var)
 	  bool existed = gang_private_shared_hmap.put (var, offset);
 	  gcc_checking_assert (!existed);
 	  gang_private_shared_size += tree_to_uhwi (DECL_SIZE_UNIT (var));
+
+	  location_t loc = EXPR_LOCATION (TREE_VALUE (attr));
+#if 0 /* For some reason, this doesn't work.  */
+	  if (dump_enabled_p ())
+	    {
+	      dump_flags_t l_dump_flags
+		= get_openacc_privatization_dump_flags ();
+
+	      const dump_user_location_t d_u_loc
+		= dump_user_location_t::from_location_t (loc);
+/* PR100695 "Format decoder, quoting in 'dump_printf' etc." */
+#if __GNUC__ >= 10
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wformat"
+#endif
+	      dump_printf_loc (l_dump_flags, d_u_loc,
+			       "variable %<%T%> adjusted for OpenACC"
+			       " privatization level: %qs\n",
+			       var, "gang");
+#if __GNUC__ >= 10
+# pragma GCC diagnostic pop
+#endif
+	    }
+#else /* ..., thus emulate that, good enough for testsuite usage.  */
+	  if (param_openacc_privatization != OPENACC_PRIVATIZATION_QUIET)
+	    inform (loc,
+		    "variable %qD adjusted for OpenACC privatization level:"
+		    " %qs",
+		    var, "gang");
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      /* 'dumpfile.c:dump_loc' */
+	      fprintf (dump_file, "%s:%d:%d: ", LOCATION_FILE (loc),
+		       LOCATION_LINE (loc), LOCATION_COLUMN (loc));
+	      fprintf (dump_file, "%s: ", "note");
+
+	      fprintf (dump_file,
+		       "variable '");
+	      print_generic_expr (dump_file, var, TDF_SLIM);
+	      fprintf (dump_file,
+		       "' adjusted for OpenACC privatization level: '%s'\n",
+		       "gang");
+	    }
+#endif
 	}
       rtx addr = plus_constant (Pmode, gang_private_shared_sym, offset);
       return gen_rtx_MEM (TYPE_MODE (TREE_TYPE (var)), addr);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9bcbcdc777c..5cd4e2d993c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14425,6 +14425,14 @@ With @option{--param=openacc-kernels=parloops}, OpenACC `kernels'
 constructs are handled by the @samp{parloops} pass, en bloc.
 This is the current default.
 
+@item openacc-privatization
+Specify mode of OpenACC privatization diagnostics for
+@option{-fopt-info-omp-note} and applicable
+@option{-fdump-tree-*-details}.
+With @option{--param=openacc-privatization=quiet}, don't diagnose.
+This is the current default.
+With @option{--param=openacc-privatization=noisy}, do diagnose.
+
 @end table
 
 The following choices of @var{name} are available on AArch64 targets:
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 78c330c292d..e3a080e4a7c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6236,12 +6236,13 @@ like @code{cond_add@var{m}}.  The default implementation returns a zero
 constant of type @var{type}.
 @end deftypefn
 
-@deftypefn {Target Hook} tree TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, int @var{level})
+@deftypefn {Target Hook} tree TARGET_GOACC_ADJUST_PRIVATE_DECL (location_t @var{loc}, tree @var{var}, int @var{level})
 This hook, if defined, is used by accelerator target back-ends to adjust
 OpenACC variable declarations that should be made private to the given
 parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or
 @code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable
 declarations at the @code{gang} level to reside in GPU shared memory.
+@var{loc} may be used for diagnostic purposes.
 
 You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the
 adjusted variable declaration needs to be expanded to RTL in a non-standard
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index d60bb307c52..375448ebf5f 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -442,6 +442,13 @@ enum openacc_kernels
   OPENACC_KERNELS_PARLOOPS
 };
 
+/* Modes of OpenACC privatization diagnostics.  */
+enum openacc_privatization
+{
+  OPENACC_PRIVATIZATION_QUIET,
+  OPENACC_PRIVATIZATION_NOISY
+};
+
 #endif
 
 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/omp-general.h b/gcc/omp-general.h
index aa04895e16d..5c3e0f0e205 100644
--- a/gcc/omp-general.h
+++ b/gcc/omp-general.h
@@ -132,4 +132,17 @@ enum omp_requires {
 
 extern GTY(()) enum omp_requires omp_requires_mask;
 
+static inline dump_flags_t
+get_openacc_privatization_dump_flags ()
+{
+  dump_flags_t l_dump_flags = MSG_NOTE;
+
+  /* For '--param=openacc-privatization=quiet', diagnostics only go to dump
+     files.  */
+  if (param_openacc_privatization == OPENACC_PRIVATIZATION_QUIET)
+    l_dump_flags |= MSG_PRIORITY_INTERNALS;
+
+  return l_dump_flags;
+}
+
 #endif /* GCC_OMP_GENERAL_H */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 577676b2a16..0d63e8243ae 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -10160,16 +10160,81 @@ lower_omp_for_lastprivate (struct omp_for_data *fd, gimple_seq *body_p,
    sometimes, using shared memory directly would be faster than
    broadcasting.  */
 
+static void
+oacc_privatization_begin_diagnose_var (const dump_flags_t l_dump_flags,
+				       const location_t loc, const tree c,
+				       const tree decl)
+{
+  const dump_user_location_t d_u_loc
+    = dump_user_location_t::from_location_t (loc);
+/* PR100695 "Format decoder, quoting in 'dump_printf' etc." */
+#if __GNUC__ >= 10
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wformat"
+#endif
+  dump_printf_loc (l_dump_flags, d_u_loc,
+		   "variable %<%T%> ", decl);
+#if __GNUC__ >= 10
+# pragma GCC diagnostic pop
+#endif
+  if (c)
+    dump_printf (l_dump_flags,
+		 "in %qs clause ",
+		 omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+  else
+    dump_printf (l_dump_flags,
+		 "declared in block ");
+}
+
 static bool
-oacc_privatization_candidate_p (const tree decl)
+oacc_privatization_candidate_p (const location_t loc, const tree c,
+				const tree decl)
 {
+  dump_flags_t l_dump_flags = get_openacc_privatization_dump_flags ();
+
   bool res = true;
 
   if (res && !VAR_P (decl))
-    res = false;
+    {
+      res = false;
+
+      if (dump_enabled_p ())
+	{
+	  oacc_privatization_begin_diagnose_var (l_dump_flags, loc, c, decl);
+	  dump_printf (l_dump_flags,
+		       "potentially has improper OpenACC privatization level: %qs\n",
+		       get_tree_code_name (TREE_CODE (decl)));
+	}
+    }
 
   if (res && !TREE_ADDRESSABLE (decl))
-    res = false;
+    {
+      res = false;
+
+      if (dump_enabled_p ())
+	{
+	  oacc_privatization_begin_diagnose_var (l_dump_flags, loc, c, decl);
+	  dump_printf (l_dump_flags,
+		       "isn%'t candidate for adjusting OpenACC privatization level: %s\n",
+		       "not addressable");
+	}
+    }
+
+  if (res)
+    {
+      if (dump_enabled_p ())
+	{
+	  oacc_privatization_begin_diagnose_var (l_dump_flags, loc, c, decl);
+	  dump_printf (l_dump_flags,
+		       "is candidate for adjusting OpenACC privatization level\n");
+	}
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      print_generic_decl (dump_file, decl, dump_flags);
+      fprintf (dump_file, "\n");
+    }
 
   return res;
 }
@@ -10185,7 +10250,7 @@ oacc_privatization_scan_clause_chain (omp_context *ctx, tree clauses)
       {
 	tree decl = OMP_CLAUSE_DECL (c);
 
-	if (!oacc_privatization_candidate_p (decl))
+	if (!oacc_privatization_candidate_p (OMP_CLAUSE_LOCATION (c), c, decl))
 	  continue;
 
 	gcc_checking_assert (!ctx->oacc_privatization_candidates.contains (decl));
@@ -10201,7 +10266,7 @@ oacc_privatization_scan_decl_chain (omp_context *ctx, tree decls)
 {
   for (tree decl = decls; decl; decl = DECL_CHAIN (decl))
     {
-      if (!oacc_privatization_candidate_p (decl))
+      if (!oacc_privatization_candidate_p (gimple_location (ctx->stmt), NULL, decl))
 	continue;
 
       gcc_checking_assert (!ctx->oacc_privatization_candidates.contains (decl));
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 8bfb8b36cf0..e9078278382 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2137,6 +2137,15 @@ execute_oacc_device_lower ()
 
 		case IFN_UNIQUE_OACC_PRIVATE:
 		  {
+		    dump_flags_t l_dump_flags
+		      = get_openacc_privatization_dump_flags ();
+
+		    location_t loc = gimple_location (stmt);
+		    if (LOCATION_LOCUS (loc) == UNKNOWN_LOCATION)
+		      loc = DECL_SOURCE_LOCATION (current_function_decl);
+		    const dump_user_location_t d_u_loc
+		      = dump_user_location_t::from_location_t (loc);
+
 		    HOST_WIDE_INT level
 		      = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
 		    gcc_checking_assert (level == -1
@@ -2146,31 +2155,65 @@ execute_oacc_device_lower ()
 			 i < gimple_call_num_args (call);
 			 i++)
 		      {
+			static char const *const axes[] =
+			/* Must be kept in sync with GOMP_DIM enumeration.  */
+			  { "gang", "worker", "vector" };
+
 			tree arg = gimple_call_arg (call, i);
 			gcc_checking_assert (TREE_CODE (arg) == ADDR_EXPR);
 			tree decl = TREE_OPERAND (arg, 0);
-			if (dump_file && (dump_flags & TDF_DETAILS))
+			if (dump_enabled_p ())
+/* PR100695 "Format decoder, quoting in 'dump_printf' etc." */
+#if __GNUC__ >= 10
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wformat"
+#endif
+			  dump_printf_loc (l_dump_flags, d_u_loc,
+					   "variable %<%T%> ought to be"
+					   " adjusted for OpenACC"
+					   " privatization level: %qs\n",
+					   decl,
+					   (level == -1
+					    ? "UNKNOWN" : axes[level]));
+#if __GNUC__ >= 10
+# pragma GCC diagnostic pop
+#endif
+			bool adjusted;
+			if (level == -1)
+			  adjusted = false;
+			else if (!targetm.goacc.adjust_private_decl)
+			  adjusted = false;
+			else if (level == GOMP_DIM_VECTOR)
 			  {
-			    static char const *const axes[] =
-			      /* Must be kept in sync with GOMP_DIM
-				 enumeration.  */
-			      { "gang", "worker", "vector" };
-			    fprintf (dump_file, "Decl UID %u has %s "
-				     "partitioning:", DECL_UID (decl),
-				     (level == -1 ? "UNKNOWN" : axes[level]));
-			    print_generic_decl (dump_file, decl, TDF_SLIM);
-			    fputc ('\n', dump_file);
+			    /* That's the default behavior.  */
+			    adjusted = true;
 			  }
-			if (level != -1
-			    && targetm.goacc.adjust_private_decl)
+			else
 			  {
 			    tree oldtype = TREE_TYPE (decl);
 			    tree newdecl
-			      = targetm.goacc.adjust_private_decl (decl, level);
-			    if (TREE_TYPE (newdecl) != oldtype
-				|| newdecl != decl)
+			      = targetm.goacc.adjust_private_decl (loc, decl,
+								   level);
+			    adjusted = (TREE_TYPE (newdecl) != oldtype
+					|| newdecl != decl);
+			    if (adjusted)
 			      adjusted_vars.put (decl, newdecl);
 			  }
+			if (adjusted
+			    && dump_enabled_p ())
+/* PR100695 "Format decoder, quoting in 'dump_printf' etc." */
+#if __GNUC__ >= 10
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wformat"
+#endif
+			  dump_printf_loc (l_dump_flags, d_u_loc,
+					   "variable %<%T%> adjusted for"
+					   " OpenACC privatization level:"
+					   " %qs\n",
+					   decl, axes[level]);
+#if __GNUC__ >= 10
+# pragma GCC diagnostic pop
+#endif
 		      }
 		    remove = true;
 		  }
diff --git a/gcc/params.opt b/gcc/params.opt
index 82600b930ba..0d0dcd216f6 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -795,6 +795,19 @@ Enum(openacc_kernels) String(decompose) Value(OPENACC_KERNELS_DECOMPOSE)
 EnumValue
 Enum(openacc_kernels) String(parloops) Value(OPENACC_KERNELS_PARLOOPS)
 
+-param=openacc-privatization=
+Common Joined Enum(openacc_privatization) Var(param_openacc_privatization) Init(OPENACC_PRIVATIZATION_QUIET) Param
+--param=openacc-privatization=[quiet|noisy]	Specify mode of OpenACC privatization diagnostics.
+
+Enum
+Name(openacc_privatization) Type(enum openacc_privatization)
+
+EnumValue
+Enum(openacc_privatization) String(quiet) Value(OPENACC_PRIVATIZATION_QUIET)
+
+EnumValue
+Enum(openacc_privatization) String(noisy) Value(OPENACC_PRIVATIZATION_NOISY)
+
 -param=parloops-chunk-size=
 Common Joined UInteger Var(param_parloops_chunk_size) Param Optimization
 Chunk size of omp schedule for loops parallelized by parloops.
diff --git a/gcc/target.def b/gcc/target.def
index 660b69f5cb5..1dffedc81e4 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1733,11 +1733,12 @@ OpenACC variable declarations that should be made private to the given\n\
 parallelism level (i.e. @code{GOMP_DIM_GANG}, @code{GOMP_DIM_WORKER} or\n\
 @code{GOMP_DIM_VECTOR}).  A typical use for this hook is to force variable\n\
 declarations at the @code{gang} level to reside in GPU shared memory.\n\
+@var{loc} may be used for diagnostic purposes.\n\
 \n\
 You may also use the @code{TARGET_GOACC_EXPAND_VAR_DECL} hook if the\n\
 adjusted variable declaration needs to be expanded to RTL in a non-standard\n\
 way.",
-tree, (tree var, int level),
+tree, (location_t loc, tree var, int level),
 NULL)
 
 DEFHOOK
diff --git a/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c b/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
index aaf0e7a4ee6..b0b78374016 100644
--- a/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
+++ b/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
@@ -1,7 +1,14 @@
+/* { dg-additional-options "-fopt-info-omp-note" } */
+/* { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
+
 void
 f_acc_data (void)
 {
 #pragma acc data
+  /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  /* { dg-note {variable 'i' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 } */
   {
     int i;
 #pragma omp atomic write
@@ -13,6 +20,8 @@ void
 f_acc_kernels (void)
 {
 #pragma acc kernels
+  /* { dg-note {variable 'i' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 }
+     { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-2 } */
   {
     int i;
 #pragma omp atomic write
@@ -27,6 +36,9 @@ f_acc_loop (void)
   int i;
 
 #pragma acc loop
+  /* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  /* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+     { dg-bogus {note: variable 'i' ought to be adjusted for OpenACC privatization level: 'UNKNOWN'} "TODO" { xfail *-*-* } .-3 } */
   for (i = 0; i < 2; ++i)
     {
 #pragma omp atomic write
@@ -38,6 +50,8 @@ void
 f_acc_parallel (void)
 {
 #pragma acc parallel
+  /* { dg-note {variable 'i' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 }
+     { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-2 } */
   {
     int i;
 #pragma omp atomic write
diff --git a/gcc/testsuite/c-c++-common/goacc/private-reduction-1.c b/gcc/testsuite/c-c++-common/goacc/private-reduction-1.c
index d4e399531f6..38f6b7acf2b 100644
--- a/gcc/testsuite/c-c++-common/goacc/private-reduction-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/private-reduction-1.c
@@ -1,3 +1,7 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
 int
 reduction ()
 {
@@ -5,6 +9,8 @@ reduction ()
 
   #pragma acc parallel
   #pragma acc loop private (r) reduction (+:r)
+  /* { dg-note {variable 'r' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} r { target *-*-* } .-1 } */
+  /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} i { target *-*-* } .-2 } */
   for (i = 0; i < 100; i++)
     r += 10;
 
diff --git a/gcc/testsuite/c-c++-common/goacc/privatization-1-compute-loop.c b/gcc/testsuite/c-c++-common/goacc/privatization-1-compute-loop.c
new file mode 100644
index 00000000000..4bfb5270690
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/privatization-1-compute-loop.c
@@ -0,0 +1,95 @@
+/* OpenACC privatization: 'loop' construct inside compute construct */
+
+/* { dg-additional-options "-fopt-info-omp-note" } */
+/* { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
+/* See also '../../gfortran.dg/goacc/privatization-1-compute-loop.f90'.  */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_loop 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+extern int e;
+static int s;
+int g;
+
+void
+f (int i, int j, int a)
+{
+  extern int ex;
+  static int st;
+  int x, y;
+#pragma acc parallel
+#pragma acc loop collapse(2) private(a) private (e, s, g) private(ex, st, x, y) /* { dg-line l_loop[incr c_loop] } */
+  for (i = 0; i < 20; ++i)
+    for (j = 0; j < 25; ++j)
+      {
+	__label__ ll;
+	/* Nested scopes fun.  */
+	{
+	  struct s_ss { int i; } ss;
+	  {
+	    extern int func (int *, int *, int *);
+	    /* Don't know how to effect a 'CONST_DECL' here.  (See Fortran example.)  */
+	    /* Don't know how to effect a 'RESULT_DECL' here; only saw this for OpenMP 'lastprivate'.  */
+
+	    a = func (&i, &j, &a);
+	  }
+	  ss.i = a;
+	  {
+	    extern int func2 (int *, int *, int *, int *, int *, int *, int *);
+	    extern int ext;
+	    static int sta;
+	    a = func2 (&e, &s, &g, &ex, &st, &ext, &sta);
+	  }
+	}
+	x = a;
+#pragma acc atomic write
+	y = a;
+	{
+	  int xx, yy;
+	  xx = a;
+#pragma acc atomic write
+	  yy = a;
+	}
+
+      ll:
+	;
+      }
+  /* { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'st' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'st' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'ex' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'ex' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'g' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'g' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 's' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 's' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'e' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'e' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'j\.1' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'i\.0' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'll' declared in block potentially has improper OpenACC privatization level: 'label_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'struct struct s_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c } l_loop$c_loop }
+     { dg-note {variable 's_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c++ } l_loop$c_loop } */
+  /* { dg-note {variable 'ss' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'func' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'func2' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'ext' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'ext' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'sta' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'sta' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'xx' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'yy' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'yy' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/privatization-1-compute.c b/gcc/testsuite/c-c++-common/goacc/privatization-1-compute.c
new file mode 100644
index 00000000000..4de45e5c1ed
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/privatization-1-compute.c
@@ -0,0 +1,90 @@
+/* OpenACC privatization: compute construct */
+
+/* { dg-additional-options "-fopt-info-omp-note" } */
+/* { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
+/* See also '../../gfortran.dg/goacc/privatization-1-compute.f90'.  */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+extern int e;
+static int s;
+int g;
+
+void
+f (int i, int j, int a)
+{
+  extern int ex;
+  static int st;
+  int x, y;
+#pragma acc parallel private(i, j, a) private (e, s, g) private(ex, st, x, y) /* { dg-line l_compute[incr c_compute] } */
+      {
+	__label__ ll;
+	/* Nested scopes fun.  */
+	{
+	  struct s_ss { int i; } ss;
+	  {
+	    extern int func (int *, int *, int *);
+	    /* Don't know how to effect a 'CONST_DECL' here.  (See Fortran example.)  */
+	    /* Don't know how to effect a 'RESULT_DECL' here; only saw this for OpenMP 'lastprivate'.  */
+
+	    a = func (&i, &j, &a);
+	  }
+	  ss.i = a;
+	  {
+	    extern int func2 (int *, int *, int *, int *, int *, int *, int *);
+	    extern int ext;
+	    static int sta;
+	    a = func2 (&e, &s, &g, &ex, &st, &ext, &sta);
+	  }
+	}
+	x = a;
+#pragma acc atomic write
+	y = a;
+	{
+	  int xx, yy;
+	  xx = a;
+#pragma acc atomic write
+	  yy = a;
+	}
+
+      ll:
+	;
+      }
+  /* { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'st' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 'st' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'ex' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 'ex' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'g' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 'g' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 's' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 's' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'e' in 'private' clause is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_compute$c_compute }
+     { dg-note {variable 'e' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'll' declared in block potentially has improper OpenACC privatization level: 'label_decl'} "TODO" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'struct struct s_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c } l_compute$c_compute }
+     { dg-note {variable 's_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c++ } l_compute$c_compute } */
+  /* { dg-note {variable 'ss' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'func' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'func2' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'ext' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'ext' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'sta' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'sta' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'xx' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-note {variable 'yy' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'yy' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute } */
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang-loop.c b/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang-loop.c
new file mode 100644
index 00000000000..fcc233b0886
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang-loop.c
@@ -0,0 +1,95 @@
+/* OpenACC privatization: 'loop' construct inside 'routine' */
+
+/* { dg-additional-options "-fopt-info-omp-note" } */
+/* { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
+/* See also '../../gfortran.dg/goacc/privatization-1-routine_gang-loop.f90'.  */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_loop 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+extern int e;
+static int s;
+int g;
+
+#pragma acc routine gang
+void
+f (int i, int j, int a)
+{
+  extern int ex;
+  static int st;
+  int x, y;
+#pragma acc loop collapse(2) private(a) private (e, s, g) private(ex, st, x, y) /* { dg-line l_loop[incr c_loop] } */
+  for (i = 0; i < 20; ++i)
+    for (j = 0; j < 25; ++j)
+      {
+	__label__ ll;
+	/* Nested scopes fun.  */
+	{
+	  struct s_ss { int i; } ss;
+	  {
+	    extern int func (int *, int *, int *);
+	    /* Don't know how to effect a 'CONST_DECL' here.  (See Fortran example.)  */
+	    /* Don't know how to effect a 'RESULT_DECL' here; only saw this for OpenMP 'lastprivate'.  */
+
+	    a = func (&i, &j, &a);
+	  }
+	  ss.i = a;
+	  {
+	    extern int func2 (int *, int *, int *, int *, int *, int *, int *);
+	    extern int ext;
+	    static int sta;
+	    a = func2 (&e, &s, &g, &ex, &st, &ext, &sta);
+	  }
+	}
+	x = a;
+#pragma acc atomic write
+	y = a;
+	{
+	  int xx, yy;
+	  xx = a;
+#pragma acc atomic write
+	  yy = a;
+	}
+
+      ll:
+	;
+      }
+  /* { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'st' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'st' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'ex' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'ex' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'g' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'g' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 's' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 's' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'e' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'e' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'j\.1' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'i\.0' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'll' declared in block potentially has improper OpenACC privatization level: 'label_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'struct struct s_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c } l_loop$c_loop }
+     { dg-note {variable 's_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c++ } l_loop$c_loop } */
+  /* { dg-note {variable 'ss' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'func' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'func2' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'ext' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'ext' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'sta' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'sta' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'xx' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+  /* { dg-note {variable 'yy' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     { dg-note {variable 'yy' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop } */
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang.c b/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang.c
new file mode 100644
index 00000000000..cd6708ff205
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/privatization-1-routine_gang.c
@@ -0,0 +1,93 @@
+/* OpenACC privatization: 'routine' */
+
+/* { dg-additional-options "-fopt-info-omp-note" } */
+/* { dg-additional-options "--param=openacc-privatization=noisy" } for
+   testing/documenting aspects of that functionality.  */
+
+/* See also '../../gfortran.dg/goacc/privatization-1-routine_gang.f90'.  */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_routine 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+extern int e;
+static int s;
+int g;
+#pragma acc declare device_resident(e, s, g)
+
+#pragma acc routine gang /* { dg-line l_routine[incr c_routine] } */
+void
+f (int i, int j, int a)
+{
+  extern int ex;
+  static int st;
+#pragma acc declare device_resident(ex /* , st */)
+  int x, y;
+      {
+	__label__ ll;
+	/* Nested scopes fun.  */
+	{
+	  struct s_ss { int i; } ss;
+	  {
+	    extern int func (int *, int *, int *);
+	    /* Don't know how to effect a 'CONST_DECL' here.  (See Fortran example.)  */
+	    /* Don't know how to effect a 'RESULT_DECL' here; only saw this for OpenMP 'lastprivate'.  */
+
+	    a = func (&i, &j, &a);
+	  }
+	  ss.i = a;
+	  {
+	    extern int func2 (int *, int *, int *, int *, int *, int *, int *);
+	    extern int ext;
+	    static int sta;
+#pragma acc declare device_resident(ext /* , sta */)
+	    a = func2 (&e, &s, &g, &ex, &st, &ext, &sta);
+	  }
+	}
+	x = a;
+#pragma acc atomic write
+	y = a;
+	{
+	  int xx, yy;
+	  xx = a;
+#pragma acc atomic write
+	  yy = a;
+	}
+
+      ll:
+	;
+      }
+}
+  /* { dg-note {variable 'y' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'st' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'st' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'ex' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'ex' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'g' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'g' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 's' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 's' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'e' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'e' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'a' declared in block potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'j' declared in block potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'i' declared in block potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'll' declared in block potentially has improper OpenACC privatization level: 'label_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'struct struct s_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 's_ss' declared in block potentially has improper OpenACC privatization level: 'type_decl'} "TODO" { target c++ xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'ss' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'func' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'func2' declared in block potentially has improper OpenACC privatization level: 'function_decl'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'ext' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'ext' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'sta' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'sta' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'xx' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "TODO" { xfail *-*-* } l_routine$c_routine } */
+  /* { dg-note {variable 'yy' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { xfail *-*-* } l_routine$c_routine }
+     { dg-note {variable 'yy' ought to be adjusted for OpenACC privatization level: 'gang'} "TODO" { xfail *-*-* } l_routine$c_routine } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/private-3.f95 b/gcc/testsuite/gfortran.dg/goacc/private-3.f95
index a7c6d81ad4e..1bfb4f1554d 100644
--- a/gcc/testsuite/gfortran.dg/goacc/private-3.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/private-3.f95
@@ -1,7 +1,9 @@
-! { dg-do compile }
-
 ! test for private variables in a reduction clause
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" } for
+! testing/documenting aspects of that functionality.
+
 program test
   implicit none
   integer, parameter :: n = 100
@@ -16,6 +18,7 @@ program test
   !$acc parallel private (k)
   k = 0
   !$acc loop reduction (+:k)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, n
      k = k + 1
   end do
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
new file mode 100644
index 00000000000..bcd7159ae5b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
@@ -0,0 +1,57 @@
+! OpenACC privatization: 'loop' construct
+
+! { dg-additional-options "-fopt-info-omp-note" }
+! { dg-additional-options "--param=openacc-privatization=noisy" } for
+! testing/documenting aspects of that functionality.
+
+! See also '../../c-c++-common/goacc/privatization-1-compute-loop.c'.
+!TODO More cases should be added here.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_loop 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+module m
+contains
+  subroutine f (i, j, a)
+    implicit none
+    integer :: i, j, a
+    integer :: x, y
+    integer, parameter :: c = 3
+    integer, external :: g
+
+    !$acc parallel
+    !$acc loop collapse(2) private(a) private(x, y) ! { dg-line l_loop[incr c_loop] }
+    do i = 1, 20
+       do j = 1, 25
+          ! Can't have nested scopes fun.  (Fortran 'block' construct supported only starting with OpenACC 3.1.)
+
+          ! Don't know how to effect a 'LABEL_DECL' here.
+          ! Don't know how to effect a 'TYPE_DECL' here.
+          ! Don't know how to effect a 'FUNCTION_DECL' here.
+          ! Don't know how to effect a 'RESULT_DECL' here.
+          ! Don't know how to effect a 'VAR_DECL' here.
+          ! (See C/C++ example.)
+
+          a = g (i, j, a, c)
+          x = a
+          !$acc atomic write
+          y = a
+       end do
+    end do
+    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop }
+    !$acc end parallel
+  end subroutine f
+end module m
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
new file mode 100644
index 00000000000..ed7e9ec6437
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
@@ -0,0 +1,48 @@
+! OpenACC privatization: compute construct
+
+! { dg-additional-options "-fopt-info-omp-note" }
+! { dg-additional-options "--param=openacc-privatization=noisy" } for
+! testing/documenting aspects of that functionality.
+
+! See also '../../c-c++-common/goacc/privatization-1-compute.c'.
+!TODO More cases should be added here.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+module m
+contains
+  subroutine f (i, j, a)
+    implicit none
+    integer :: i, j, a
+    integer :: x, y
+    integer, parameter :: c = 3
+    integer, external :: g
+
+    !$acc parallel private(i, j, a) private(x, y) ! { dg-line l_compute[incr c_compute] }
+          ! Can't have nested scopes fun.  (Fortran 'block' construct supported only starting with OpenACC 3.1.)
+
+          ! Don't know how to effect a 'LABEL_DECL' here.
+          ! Don't know how to effect a 'TYPE_DECL' here.
+          ! Don't know how to effect a 'FUNCTION_DECL' here.
+          ! Don't know how to effect a 'RESULT_DECL' here.
+          ! Don't know how to effect a 'VAR_DECL' here.
+          ! (See C/C++ example.)
+
+          a = g (i, j, a, c)
+          x = a
+          !$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
+          y = a
+    !$acc end parallel
+    ! { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+  end subroutine f
+end module m
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
new file mode 100644
index 00000000000..db6d8226ed0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
@@ -0,0 +1,56 @@
+! OpenACC privatization: 'loop' construct inside 'routine'
+
+! { dg-additional-options "-fopt-info-omp-note" }
+! { dg-additional-options "--param=openacc-privatization=noisy" } for
+! testing/documenting aspects of that functionality.
+
+! See also '../../c-c++-common/goacc/privatization-1-routine_gang-loop.c'.
+!TODO More cases should be added here.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_loop 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+module m
+contains
+  subroutine f (i, j, a)
+    implicit none
+    integer :: i, j, a
+    !$acc routine (f) gang
+    integer :: x, y
+    integer, parameter :: c = 3
+    integer, external :: g
+
+    !$acc loop collapse(2) private(a) private(x, y) ! { dg-line l_loop[incr c_loop] }
+    do i = 1, 20
+       do j = 1, 25
+          ! Can't have nested scopes fun.  (Fortran 'block' construct supported only starting with OpenACC 3.1.)
+
+          ! Don't know how to effect a 'LABEL_DECL' here.
+          ! Don't know how to effect a 'TYPE_DECL' here.
+          ! Don't know how to effect a 'FUNCTION_DECL' here.
+          ! Don't know how to effect a 'RESULT_DECL' here.
+          ! Don't know how to effect a 'VAR_DECL' here.
+          ! (See C/C++ example.)
+
+          a = g (i, j, a, c)
+          x = a
+          !$acc atomic write
+          y = a
+       end do
+    end do
+    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop }
+  end subroutine f
+end module m
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang.f90
new file mode 100644
index 00000000000..725bd5e2ebe
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang.f90
@@ -0,0 +1,47 @@
+! OpenACC privatization: 'routine'
+
+! { dg-additional-options "-fopt-info-omp-note" }
+! { dg-additional-options "--param=openacc-privatization=noisy" } for
+! testing/documenting aspects of that functionality.
+
+! See also '../../c-c++-common/goacc/privatization-1-routine_gang.c'.
+!TODO More cases should be added here.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_routine 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+module m
+contains
+  subroutine f (i, j, a)
+    implicit none
+    integer :: i, j, a
+    !$acc routine (f) gang ! { dg-line l_routine[incr c_routine] }
+    integer :: x, y
+    integer, parameter :: c = 3
+    integer, external :: g
+
+          ! Can't have nested scopes fun.  (Fortran 'block' construct supported only starting with OpenACC 3.1.)
+
+          ! Don't know how to effect a 'LABEL_DECL' here.
+          ! Don't know how to effect a 'TYPE_DECL' here.
+          ! Don't know how to effect a 'FUNCTION_DECL' here.
+          ! Don't know how to effect a 'RESULT_DECL' here.
+          ! Don't know how to effect a 'VAR_DECL' here.
+          ! (See C/C++ example.)
+
+          a = g (i, j, a, c)
+          x = a
+          !$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
+          y = a
+  end subroutine f
+    ! { dg-note {variable 'i' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine }
+    ! { dg-note {variable 'j' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine }
+    ! { dg-note {variable 'a' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } l_routine$c_routine }
+    ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { xfail *-*-* } l_routine$c_routine }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "TODO" { xfail *-*-* } l_routine$c_routine }
+end module m
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
index 0990e3db224..fff0c28e8ad 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -15,9 +21,11 @@ void t1 ()
     ary[i] = ~0;
   
 #pragma acc parallel num_gangs (32) copy (ok) firstprivate (val) copy(ary, ondev)
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     ondev = acc_on_device (acc_device_not_host);
 #pragma acc loop gang(static:1)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (unsigned i = 0; i < 32; i++)
       {
 	if (val != 2)
@@ -79,6 +87,7 @@ void t3 ()
 
   #pragma acc parallel num_gangs (n) firstprivate (a)
   #pragma acc loop gang
+  /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   for (i = 0; i < n; i++)
     {
       a = a + i;
@@ -124,6 +133,7 @@ void t4 ()
   /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
   {
 #pragma acc loop gang
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       arr[i] += x;
   }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
index 6830ef1e7ed..66501e614fb 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
@@ -1,6 +1,11 @@
-/* { dg-do run } */
-
 /* Test if, if_present clauses on host_data construct.  */
+
+/* { dg-additional-options "-fopt-info-all-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-all-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* C/C++ variant of 'libgomp.oacc-fortran/host_data-5.F90' */
 
 #include <assert.h>
@@ -14,15 +19,19 @@ foo (float *p, intptr_t host_p, int cond)
 #pragma acc data copyin(host_p)
   {
 #pragma acc host_data use_device(p) if_present
+    /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     /* p not mapped yet, so it will be equal to the host pointer.  */
     assert (p == (float *) host_p);
 
 #pragma acc data copy(p[0:100])
+    /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
     {
       /* Not inside a host_data construct, so p is still the host pointer.  */
       assert (p == (float *) host_p);
 
 #pragma acc host_data use_device(p)
+      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
       {
 #if ACC_MEM_SHARED
 	assert (p == (float *) host_p);
@@ -33,6 +42,7 @@ foo (float *p, intptr_t host_p, int cond)
       }
 
 #pragma acc host_data use_device(p) if_present
+      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
       {
 #if ACC_MEM_SHARED
 	assert (p == (float *) host_p);
@@ -43,6 +53,8 @@ foo (float *p, intptr_t host_p, int cond)
       }
 
 #pragma acc host_data use_device(p) if(cond)
+      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+      /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-2 } */
       {
 #if ACC_MEM_SHARED
 	assert (p == (float *) host_p);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
index dd83557b6aa..e08cfa56e3c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
@@ -3,11 +3,17 @@
 /* { dg-additional-options "-fopt-info-omp-all" } */
 /* { dg-additional-options "--param=openacc-kernels=decompose" } */
 
+/* { dg-additional-options "-fopt-info-all-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-all-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
    passed to 'incr' may be unset, and in that case, it will be set to [...]",
    so to maintain compatibility with earlier Tcl releases, we manually
    initialize counter variables:
-   { dg-line l_dummy[variable c_loop_i 0] }
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0] }
    { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
    "WARNING: dg-line var l_dummy defined, but not used".  */
 
@@ -22,15 +28,19 @@ int main()
 #define N 123
   int b[N] = { 0 };
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
   {
     int c = 234; /* { dg-message "note: beginning 'gang-single' part in OpenACC 'kernels' region" } */
+    /* { dg-note {variable 'c' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+       { dg-note {variable 'c\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
 
     /*TODO Hopefully, this is the same issue as '../../../gcc/testsuite/c-c++-common/goacc/kernels-decompose-ice-1.c'.  */
     (volatile int *) &c;
 
 #pragma acc loop independent gang /* { dg-line l_loop_i[incr c_loop_i] } */
     /* { dg-message "note: parallelized loop nest in OpenACC 'kernels' region" "" { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop_i$c_loop_i } */
     /* { dg-optimized "assigned OpenACC gang loop parallelism" "" { target *-*-* } l_loop_i$c_loop_i } */
     for (int i = 0; i < N; ++i)
       b[i] = c;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
index bcbe28a6778..f28513dd208 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared in a local scope, broadcasting
@@ -12,30 +18,40 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
 	#pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
 	#pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
index a944486fac3..21f25114d68 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared in a local scope, broadcasting
@@ -12,25 +18,32 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
index ba0b44dc5be..8b4cde87ce9 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared in a local scope, broadcasting
@@ -17,13 +23,18 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -33,10 +44,12 @@ main (int argc, char* argv[])
 	    pt.y = i | j * 5;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
index 7189d2a99cd..a658d167236 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared in a local scope, broadcasting
@@ -17,13 +23,19 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'pt' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-3 } */
+	/* { dg-note {variable 'ptp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -34,12 +46,14 @@ main (int argc, char* argv[])
 	    pt.x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += ptp->x * k;
 
 	    ptp->y = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
index 854ad7e9b3b..b82b9bf210a 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared in a local scope, broadcasting
@@ -12,13 +18,18 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -27,12 +38,14 @@ main (int argc, char* argv[])
 	    pt[0] = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
index 5bc90c2367b..38d89c726ca 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private variables declared on loop directive.  */
@@ -13,6 +19,8 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang(num:32) private(x)
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
index 3eb11670e36..62dd12fb790 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private variables declared on loop directive, with broadcasting
@@ -14,11 +20,15 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang(num:32) private(x)
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
 	#pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
index 86b9a7179e1..c22c3b43e31 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private variables declared on loop directive, with broadcasting
@@ -14,11 +20,15 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang(num:32) private(x)
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
 	#pragma acc loop vector(length:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
index 4174248ee4e..27a8e804129 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private addressable variable declared on loop directive, with
@@ -14,6 +20,10 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang(num:32) private(x)
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (i = 0; i < 32; i++)
       {
         int *p = &x;
@@ -21,6 +31,7 @@ main (int argc, char* argv[])
 	x = i * 2;
 
 	#pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
index b160eaa604d..f570c222940 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private array variable declared on loop directive, with
@@ -14,12 +20,16 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang(num:32) private(x)
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
     for (i = 0; i < 32; i++)
       {
         for (int j = 0; j < 8; j++)
 	  x[j] = j * 2;
 
 	#pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x[j % 8];
       }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
index 88ab245b0ce..5b776f18f72 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of gang-private aggregate variable declared on loop directive, with
@@ -20,6 +26,9 @@ main (int argc, char* argv[])
   #pragma acc kernels copy(arr)
   {
     #pragma acc loop gang private(pt)
+    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
     for (i = 0; i < 32; i++)
       {
         pt.x = i;
@@ -28,6 +37,7 @@ main (int argc, char* argv[])
 	pt.attr[5] = i * 6;
 
 	#pragma acc loop worker
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += pt.x + pt.y + pt.z + pt.attr[5];
       }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
index df4add11df4..696da0f204f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of vector-private variables declared on loop directive.  */
@@ -11,18 +17,24 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
 	    #pragma acc loop vector(length:32) private(x)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i ^ j * 3;
@@ -30,6 +42,8 @@ main (int argc, char* argv[])
 	      }
 
 	    #pragma acc loop vector(length:32) private(x)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i | j * 5;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
index 53c56b2d362..2e3b635b023 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of vector-private variables declared on loop directive. Array type.  */
@@ -11,18 +17,24 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
 	    #pragma acc loop vector(length:32) private(pt)
+	    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	    for (k = 0; k < 32; k++)
 	      {
 	        pt[0] = i ^ j * 3;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
index 95db2f8912e..1aedc7964e2 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive.  */
@@ -11,13 +17,17 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    x = i ^ j * 3;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
index ceaa3ee9ecd..3bf62aae174 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive, broadcasting
@@ -12,19 +18,25 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
index 193a1d1063b..8de551635ea 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive, broadcasting
@@ -12,30 +18,40 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
 	#pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
index 4320cd81e69..425fe6321fa 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive, broadcasting
@@ -12,25 +18,32 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
index 80992eed0f8..c027c024b9c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive, broadcasting
@@ -12,13 +18,19 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(x)
+	/* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+	/* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -27,12 +39,14 @@ main (int argc, char* argv[])
 	    x = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    *p = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
index 005ba60a341..4f17566f8f9 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on a loop directive, broadcasting
@@ -18,13 +24,18 @@ main (int argc, char* argv[])
     arr[i] = i;
 
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         #pragma acc loop worker(num:32) private(pt)
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -33,10 +44,12 @@ main (int argc, char* argv[])
 	    pt.y = i | j * 5;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
index 8d367fb00e0..12b4c548156 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 /* Test of worker-private variables declared on loop directive, broadcasting
@@ -15,14 +21,19 @@ main (int argc, char* argv[])
   /* "pt" is treated as "present_or_copy" on the kernels directive because it
      is an array variable.  */
   #pragma acc kernels copy(arr)
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
     int j;
 
     #pragma acc loop gang(num:32)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 32; i++)
       {
         /* But here, it is made private per-worker.  */
         #pragma acc loop worker(num:32) private(pt)
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -30,12 +41,14 @@ main (int argc, char* argv[])
 	    pt[0] = i ^ j * 3;
 
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
 	    #pragma acc loop vector(length:32)
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
index 98f02e9840a..12272add471 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -14,8 +20,13 @@ int main ()
     ary[ix] = -1;
   
 #pragma acc parallel num_gangs(32) copy(ary) copy(ondev)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
 #pragma acc loop gang
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
index 4152a4e6c82..683bd126279 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -14,8 +20,13 @@ int main ()
     ary[ix] = -1;
   
 #pragma acc parallel num_gangs(32) copy(ary) copy(ondev)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
 #pragma acc loop gang (static:1)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c
index 5c843012061..e5ed2ab7006 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -16,8 +22,13 @@ int main ()
   
 #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 	    copy(ary) copy(ondev) copyout(gangsize, workersize, vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop gang worker vector
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
index a4f81a39e24..cb3878b8d4e 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <alloca.h>
@@ -49,8 +55,13 @@ int main ()
 
 #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 	    copy(ary) copyout(gangsize, workersize, vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop gang worker vector
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int g, w, v;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
index 7107502e070..0c8402703e7 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -10,8 +16,14 @@ int main ()
   int t = 0, h = 0;
   
 #pragma acc parallel num_gangs(32) copy(ondev)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
 #pragma acc loop gang  reduction (+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
index 9c4a85f7b16..c1a2d0cffe1 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -12,8 +18,14 @@ int main ()
 
 #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \
 	copy(ondev) copyout(gangsize, workersize, vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop gang worker vector reduction(+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
index 1173c1f57bb..58c7b6ab57f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -12,8 +18,14 @@ int main ()
   int vectorsize;
 
 #pragma acc parallel vector_length(32) copy(ondev) copyout(vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
 #pragma acc loop vector reduction (+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c
index 84c2296a7b1..85931f5e433 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -12,10 +18,17 @@ int main ()
   int vectorsize;
 
 #pragma acc parallel vector_length(32) copy(q) copy(ondev) copyout(vectorsize)
+  /* { dg-note {variable 't' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
     int t = q;
     
 #pragma acc loop vector reduction (+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
index 2f749e04ae0..b9ceec9887d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -15,9 +21,15 @@ int main ()
 
 #pragma acc parallel num_workers(32) vector_length(32) copy(ondev) \
 	    copyout(workersize)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-3 } */
   {
 #pragma acc loop worker reduction(+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
index 9727e22d3c2..ff5e4a1656b 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -15,11 +21,18 @@ int main ()
 
 #pragma acc parallel num_workers(32) vector_length(32) copy(q) copy(ondev) \
 	    copyout(workersize)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
+  /* { dg-note {variable 't' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-4 } */
   {
     int t = q;
     
 #pragma acc loop worker reduction(+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
index c360ad11e7c..5d60899acc1 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -12,8 +18,14 @@ int main ()
   
 #pragma acc parallel num_workers(32) vector_length(32) copy(ondev) \
 	    copyout(workersize, vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop worker vector reduction (+:t)
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'val' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	int val = ix;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c
index 8c858f30563..9ccc1a89b13 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-v-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -16,8 +22,13 @@ int main ()
   
 #pragma acc parallel vector_length(32) copy(ary) copy(ondev) \
 	    copyout(vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop vector
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
index d639e14a67c..0e99ec62038 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -19,9 +25,14 @@ int main ()
   
 #pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev) \
 	    copyout(workersize)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } .-2 } */
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } .-3 } */
   {
 #pragma acc loop worker
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c
index fd4e4cf5ea9..f4707d15394 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdio.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -16,8 +22,13 @@ int main ()
   
 #pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev) \
 	    copyout(workersize, vectorsize)
+  /* { dg-note {variable 'ix' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
 #pragma acc loop worker vector
+    /* { dg-note {variable 'ix' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'g' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    /* { dg-note {variable 'w' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    /* { dg-note {variable 'v' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
     for (unsigned ix = 0; ix < N; ix++)
       {
 	if (acc_on_device (acc_device_not_host))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c
index b15ee8b22ff..f88babce5d6 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -63,6 +69,7 @@ main ()
 #pragma acc parallel num_gangs (10) reduction (+:s1) copy(s1)
   {
 #pragma acc loop gang reduction (+:s1)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < 10; i++)
       s1++;
   }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
index 28222c25da3..2c1ffb15be1 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c
@@ -1,38 +1,99 @@
+/* Tests for gang-private variables, 'atomic' access */
+
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 #include <assert.h>
+#include <openacc.h>
 
 int main (void)
 {
   int ret;
 
-  #pragma acc parallel num_gangs(1) num_workers(32) copyout(ret)
+
+  ret = 0;
+  #pragma acc parallel num_gangs(1444) num_workers(32) reduction(+: ret) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'w' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'w' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'w' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } l_compute$c_compute } */
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
-    int w = 0;
+    int w = -22;
 
-    #pragma acc loop worker
-    for (int i = 0; i < 32; i++)
+    #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    for (int i = 0; i < 2232; i++)
       {
 	#pragma acc atomic update
 	w++;
       }
 
-    ret = (w == 32);
+    ret = (w == -22 + 2232);
   }
-  assert (ret);
+  if (acc_get_device_type () == acc_device_host)
+    assert (ret == 1);
+  else
+    assert (ret == 1444);
+
 
-  #pragma acc parallel num_gangs(1) vector_length(32) copyout(ret)
+  ret = 0;
+  #pragma acc parallel num_gangs(1414) vector_length(32) reduction(+: ret) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'v' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'v' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'v' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } l_compute$c_compute } */
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
-    int v = 0;
+    int v = 10;
 
-    #pragma acc loop vector
-    for (int i = 0; i < 32; i++)
+    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    for (int i = 0; i < 3201; i++)
       {
 	#pragma acc atomic update
 	v++;
       }
 
-    ret = (v == 32);
+    ret = (v == 10 + 3201);
+  }
+  if (acc_get_device_type () == acc_device_host)
+    assert (ret == 1);
+  else
+    assert (ret == 1414);
+
+
+  ret = 0;
+#pragma acc parallel num_gangs(314) reduction(+: ret) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'v' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'v' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute }
+     { dg-note {variable 'v' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } l_compute$c_compute } */
+  {
+    int v = -222;
+
+#pragma acc atomic update
+    ++v;
+#pragma acc atomic update
+    ++v;
+#pragma acc atomic update
+    ++v;
+
+    ret += (v == -222 + 3);
   }
-  assert (ret);
+  if (acc_get_device_type () == acc_device_host)
+    assert (ret == 1);
+  else
+    assert (ret == 314);
+
 
   return 0;
 }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c
index 77197d8fd44..e651012f463 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c
@@ -1,5 +1,11 @@
 // 'atomic' access of thread-private variable
 
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <assert.h>
 
 int main (void)
@@ -8,13 +14,20 @@ int main (void)
 
   res = 0;
 #pragma acc parallel reduction(+: res)
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   {
 #pragma acc loop vector reduction(+: res)
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    /* { dg-note {variable 'v' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+       { dg-note {variable 'v' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } .-3 }
+       { dg-note {variable 'v' adjusted for OpenACC privatization level: 'vector'} "" { target { ! openacc_host_selected } } .-4 } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 } */
     for (int i = 0; i < 2322; i++)
     {
       int v = -222;
 
 #pragma acc loop seq
+      /* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
       for (int j = 0; j < 121; ++j)
 	{
 #pragma acc atomic update
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
index 3cc6f150f63..366f818a14d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c
@@ -1,6 +1,20 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 #include <assert.h>
 #include <openacc.h>
 
@@ -24,17 +38,20 @@ void local_g_1()
   for (i = 0; i < 32; i++)
     arr[i] = 3;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 } */
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
     int x;
 
-    #pragma acc loop gang(static:1)
+    #pragma acc loop gang(static:1) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       x = i * 2;
 
-    #pragma acc loop gang(static:1)
+    #pragma acc loop gang(static:1) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
 	if (acc_on_device (acc_device_host))
@@ -58,31 +75,41 @@ void local_w_1()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -109,26 +136,33 @@ void local_w_2()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -155,14 +189,19 @@ void local_w_3()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -171,11 +210,13 @@ void local_w_3()
 	    pt.x = i ^ j * 3;
 	    pt.y = i | j * 5;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
@@ -202,14 +243,22 @@ void local_w_4()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'pt' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+	   { dg-note {variable 'pt' ought to be adjusted for OpenACC privatization level: 'worker'} "" { target *-*-* } l_loop$c_loop }
+	   { dg-note {variable 'pt' adjusted for OpenACC privatization level: 'worker'} "TODO" { target { ! openacc_host_selected } xfail *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'ptp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -219,13 +268,15 @@ void local_w_4()
 	    
 	    pt.x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += ptp->x * k;
 
 	    ptp->y = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
@@ -252,14 +303,19 @@ void local_w_5()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -267,13 +323,15 @@ void local_w_5()
 	    
 	    pt[0] = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
@@ -299,11 +357,13 @@ void loop_g_1()
   for (i = 0; i < 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 } */
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(x)
+    #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
@@ -326,15 +386,19 @@ void loop_g_2()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(x)
+    #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
@@ -355,15 +419,19 @@ void loop_g_3()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(x)
+    #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
-	#pragma acc loop vector
+	#pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
@@ -384,17 +452,33 @@ void loop_g_4()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(x)
+    #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+       But, with optimizations enabled, per the '*.ssa' dump ('gcc/tree-ssa.c:execute_update_addresses_taken'):
+           No longer having address taken: x
+	   Now a gimple register: x
+       However, 'x' remains in the candidate set:
+       { dg-note {variable 'x' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+       Now, for GCN offloading, 'adjust_private_decl' does the privatization change right away:
+       { dg-note {variable 'x' adjusted for OpenACC privatization level: 'gang'} "" { target openacc_radeon_accel_selected } l_loop$c_loop }
+       For nvptx offloading however, we first mark up 'x', and then later apply the privatization change -- or, with optimizations enabled, don't, because we then don't actually call 'expand_var_decl'.
+       { dg-note {variable 'x' adjusted for OpenACC privatization level: 'gang'} "" { target { openacc_nvidia_accel_selected && { ! __OPTIMIZE__ } } } l_loop$c_loop }
+       { dg-bogus {note: variable 'x' adjusted for OpenACC privatization level: 'gang'} "" { target { openacc_nvidia_accel_selected && __OPTIMIZE__ } } l_loop$c_loop }
+    */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
         int *p = &x;
 
 	x = i * 2;
 
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
 
@@ -417,16 +501,22 @@ void loop_g_5()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(x)
+    #pragma acc loop gang private(x) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+       { dg-note {variable 'x' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+       { dg-note {variable 'x' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } l_loop$c_loop } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
         for (int j = 0; j < 8; j++)
 	  x[j] = j * 2;
 
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x[j % 8];
       }
@@ -448,10 +538,13 @@ void loop_g_6()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang private(pt)
+    #pragma acc loop gang private(pt) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
         pt.x = i;
@@ -459,7 +552,8 @@ void loop_g_6()
 	pt.z = i * 4;
 	pt.attr[5] = i * 6;
 
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += pt.x + pt.y + pt.z + pt.attr[5];
       }
@@ -479,26 +573,34 @@ void loop_v_1()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
-	    #pragma acc loop vector private(x)
+	    #pragma acc loop vector private(x) /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i ^ j * 3;
 		arr[i * 1024 + j * 32 + k] += x * k;
 	      }
 
-	    #pragma acc loop vector private(x)
+	    #pragma acc loop vector private(x) /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i | j * 5;
@@ -527,19 +629,25 @@ void loop_v_2()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker
+        #pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
-	    #pragma acc loop vector private(pt)
+	    #pragma acc loop vector private(pt) /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      {
 	        pt[0] = i ^ j * 3;
@@ -570,15 +678,19 @@ void loop_w_1()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(x)
+        #pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    x = i ^ j * 3;
@@ -605,20 +717,26 @@ void loop_w_2()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(x)
+        #pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -645,31 +763,41 @@ void loop_w_3()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(x)
+        #pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
-	#pragma acc loop worker private(x)
+	#pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -696,26 +824,33 @@ void loop_w_4()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(x)
+        #pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -742,14 +877,22 @@ void loop_w_5()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(x)
+        #pragma acc loop worker private(x) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+	   { dg-note {variable 'x' ought to be adjusted for OpenACC privatization level: 'worker'} "" { target *-*-* } l_loop$c_loop }
+	   { dg-note {variable 'x' adjusted for OpenACC privatization level: 'worker'} "TODO" { target { ! openacc_host_selected } xfail *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -757,13 +900,15 @@ void loop_w_5()
 	    
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    *p = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
@@ -791,14 +936,19 @@ void loop_w_6()
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker private(pt)
+        #pragma acc loop worker private(pt) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -806,11 +956,13 @@ void loop_w_6()
 	    pt.x = i ^ j * 3;
 	    pt.y = i | j * 5;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
@@ -840,28 +992,35 @@ void loop_w_7()
 
   /* "pt" is treated as "present_or_copy" on the parallel directive because it
      is an array variable.  */
-  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32)
+  #pragma acc parallel copy(arr) num_gangs(32) num_workers(32) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
         /* But here, it is made private per-worker.  */
-        #pragma acc loop worker private(pt)
+        #pragma acc loop worker private(pt) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    
 	    pt[0] = i ^ j * 3;
 
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
-	    #pragma acc loop vector
+	    #pragma acc loop vector /* { dg-line l_loop[incr c_loop] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
@@ -887,15 +1046,17 @@ void parallel_g_1()
   for (i = 0; i < 32; i++)
     arr[i] = 3;
 
-  #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 } */
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 } */
+  #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } l_compute$c_compute } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang(static:1)
+    #pragma acc loop gang(static:1) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       x = i * 2;
 
-    #pragma acc loop gang(static:1)
+    #pragma acc loop gang(static:1) /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
 	if (acc_on_device (acc_device_host))
@@ -918,17 +1079,20 @@ void parallel_g_2()
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(2) vector_length(32)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 } */
+  #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(2) vector_length(32) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } l_compute$c_compute } */
   {
-    #pragma acc loop gang
+    #pragma acc loop gang /* { dg-line l_loop[incr c_loop] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
     for (i = 0; i < 32; i++)
       {
         int j;
 	for (j = 0; j < 32; j++)
 	  x[j] = j * 2;
 	
-	#pragma acc loop worker
+	#pragma acc loop worker /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } */
 	for (j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x[31 - j];
       }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-4.c
index d6ff44df5a1..0402e44e3c5 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-4.c
@@ -1,3 +1,9 @@
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 #include <stdlib.h>
 #include <stdio.h>
 
@@ -11,6 +17,7 @@ vector (int *a)
   int i;
 
 #pragma acc loop vector
+  /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   for (i = 0; i < N; i++)
     a[i] -= a[i]; 
 }
@@ -22,9 +29,11 @@ worker (int *b)
   int i, j;
 
 #pragma acc loop worker
+  /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   for (i = 0; i < N; i++)
     {
 #pragma acc loop vector
+      /* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
       for (j = 0; j < M; j++)
         b[i * M + j] += b[i  * M + j]; 
     }
@@ -37,6 +46,7 @@ gang (int *a)
   int i;
 
 #pragma acc loop gang worker vector
+  /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
   for (i = 0; i < N; i++)
     a[i] -= i; 
 }
@@ -66,6 +76,7 @@ main(int argc, char **argv)
 #pragma acc parallel copy (a[0:N])
   {
 #pragma acc loop seq
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < N; i++)
       seq (&a[0]);
   }
@@ -79,6 +90,7 @@ main(int argc, char **argv)
 #pragma acc parallel copy (a[0:N])
   {
 #pragma acc loop seq
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < N; i++)
       gang (&a[0]);
   }
@@ -109,6 +121,7 @@ main(int argc, char **argv)
 #pragma acc parallel copy (a[0:N])
   {
 #pragma acc loop
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
     for (i = 0; i < N; i++)
       vector (&a[0]);
   }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/static-variable-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/static-variable-1.c
index 0c071c37346..6a4c6a0e85f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/static-variable-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/static-variable-1.c
@@ -9,6 +9,12 @@
    variables" (only visible to members of the GitHub OpenACC organization).
 */
 
+/* { dg-additional-options "-fopt-info-note-omp" }
+   { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=-fopt-info-note-omp" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   for testing/documenting aspects of that functionality.  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -40,6 +46,9 @@ static void t0_c(void)
 #pragma acc parallel \
   reduction(max:num_gangs_actual) \
   reduction(max:result)
+      /* { dg-note {variable 'var' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-3 }
+	 { dg-note {variable 'var' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-4 }
+	 { dg-note {variable 'var' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-5 } */
       {
 	num_gangs_actual = 1 + __builtin_goacc_parlevel_id(GOMP_DIM_GANG);
 
@@ -134,6 +143,9 @@ static void t1_c(void)
   num_gangs(num_gangs_request) \
   reduction(max:num_gangs_actual) \
   reduction(max:result)
+      /* { dg-note {variable 'var' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-4 }
+	 { dg-note {variable 'var' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-5 }
+	 { dg-note {variable 'var' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-6 } */
       {
 	num_gangs_actual = 1 + __builtin_goacc_parlevel_id(GOMP_DIM_GANG);
 
@@ -290,6 +302,7 @@ static void t2(void)
 
 #pragma acc data \
   copy(results_1, results_2, results_3)
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
   {
     for (int i = 0; i < i_limit; ++i)
       {
@@ -304,6 +317,10 @@ static void t2(void)
   present(results_1) \
   num_gangs(num_gangs_request_1) \
   async(1)
+	/* { dg-note {variable 'var' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-4 }
+	   { dg-note {variable 'var' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-5 }
+	   { dg-note {variable 'var' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-6 } */
+	/* { dg-note {variable 'tmp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-7 } */
 	{
 	  static int var = var_init_1;
 
@@ -327,6 +344,10 @@ static void t2(void)
   present(results_3) \
   num_gangs(num_gangs_request_3) \
   async(3)
+	/* { dg-note {variable 'var' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-4 }
+	   { dg-note {variable 'var' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-5 }
+	   { dg-note {variable 'var' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-6 } */
+	/* { dg-note {variable 'tmp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-7 } */
 	{
 	  static int var = var_init_3;
 
@@ -447,6 +468,9 @@ static void pr84992_1(void)
   int n[1];
   n[0] = 3;
 #pragma acc parallel copy(n)
+  /* { dg-note {variable 'test' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 }
+     { dg-note {variable 'test' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-2 }
+     { dg-note {variable 'test' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-3 } */
   {
     static const int test[] = {1,2,3,4};
     n[0] = test[n[0]];
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90
index 1a8432cfa86..ace935817dc 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90
@@ -1,6 +1,12 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
-!
+
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! TODO: Have to disable the acc_on_device builtin for we want to test the
 ! libgomp library function?  The command line option
 ! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not for
@@ -20,6 +26,8 @@ if (acc_on_device (acc_device_nvidia)) STOP 4
 ! Host via offloading fallback mode.
 
 !$acc parallel if(.false.)
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+!TODO Unhandled 'CONST_DECL' instances for constant arguments in 'acc_on_device' calls.
 if (.not. acc_on_device (acc_device_none)) STOP 5
 if (.not. acc_on_device (acc_device_host)) STOP 6
 if (acc_on_device (acc_device_not_host)) STOP 7
@@ -32,6 +40,7 @@ if (acc_on_device (acc_device_nvidia)) STOP 8
 ! Offloaded.
 
 !$acc parallel
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target { ! openacc_host_selected } } .-1 }
 if (acc_on_device (acc_device_none)) STOP 9
 if (acc_on_device (acc_device_host)) STOP 10
 if (.not. acc_on_device (acc_device_not_host)) STOP 11
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
index 56f99d4f99b..56270b12970 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
+++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
@@ -1,6 +1,12 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
-!
+
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! TODO: Have to disable the acc_on_device builtin for we want to test
 ! the libgomp library function?  The command line option
 ! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not
@@ -20,6 +26,8 @@
 !Host via offloading fallback mode.
 
 !$ACC PARALLEL IF(.FALSE.)
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+!TODO Unhandled 'CONST_DECL' instances for constant arguments in 'acc_on_device' calls.
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 5
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 6
       IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 7
@@ -32,6 +40,7 @@
 ! Offloaded.
 
 !$ACC PARALLEL
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target { ! openacc_host_selected } } .-1 }
       IF (ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 9
       IF (ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 10
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 11
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f
index 565723851b1..a8b9cddd1ae 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f
+++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f
@@ -1,6 +1,12 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
-!
+
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! TODO: Have to disable the acc_on_device builtin for we want to test
 ! the libgomp library function?  The command line option
 ! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not
@@ -20,6 +26,8 @@
 !Host via offloading fallback mode.
 
 !$ACC PARALLEL IF(.FALSE.)
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+!TODO Unhandled 'CONST_DECL' instances for constant arguments in 'acc_on_device' calls.
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 5
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 6
       IF (ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 7
@@ -32,6 +40,7 @@
 ! Offloaded.
 
 !$ACC PARALLEL
+! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target { ! openacc_host_selected } } .-1 }
       IF (ACC_ON_DEVICE (ACC_DEVICE_NONE)) STOP 9
       IF (ACC_ON_DEVICE (ACC_DEVICE_HOST)) STOP 10
       IF (.NOT. ACC_ON_DEVICE (ACC_DEVICE_NOT_HOST)) STOP 11
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-1.f90
index 084f336faa9..51776a1d260 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-1.f90
@@ -1,6 +1,12 @@
 ! { dg-do run }
 ! { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! Tests to exercise the declare directive along with
 ! the clauses: copy
 !              copyin
@@ -34,6 +40,7 @@ subroutine subr5 (a, b, c, d)
   i = 0
 
   !$acc parallel
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 1, N
       b(i) = a(i)
       c(i) = b(i)
@@ -55,6 +62,7 @@ subroutine subr4 (a, b)
   i = 0
 
   !$acc parallel
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     b(i) = a(i)
   end do
@@ -74,6 +82,7 @@ subroutine subr3 (a, c)
   i = 0
 
   !$acc parallel
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     a(i) = c(i)
     c(i) = 0
@@ -96,6 +105,7 @@ subroutine subr2 (a, b, c)
   i = 0
 
   !$acc parallel
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     b(i) = a(i)
     c(i) = b(i) + c(i) + 1
@@ -114,6 +124,7 @@ subroutine subr1 (a)
   i = 0
 
   !$acc parallel
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     a(i) = a(i) + 1
   end do
@@ -133,6 +144,9 @@ subroutine test (a, e)
 end subroutine
 
 subroutine subr0 (a, b, c, d)
+  ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  ! { dg-note {variable 'a\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
   implicit none
   integer, parameter :: N = 8
   integer :: a(N)
@@ -198,6 +212,10 @@ subroutine subr0 (a, b, c, d)
 end subroutine
 
 program main
+  ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-1 }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+  ! { dg-note {variable 'desc\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-4 }
   use vars
   use openacc
   implicit none
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/host_data-5.F90 b/libgomp/testsuite/libgomp.oacc-fortran/host_data-5.F90
index 483ac3fb668..93e9ee09818 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/host_data-5.F90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/host_data-5.F90
@@ -1,7 +1,13 @@
 ! { dg-do run }
 !
 ! Test if, if_present clauses on host_data construct.
-!
+
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! Fortran variant of 'libgomp.oacc-c-c++-common/host_data-7.c'.
 !
 program main
@@ -33,11 +39,24 @@ subroutine foo (p2, parr, host_p, host_parr, cond)
 #endif
   
   !$acc data copyin(host_p, host_parr)
+  ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target { ! openacc_host_selected } } .-1 }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-2 }
+  ! { dg-note {variable 'p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+  ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target { ! openacc_host_selected } } .-5 }
 #if !ACC_MEM_SHARED
     if (acc_is_present(p, c_sizeof(p))) stop 5
     if (acc_is_present(parr, 1)) stop 6
 #endif
     !$acc host_data use_device(p, parr) if_present
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'transfer\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+    ! { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+    ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+    ! { dg-note {variable 'host_parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-6 }
+    ! { dg-note {variable 'transfer\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-7 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-8 }
       ! not mapped yet, so it will be equal to the host pointer.
       if (transfer(c_loc(p), host_p) /= host_p) stop 7
       if (transfer(c_loc(parr), host_parr) /= host_parr) stop 8
@@ -48,6 +67,17 @@ subroutine foo (p2, parr, host_p, host_parr, cond)
 #endif
 
     !$acc data copy(p, parr)
+    ! { dg-note {variable 'p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+    ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+    ! { dg-note {variable 'transfer\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+    ! { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 }
+    ! { dg-note {variable 'host_parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-6 }
+    ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-7 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-8 }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-9 }
+    ! { dg-note {variable 'transfer\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-10 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-11 }
       if (.not. acc_is_present(p, c_sizeof(p))) stop 11
       if (.not. acc_is_present(parr, 1)) stop 12
       ! Not inside a host_data construct, so still the host pointer.
@@ -55,6 +85,14 @@ subroutine foo (p2, parr, host_p, host_parr, cond)
       if (transfer(c_loc(parr), host_parr) /= host_parr) stop 14
       
       !$acc host_data use_device(p, parr)
+      ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+      ! { dg-note {variable 'transfer\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+      ! { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+      ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+      ! { dg-note {variable 'host_parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 }
+      ! { dg-note {variable 'D\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-6 }
+      ! { dg-note {variable 'transfer\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-7 }
+      ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-8 }
 #if ACC_MEM_SHARED
         if (transfer(c_loc(p), host_p) /= host_p) stop 15
         if (transfer(c_loc(parr), host_parr) /= host_parr) stop 16
@@ -66,6 +104,14 @@ subroutine foo (p2, parr, host_p, host_parr, cond)
       !$acc end host_data
 
       !$acc host_data use_device(p, parr) if_present
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'transfer\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        ! { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+        ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+        ! { dg-note {variable 'host_parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-6 }
+        ! { dg-note {variable 'D\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-7 }
+        ! { dg-note {variable 'transfer\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-8 }
 #if ACC_MEM_SHARED
         if (transfer(c_loc(p), host_p) /= host_p) stop 19
         if (transfer(c_loc(parr), host_parr) /= host_parr) stop 20
@@ -77,6 +123,14 @@ subroutine foo (p2, parr, host_p, host_parr, cond)
       !$acc end host_data
 
       !$acc host_data use_device(p, parr) if(cond)
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'transfer\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        ! { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+        ! { dg-note {variable 'parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 }
+        ! { dg-note {variable 'host_parr\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-5 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-6 }
+        ! { dg-note {variable 'D\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-7 }
+        ! { dg-note {variable 'transfer\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "TODO" { target *-*-* } .-8 }
 #if ACC_MEM_SHARED
         if (transfer(c_loc(p), host_p) /= host_p) stop 23
         if (transfer(c_loc(parr), host_parr) /= host_parr) stop 24
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
index f3bf1ee5af6..3089d6a0c43 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
@@ -1,6 +1,20 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".  */
+
 program main
   use openacc
   implicit none
@@ -19,8 +33,11 @@ program main
 
   a(:) = 4.0
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (1 == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
+        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
+        !TODO Unhandled 'CONST_DECL' instances for constant argument in 'acc_on_device' call.
         if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
           b(i) = a(i) + 1
         else
@@ -41,8 +58,10 @@ program main
 
   a(:) = 16.0
 
-  !$acc parallel if (0 == 1)
+  !$acc parallel if (0 == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
+        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
        if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
          b(i) = a(i) + 1
        else
@@ -57,8 +76,10 @@ program main
 
   a(:) = 8.0
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -79,8 +100,10 @@ program main
 
   a(:) = 22.0
 
-  !$acc parallel if (zero == 1)
+  !$acc parallel if (zero == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -95,8 +118,10 @@ program main
 
   a(:) = 16.0
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -117,8 +142,10 @@ program main
 
   a(:) = 76.0
 
-  !$acc parallel if (.FALSE.)
+  !$acc parallel if (.FALSE.) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -135,8 +162,10 @@ program main
 
   nn = 1
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (nn == 1)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (nn == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -159,8 +188,10 @@ program main
 
   nn = 0
 
-  !$acc parallel if (nn == 1)
+  !$acc parallel if (nn == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -177,8 +208,10 @@ program main
 
   nn = 1
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -201,8 +234,10 @@ program main
 
   nn = 0;
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -217,8 +252,10 @@ program main
 
   a(:) = 91.0
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -233,8 +270,10 @@ program main
 
   a(:) = 43.0
 
-  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1)
+  !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -255,8 +294,10 @@ program main
 
   a(:) = 87.0
 
-  !$acc parallel if (one == 0)
+  !$acc parallel if (one == 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -333,8 +374,11 @@ program main
   b(:) = 0.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
-    !$acc parallel present (a(1:N))
+    !$acc parallel present (a(1:N)) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
        do i = 1, N
            b(i) = a(i)
        end do
@@ -349,6 +393,7 @@ program main
   b(:) = 1.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 21
@@ -361,18 +406,25 @@ program main
   b(:) = 21.0
 
   !$acc data copyin (a(1:N)) if (1 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
 #if !ACC_MEM_SHARED
     if (acc_is_present (a) .eqv. .FALSE.) STOP 23
 #endif
 
     !$acc data copyout (b(1:N)) if (0 == 1)
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 #if !ACC_MEM_SHARED
       if (acc_is_present (b) .eqv. .TRUE.) STOP 24
 #endif
         !$acc data copyout (b(1:N)) if (1 == 1)
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
-        !$acc parallel present (a(1:N)) present (b(1:N))
+        !$acc parallel present (a(1:N)) present (b(1:N)) ! { dg-line l_compute[incr c_compute] }
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
           do i = 1, N
             b(i) = a(i)
           end do
@@ -452,8 +504,10 @@ program main
 
   a(:) = 4.0
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (1 == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
+        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
         if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
           b(i) = a(i) + 1
         else
@@ -474,8 +528,10 @@ program main
 
   a(:) = 16.0
 
-  !$acc kernels if (0 == 1)
+  !$acc kernels if (0 == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
+        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
        if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
          b(i) = a(i) + 1
        else
@@ -490,8 +546,10 @@ program main
 
   a(:) = 8.0
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -512,8 +570,10 @@ program main
 
   a(:) = 22.0
 
-  !$acc kernels if (zero == 1)
+  !$acc kernels if (zero == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -528,8 +588,10 @@ program main
 
   a(:) = 16.0
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -550,8 +612,10 @@ program main
 
   a(:) = 76.0
 
-  !$acc kernels if (.FALSE.)
+  !$acc kernels if (.FALSE.) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -568,8 +632,10 @@ program main
 
   nn = 1
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (nn == 1)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (nn == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -592,8 +658,10 @@ program main
 
   nn = 0
 
-  !$acc kernels if (nn == 1)
+  !$acc kernels if (nn == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -610,8 +678,10 @@ program main
 
   nn = 1
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -634,8 +704,10 @@ program main
 
   nn = 0;
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -650,8 +722,10 @@ program main
 
   a(:) = 91.0
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -666,8 +740,10 @@ program main
 
   a(:) = 43.0
 
-  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1)
+  !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -688,8 +764,10 @@ program main
 
   a(:) = 87.0
 
-  !$acc kernels if (one == 0)
+  !$acc kernels if (one == 0) ! { dg-line l_compute[incr c_compute] }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
+      ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
         b(i) = a(i) + 1
       else
@@ -766,8 +844,11 @@ program main
   b(:) = 0.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
-    !$acc kernels present (a(1:N))
+    !$acc kernels present (a(1:N)) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
        do i = 1, N
            b(i) = a(i)
        end do
@@ -782,6 +863,7 @@ program main
   b(:) = 1.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 56
@@ -794,18 +876,25 @@ program main
   b(:) = 21.0
 
   !$acc data copyin (a(1:N)) if (1 == 1)
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
 #if !ACC_MEM_SHARED
     if (acc_is_present (a) .eqv. .FALSE.) STOP 58
 #endif
 
     !$acc data copyout (b(1:N)) if (0 == 1)
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 #if !ACC_MEM_SHARED
       if (acc_is_present (b) .eqv. .TRUE.) STOP 59
 #endif
         !$acc data copyout (b(1:N)) if (1 == 1)
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
 
-        !$acc kernels present (a(1:N)) present (b(1:N))
+        !$acc kernels present (a(1:N)) present (b(1:N)) ! { dg-line l_compute[incr c_compute] }
+        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
           do i = 1, N
             b(i) = a(i)
           end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
index bcc0476d665..0ae7c4bc761 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
@@ -2,6 +2,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, arr(32)
 
@@ -11,6 +17,8 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32) private(x)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
   do i = 1, 32
      x = i * 2;
      arr(i) = arr(i) + x;
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
index 5571059588f..e3ff24848b6 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, arr(0:32*32)
 
@@ -12,10 +18,13 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32) private(x)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
   do i = 0, 31
      x = i * 2;
 
      !$acc loop worker(num:32)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x;
      end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
index 6abbed7f489..370a25a7db6 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, arr(0:32*32)
 
@@ -12,10 +18,13 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32) private(x)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
   do i = 0, 31
      x = i * 2;
 
      !$acc loop vector(length:32)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x;
      end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
index d92be2d4f0e..abb86d0824f 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   type vec3
      integer x, y, z, attr(13)
@@ -17,6 +23,8 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32) private(pt)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
   do i = 0, 31
      pt%x = i
      pt%y = i * 2
@@ -24,6 +32,7 @@ program main
      pt%attr(5) = i * 6
 
      !$acc loop vector(length:32)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5);
      end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
index e9c0fb3f130..fe796f3ba46 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
@@ -2,6 +2,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
 
@@ -11,15 +17,21 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
      do j = 0, 31
         !$acc loop vector(length:32) private(x)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
         do k = 0, 31
            x = ieor(i, j * 3)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
         !$acc loop vector(length:32) private(x)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
         do k = 0, 31
            x = ior(i, j * 5)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
index 13badb51919..b5cefeccc22 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
@@ -2,6 +2,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: i, j, k, idx, arr(0:32*32*32), pt(2)
 
@@ -11,10 +17,14 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
      do j = 0, 31
         !$acc loop vector(length:32) private(x, pt)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
         do k = 0, 31
            pt(1) = ieor(i, j * 3)
            pt(2) = ior(i, j * 5)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
index 04d732ef410..3fd1239da4b 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
@@ -2,6 +2,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, arr(0:32*32)
   common x
@@ -12,8 +18,11 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32) private(x)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(x)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         x = ieor(i, j * 3)
         arr(i * 32 + j) = arr(i * 32 + j) + x
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
index 6c9a6b81c8a..1dc5d9e8eff 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
 
@@ -12,12 +18,16 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(x)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         x = ieor(i, j * 3)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
index fab14c3a953..25bc67abb8b 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
 
@@ -12,22 +18,29 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(x)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         x = ieor(i, j * 3)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
 
      !$acc loop worker(num:8) private(x)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         x = ior(i, j * 5)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
index 71f4a110acb..b3f66eaf773 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
 
@@ -12,12 +18,16 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(x)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         x = ieor(i, j * 3)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -25,6 +35,7 @@ program main
         x = ior(i, j * 5)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
index bb457555a42..d9dbb0736f3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: i, j, k, idx, arr(0:32*32*32)
   integer, target :: x
@@ -14,13 +20,18 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(x, p)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+     ! { dg-note {variable 'p' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
      do j = 0, 31
         p => x
         x = ieor(i, j * 3)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -28,6 +39,7 @@ program main
         p = ior(i, j * 5)
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
index e169714dd51..b4225c2bf47 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   type vec2
      integer x, y
@@ -17,18 +23,23 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(pt)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 0, 31
         pt%x = ieor(i, j * 3)
         pt%y = ior(i, j * 5)
         
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%x * k
         end do
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%y * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
index e262c02ac00..76bbda72787 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
@@ -3,6 +3,12 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 program main
   integer :: i, j, k, idx, arr(0:32*32*32), pt(2)
 
@@ -12,18 +18,23 @@ program main
 
   !$acc kernels copy(arr)
   !$acc loop gang(num:32)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 0, 31
      !$acc loop worker(num:8) private(pt)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
      do j = 0, 31
         pt(1) = ieor(i, j * 3)
         pt(2) = ior(i, j * 5)
         
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(1) * k
         end do
 
         !$acc loop vector(length:32)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(2) * k
         end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90 b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
index 4d36d869b0c..4e67809f769 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/optional-private.f90
@@ -4,9 +4,16 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
 ! aspects of that functionality.
 
+
 program main
   implicit none
 
@@ -36,6 +43,8 @@ contains
     ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
     ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 }
     !$acc loop gang private(x)
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'x' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } .-2 }
     do i = 1, 32
        x = i * 2;
        arr(i) = arr(i) + x
@@ -62,6 +71,8 @@ contains
     !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
     ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
     !$acc loop gang private(pt)
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'pt' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } .-2 }
     do i = 0, 31
        pt%x = i
        pt%y = i * 2
@@ -69,6 +80,7 @@ contains
        pt%attr(5) = i * 6
 
        !$acc loop vector
+       ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
        do j = 0, 31
           arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5);
        end do
@@ -92,10 +104,14 @@ contains
 
     !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
     !$acc loop gang
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 0, 31
        !$acc loop worker
+       ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
        do j = 0, 31
           !$acc loop vector private(pt)
+          ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+          ! { dg-note {variable 'pt' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "TODO" { target *-*-* } .-2 }
           do k = 0, 31
              pt(1) = ieor(i, j * 3)
              pt(2) = ior(i, j * 5)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
index f69ab5a6642..fad3d9d6a80 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -5,6 +5,12 @@
 ! { dg-do run }
 ! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
 ! aspects of that functionality.
 
@@ -62,6 +68,7 @@ program main
   vectors_max = -huge(gangs_max) - 1 ! INT_MIN
   !$acc serial &
   !$acc   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 100, -99, -1
      gangs_min = acc_gang ();
      gangs_max = acc_gang ();
@@ -90,6 +97,8 @@ program main
   ! { dg-bogus "\[Ww\]arning: region contains gang partitioned code but is not gang partitioned" "TODO 'serial'" { xfail *-*-* } .-1 }
   ! { dg-bogus "\[Ww\]arning: region contains worker partitioned code but is not worker partitioned" "TODO 'serial'" { xfail *-*-* } .-2 }
   ! { dg-bogus "\[Ww\]arning: region contains vector partitioned code but is not vector partitioned" "TODO 'serial'" { xfail *-*-* } .-3 }
+  ! { dg-note {variable 'C.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } .-4 }
+  !TODO Unhandled 'CONST_DECL' instance for constant argument in 'acc_on_device' call.
   if (acc_on_device (acc_device_nvidia)) then
      ! The GCC nvptx back end enforces vector_length (32).
      ! It's unclear if that's actually permissible here;
@@ -98,10 +107,14 @@ program main
    vectors_actual = 32
   end if
   !$acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 100, -99, -1
      !$acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
      do j = 100, -99, -1
         !$acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
         do k = 100 * vectors_actual, -99 * vectors_actual, -1
            gangs_min = acc_gang ();
            gangs_max = acc_gang ();
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
index 81487d7a7e0..4be7507e7ab 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90
@@ -1,17 +1,27 @@
-! Test for "oacc gang-private" attribute on gang-private variables
+! 'atomic' access of gang-private variable
 
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-oaccdevlow-details -w" }
+
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 
 program main
   integer :: w, arr(0:31)
 
   !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
     !$acc loop gang private(w)
-! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'w' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+    ! { dg-note {variable 'w' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } .-3 }
+    ! { dg-note {variable 'w' adjusted for OpenACC privatization level: 'gang'} "" { target { ! openacc_host_selected } } .-4 }
     do j = 0, 31
       w = 0
       !$acc loop seq
+      ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
       do i = 0, 31
         !$acc atomic update
         w = w + 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90
new file mode 100644
index 00000000000..e916837fc8f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90
@@ -0,0 +1,42 @@
+! 'atomic' access of vector-private variable
+
+! { dg-do run }
+
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
+
+program main
+  integer :: w, arr(0:31)
+
+  !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
+    !$acc loop gang worker vector private(w)
+    ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'w' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+    ! { dg-note {variable 'w' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } .-3 }
+    ! { dg-note {variable 'w' adjusted for OpenACC privatization level: 'vector'} "" { target { ! openacc_host_selected } } .-4 }
+    do j = 0, 31
+      w = 0
+      !$acc loop seq
+      ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+      do i = 0, 31
+        !$acc atomic update
+        w = w + 1
+        ! nvptx offloading: PR83812 "operation not supported on global/shared address space".
+        ! { dg-output "(\n|\r\n|\r)libgomp: cuStreamSynchronize error: operation not supported on global/shared address space(\n|\r\n|\r)$" { target openacc_nvidia_accel_selected } }
+        !   Scan for what we expect in the "XFAILed" case (without actually XFAILing).
+        ! { dg-shouldfail "XFAILed" { openacc_nvidia_accel_selected } }
+        !   ... instead of 'dg-xfail-run-if' so that 'dg-output' is evaluated at all.
+        ! { dg-final { if { [dg-process-target { xfail openacc_nvidia_accel_selected }] == "F" } { xfail "[testname-for-summary] really is XFAILed" } } }
+        !   ... so that we still get an XFAIL visible in the log.
+        !$acc end atomic
+      end do
+      arr(j) = w
+    end do
+  !$acc end parallel
+
+  if (any (arr .ne. 32)) stop 1
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90
index 21d13754591..5fa157b1674 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90
@@ -1,17 +1,27 @@
-! Test for worker-private variables
+! 'atomic' access of worker-private variable
 
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-oaccdevlow-details" }
+
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 
 program main
   integer :: w, arr(0:31)
 
   !$acc parallel num_gangs(32) num_workers(32) copyout(arr)
     !$acc loop gang worker private(w)
-! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning:  integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */
+    ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'w' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+    ! { dg-note {variable 'w' ought to be adjusted for OpenACC privatization level: 'worker'} "" { target *-*-* } .-3 }
+    ! { dg-note {variable 'w' adjusted for OpenACC privatization level: 'worker'} "TODO" { target { ! openacc_host_selected } xfail *-*-* } .-4 }
     do j = 0, 31
       w = 0
       !$acc loop seq
+      ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
       do i = 0, 31
         !$acc atomic update
         w = w + 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
index 81043a22fd8..e40a82fff10 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90
@@ -2,9 +2,23 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
 ! aspects of that functionality.
 
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_loop 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
 
 ! Test of gang-private variables declared on loop directive.
 
@@ -18,7 +32,9 @@ subroutine t1()
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
   ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
   ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 }
-  !$acc loop gang private(x)
+  !$acc loop gang private(x) ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 1, 32
      x = i * 2;
      arr(i) = arr(i) + x
@@ -43,11 +59,14 @@ subroutine t2()
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
   ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 }
-  !$acc loop gang private(x)
+  !$acc loop gang private(x) ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
      x = i * 2;
 
-     !$acc loop worker
+     !$acc loop worker ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x
      end do
@@ -72,11 +91,14 @@ subroutine t3()
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
   ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
-  !$acc loop gang private(x)
+  !$acc loop gang private(x) ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
      x = i * 2;
 
-     !$acc loop vector
+     !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x
      end do
@@ -106,14 +128,26 @@ subroutine t4()
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
   ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
-  !$acc loop gang private(pt)
+  !$acc loop gang private(pt) ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+  ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+  ! But, with optimizations enabled, per the '*.ssa' dump ('gcc/tree-ssa.c:execute_update_addresses_taken'):
+  !     No longer having address taken: pt
+  ! However, 'pt' remains in the candidate set:
+  ! { dg-note {variable 'pt' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+  ! Now, for GCN offloading, 'adjust_private_decl' does the privatization change right away:
+  ! { dg-note {variable 'pt' adjusted for OpenACC privatization level: 'gang'} "" { target openacc_radeon_accel_selected } l_loop$c_loop }
+  ! For nvptx offloading however, we first mark up 'pt', and then later apply the privatization change -- or, with optimizations enabled, don't, because we then don't actually call 'expand_var_decl'.
+  ! { dg-note {variable 'pt' adjusted for OpenACC privatization level: 'gang'} "" { target { openacc_nvidia_accel_selected && { ! __OPTIMIZE__ } } } l_loop$c_loop }
+  ! { dg-bogus {note: variable 'pt' adjusted for OpenACC privatization level: 'gang'} "" { target { openacc_nvidia_accel_selected && __OPTIMIZE__ } } l_loop$c_loop }
   do i = 0, 31
      pt%x = i
      pt%y = i * 2
      pt%z = i * 4
      pt%attr(5) = i * 6
 
-     !$acc loop vector
+     !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5);
      end do
@@ -136,16 +170,22 @@ subroutine t5()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker
+     !$acc loop worker ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
-        !$acc loop vector private(x)
+        !$acc loop vector private(x) ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            x = ieor(i, j * 3)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
-        !$acc loop vector private(x)
+        !$acc loop vector private(x) ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            x = ior(i, j * 5)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
@@ -177,11 +217,18 @@ subroutine t6()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker
+     !$acc loop worker ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
-        !$acc loop vector private(x, pt)
+        !$acc loop vector private(x, pt) ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-bogus {note: variable 'x' in 'private' clause} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-note {variable 'pt' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop }
+        ! { dg-note {variable 'pt' adjusted for OpenACC privatization level: 'vector'} "" { target { ! openacc_host_selected } } l_loop$c_loop }
         do k = 0, 31
            pt(1) = ieor(i, j * 3)
            pt(2) = ior(i, j * 5)
@@ -217,9 +264,13 @@ subroutine t7()
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
   ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-1 }
-  !$acc loop gang private(x)
+  !$acc loop gang private(x) ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+  ! { dg-bogus {note: variable 'x' in 'private' clause} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(x)
+     !$acc loop worker private(x) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         x = ieor(i, j * 3)
         arr(i * 32 + j) = arr(i * 32 + j) + x
@@ -244,13 +295,17 @@ subroutine t8()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(x)
+     !$acc loop worker private(x) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -280,23 +335,30 @@ subroutine t9()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(x)
+     !$acc loop worker private(x) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
 
-     !$acc loop worker private(x)
+     !$acc loop worker private(x) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         x = ior(i, j * 5)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -328,20 +390,25 @@ subroutine t10()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(x)
+     !$acc loop worker private(x) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
 
         x = ior(i, j * 5)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -375,21 +442,29 @@ subroutine t11()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(x, p)
+     !$acc loop worker private(x, p) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' ought to be adjusted for OpenACC privatization level: 'worker'} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'x' adjusted for OpenACC privatization level: 'worker'} "TODO" { target { ! openacc_host_selected } xfail *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'p' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         p => x
         x = ieor(i, j * 3)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
 
         p = ior(i, j * 5)
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
@@ -426,19 +501,24 @@ subroutine t12()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(pt)
+     !$acc loop worker private(pt) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
      do j = 0, 31
         pt%x = ieor(i, j * 3)
         pt%y = ior(i, j * 5)
         
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%x * k
         end do
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%y * k
         end do
@@ -470,19 +550,26 @@ subroutine t13()
   end do
 
   !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32)
-  !$acc loop gang
+  !$acc loop gang ! { dg-line l_loop[incr c_loop] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
   do i = 0, 31
-     !$acc loop worker private(pt)
+     !$acc loop worker private(pt) ! { dg-line l_loop[incr c_loop] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'pt' ought to be adjusted for OpenACC privatization level: 'worker'} "" { target *-*-* } l_loop$c_loop }
+     ! { dg-note {variable 'pt' adjusted for OpenACC privatization level: 'worker'} "TODO" { target { ! openacc_host_selected } xfail *-*-* } l_loop$c_loop } */
      do j = 0, 31
         pt(1) = ieor(i, j * 3)
         pt(2) = ior(i, j * 5)
         
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(1) * k
         end do
 
-        !$acc loop vector
+        !$acc loop vector ! { dg-line l_loop[incr c_loop] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(2) * k
         end do
@@ -518,13 +605,17 @@ subroutine t14()
   !$acc parallel private(x) copy(arr) num_gangs(n) num_workers(8) vector_length(32)
   ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-1 }
   ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-2 }
-    !$acc loop gang(static:1)
+    !$acc loop gang(static:1) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     do i = 1, n
       x = i * 2;
     end do
 
-   !$acc loop gang(static:1)
+   !$acc loop gang(static:1) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     do i = 1, n
+       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_loop$c_loop }
+       !TODO Unhandled 'CONST_DECL' instance for constant argument in 'acc_on_device' call.
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) x = i * 2
       arr(i) = arr(i) + x
     end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
index 907f0245f93..ba638da8628 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
@@ -7,6 +7,20 @@
 ! XFAIL the "UNRESOLVED: [...] compilation failed to produce executable", or
 ! get rid of it, unfortunately.
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".  */
+
 program main
   implicit none (type, external)
   integer :: j
@@ -28,10 +42,18 @@ contains
     integer :: i, nn
     integer :: array(nn)
 
-    !$acc parallel copyout(array)
+    !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     array = [(-i, i = 1, nn)]
-    !$acc loop gang private(array)
-    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } .-1 }
+    !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'array' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } l_loop$c_loop }
     do i = 1, 10
       array(i) = i
     end do
@@ -42,10 +64,26 @@ contains
     integer :: i
     integer :: array(:)
 
-    !$acc parallel copyout(array)
+    !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'parm\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'parm\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_compute$c_compute }
+    ! { dg-note {variable 'A\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'A\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'A\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_compute$c_compute }
     array = [(-2*i, i = 1, size(array))]
-    !$acc loop gang private(array)
-    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } .-1 }
+    !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'array\.[0-9]+' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'array\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'array\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_loop$c_loop }
+    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } l_loop$c_loop }
     do i = 1, 10
       array(i) = 9*i
     end do
@@ -56,10 +94,17 @@ contains
     integer :: i
     character(len=*) :: str
 
-    !$acc parallel copyout(str)
+    !$acc parallel copyout(str) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     str = "abcdefghij"
-    !$acc loop gang private(str)
-    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } .-1 }
+    !$acc loop gang private(str) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'str' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_loop$c_loop }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } l_loop$c_loop }
     do i = 1, 10
       str(i:i) = achar(ichar('A') + i)
     end do
@@ -88,7 +133,13 @@ contains
 
     !$acc parallel copyout(scalar)
     scalar = "abcdefghi-12345"
-    !$acc loop gang private(scalar)
+    !$acc loop gang private(scalar) ! { dg-line l_loop[incr c_loop] }
+    ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'scalar' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'char\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_loop$c_loop }
+    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     do i = 1, 15
       scalar(i:i) = achar(ichar('A') + i)
     end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
index c34de3a4963..75660bb39b5 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
@@ -2,6 +2,12 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
 
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! for testing/documenting aspects of that functionality.
+
 ! { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
 ! aspects of that functionality.
 !TODO { dg-additional-options "-fno-inline" } for stable results regarding OpenACC 'routine'.
@@ -20,6 +26,7 @@ program main
 
   !$acc parallel copy (a)
   !$acc loop seq
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 1, N
       call seq (a)
     end do
@@ -31,6 +38,7 @@ program main
 
   !$acc parallel copy (a)
   !$acc loop seq
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 1, N 
       call gang (a)
     end do
@@ -46,6 +54,7 @@ program main
 
   !$acc parallel copy (b)
   !$acc loop seq
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 1, N
       call worker (b)
     end do
@@ -61,6 +70,7 @@ program main
 
   !$acc parallel copy (a)
   !$acc loop seq
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do i = 1, N
       call vector (a)
     end do
@@ -78,6 +88,7 @@ subroutine vector (a)
   integer :: i
 
   !$acc loop vector
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     a(i) = a(i) - a(i) 
   end do
@@ -90,8 +101,10 @@ subroutine worker (b)
   integer :: i, j
 
   !$acc loop worker
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
   !$acc loop vector
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
     do j = 1, M
       b(j + ((i - 1) * M)) = b(j + ((i - 1) * M)) + 1
     end do
@@ -107,6 +120,7 @@ subroutine gang (a)
   integer :: i
 
   !$acc loop gang
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   do i = 1, N
     a(i) = a(i) - i 
   end do
-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98) on Linux/x86_64
@ 2021-05-22  1:40         ` sunil.k.pandey
  2021-05-22  8:41           ` Thomas Schwinge
  0 siblings, 1 reply; 24+ messages in thread
From: sunil.k.pandey @ 2021-05-22  1:40 UTC (permalink / raw)
  To: gcc-patches, gcc-regression, thomas

On Linux/x86_64,

325aa13996bafce0c4927876c315d1fa706d9881 is the first bad commit
commit 325aa13996bafce0c4927876c315d1fa706d9881
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Fri May 21 08:51:47 2021 +0200

    [OpenACC privatization] Reject 'static', 'external' in blocks [PR90115]

caused

FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 134)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98)

with GCC configured with

../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-989/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check RUNTESTFLAGS="fortran.exp=libgomp.oacc-fortran/privatized-ref-2.f90 --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check RUNTESTFLAGS="fortran.exp=libgomp.oacc-fortran/privatized-ref-2.f90 --target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98) on Linux/x86_64
  2021-05-22  1:40         ` [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64 sunil.k.pandey
@ 2021-05-22  8:41           ` Thomas Schwinge
  2021-05-25  1:03             ` Sunil Pandey
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Schwinge @ 2021-05-22  8:41 UTC (permalink / raw)
  To: skpgkp2, gcc-patches; +Cc: gcc-regression

[-- Attachment #1: Type: text/plain, Size: 3429 bytes --]

Hi!

First: many thanks for running this automated regression testing
machinery!

On 2021-05-21T18:40:55-0700, "sunil.k.pandey via Gcc-patches" <gcc-patches@gcc.gnu.org> wrote:
> On Linux/x86_64,
>
> 325aa13996bafce0c4927876c315d1fa706d9881 is the first bad commit
> commit 325aa13996bafce0c4927876c315d1fa706d9881
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Fri May 21 08:51:47 2021 +0200
>
>     [OpenACC privatization] Reject 'static', 'external' in blocks [PR90115]

Actually not that one, but instead one commit before is the culprit:

    commit 11b8286a83289f5b54e813f14ff56d730c3f3185
    Author: Thomas Schwinge <thomas@codesourcery.com>
    Date:   Thu May 20 16:11:37 2021 +0200

        [OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]

(Probably your testing aggregates commits that appear in some period of
time?  Maybe reflect that in the reporting emails?)

> caused
>
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 98)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 98)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 98)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for warnings, line 98)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 98)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 134)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98)

Sorry, and ACK, and I'm confused why I didn't see that in my own testing.
I've now pushed "[OpenACC privatization] Prune uninteresting/varying
diagnostics in 'libgomp.oacc-fortran/privatized-ref-2.f90'" to master
branch in commit 3050a1a18276d7cdd8946e34cc1344e30efb7030, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-Prune-uninteresting-varying-di.patch --]
[-- Type: text/x-diff, Size: 5625 bytes --]

From 3050a1a18276d7cdd8946e34cc1344e30efb7030 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Sat, 22 May 2021 10:28:34 +0200
Subject: [PATCH] [OpenACC privatization] Prune uninteresting/varying
 diagnostics in 'libgomp.oacc-fortran/privatized-ref-2.f90'

Minor fix-up for my recent commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	libgomp/
	PR testsuite/90115
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Prune
	uninteresting/varying diagnostics.

Reported-by: Sunil K Pandey <skpandey@sc.intel.com>
---
 .../testsuite/libgomp.oacc-fortran/privatized-ref-2.f90  | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
index 60803e48cbe..baaee02b82c 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
@@ -12,6 +12,8 @@
 ! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
 ! for testing/documenting aspects of that functionality.
+! Prune a few: uninteresting, and varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
 
 ! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
 ! passed to 'incr' may be unset, and in that case, it will be set to [...]",
@@ -43,7 +45,6 @@ contains
     integer :: array(nn)
 
     !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
@@ -65,7 +66,6 @@ contains
     integer :: array(:)
 
     !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
@@ -92,8 +92,7 @@ contains
     integer :: i
     character(len=*) :: str
 
-    !$acc parallel copyout(str) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    !$acc parallel copyout(str)
     str = "abcdefghij"
     !$acc loop gang private(str) ! { dg-line l_loop[incr c_loop] }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
@@ -101,7 +100,6 @@ contains
     ! { dg-note {variable 'char\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'char\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'char\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_loop$c_loop }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-message {sorry, unimplemented: target cannot support alloca} PR65181 { target openacc_nvidia_accel_selected } l_loop$c_loop }
     do i = 1, 10
       str(i:i) = achar(ichar('A') + i)
@@ -137,7 +135,6 @@ contains
     ! { dg-note {variable 'char\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'char\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'char\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} "" { target { ! { openacc_host_selected || openacc_nvidia_accel_selected } } } l_loop$c_loop }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     do i = 1, 15
       scalar(i:i) = achar(ichar('A') + i)
     end do
-- 
2.25.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64
  2021-05-22  8:41           ` Thomas Schwinge
@ 2021-05-25  1:03             ` Sunil Pandey
  0 siblings, 0 replies; 24+ messages in thread
From: Sunil Pandey @ 2021-05-25  1:03 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: GCC Patches, gcc-regression, Hongjiu Lu

Hi Thomas,

I reproduced this issue manually and it turns out this is a special case.

Script takes input from https://gcc.gnu.org/pipermail/gcc-regression/ and
it matches the exact error message in the triaging process. This failure
reported on gcc regression
https://gcc.gnu.org/pipermail/gcc-regression/2021-May/074806.html

Reason it triaged 325aa13996bafce0c4927876c315d1fa706d9881 and not
11b8286a83289f5b54e813f14ff56d730c3f3185 because,

Commit 325aa13996bafce0c4927876c315d1fa706d9881 is the first commit which
matches the failure message reported on gcc-regression. See the difference
in line number.

Error message produced from commit 325aa13996bafce0c4927876c315d1fa706d9881:
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 134)

vs.

Error message produced from commit 11b8286a83289f5b54e813f14ff56d730c3f3185:
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 100)
FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 136)

Thank you so much,
Sunil Pandey


On Sat, May 22, 2021 at 1:41 AM Thomas Schwinge <thomas@codesourcery.com>
wrote:

> Hi!
>
> First: many thanks for running this automated regression testing
> machinery!
>
> On 2021-05-21T18:40:55-0700, "sunil.k.pandey via Gcc-patches" <
> gcc-patches@gcc.gnu.org> wrote:
> > On Linux/x86_64,
> >
> > 325aa13996bafce0c4927876c315d1fa706d9881 is the first bad commit
> > commit 325aa13996bafce0c4927876c315d1fa706d9881
> > Author: Thomas Schwinge <thomas@codesourcery.com>
> > Date:   Fri May 21 08:51:47 2021 +0200
> >
> >     [OpenACC privatization] Reject 'static', 'external' in blocks
> [PR90115]
>
> Actually not that one, but instead one commit before is the culprit:
>
>     commit 11b8286a83289f5b54e813f14ff56d730c3f3185
>     Author: Thomas Schwinge <thomas@codesourcery.com>
>     Date:   Thu May 20 16:11:37 2021 +0200
>
>         [OpenACC privatization] Largely extend diagnostics and
> corresponding testsuite coverage [PR90115]
>
> (Probably your testing aggregates commits that appear in some period of
> time?  Maybe reflect that in the reporting emails?)
>
> > caused
> >
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O0   (test for warnings, line 98)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O1   (test for warnings, line 98)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O2   (test for warnings, line 98)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for
> warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions   (test for
> warnings, line 98)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -O3 -g   (test for warnings, line 98)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 134)
> > FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1
> -DACC_MEM_SHARED=1 -foffload=disable  -Os   (test for warnings, line 98)
>
> Sorry, and ACK, and I'm confused why I didn't see that in my own testing.
> I've now pushed "[OpenACC privatization] Prune uninteresting/varying
> diagnostics in 'libgomp.oacc-fortran/privatized-ref-2.f90'" to master
> branch in commit 3050a1a18276d7cdd8946e34cc1344e30efb7030, see attached.
>
>
> Grüße
>  Thomas
>
>
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank
> Thürauf
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*'
  2021-05-21 19:29       ` Thomas Schwinge
  2021-05-22  1:40         ` [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64 sunil.k.pandey
@ 2022-03-04 13:51         ` Thomas Schwinge
  2022-03-10 11:10         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
                           ` (3 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2022-03-04 13:51 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

Hi!

Pushed to master branch commit 07395f19dff610f03d1b1d30c8bd640f610c45dc
"Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*'",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Test-fopt-info-omp-all-in-libgomp.oacc-kernels-priva.patch --]
[-- Type: text/x-diff, Size: 148783 bytes --]

From 07395f19dff610f03d1b1d30c8bd640f610c45dc Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 16 Feb 2022 15:44:27 +0100
Subject: [PATCH] Test '-fopt-info-omp-all' in
 'libgomp.oacc-*/kernels-private-vars-*'

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c:
	Test '-fopt-info-omp-all'.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90:
	Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90:
	Likewise.
---
 .../kernels-private-vars-local-worker-1.c     | 57 +++++++++++--------
 .../kernels-private-vars-local-worker-2.c     | 49 +++++++++-------
 .../kernels-private-vars-local-worker-3.c     | 49 +++++++++-------
 .../kernels-private-vars-local-worker-4.c     | 51 ++++++++++-------
 .../kernels-private-vars-local-worker-5.c     | 49 +++++++++-------
 .../kernels-private-vars-loop-gang-1.c        | 31 ++++++----
 .../kernels-private-vars-loop-gang-2.c        | 39 ++++++++-----
 .../kernels-private-vars-loop-gang-3.c        | 39 ++++++++-----
 .../kernels-private-vars-loop-gang-4.c        | 41 +++++++------
 .../kernels-private-vars-loop-gang-5.c        | 39 ++++++++-----
 .../kernels-private-vars-loop-gang-6.c        | 39 ++++++++-----
 .../kernels-private-vars-loop-vector-1.c      | 49 +++++++++-------
 .../kernels-private-vars-loop-vector-2.c      | 43 ++++++++------
 .../kernels-private-vars-loop-worker-1.c      | 37 +++++++-----
 .../kernels-private-vars-loop-worker-2.c      | 45 +++++++++------
 .../kernels-private-vars-loop-worker-3.c      | 57 +++++++++++--------
 .../kernels-private-vars-loop-worker-4.c      | 49 +++++++++-------
 .../kernels-private-vars-loop-worker-5.c      | 51 ++++++++++-------
 .../kernels-private-vars-loop-worker-6.c      | 49 +++++++++-------
 .../kernels-private-vars-loop-worker-7.c      | 49 +++++++++-------
 .../kernels-private-vars-loop-gang-1.f90      | 23 +++++---
 .../kernels-private-vars-loop-gang-2.f90      | 27 ++++++---
 .../kernels-private-vars-loop-gang-3.f90      | 27 ++++++---
 .../kernels-private-vars-loop-gang-6.f90      | 27 ++++++---
 .../kernels-private-vars-loop-vector-1.f90    | 37 +++++++-----
 .../kernels-private-vars-loop-vector-2.f90    | 31 ++++++----
 .../kernels-private-vars-loop-worker-1.f90    | 27 ++++++---
 .../kernels-private-vars-loop-worker-2.f90    | 31 ++++++----
 .../kernels-private-vars-loop-worker-3.f90    | 41 +++++++------
 .../kernels-private-vars-loop-worker-4.f90    | 35 +++++++-----
 .../kernels-private-vars-loop-worker-5.f90    | 37 +++++++-----
 .../kernels-private-vars-loop-worker-6.f90    | 35 +++++++-----
 .../kernels-private-vars-loop-worker-7.f90    | 35 +++++++-----
 33 files changed, 811 insertions(+), 514 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
index f28513dd208..acbeb65273f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared in a local scope, broadcasting
    to vector-partitioned mode.  Back-to-back worker loops.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,46 +25,47 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-	#pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+	#pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
-	#pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+	#pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
index 21f25114d68..2558a68eb94 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared in a local scope, broadcasting
    to vector-partitioned mode.  Successive vector loops.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,38 +25,39 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'x' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    int x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
index 8b4cde87ce9..b2a208c163f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared in a local scope, broadcasting
    to vector-partitioned mode.  Aggregate worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 typedef struct
 {
   int x, y;
@@ -22,19 +30,19 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -43,18 +51,19 @@ main (int argc, char* argv[])
 	    pt.x = i ^ j * 3;
 	    pt.y = i | j * 5;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
index a658d167236..46c395620fe 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared in a local scope, broadcasting
    to vector-partitioned mode.  Addressable worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 typedef struct
 {
   int x, y;
@@ -22,20 +30,20 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'pt' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-3 } */
-	/* { dg-note {variable 'ptp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'pt' declared in block is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'ptp' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -45,20 +53,21 @@ main (int argc, char* argv[])
 	    
 	    pt.x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += ptp->x * k;
 
 	    ptp->y = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
index b82b9bf210a..4b5a15e6ad4 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared in a local scope, broadcasting
    to vector-partitioned mode.  Array worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,19 +25,19 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'pt' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -37,20 +45,21 @@ main (int argc, char* argv[])
 	    
 	    pt[0] = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
index 38d89c726ca..4a824941427 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c
@@ -1,12 +1,20 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+/* Test of gang-private variables declared on loop directive.  */
 
-#include <assert.h>
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
 
-/* Test of gang-private variables declared on loop directive.  */
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
 
 int
 main (int argc, char* argv[])
@@ -16,17 +24,18 @@ main (int argc, char* argv[])
   for (i = 0; i < 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang(num:32) private(x)
-    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+    #pragma acc loop gang(num:32) private(x) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 	arr[i] += x;
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     assert (arr[i] == i * 3);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
index 62dd12fb790..039053f3c86 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of gang-private variables declared on loop directive, with broadcasting
    to partitioned workers.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,22 +25,23 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang(num:32) private(x)
-    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    #pragma acc loop gang(num:32) private(x) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
-	#pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	#pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + (i / 32) * 2);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
index c22c3b43e31..2b89659a007 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of gang-private variables declared on loop directive, with broadcasting
    to partitioned vectors.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,22 +25,23 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang(num:32) private(x)
-    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    #pragma acc loop gang(num:32) private(x) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
 	x = i * 2;
 
-	#pragma acc loop vector(length:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	#pragma acc loop vector(length:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + (i / 32) * 2);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
index 27a8e804129..70760705925 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of gang-private addressable variable declared on loop directive, with
    broadcasting to partitioned workers.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,27 +25,28 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang(num:32) private(x)
-    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-    /* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
-    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+    #pragma acc loop gang(num:32) private(x) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
         int *p = &x;
 
 	x = i * 2;
 
-	#pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	#pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x;
 
 	(*p)--;
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + (i / 32) * 2);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
index f570c222940..edf0e24af8b 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of gang-private array variable declared on loop directive, with
    broadcasting to partitioned workers.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,23 +25,24 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang(num:32) private(x)
-    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    #pragma acc loop gang(num:32) private(x) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
         for (int j = 0; j < 8; j++)
 	  x[j] = j * 2;
 
-	#pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	#pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += x[j % 8];
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + (i % 8) * 2);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
index 5b776f18f72..a2df33b767d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of gang-private aggregate variable declared on loop directive, with
    broadcasting to partitioned workers.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 typedef struct {
   int x, y, z;
   int attr[13];
@@ -23,12 +31,12 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
   {
-    #pragma acc loop gang private(pt)
-    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+    #pragma acc loop gang private(pt) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
         pt.x = i;
@@ -36,12 +44,13 @@ main (int argc, char* argv[])
 	pt.z = i * 4;
 	pt.attr[5] = i * 6;
 
-	#pragma acc loop worker
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	#pragma acc loop worker /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 0; j < 32; j++)
 	  arr[i * 32 + j] += pt.x + pt.y + pt.z + pt.attr[5];
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + (i / 32) * 13);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
index 696da0f204f..51c1de53414 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c
@@ -1,12 +1,20 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+/* Test of vector-private variables declared on loop directive.  */
 
-#include <assert.h>
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
 
-/* Test of vector-private variables declared on loop directive.  */
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
 
 int
 main (int argc, char* argv[])
@@ -16,34 +24,34 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
-	    #pragma acc loop vector(length:32) private(x)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	    #pragma acc loop vector(length:32) private(x) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i ^ j * 3;
 		arr[i * 1024 + j * 32 + k] += x * k;
 	      }
 
-	    #pragma acc loop vector(length:32) private(x)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	    #pragma acc loop vector(length:32) private(x) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
+	    /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      {
 		x = i | j * 5;
@@ -52,6 +60,7 @@ main (int argc, char* argv[])
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
index 2e3b635b023..cb90eaab99d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c
@@ -1,12 +1,20 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+/* Test of vector-private variables declared on loop directive. Array type.  */
 
-#include <assert.h>
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
 
-/* Test of vector-private variables declared on loop directive. Array type.  */
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
 
 int
 main (int argc, char* argv[])
@@ -16,25 +24,25 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32)
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+        #pragma acc loop worker(num:32) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 
-	    #pragma acc loop vector(length:32) private(pt)
-	    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+	    #pragma acc loop vector(length:32) private(pt) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      {
 	        pt[0] = i ^ j * 3;
@@ -45,6 +53,7 @@ main (int argc, char* argv[])
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
index 1aedc7964e2..54e1c93714b 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c
@@ -1,12 +1,20 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+/* Test of worker-private variables declared on a loop directive.  */
 
-#include <assert.h>
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
 
-/* Test of worker-private variables declared on a loop directive.  */
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
 
 int
 main (int argc, char* argv[])
@@ -16,18 +24,18 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
+        #pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    x = i ^ j * 3;
@@ -38,6 +46,7 @@ main (int argc, char* argv[])
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32 * 32; i++)
     assert (arr[i] == i + ((i / 32) ^ (i % 32) * 3));
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
index 3bf62aae174..80ac99013d6 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on a loop directive, broadcasting
    to vector-partitioned mode.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,31 +25,32 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
index 8de551635ea..a05ac609123 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on a loop directive, broadcasting
    to vector-partitioned mode.  Back-to-back worker loops.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,46 +25,47 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
 
-	#pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+	#pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
index 425fe6321fa..d46bb948626 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on a loop directive, broadcasting
    to vector-partitioned mode.  Successive vector loops.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,38 +25,39 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    x = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
index c027c024b9c..644c6175863 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on a loop directive, broadcasting
    to vector-partitioned mode.  Addressable worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -17,20 +25,20 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(x)
-	/* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
-	/* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-4 } */
+        #pragma acc loop worker(num:32) private(x) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'p' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -38,20 +46,21 @@ main (int argc, char* argv[])
 	    
 	    x = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	    
 	    *p = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += x * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
index 4f17566f8f9..182a12a1b3f 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on a loop directive, broadcasting
    to vector-partitioned mode.  Aggregate worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 typedef struct
 {
   int x, y;
@@ -23,19 +31,19 @@ main (int argc, char* argv[])
   for (i = 0; i < 32 * 32 * 32; i++)
     arr[i] = i;
 
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
-        #pragma acc loop worker(num:32) private(pt)
-	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) private(pt) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
@@ -43,18 +51,19 @@ main (int argc, char* argv[])
 	    pt.x = i ^ j * 3;
 	    pt.y = i | j * 5;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.x * k;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt.y * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
index 12b4c548156..bdfbb59c5c5 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c
@@ -1,14 +1,22 @@
-/* { dg-additional-options "-fopt-info-note-omp" }
-   { dg-additional-options "--param=openacc-privatization=noisy" }
-   { dg-additional-options "-foffload=-fopt-info-note-omp" }
-   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
-
-#include <assert.h>
-
 /* Test of worker-private variables declared on loop directive, broadcasting
    to vector-partitioned mode.  Array worker variable.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include <assert.h>
+
 int
 main (int argc, char* argv[])
 {
@@ -20,40 +28,41 @@ main (int argc, char* argv[])
 
   /* "pt" is treated as "present_or_copy" on the kernels directive because it
      is an array variable.  */
-  #pragma acc kernels copy(arr)
-  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+  #pragma acc kernels copy(arr) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     int j;
 
-    #pragma acc loop gang(num:32)
-    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+    #pragma acc loop gang(num:32) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 32; i++)
       {
         /* But here, it is made private per-worker.  */
-        #pragma acc loop worker(num:32) private(pt)
-	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
-	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 } */
+        #pragma acc loop worker(num:32) private(pt) /* { dg-line l_loop_j[incr c_loop_j] } */
+	/* { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (j = 0; j < 32; j++)
 	  {
 	    int k;
 	    
 	    pt[0] = i ^ j * 3;
 
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[0] * k;
 
 	    pt[1] = i | j * 5;
 	    
-	    #pragma acc loop vector(length:32)
-	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
+	    #pragma acc loop vector(length:32) /* { dg-line l_loop_k[incr c_loop_k] } */
+	    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
 	    for (k = 0; k < 32; k++)
 	      arr[i * 1024 + j * 32 + k] += pt[1] * k;
 	  }
       }
   }
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
 
   for (i = 0; i < 32; i++)
     for (int j = 0; j < 32; j++)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
index 0ae7c4bc761..09ab3956624 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90
@@ -2,11 +2,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, arr(32)
@@ -15,15 +23,16 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32) private(x)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) private(x) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 1, 32
      x = i * 2;
      arr(i) = arr(i) + x;
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 1, 32
      if (arr(i) .ne. i * 3) stop 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
index e3ff24848b6..bec1069c2a8 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, arr(0:32*32)
@@ -16,20 +24,21 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32) private(x)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) private(x) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
      x = i * 2;
 
-     !$acc loop worker(num:32)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     !$acc loop worker(num:32) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x;
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 * 32 - 1
      if (arr(i) .ne. i + (i / 32) * 2) stop 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
index 370a25a7db6..9fde012c19c 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, arr(0:32*32)
@@ -16,20 +24,21 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32) private(x)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) private(x) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
+  ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
      x = i * 2;
 
-     !$acc loop vector(length:32)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     !$acc loop vector(length:32) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + x;
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 * 32 - 1
      if (arr(i) .ne. i + (i / 32) * 2) stop 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
index abb86d0824f..02e09b31cf7 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   type vec3
@@ -21,23 +29,24 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32) private(pt)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) private(pt) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
+  ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
      pt%x = i
      pt%y = i * 2
      pt%z = i * 4
      pt%attr(5) = i * 6
 
-     !$acc loop vector(length:32)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     !$acc loop vector(length:32) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         arr(i * 32 + j) = arr(i * 32 + j) + pt%x + pt%y + pt%z + pt%attr(5);
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 * 32 - 1
      if (arr(i) .ne. i + (i / 32) * 13) stop 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
index fe796f3ba46..5811d0c41b7 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90
@@ -2,11 +2,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
@@ -15,23 +23,23 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     !$acc loop worker(num:8) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
-        !$acc loop vector(length:32) private(x)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        !$acc loop vector(length:32) private(x) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            x = ieor(i, j * 3)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
-        !$acc loop vector(length:32) private(x)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        !$acc loop vector(length:32) private(x) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
+        ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            x = ior(i, j * 5)
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
@@ -39,6 +47,7 @@ program main
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
index b5cefeccc22..81125a24d0d 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90
@@ -2,11 +2,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: i, j, k, idx, arr(0:32*32*32), pt(2)
@@ -15,16 +23,16 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+     !$acc loop worker(num:8) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
-        !$acc loop vector(length:32) private(x, pt)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-        ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+        !$acc loop vector(length:32) private(x, pt) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
+        ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            pt(1) = ieor(i, j * 3)
            pt(2) = ior(i, j * 5)
@@ -34,6 +42,7 @@ program main
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
index 3fd1239da4b..824c198799c 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90
@@ -2,11 +2,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, arr(0:32*32)
@@ -16,19 +24,20 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32) private(x)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) private(x) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(x)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(x) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         x = ieor(i, j * 3)
         arr(i * 32 + j) = arr(i * 32 + j) + x
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 * 32 - 1
      if (arr(i) .ne. i + ieor(i / 32, mod(i, 32) * 3)) stop 1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
index 1dc5d9e8eff..d25d419a11e 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
@@ -16,24 +24,25 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(x)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(x) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
index 25bc67abb8b..7a69145b3b6 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
@@ -16,37 +24,38 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(x)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(x) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
 
-     !$acc loop worker(num:8) private(x)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(x) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         x = ior(i, j * 5)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
index b3f66eaf773..2c1d5665ce5 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: x, i, j, k, idx, arr(0:32*32*32)
@@ -16,32 +24,33 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(x)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(x) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         x = ieor(i, j * 3)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
 
         x = ior(i, j * 5)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
index d9dbb0736f3..4936e569b44 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: i, j, k, idx, arr(0:32*32*32)
@@ -18,34 +26,35 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(x, p)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
-     ! { dg-note {variable 'p' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-3 }
+     !$acc loop worker(num:8) private(x, p) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'x' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'p' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         p => x
         x = ieor(i, j * 3)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
 
         p = ior(i, j * 5)
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + x * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
index b4225c2bf47..6b2ec1a0047 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   type vec2
@@ -21,31 +29,32 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(pt)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(pt) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'pt' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         pt%x = ieor(i, j * 3)
         pt%y = ior(i, j * 5)
         
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%x * k
         end do
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt%y * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
index 76bbda72787..a90be1d45e8 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90
@@ -3,11 +3,19 @@
 
 ! { dg-do run }
 
-! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-fopt-info-omp-all" }
+! { dg-additional-options "-foffload=-fopt-info-omp-all" }
+
 ! { dg-additional-options "--param=openacc-privatization=noisy" }
-! { dg-additional-options "-foffload=-fopt-info-note-omp" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
 
 program main
   integer :: i, j, k, idx, arr(0:32*32*32), pt(2)
@@ -16,31 +24,32 @@ program main
      arr(i) = i
   end do
 
-  !$acc kernels copy(arr)
-  !$acc loop gang(num:32)
-  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  !$acc kernels copy(arr) ! { dg-line l_compute[incr c_compute] }
+  !$acc loop gang(num:32) ! { dg-line l_loop_i[incr c_loop_i] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i }
   do i = 0, 31
-     !$acc loop worker(num:8) private(pt)
-     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-     ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 }
+     !$acc loop worker(num:8) private(pt) ! { dg-line l_loop_j[incr c_loop_j] }
+     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j }
+     ! { dg-note {variable 'pt' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_j$c_loop_j }
      do j = 0, 31
         pt(1) = ieor(i, j * 3)
         pt(2) = ior(i, j * 5)
         
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(1) * k
         end do
 
-        !$acc loop vector(length:32)
-        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+        !$acc loop vector(length:32) ! { dg-line l_loop_k[incr c_loop_k] }
+        ! { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k }
         do k = 0, 31
            arr(i * 1024 + j * 32 + k) = arr(i * 1024 + j * 32 + k) + pt(2) * k
         end do
      end do
   end do
   !$acc end kernels
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute }
 
   do i = 0, 32 - 1
      do j = 0, 32 -1
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
  2021-05-21 19:29       ` Thomas Schwinge
  2021-05-22  1:40         ` [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64 sunil.k.pandey
  2022-03-04 13:51         ` Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*' Thomas Schwinge
@ 2022-03-10 11:10         ` Thomas Schwinge
  2022-03-12 13:05         ` Thomas Schwinge
                           ` (2 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2022-03-10 11:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: Julian Brown, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

Hi!

On 2021-05-21T21:29:19+0200, I wrote:
> I've pushed "[OpenACC privatization] Largely extend diagnostics and
> corresponding testsuite coverage [PR90115]" to master branch in commit
> 11b8286a83289f5b54e813f14ff56d730c3f3185

To demonstrate that later changes don't vs. how they do change things,
pushed to master branch commit 1d9dc3dd74eddd192bec1ac6f4d6548a81deb9a5
"Enhance further testcases to verify handling of OpenACC privatization
level [PR90115]", see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Enhance-further-testcases-to-verify-handling-of-Open.patch --]
[-- Type: text/x-diff, Size: 33747 bytes --]

From 1d9dc3dd74eddd192bec1ac6f4d6548a81deb9a5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Tue, 8 Mar 2022 11:51:55 +0100
Subject: [PATCH] Enhance further testcases to verify handling of OpenACC
 privatization level [PR90115]

As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	gcc/testsuite/
	* c-c++-common/goacc/nesting-1.c: Enhance.
	* gcc.dg/goacc/nested-function-1.c: Likewise.
	* gcc.dg/goacc/nested-function-2.c: Likewise.
	* gfortran.dg/goacc/nested-function-1.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-fortran/routine-1.f90: Enhance.
	* testsuite/libgomp.oacc-fortran/routine-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-9.f90: Likewise.
---
 gcc/testsuite/c-c++-common/goacc/nesting-1.c  | 57 +++++++++++++----
 .../gcc.dg/goacc/nested-function-1.c          | 54 ++++++++++++----
 .../gcc.dg/goacc/nested-function-2.c          | 28 ++++++++-
 .../gfortran.dg/goacc/nested-function-1.f90   | 62 +++++++++++++++----
 .../libgomp.oacc-fortran/routine-1.f90        | 19 +++++-
 .../libgomp.oacc-fortran/routine-2.f90        | 19 +++++-
 .../libgomp.oacc-fortran/routine-3.f90        | 19 +++++-
 .../libgomp.oacc-fortran/routine-9.f90        | 19 +++++-
 8 files changed, 227 insertions(+), 50 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/goacc/nesting-1.c b/gcc/testsuite/c-c++-common/goacc/nesting-1.c
index cab4f98950d..83cbff767a4 100644
--- a/gcc/testsuite/c-c++-common/goacc/nesting-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/nesting-1.c
@@ -1,3 +1,15 @@
+/* { dg-additional-options "-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 extern int i;
 
 void
@@ -5,7 +17,11 @@ f_acc_parallel (void)
 {
 #pragma acc parallel
   {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i }
+       { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 2; ++i)
       ;
   }
@@ -15,9 +31,12 @@ f_acc_parallel (void)
 void
 f_acc_kernels (void)
 {
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
   {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (i = 0; i < 2; ++i)
       ;
   }
@@ -34,17 +53,25 @@ f_acc_data (void)
 
 #pragma acc parallel
     {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+      /* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i }
+	 { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (i = 0; i < 2; ++i)
 	;
     }
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
     ;
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
     {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+      /* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (i = 0; i < 2; ++i)
 	;
     }
@@ -65,17 +92,25 @@ f_acc_data (void)
 
 #pragma acc parallel
       {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+	/* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+	/* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i }
+	   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop_i$c_loop_i } */
+	/* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
 	for (i = 0; i < 2; ++i)
 	  ;
       }
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
       ;
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
       {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+	/* { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+	/* { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_i$c_loop_i } */
 	for (i = 0; i < 2; ++i)
 	  ;
       }
diff --git a/gcc/testsuite/gcc.dg/goacc/nested-function-1.c b/gcc/testsuite/gcc.dg/goacc/nested-function-1.c
index e17c0e2227f..c34bcb0d601 100644
--- a/gcc/testsuite/gcc.dg/goacc/nested-function-1.c
+++ b/gcc/testsuite/gcc.dg/goacc/nested-function-1.c
@@ -2,6 +2,20 @@
 /* See gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90 for the Fortran
    version.  */
 
+/* { dg-additional-options "-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute_loop 0 c_loop 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 int main ()
 {
 #define N 100
@@ -25,32 +39,40 @@ int main ()
       local_a[i] = 5;
     local_arg = 5;
 
-#pragma acc kernels loop \
+#pragma acc kernels loop /* { dg-line l_compute_loop[incr c_compute_loop] } */ \
   gang(num:local_arg) worker(local_arg) vector(local_arg) \
   wait async(local_arg)
+    /* { dg-note {variable 'local_i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop } */
     for (local_i = 0; local_i < N; ++local_i)
       {
 #pragma acc cache (local_a[local_i:5])
 	local_a[local_i] = 100;
-#pragma acc loop seq tile(*)
+#pragma acc loop seq tile(*) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (local_j = 0; local_j < N; ++local_j)
 	  ;
-#pragma acc loop auto independent tile(1)
+#pragma acc loop auto independent tile(1) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (local_j = 0; local_j < N; ++local_j)
 	  ;
       }
 
-#pragma acc kernels loop \
+#pragma acc kernels loop /* { dg-line l_compute_loop[incr c_compute_loop] } */ \
   gang(static:local_arg) worker(local_arg) vector(local_arg) \
   wait(local_arg, local_arg + 1, local_arg + 2) async
+    /* { dg-note {variable 'local_i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop } */
     for (local_i = 0; local_i < N; ++local_i)
       {
 #pragma acc cache (local_a[local_i:4])
 	local_a[local_i] = 100;
-#pragma acc loop seq tile(1)
+#pragma acc loop seq tile(1) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (local_j = 0; local_j < N; ++local_j)
 	  ;
-#pragma acc loop auto independent tile(*)
+#pragma acc loop auto independent tile(*) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (local_j = 0; local_j < N; ++local_j)
 	  ;
       }
@@ -62,32 +84,40 @@ int main ()
       nonlocal_a[i] = 5;
     nonlocal_arg = 5;
 
-#pragma acc kernels loop \
+#pragma acc kernels loop /* { dg-line l_compute_loop[incr c_compute_loop] } */ \
   gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \
   wait async(nonlocal_arg)
+    /* { dg-note {variable 'nonlocal_i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop } */
     for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i)
       {
 #pragma acc cache (nonlocal_a[nonlocal_i:3])
 	nonlocal_a[nonlocal_i] = 100;
-#pragma acc loop seq tile(2)
+#pragma acc loop seq tile(2) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j)
 	  ;
-#pragma acc loop auto independent tile(3)
+#pragma acc loop auto independent tile(3) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j)
 	  ;
       }
 
-#pragma acc kernels loop \
+#pragma acc kernels loop /* { dg-line l_compute_loop[incr c_compute_loop] } */ \
   gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \
   wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async
+    /* { dg-note {variable 'nonlocal_i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop } */
     for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i)
       {
 #pragma acc cache (nonlocal_a[nonlocal_i:2])
 	nonlocal_a[nonlocal_i] = 100;
-#pragma acc loop seq tile(*)
+#pragma acc loop seq tile(*) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j)
 	  ;
-#pragma acc loop auto independent tile(*)
+#pragma acc loop auto independent tile(*) /* { dg-line l_loop[incr c_loop] } */
+	/* { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
 	for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j)
 	  ;
       }
diff --git a/gcc/testsuite/gcc.dg/goacc/nested-function-2.c b/gcc/testsuite/gcc.dg/goacc/nested-function-2.c
index 70c9ec8ebfa..407006948da 100644
--- a/gcc/testsuite/gcc.dg/goacc/nested-function-2.c
+++ b/gcc/testsuite/gcc.dg/goacc/nested-function-2.c
@@ -1,5 +1,17 @@
 /* Exercise nested function decomposition, gcc/tree-nested.c.  */
 
+/* { dg-additional-options "-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_loop 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 int
 main (void)
 {
@@ -9,7 +21,9 @@ main (void)
     int i;
 #pragma acc parallel
     {
-#pragma acc loop
+#pragma acc loop /* { dg-line l_loop[incr c_loop] } */
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop$c_loop } */
       for (i = 0; i < m; i+= k)
 	j = (m + i - j) * l;
     }
@@ -19,7 +33,11 @@ main (void)
     int x, y, z;
 #pragma acc parallel
     {
-#pragma acc loop collapse (3)
+#pragma acc loop collapse (3) /* { dg-line l_loop[incr c_loop] } */
+      /* { dg-note {variable 'z' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-note {variable 'y' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop$c_loop } */
       for (x = 0; x < k; x++)
 	for (y = -5; y < l; y++)
 	  for (z = 0; z < m; z++)
@@ -31,7 +49,11 @@ main (void)
     int x, y, z;
 #pragma acc parallel reduction (+:j)
     {
-#pragma acc loop reduction (+:j) collapse (3)
+#pragma acc loop reduction (+:j) collapse (3) /* { dg-line l_loop[incr c_loop] } */
+      /* { dg-note {variable 'z' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-note {variable 'y' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-note {variable 'x' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop } */
+      /* { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop$c_loop } */
       for (x = 0; x < k; x++)
 	for (y = -5; y < l; y++)
 	  for (z = 0; z < m; z++)
diff --git a/gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90 b/gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
index 005193f30a7..50fd0c82e14 100644
--- a/gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
@@ -1,6 +1,20 @@
 ! Exercise nested function decomposition, gcc/tree-nested.c.
 ! See gcc/testsuite/gcc.dg/goacc/nested-function-1.c for the C version.
 
+! { dg-additional-options "-fopt-info-all-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute_loop 0 c_loop 0] }
+! { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
 program main
   integer, parameter :: N = 100
   integer :: nonlocal_arg
@@ -29,14 +43,20 @@ contains
 
     !$acc kernels loop &
     !$acc gang(num:local_arg) worker(local_arg) vector(local_arg) &
-    !$acc wait async(local_arg)
+    !$acc wait async(local_arg) ! { dg-line l_compute_loop[incr c_compute_loop] }
+    ! { dg-note {variable 'local_i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'local_i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop }
     do local_i = 1, N
        !$acc cache (local_a(local_i:local_i + 5))
        local_a(local_i) = 100
-       !$acc loop seq tile(*)
+       !$acc loop seq tile(*) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do local_j = 1, N
        enddo
-       !$acc loop auto independent tile(1)
+       !$acc loop auto independent tile(1) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do local_j = 1, N
        enddo
     enddo
@@ -44,14 +64,20 @@ contains
 
     !$acc kernels loop &
     !$acc gang(static:local_arg) worker(local_arg) vector(local_arg) &
-    !$acc wait(local_arg, local_arg + 1, local_arg + 2) async
+    !$acc wait(local_arg, local_arg + 1, local_arg + 2) async ! { dg-line l_compute_loop[incr c_compute_loop] }
+    ! { dg-note {variable 'local_i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'local_i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop }
     do local_i = 1, N
        !$acc cache (local_a(local_i:local_i + 4))
        local_a(local_i) = 100
-       !$acc loop seq tile(1)
+       !$acc loop seq tile(1) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do local_j = 1, N
        enddo
-       !$acc loop auto independent tile(*)
+       !$acc loop auto independent tile(*) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'local_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do local_j = 1, N
        enddo
     enddo
@@ -68,14 +94,20 @@ contains
 
     !$acc kernels loop &
     !$acc gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) &
-    !$acc wait async(nonlocal_arg)
+    !$acc wait async(nonlocal_arg) ! { dg-line l_compute_loop[incr c_compute_loop] }
+    ! { dg-note {variable 'nonlocal_i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'nonlocal_i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop }
     do nonlocal_i = 1, N
        !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 3))
        nonlocal_a(nonlocal_i) = 100
-       !$acc loop seq tile(2)
+       !$acc loop seq tile(2) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do nonlocal_j = 1, N
        enddo
-       !$acc loop auto independent tile(3)
+       !$acc loop auto independent tile(3) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do nonlocal_j = 1, N
        enddo
     enddo
@@ -83,14 +115,20 @@ contains
 
     !$acc kernels loop &
     !$acc gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) &
-    !$acc wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async
+    !$acc wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async ! { dg-line l_compute_loop[incr c_compute_loop] }
+    ! { dg-note {variable 'nonlocal_i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'nonlocal_i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute_loop$c_compute_loop }
+    ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute_loop$c_compute_loop }
     do nonlocal_i = 1, N
        !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 2))
        nonlocal_a(nonlocal_i) = 100
-       !$acc loop seq tile(*)
+       !$acc loop seq tile(*) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do nonlocal_j = 1, N
        enddo
-       !$acc loop auto independent tile(*)
+       !$acc loop auto independent tile(*) ! { dg-line l_loop[incr c_loop] }
+       ! { dg-note {variable 'nonlocal_j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop$c_loop }
        do nonlocal_j = 1, N
        enddo
     enddo
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90
index 6a573218b7a..95d8752f8a0 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-1.f90
@@ -1,6 +1,14 @@
 ! { dg-do run }
 ! { dg-options "-fno-inline" }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
   interface
     recursive function fact (x)
       !$acc routine
@@ -11,9 +19,14 @@
   integer, parameter :: n = 10
   integer :: a(n), i
   !$acc parallel
-  !$acc loop
+  !$acc loop ! { dg-line l_loop1 }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l_loop1 }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop1 }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop1 }
   do i = 1, n
-     a(i) = fact (i)
+     a(i) = fact (i) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end do
   !$acc end parallel
   do i = 1, n
@@ -27,6 +40,6 @@ recursive function fact (x) result (res)
   if (x < 1) then
      res = 1
   else
-     res = x * fact (x - 1)
+     res = x * fact (x - 1) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end if
 end function fact
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90
index b6979747902..9e8eb96dbf2 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-2.f90
@@ -1,6 +1,14 @@
 ! { dg-do run }
 ! { dg-options "-fno-inline" }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
   module m1
     contains
     recursive function fact (x) result (res)
@@ -10,7 +18,7 @@
       if (x < 1) then
          res = 1
       else
-         res = x * fact (x - 1)
+         res = x * fact (x - 1) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
       end if
     end function fact
   end module m1
@@ -18,9 +26,14 @@
   integer, parameter :: n = 10
   integer :: a(n), i
   !$acc parallel
-  !$acc loop
+  !$acc loop ! { dg-line l_loop1 }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l_loop1 }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop1 }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop1 }
   do i = 1, n
-     a(i) = fact (i)
+     a(i) = fact (i) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end do
   !$acc end parallel
   do i = 1, n
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90
index e7b9d8ab364..38218263851 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-3.f90
@@ -1,14 +1,27 @@
 ! { dg-do run }
 ! { dg-options "-fno-inline" }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
   integer, parameter :: n = 10
   integer :: a(n), i
   integer, external :: fact
   !$acc routine (fact)
   !$acc parallel
-  !$acc loop
+  !$acc loop ! { dg-line l_loop1 }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l_loop1 }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop1 }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop1 }
   do i = 1, n
-     a(i) = fact (i)
+     a(i) = fact (i) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end do
   !$acc end parallel
   do i = 1, n
@@ -22,6 +35,6 @@ recursive function fact (x) result (res)
   if (x < 1) then
      res = 1
   else
-     res = x * fact (x - 1)
+     res = x * fact (x - 1) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end if
 end function fact
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-9.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-9.f90
index b1a1338dd8c..dbd2e4de743 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-9.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-9.f90
@@ -1,6 +1,14 @@
 ! { dg-do run }
 ! { dg-options "-fno-inline" }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
 program main
   implicit none
   integer, parameter :: n = 10
@@ -8,9 +16,14 @@ program main
   integer, external :: fact
   !$acc routine (fact)
   !$acc parallel
-  !$acc loop
+  !$acc loop ! { dg-line l_loop1 }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l_loop1 }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l_loop1 }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop1 }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l_loop1 }
   do i = 1, n
-     a(i) = fact (i)
+     a(i) = fact (i) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end do
   !$acc end parallel
   do i = 1, n
@@ -26,6 +39,6 @@ recursive function fact (x) result (res)
   if (x < 1) then
      res = 1
   else
-     res = x * fact(x - 1)
+     res = x * fact(x - 1) ! { dg-optimized {assigned OpenACC seq loop parallelism} }
   end if
 end function fact
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
  2021-05-21 19:29       ` Thomas Schwinge
                           ` (2 preceding siblings ...)
  2022-03-10 11:10         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
@ 2022-03-12 13:05         ` Thomas Schwinge
  2022-03-16  9:20         ` OpenACC privatization diagnostics vs. 'assert' [PR102841] Thomas Schwinge
  2022-03-17  7:59         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
  5 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2022-03-12 13:05 UTC (permalink / raw)
  To: gcc-patches; +Cc: Julian Brown, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

Hi!

On 2021-05-21T21:29:19+0200, I wrote:
> I've pushed "[OpenACC privatization] Largely extend diagnostics and
> corresponding testsuite coverage [PR90115]" to master branch in commit
> 11b8286a83289f5b54e813f14ff56d730c3f3185

To demonstrate that later changes don't vs. how they do change things,
pushed to master branch commit 2e53fa7bb2ae9fe1152c27e423be9e261da82ddc
"Enhance further testcases to verify handling of OpenACC privatization
level [PR90115]", see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Enhance-further-testcases-to-verify-handling-of-Open.patch --]
[-- Type: text/x-diff, Size: 43361 bytes --]

From 2e53fa7bb2ae9fe1152c27e423be9e261da82ddc Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 11 Mar 2022 15:10:59 +0100
Subject: [PATCH] Enhance further testcases to verify handling of OpenACC
 privatization level [PR90115]

As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/default-1.c: Enhance.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
---
 .../libgomp.oacc-c-c++-common/default-1.c     |  32 ++-
 .../kernels-reduction-1.c                     |  14 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c | 261 +++++++++++++++---
 .../kernels-reduction-1.f90                   |  14 +-
 4 files changed, 266 insertions(+), 55 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c
index 1ac0b9587b9..0ac8d7132d4 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c
@@ -1,4 +1,18 @@
-/* { dg-do run } */
+/* { dg-additional-options "-fopt-info-all-omp" }
+   { dg-additional-options "-foffload=-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
 
 #include  <openacc.h>
 
@@ -13,10 +27,15 @@ int test_parallel ()
     ary[i] = ~0;
 
   /* val defaults to firstprivate, ary defaults to copy.  */
-#pragma acc parallel num_gangs (32) copy (ok) copy(ondev)
+#pragma acc parallel num_gangs (32) copy (ok) copy(ondev) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     ondev = acc_on_device (acc_device_not_host);
-#pragma acc loop gang(static:1)
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+       ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
+#pragma acc loop gang(static:1) /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+    /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (unsigned i = 0; i < 32; i++)
       {
 	if (val != 2)
@@ -51,10 +70,13 @@ int test_kernels ()
     ary[i] = ~0;
 
   /* val defaults to copy, ary defaults to copy.  */
-#pragma acc kernels copy(ondev)
+#pragma acc kernels copy(ondev) /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
   {
     ondev = acc_on_device (acc_device_not_host);
-#pragma acc loop 
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
     for (unsigned i = 0; i < 32; i++)
       {
 	ary[i] = val;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c
index 95f1b77986c..fbd9815f683 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c
@@ -1,6 +1,14 @@
 /* Verify that a simple, explicit acc loop reduction works inside
  a kernels region.  */
 
+/* { dg-additional-options "-fopt-info-all-omp" }
+   { dg-additional-options "-foffload=-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
 #include <stdlib.h>
 
 #define N 100
@@ -10,9 +18,11 @@ main ()
 {
   int i, red = 0;
 
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute1 } */
+  /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute1 } */
   {
-#pragma acc loop reduction (+:red)
+#pragma acc loop reduction (+:red) /* { dg-line l_loop_i1 } */
+    /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i1 } */
   for (i = 0; i < N; i++)
     red++;
   }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index c2f264a1ec8..f9c7aed3a56 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -1,6 +1,22 @@
 /* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
    vector_length.  */
 
+/* { dg-additional-options "-fopt-info-all-omp" }
+   { dg-additional-options "-foffload=-fopt-info-all-omp" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop_i 0 c_loop_j 0 c_loop_k 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 /* { dg-additional-options "-Wopenacc-parallelism" } for testing/documenting
    aspects of that functionality.  */
 
@@ -11,18 +27,21 @@
 #include <gomp-constants.h>
 
 #pragma acc routine seq
+inline __attribute__ ((always_inline))
 static int acc_gang ()
 {
   return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
 }
 
 #pragma acc routine seq
+inline __attribute__ ((always_inline))
 static int acc_worker ()
 {
   return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
 }
 
 #pragma acc routine seq
+inline __attribute__ ((always_inline))
 static int acc_vector ()
 {
   return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
@@ -39,14 +58,19 @@ int main ()
 
   /* GR, WS, VS.  */
   {
-#define GANGS 0 /* { dg-warning "'num_gangs' value must be positive" "" { target c } } */
+#define GANGS 0
+    /* { dg-warning {'num_gangs' value must be positive} {} { target c } .-1 } */
     int gangs_actual = GANGS;
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (gangs_actual) \
   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) \
-  num_gangs (GANGS) /* { dg-warning "'num_gangs' value must be positive" "" { target c++ } } */
+  num_gangs (GANGS)
+    /* { dg-note {in expansion of macro 'GANGS'} {} { target c } .-1 } */
+    /* { dg-warning {'num_gangs' value must be positive} {} { target c++ } .-2 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
     {
       /* We're actually executing with num_gangs (1).  */
       gangs_actual = 1;
@@ -68,18 +92,27 @@ int main ()
 
   /* GP, WS, VS.  */
   {
-#define GANGS 0 /* { dg-warning "'num_gangs' value must be positive" "" { target c } } */
+#define GANGS 0
+    /* { dg-warning {'num_gangs' value must be positive} {} { target c } .-1 } */
     int gangs_actual = GANGS;
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
-  num_gangs (GANGS) /* { dg-warning "'num_gangs' value must be positive" "" { target c++ } } */
-    /* { dg-warning "region contains gang partitioned code but is not gang partitioned" "" { target *-*-* } .-2 } */
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (gangs_actual) \
+  num_gangs (GANGS)
+    /* { dg-note {in expansion of macro 'GANGS'} {} { target c } .-1 } */
+    /* { dg-warning {'num_gangs' value must be positive} {} { target c++ } .-2 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {region contains gang partitioned code but is not gang partitioned} {} { target *-*-* } l_compute$c_compute } */
     {
       /* We're actually executing with num_gangs (1).  */
       gangs_actual = 1;
-#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  gang \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * gangs_actual; i > -100 * gangs_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -98,18 +131,27 @@ int main ()
 
   /* GR, WP, VS.  */
   {
-#define WORKERS 0 /* { dg-warning "'num_workers' value must be positive" "" { target c } } */
+#define WORKERS 0
+    /* { dg-warning {'num_workers' value must be positive} {} { target c } .-1 } */
     int workers_actual = WORKERS;
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (workers_actual) \
-  num_workers (WORKERS) /* { dg-warning "'num_workers' value must be positive" "" { target c++ } } */
-    /* { dg-warning "region contains worker partitioned code but is not worker partitioned" "" { target *-*-* } .-2 } */
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (workers_actual) \
+  num_workers (WORKERS)
+    /* { dg-note {in expansion of macro 'WORKERS'} {} { target c } .-1 } */
+    /* { dg-warning {'num_workers' value must be positive} {} { target c++ } .-2 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {region contains worker partitioned code but is not worker partitioned} {} { target *-*-* } l_compute$c_compute } */
     {
       /* We're actually executing with num_workers (1).  */
       workers_actual = 1;
-#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  worker \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC worker loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * workers_actual; i > -100 * workers_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -128,22 +170,34 @@ int main ()
 
   /* GR, WS, VP.  */
   {
-#define VECTORS 0 /* { dg-warning "'vector_length' value must be positive" "" { target c } } */
+#define VECTORS 0
+    /* { dg-warning {'vector_length' value must be positive} {} { target c } .-1 } */
     int vectors_actual = VECTORS;
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (vectors_actual) /* { dg-warning "using .vector_length \\(32\\)., ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
-  vector_length (VECTORS) /* { dg-warning "'vector_length' value must be positive" "" { target c++ } } */
-    /* { dg-warning "region contains vector partitioned code but is not vector partitioned" "" { target *-*-* } .-2 } */
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (vectors_actual) \
+  vector_length (VECTORS)
+    /* { dg-note {in expansion of macro 'VECTORS'} {} { target c } .-1 } */
+    /* { dg-warning {'vector_length' value must be positive} {} { target c++ } .-2 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {region contains vector partitioned code but is not vector partitioned} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring 1} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       /* We're actually executing with vector_length (1), just the GCC nvptx
 	 back end enforces vector_length (32).  */
       if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	vectors_actual = 32;
       else
 	vectors_actual = 1;
-#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  vector \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * vectors_actual; i > -100 * vectors_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -178,12 +232,16 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (gangs_actual) \
   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) \
   num_gangs (gangs)
-    /* { dg-bogus "warning: region is gang partitioned but does not contain gang partitioned code" "TODO 'reduction'" { xfail *-*-* } .-3 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-bogus {warning: region is gang partitioned but does not contain gang partitioned code} {TODO 'reduction'} { xfail *-*-* } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_gangs (1).  */
 	  gangs_actual = 1;
@@ -214,15 +272,23 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual) \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (gangs_actual) \
   num_gangs (gangs)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_gangs (1).  */
 	  gangs_actual = 1;
 	}
-#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  gang \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * gangs_actual; i > -100 * gangs_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -246,27 +312,40 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (workers_actual) /* { dg-warning "using .num_workers \\(32\\)., ignoring 2097152" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (workers_actual) \
   num_workers (WORKERS)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'num_workers \(32\)', ignoring 2097152} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_workers (1).  */
 	  workers_actual = 1;
 	}
       else if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC nvptx back end enforces num_workers (32).  */
 	  workers_actual = 32;
 	}
       else if (acc_on_device (acc_device_radeon))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC GCN back end is limited to num_workers (16).  */
 	  workers_actual = 16;
 	}
       else
 	__builtin_abort ();
-#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  worker \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC worker loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * workers_actual; i > -100 * workers_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -297,27 +376,39 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (workers_actual) \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (workers_actual) \
   num_workers (workers)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_workers (1).  */
 	  workers_actual = 1;
 	}
       else if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_workers (32).  */
 	  /* workers_actual = 32; */
 	}
       else if (acc_on_device (acc_device_radeon))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC GCN back end is limited to num_workers (16).  */
 	  workers_actual = 16;
 	}
       else
 	__builtin_abort ();
-#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  worker \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC worker loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * workers_actual; i > -100 * workers_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -341,27 +432,40 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (vectors_actual) /* { dg-warning "using .vector_length \\(1024\\)., ignoring 2097152" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (vectors_actual) \
   vector_length (VECTORS)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(1024\)', ignoring 2097152} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with vector_length (1).  */
 	  vectors_actual = 1;
 	}
       else if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC nvptx back end reduces to vector_length (1024).  */
 	  vectors_actual = 1024;
 	}
       else if (acc_on_device (acc_device_radeon))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC GCN back end enforces vector_length (1): autovectorize. */
 	  vectors_actual = 1;
 	}
       else
 	__builtin_abort ();
-#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  vector \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * vectors_actual; i > -100 * vectors_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -386,20 +490,29 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (vectors_actual) /* { dg-warning "using .vector_length \\(32\\)., ignoring runtime setting" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (vectors_actual) \
   vector_length (vectors)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring runtime setting} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with vector_length (1).  */
 	  vectors_actual = 1;
 	}
       else if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC nvptx back end enforces vector_length (32).  */
 	  vectors_actual = 32;
 	}
       else if (acc_on_device (acc_device_radeon))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* Because of the way vectors are implemented for GCN, a vector loop
 	     containing a seq routine call will not vectorize calls to that
@@ -408,7 +521,11 @@ int main ()
 	}
       else
 	__builtin_abort ();
-#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  vector \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC vector loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * vectors_actual; i > -100 * vectors_actual; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -443,12 +560,17 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc parallel copy (gangs_actual, workers_actual, vectors_actual) /* { dg-warning "using .vector_length \\(32\\)., ignoring 11" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc parallel /* { dg-line l_compute[incr c_compute] } */ \
+  copy (gangs_actual, workers_actual, vectors_actual) \
   num_gangs (gangs) \
   num_workers (WORKERS) \
   vector_length (VECTORS)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring 11} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_host))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* We're actually executing with num_gangs (1), num_workers (1),
 	     vector_length (1).  */
@@ -457,22 +579,40 @@ int main ()
 	  vectors_actual = 1;
 	}
       else if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC nvptx back end enforces vector_length (32).  */
 	  vectors_actual = 32;
 	}
       else if (acc_on_device (acc_device_radeon))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* See above comments about GCN vectors_actual.  */
 	  vectors_actual = 1;
 	}
       else
 	__builtin_abort ();
-#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  gang \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100 * gangs_actual; i > -100 * gangs_actual; --i)
-#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_j[incr c_loop_j] } */ \
+  worker \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-optimized {assigned OpenACC worker loop parallelism} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 100 * workers_actual; j > -100 * workers_actual; --j)
-#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_k[incr c_loop_k] } */ \
+  vector \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	  /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
+	  /* { dg-optimized {assigned OpenACC vector loop parallelism} {} { target *-*-* } l_loop_k$c_loop_k } */
 	  for (int k = 100 * vectors_actual; k > -100 * vectors_actual; --k)
 	    {
 	      gangs_min = gangs_max = acc_gang ();
@@ -502,12 +642,16 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
     {
       /* This is to make the OpenACC kernels construct unparallelizable.  */
       asm volatile ("" : : : "memory");
 
-#pragma acc loop reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100; i > -100; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -532,15 +676,19 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc kernels \
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ \
   num_gangs (gangs) \
   num_workers (WORKERS) \
   vector_length (VECTORS)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
     {
       /* This is to make the OpenACC kernels construct unparallelizable.  */
       asm volatile ("" : : : "memory");
 
-#pragma acc loop reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100; i > -100; --i)
 	{
 	  gangs_min = gangs_max = acc_gang ();
@@ -564,8 +712,10 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc serial /* { dg-warning "using .vector_length \\(32\\)., ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc serial /* { dg-line l_compute[incr c_compute] } */ \
   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring 1} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       for (int i = 100; i > -100; i--)
 	{
@@ -586,13 +736,18 @@ int main ()
     int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
     gangs_min = workers_min = vectors_min = INT_MAX;
     gangs_max = workers_max = vectors_max = INT_MIN;
-#pragma acc serial copy (vectors_actual) /* { dg-warning "using .vector_length \\(32\\)., ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+#pragma acc serial /* { dg-line l_compute[incr c_compute] } */ \
+  copy (vectors_actual) \
   copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max)
-    /* { dg-bogus "warning: region contains gang partitioned code but is not gang partitioned" "TODO 'serial'" { xfail *-*-* } .-2 }
-       { dg-bogus "warning: region contains worker partitioned code but is not worker partitioned" "TODO 'serial'" { xfail *-*-* } .-3 }
-       { dg-bogus "warning: region contains vector partitioned code but is not vector partitioned" "TODO 'serial'" { xfail *-*-* } .-4 } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-bogus {warning: region contains gang partitioned code but is not gang partitioned} {TODO 'serial'} { xfail *-*-* } l_compute$c_compute }
+       { dg-bogus {warning: region contains worker partitioned code but is not worker partitioned} {TODO 'serial'} { xfail *-*-* } l_compute$c_compute }
+       { dg-bogus {warning: region contains vector partitioned code but is not vector partitioned} {TODO 'serial'} { xfail *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring 1} {} { target openacc_nvidia_accel_selected } l_compute$c_compute } */
     {
       if (acc_on_device (acc_device_nvidia))
+	/* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { c++ && { ! __OPTIMIZE__ } } } .-1 }
+	   ..., as without optimizations, we're not inlining the C++ 'acc_on_device' wrapper.  */
 	{
 	  /* The GCC nvptx back end enforces vector_length (32).  */
 	  /* It's unclear if that's actually permissible here;
@@ -600,11 +755,25 @@ int main ()
 	     'serial' construct might not actually be serial".  */
 	  vectors_actual = 32;
 	}
-#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_i[incr c_loop_i] } */ \
+  gang \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      /* { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-note {variable 'j' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i$c_loop_i } */
+      /* { dg-optimized {assigned OpenACC gang loop parallelism} {} { target *-*-* } l_loop_i$c_loop_i } */
       for (int i = 100; i > -100; i--)
-#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_j[incr c_loop_j] } */ \
+  worker \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	/* { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_j$c_loop_j } */
+	/* { dg-optimized {assigned OpenACC worker loop parallelism} {} { target *-*-* } l_loop_j$c_loop_j } */
 	for (int j = 100; j > -100; j--)
-#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+#pragma acc loop /* { dg-line l_loop_k[incr c_loop_k] } */ \
+  vector \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	  /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k$c_loop_k } */
+	  /* { dg-optimized {assigned OpenACC vector loop parallelism} {} { target *-*-* } l_loop_k$c_loop_k } */
 	  for (int k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
 	    {
 	      gangs_min = gangs_max = acc_gang ();
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90
index 4b85608f0de..6ff740efc32 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90
@@ -2,14 +2,24 @@
 
 ! { dg-do run }
 
+! { dg-additional-options "-fopt-info-all-omp" }
+! { dg-additional-options "-foffload=-fopt-info-all-omp" } */
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
 program reduction
   integer, parameter     :: n = 20
   integer                :: i, red
 
   red = 0
 
-  !$acc kernels
-  !$acc loop reduction (+:red)
+  !$acc kernels ! { dg-line l_compute1 } */
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute1 }
+  !$acc loop reduction (+:red) ! { dg-line l_loop_i1 }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_i1 }
   do i = 1, n
      red = red + 1
   end do
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* OpenACC privatization diagnostics vs. 'assert' [PR102841]
  2021-05-21 19:29       ` Thomas Schwinge
                           ` (3 preceding siblings ...)
  2022-03-12 13:05         ` Thomas Schwinge
@ 2022-03-16  9:20         ` Thomas Schwinge
  2022-03-17  7:59         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
  5 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2022-03-16  9:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: Julian Brown, Jakub Jelinek, ro, iains

[-- Attachment #1: Type: text/plain, Size: 3564 bytes --]

Hi!

On 2021-05-21T21:29:19+0200, I wrote:
> I've pushed "[OpenACC privatization] Largely extend diagnostics and
> corresponding testsuite coverage [PR90115]" to master branch in commit
> 11b8286a83289f5b54e813f14ff56d730c3f3185

Pushed to master branch commit ab46fc7c3bf01337ea4554f08f4f6b0be8173557
"OpenACC privatization diagnostics vs. 'assert' [PR102841]", see
attached.


Grüße
 Thomas


> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
> @@ -1,6 +1,11 @@
> -/* { dg-do run } */
> -
>  /* Test if, if_present clauses on host_data construct.  */
> +
> +/* { dg-additional-options "-fopt-info-all-omp" }
> +   { dg-additional-options "--param=openacc-privatization=noisy" }
> +   { dg-additional-options "-foffload=-fopt-info-all-omp" }
> +   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
> +   for testing/documenting aspects of that functionality.  */
> +
>  /* C/C++ variant of 'libgomp.oacc-fortran/host_data-5.F90' */
>
>  #include <assert.h>
> @@ -14,15 +19,19 @@ foo (float *p, intptr_t host_p, int cond)
>  #pragma acc data copyin(host_p)
>    {
>  #pragma acc host_data use_device(p) if_present
> +    /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
>      /* p not mapped yet, so it will be equal to the host pointer.  */
>      assert (p == (float *) host_p);
>
>  #pragma acc data copy(p[0:100])
> +    /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
> +    /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
>      {
>        /* Not inside a host_data construct, so p is still the host pointer.  */
>        assert (p == (float *) host_p);
>
>  #pragma acc host_data use_device(p)
> +      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
>        {
>  #if ACC_MEM_SHARED
>       assert (p == (float *) host_p);
> @@ -33,6 +42,7 @@ foo (float *p, intptr_t host_p, int cond)
>        }
>
>  #pragma acc host_data use_device(p) if_present
> +      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
>        {
>  #if ACC_MEM_SHARED
>       assert (p == (float *) host_p);
> @@ -43,6 +53,8 @@ foo (float *p, intptr_t host_p, int cond)
>        }
>
>  #pragma acc host_data use_device(p) if(cond)
> +      /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
> +      /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-2 } */
>        {
>  #if ACC_MEM_SHARED
>       assert (p == (float *) host_p);


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-privatization-diagnostics-vs.-assert-PR10284.patch --]
[-- Type: text/x-diff, Size: 3129 bytes --]

From ab46fc7c3bf01337ea4554f08f4f6b0be8173557 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 16 Mar 2022 08:02:39 +0100
Subject: [PATCH] OpenACC privatization diagnostics vs. 'assert' [PR102841]

It's an orthogonal concern why these diagnostics do appear at all for
non-offloaded OpenACC constructs (where they're not relevant at all); PR90115.

Depending on how 'assert' is implemented, it may cause temporaries to be
created, and/or may lower into 'COND_EXPR's, and
'gcc/gimplify.cc:gimplify_cond_expr' uses 'create_tmp_var (type, "iftmp")'.

Fix-up for commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and
corresponding testsuite coverage [PR90115]".

	PR testsuite/102841
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Adjust.
---
 libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
index 66501e614fb..50b4fc264d0 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-7.c
@@ -4,7 +4,9 @@
    { dg-additional-options "--param=openacc-privatization=noisy" }
    { dg-additional-options "-foffload=-fopt-info-all-omp" }
    { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types) or 'assert' implementation:
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+   { dg-prune-output {note: variable 'iftmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
 
 /* C/C++ variant of 'libgomp.oacc-fortran/host_data-5.F90' */
 
@@ -25,7 +27,6 @@ foo (float *p, intptr_t host_p, int cond)
 
 #pragma acc data copy(p[0:100])
     /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-    /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 } */
     {
       /* Not inside a host_data construct, so p is still the host pointer.  */
       assert (p == (float *) host_p);
@@ -54,7 +55,6 @@ foo (float *p, intptr_t host_p, int cond)
 
 #pragma acc host_data use_device(p) if(cond)
       /* { dg-note {variable 'host_p\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-      /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-2 } */
       {
 #if ACC_MEM_SHARED
 	assert (p == (float *) host_p);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
  2021-05-21 19:29       ` Thomas Schwinge
                           ` (4 preceding siblings ...)
  2022-03-16  9:20         ` OpenACC privatization diagnostics vs. 'assert' [PR102841] Thomas Schwinge
@ 2022-03-17  7:59         ` Thomas Schwinge
  5 siblings, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2022-03-17  7:59 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 711 bytes --]

Hi!

On 2021-05-21T21:29:19+0200, I wrote:
> I've pushed "[OpenACC privatization] Largely extend diagnostics and
> corresponding testsuite coverage [PR90115]" to master branch in commit
> 11b8286a83289f5b54e813f14ff56d730c3f3185

Pushed to master branch commit 004fc4f2fc686d3366c9e1a2d8b9183796073866
"Enhance further testcases to verify handling of OpenACC privatization
level [PR90115]", see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Enhance-further-testcases-to-verify-handling-of-Open.patch --]
[-- Type: text/x-diff, Size: 32365 bytes --]

From 004fc4f2fc686d3366c9e1a2d8b9183796073866 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 16 Mar 2022 12:15:01 +0100
Subject: [PATCH] Enhance further testcases to verify handling of OpenACC
 privatization level [PR90115]

As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	gcc/testsuite/
	* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
	* gfortran.dg/goacc/common-block-3.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Enhance.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
---
 .../c-c++-common/goacc-gomp/nesting-1.c       |  9 +-
 .../gfortran.dg/goacc/common-block-3.f90      | 13 ++-
 .../acc_prof-kernels-1.c                      | 35 ++++++--
 .../testsuite/libgomp.oacc-fortran/if-1.f90   | 82 +++++--------------
 4 files changed, 68 insertions(+), 71 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c b/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
index b0b78374016..39b92712b31 100644
--- a/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
+++ b/gcc/testsuite/c-c++-common/goacc-gomp/nesting-1.c
@@ -1,14 +1,15 @@
 /* { dg-additional-options "-fopt-info-omp-note" } */
-/* { dg-additional-options "--param=openacc-privatization=noisy" } for
-   testing/documenting aspects of that functionality.  */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
 
 
 void
 f_acc_data (void)
 {
 #pragma acc data
-  /* { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 } */
-  /* { dg-note {variable 'i' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-2 } */
+  /* { dg-note {variable 'i' declared in block is candidate for adjusting OpenACC privatization level} "" { target *-*-* } .-1 } */
   {
     int i;
 #pragma omp atomic write
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
index 5defe2ea85d..9dbfa4cd2f0 100644
--- a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
@@ -1,5 +1,11 @@
 ! { dg-options "-fopenacc -fdump-tree-omplower" }
 
+! { dg-additional-options "-fopt-info-omp-all" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
 module consts
   integer, parameter :: n = 100
 end module consts
@@ -15,11 +21,14 @@ program main
   common /KERNELS_BLOCK/ x, y, z
 
   c = 1.0
-  !$acc parallel loop copy(/BLOCK/)
+  !$acc parallel loop copy(/BLOCK/) ! { dg-line l1 }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l1 }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l1 }
   do i = 1, n
      a(i) = b(i) + c
   end do
-  !$acc kernels
+  !$acc kernels ! { dg-line l2 }
+  ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l2 }
   do i = 1, n
      x(i) = y(i) + c
   end do
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
index d7f47627438..c82a7edbfa0 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
@@ -1,5 +1,21 @@
 /* Test dispatch of events to callbacks.  */
 
+/* { dg-additional-options "-fopt-info-omp-all" }
+   { dg-additional-options "-foffload=-fopt-info-omp-all" } */
+
+/* { dg-additional-options "--param=openacc-privatization=noisy" }
+   { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+   Prune a few: uninteresting:
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
 #undef NDEBUG
 #include <assert.h>
 #include <stdlib.h>
@@ -164,7 +180,10 @@ int main()
   {
 #define N 100
     int x[N];
-#pragma acc kernels
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { ! __OPTIMIZE__ } } l_compute$c_compute }
+       { dg-optimized {assigned OpenACC gang loop parallelism} {} { target __OPTIMIZE__ } l_compute$c_compute } */
     {
       for (int i = 0; i < N; ++i)
 	x[i] = i * i;
@@ -187,9 +206,12 @@ int main()
   {
 #define N 100
     int x[N];
-#pragma acc kernels \
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ \
   num_gangs (30) num_workers (3) vector_length (5)
-    /* { dg-prune-output "using .vector_length \\(32\\)., ignoring 5" } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring 5} {} { target { __OPTIMIZE__ && openacc_nvidia_accel_selected } } l_compute$c_compute } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { ! __OPTIMIZE__ } } l_compute$c_compute }
+       { dg-optimized {assigned OpenACC gang loop parallelism} {} { target __OPTIMIZE__ } l_compute$c_compute } */
     {
       for (int i = 0; i < N; ++i)
 	x[i] = i * i;
@@ -212,9 +234,12 @@ int main()
   {
 #define N 100
     int x[N];
-#pragma acc kernels \
+#pragma acc kernels /* { dg-line l_compute[incr c_compute] } */ \
   num_gangs (num_gangs) num_workers (num_workers) vector_length (vector_length)
-    /* { dg-prune-output "using .vector_length \\(32\\)., ignoring runtime setting" } */
+    /* { dg-note {variable 'i' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
+    /* { dg-warning {using 'vector_length \(32\)', ignoring runtime setting} {} { target { __OPTIMIZE__ && openacc_nvidia_accel_selected } } l_compute$c_compute } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target { ! __OPTIMIZE__ } } l_compute$c_compute }
+       { dg-optimized {assigned OpenACC gang loop parallelism} {} { target __OPTIMIZE__ } l_compute$c_compute } */
     {
       for (int i = 0; i < N; ++i)
 	x[i] = i * i;
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
index 9eadfcf9738..3c4d9a6efb7 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
@@ -2,10 +2,12 @@
 ! { dg-additional-options "-cpp" }
 
 ! { dg-additional-options "-fopt-info-note-omp" }
-! { dg-additional-options "--param=openacc-privatization=noisy" }
 ! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
 ! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-! for testing/documenting aspects of that functionality.
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
 
 ! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
 ! passed to 'incr' may be unset, and in that case, it will be set to [...]",
@@ -34,7 +36,6 @@ program main
   a(:) = 4.0
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (1 == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
         ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
         !TODO Unhandled 'CONST_DECL' instances for constant argument in 'acc_on_device' call.
@@ -59,7 +60,6 @@ program main
   a(:) = 16.0
 
   !$acc parallel if (0 == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
         ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
        if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -77,7 +77,6 @@ program main
   a(:) = 8.0
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -101,7 +100,6 @@ program main
   a(:) = 22.0
 
   !$acc parallel if (zero == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -119,7 +117,6 @@ program main
   a(:) = 16.0
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -143,7 +140,6 @@ program main
   a(:) = 76.0
 
   !$acc parallel if (.FALSE.) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -163,7 +159,6 @@ program main
   nn = 1
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (nn == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -189,7 +184,6 @@ program main
   nn = 0
 
   !$acc parallel if (nn == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -209,7 +203,6 @@ program main
   nn = 1
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -235,7 +228,6 @@ program main
   nn = 0;
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -253,7 +245,6 @@ program main
   a(:) = 91.0
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -271,7 +262,6 @@ program main
   a(:) = 43.0
 
   !$acc parallel copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -295,7 +285,6 @@ program main
   a(:) = 87.0
 
   !$acc parallel if (one == 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -374,11 +363,9 @@ program main
   b(:) = 0.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
 
-    !$acc parallel present (a(1:N)) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    !$acc parallel present (a(1:N))
        do i = 1, N
            b(i) = a(i)
        end do
@@ -393,8 +380,7 @@ program main
   b(:) = 1.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-1 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 21
@@ -407,27 +393,23 @@ program main
   b(:) = 21.0
 
   !$acc data copyin (a(1:N)) if (1 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 
 #if !ACC_MEM_SHARED
     if (acc_is_present (a) .eqv. .FALSE.) STOP 23
 #endif
 
     !$acc data copyout (b(1:N)) if (0 == 1)
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
-    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 #if !ACC_MEM_SHARED
       if (acc_is_present (b) .eqv. .TRUE.) STOP 24
 #endif
         !$acc data copyout (b(1:N)) if (1 == 1)
-        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
 
-        !$acc parallel present (a(1:N)) present (b(1:N)) ! { dg-line l_compute[incr c_compute] }
-        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+        !$acc parallel present (a(1:N)) present (b(1:N))
           do i = 1, N
             b(i) = a(i)
           end do
@@ -508,7 +490,6 @@ program main
   a(:) = 4.0
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (1 == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
         ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
         if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -532,7 +513,6 @@ program main
   a(:) = 16.0
 
   !$acc kernels if (0 == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
      do i = 1, N
         ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
        if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -550,7 +530,6 @@ program main
   a(:) = 8.0
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -574,7 +553,6 @@ program main
   a(:) = 22.0
 
   !$acc kernels if (zero == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -592,7 +570,6 @@ program main
   a(:) = 16.0
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (.TRUE.) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -616,7 +593,6 @@ program main
   a(:) = 76.0
 
   !$acc kernels if (.FALSE.) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -636,7 +612,6 @@ program main
   nn = 1
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (nn == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -662,7 +637,6 @@ program main
   nn = 0
 
   !$acc kernels if (nn == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -682,7 +656,6 @@ program main
   nn = 1
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -708,7 +681,6 @@ program main
   nn = 0;
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if ((nn + nn) > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -726,7 +698,6 @@ program main
   a(:) = 91.0
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (-2 > 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -744,7 +715,6 @@ program main
   a(:) = 43.0
 
   !$acc kernels copyin (a(1:N)) copyout (b(1:N)) if (one == 1) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
        ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -768,7 +738,6 @@ program main
   a(:) = 87.0
 
   !$acc kernels if (one == 0) ! { dg-line l_compute[incr c_compute] }
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     do i = 1, N
       ! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_compute$c_compute }
       if (acc_on_device (acc_device_host) .eqv. .TRUE.) then
@@ -847,11 +816,9 @@ program main
   b(:) = 0.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (1 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
 
-    !$acc kernels present (a(1:N)) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    !$acc kernels present (a(1:N))
        do i = 1, N
            b(i) = a(i)
        end do
@@ -866,8 +833,7 @@ program main
   b(:) = 1.0
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-1 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 56
@@ -880,27 +846,23 @@ program main
   b(:) = 21.0
 
   !$acc data copyin (a(1:N)) if (1 == 1)
-  ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
-  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 
 #if !ACC_MEM_SHARED
     if (acc_is_present (a) .eqv. .FALSE.) STOP 58
 #endif
 
     !$acc data copyout (b(1:N)) if (0 == 1)
-    ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
-    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
+    ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 #if !ACC_MEM_SHARED
       if (acc_is_present (b) .eqv. .TRUE.) STOP 59
 #endif
         !$acc data copyout (b(1:N)) if (1 == 1)
-        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
-        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+        ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
 
-        !$acc kernels present (a(1:N)) present (b(1:N)) ! { dg-line l_compute[incr c_compute] }
-        ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+        !$acc kernels present (a(1:N)) present (b(1:N))
           do i = 1, N
             b(i) = a(i)
           end do
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-03-17  7:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 12:34 [PATCH 0/3] openacc: Gang-private variables in shared memory Julian Brown
2021-02-26 12:34 ` [PATCH 1/3] openacc: Add support for gang local storage allocation " Julian Brown
2021-04-15 17:26   ` Thomas Schwinge
2021-04-16 16:05     ` Andrew Stubbs
2021-04-16 17:30       ` Thomas Schwinge
2021-04-18 22:53         ` Andrew Stubbs
2021-04-19 11:06           ` Thomas Schwinge
2021-04-19 11:23     ` Julian Brown
2021-05-21 18:55       ` Thomas Schwinge
2021-05-21 19:18       ` Thomas Schwinge
2021-05-21 19:20       ` Thomas Schwinge
2021-05-21 19:29       ` Thomas Schwinge
2021-05-22  1:40         ` [r12-989 Regression] FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for warnings, line 98) on Linux/x86_64 sunil.k.pandey
2021-05-22  8:41           ` Thomas Schwinge
2021-05-25  1:03             ` Sunil Pandey
2022-03-04 13:51         ` Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*' Thomas Schwinge
2022-03-10 11:10         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
2022-03-12 13:05         ` Thomas Schwinge
2022-03-16  9:20         ` OpenACC privatization diagnostics vs. 'assert' [PR102841] Thomas Schwinge
2022-03-17  7:59         ` Enhance further testcases to verify handling of OpenACC privatization level [PR90115] Thomas Schwinge
2021-05-21 19:12   ` [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory Thomas Schwinge
2021-02-26 12:34 ` [PATCH 2/3] amdgcn: AMD GCN parts for OpenACC private variables patch Julian Brown
2021-02-26 12:34 ` [PATCH 3/3] nvptx: NVPTX " Julian Brown
2021-05-21 18:59   ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).