public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4] add support for fortran allocate support with declare create
@ 2017-04-05 15:24 Cesar Philippidis
  2017-04-06  9:05 ` Thomas Schwinge
  2022-11-02 20:04 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
  0 siblings, 2 replies; 9+ messages in thread
From: Cesar Philippidis @ 2017-04-05 15:24 UTC (permalink / raw)
  To: gcc-patches, Fortran List, Thomas Schwinge, Chung-Lin Tang

[-- Attachment #1: Type: text/plain, Size: 1557 bytes --]

This patch implements the OpenACC 2.5 behavior of fortran allocate on
variables marked with declare create as defined in Section 2.13.2 in the
OpenACC spec. To do so, I've added two new data mappings,
GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE.

While working on adding support for allocate, I noticed that OpenACC
declare has a number of quirks. For starters, the fortran FE wasn't
lowering them properly, so there was no way for omplower to utilize them
inside acc parallel regions.

Next, I think the "omp declare target" attribute that was being
improperly assigned to all declared variables. The semantics of OpenACC
declare is slightly different from OpenMP. In OpenACC, declared
variables may have non-global lifetimes. Therefore, this patch relaxes
the fortran FE to only apply "omp declare target" to OpenACC declared
variables with the device_resident clause (which specifies that only a
device can have a copy of a variable). This ultimately enabled the use
of declared variables inside update directives, which in turn enables
additional declare testing.

There is still some unimplemented functionality.

 * The c and c++ FEs should be updated with the same declare behavior
   and we can use more declare test coverage in general.

 * Allocate only works on arrays, not scalar values.

 * This doesn't implement support for allocate as specified in Section
   2.13.1. That one involves adding malloc support inside acc routines
   and possibly other fortran runtime changes.

I've applied this patch to gomp-4_0-branch.

Cesar


[-- Attachment #2: gomp4-declare-allocatable.diff --]
[-- Type: text/x-patch, Size: 19403 bytes --]

2017-04-05  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/fortran/
	* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
	OMP_MAP_DECLARE_DEALLOCATE.
	* openmp.c (gfc_match_oacc_declare): Add support for OMP_MAP_ALLOC and
	OMP_MAP_TO, as those match the OpenACC 2.5 semantics.
	* trans-array.c (gfc_array_allocate): Call
	gfc_trans_oacc_declare_allocate for decls with oacc_decalre_create
	attributes set.
	(gfc_array_deallocate): Likewise.
	* trans-decl.c (add_attributes_to_decl): Enable lowering of OpenACC
	declared create, copyin and deviceptr clauses.
	(add_clause): Don't duplicate OpenACC declare clauses.
	(find_module_oacc_declare_clauses): Relax oacc_declare_create to
	OMP_MAP_ALLOC, and oacc_declare_copying to OMP_MAP_TO.  This matches
	the OpenACC 2.5 semantics.
	* trans-openmp.c (gfc_trans_omp_clauses_1): Handle
	OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
	(gfc_trans_oacc_declare_allocate): New function.
	* trans-stmt.h: Declare gfc_trans_oacc_declare_allocate.

	gcc/
	* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
	create, copyin and deviceptr to have local lifetimes.
	(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
	* tree-pretty-print.c (dump_omp_clause): Likewise.

	gcc/testsuite/
	* gfortran.dg/goacc/declare-allocatable-1.f90: New test.

	include/
	* gomp-constants.h (enum gomp_map_kind): Define GOMP_MAP_DECLARE,
	GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.

	libgomp/
	* libgomp.h: Declare gomp_acc_declare_allocate.
	* oacc-mem.c (gomp_acc_declare_allocate): New function.
	* oacc-parallel.c (GOACC_enter_exit_data): Handle
	GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test.


diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 2adbe4c..75217c7 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1156,7 +1156,9 @@ enum gfc_omp_map_op
   OMP_MAP_RELEASE,
   OMP_MAP_ALWAYS_TO,
   OMP_MAP_ALWAYS_FROM,
-  OMP_MAP_ALWAYS_TOFROM
+  OMP_MAP_ALWAYS_TOFROM,
+  OMP_MAP_DECLARE_ALLOCATE,
+  OMP_MAP_DECLARE_DEALLOCATE
 };
 
 enum gfc_omp_linear_op
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 354e6ff..31e4885 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -2233,10 +2233,12 @@ gfc_match_oacc_declare (void)
       switch (n->u.map_op)
 	{
 	  case OMP_MAP_FORCE_ALLOC:
+	  case OMP_MAP_ALLOC:
 	    s->attr.oacc_declare_create = 1;
 	    break;
 
 	  case OMP_MAP_FORCE_TO:
+	  case OMP_MAP_TO:
 	    s->attr.oacc_declare_copyin = 1;
 	    break;
 
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 278eaff..de84a67 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -88,6 +88,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "trans-types.h"
 #include "trans-array.h"
 #include "trans-const.h"
+#include "trans-stmt.h"
 #include "dependency.h"
 
 static bool gfc_get_array_constructor_size (mpz_t *, gfc_constructor_base);
@@ -5394,6 +5395,7 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   gfc_expr **upper;
   gfc_ref *ref, *prev_ref = NULL;
   bool allocatable, coarray, dimension, alloc_w_e3_arr_spec = false;
+  bool oacc_declare = false;
 
   ref = expr->ref;
 
@@ -5408,6 +5410,7 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
       allocatable = expr->symtree->n.sym->attr.allocatable;
       coarray = expr->symtree->n.sym->attr.codimension;
       dimension = expr->symtree->n.sym->attr.dimension;
+      oacc_declare = expr->symtree->n.sym->attr.oacc_declare_create;
     }
   else
     {
@@ -5540,7 +5543,12 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
 
   /* Update the array descriptors.  */
   if (dimension)
-    gfc_conv_descriptor_offset_set (&set_descriptor_block, se->expr, offset);
+    {
+      gfc_conv_descriptor_offset_set (&set_descriptor_block, se->expr, offset);
+
+      if (oacc_declare)
+	gfc_trans_oacc_declare_allocate (&set_descriptor_block, expr, true);
+    }
 
   set_descriptor = gfc_finish_block (&set_descriptor_block);
   if (status != NULL_TREE)
@@ -5581,6 +5589,7 @@ gfc_array_deallocate (tree descriptor, tree pstat, tree errmsg, tree errlen,
   tree tmp;
   stmtblock_t block;
   bool coarray = gfc_is_coarray (expr);
+  gfc_symbol *sym = expr->symtree->n.sym;
 
   gfc_start_block (&block);
 
@@ -5588,6 +5597,9 @@ gfc_array_deallocate (tree descriptor, tree pstat, tree errmsg, tree errlen,
   var = gfc_conv_descriptor_data_get (descriptor);
   STRIP_NOPS (var);
 
+  if (!coarray && sym->attr.oacc_declare_create)
+    gfc_trans_oacc_declare_allocate (&block, expr, false);
+
   /* Parameter is the address of the data component.  */
   tmp = gfc_deallocate_with_status (coarray ? descriptor : var, pstat, errmsg,
 				    errlen, label_finish, false, expr, coarray);
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 109bdf7..b4db6b0 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1324,10 +1324,10 @@ add_attributes_to_decl (symbol_attribute sym_attr, tree list)
       }
 
   if (sym_attr.omp_declare_target
-#if 0 /* TODO */
       || sym_attr.oacc_declare_create
       || sym_attr.oacc_declare_copyin
       || sym_attr.oacc_declare_deviceptr
+#if 0 /* TODO */
       || sym_attr.oacc_declare_device_resident
 #endif
       )
@@ -5932,13 +5932,17 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op map_op)
 {
   gfc_omp_namelist *n;
 
+  if (!module_oacc_clauses)
+    module_oacc_clauses = gfc_get_omp_clauses ();
+
+  for (n = module_oacc_clauses->lists[OMP_LIST_MAP]; n != NULL; n = n->next)
+    if (n->sym->backend_decl == sym->backend_decl)
+      return;
+
   n = gfc_get_omp_namelist ();
   n->sym = sym;
   n->u.map_op = map_op;
 
-  if (!module_oacc_clauses)
-    module_oacc_clauses = gfc_get_omp_clauses ();
-
   if (module_oacc_clauses->lists[OMP_LIST_MAP])
     n->next = module_oacc_clauses->lists[OMP_LIST_MAP];
 
@@ -5954,10 +5958,10 @@ find_module_oacc_declare_clauses (gfc_symbol *sym)
       gfc_omp_map_op map_op;
 
       if (sym->attr.oacc_declare_create)
-	map_op = OMP_MAP_FORCE_ALLOC;
+	map_op = OMP_MAP_ALLOC;
 
       if (sym->attr.oacc_declare_copyin)
-	map_op = OMP_MAP_FORCE_TO;
+	map_op = OMP_MAP_TO;
 
       if (sym->attr.oacc_declare_deviceptr)
 	map_op = OMP_MAP_FORCE_DEVICEPTR;
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 17984bb..ba738a9 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -2360,6 +2360,12 @@ gfc_trans_omp_clauses_1 (stmtblock_t *block, gfc_omp_clauses *clauses,
 		case OMP_MAP_FORCE_DEVICEPTR:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_DEVICEPTR);
 		  break;
+		case OMP_MAP_DECLARE_ALLOCATE:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DECLARE_ALLOCATE);
+		  break;
+		case OMP_MAP_DECLARE_DEALLOCATE:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DECLARE_DEALLOCATE);
+		  break;
 		default:
 		  gcc_unreachable ();
 		}
@@ -5369,6 +5375,41 @@ gfc_trans_oacc_declare (gfc_code *code)
   return gfc_finish_block (&block);
 }
 
+/* Create an OpenACC enter or exit data construct for an OpenACC declared
+   variable that has been allocated or deallocated.  */
+
+tree
+gfc_trans_oacc_declare_allocate (stmtblock_t *block, gfc_expr *expr,
+				 bool allocate)
+{
+  gfc_omp_clauses *clauses = gfc_get_omp_clauses ();
+  gfc_omp_namelist *p = gfc_get_omp_namelist ();
+  tree oacc_clauses, stmt;
+  enum tree_code construct_code;
+
+  p->sym = expr->symtree->n.sym;
+  p->where = expr->where;
+
+  if (allocate)
+    {
+      p->u.map_op = OMP_MAP_DECLARE_ALLOCATE;
+      construct_code = OACC_ENTER_DATA;
+    }
+  else
+    {
+      p->u.map_op = OMP_MAP_DECLARE_DEALLOCATE;
+      construct_code = OACC_EXIT_DATA;
+    }
+  clauses->lists[OMP_LIST_MAP] = p;
+
+  oacc_clauses = gfc_trans_omp_clauses (block, clauses, expr->where);
+  stmt = build1_loc (input_location, construct_code, void_type_node,
+		     oacc_clauses);
+  gfc_add_expr_to_block (block, stmt);
+
+  return stmt;
+}
+
 tree
 gfc_trans_oacc_directive (gfc_code *code)
 {
diff --git a/gcc/fortran/trans-stmt.h b/gcc/fortran/trans-stmt.h
index 6ca0c1b..aed3214 100644
--- a/gcc/fortran/trans-stmt.h
+++ b/gcc/fortran/trans-stmt.h
@@ -67,6 +67,7 @@ tree gfc_trans_omp_directive (gfc_code *);
 void gfc_trans_omp_declare_simd (gfc_namespace *);
 tree gfc_trans_oacc_directive (gfc_code *);
 tree gfc_trans_oacc_declare (gfc_namespace *);
+tree gfc_trans_oacc_declare_allocate (stmtblock_t *, gfc_expr *, bool);
 
 /* trans-io.c */
 tree gfc_trans_open (gfc_code *);
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f98fa54..a584a44 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -2265,7 +2265,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
 	      && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
 	      && varpool_node::get_create (decl)->offloadable
 	      && !lookup_attribute ("omp declare target link",
-				    DECL_ATTRIBUTES (decl)))
+				    DECL_ATTRIBUTES (decl))
+	      && !(is_gimple_omp_oacc (ctx->stmt)))
 	    break;
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER)
@@ -16740,6 +16741,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_DECLARE_ALLOCATE:
+	  case GOMP_MAP_DECLARE_DEALLOCATE:
 	  case GOMP_MAP_DYNAMIC_ARRAY_TO:
 	  case GOMP_MAP_DYNAMIC_ARRAY_FROM:
 	  case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
diff --git a/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90 b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
new file mode 100644
index 0000000..9195055
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
@@ -0,0 +1,25 @@
+! Verify that OpenACC declared allocatable arrays have implicit
+! OpenACC enter and exit pragmas at the time of allocation and
+! deallocation.
+
+! { dg-additional-options "-fdump-tree-original" }
+
+program allocate
+  implicit none
+  integer, allocatable :: a(:)
+  integer, parameter :: n = 100
+  integer i
+  !$acc declare create(a)
+
+  allocate (a(n))
+
+  !$acc parallel loop copyout(a)
+  do i = 1, n
+     a(i) = i
+  end do
+
+  deallocate (a)
+end program allocate
+
+! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "gimple" } }
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 16efdc4..aae409d 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -764,6 +764,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
 	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
 	  pp_string (pp, "force_present,dynamic_array");
 	  break;
+	case GOMP_MAP_DECLARE_ALLOCATE:
+	  pp_string (pp, "declare_allocate");
+	  break;
+	case GOMP_MAP_DECLARE_DEALLOCATE:
+	  pp_string (pp, "declare_deallocate");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index e60d07d..8f17f78 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -41,6 +41,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
 #define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
+#define GOMP_MAP_FLAG_SPECIAL_4		(1 << 6)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -149,6 +150,12 @@ enum gomp_map_kind
 						 | GOMP_MAP_FORCE_ALLOC),
     GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT =	(GOMP_MAP_DYNAMIC_ARRAY
 						 | GOMP_MAP_FORCE_PRESENT),
+    /* Mapping kinds for allocatable arrays.  */
+    GOMP_MAP_DECLARE =			(GOMP_MAP_FLAG_SPECIAL_4),
+    GOMP_MAP_DECLARE_ALLOCATE =		(GOMP_MAP_DECLARE
+					 | GOMP_MAP_FORCE_TO),
+    GOMP_MAP_DECLARE_DEALLOCATE =	(GOMP_MAP_DECLARE
+					 | GOMP_MAP_FORCE_FROM),
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
     GOMP_MAP_FIRSTPRIVATE_POINTER =	(GOMP_MAP_LAST | 1),
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index f4bfc06..31cb103 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -974,6 +974,8 @@ enum gomp_map_vars_kind
 
 extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *);
 extern void gomp_acc_remove_pointer (void *, bool, int, int);
+extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *,
+				       unsigned short *);
 
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 6cff777..14b39b2 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -704,6 +704,34 @@ acc_update_self_async (void *h, size_t s, int async)
 }
 
 void
+gomp_acc_declare_allocate (bool allocate, size_t mapnum, void **hostaddrs,
+			   size_t *sizes, unsigned short *kinds)
+{
+  gomp_debug (0, "  %s: processing\n", __FUNCTION__);
+
+  if (allocate)
+    {
+      assert (mapnum == 3);
+
+      /* Allocate memory for the array data.  */
+      uintptr_t data = (uintptr_t) acc_create (hostaddrs[0], sizes[0]);
+
+      /* Update the PSET.  */
+      acc_update_device (hostaddrs[1], sizes[1]);
+      void *pset = acc_deviceptr (hostaddrs[1]);
+      acc_memcpy_to_device (pset, &data, sizeof (uintptr_t));
+    }
+  else
+    {
+      /* Deallocate memory for the array data.  */
+      void *data = acc_deviceptr (hostaddrs[0]);
+      acc_free (data);
+    }
+
+  gomp_debug (0, "  %s: end\n", __FUNCTION__);
+}
+
+void
 gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
 			 void *kinds)
 {
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 36e2431..d6ced64 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -484,14 +484,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	  || kind == GOMP_MAP_FORCE_PRESENT
 	  || kind == GOMP_MAP_FORCE_TO
 	  || kind == GOMP_MAP_TO
-	  || kind == GOMP_MAP_ALLOC)
+	  || kind == GOMP_MAP_ALLOC
+	  || kind == GOMP_MAP_DECLARE_ALLOCATE)
 	{
 	  data_enter = true;
 	  break;
 	}
 
       if (kind == GOMP_MAP_DELETE
-	  || kind == GOMP_MAP_FORCE_FROM)
+	  || kind == GOMP_MAP_FORCE_FROM
+	  || kind == GOMP_MAP_DECLARE_DEALLOCATE)
 	break;
 
       gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
@@ -540,7 +542,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	    }
 	  else
 	    {
-	      if (!acc_is_present (hostaddrs[i], sizes[i]))
+	      if (kind == GOMP_MAP_DECLARE_ALLOCATE)
+		gomp_acc_declare_allocate (true, pointer, &hostaddrs[i],
+					   &sizes[i], &kinds[i]);
+	      else if (!acc_is_present (hostaddrs[i], sizes[i]))
 		{
 		  gomp_acc_insert_pointer (pointer, &hostaddrs[i],
 					   &sizes[i], &kinds[i]);
@@ -579,7 +584,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	  }
 	else
 	  {
-	    if (acc_is_present (hostaddrs[i], sizes[i]))
+	    if (kind == GOMP_MAP_DECLARE_DEALLOCATE)
+	      gomp_acc_declare_allocate (false, pointer, &hostaddrs[i],
+					 &sizes[i], &kinds[i]);
+	    else if (acc_is_present (hostaddrs[i], sizes[i]))
 	      {
 		gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
 					 == GOMP_MAP_FORCE_FROM, async,
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
new file mode 100644
index 0000000..5167dee
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
@@ -0,0 +1,211 @@
+! Test declare create with allocatable arrays.
+
+! { dg-do run }
+
+module vars
+  implicit none
+  integer, parameter :: n = 100
+  real*8, allocatable :: b(:)
+ !$acc declare create (b)
+end module vars
+
+program test
+  use vars
+  use openacc
+  implicit none
+  real*8 :: a
+  integer :: i
+
+  interface
+     subroutine sub1
+       !$acc routine gang
+     end subroutine sub1
+
+     subroutine sub2
+     end subroutine sub2
+
+     real*8 function fun1 (ix)
+       integer ix
+       !$acc routine seq
+     end function fun1
+
+     real*8 function fun2 (ix)
+       integer ix
+       !$acc routine seq
+     end function fun2
+  end interface
+
+  if (allocated (b)) call abort
+
+  ! Test local usage of an allocated declared array.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) call abort
+  if (acc_is_present (b) .neqv. .true.) call abort
+
+  a = 2.0
+
+  !$acc parallel loop
+  do i = 1, n
+     b(i) = i * a
+  end do
+
+  if (.not.acc_is_present (b)) call abort
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i*a) call abort
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside an acc
+  ! routine subroutine.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) call abort
+  if (acc_is_present (b) .neqv. .true.) call abort
+
+  !$acc parallel
+  call sub1
+  !$acc end parallel
+
+  if (.not.acc_is_present (b)) call abort
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i*2) call abort
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside a host
+  ! subroutine.
+
+  call sub2
+
+  if (.not.acc_is_present (b)) call abort
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= 1.0) call abort
+  end do
+
+  deallocate (b)
+
+  if (allocated (b)) call abort
+
+  ! Test the usage of an allocated declared array inside an acc
+  ! routine function.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) call abort
+  if (acc_is_present (b) .neqv. .true.) call abort
+
+  !$acc parallel loop
+  do i = 1, n
+     b(i) = 1.0
+  end do
+
+  !$acc parallel loop
+  do i = 1, n
+     b(i) = fun1 (i)
+  end do
+
+  if (.not.acc_is_present (b)) call abort
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i) call abort
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside a host
+  ! function.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) call abort
+  if (acc_is_present (b) .neqv. .true.) call abort
+
+  !$acc parallel loop
+  do i = 1, n
+     b(i) = 1.0
+  end do
+
+  !$acc update host(b)
+
+  do i = 1, n
+     b(i) = fun2 (i)
+  end do
+
+  if (.not.acc_is_present (b)) call abort
+
+  do i = 1, n
+     if (b(i) /= i*i) call abort
+  end do
+
+  deallocate (b)
+end program test
+
+! Set each element in array 'b' at index i to i*2.
+
+subroutine sub1 ! { dg-warning "region is worker partitioned" }
+  use vars
+  implicit none
+  integer i
+  !$acc routine gang
+
+  !$acc loop
+  do i = 1, n
+     b(i) = i*2
+  end do
+end subroutine sub1
+
+! Allocate array 'b', and set it to all 1.0.
+
+subroutine sub2
+  use vars
+  use openacc
+  implicit none
+  integer i
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) call abort
+  if (acc_is_present (b) .neqv. .true.) call abort
+
+  !$acc parallel loop
+  do i = 1, n
+     b(i) = 1.0
+  end do
+end subroutine sub2
+
+! Return b(i) * i;
+
+real*8 function fun1 (i)
+  use vars
+  implicit none
+  integer i
+  !$acc routine seq
+
+  fun1 = b(i) * i
+end function fun1
+
+! Return b(i) * i * i;
+
+real*8 function fun2 (i)
+  use vars
+  implicit none
+  integer i
+
+  fun2 = b(i) * i * i
+end function fun2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gomp4] add support for fortran allocate support with declare create
  2017-04-05 15:24 [gomp4] add support for fortran allocate support with declare create Cesar Philippidis
@ 2017-04-06  9:05 ` Thomas Schwinge
  2017-04-06 14:26   ` Cesar Philippidis
  2022-11-02 20:04 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2017-04-06  9:05 UTC (permalink / raw)
  To: Cesar Philippidis; +Cc: gcc-patches, Fortran List, Chung-Lin Tang

Hi Cesar!

On Wed, 5 Apr 2017 08:23:58 -0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
> This patch implements the OpenACC 2.5 behavior of fortran allocate on
> variables marked with declare create as defined in Section 2.13.2 in the
> OpenACC spec.

Thanks!


> While working on adding support for allocate, I noticed that OpenACC
> declare has a number of quirks. For starters, the fortran FE wasn't
> lowering them properly, so there was no way for omplower to utilize them
> inside acc parallel regions.

> There is still some unimplemented functionality.
> [...]

File (at least some of these?) as separate issues, I guess?


> I've applied this patch to gomp-4_0-branch.

Not reviewed, but I noticed:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
> @@ -0,0 +1,25 @@
> +! Verify that OpenACC declared allocatable arrays have implicit
> +! OpenACC enter and exit pragmas at the time of allocation and
> +! deallocation.
> +
> +! { dg-additional-options "-fdump-tree-original" }
> +[...]
> +! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "gimple" } }
> +! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "gimple" } }

    UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   scan-tree-dump-times gimple "pragma acc enter data map.declare_allocate" 1
    UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   scan-tree-dump-times gimple "pragma acc exit data map.declare_deallocate" 1
    PASS: gfortran.dg/goacc/declare-allocatable-1.f90   -O  (test for excess errors)

"original" vs. "gimple" -- which one should it be?


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gomp4] add support for fortran allocate support with declare create
  2017-04-06  9:05 ` Thomas Schwinge
@ 2017-04-06 14:26   ` Cesar Philippidis
  0 siblings, 0 replies; 9+ messages in thread
From: Cesar Philippidis @ 2017-04-06 14:26 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: gcc-patches, Fortran List, Chung-Lin Tang

[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]

On 04/06/2017 02:05 AM, Thomas Schwinge wrote:

>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
>> @@ -0,0 +1,25 @@
>> +! Verify that OpenACC declared allocatable arrays have implicit
>> +! OpenACC enter and exit pragmas at the time of allocation and
>> +! deallocation.
>> +
>> +! { dg-additional-options "-fdump-tree-original" }
>> +[...]
>> +! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "gimple" } }
>> +! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "gimple" } }
> 
>     UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   scan-tree-dump-times gimple "pragma acc enter data map.declare_allocate" 1
>     UNRESOLVED: gfortran.dg/goacc/declare-allocatable-1.f90   -O   scan-tree-dump-times gimple "pragma acc exit data map.declare_deallocate" 1
>     PASS: gfortran.dg/goacc/declare-allocatable-1.f90   -O  (test for excess errors)
> 
> "original" vs. "gimple" -- which one should it be?

I'm bad at noticing new unresolved test cases.

It could be either, but I changed it to original to ensure that the
fortran FE inserts those acc enter/exit data directives appropriately.

This patch has been committed to gomp-4_0-branch.

Cesar

[-- Attachment #2: gomp4-declare-test.diff --]
[-- Type: text/x-patch, Size: 896 bytes --]

2017-04-06  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/testsuite/
	* gfortran.dg/goacc/declare-allocatable-1.f90: Correct test.

diff --git a/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90 b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
index 9195055..b6bb6b3 100644
--- a/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/declare-allocatable-1.f90
@@ -21,5 +21,5 @@ program allocate
   deallocate (a)
 end program allocate
 
-! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "pragma acc enter data map.declare_allocate" 1 "original" } }
+! { dg-final { scan-tree-dump-times "pragma acc exit data map.declare_deallocate" 1 "original" } }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create)
  2017-04-05 15:24 [gomp4] add support for fortran allocate support with declare create Cesar Philippidis
  2017-04-06  9:05 ` Thomas Schwinge
@ 2022-11-02 20:04 ` Thomas Schwinge
  2022-11-02 20:10   ` Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90') Thomas Schwinge
       [not found]   ` <1ECCE9C8-0CE9-46EA-A0F2-3F3FA50F4681@gmail.com>
  1 sibling, 2 replies; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-02 20:04 UTC (permalink / raw)
  To: gcc-patches, fortran

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

Hi!

On 2017-04-05T08:23:58-0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
> This patch implements the OpenACC 2.5 behavior of fortran allocate on
> variables marked with declare create as defined in Section 2.13.2 in the
> OpenACC spec.

That functionality is still missing in GCC master branch, however a test
case included in that submission here:

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
> @@ -0,0 +1,211 @@
> +! Test declare create with allocatable arrays.

... is useful in a different (though related) context that I'm currently
working on.  Having applied the following changes:

  - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
    changes).
  - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
  - Add scanning for OpenACC compiler diagnostics.
  - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).

..., I've then pushed to master branch
commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
"Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-libgomp.oacc-fortran-declare-allocatable-1.f90.patch --]
[-- Type: text/x-diff, Size: 9987 bytes --]

From 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe Mon Sep 17 00:00:00 2001
From: Cesar Philippidis <cesar@codesourcery.com>
Date: Wed, 5 Apr 2017 08:23:58 -0700
Subject: [PATCH] Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
---
 .../declare-allocatable-1.f90                 | 268 ++++++++++++++++++
 1 file changed, 268 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
new file mode 100644
index 00000000000..1c8ccd9f61f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
@@ -0,0 +1,268 @@
+! Test OpenACC 'declare create' with allocatable arrays.
+
+! { dg-do run }
+
+!TODO-OpenACC-declare-allocate
+! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
+! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
+! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
+! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
+
+!TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
+
+! { dg-additional-options -fopt-info-all-omp }
+! { dg-additional-options -foffload=-fopt-info-all-omp }
+
+! { dg-additional-options --param=openacc-privatization=noisy }
+! { dg-additional-options -foffload=--param=openacc-privatization=noisy }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable '[Di]\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} }
+
+! { dg-additional-options -Wopenacc-parallelism }
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c 0] }
+! { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
+
+module vars
+  implicit none
+  integer, parameter :: n = 100
+  real*8, allocatable :: b(:)
+ !$acc declare create (b)
+end module vars
+
+program test
+  use vars
+  use openacc
+  implicit none
+  real*8 :: a
+  integer :: i
+
+  interface
+     subroutine sub1
+       !$acc routine gang
+     end subroutine sub1
+
+     subroutine sub2
+     end subroutine sub2
+
+     real*8 function fun1 (ix)
+       integer ix
+       !$acc routine seq
+     end function fun1
+
+     real*8 function fun2 (ix)
+       integer ix
+       !$acc routine seq
+     end function fun2
+  end interface
+
+  if (allocated (b)) error stop
+
+  ! Test local usage of an allocated declared array.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  a = 2.0
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l$c }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = i * a
+  end do
+
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i*a) error stop
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside an acc
+  ! routine subroutine.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc parallel
+  call sub1 ! { dg-line l[incr c] }
+  ! { dg-optimized {assigned OpenACC gang worker vector loop parallelism} {} { target *-*-* } l$c }
+  !$acc end parallel
+
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i*2) error stop
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside a host
+  ! subroutine.
+
+  call sub2
+
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= 1.0) error stop
+  end do
+
+  deallocate (b)
+
+  if (allocated (b)) error stop
+
+  ! Test the usage of an allocated declared array inside an acc
+  ! routine function.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l$c }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = 1.0
+  end do
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l$c }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = fun1 (i) ! { dg-line l[incr c] }
+     ! { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l$c }
+  end do
+
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc update host(b)
+
+  do i = 1, n
+     if (b(i) /= i) error stop
+  end do
+
+  deallocate (b)
+
+  ! Test the usage of an allocated declared array inside a host
+  ! function.
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' ought to be adjusted for OpenACC privatization level: 'vector'} {} { target *-*-* } l$c }
+  !   { dg-note {variable 'i' adjusted for OpenACC privatization level: 'vector'} {} { target { ! openacc_host_selected } } l$c }
+  ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = 1.0
+  end do
+
+  !$acc update host(b)
+
+  do i = 1, n
+     b(i) = fun2 (i)
+  end do
+
+  if (.not.acc_is_present (b)) error stop
+
+  do i = 1, n
+     if (b(i) /= i*i) error stop
+  end do
+
+  deallocate (b)
+end program test ! { dg-line l[incr c] }
+! { dg-bogus {note: variable 'overflow\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {TODO n/a} { xfail *-*-* } l$c }
+! { dg-bogus {note: variable 'not_prev_allocated\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {TODO n/a} { xfail *-*-* } l$c }
+! { dg-bogus {note: variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: artificial} {TODO n/a} { xfail *-*-* } l$c }
+
+! Set each element in array 'b' at index i to i*2.
+
+subroutine sub1 ! { dg-line subroutine_sub1 }
+  use vars
+  implicit none
+  integer i
+  !$acc routine gang
+  ! { dg-bogus {[Ww]arning: region is worker partitioned but does not contain worker partitioned code} {TODO default 'gang' 'vector'} { xfail *-*-* } subroutine_sub1 }
+
+  !$acc loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = i*2
+  end do
+end subroutine sub1
+
+! Allocate array 'b', and set it to all 1.0.
+
+subroutine sub2
+  use vars
+  use openacc
+  implicit none
+  integer i
+
+  allocate (b(n))
+
+  if (.not.allocated (b)) error stop
+  if (.not.acc_is_present (b)) error stop
+
+  !$acc parallel loop ! { dg-line l[incr c] }
+  ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l$c }
+  ! { dg-optimized {assigned OpenACC gang vector loop parallelism} {} { target *-*-* } l$c }
+  do i = 1, n
+     b(i) = 1.0
+  end do
+end subroutine sub2
+
+! Return b(i) * i;
+
+real*8 function fun1 (i)
+  use vars
+  implicit none
+  integer i
+  !$acc routine seq
+
+  fun1 = b(i) * i
+end function fun1
+
+! Return b(i) * i * i;
+
+real*8 function fun2 (i)
+  use vars
+  implicit none
+  integer i
+
+  fun2 = b(i) * i * i
+end function fun2
-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90')
  2022-11-02 20:04 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
@ 2022-11-02 20:10   ` Thomas Schwinge
  2022-11-02 20:15     ` Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90' Thomas Schwinge
       [not found]   ` <1ECCE9C8-0CE9-46EA-A0F2-3F3FA50F4681@gmail.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-02 20:10 UTC (permalink / raw)
  To: gcc-patches, fortran

[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]

Hi!

On 2022-11-02T21:04:56+0100, I wrote:
> On 2017-04-05T08:23:58-0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
>> This patch implements the OpenACC 2.5 behavior of fortran allocate on
>> variables marked with declare create as defined in Section 2.13.2 in the
>> OpenACC spec.
>
> That functionality is still missing in GCC master branch, however a test
> case included in that submission here:
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>> @@ -0,0 +1,211 @@
>> +! Test declare create with allocatable arrays.
>
> ... is useful in a different (though related) context that I'm currently
> working on.  Having applied the following changes:
>
>   - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
>     changes).
>   - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
>   - Add scanning for OpenACC compiler diagnostics.
>   - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).
>
> ..., I've then pushed to master branch
> commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
> "Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
> @@ -0,0 +1,268 @@
> +! Test OpenACC 'declare create' with allocatable arrays.
> +
> +! { dg-do run }
> +
> +!TODO-OpenACC-declare-allocate
> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
> +
> +[...]

Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
work around (as seen in real-world code), I've pushed to master branch
commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
"Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'", see
attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-libgomp.oacc-fortran-declare-allocatable-1-runti.patch --]
[-- Type: text/x-diff, Size: 3853 bytes --]

From 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 14 Oct 2022 17:36:51 +0200
Subject: [PATCH] Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'

... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
for missing support for OpenACC "Changes from Version 2.0 to 2.5":
"The 'declare create' directive with a Fortran 'allocatable' has new behavior".
Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
manually.

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
	New.
---
 ...ble-1.f90 => declare-allocatable-1-runtime.f90} | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-1.f90 => declare-allocatable-1-runtime.f90} (96%)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
similarity index 96%
copy from libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
copy to libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
index 1c8ccd9f61f..e4cb9c378a3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90
@@ -3,10 +3,10 @@
 ! { dg-do run }
 
 !TODO-OpenACC-declare-allocate
-! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
 ! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
 ! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
-! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
+! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
+! manually.
 
 !TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
 
@@ -67,6 +67,7 @@ program test
   ! Test local usage of an allocated declared array.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -91,12 +92,14 @@ program test
      if (b(i) /= i*a) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside an acc
   ! routine subroutine.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -114,6 +117,7 @@ program test
      if (b(i) /= i*2) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside a host
@@ -129,6 +133,7 @@ program test
      if (b(i) /= 1.0) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   if (allocated (b)) error stop
@@ -137,6 +142,7 @@ program test
   ! routine function.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -170,12 +176,14 @@ program test
      if (b(i) /= i) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside a host
   ! function.
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -202,6 +210,7 @@ program test
      if (b(i) /= i*i) error stop
   end do
 
+  call acc_delete (b)
   deallocate (b)
 end program test ! { dg-line l[incr c] }
 ! { dg-bogus {note: variable 'overflow\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {TODO n/a} { xfail *-*-* } l$c }
@@ -234,6 +243,7 @@ subroutine sub2
   integer i
 
   allocate (b(n))
+  call acc_create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'
  2022-11-02 20:10   ` Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90') Thomas Schwinge
@ 2022-11-02 20:15     ` Thomas Schwinge
  2022-11-02 20:22       ` Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643] Thomas Schwinge
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-02 20:15 UTC (permalink / raw)
  To: gcc-patches, fortran

[-- Attachment #1: Type: text/plain, Size: 2901 bytes --]

Hi!

On 2022-11-02T21:10:54+0100, I wrote:
> On 2022-11-02T21:04:56+0100, I wrote:
>> On 2017-04-05T08:23:58-0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
>>> This patch implements the OpenACC 2.5 behavior of fortran allocate on
>>> variables marked with declare create as defined in Section 2.13.2 in the
>>> OpenACC spec.
>>
>> That functionality is still missing in GCC master branch, however a test
>> case included in that submission here:
>>
>>> --- /dev/null
>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>> @@ -0,0 +1,211 @@
>>> +! Test declare create with allocatable arrays.
>>
>> ... is useful in a different (though related) context that I'm currently
>> working on.  Having applied the following changes:
>>
>>   - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
>>     changes).
>>   - Replace '[logical] .neqv. .true.' by '.not.[logical]'.
>>   - Add scanning for OpenACC compiler diagnostics.
>>   - 'dg-xfail-run-if' for '-DACC_MEM_SHARED=0' (see above).
>>
>> ..., I've then pushed to master branch
>> commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe
>> "Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'", see attached.
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>> @@ -0,0 +1,268 @@
>> +! Test OpenACC 'declare create' with allocatable arrays.
>> +
>> +! { dg-do run }
>> +
>> +!TODO-OpenACC-declare-allocate
>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>> +
>> +[...]
>
> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
> work around (as seen in real-world code), I've pushed to master branch
> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"

> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
> "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
> manually.

A similar test case, but with different focus, I've pushed to master
branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
"Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-libgomp.oacc-fortran-declare-allocatable-array_d.patch --]
[-- Type: text/x-diff, Size: 16050 bytes --]

From abeaf3735fe2568b9d5b8096318da866b1fe1e5c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Wed, 26 Oct 2022 23:47:29 +0200
Subject: [PATCH] Add
 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'

	libgomp/
	* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
	New.
---
 ...allocatable-array_descriptor-1-runtime.f90 | 402 ++++++++++++++++++
 1 file changed, 402 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
new file mode 100644
index 00000000000..b27f312631d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
@@ -0,0 +1,402 @@
+! Test OpenACC 'declare create' with allocatable arrays.
+
+! { dg-do run }
+
+! Note that we're not testing OpenACC semantics here, but rather documenting
+! current GCC behavior, specifically, behavior concerning updating of
+! host/device array descriptors.
+! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
+
+!TODO-OpenACC-declare-allocate
+! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
+! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
+! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
+! manually.
+
+
+!TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
+
+
+!TODO OpenACC 'serial' vs. GCC/nvptx:
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+
+! { dg-additional-options -fdump-tree-original }
+! { dg-additional-options -fdump-tree-gimple }
+
+
+module vars
+  implicit none
+  integer, parameter :: n1_lb = -3
+  integer, parameter :: n1_ub = 6
+  integer, parameter :: n2_lb = -9999
+  integer, parameter :: n2_ub = 22222
+
+  integer, allocatable :: b(:)
+  !$acc declare create (b)
+
+end module vars
+
+program test
+  use vars
+  use openacc
+  implicit none
+  integer :: i
+
+  ! Identifiers for purposes of reliable '-fdump-tree-[...]' scanning.
+  integer :: id1_1, id1_2
+
+  interface
+
+     subroutine verify_initial
+       implicit none
+       !$acc routine seq
+     end subroutine verify_initial
+
+     subroutine verify_n1_allocated
+       implicit none
+       !$acc routine seq
+     end subroutine verify_n1_allocated
+
+     subroutine verify_n1_values (addend)
+       implicit none
+       !$acc routine gang
+       integer, value :: addend
+     end subroutine verify_n1_values
+
+     subroutine verify_n1_deallocated (expect_allocated)
+       implicit none
+       !$acc routine seq
+       logical, value :: expect_allocated
+     end subroutine verify_n1_deallocated
+
+     subroutine verify_n2_allocated
+       implicit none
+       !$acc routine seq
+     end subroutine verify_n2_allocated
+
+     subroutine verify_n2_values (addend)
+       implicit none
+       !$acc routine gang
+       integer, value :: addend
+     end subroutine verify_n2_values
+
+     subroutine verify_n2_deallocated (expect_allocated)
+       implicit none
+       !$acc routine seq
+       logical, value :: expect_allocated
+     end subroutine verify_n2_deallocated
+
+  end interface
+
+  call acc_create (id1_1)
+  call acc_create (id1_2)
+
+  call verify_initial
+  ! It is important here (and similarly, following) that there is no data
+  ! clause for 'b' (explicit or implicit): no 'GOMP_MAP_TO_PSET'.
+  !$acc serial
+  call verify_initial
+  !$acc end serial
+
+  allocate (b(n1_lb:n1_ub))
+  call verify_n1_allocated
+  if (acc_is_present (b)) error stop
+  call acc_create (b)
+  ! This is now OpenACC "present":
+  if (.not.acc_is_present (b)) error stop
+  ! This still has the initial array descriptor:
+  !$acc serial
+  call verify_initial
+  !$acc end serial
+
+  do i = n1_lb, n1_ub
+     b(i) = i - 1
+  end do
+
+  ! Verify that host-to-device copy doesn't touch the device-side (still
+  ! initial) array descriptor (but it does copy the array data).
+  call acc_update_device (b)
+  !$acc serial
+  call verify_initial
+  !$acc end serial
+
+  b = 40
+
+  ! Verify that device-to-host copy doesn't touch the host-side array
+  ! descriptor, doesn't copy out the device-side (still initial) array
+  ! descriptor (but it does copy the array data).
+  call acc_update_self (b)
+  call verify_n1_allocated
+
+  do i = n1_lb, n1_ub
+     if (b(i) /= i - 1) error stop
+     b(i) = b(i) + 2
+  end do
+
+  ! The same using the OpenACC 'update' directive.
+
+  !$acc update device (b) self (id1_1)
+  ! We do have 'GOMP_MAP_TO_PSET' here:
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc update map\(force_to:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_1\);$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_update map\(force_to:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+  ! ..., but it's silently skipped in 'GOACC_update'.
+  !$acc serial
+  call verify_initial
+  !$acc end serial
+
+  b = 41
+
+  !$acc update self (b) self (id1_2)
+  ! We do have 'GOMP_MAP_TO_PSET' here:
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc update map\(force_from:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_2\);$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_update map\(force_from:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+  ! ..., but it's silently skipped in 'GOACC_update'.
+  call verify_n1_allocated
+
+  do i = n1_lb, n1_ub
+     if (b(i) /= i + 1) error stop
+     b(i) = b(i) + 2
+  end do
+
+  ! Now install the actual array descriptor, via a data clause for 'b'
+  ! (explicit or implicit): must get a 'GOMP_MAP_TO_PSET', which then in
+  ! 'gomp_map_vars_internal' is handled as 'declare target', and because of
+  ! '*(void **) hostaddrs[i] != NULL', we've got 'has_always_ptrset == true',
+  ! 'always_to_cnt == 1', and therefore 'gomp_map_vars_existing' does update
+  ! the 'GOMP_MAP_TO_PSET'.
+  !$acc serial present (b) copyin (id1_1)
+  call verify_initial
+  id1_1 = 0
+  !$acc end serial
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc serial map\(force_present:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1\)$} 1 original } }
+  !TODO ..., but without an actual use of 'b', the gimplifier removes the
+  !TODO 'GOMP_MAP_TO_PSET':
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_serial map\(force_present:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+  !$acc serial present (b) copyin (id1_2)
+  call verify_n1_allocated
+  !TODO Use of 'b':
+  id1_2 = ubound (b, 1)
+  !$acc end serial
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc serial map\(force_present:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(to:id1_2\)$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_serial map\(force_present:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(to:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+
+  !$acc parallel copyin (id1_1) ! No data clause for 'b' (explicit or implicit): no 'GOMP_MAP_TO_PSET'.
+  call verify_n1_values (1)
+  id1_1 = 0
+  !$acc end parallel
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(to:id1_1\)$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(to:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+
+  !$acc parallel copy (b) copyin (id1_2)
+  ! As already present, 'copy (b)' doesn't copy; addend is still '1'.
+  call verify_n1_values (1)
+  id1_2 = 0
+  !$acc end parallel
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(tofrom:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(to:id1_2\)$} 1 original } }
+  !TODO ..., but without an actual use of 'b', the gimplifier removes the
+  !TODO 'GOMP_MAP_TO_PSET':
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(tofrom:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(to:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+
+  call verify_n1_allocated
+  if (.not.acc_is_present (b)) error stop
+
+  call acc_delete (b)
+  if (.not.allocated (b)) error stop
+  if (acc_is_present (b)) error stop
+  ! The device-side array descriptor doesn't get updated, so 'b' still appears
+  ! as "allocated":
+  !$acc serial
+  call verify_n1_allocated
+  !$acc end serial
+
+  deallocate (b)
+  call verify_n1_deallocated (.false.)
+  ! The device-side array descriptor doesn't get updated, so 'b' still appears
+  ! as "allocated":
+  !$acc serial
+  call verify_n1_allocated
+  !$acc end serial
+
+  ! Now try to install the actual array descriptor, via a data clause for 'b'
+  ! (explicit or implicit): must get a 'GOMP_MAP_TO_PSET', which then in
+  ! 'gomp_map_vars_internal' is handled as 'declare target', but because of
+  ! '*(void **) hostaddrs[i] == NULL', we've got 'has_always_ptrset == false',
+  ! 'always_to_cnt == 0', and therefore 'gomp_map_vars_existing' doesn't update
+  ! the 'GOMP_MAP_TO_PSET'.
+  ! The device-side array descriptor doesn't get updated, so 'b' still appears
+  ! as "allocated":
+  !TODO Why does 'present (b)' still work here?
+  !$acc serial present (b) copyout (id1_2)
+  call verify_n1_deallocated (.true.)
+  !TODO Use of 'b'.
+  id1_2 = ubound (b, 1)
+  !$acc end serial
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc serial map\(force_present:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2\)$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_serial map\(force_present:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+
+
+  ! Restart the procedure, with different array dimensions.
+
+  allocate (b(n2_lb:n2_ub))
+  call verify_n2_allocated
+  if (acc_is_present (b)) error stop
+  call acc_create (b)
+  if (.not.acc_is_present (b)) error stop
+  ! This still has the previous (n1) array descriptor:
+  !$acc serial
+  call verify_n1_deallocated (.true.)
+  !$acc end serial
+
+  do i = n2_lb, n2_ub
+     b(i) = i + 20
+  end do
+
+  call acc_update_device (b)
+  !$acc serial
+  call verify_n1_deallocated (.true.)
+  !$acc end serial
+
+  b = -40
+
+  call acc_update_self (b)
+  call verify_n2_allocated
+
+  do i = n2_lb, n2_ub
+     if (b(i) /= i + 20) error stop
+     b(i) = b(i) - 40
+  end do
+
+  !$acc update device (b)
+  !$acc serial
+  call verify_n1_deallocated (.true.)
+  !$acc end serial
+
+  b = -41
+
+  !$acc update self (b)
+  call verify_n2_allocated
+
+  do i = n2_lb, n2_ub
+     if (b(i) /= i - 20) error stop
+     b(i) = b(i) + 10
+  end do
+
+  !$acc serial present (b) copy (id1_2)
+  call verify_n2_allocated
+  !TODO Use of 'b':
+  id1_2 = ubound (b, 1)
+  !$acc end serial
+
+  !$acc parallel
+  call verify_n2_values (-20)
+  !$acc end parallel
+
+  !$acc parallel copy (b)
+  call verify_n2_values (-20)
+  !$acc end parallel
+
+  call verify_n2_allocated
+  if (.not.acc_is_present (b)) error stop
+
+  call acc_delete (b)
+  if (.not.allocated (b)) error stop
+  if (acc_is_present (b)) error stop
+  !$acc serial
+  call verify_n2_allocated
+  !$acc end serial
+
+  deallocate (b)
+  call verify_n2_deallocated (.false.)
+  !$acc serial
+  call verify_n2_allocated
+  !$acc end serial
+
+  !$acc serial present (b) copy (id1_2)
+  call verify_n2_deallocated (.true.)
+  !TODO Use of 'b':
+  id1_2 = ubound (b, 1)
+  !$acc end serial
+
+end program test
+
+
+subroutine verify_initial
+  use vars
+  implicit none
+  !$acc routine seq
+
+  if (allocated (b)) error stop "verify_initial allocated"
+  if (any (lbound (b) /= [0])) error stop "verify_initial lbound"
+  if (any (ubound (b) /= [0])) error stop "verify_initial ubound"
+end subroutine verify_initial
+
+subroutine verify_n1_allocated
+  use vars
+  implicit none
+  !$acc routine seq
+
+  if (.not.allocated (b)) error stop "verify_n1_allocated allocated"
+  if (any (lbound (b) /= [n1_lb])) error stop "verify_n1_allocated lbound"
+  if (any (ubound (b) /= [n1_ub])) error stop "verify_n1_allocated ubound"
+end subroutine verify_n1_allocated
+
+subroutine verify_n1_values (addend)
+  use vars
+  implicit none
+  !$acc routine gang
+  integer, value :: addend
+  integer :: i
+
+  !$acc loop
+  do i = n1_lb, n1_ub
+     if (b(i) /= i + addend) error stop
+  end do
+end subroutine verify_n1_values
+
+subroutine verify_n1_deallocated (expect_allocated)
+  use vars
+  implicit none
+  !$acc routine seq
+  logical, value :: expect_allocated
+
+  if (allocated(b) .neqv. expect_allocated) error stop "verify_n1_deallocated allocated"
+  ! Apparently 'deallocate'ing doesn't unset the bounds.
+  if (any (lbound (b) /= [n1_lb])) error stop "verify_n1_deallocated lbound"
+  if (any (ubound (b) /= [n1_ub])) error stop "verify_n1_deallocated ubound"
+end subroutine verify_n1_deallocated
+
+subroutine verify_n2_allocated
+  use vars
+  implicit none
+  !$acc routine seq
+
+  if (.not.allocated(b)) error stop "verify_n2_allocated allocated"
+  if (any (lbound (b) /= [n2_lb])) error stop "verify_n2_allocated lbound"
+  if (any (ubound (b) /= [n2_ub])) error stop "verify_n2_allocated ubound"
+end subroutine verify_n2_allocated
+
+subroutine verify_n2_values (addend)
+  use vars
+  implicit none
+  !$acc routine gang
+  integer, value :: addend
+  integer :: i
+
+  !$acc loop
+  do i = n2_lb, n2_ub
+     if (b(i) /= i + addend) error stop
+  end do
+end subroutine verify_n2_values
+
+subroutine verify_n2_deallocated (expect_allocated)
+  use vars
+  implicit none
+  !$acc routine seq
+  logical, value :: expect_allocated
+
+  if (allocated(b) .neqv. expect_allocated) error stop "verify_n2_deallocated allocated"
+  ! Apparently 'deallocate'ing doesn't unset the bounds.
+  if (any (lbound (b) /= [n2_lb])) error stop "verify_n2_deallocated lbound"
+  if (any (ubound (b) /= [n2_ub])) error stop "verify_n2_deallocated ubound"
+end subroutine verify_n2_deallocated
-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]
  2022-11-02 20:15     ` Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90' Thomas Schwinge
@ 2022-11-02 20:22       ` Thomas Schwinge
  2022-11-02 20:34         ` Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668] (was: Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]) Thomas Schwinge
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-02 20:22 UTC (permalink / raw)
  To: gcc-patches, fortran; +Cc: hberre3, rcheruku

[-- Attachment #1: Type: text/plain, Size: 3010 bytes --]

Hi!

On 2022-11-02T21:15:31+0100, I wrote:
> On 2022-11-02T21:10:54+0100, I wrote:
>> On 2022-11-02T21:04:56+0100, I wrote:
>>> --- /dev/null
>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>> @@ -0,0 +1,268 @@
>>> +! Test OpenACC 'declare create' with allocatable arrays.
>>> +
>>> +! { dg-do run }
>>> +
>>> +!TODO-OpenACC-declare-allocate
>>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>>> +
>>> +[...]
>>
>> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
>> work around (as seen in real-world code), I've pushed to master branch
>> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
>> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"
>
>> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
>> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>> manually.
>
> A similar test case, but with different focus, I've pushed to master
> branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
> "Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
> see attached.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
> @@ -0,0 +1,402 @@
> +! Test OpenACC 'declare create' with allocatable arrays.
> +
> +! { dg-do run }
> +
> +! Note that we're not testing OpenACC semantics here, but rather documenting
> +! current GCC behavior, specifically, behavior concerning updating of
> +! host/device array descriptors.
> +! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
> +
> +!TODO-OpenACC-declare-allocate
> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
> +! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
> +! manually.

If instead of calling 'acc_create'/'acc_delete' we'd like to use
'!$acc enter data create'/'!$acc exit data delete', we run into
<https://gcc.gnu.org/PR106643>
"[gfortran + OpenACC] Allocate in module causes refcount error".
Pushed to master branchcommit da8e0e1191c5512244a752b30dea0eba83e3d10c
"Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: 0001-Support-OpenACC-declare-create-with-Fortran-allocata.patch --]
[-- Type: text/x-diff, Size: 8308 bytes --]

From da8e0e1191c5512244a752b30dea0eba83e3d10c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 27 Oct 2022 21:52:07 +0200
Subject: [PATCH] Support OpenACC 'declare create' with Fortran allocatable
 arrays, part I [PR106643]

	PR libgomp/106643
	libgomp/
	* oacc-mem.c (goacc_enter_data_internal): Support
	OpenACC 'declare create' with Fortran allocatable arrays, part I.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
	New.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
	New.
---
 libgomp/oacc-mem.c                            | 28 +++++++++++++++++--
 ...90 => declare-allocatable-1-directive.f90} | 14 ++++++++--
 ...ocatable-array_descriptor-1-directive.f90} | 12 ++++----
 3 files changed, 44 insertions(+), 10 deletions(-)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-1.f90 => declare-allocatable-1-directive.f90} (95%)
 copy libgomp/testsuite/libgomp.oacc-fortran/{declare-allocatable-array_descriptor-1-runtime.f90 => declare-allocatable-array_descriptor-1-directive.f90} (98%)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 73b2710c2b8..ba010fddbb3 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1150,8 +1150,7 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	}
       else if (n && groupnum > 1)
 	{
-	  assert (n->refcount != REFCOUNT_INFINITY
-		  && n->refcount != REFCOUNT_LINK);
+	  assert (n->refcount != REFCOUNT_LINK);
 
 	  for (size_t j = i + 1; j <= group_last; j++)
 	    if ((kinds[j] & 0xff) == GOMP_MAP_ATTACH)
@@ -1166,6 +1165,31 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	  bool processed = false;
 
 	  struct target_mem_desc *tgt = n->tgt;
+
+	  /* Arrange so that OpenACC 'declare' code à la PR106643
+	     "[gfortran + OpenACC] Allocate in module causes refcount error"
+	     has a chance to work.  */
+	  if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
+	      && tgt->list_count == 0)
+	    {
+	      /* 'declare target'.  */
+	      assert (n->refcount == REFCOUNT_INFINITY);
+
+	      for (size_t k = 1; k < groupnum; k++)
+		{
+		  /* The only thing we expect to see here.  */
+		  assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
+		}
+
+	      /* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
+		 will always see 'n->refcount == REFCOUNT_INFINITY',
+		 there's no need to adjust 'n->dynamic_refcount' here.  */
+
+	      processed = true;
+	    }
+	  else
+	    assert (n->refcount != REFCOUNT_INFINITY);
+
 	  for (size_t j = 0; j < tgt->list_count; j++)
 	    if (tgt->list[j].key == n)
 	      {
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90
similarity index 95%
copy from libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
copy to libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90
index 1c8ccd9f61f..759873bad67 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90
@@ -3,10 +3,10 @@
 ! { dg-do run }
 
 !TODO-OpenACC-declare-allocate
-! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
 ! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
 ! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
-! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
+! Thus, after 'allocate'/before 'deallocate', do
+! '!$acc enter data create'/'!$acc exit data delete' manually.
 
 !TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
 
@@ -67,6 +67,7 @@ program test
   ! Test local usage of an allocated declared array.
 
   allocate (b(n))
+  !$acc enter data create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -91,12 +92,14 @@ program test
      if (b(i) /= i*a) error stop
   end do
 
+  !$acc exit data delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside an acc
   ! routine subroutine.
 
   allocate (b(n))
+  !$acc enter data create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -114,6 +117,7 @@ program test
      if (b(i) /= i*2) error stop
   end do
 
+  !$acc exit data delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside a host
@@ -129,6 +133,7 @@ program test
      if (b(i) /= 1.0) error stop
   end do
 
+  !$acc exit data delete (b)
   deallocate (b)
 
   if (allocated (b)) error stop
@@ -137,6 +142,7 @@ program test
   ! routine function.
 
   allocate (b(n))
+  !$acc enter data create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -170,12 +176,14 @@ program test
      if (b(i) /= i) error stop
   end do
 
+  !$acc exit data delete (b)
   deallocate (b)
 
   ! Test the usage of an allocated declared array inside a host
   ! function.
 
   allocate (b(n))
+  !$acc enter data create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
@@ -202,6 +210,7 @@ program test
      if (b(i) /= i*i) error stop
   end do
 
+  !$acc exit data delete (b)
   deallocate (b)
 end program test ! { dg-line l[incr c] }
 ! { dg-bogus {note: variable 'overflow\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {TODO n/a} { xfail *-*-* } l$c }
@@ -234,6 +243,7 @@ subroutine sub2
   integer i
 
   allocate (b(n))
+  !$acc enter data create (b)
 
   if (.not.allocated (b)) error stop
   if (.not.acc_is_present (b)) error stop
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
similarity index 98%
copy from libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
copy to libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
index b27f312631d..10e1d5bc378 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
@@ -10,8 +10,8 @@
 !TODO-OpenACC-declare-allocate
 ! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
 ! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
-! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
-! manually.
+! Thus, after 'allocate'/before 'deallocate', do
+! '!$acc enter data create'/'!$acc exit data delete' manually.
 
 
 !TODO { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.
@@ -102,7 +102,7 @@ program test
   allocate (b(n1_lb:n1_ub))
   call verify_n1_allocated
   if (acc_is_present (b)) error stop
-  call acc_create (b)
+  !$acc enter data create (b)
   ! This is now OpenACC "present":
   if (.not.acc_is_present (b)) error stop
   ! This still has the initial array descriptor:
@@ -201,7 +201,7 @@ program test
   call verify_n1_allocated
   if (.not.acc_is_present (b)) error stop
 
-  call acc_delete (b)
+  !$acc exit data delete (b)
   if (.not.allocated (b)) error stop
   if (acc_is_present (b)) error stop
   ! The device-side array descriptor doesn't get updated, so 'b' still appears
@@ -241,7 +241,7 @@ program test
   allocate (b(n2_lb:n2_ub))
   call verify_n2_allocated
   if (acc_is_present (b)) error stop
-  call acc_create (b)
+  !$acc enter data create (b)
   if (.not.acc_is_present (b)) error stop
   ! This still has the previous (n1) array descriptor:
   !$acc serial
@@ -299,7 +299,7 @@ program test
   call verify_n2_allocated
   if (.not.acc_is_present (b)) error stop
 
-  call acc_delete (b)
+  !$acc exit data delete (b)
   if (.not.allocated (b)) error stop
   if (acc_is_present (b)) error stop
   !$acc serial
-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668] (was: Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643])
  2022-11-02 20:22       ` Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643] Thomas Schwinge
@ 2022-11-02 20:34         ` Thomas Schwinge
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-02 20:34 UTC (permalink / raw)
  To: gcc-patches, fortran; +Cc: hberre3, rcheruku, Tobias Burnus

[-- Attachment #1: Type: text/plain, Size: 5114 bytes --]

Hi!

On 2022-11-02T21:22:25+0100, I wrote:
> On 2022-11-02T21:15:31+0100, I wrote:
>> On 2022-11-02T21:10:54+0100, I wrote:
>>> On 2022-11-02T21:04:56+0100, I wrote:
>>>> --- /dev/null
>>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>>> @@ -0,0 +1,268 @@
>>>> +! Test OpenACC 'declare create' with allocatable arrays.
>>>> +
>>>> +! { dg-do run }
>>>> +
>>>> +!TODO-OpenACC-declare-allocate
>>>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>>>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>>>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>>>> +
>>>> +[...]
>>>
>>> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
>>> work around (as seen in real-world code), I've pushed to master branch
>>> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
>>> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"
>>
>>> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
>>> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>> "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>>> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>>> manually.
>>
>> A similar test case, but with different focus, I've pushed to master
>> branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
>> "Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
>> see attached.
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
>> @@ -0,0 +1,402 @@
>> +! Test OpenACC 'declare create' with allocatable arrays.
>> +
>> +! { dg-do run }
>> +
>> +! Note that we're not testing OpenACC semantics here, but rather documenting
>> +! current GCC behavior, specifically, behavior concerning updating of
>> +! host/device array descriptors.
>> +! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
>> +
>> +!TODO-OpenACC-declare-allocate
>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>> +! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>> +! manually.
>
> If instead of calling 'acc_create'/'acc_delete' we'd like to use
> '!$acc enter data create'/'!$acc exit data delete', we run into
> <https://gcc.gnu.org/PR106643>
> "[gfortran + OpenACC] Allocate in module causes refcount error".
> Pushed to master branchcommit da8e0e1191c5512244a752b30dea0eba83e3d10c
> "Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]",
> see attached.

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -1166,6 +1165,31 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
>         bool processed = false;
>
>         struct target_mem_desc *tgt = n->tgt;
> +
> +       /* Arrange so that OpenACC 'declare' code à la PR106643
> +          "[gfortran + OpenACC] Allocate in module causes refcount error"
> +          has a chance to work.  */
> +       if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
> +           && tgt->list_count == 0)
> +         {
> +           /* 'declare target'.  */
> +           assert (n->refcount == REFCOUNT_INFINITY);
> +
> +           for (size_t k = 1; k < groupnum; k++)
> +             {
> +               /* The only thing we expect to see here.  */
> +               assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
> +             }
> +
> +           /* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
> +              will always see 'n->refcount == REFCOUNT_INFINITY',
> +              there's no need to adjust 'n->dynamic_refcount' here.  */
> +
> +           processed = true;
> +         }

To make slightly more interesting (real-world) test cases work, we here
also have to process the 'GOMP_MAP_TO_PSET', 'GOMP_MAP_POINTER' here.
Tobias had implemented such a thing in context of OpenMP PR96668
"[OpenMP] Re-mapping allocated but previously unallocated allocatable does not work"
a while ago, and we may do similar here.  Side note: in the first version
of my changes, I had actually here in
'libgomp/oacc-mem.c:goacc_enter_data_internal' re-implemented the
corresponding -- "somewhat ugly" -- logic, when at some point I realized
that I instead could simply call into the existing code, greatly reducing
the complexity here...  Pushed to master branch
commit f6ce1e77bbf5d3a096f52e674bfd7354c6537d10
"Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668]",
see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: 0001-Support-OpenACC-declare-create-with-Fortran-allocata.patch --]
[-- Type: text/x-diff, Size: 11881 bytes --]

From f6ce1e77bbf5d3a096f52e674bfd7354c6537d10 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 28 Oct 2022 15:06:45 +0200
Subject: [PATCH] Support OpenACC 'declare create' with Fortran allocatable
 arrays, part II [PR106643, PR96668]

	PR libgomp/106643
	PR fortran/96668
	libgomp/
	* oacc-mem.c (goacc_enter_data_internal): Support
	OpenACC 'declare create' with Fortran allocatable arrays, part II.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
	Adjust.
	* testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.
---
 libgomp/oacc-mem.c                            | 15 +++-
 ...locatable-array_descriptor-1-directive.f90 | 90 +++++++++++++------
 .../libgomp.oacc-fortran/pr106643-1.f90       | 83 +++++++++++++++++
 3 files changed, 160 insertions(+), 28 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index ba010fddbb3..233fe0e4c1d 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1166,7 +1166,10 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 
 	  struct target_mem_desc *tgt = n->tgt;
 
-	  /* Arrange so that OpenACC 'declare' code à la PR106643
+	  /* Minimal OpenACC variant corresponding to PR96668
+	     "[OpenMP] Re-mapping allocated but previously unallocated
+	     allocatable does not work" 'libgomp/target.c' changes, so that
+	     OpenACC 'declare' code à la PR106643
 	     "[gfortran + OpenACC] Allocate in module causes refcount error"
 	     has a chance to work.  */
 	  if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
@@ -1181,6 +1184,16 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 		  assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
 		}
 
+	      /* Let 'goacc_map_vars' -> 'gomp_map_vars_internal' handle
+		 this.  */
+	      gomp_mutex_unlock (&acc_dev->lock);
+	      struct target_mem_desc *tgt_
+		= goacc_map_vars (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
+				  &sizes[i], &kinds[i], true,
+				  GOMP_MAP_VARS_ENTER_DATA);
+	      assert (tgt_ == NULL);
+	      gomp_mutex_lock (&acc_dev->lock);
+
 	      /* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
 		 will always see 'n->refcount == REFCOUNT_INFINITY',
 		 there's no need to adjust 'n->dynamic_refcount' here.  */
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
index 10e1d5bc378..6604f72c5c1 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
@@ -105,27 +105,50 @@ program test
   !$acc enter data create (b)
   ! This is now OpenACC "present":
   if (.not.acc_is_present (b)) error stop
-  ! This still has the initial array descriptor:
+  ! ..., and got the actual array descriptor installed:
   !$acc serial
-  call verify_initial
+  call verify_n1_allocated
   !$acc end serial
 
   do i = n1_lb, n1_ub
      b(i) = i - 1
   end do
 
-  ! Verify that host-to-device copy doesn't touch the device-side (still
-  ! initial) array descriptor (but it does copy the array data).
+  ! In 'declare-allocatable-array_descriptor-1-runtime.f90', this does "verify
+  ! that host-to-device copy doesn't touch the device-side (still initial)
+  ! array descriptor (but it does copy the array data").  This is here not
+  ! applicable anymore, as we've already gotten the actual array descriptor
+  ! installed.  Thus now verify that it does copy the array data.
   call acc_update_device (b)
   !$acc serial
-  call verify_initial
+  call verify_n1_allocated
   !$acc end serial
 
   b = 40
 
-  ! Verify that device-to-host copy doesn't touch the host-side array
-  ! descriptor, doesn't copy out the device-side (still initial) array
-  ! descriptor (but it does copy the array data).
+  !$acc parallel copyout (id1_1) ! No data clause for 'b' (explicit or implicit): no 'GOMP_MAP_TO_PSET'.
+  call verify_n1_values (-1)
+  id1_1 = 0
+  !$acc end parallel
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(from:id1_1\)$} 1 original } }
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(from:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+
+  !$acc parallel copy (b) copyout (id1_2)
+  ! As already present, 'copy (b)' doesn't copy; addend is still '-1'.
+  call verify_n1_values (-1)
+  id1_2 = 0
+  !$acc end parallel
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(tofrom:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2\)$} 1 original } }
+  !TODO ..., but without an actual use of 'b', the gimplifier removes the
+  !TODO 'GOMP_MAP_TO_PSET':
+  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(tofrom:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+
+  ! In 'declare-allocatable-array_descriptor-1-runtime.f90', this does "verify
+  ! that device-to-host copy doesn't touch the host-side array descriptor,
+  ! doesn't copy out the device-side (still initial) array descriptor (but it
+  ! does copy the array data)".  This is here not applicable anymore, as we've
+  ! already gotten the actual array descriptor installed.  Thus now verify that
+  ! it does copy the array data.
   call acc_update_self (b)
   call verify_n1_allocated
 
@@ -142,11 +165,19 @@ program test
   ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_update map\(force_to:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
   ! ..., but it's silently skipped in 'GOACC_update'.
   !$acc serial
-  call verify_initial
+  call verify_n1_allocated
   !$acc end serial
 
   b = 41
 
+  !$acc parallel
+  call verify_n1_values (1)
+  !$acc end parallel
+
+  !$acc parallel copy (b)
+  call verify_n1_values (1)
+  !$acc end parallel
+
   !$acc update self (b) self (id1_2)
   ! We do have 'GOMP_MAP_TO_PSET' here:
   ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc update map\(force_from:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_2\);$} 1 original } }
@@ -159,20 +190,9 @@ program test
      b(i) = b(i) + 2
   end do
 
-  ! Now install the actual array descriptor, via a data clause for 'b'
-  ! (explicit or implicit): must get a 'GOMP_MAP_TO_PSET', which then in
-  ! 'gomp_map_vars_internal' is handled as 'declare target', and because of
-  ! '*(void **) hostaddrs[i] != NULL', we've got 'has_always_ptrset == true',
-  ! 'always_to_cnt == 1', and therefore 'gomp_map_vars_existing' does update
-  ! the 'GOMP_MAP_TO_PSET'.
-  !$acc serial present (b) copyin (id1_1)
-  call verify_initial
-  id1_1 = 0
-  !$acc end serial
-  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc serial map\(force_present:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1\)$} 1 original } }
-  !TODO ..., but without an actual use of 'b', the gimplifier removes the
-  !TODO 'GOMP_MAP_TO_PSET':
-  ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_serial map\(force_present:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+  ! Now test that (potentially re-)installing the actual array descriptor is a
+  ! no-op, via a data clause for 'b' (explicit or implicit): must get a
+  ! 'GOMP_MAP_TO_PSET'.
   !$acc serial present (b) copyin (id1_2)
   call verify_n1_allocated
   !TODO Use of 'b':
@@ -243,9 +263,9 @@ program test
   if (acc_is_present (b)) error stop
   !$acc enter data create (b)
   if (.not.acc_is_present (b)) error stop
-  ! This still has the previous (n1) array descriptor:
+  ! ..., and got the actual array descriptor installed:
   !$acc serial
-  call verify_n1_deallocated (.true.)
+  call verify_n2_allocated
   !$acc end serial
 
   do i = n2_lb, n2_ub
@@ -254,11 +274,19 @@ program test
 
   call acc_update_device (b)
   !$acc serial
-  call verify_n1_deallocated (.true.)
+  call verify_n2_allocated
   !$acc end serial
 
   b = -40
 
+  !$acc parallel
+  call verify_n2_values (20)
+  !$acc end parallel
+
+  !$acc parallel copy (b)
+  call verify_n2_values (20)
+  !$acc end parallel
+
   call acc_update_self (b)
   call verify_n2_allocated
 
@@ -269,11 +297,19 @@ program test
 
   !$acc update device (b)
   !$acc serial
-  call verify_n1_deallocated (.true.)
+  call verify_n2_allocated
   !$acc end serial
 
   b = -41
 
+  !$acc parallel
+  call verify_n2_values (-20)
+  !$acc end parallel
+
+  !$acc parallel copy (b)
+  call verify_n2_values (-20)
+  !$acc end parallel
+
   !$acc update self (b)
   call verify_n2_allocated
 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90
new file mode 100644
index 00000000000..a9c969e3361
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90
@@ -0,0 +1,83 @@
+! { dg-do run }
+! { dg-additional-options -cpp }
+
+
+!TODO OpenACC 'serial' vs. GCC/nvptx:
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+
+module m_macron
+
+    implicit none
+
+    real(kind(0d0)), allocatable, dimension(:) :: valls
+    !$acc declare create(valls)
+
+contains
+
+    subroutine s_macron_compute(size)
+
+        integer :: size
+
+        !$acc routine seq
+
+#if ACC_MEM_SHARED
+        if (valls(size) /= 1) error stop
+#else
+        if (valls(size) /= size - 2) error stop
+#endif
+
+        valls(size) = size + 2
+
+    end subroutine s_macron_compute
+
+    subroutine s_macron_init(size)
+
+        integer :: size
+
+        print*, "size=", size
+
+        print*, "allocate(valls(1:size))"
+        allocate(valls(1:size))
+
+        print*, "acc enter data create(valls(1:size))"
+        !$acc enter data create(valls(1:size))
+
+        print*, "!$acc update device(valls(1:size))"
+        valls(size) = size - 2
+        !$acc update device(valls(1:size))
+
+        valls(size) = 1
+
+        !$acc serial
+        call s_macron_compute(size)
+        !$acc end serial
+
+        valls(size) = -1
+
+        !$acc update host(valls(1:size))
+#if ACC_MEM_SHARED
+        if (valls(size) /= -1) error stop
+#else
+        if (valls(size) /= size + 2) error stop
+#endif
+
+        print*, valls(1:size)
+
+        print*, "acc exit data delete(valls)"
+        !$acc exit data delete(valls)
+
+    end subroutine s_macron_init
+
+end module m_macron
+
+
+program p_main
+
+    use m_macron
+
+    implicit none
+
+    call s_macron_init(10)
+
+end program p_main
-- 
2.35.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create)
       [not found]   ` <1ECCE9C8-0CE9-46EA-A0F2-3F3FA50F4681@gmail.com>
@ 2022-11-03 10:47     ` Thomas Schwinge
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Schwinge @ 2022-11-03 10:47 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer; +Cc: fortran, Tobias Burnus

Hi!

Let me add back CC: <fortran@gcc.gnu.org>, so that others may comment,
too.

On 2022-11-03T01:37:10+0100, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote:
> On 2 November 2022 21:04:56 CET, Thomas Schwinge <thomas@codesourcery.com> wrote:
>>> --- /dev/null
>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>> @@ -0,0 +1,211 @@
>>> +! Test declare create with allocatable arrays.
>>
>>... is useful in a different (though related) context that I'm currently
>>working on.  Having applied the following changes:
>>
>>  - Replace 'call abort' by 'error stop' (in spirit of earlier PR84381
>>    changes).
>
> Please remind me why you don't stop N
> but error stop?

  - Don't have to re-number if changing test case later on.
  - Prints a backtrace (where supported).

> Re: https://gcc.gnu.org/legacy-ml/fortran/2018-09/msg00173.html
>
> You'd obviously tweak
> sub(/call\s\s*abort/, "stop " i)
> with error\s\s*stop
>
> Or is your output so br^W lacking that you cannot write but just return? But then i think that error stop writes, too, so that cannot be the case, can it?

Right.


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-11-03 10:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 15:24 [gomp4] add support for fortran allocate support with declare create Cesar Philippidis
2017-04-06  9:05 ` Thomas Schwinge
2017-04-06 14:26   ` Cesar Philippidis
2022-11-02 20:04 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
2022-11-02 20:10   ` Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90') Thomas Schwinge
2022-11-02 20:15     ` Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90' Thomas Schwinge
2022-11-02 20:22       ` Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643] Thomas Schwinge
2022-11-02 20:34         ` Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668] (was: Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]) Thomas Schwinge
     [not found]   ` <1ECCE9C8-0CE9-46EA-A0F2-3F3FA50F4681@gmail.com>
2022-11-03 10:47     ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).