public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data
@ 2020-05-22 22:16 Julian Brown
  2020-05-22 22:16 ` [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code Julian Brown
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

After questions from Thomas:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545399.html

here is a patch series that adjusts how reference counting works
(again) for dynamic data lifetimes in OpenACC.  Since the "overhaul"
patch was applied,

https://gcc.gnu.org/legacy-ml/gcc-patches/2019-12/msg01249.html

dynamic data lifetimes have been represented by counting "excess"
references beyond those that are explicitly part of the linked splay
tree/target memory descriptor data structure.  This allowed self-test
code to be written to ensure that reference counts remained consistent
throughout execution.

However, there were some awkward corner-cases, which -- though fixable
-- made some of the code more complex than it could have been.  So,
this patch series reverts the dynamic reference counting implementation
to the previous semantics, which are that the dynamic reference count
recorded in each mapping's splay tree key corresponds more directly to
the source-level semantics (i.e. "enter data" operations increment the
reference count, and "exit data" operations decrement it).

This is not a plain revert of the "overhaul" patch above.  I have tried
to keep various refactoring introduced in that patch in place, though
I have re-introduced some error checking that the aforementioned patch
removed.

I have also managed to adjust the (still optional, development use only)
self-checking code to be able to work with the "dynamic_refcount" scheme.
A couple of patches containing minor cleanups are included too.

Tested with offloading to NVPTX (as a series). OK?

Julian Brown (7):
  [OpenACC] Missing unlocking on error paths in attach/detach code
  [OpenACC] Adjust dynamic reference count semantics
  [OpenACC] Don't pass kind array via pointer to goacc_enter_datum
  [OpenACC] Fix incompatible copyout for acc_map_data (PR92843)
  [OpenACC] Distinguish structural/dynamic mappings in libgomp
  [OpenACC] Reference count self-checking (dynamic_refcount version)
  [OpenACC] Stricter dynamic data unmapping testing (WIP)

 libgomp/libgomp.h                             |  29 +-
 libgomp/oacc-int.h                            |   3 +
 libgomp/oacc-mem.c                            | 306 +++++++++++++-----
 libgomp/oacc-parallel.c                       |  27 ++
 libgomp/target.c                              | 231 +++++++++++--
 .../libgomp.oacc-c-c++-common/pr92843-1.c     |   1 -
 .../libgomp.oacc-c-c++-common/refcounting-1.c |  31 ++
 .../libgomp.oacc-c-c++-common/refcounting-2.c |  31 ++
 8 files changed, 542 insertions(+), 117 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-2.c

-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
@ 2020-05-22 22:16 ` Julian Brown
  2020-06-04 18:00   ` Thomas Schwinge
  2020-05-22 22:16 ` [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics Julian Brown
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

This patch adds some missing unlocking from error paths in the OpenACC
attach/detach code, noticed during development of other patches in
this series.

OK?

Julian

ChangeLog

	libgomp/
	* oacc-mem.c (acc_attach_async): Add missing gomp_mutex_unlock on
	error path.
	(goacc_detach_internal): Likewise.
---
 libgomp/oacc-mem.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 2d4bba78efd..c06b7341cbb 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -887,7 +887,10 @@ acc_attach_async (void **hostaddr, int async)
   n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
   if (n == NULL)
-    gomp_fatal ("struct not mapped for acc_attach");
+    {
+      gomp_mutex_unlock (&acc_dev->lock);
+      gomp_fatal ("struct not mapped for acc_attach");
+    }
 
   gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr,
 		       0, NULL);
@@ -920,7 +923,10 @@ goacc_detach_internal (void **hostaddr, int async, bool finalize)
   n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
   if (n == NULL)
-    gomp_fatal ("struct not mapped for acc_detach");
+    {
+      gomp_mutex_unlock (&acc_dev->lock);
+      gomp_fatal ("struct not mapped for acc_detach");
+    }
 
   gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL);
 
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
  2020-05-22 22:16 ` [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code Julian Brown
@ 2020-05-22 22:16 ` Julian Brown
  2020-06-03 12:36   ` Thomas Schwinge
  2020-06-04 18:26   ` [OpenACC] Don't open-code 'gomp_remove_var' in 'acc_unmap_data' (was: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics) Thomas Schwinge
  2020-05-22 22:16 ` [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum Julian Brown
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

This patch adjusts the semantics of dynamic reference counts, as described
in the parent email. There are also two new test cases derived from
Thomas's test in the email:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546166.html

that work now.

OK?

Julian

ChangeLog

	libgomp/
	* libgomp.h (struct splay_tree_key_s): Change virtual_refcount to
	dynamic_refcount.
	(struct gomp_device_descr): Remove GOMP_MAP_VARS_OPENACC_ENTER_DATA.
	* oacc-mem.c (acc_map_data): Substitute virtual_refcount for
	dynamic_refcount.
	(acc_unmap_data): Replace open-coded refcount handling with call to
	gomp_remove_var.
	(goacc_enter_datum): Adjust for dynamic_refcount semantics.  Use tgt
	returned from gomp_map_vars_async.  Update assertions.
	(goacc_exit_datum): Re-add some error checking.  Adjust for
	dynamic_refcount semantics.  Fix is_tgt_unmapped test for struct
	mappings.
	(goacc_enter_data_internal): Implement "present" case of dynamic
	memory-map handling here.  Update "non-present" case for
	dynamic_refcount semantics.
	(goacc_exit_data_internal): Update for dynamic_refcount semantics.
	Re-introduce error checking for tgt unmapping when appropriate.
	* target.c (gomp_map_vars_internal): Remove
	GOMP_MAP_VARS_OPENACC_ENTER_DATA handling.  Update for dynamic_refcount
	handling.
	(gomp_unmap_vars_internal): Remove virtual_refcount handling.
	(gomp_load_image_to_device): Substitute dynamic_refcount for
	virtual_refcount.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/refcounting-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/refcounting-2.c: New test.
---
 libgomp/libgomp.h                             |   8 +-
 libgomp/oacc-mem.c                            | 241 ++++++++++++------
 libgomp/target.c                              |  38 +--
 .../libgomp.oacc-c-c++-common/refcounting-1.c |  31 +++
 .../libgomp.oacc-c-c++-common/refcounting-2.c |  31 +++
 5 files changed, 243 insertions(+), 106 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-2.c

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index ca42e0de640..7b52ce7d5c2 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1016,11 +1016,8 @@ struct splay_tree_key_s {
   uintptr_t tgt_offset;
   /* Reference count.  */
   uintptr_t refcount;
-  /* Reference counts beyond those that represent genuine references in the
-     linked splay tree key/target memory structures, e.g. for multiple OpenACC
-     "present increment" operations (via "acc enter data") referring to the same
-     host-memory block.  */
-  uintptr_t virtual_refcount;
+  /* Dynamic reference count.  */
+  uintptr_t dynamic_refcount;
   struct splay_tree_aux *aux;
 };
 
@@ -1153,7 +1150,6 @@ struct gomp_device_descr
 enum gomp_map_vars_kind
 {
   GOMP_MAP_VARS_OPENACC,
-  GOMP_MAP_VARS_OPENACC_ENTER_DATA,
   GOMP_MAP_VARS_TARGET,
   GOMP_MAP_VARS_DATA,
   GOMP_MAP_VARS_ENTER_DATA
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c06b7341cbb..fff0d573f59 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -407,7 +407,7 @@ acc_map_data (void *h, void *d, size_t s)
       assert (tgt);
       splay_tree_key n = tgt->list[0].key;
       assert (n->refcount == 1);
-      assert (n->virtual_refcount == 0);
+      assert (n->dynamic_refcount == 0);
       /* Special reference counting behavior.  */
       n->refcount = REFCOUNT_INFINITY;
 
@@ -454,7 +454,7 @@ acc_unmap_data (void *h)
 		  (void *) n->host_start, (int) host_size, (void *) h);
     }
   /* TODO This currently doesn't catch 'REFCOUNT_INFINITY' usage different from
-     'acc_map_data'.  Maybe 'virtual_refcount' can be used for disambiguating
+     'acc_map_data'.  Maybe 'dynamic_refcount' can be used for disambiguating
      the different 'REFCOUNT_INFINITY' cases, or simply separate
      'REFCOUNT_INFINITY' values per different usage ('REFCOUNT_ACC_MAP_DATA'
      etc.)?  */
@@ -475,14 +475,19 @@ acc_unmap_data (void *h)
       gomp_mutex_unlock (&acc_dev->lock);
       gomp_fatal ("cannot unmap target block");
     }
-  else if (tgt->refcount > 1)
-    tgt->refcount--;
-  else
+
+  if (tgt->refcount == 1)
     {
-      free (tgt->array);
-      free (tgt);
+      /* This is the last reference.  Nullifying these fields prevents
+	 'gomp_unmap_tgt' via 'gomp_remove_var' from freeing the target
+	 memory.  */
+      tgt->tgt_end = 0;
+      tgt->to_free = NULL;
     }
 
+  bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
+  assert (is_tgt_unmapped);
+
   gomp_mutex_unlock (&acc_dev->lock);
 
   if (profiling_p)
@@ -540,10 +545,8 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
 
       assert (n->refcount != REFCOUNT_LINK);
       if (n->refcount != REFCOUNT_INFINITY)
-	{
-	  n->refcount++;
-	  n->virtual_refcount++;
-	}
+	n->refcount++;
+      n->dynamic_refcount++;
 
       gomp_mutex_unlock (&acc_dev->lock);
     }
@@ -555,16 +558,18 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
 
       goacc_aq aq = get_goacc_asyncqueue (async);
 
-      gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes, kinds,
-			   true, GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+      struct target_mem_desc *tgt
+	= gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes,
+			       kinds, true, GOMP_MAP_VARS_ENTER_DATA);
+      assert (tgt);
+      assert (tgt->list_count == 1);
+      n = tgt->list[0].key;
+      assert (n);
+      assert (n->refcount == 1);
+      assert (n->dynamic_refcount == 0);
+      n->dynamic_refcount++;
 
-      gomp_mutex_lock (&acc_dev->lock);
-      n = lookup_host (acc_dev, hostaddrs[0], sizes[0]);
-      assert (n != NULL);
-      assert (n->tgt_offset == 0);
-      assert ((uintptr_t) hostaddrs[0] == n->host_start);
-      d = (void *) n->tgt->tgt_start;
-      gomp_mutex_unlock (&acc_dev->lock);
+      d = (void *) tgt->tgt_start;
     }
 
   if (profiling_p)
@@ -683,23 +688,28 @@ goacc_exit_datum (void *h, size_t s, unsigned short kind, int async)
 		  (void *) h, (int) s, (void *) n->host_start, (int) host_size);
     }
 
+  assert (n->refcount != REFCOUNT_LINK);
+  if (n->refcount != REFCOUNT_INFINITY
+      && n->refcount < n->dynamic_refcount)
+    {
+      gomp_mutex_unlock (&acc_dev->lock);
+      gomp_fatal ("Dynamic reference counting assert fail\n");
+    }
+
   bool finalize = (kind == GOMP_MAP_DELETE
 		   || kind == GOMP_MAP_FORCE_FROM);
   if (finalize)
     {
       if (n->refcount != REFCOUNT_INFINITY)
-	n->refcount -= n->virtual_refcount;
-      n->virtual_refcount = 0;
+	n->refcount -= n->dynamic_refcount;
+      n->dynamic_refcount = 0;
     }
-
-  if (n->virtual_refcount > 0)
+  else if (n->dynamic_refcount)
     {
       if (n->refcount != REFCOUNT_INFINITY)
 	n->refcount--;
-      n->virtual_refcount--;
+      n->dynamic_refcount--;
     }
-  else if (n->refcount > 0 && n->refcount != REFCOUNT_INFINITY)
-    n->refcount--;
 
   if (n->refcount == 0)
     {
@@ -722,8 +732,16 @@ goacc_exit_datum (void *h, size_t s, unsigned short kind, int async)
 	gomp_remove_var_async (acc_dev, n, aq);
       else
 	{
+	  int num_mappings = 0;
+	  /* If the target_mem_desc represents a single data mapping, we can
+	     check that it is freed when this splay tree key's refcount
+	     reaches zero.  Otherwise (e.g. for a struct mapping with multiple
+	     members), fall back to skipping the test.  */
+	  for (int i = 0; i < n->tgt->list_count; i++)
+	    if (n->tgt->list[i].key)
+	      num_mappings++;
 	  bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
-	  assert (is_tgt_unmapped);
+	  assert (num_mappings > 1 || is_tgt_unmapped);
 	}
     }
 
@@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 {
   for (size_t i = 0; i < mapnum; i++)
     {
-      int group_last = find_group_last (i, mapnum, sizes, kinds);
+      splay_tree_key n;
+      size_t group_last = find_group_last (i, mapnum, sizes, kinds);
+      bool struct_p = false;
+      size_t size, groupnum = (group_last - i) + 1;
 
-      gomp_map_vars_async (acc_dev, aq,
-			   (group_last - i) + 1,
-			   &hostaddrs[i], NULL,
-			   &sizes[i], &kinds[i], true,
-			   GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+      switch (kinds[i] & 0xff)
+	{
+	case GOMP_MAP_STRUCT:
+	  {
+	    int last = i + sizes[i];
+	    size = (uintptr_t) hostaddrs[last] + sizes[last]
+		   - (uintptr_t) hostaddrs[i];
+	    struct_p = true;
+	  }
+	  break;
+
+	case GOMP_MAP_ATTACH:
+	  size = sizeof (void *);
+	  break;
+
+	default:
+	  size = sizes[i];
+	}
+
+      n = lookup_host (acc_dev, hostaddrs[i], size);
+
+      if (n && struct_p)
+	{
+	  if (n->refcount != REFCOUNT_INFINITY)
+	    n->refcount += groupnum - 1;
+	  n->dynamic_refcount += groupnum - 1;
+	  gomp_mutex_unlock (&acc_dev->lock);
+	}
+      else if (n && groupnum == 1)
+	{
+	  void *h = hostaddrs[i];
+	  size_t s = sizes[i];
+
+	  /* A standalone attach clause.  */
+	  if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
+	    gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n,
+				 (uintptr_t) h, s, NULL);
+	  else if (h + s > (void *) n->host_end)
+	    {
+	      gomp_mutex_unlock (&acc_dev->lock);
+	      gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
+	    }
+
+	  assert (n->refcount != REFCOUNT_LINK);
+	  if (n->refcount != REFCOUNT_INFINITY)
+	    n->refcount++;
+	  n->dynamic_refcount++;
+
+	  gomp_mutex_unlock (&acc_dev->lock);
+	}
+      else if (n && groupnum > 1)
+	{
+	  assert (n->refcount != REFCOUNT_INFINITY
+		  && n->refcount != REFCOUNT_LINK);
+
+	  bool processed = false;
+
+	  struct target_mem_desc *tgt = n->tgt;
+	  for (size_t j = 0; j < tgt->list_count; j++)
+	    if (tgt->list[j].key == n)
+	      {
+		for (size_t k = 0; k < groupnum; k++)
+		  if (j + k < tgt->list_count && tgt->list[j + k].key)
+		    {
+		      tgt->list[j + k].key->refcount++;
+		      tgt->list[j + k].key->dynamic_refcount++;
+		    }
+		processed = true;
+	      }
+
+	  gomp_mutex_unlock (&acc_dev->lock);
+	  if (!processed)
+	    gomp_fatal ("dynamic refcount incrementing failed for "
+			"pointer/pset");
+	}
+      else if (hostaddrs[i])
+	{
+	  gomp_mutex_unlock (&acc_dev->lock);
+
+	  struct target_mem_desc *tgt
+	    = gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
+				   &sizes[i], &kinds[i], true,
+				   GOMP_MAP_VARS_ENTER_DATA);
+	  assert (tgt);
+	  for (size_t j = 0; j < tgt->list_count; j++)
+	    {
+	      n = tgt->list[j].key;
+	      if (n)
+		n->dynamic_refcount++;
+	    }
+	}
 
       i = group_last;
     }
@@ -1115,18 +1222,15 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	    if (finalize)
 	      {
 		if (n->refcount != REFCOUNT_INFINITY)
-		  n->refcount -= n->virtual_refcount;
-		n->virtual_refcount = 0;
+		  n->refcount -= n->dynamic_refcount;
+		n->dynamic_refcount = 0;
 	      }
-
-	    if (n->virtual_refcount > 0)
+	    else if (n->dynamic_refcount)
 	      {
 		if (n->refcount != REFCOUNT_INFINITY)
 		  n->refcount--;
-		n->virtual_refcount--;
+		n->dynamic_refcount--;
 	      }
-	    else if (n->refcount > 0 && n->refcount != REFCOUNT_INFINITY)
-	      n->refcount--;
 
 	    if (copyfrom
 		&& (kind != GOMP_MAP_FROM || n->refcount == 0))
@@ -1137,45 +1241,40 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 				  cur_node.host_end - cur_node.host_start);
 
 	    if (n->refcount == 0)
-	      gomp_remove_var_async (acc_dev, n, aq);
-	  }
-	  break;
-
-	case GOMP_MAP_STRUCT:
-	  {
-	    int elems = sizes[i];
-	    for (int j = 1; j <= elems; j++)
 	      {
-		struct splay_tree_key_s k;
-		k.host_start = (uintptr_t) hostaddrs[i + j];
-		k.host_end = k.host_start + sizes[i + j];
-		splay_tree_key str;
-		str = splay_tree_lookup (&acc_dev->mem_map, &k);
-		if (str)
+		if (aq)
 		  {
-		    if (finalize)
-		      {
-			if (str->refcount != REFCOUNT_INFINITY)
-			  str->refcount -= str->virtual_refcount;
-			str->virtual_refcount = 0;
-		      }
-		    if (str->virtual_refcount > 0)
+		    /* TODO The way the following code is currently
+		       implemented, we need the 'is_tgt_unmapped' return
+		       value from 'gomp_remove_var', so can't use
+		       'gomp_remove_var_async' here -- see the
+		       'gomp_unref_tgt' comment in
+		       <http://mid.mail-archive.com/878snl36eu.fsf@euler.schwinge.homeip.net>;
+		       PR92881 -- so have to synchronize here.  */
+		    if (!acc_dev->openacc.async.synchronize_func (aq))
 		      {
-			if (str->refcount != REFCOUNT_INFINITY)
-			  str->refcount--;
-			str->virtual_refcount--;
+			gomp_mutex_unlock (&acc_dev->lock);
+			gomp_fatal ("synchronize failed");
 		      }
-		    else if (str->refcount > 0
-			     && str->refcount != REFCOUNT_INFINITY)
-		      str->refcount--;
-		    if (str->refcount == 0)
-		      gomp_remove_var_async (acc_dev, str, aq);
 		  }
+		int num_mappings = 0;
+		/* If the target_mem_desc represents a single data mapping, we
+		   can check that it is freed when this splay tree key's
+		   refcount reaches zero.  Otherwise (e.g. for a struct
+		   mapping with multiple members), fall back to skipping the
+		   test.  */
+		for (int j = 0; j < n->tgt->list_count; j++)
+		  if (n->tgt->list[j].key)
+		    num_mappings++;
+		bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
+		assert (num_mappings > 1 || is_tgt_unmapped);
 	      }
-	    i += elems;
 	  }
 	  break;
 
+	case GOMP_MAP_STRUCT:
+	  continue;
+
 	default:
 	  gomp_fatal (">>>> goacc_exit_data_internal UNHANDLED kind 0x%.2x",
 			  kind);
diff --git a/libgomp/target.c b/libgomp/target.c
index 36425477dcb..3f2becdae0e 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -666,8 +666,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
   struct target_mem_desc *tgt
     = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
   tgt->list_count = mapnum;
-  tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA
-		   || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1;
+  tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
   tgt->prev = NULL;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
@@ -1094,7 +1093,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
 		      tgt->list[i].copy_from = false;
 		      tgt->list[i].always_copy_from = false;
 		      tgt->list[i].do_detach
-			= (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+			= (pragma_kind != GOMP_MAP_VARS_ENTER_DATA);
 		      n->refcount++;
 		    }
 		  else
@@ -1155,7 +1154,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
 		tgt->list[i].offset = 0;
 		tgt->list[i].length = k->host_end - k->host_start;
 		k->refcount = 1;
-		k->virtual_refcount = 0;
+		k->dynamic_refcount = 0;
 		tgt->refcount++;
 		array->left = NULL;
 		array->right = NULL;
@@ -1294,20 +1293,8 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
   /* If the variable from "omp target enter data" map-list was already mapped,
      tgt is not needed.  Otherwise tgt will be freed by gomp_unmap_vars or
      gomp_exit_data.  */
-  if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA
-       || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
-      && tgt->refcount == 0)
-    {
-      /* If we're about to discard a target_mem_desc with no "structural"
-	 references (tgt->refcount == 0), any splay keys linked in the tgt's
-	 list must have their virtual refcount incremented to represent that
-	 "lost" reference in order to implement the semantics of the OpenACC
-	 "present increment" operation properly.  */
-      if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
-	for (i = 0; i < tgt->list_count; i++)
-	  if (tgt->list[i].key)
-	    tgt->list[i].key->virtual_refcount++;
-
+  if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0)
+    {
       free (tgt);
       tgt = NULL;
     }
@@ -1459,14 +1446,7 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 	continue;
 
       bool do_unmap = false;
-      if (k->tgt == tgt
-	  && k->virtual_refcount > 0
-	  && k->refcount != REFCOUNT_INFINITY)
-	{
-	  k->virtual_refcount--;
-	  k->refcount--;
-	}
-      else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
+      if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
 	k->refcount--;
       else if (k->refcount == 1)
 	{
@@ -1631,7 +1611,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_table[i].start;
       k->refcount = REFCOUNT_INFINITY;
-      k->virtual_refcount = 0;
+      k->dynamic_refcount = 0;
       k->aux = NULL;
       array->left = NULL;
       array->right = NULL;
@@ -1665,7 +1645,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_var->start;
       k->refcount = is_link_var ? REFCOUNT_LINK : REFCOUNT_INFINITY;
-      k->virtual_refcount = 0;
+      k->dynamic_refcount = 0;
       k->aux = NULL;
       array->left = NULL;
       array->right = NULL;
@@ -2935,7 +2915,7 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
       k->tgt = tgt;
       k->tgt_offset = (uintptr_t) device_ptr + device_offset;
       k->refcount = REFCOUNT_INFINITY;
-      k->virtual_refcount = 0;
+      k->dynamic_refcount = 0;
       k->aux = NULL;
       array->left = NULL;
       array->right = NULL;
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-1.c
new file mode 100644
index 00000000000..4e6d06d48d5
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-1.c
@@ -0,0 +1,31 @@
+/* Test dynamic unmapping of separate structure members.  */
+
+/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */
+
+#include <assert.h>
+#include <openacc.h>
+
+struct s
+{
+  char a;
+  char b;
+};
+
+int main ()
+{
+  struct s s;
+
+#pragma acc enter data create(s.a, s.b)
+
+  assert (acc_is_present (&s.a, sizeof s.a));
+  assert (acc_is_present (&s.b, sizeof s.b));
+
+#pragma acc exit data delete(s.a)
+#pragma acc exit data delete(s.b)
+
+  assert (!acc_is_present (&s.a, sizeof s.a));
+  assert (!acc_is_present (&s.b, sizeof s.b));
+
+  return 0;
+}
+
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-2.c
new file mode 100644
index 00000000000..5539fd8d57f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/refcounting-2.c
@@ -0,0 +1,31 @@
+/* Test dynamic unmapping of separate structure members.  */
+
+/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */
+
+#include <assert.h>
+#include <openacc.h>
+
+struct s
+{
+  char a;
+  char b;
+};
+
+int main ()
+{
+  struct s s;
+
+#pragma acc enter data create(s.a, s.b)
+
+  assert (acc_is_present (&s.a, sizeof s.a));
+  assert (acc_is_present (&s.b, sizeof s.b));
+
+  acc_delete (&s.a, sizeof s.a);
+  acc_delete (&s.b, sizeof s.b);
+
+  assert (!acc_is_present (&s.a, sizeof s.a));
+  assert (!acc_is_present (&s.b, sizeof s.b));
+
+  return 0;
+}
+
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
  2020-05-22 22:16 ` [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code Julian Brown
  2020-05-22 22:16 ` [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics Julian Brown
@ 2020-05-22 22:16 ` Julian Brown
  2020-06-25 10:52   ` Thomas Schwinge
  2020-05-22 22:16 ` [PATCH 4/7] [OpenACC] Fix incompatible copyout for acc_map_data (PR92843) Julian Brown
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

Since goacc_enter_datum only maps a single data item now, there is no
need to pass "kinds" as an array.  Passing as a scalar allows for some
simplification in the function's callers.

OK?

Julian

ChangeLog

	libgomp/
	* oacc-mem.c (goacc_enter_datum): Use scalar kind argument instead of
	kinds array.
	(acc_create, acc_create_async, acc_copyin, acc_copyin_async): Update
	calls to goacc_enter_datum.
---
 libgomp/oacc-mem.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index fff0d573f59..20d241382a8 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -501,7 +501,8 @@ acc_unmap_data (void *h)
 /* Enter dynamic mapping for a single datum.  Return the device pointer.  */
 
 static void *
-goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
+goacc_enter_datum (void **hostaddrs, size_t *sizes, unsigned short kind,
+		   int async)
 {
   void *d;
   splay_tree_key n;
@@ -560,7 +561,7 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
 
       struct target_mem_desc *tgt
 	= gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes,
-			       kinds, true, GOMP_MAP_VARS_ENTER_DATA);
+			       &kind, true, GOMP_MAP_VARS_ENTER_DATA);
       assert (tgt);
       assert (tgt->list_count == 1);
       n = tgt->list[0].key;
@@ -584,15 +585,13 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
 void *
 acc_create (void *h, size_t s)
 {
-  unsigned short kinds[1] = { GOMP_MAP_ALLOC };
-  return goacc_enter_datum (&h, &s, &kinds, acc_async_sync);
+  return goacc_enter_datum (&h, &s, GOMP_MAP_ALLOC, acc_async_sync);
 }
 
 void
 acc_create_async (void *h, size_t s, int async)
 {
-  unsigned short kinds[1] = { GOMP_MAP_ALLOC };
-  goacc_enter_datum (&h, &s, &kinds, async);
+  goacc_enter_datum (&h, &s, GOMP_MAP_ALLOC, async);
 }
 
 /* acc_present_or_create used to be what acc_create is now.  */
@@ -617,15 +616,13 @@ acc_pcreate (void *h, size_t s)
 void *
 acc_copyin (void *h, size_t s)
 {
-  unsigned short kinds[1] = { GOMP_MAP_TO };
-  return goacc_enter_datum (&h, &s, &kinds, acc_async_sync);
+  return goacc_enter_datum (&h, &s, GOMP_MAP_TO, acc_async_sync);
 }
 
 void
 acc_copyin_async (void *h, size_t s, int async)
 {
-  unsigned short kinds[1] = { GOMP_MAP_TO };
-  goacc_enter_datum (&h, &s, &kinds, async);
+  goacc_enter_datum (&h, &s, GOMP_MAP_TO, async);
 }
 
 /* acc_present_or_copyin used to be what acc_copyin is now.  */
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/7] [OpenACC] Fix incompatible copyout for acc_map_data (PR92843)
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
                   ` (2 preceding siblings ...)
  2020-05-22 22:16 ` [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum Julian Brown
@ 2020-05-22 22:16 ` Julian Brown
  2020-05-22 22:21 ` [PATCH 5/7] [OpenACC] Distinguish structural/dynamic mappings in libgomp Julian Brown
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

This patch provides a bug fix (on top of previous patches in this
series) that allows the PR92843 test case to pass. Data mapped in with
"acc_map_data" is not copied out by an "exit data" directive.

OK?

Julian

ChangeLog

	PR libgomp/92843

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Don't copyout data mapped
	with acc_map_data in exit data directive.
---
 libgomp/oacc-mem.c                                      | 1 +
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 20d241382a8..c2b4a131a5f 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1230,6 +1230,7 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	      }
 
 	    if (copyfrom
+		&& n->refcount != REFCOUNT_INFINITY
 		&& (kind != GOMP_MAP_FROM || n->refcount == 0))
 	      gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start,
 				  (void *) (n->tgt->tgt_start + n->tgt_offset
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c
index f16c46a37bf..db5b35b08d9 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c
@@ -1,7 +1,6 @@
 /* Verify that 'acc_copyout' etc. is a no-op if there's still a structured
    reference count.  */
 
-/* { dg-xfail-run-if "TODO PR92843" { *-*-* } } */
 /* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
 
 #include <assert.h>
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/7] [OpenACC] Distinguish structural/dynamic mappings in libgomp
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
                   ` (3 preceding siblings ...)
  2020-05-22 22:16 ` [PATCH 4/7] [OpenACC] Fix incompatible copyout for acc_map_data (PR92843) Julian Brown
@ 2020-05-22 22:21 ` Julian Brown
  2020-05-22 22:21 ` [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version) Julian Brown
  2020-05-22 22:21 ` [PATCH 7/7] [OpenACC] Stricter dynamic data unmapping testing (WIP) Julian Brown
  6 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

This patch provides support for distinguishing target_mem_descs introduced
via structured data lifetimes from those arising from dynamic data
lifetimes.  This is a prerequisite for the following reference-count
self-checking patch.

This patch (and those following it) are not vital for this patch series,
but are "nice-to-have" additions.

OK?

Julian

	libgomp/
	* libgomp.h (struct target_mem_desc): Update comment on prev field.
	* oacc-int.h (goacc_mark_dynamic, goacc_marked_dynamic_p): Add
	prototypes.
	* oacc-mem.c (dyn_tgt_sentinel): New.
	(goacc_mark_dynamic, goacc_marked_dynamic_p): New functions.
	(goacc_enter_datum): Call goacc_mark_dynamic.
	(goacc_enter_data_internal): Likewise.
	* target.c (gomp_unmap_vars_internal): Convert a target_mem_desc from
	a structural mapping to dynamic when appropriate.
---
 libgomp/libgomp.h  |  3 ++-
 libgomp/oacc-int.h |  3 +++
 libgomp/oacc-mem.c | 28 ++++++++++++++++++++++++++++
 libgomp/target.c   |  8 +++++++-
 4 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 7b52ce7d5c2..0d1978ffb13 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -971,7 +971,8 @@ struct target_mem_desc {
   uintptr_t tgt_end;
   /* Handle to free.  */
   void *to_free;
-  /* Previous target_mem_desc.  */
+  /* Previous target_mem_desc.  Also used in OpenACC to indicate that this
+     target_mem_desc is used only for an "enter data" mapping.  */
   struct target_mem_desc *prev;
   /* Number of items in following list.  */
   size_t list_count;
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 3c2c9b84b2f..2d8d3eb5a4b 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -165,6 +165,9 @@ bool _goacc_profiling_setup_p (struct goacc_thread *,
 void goacc_profiling_dispatch (acc_prof_info *, acc_event_info *,
 			       acc_api_info *);
 
+extern void goacc_mark_dynamic (struct target_mem_desc *);
+extern bool goacc_marked_dynamic_p (struct target_mem_desc *tgt);
+
 #ifdef HAVE_ATTRIBUTE_VISIBILITY
 # pragma GCC visibility pop
 #endif
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c2b4a131a5f..038ab68e8a2 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -497,6 +497,30 @@ acc_unmap_data (void *h)
     }
 }
 
+/* Indicate (via storing its address in the "prev" field) a target_mem_desc
+   that is used for an "enter data" mapping.  */
+const static struct target_mem_desc dyn_tgt_sentinel;
+
+attribute_hidden void
+goacc_mark_dynamic (struct target_mem_desc *tgt)
+{
+  tgt->prev = (struct target_mem_desc *) &dyn_tgt_sentinel;
+}
+
+attribute_hidden bool
+goacc_marked_dynamic_p (struct target_mem_desc *tgt)
+{
+  return tgt->prev == (struct target_mem_desc *) &dyn_tgt_sentinel;
+}
 
 /* Enter dynamic mapping for a single datum.  Return the device pointer.  */
 
@@ -563,6 +587,8 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, unsigned short kind,
 	= gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes,
 			       &kind, true, GOMP_MAP_VARS_ENTER_DATA);
       assert (tgt);
+      goacc_mark_dynamic (tgt);
+
       assert (tgt->list_count == 1);
       n = tgt->list[0].key;
       assert (n);
@@ -1122,6 +1148,8 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 				   &sizes[i], &kinds[i], true,
 				   GOMP_MAP_VARS_ENTER_DATA);
 	  assert (tgt);
+	  goacc_mark_dynamic (tgt);
+
 	  for (size_t j = 0; j < tgt->list_count; j++)
 	    {
 	      n = tgt->list[j].key;
diff --git a/libgomp/target.c b/libgomp/target.c
index 3f2becdae0e..1d60d0cb573 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1447,7 +1447,13 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 
       bool do_unmap = false;
       if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
-	k->refcount--;
+	{
+	  k->refcount--;
+	  /* If we only have dynamic references left, mark the tgt_mem_desc
+	     appropriately.  */
+	  if (k->refcount == k->dynamic_refcount)
+	    goacc_mark_dynamic (k->tgt);
+	}
       else if (k->refcount == 1)
 	{
 	  k->refcount--;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version)
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
                   ` (4 preceding siblings ...)
  2020-05-22 22:21 ` [PATCH 5/7] [OpenACC] Distinguish structural/dynamic mappings in libgomp Julian Brown
@ 2020-05-22 22:21 ` Julian Brown
  2020-06-18 18:40   ` Julian Brown
  2020-05-22 22:21 ` [PATCH 7/7] [OpenACC] Stricter dynamic data unmapping testing (WIP) Julian Brown
  6 siblings, 1 reply; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

This is a new version of the reference count self-checking code, adjusted
to work with the new (old) dynamic_refcount counting scheme.  The key
observation is that a target_mem_desc that was created from a dynamic
data lifetime should not contribute to the structured refcount for splay
tree keys in its variable list.  We can figure out which target_mem_descs
that applies to using the information recorded in the previous patch.

In a sense, this takes the "awkward corner cases" from the
virtual_refcount ("overhaul") patch, and moves them to the optional
self-test code, where they can potentially do less harm.  With this, we
still have a formal-ish model of what refcounts mean and some confidence
that they remain consistent (at least throughout execution of a test run),
which I think is a good thing.

OK? (We probably want a way of configuring-in this testing automatically,
as mentioned previously.)

Julian

ChangeLog

	libgomp/
	* libgomp.h (RC_CHECKING): New macro, disabled by default, guarding all
	hunks in this patch.
	(target_mem_desc): Add refcount_chk, mark fields.
	(splay_tree_key_s): Add refcount_chk field.
	(dump_tgt, gomp_rc_check): Add prototypes.
	* oacc-mem.c (GOACC_enter_exit_data): Add refcount self-check code.
	* oacc-parallel.c (GOACC_parallel_keyed_internal): Add refcount
	self-check code.
	(GOACC_data_start, GOACC_data_end, GOACC_enter_exit_data): Likewise.
	* target.c (stdio.h): Include.
	(dump_tgt, rc_check_clear, rc_check_count, rc_check_verify,
	gomp_rc_check): New functions to consistency-check reference counts.
---
 libgomp/libgomp.h       |  18 ++++
 libgomp/oacc-mem.c      |   6 ++
 libgomp/oacc-parallel.c |  27 ++++++
 libgomp/target.c        | 185 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 236 insertions(+)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 0d1978ffb13..eaa7c6ebb4c 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -960,9 +960,17 @@ struct target_var_desc {
   uintptr_t length;
 };
 
+/* Uncomment to enable reference-count consistency checking (for development
+   use only).  */
+//#define RC_CHECKING 1
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
+#ifdef RC_CHECKING
+  uintptr_t refcount_chk;
+  bool mark;
+#endif
   /* All the splay nodes allocated together.  */
   splay_tree_node array;
   /* Start of the target region.  */
@@ -1019,6 +1027,10 @@ struct splay_tree_key_s {
   uintptr_t refcount;
   /* Dynamic reference count.  */
   uintptr_t dynamic_refcount;
+#ifdef RC_CHECKING
+  /* The recalculated reference count, for verification.  */
+  uintptr_t refcount_chk;
+#endif
   struct splay_tree_aux *aux;
 };
 
@@ -1174,6 +1186,12 @@ extern void gomp_detach_pointer (struct gomp_device_descr *,
 				 struct goacc_asyncqueue *, splay_tree_key,
 				 uintptr_t, bool, struct gomp_coalesce_buf *);
 
+#ifdef RC_CHECKING
+extern void dump_tgt (const char *, struct target_mem_desc *);
+extern void gomp_rc_check (struct gomp_device_descr *,
+			   struct target_mem_desc *);
+#endif
+
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
 					      size_t *, void *, bool,
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 038ab68e8a2..c8ec3c9a7dd 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1450,4 +1450,10 @@ GOACC_enter_exit_data (int flags_m, size_t mapnum, void **hostaddrs,
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index c7e46e35bd6..0774cdc7e4f 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -301,6 +301,15 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
 				&api_info);
     }
   
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  assert (tgt);
+  dump_tgt (__FUNCTION__, tgt);
+  tgt->prev = thr->mapped_data;
+  gomp_rc_check (acc_dev, tgt);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
+
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
     devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i);
@@ -347,6 +356,12 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
 
 /* Legacy entry point (GCC 5).  Only provide host fallback execution.  */
@@ -481,6 +496,12 @@ GOACC_data_start (int flags_m, size_t mapnum,
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
 
 void
@@ -554,6 +575,12 @@ GOACC_data_end (void)
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&thr->dev->lock);
+  gomp_rc_check (thr->dev, thr->mapped_data);
+  gomp_mutex_unlock (&thr->dev->lock);
+#endif
 }
 
 void
diff --git a/libgomp/target.c b/libgomp/target.c
index 1d60d0cb573..9a51e1c70f6 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -38,6 +38,9 @@
 #include <string.h>
 #include <assert.h>
 #include <errno.h>
+#ifdef RC_CHECKING
+#include <stdio.h>
+#endif
 
 #ifdef PLUGIN_SUPPORT
 #include <dlfcn.h>
@@ -347,6 +350,188 @@ gomp_free_device_memory (struct gomp_device_descr *devicep, void *devptr)
     }
 }
 
+#ifdef RC_CHECKING
+void
+dump_tgt (const char *where, struct target_mem_desc *tgt)
+{
+  if (!getenv ("GOMP_DEBUG_TGT"))
+    return;
+
+  fprintf (stderr, "%s: %s: tgt=%p\n", __FUNCTION__, where, (void*) tgt);
+  fprintf (stderr, "refcount=%d\n", (int) tgt->refcount);
+  fprintf (stderr, "tgt_start=%p\n", (void*) tgt->tgt_start);
+  fprintf (stderr, "tgt_end=%p\n", (void*) tgt->tgt_end);
+  fprintf (stderr, "to_free=%p\n", tgt->to_free);
+  fprintf (stderr, "list_count=%d\n", (int) tgt->list_count);
+  for (int i = 0; i < tgt->list_count; i++)
+    {
+      fprintf (stderr, "list item %d:\n", i);
+      fprintf (stderr, "  key: %p\n", (void*) tgt->list[i].key);
+      if (tgt->list[i].key)
+	{
+	  fprintf (stderr, "  key.host_start=%p\n",
+		   (void*) tgt->list[i].key->host_start);
+	  fprintf (stderr, "  key.host_end=%p\n",
+		   (void*) tgt->list[i].key->host_end);
+	  fprintf (stderr, "  key.tgt=%p\n", (void*) tgt->list[i].key->tgt);
+	  fprintf (stderr, "  key.offset=%d\n",
+		   (int) tgt->list[i].key->tgt_offset);
+	  fprintf (stderr, "  key.refcount=%d\n",
+		   (int) tgt->list[i].key->refcount);
+	  fprintf (stderr, "  key.dynamic_refcount=%d\n",
+		   (int) tgt->list[i].key->dynamic_refcount);
+	  if (tgt->list[i].key->aux)
+	    {
+	      fprintf (stderr, "  key.aux.link_key=%p\n",
+		       (void*) tgt->list[i].key->aux->link_key);
+	      fprintf (stderr, "  key.aux.attach_count=%p\n",
+		       (void*) tgt->list[i].key->aux->attach_count);
+	    }
+	}
+    }
+  fprintf (stderr, "\n");
+}
+
+static void
+rc_check_clear (splay_tree_node node)
+{
+  splay_tree_key k = &node->key;
+
+  k->refcount_chk = 0;
+  k->tgt->refcount_chk = 0;
+  k->tgt->mark = false;
+
+  if (node->left)
+    rc_check_clear (node->left);
+  if (node->right)
+    rc_check_clear (node->right);
+}
+
+static void
+rc_check_count (splay_tree_node node)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t;
+
+  /* Add dynamic reference counts ("acc enter data", etc.) for this key.  */
+  k->refcount_chk += k->dynamic_refcount;
+
+  t = k->tgt;
+  t->refcount_chk++;
+
+  /* Do not count references from tgt_mem_descs that arise from dynamic data
+     lifetimes: those are counted already by their keys' dynamic_refcount.  */
+  if (!t->mark && goacc_marked_dynamic_p (t))
+    t->mark = true;
+  else if (!t->mark)
+    {
+      for (int i = 0; i < t->list_count; i++)
+	if (t->list[i].key)
+	  t->list[i].key->refcount_chk++;
+
+      t->mark = true;
+    }
+
+  if (node->left)
+    rc_check_count (node->left);
+  if (node->right)
+    rc_check_count (node->right);
+}
+
+static bool
+rc_check_verify (splay_tree_node node, bool noisy, bool errors)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t;
+
+  if (k->refcount != REFCOUNT_INFINITY)
+    {
+      if (noisy)
+	fprintf (stderr, "key %p (%p..+%d): rc=%d/%d, dyn_rc=%d\n", k,
+		 (void *) k->host_start, (int) (k->host_end - k->host_start),
+		 (int) k->refcount, (int) k->refcount_chk,
+		 (int) k->dynamic_refcount);
+
+      if (k->refcount != k->refcount_chk)
+	{
+	  if (noisy)
+	    fprintf (stderr, "  -- key refcount mismatch!\n");
+	  errors = true;
+	}
+
+      t = k->tgt;
+
+      if (noisy)
+	fprintf (stderr, "tgt %p: rc=%d/%d\n", t, (int) t->refcount,
+		 (int) t->refcount_chk);
+
+      if (t->refcount != t->refcount_chk)
+	{
+	  if (noisy)
+	    fprintf (stderr,
+		     "  -- target memory descriptor refcount mismatch!\n");
+	  errors = true;
+	}
+    }
+
+  if (node->left)
+    errors |= rc_check_verify (node->left, noisy, errors);
+  if (node->right)
+    errors |= rc_check_verify (node->right, noisy, errors);
+
+  return errors;
+}
+
+/* Call with device locked.  */
+
+attribute_hidden void
+gomp_rc_check (struct gomp_device_descr *devicep, struct target_mem_desc *tgt)
+{
+  splay_tree sp = &devicep->mem_map;
+
+  bool noisy = getenv ("GOMP_DEBUG_TGT") != 0;
+
+  if (noisy)
+    fprintf (stderr, "\n*** GOMP_RC_CHECK ***\n\n");
+
+  if (sp->root)
+    {
+      rc_check_clear (sp->root);
+
+      for (struct target_mem_desc *t = tgt; t; t = t->prev)
+	{
+	  t->refcount_chk = 0;
+	  t->mark = false;
+	}
+
+      /* Add references for interconnected splay-tree keys.  */
+      rc_check_count (sp->root);
+
+      /* Add references for the tgt for a currently-executing kernel and/or
+	 any enclosing data directives.  */
+      for (struct target_mem_desc *t = tgt; t; t = t->prev)
+	{
+	  t->refcount_chk++;
+
+	  if (!t->mark)
+	    {
+	      for (int i = 0; i < t->list_count; i++)
+		if (t->list[i].key)
+		  t->list[i].key->refcount_chk++;
+
+	      t->mark = true;
+	    }
+	}
+
+      if (rc_check_verify (sp->root, noisy, false))
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("refcount checking failure");
+	}
+    }
+}
+#endif
+
 /* Handle the case where gomp_map_lookup, splay_tree_lookup or
    gomp_map_0len_lookup found oldn for newn.
    Helper function of gomp_map_vars.  */
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 7/7] [OpenACC] Stricter dynamic data unmapping testing (WIP)
  2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
                   ` (5 preceding siblings ...)
  2020-05-22 22:21 ` [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version) Julian Brown
@ 2020-05-22 22:21 ` Julian Brown
  6 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-05-22 22:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, Catherine_Moore, jakub

Using the ability to distinguish structural from dynamic mappings'
target_mem_descs, we can adjust how the assertions in goacc_exit_datum
and goacc_exit_data_internal work.  This is possibly a slightly stronger
test than the one introduced earlier in this patch series -- though
actually I haven't quite convinced myself of that.

Anyway, this passes a regression run, with the refcount self-checking
code enabled also.

OK, or any comments?

Julian

ChangeLog

	libgomp/
	* oacc-mem.c (goacc_exit_datum): Adjust self-test code.
	(goacc_exit_data_internal): Likewise.
	* target.c (gomp_unmap_vars_internal): Clear target_mem_desc variable
	list keys on unmapping.
---
 libgomp/oacc-mem.c | 43 ++++++++++++++++++++++++-------------------
 libgomp/target.c   |  8 +++++++-
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c8ec3c9a7dd..d7a1d87c9ef 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -755,16 +755,19 @@ goacc_exit_datum (void *h, size_t s, unsigned short kind, int async)
 	gomp_remove_var_async (acc_dev, n, aq);
       else
 	{
-	  int num_mappings = 0;
-	  /* If the target_mem_desc represents a single data mapping, we can
-	     check that it is freed when this splay tree key's refcount
-	     reaches zero.  Otherwise (e.g. for a struct mapping with multiple
-	     members), fall back to skipping the test.  */
-	  for (int i = 0; i < n->tgt->list_count; i++)
-	    if (n->tgt->list[i].key)
-	      num_mappings++;
+	  int remaining_mappings = 0;
+	  bool dynamic = goacc_marked_dynamic_p (n->tgt);
+	  if (dynamic)
+	    {
+	      /* For dynamic mappings, we may have more than one live splay
+		 tree in the target_mem_desc's variable list.  That's not an
+		 error.  */
+	      for (int i = 0; i < n->tgt->list_count; i++)
+		if (n->tgt->list[i].key)
+		  remaining_mappings++;
+	    }
 	  bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
-	  assert (num_mappings > 1 || is_tgt_unmapped);
+	  assert ((dynamic && remaining_mappings > 0) || is_tgt_unmapped);
 	}
     }
 
@@ -1283,17 +1286,19 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 			gomp_fatal ("synchronize failed");
 		      }
 		  }
-		int num_mappings = 0;
-		/* If the target_mem_desc represents a single data mapping, we
-		   can check that it is freed when this splay tree key's
-		   refcount reaches zero.  Otherwise (e.g. for a struct
-		   mapping with multiple members), fall back to skipping the
-		   test.  */
-		for (int j = 0; j < n->tgt->list_count; j++)
-		  if (n->tgt->list[j].key)
-		    num_mappings++;
+		int remaining_mappings = 0;
+		bool dynamic = goacc_marked_dynamic_p (n->tgt);
+		if (dynamic)
+		  {
+		   /* For dynamic mappings, we may have more than one live
+		      splay tree in the target_mem_desc's variable list.
+		      That's not an error.  */
+		    for (int j = 0; j < n->tgt->list_count; j++)
+		      if (n->tgt->list[j].key)
+			remaining_mappings++;
+		  }
 		bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
-		assert (num_mappings > 1 || is_tgt_unmapped);
+		assert ((dynamic && remaining_mappings > 0) || is_tgt_unmapped);
 	      }
 	  }
 	  break;
diff --git a/libgomp/target.c b/libgomp/target.c
index 9a51e1c70f6..f072e050cc1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1630,6 +1630,7 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
       if (k == NULL)
 	continue;
 
+      bool clear_mapping = true;
       bool do_unmap = false;
       if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
 	{
@@ -1637,7 +1638,10 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 	  /* If we only have dynamic references left, mark the tgt_mem_desc
 	     appropriately.  */
 	  if (k->refcount == k->dynamic_refcount)
-	    goacc_mark_dynamic (k->tgt);
+	    {
+	      goacc_mark_dynamic (k->tgt);
+	      clear_mapping = false;
+	    }
 	}
       else if (k->refcount == 1)
 	{
@@ -1662,6 +1666,8 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 	  assert (!is_tgt_unmapped
 		  || k_tgt != tgt);
 	}
+      if (clear_mapping)
+	tgt->list[i].key = NULL;
     }
 
   if (aq)
-- 
2.23.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics
  2020-05-22 22:16 ` [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics Julian Brown
@ 2020-06-03 12:36   ` Thomas Schwinge
  2020-06-03 15:19     ` Thomas Schwinge
  2020-06-04 18:26   ` [OpenACC] Don't open-code 'gomp_remove_var' in 'acc_unmap_data' (was: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics) Thomas Schwinge
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Schwinge @ 2020-06-03 12:36 UTC (permalink / raw)
  To: Julian Brown; +Cc: jakub, gcc-patches

Hi Julian!

On 2020-05-22T15:16:05-0700, Julian Brown <julian@codesourcery.com> wrote:
> This patch adjusts the semantics of dynamic reference counts, as described
> in the parent email.

Thanks!

A few questions, but no need to send an updated patch.

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
>  {
>    for (size_t i = 0; i < mapnum; i++)
>      {
> -      int group_last = find_group_last (i, mapnum, sizes, kinds);
> +      splay_tree_key n;
> +      size_t group_last = find_group_last (i, mapnum, sizes, kinds);
> +      bool struct_p = false;
> +      size_t size, groupnum = (group_last - i) + 1;
>
> -      gomp_map_vars_async (acc_dev, aq,
> -                        (group_last - i) + 1,
> -                        &hostaddrs[i], NULL,
> -                        &sizes[i], &kinds[i], true,
> -                        GOMP_MAP_VARS_OPENACC_ENTER_DATA);
> +      switch (kinds[i] & 0xff)
> +     {
> +     case GOMP_MAP_STRUCT:
> +       {
> +         int last = i + sizes[i];

The 'last' calculated here must always equal the 'group_last' calculated
above.  ;-) (... so we might just use 'group_last' instead of 'last' in
the following.)

> +         size = (uintptr_t) hostaddrs[last] + sizes[last]
> +                - (uintptr_t) hostaddrs[i];
> +         struct_p = true;
> +       }
> +       break;
> +
> +     case GOMP_MAP_ATTACH:
> +       size = sizeof (void *);
> +       break;
> +
> +     default:
> +       size = sizes[i];
> +     }
> +
> +      n = lookup_host (acc_dev, hostaddrs[i], size);
> +

> +      if (n && struct_p)
> +     {
> +       if (n->refcount != REFCOUNT_INFINITY)
> +         n->refcount += groupnum - 1;
> +       n->dynamic_refcount += groupnum - 1;
> +       gomp_mutex_unlock (&acc_dev->lock);
> +     }

Is the 'GOMP_MAP_STRUCT' handling here specifically necessary, or is that
just an optimization of the 'n && groupnum > 1' case below?

> +      else if (n && groupnum == 1)
> +     {
> +       void *h = hostaddrs[i];
> +       size_t s = sizes[i];
> +
> +       /* A standalone attach clause.  */
> +       if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
> +         gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n,
> +                              (uintptr_t) h, s, NULL);
> +       else if (h + s > (void *) n->host_end)
> +         {
> +           gomp_mutex_unlock (&acc_dev->lock);
> +           gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
> +         }
> +
> +       assert (n->refcount != REFCOUNT_LINK);
> +       if (n->refcount != REFCOUNT_INFINITY)
> +         n->refcount++;
> +       n->dynamic_refcount++;
> +
> +       gomp_mutex_unlock (&acc_dev->lock);
> +     }

> +      else if (n && groupnum > 1)
> +     {
> +       assert (n->refcount != REFCOUNT_INFINITY
> +               && n->refcount != REFCOUNT_LINK);
> +
> +       bool processed = false;
> +
> +       struct target_mem_desc *tgt = n->tgt;
> +       for (size_t j = 0; j < tgt->list_count; j++)
> +         if (tgt->list[j].key == n)
> +           {
> +             for (size_t k = 0; k < groupnum; k++)
> +               if (j + k < tgt->list_count && tgt->list[j + k].key)
> +                 {
> +                   tgt->list[j + k].key->refcount++;
> +                   tgt->list[j + k].key->dynamic_refcount++;
> +                 }
> +             processed = true;
> +           }
> +
> +       gomp_mutex_unlock (&acc_dev->lock);
> +       if (!processed)
> +         gomp_fatal ("dynamic refcount incrementing failed for "
> +                     "pointer/pset");
> +     }

Please add some text to explain the nested 'j', 'k' loops and their 'if'
conditionals, and the 'groupnum' usage in the 'k' loop boundary.  Should
the 'k' loop maybe run 'for (size_t k = j; k < tgt->list_count; ++k)'
(..., or is 'groupnum' relevant?), and in the loop body then use 'k'
instead of 'j + k'?  (Maybe I've now confused myself, staring at this for
a while...)

> +      else if (hostaddrs[i])
> +     {
> +       gomp_mutex_unlock (&acc_dev->lock);
> +
> +       struct target_mem_desc *tgt
> +         = gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
> +                                &sizes[i], &kinds[i], true,
> +                                GOMP_MAP_VARS_ENTER_DATA);
> +       assert (tgt);
> +       for (size_t j = 0; j < tgt->list_count; j++)
> +         {
> +           n = tgt->list[j].key;
> +           if (n)
> +             n->dynamic_refcount++;
> +         }
> +     }

... else nothing.  This latter "nothing" case (not present, and no
'hostaddrs[i]') is exercised by
'libgomp.oacc-fortran/optional-data-enter-exit.f90' (only).  Is that
alright?

>
>        i = group_last;
>      }


> @@ -1137,45 +1241,40 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,

(Diff slightly edited.)

>           if (n->refcount == 0)
> -           gomp_remove_var_async (acc_dev, n, aq);

> +           {
> +             if (aq)
> +               {
> +                 /* TODO The way the following code is currently
> +                    implemented, we need the 'is_tgt_unmapped' return
> +                    value from 'gomp_remove_var', so can't use
> +                    'gomp_remove_var_async' here -- see the
> +                    'gomp_unref_tgt' comment in
> +                    <http://mid.mail-archive.com/878snl36eu.fsf@euler.schwinge.homeip.net>;
> +                    PR92881 -- so have to synchronize here.  */
> +                 if (!acc_dev->openacc.async.synchronize_func (aq))
> +                   {
> +                     gomp_mutex_unlock (&acc_dev->lock);
> +                     gomp_fatal ("synchronize failed");
> +                   }
> +               }

As far as I understand, it's no longer true that "The way the following
code is [...] implemented, we need the 'is_tgt_unmapped' return value
from 'gomp_remove_var'".  In particular, we now can/should "use
'gomp_remove_var_async' here", and no longer "have to synchronize here"?

Indeed I'm happy to see that the logic below no longer depends on
'is_tgt_unmapped' for its loop exit condition.  Instead of the above,
this now can use the standard pattern:

    if (aq)
      /* TODO We can't do the 'is_tgt_unmapped' checking -- see the
         'gomp_unref_tgt' comment in
         <http://mid.mail-archive.com/878snl36eu.fsf@euler.schwinge.homeip.net>;
         PR92881.  */
      gomp_remove_var_async (acc_dev, n, aq);
    else
      { [as follows] }

> +             int num_mappings = 0;
> +             /* If the target_mem_desc represents a single data mapping, we
> +                can check that it is freed when this splay tree key's
> +                refcount reaches zero.  Otherwise (e.g. for a struct
> +                mapping with multiple members), fall back to skipping the
> +                test.  */
> +             for (int j = 0; j < n->tgt->list_count; j++)
> +               if (n->tgt->list[j].key)
> +                 num_mappings++;
> +             bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
> +             assert (num_mappings > 1 || is_tgt_unmapped);
> +           }
>         }
>         break;

For reference, the old logic (mandating what was described in the comment
above) was:

    bool is_tgt_unmapped = false;
    for (size_t i = 0; i < t->list_count; i++)
     {
       is_tgt_unmapped = gomp_remove_var (acc_dev, t->list[i].key);
       if (is_tgt_unmapped)
         break;
     }
    assert (is_tgt_unmapped);


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics
  2020-06-03 12:36   ` Thomas Schwinge
@ 2020-06-03 15:19     ` Thomas Schwinge
  2020-06-05 22:03       ` Julian Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Schwinge @ 2020-06-03 15:19 UTC (permalink / raw)
  To: Julian Brown; +Cc: jakub, gcc-patches

Hi Julian!

On 2020-06-03T14:36:14+0200, I wrote:
> On 2020-05-22T15:16:05-0700, Julian Brown <julian@codesourcery.com> wrote:
>> This patch adjusts the semantics of dynamic reference counts, as described
>> in the parent email.
>
> Thanks!
>
> A few questions, but no need to send an updated patch.
>
>> --- a/libgomp/oacc-mem.c
>> +++ b/libgomp/oacc-mem.c
>
>> @@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
>>  {
>>    for (size_t i = 0; i < mapnum; i++)
>>      {
>> -      int group_last = find_group_last (i, mapnum, sizes, kinds);
>> +      splay_tree_key n;
>> +      size_t group_last = find_group_last (i, mapnum, sizes, kinds);
>> +      bool struct_p = false;
>> +      size_t size, groupnum = (group_last - i) + 1;
>>
>> -      gomp_map_vars_async (acc_dev, aq,
>> -                       (group_last - i) + 1,
>> -                       &hostaddrs[i], NULL,
>> -                       &sizes[i], &kinds[i], true,
>> -                       GOMP_MAP_VARS_OPENACC_ENTER_DATA);
>> +      switch (kinds[i] & 0xff)
>> +    {
>> +    case GOMP_MAP_STRUCT:
>> +      {
>> +        int last = i + sizes[i];
>
> The 'last' calculated here must always equal the 'group_last' calculated
> above.  ;-) (... so we might just use 'group_last' instead of 'last' in
> the following.)
>
>> +        size = (uintptr_t) hostaddrs[last] + sizes[last]
>> +               - (uintptr_t) hostaddrs[i];
>> +        struct_p = true;
>> +      }
>> +      break;
>> +
>> +    case GOMP_MAP_ATTACH:
>> +      size = sizeof (void *);
>> +      break;
>> +
>> +    default:
>> +      size = sizes[i];
>> +    }
>> +
>> +      n = lookup_host (acc_dev, hostaddrs[i], size);
>> +
>
>> +      if (n && struct_p)
>> +    {
>> +      if (n->refcount != REFCOUNT_INFINITY)
>> +        n->refcount += groupnum - 1;
>> +      n->dynamic_refcount += groupnum - 1;
>> +      gomp_mutex_unlock (&acc_dev->lock);
>> +    }
>
> Is the 'GOMP_MAP_STRUCT' handling here specifically necessary, or is that
> just an optimization of the 'n && groupnum > 1' case below?

Eh, OK, I think I see where this is going; the 'n && groupnum > 1' case
below might not necessarily take care of the 'groupnum - 1' refcounts
that we're filing here?

>> +      else if (n && groupnum == 1)
>> +    {
>> +      void *h = hostaddrs[i];
>> +      size_t s = sizes[i];
>> +
>> +      /* A standalone attach clause.  */
>> +      if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
>> +        gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n,
>> +                             (uintptr_t) h, s, NULL);
>> +      else if (h + s > (void *) n->host_end)
>> +        {
>> +          gomp_mutex_unlock (&acc_dev->lock);
>> +          gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
>> +        }
>> +
>> +      assert (n->refcount != REFCOUNT_LINK);
>> +      if (n->refcount != REFCOUNT_INFINITY)
>> +        n->refcount++;
>> +      n->dynamic_refcount++;
>> +
>> +      gomp_mutex_unlock (&acc_dev->lock);
>> +    }
>
>> +      else if (n && groupnum > 1)
>> +    {
>> +      assert (n->refcount != REFCOUNT_INFINITY
>> +              && n->refcount != REFCOUNT_LINK);
>> +
>> +      bool processed = false;
>> +
>> +      struct target_mem_desc *tgt = n->tgt;
>> +      for (size_t j = 0; j < tgt->list_count; j++)
>> +        if (tgt->list[j].key == n)
>> +          {
>> +            for (size_t k = 0; k < groupnum; k++)
>> +              if (j + k < tgt->list_count && tgt->list[j + k].key)
>> +                {
>> +                  tgt->list[j + k].key->refcount++;
>> +                  tgt->list[j + k].key->dynamic_refcount++;
>> +                }
>> +            processed = true;
>> +          }
>> +
>> +      gomp_mutex_unlock (&acc_dev->lock);
>> +      if (!processed)
>> +        gomp_fatal ("dynamic refcount incrementing failed for "
>> +                    "pointer/pset");
>> +    }
>
> Please add some text to explain the nested 'j', 'k' loops and their 'if'
> conditionals, and the 'groupnum' usage in the 'k' loop boundary.  Should
> the 'k' loop maybe run 'for (size_t k = j; k < tgt->list_count; ++k)'
> (..., or is 'groupnum' relevant?), and in the loop body then use 'k'
> instead of 'j + k'?  (Maybe I've now confused myself, staring at this for
> a while...)

Audacious as I am sometimes, I did put a '__builtin_abort' right after
'tgt->list[j].key == n' -- and it doesn't trigger one single time for the
current libgomp test cases, meaning this is all dead code?  I'm confused.

>> +      else if (hostaddrs[i])
>> +    {
>> +      gomp_mutex_unlock (&acc_dev->lock);
>> +
>> +      struct target_mem_desc *tgt
>> +        = gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
>> +                               &sizes[i], &kinds[i], true,
>> +                               GOMP_MAP_VARS_ENTER_DATA);
>> +      assert (tgt);
>> +      for (size_t j = 0; j < tgt->list_count; j++)
>> +        {
>> +          n = tgt->list[j].key;
>> +          if (n)
>> +            n->dynamic_refcount++;
>> +        }
>> +    }
>
> ... else nothing.  This latter "nothing" case (not present, and no
> 'hostaddrs[i]') is exercised by
> 'libgomp.oacc-fortran/optional-data-enter-exit.f90' (only).  Is that
> alright?
>
>>
>>        i = group_last;
>>      }
>
>
>> @@ -1137,45 +1241,40 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
>
> (Diff slightly edited.)
>
>>          if (n->refcount == 0)
>> -          gomp_remove_var_async (acc_dev, n, aq);
>
>> +          {
>> +            if (aq)
>> +              {
>> +                /* TODO The way the following code is currently
>> +                   implemented, we need the 'is_tgt_unmapped' return
>> +                   value from 'gomp_remove_var', so can't use
>> +                   'gomp_remove_var_async' here -- see the
>> +                   'gomp_unref_tgt' comment in
>> +                   <http://mid.mail-archive.com/878snl36eu.fsf@euler.schwinge.homeip.net>;
>> +                   PR92881 -- so have to synchronize here.  */
>> +                if (!acc_dev->openacc.async.synchronize_func (aq))
>> +                  {
>> +                    gomp_mutex_unlock (&acc_dev->lock);
>> +                    gomp_fatal ("synchronize failed");
>> +                  }
>> +              }
>
> As far as I understand, it's no longer true that "The way the following
> code is [...] implemented, we need the 'is_tgt_unmapped' return value
> from 'gomp_remove_var'".  In particular, we now can/should "use
> 'gomp_remove_var_async' here", and no longer "have to synchronize here"?
>
> Indeed I'm happy to see that the logic below no longer depends on
> 'is_tgt_unmapped' for its loop exit condition.  Instead of the above,
> this now can use the standard pattern:
>
>     if (aq)
>       /* TODO We can't do the 'is_tgt_unmapped' checking -- see the
>          'gomp_unref_tgt' comment in
>          <http://mid.mail-archive.com/878snl36eu.fsf@euler.schwinge.homeip.net>;
>          PR92881.  */
>       gomp_remove_var_async (acc_dev, n, aq);
>     else
>       { [as follows] }
>
>> +            int num_mappings = 0;
>> +            /* If the target_mem_desc represents a single data mapping, we
>> +               can check that it is freed when this splay tree key's
>> +               refcount reaches zero.  Otherwise (e.g. for a struct
>> +               mapping with multiple members), fall back to skipping the
>> +               test.  */
>> +            for (int j = 0; j < n->tgt->list_count; j++)
>> +              if (n->tgt->list[j].key)
>> +                num_mappings++;
>> +            bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
>> +            assert (num_mappings > 1 || is_tgt_unmapped);
>> +          }
>>        }
>>        break;
>
> For reference, the old logic (mandating what was described in the comment
> above) was:
>
>     bool is_tgt_unmapped = false;
>     for (size_t i = 0; i < t->list_count; i++)
>      {
>        is_tgt_unmapped = gomp_remove_var (acc_dev, t->list[i].key);
>        if (is_tgt_unmapped)
>          break;
>      }
>     assert (is_tgt_unmapped);


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code
  2020-05-22 22:16 ` [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code Julian Brown
@ 2020-06-04 18:00   ` Thomas Schwinge
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Schwinge @ 2020-06-04 18:00 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: jakub

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

Hi!

On 2020-05-22T15:16:04-0700, Julian Brown <julian@codesourcery.com> wrote:
> This patch adds some missing unlocking from error paths in the OpenACC
> attach/detach code, noticed during development of other patches in
> this series.

Thanks, pushed "[OpenACC] Missing unlocking on error paths in
attach/detach code" to master branch in commit
dc9541545d9e5705a97c41713c557f55522b54dc, and releases/gcc-10 branch in
commit 16faaf529cfdefbf577cf6affc5869506b484599, see attached.

..., and then pushed "[OpenACC] Missing unlocking on error paths in
attach/detach code, part II" to master branch in commit
2e24d457d8c97e409549848715ff046cfa9efd3d, and releases/gcc-10 branch in
commit 52022880530d3ae9cdae81db74a40a92f2265c2c, see attached, too.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-Missing-unlocking-on-error-paths-in-attach-d.patch --]
[-- Type: text/x-diff, Size: 1002 bytes --]

From 2e24d457d8c97e409549848715ff046cfa9efd3d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Sat, 23 May 2020 17:20:30 +0200
Subject: [PATCH] [OpenACC] Missing unlocking on error paths in attach/detach
 code, part II

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Unlock on error path.
---
 libgomp/oacc-mem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c06b7341cbbf..6713846c9429 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1060,7 +1060,10 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	      = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
 	    if (n == NULL)
-	      gomp_fatal ("struct not mapped for detach operation");
+	      {
+		gomp_mutex_unlock (&acc_dev->lock);
+		gomp_fatal ("struct not mapped for detach operation");
+	      }
 
 	    gomp_detach_pointer (acc_dev, aq, n, hostaddr, finalize, NULL);
 	  }
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-OpenACC-Missing-unlocking-on-error-paths-in-atta.g10.patch --]
[-- Type: text/x-diff, Size: 1072 bytes --]

From 52022880530d3ae9cdae81db74a40a92f2265c2c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Sat, 23 May 2020 17:20:30 +0200
Subject: [PATCH] [OpenACC] Missing unlocking on error paths in attach/detach
 code, part II

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Unlock on error path.

(cherry picked from commit 2e24d457d8c97e409549848715ff046cfa9efd3d)
---
 libgomp/oacc-mem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c06b7341cbbf..6713846c9429 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1060,7 +1060,10 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	      = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
 	    if (n == NULL)
-	      gomp_fatal ("struct not mapped for detach operation");
+	      {
+		gomp_mutex_unlock (&acc_dev->lock);
+		gomp_fatal ("struct not mapped for detach operation");
+	      }
 
 	    gomp_detach_pointer (acc_dev, aq, n, hostaddr, finalize, NULL);
 	  }
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-OpenACC-Missing-unlocking-on-error-paths-in-attach-d.patch --]
[-- Type: text/x-diff, Size: 1002 bytes --]

From 2e24d457d8c97e409549848715ff046cfa9efd3d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Sat, 23 May 2020 17:20:30 +0200
Subject: [PATCH] [OpenACC] Missing unlocking on error paths in attach/detach
 code, part II

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Unlock on error path.
---
 libgomp/oacc-mem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c06b7341cbbf..6713846c9429 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1060,7 +1060,10 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	      = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
 	    if (n == NULL)
-	      gomp_fatal ("struct not mapped for detach operation");
+	      {
+		gomp_mutex_unlock (&acc_dev->lock);
+		gomp_fatal ("struct not mapped for detach operation");
+	      }
 
 	    gomp_detach_pointer (acc_dev, aq, n, hostaddr, finalize, NULL);
 	  }
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0001-OpenACC-Missing-unlocking-on-error-paths-in-atta.g10.patch --]
[-- Type: text/x-diff, Size: 1072 bytes --]

From 52022880530d3ae9cdae81db74a40a92f2265c2c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Sat, 23 May 2020 17:20:30 +0200
Subject: [PATCH] [OpenACC] Missing unlocking on error paths in attach/detach
 code, part II

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Unlock on error path.

(cherry picked from commit 2e24d457d8c97e409549848715ff046cfa9efd3d)
---
 libgomp/oacc-mem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index c06b7341cbbf..6713846c9429 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1060,7 +1060,10 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	      = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
 	    if (n == NULL)
-	      gomp_fatal ("struct not mapped for detach operation");
+	      {
+		gomp_mutex_unlock (&acc_dev->lock);
+		gomp_fatal ("struct not mapped for detach operation");
+	      }
 
 	    gomp_detach_pointer (acc_dev, aq, n, hostaddr, finalize, NULL);
 	  }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [OpenACC] Don't open-code 'gomp_remove_var' in 'acc_unmap_data' (was: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics)
  2020-05-22 22:16 ` [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics Julian Brown
  2020-06-03 12:36   ` Thomas Schwinge
@ 2020-06-04 18:26   ` Thomas Schwinge
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Schwinge @ 2020-06-04 18:26 UTC (permalink / raw)
  To: Julian Brown, gcc-patches; +Cc: jakub

[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]

Hi Julian!

On 2020-05-22T15:16:05-0700, Julian Brown <julian@codesourcery.com> wrote:
> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -475,14 +475,19 @@ acc_unmap_data (void *h)
>        gomp_mutex_unlock (&acc_dev->lock);
>        gomp_fatal ("cannot unmap target block");
>      }
> -  else if (tgt->refcount > 1)
> -    tgt->refcount--;
> -  else
> +
> +  if (tgt->refcount == 1)
>      {
> -      free (tgt->array);
> -      free (tgt);
> +      /* This is the last reference.  Nullifying these fields prevents
> +      'gomp_unmap_tgt' via 'gomp_remove_var' from freeing the target
> +      memory.  */
> +      tgt->tgt_end = 0;
> +      tgt->to_free = NULL;
>      }
>
> +  bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
> +  assert (is_tgt_unmapped);

Also should again remove the 'splay_tree_remove' that you'd added further
up.  I've pushed "[OpenACC] Don't open-code 'gomp_remove_var' in
'acc_unmap_data'" to master branch in commit
2112d3242f413979931e371423dcead9d19440e7, and releases/gcc-10 branch in
commit 1bca30efec5d684f03dfb88ed93cbe26e68d35b0, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-Don-t-open-code-gomp_remove_var-in-acc_unmap.patch --]
[-- Type: text/x-diff, Size: 1226 bytes --]

From 2112d3242f413979931e371423dcead9d19440e7 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 29 May 2020 14:21:03 +0200
Subject: [PATCH] [OpenACC] Don't open-code 'gomp_remove_var' in
 'acc_unmap_data'

	libgomp/
	* oacc-mem.c (acc_unmap_data): Don't open-code 'gomp_remove_var'.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
---
 libgomp/oacc-mem.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 6314f5d8b686..8e8c7c3093d5 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -468,8 +468,6 @@ acc_unmap_data (void *h)
 		  (void *) h, (int) host_size);
     }
 
-  splay_tree_remove (&acc_dev->mem_map, n);
-
   struct target_mem_desc *tgt = n->tgt;
 
   if (tgt->refcount == REFCOUNT_INFINITY)
@@ -482,8 +480,12 @@ acc_unmap_data (void *h)
      'acc_map_data'.  */
   assert (tgt->refcount == 1);
 
-  free (tgt->array);
-  free (tgt);
+  /* Nullifying these fields prevents 'gomp_unmap_tgt' via 'gomp_remove_var'
+     from freeing the target memory.  */
+  tgt->tgt_end = 0;
+  tgt->to_free = NULL;
+
+  gomp_remove_var (acc_dev, n);
 
   gomp_mutex_unlock (&acc_dev->lock);
 
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-OpenACC-Don-t-open-code-gomp_remove_var-in-acc_u.g10.patch --]
[-- Type: text/x-diff, Size: 1295 bytes --]

From 1bca30efec5d684f03dfb88ed93cbe26e68d35b0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 29 May 2020 14:21:03 +0200
Subject: [PATCH] [OpenACC] Don't open-code 'gomp_remove_var' in
 'acc_unmap_data'

	libgomp/
	* oacc-mem.c (acc_unmap_data): Don't open-code 'gomp_remove_var'.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
(cherry picked from commit 2112d3242f413979931e371423dcead9d19440e7)
---
 libgomp/oacc-mem.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 6314f5d8b686..8e8c7c3093d5 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -468,8 +468,6 @@ acc_unmap_data (void *h)
 		  (void *) h, (int) host_size);
     }
 
-  splay_tree_remove (&acc_dev->mem_map, n);
-
   struct target_mem_desc *tgt = n->tgt;
 
   if (tgt->refcount == REFCOUNT_INFINITY)
@@ -482,8 +480,12 @@ acc_unmap_data (void *h)
      'acc_map_data'.  */
   assert (tgt->refcount == 1);
 
-  free (tgt->array);
-  free (tgt);
+  /* Nullifying these fields prevents 'gomp_unmap_tgt' via 'gomp_remove_var'
+     from freeing the target memory.  */
+  tgt->tgt_end = 0;
+  tgt->to_free = NULL;
+
+  gomp_remove_var (acc_dev, n);
 
   gomp_mutex_unlock (&acc_dev->lock);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics
  2020-06-03 15:19     ` Thomas Schwinge
@ 2020-06-05 22:03       ` Julian Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-06-05 22:03 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: jakub, gcc-patches, Tobias Burnus

[-- Attachment #1: Type: text/plain, Size: 7735 bytes --]

On Wed, 3 Jun 2020 17:19:47 +0200
Thomas Schwinge <thomas@codesourcery.com> wrote:

> Hi Julian!
> 
> On 2020-06-03T14:36:14+0200, I wrote:
> > On 2020-05-22T15:16:05-0700, Julian Brown <julian@codesourcery.com>
> > wrote:  
> >> This patch adjusts the semantics of dynamic reference counts, as
> >> described in the parent email.  
> >
> > Thanks!
> >
> > A few questions, but no need to send an updated patch.
> >  
> >> --- a/libgomp/oacc-mem.c
> >> +++ b/libgomp/oacc-mem.c  
> >  
> >> @@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct
> >> gomp_device_descr *acc_dev, size_t mapnum, {
> >>    for (size_t i = 0; i < mapnum; i++)
> >>      {
> >> -      int group_last = find_group_last (i, mapnum, sizes, kinds);
> >> +      splay_tree_key n;
> >> +      size_t group_last = find_group_last (i, mapnum, sizes,
> >> kinds);
> >> +      bool struct_p = false;
> >> +      size_t size, groupnum = (group_last - i) + 1;
> >>  
> >> -      gomp_map_vars_async (acc_dev, aq,
> >> -			   (group_last - i) + 1,
> >> -			   &hostaddrs[i], NULL,
> >> -			   &sizes[i], &kinds[i], true,
> >> -			   GOMP_MAP_VARS_OPENACC_ENTER_DATA);
> >> +      switch (kinds[i] & 0xff)
> >> +	{
> >> +	case GOMP_MAP_STRUCT:
> >> +	  {
> >> +	    int last = i + sizes[i];  
> >
> > The 'last' calculated here must always equal the 'group_last'
> > calculated above.  ;-) (... so we might just use 'group_last'
> > instead of 'last' in the following.)
> >  
> >> +	    size = (uintptr_t) hostaddrs[last] + sizes[last]
> >> +		   - (uintptr_t) hostaddrs[i];
> >> +	    struct_p = true;
> >> +	  }
> >> +	  break;
> >> +
> >> +	case GOMP_MAP_ATTACH:
> >> +	  size = sizeof (void *);
> >> +	  break;
> >> +
> >> +	default:
> >> +	  size = sizes[i];
> >> +	}
> >> +
> >> +      n = lookup_host (acc_dev, hostaddrs[i], size);
> >> +  
> >  
> >> +      if (n && struct_p)
> >> +	{
> >> +	  if (n->refcount != REFCOUNT_INFINITY)
> >> +	    n->refcount += groupnum - 1;
> >> +	  n->dynamic_refcount += groupnum - 1;
> >> +	  gomp_mutex_unlock (&acc_dev->lock);
> >> +	}  
> >
> > Is the 'GOMP_MAP_STRUCT' handling here specifically necessary, or
> > is that just an optimization of the 'n && groupnum > 1' case below?
> >  
> 
> Eh, OK, I think I see where this is going; the 'n && groupnum > 1'
> case below might not necessarily take care of the 'groupnum - 1'
> refcounts that we're filing here?

Right. GOMP_MAP_STRUCT is a little special in this case.

> >> +      else if (n && groupnum == 1)
> >> +	{
> >> +	  void *h = hostaddrs[i];
> >> +	  size_t s = sizes[i];
> >> +
> >> +	  /* A standalone attach clause.  */
> >> +	  if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
> >> +	    gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map,
> >> n,
> >> +				 (uintptr_t) h, s, NULL);
> >> +	  else if (h + s > (void *) n->host_end)
> >> +	    {
> >> +	      gomp_mutex_unlock (&acc_dev->lock);
> >> +	      gomp_fatal ("[%p,+%d] not mapped", (void *)h,
> >> (int)s);
> >> +	    }
> >> +
> >> +	  assert (n->refcount != REFCOUNT_LINK);
> >> +	  if (n->refcount != REFCOUNT_INFINITY)
> >> +	    n->refcount++;
> >> +	  n->dynamic_refcount++;
> >> +
> >> +	  gomp_mutex_unlock (&acc_dev->lock);
> >> +	}  
> >  
> >> +      else if (n && groupnum > 1)
> >> +	{
> >> +	  assert (n->refcount != REFCOUNT_INFINITY
> >> +		  && n->refcount != REFCOUNT_LINK);
> >> +
> >> +	  bool processed = false;
> >> +
> >> +	  struct target_mem_desc *tgt = n->tgt;
> >> +	  for (size_t j = 0; j < tgt->list_count; j++)
> >> +	    if (tgt->list[j].key == n)
> >> +	      {
> >> +		for (size_t k = 0; k < groupnum; k++)
> >> +		  if (j + k < tgt->list_count && tgt->list[j +
> >> k].key)
> >> +		    {
> >> +		      tgt->list[j + k].key->refcount++;
> >> +		      tgt->list[j + k].key->dynamic_refcount++;
> >> +		    }
> >> +		processed = true;
> >> +	      }
> >> +
> >> +	  gomp_mutex_unlock (&acc_dev->lock);
> >> +	  if (!processed)
> >> +	    gomp_fatal ("dynamic refcount incrementing failed for
> >> "
> >> +			"pointer/pset");
> >> +	}  
> >
> > Please add some text to explain the nested 'j', 'k' loops and their
> > 'if' conditionals, and the 'groupnum' usage in the 'k' loop
> > boundary.  Should the 'k' loop maybe run 'for (size_t k = j; k <
> > tgt->list_count; ++k)' (..., or is 'groupnum' relevant?), and in
> > the loop body then use 'k' instead of 'j + k'?  (Maybe I've now
> > confused myself, staring at this for a while...)  
> 
> Audacious as I am sometimes, I did put a '__builtin_abort' right after
> 'tgt->list[j].key == n' -- and it doesn't trigger one single time for
> the current libgomp test cases, meaning this is all dead code?  I'm
> confused.

Huh, I didn't expect that! Indeed that stanza appears to be dead code
(at least with mapping clauses generated from current GCC). The reason
is a late bug-fix to the manual deep copy code that strips
GOMP_MAP_TO_PSET and GOMP_MAP_POINTER from OpenACC enter/exit mappings
altogether. (In
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-12/msg01253.html).

That means "grouped" mappings are actually only now used
for GOMP_MAP_STRUCT, so actually even more of the find_group_last
function is probably dead now too, modulo backward compatibility issues.

Rewinding a bit, here is an explanation of the problem that the removal
of those clauses fixes, in case we want to revisit that.

With the attached patch (reverting the fix), the attached test case
fails (e.g. compiled at -O0). The problem is that with a dynamic data
lifetime, it's possible for an array descriptor on the stack to go out
of scope before the array data it is associated with does. This might
well be violating either Fortran rules or OpenACC semantics -- if that's
the case, then we had no problem here. (I did see a similar problem "in
the wild", but hadn't come up with a standalone test case until now.)

The attached test case starts out with a explicit-shape array local. It
passes this to a subroutine "enterdata_wrapper". This subroutine
fabricates an assumed-shape array pointer to its argument (creating an
array descriptor), and passes it to another subroutine "enterdata".

The "enterdata" subroutine then performs an OpenACC "enter data"
operation with the array -- whose data comes from the original
explicit-shape array in the main program, but whose descriptor comes
from the stack frame of the caller (i.e. "enterdata_wrapper"). This
descriptor then goes out of scope before returning to the main program.

The test case tries to fiddle with the stack layout by adding arbitrary
other arrays, and does the same dance again with nested subroutines to
perform an "exit data" operation.  But now the address of the (new)
descriptor is different, and the unmapping operation fails.

In short -- OpenACC "enter data" operations can (could) create hidden
dangling references to array descriptors, in some circumstances.

So, the fix was to strip out GOMP_MAP_TO_PSET (and GOMP_MAP_POINTER,
which I don't think has any meaning on these directives) from OpenACC
"enter data" and "exit data" directives altogether. If an array has a
descriptor when we get to a compute kernel, that descriptor is copied
to the target anyway, *even for present clauses*, so passing the
array descriptor to "enter data" descriptor doesn't appear to be
necessary, even in cases where it stays in scope before unmapping from
the target.

So, questions:

1. Does the attached program violate Fortran semantics in some way?

2. Or OpenACC semantics?

3. Are there unintended side-effects of removing GOMP_MAP_TO_PSET and
   GOMP_MAP_POINTER from OpenACC enter/exit data directives?

4. Should the clauses be stripped from the equivalent OpenMP directives
   too?

(FAOD, I'm not asking for review on the attached patch at this time.)

HTH,

Julian

[-- Attachment #2: enter-data-pset.f90 --]
[-- Type: text/x-fortran, Size: 1442 bytes --]

program myprog
  implicit none
  integer :: a(16)
  integer :: i

  call enterdata_wrapper(a, 16)
  call exitdata_wrapper(a, 16)

  contains

  subroutine enterdata_wrapper(a, n)
    implicit none
    integer :: n
    integer, target :: a(n)
    integer :: aa(16)
    integer :: bb(16)
    integer, pointer :: ap(:)
    integer :: cc(16)
    integer :: dd(16)

    ! An array descriptor appears somewhere around here...
    ap => a

    !$acc enter data copyin(aa,bb,cc,dd)
    call enterdata(ap)
    !$acc exit data copyout(aa,bb,cc,dd)

    ! ...and goes out of scope.
  end subroutine enterdata_wrapper

  subroutine enterdata(a)
    implicit none
    integer, pointer :: a(:)

    ! Map "to(a.data) to_pset(a) pointer(a.data)"
    !$acc enter data copyin(a)
  end subroutine enterdata

  subroutine exitdata_wrapper(a, n)
    implicit none
    integer :: n
    integer, target :: a(n)
    integer :: aa(32)
    integer :: bb(32)
    integer, pointer :: ap(:)
    integer :: cc(32)
    integer :: dd(32)

    ! A different array descriptor appears...
    ap => a

    !$acc enter data copyin(aa,bb,cc,dd)
    call exitdata(ap)
    !$acc exit data copyout(aa,bb,cc,dd)

    ! ...and goes out of scope.
  end subroutine exitdata_wrapper

  subroutine exitdata(a)
    implicit none
    integer, pointer :: a(:)

    ! Try to unmap the fresh array descriptor: FAILS.
    !$acc exit data copyout(a)
  end subroutine exitdata
end program myprog

[-- Attachment #3: remove-pset-dangling-ref-bugfix-1.diff --]
[-- Type: text/x-patch, Size: 2977 bytes --]

commit 7a4d9939a7c5f770f3d2fcd02be01bfd146589ce
Author: Julian Brown <julian@codesourcery.com>
Date:   Fri Jun 5 14:46:41 2020 -0700

    Remove GOMP_MAP_TO_PSET dangling reference bugfix

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index e14932fafaf..79120e53129 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8748,6 +8748,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	    case OMP_TARGET_DATA:
 	    case OMP_TARGET_ENTER_DATA:
 	    case OMP_TARGET_EXIT_DATA:
+	    case OACC_ENTER_DATA:
+	    case OACC_EXIT_DATA:
 	    case OACC_HOST_DATA:
 	      if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
 		  || (OMP_CLAUSE_MAP_KIND (c)
@@ -8756,15 +8758,6 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		   mapped, but not the pointer to it.  */
 		remove = true;
 	      break;
-	    case OACC_ENTER_DATA:
-	    case OACC_EXIT_DATA:
-	      if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER
-		  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_TO_PSET
-		  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
-		  || (OMP_CLAUSE_MAP_KIND (c)
-		      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
-		remove = true;
-	      break;
 	    default:
 	      break;
 	    }
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index bc25527c616..c462cbb1007 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1015,9 +1015,12 @@ find_group_last (int pos, size_t mapnum, size_t *sizes, unsigned short *kinds)
   switch (kind0)
     {
     case GOMP_MAP_TO_PSET:
-      while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
+      while (pos + 1 < mapnum
+	     && ((kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER
+		 || (kinds[pos + 1] & 0xff) == GOMP_MAP_ATTACH))
 	pos++;
-      /* We expect at least one GOMP_MAP_POINTER after a GOMP_MAP_TO_PSET.  */
+      /* We expect at least one GOMP_MAP_POINTER (or GOMP_MAP_ATTACH)
+	 after a GOMP_MAP_TO_PSET.  */
       assert (pos > first_pos);
       break;
 
@@ -1044,7 +1047,9 @@ find_group_last (int pos, size_t mapnum, size_t *sizes, unsigned short *kinds)
 
       /* We can have zero or more GOMP_MAP_POINTER mappings after a to/from
 	 (etc.) mapping.  */
-      while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER)
+      while (pos + 1 < mapnum
+	     && ((kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER
+		 || (kinds[pos + 1] & 0xff) == GOMP_MAP_ATTACH))
 	pos++;
     }
 
@@ -1122,6 +1127,15 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
 	  assert (n->refcount != REFCOUNT_INFINITY
 		  && n->refcount != REFCOUNT_LINK);
 
+	  for (size_t j = i + 1; j <= group_last; j++)
+	    if ((kinds[j] & 0xff) == GOMP_MAP_ATTACH)
+	      {
+		splay_tree_key m
+		  = lookup_host (acc_dev, hostaddrs[j], sizeof (void *));
+		gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, m,
+				     (uintptr_t) hostaddrs[j], sizes[j], NULL);
+	      }
+
 	  bool processed = false;
 
 	  struct target_mem_desc *tgt = n->tgt;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version)
  2020-05-22 22:21 ` [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version) Julian Brown
@ 2020-06-18 18:40   ` Julian Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-06-18 18:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: Catherine_Moore, jakub, thomas

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

On Fri, 22 May 2020 15:21:44 -0700
Julian Brown <julian@codesourcery.com> wrote:

> This is a new version of the reference count self-checking code,
> adjusted to work with the new (old) dynamic_refcount counting scheme.
>  The key observation is that a target_mem_desc that was created from
> a dynamic data lifetime should not contribute to the structured
> refcount for splay tree keys in its variable list.  We can figure out
> which target_mem_descs that applies to using the information recorded
> in the previous patch.
> 
> In a sense, this takes the "awkward corner cases" from the
> virtual_refcount ("overhaul") patch, and moves them to the optional
> self-test code, where they can potentially do less harm.  With this,
> we still have a formal-ish model of what refcounts mean and some
> confidence that they remain consistent (at least throughout execution
> of a test run), which I think is a good thing.
> 
> OK? (We probably want a way of configuring-in this testing
> automatically, as mentioned previously.)

This is a new version of the self-checking patch that works with the
recent patch to stop attach/detach operations from affecting reference
counts:

https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548529.html

This patch depends on the previous patch to distinguish structural from
dynamic reference counts:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546334.html

There still isn't a way of neatly configuring self-checking behaviour
on/off, so this is mostly just useful for development (or as a
proof-of-concept).

Thanks,

Julian

[-- Attachment #2: refcount-selfchecking-2.diff --]
[-- Type: text/x-patch, Size: 10063 bytes --]

commit 920a44da7b74ddbe4e6d908a56a67e98d2078756
Author: Julian Brown <julian@codesourcery.com>
Date:   Thu May 21 02:41:32 2020 -0700

    [OpenACC] Reference count self-checking (dynamic_refcount version)
    
            libgomp/
            * libgomp.h (RC_CHECKING): New macro, disabled by default, guarding all
            hunks in this patch.
            (target_mem_desc): Add refcount_chk, mark fields.
            (splay_tree_key_s): Add refcount_chk field.
            (dump_tgt, gomp_rc_check): Add prototypes.
            * oacc-mem.c (GOACC_enter_exit_data): Add refcount self-check code.
            * oacc-parallel.c (GOACC_parallel_keyed_internal): Add refcount
            self-check code.
            (GOACC_data_start, GOACC_data_end, GOACC_enter_exit_data): Likewise.
            * target.c (stdio.h): Include.
            (dump_tgt, rc_check_clear, rc_check_count, rc_check_verify,
            gomp_rc_check): New functions to consistency-check reference counts.

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 0d1978ffb13..eaa7c6ebb4c 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -960,9 +960,17 @@ struct target_var_desc {
   uintptr_t length;
 };
 
+/* Uncomment to enable reference-count consistency checking (for development
+   use only).  */
+//#define RC_CHECKING 1
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
+#ifdef RC_CHECKING
+  uintptr_t refcount_chk;
+  bool mark;
+#endif
   /* All the splay nodes allocated together.  */
   splay_tree_node array;
   /* Start of the target region.  */
@@ -1019,6 +1027,10 @@ struct splay_tree_key_s {
   uintptr_t refcount;
   /* Dynamic reference count.  */
   uintptr_t dynamic_refcount;
+#ifdef RC_CHECKING
+  /* The recalculated reference count, for verification.  */
+  uintptr_t refcount_chk;
+#endif
   struct splay_tree_aux *aux;
 };
 
@@ -1174,6 +1186,12 @@ extern void gomp_detach_pointer (struct gomp_device_descr *,
 				 struct goacc_asyncqueue *, splay_tree_key,
 				 uintptr_t, bool, struct gomp_coalesce_buf *);
 
+#ifdef RC_CHECKING
+extern void dump_tgt (const char *, struct target_mem_desc *);
+extern void gomp_rc_check (struct gomp_device_descr *,
+			   struct target_mem_desc *);
+#endif
+
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
 					      size_t *, void *, bool,
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index a9682e832be..1816b06bf2d 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1522,4 +1522,10 @@ GOACC_enter_exit_data (int flags_m, size_t mapnum, void **hostaddrs,
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index c7e46e35bd6..0774cdc7e4f 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -301,6 +301,15 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
 				&api_info);
     }
   
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  assert (tgt);
+  dump_tgt (__FUNCTION__, tgt);
+  tgt->prev = thr->mapped_data;
+  gomp_rc_check (acc_dev, tgt);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
+
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
     devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i);
@@ -347,6 +356,12 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (void *),
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
 
 /* Legacy entry point (GCC 5).  Only provide host fallback execution.  */
@@ -481,6 +496,12 @@ GOACC_data_start (int flags_m, size_t mapnum,
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&acc_dev->lock);
+  gomp_rc_check (acc_dev, thr->mapped_data);
+  gomp_mutex_unlock (&acc_dev->lock);
+#endif
 }
 
 void
@@ -554,6 +575,12 @@ GOACC_data_end (void)
       thr->prof_info = NULL;
       thr->api_info = NULL;
     }
+
+#ifdef RC_CHECKING
+  gomp_mutex_lock (&thr->dev->lock);
+  gomp_rc_check (thr->dev, thr->mapped_data);
+  gomp_mutex_unlock (&thr->dev->lock);
+#endif
 }
 
 void
diff --git a/libgomp/target.c b/libgomp/target.c
index 35a76c4ac39..badc254a777 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -38,6 +38,9 @@
 #include <string.h>
 #include <assert.h>
 #include <errno.h>
+#ifdef RC_CHECKING
+#include <stdio.h>
+#endif
 
 #ifdef PLUGIN_SUPPORT
 #include <dlfcn.h>
@@ -347,6 +350,192 @@ gomp_free_device_memory (struct gomp_device_descr *devicep, void *devptr)
     }
 }
 
+#ifdef RC_CHECKING
+void
+dump_tgt (const char *where, struct target_mem_desc *tgt)
+{
+  if (!getenv ("GOMP_DEBUG_TGT"))
+    return;
+
+  fprintf (stderr, "%s: %s: tgt=%p\n", __FUNCTION__, where, (void*) tgt);
+  fprintf (stderr, "refcount=%d\n", (int) tgt->refcount);
+  fprintf (stderr, "tgt_start=%p\n", (void*) tgt->tgt_start);
+  fprintf (stderr, "tgt_end=%p\n", (void*) tgt->tgt_end);
+  fprintf (stderr, "to_free=%p\n", tgt->to_free);
+  fprintf (stderr, "list_count=%d\n", (int) tgt->list_count);
+  for (int i = 0; i < tgt->list_count; i++)
+    {
+      fprintf (stderr, "list item %d:\n", i);
+      fprintf (stderr, "  key: %p\n", (void*) tgt->list[i].key);
+      if (tgt->list[i].key)
+	{
+	  fprintf (stderr, "  key.host_start=%p\n",
+		   (void*) tgt->list[i].key->host_start);
+	  fprintf (stderr, "  key.host_end=%p\n",
+		   (void*) tgt->list[i].key->host_end);
+	  fprintf (stderr, "  key.tgt=%p\n", (void*) tgt->list[i].key->tgt);
+	  fprintf (stderr, "  key.offset=%d\n",
+		   (int) tgt->list[i].key->tgt_offset);
+	  fprintf (stderr, "  key.refcount=%d\n",
+		   (int) tgt->list[i].key->refcount);
+	  fprintf (stderr, "  key.dynamic_refcount=%d\n",
+		   (int) tgt->list[i].key->dynamic_refcount);
+	  if (tgt->list[i].key->aux)
+	    {
+	      fprintf (stderr, "  key.aux.link_key=%p\n",
+		       (void*) tgt->list[i].key->aux->link_key);
+	      fprintf (stderr, "  key.aux.attach_count=%p\n",
+		       (void*) tgt->list[i].key->aux->attach_count);
+	    }
+	}
+    }
+  fprintf (stderr, "\n");
+}
+
+static void
+rc_check_clear (splay_tree_node node)
+{
+  splay_tree_key k = &node->key;
+
+  k->refcount_chk = 0;
+  k->tgt->refcount_chk = 0;
+  k->tgt->mark = false;
+
+  if (node->left)
+    rc_check_clear (node->left);
+  if (node->right)
+    rc_check_clear (node->right);
+}
+
+static void
+rc_check_count (splay_tree_node node)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t;
+
+  /* Add dynamic reference counts ("acc enter data", etc.) for this key.  */
+  k->refcount_chk += k->dynamic_refcount;
+
+  t = k->tgt;
+  t->refcount_chk++;
+
+  /* Do not count references from tgt_mem_descs that arise from dynamic data
+     lifetimes: those are counted already by their keys' dynamic_refcount.  */
+  if (!t->mark && goacc_marked_dynamic_p (t))
+    t->mark = true;
+  else if (!t->mark)
+    {
+      /* This tgt counts towards each splay tree key in the list's reference
+	 count -- apart from "attach" operations, which don't affect the
+	 reference count.  We can detect those by examining the do_detach
+	 flag.  */
+      for (int i = 0; i < t->list_count; i++)
+	if (t->list[i].key && !t->list[i].do_detach)
+	  t->list[i].key->refcount_chk++;
+
+      t->mark = true;
+    }
+
+  if (node->left)
+    rc_check_count (node->left);
+  if (node->right)
+    rc_check_count (node->right);
+}
+
+static bool
+rc_check_verify (splay_tree_node node, bool noisy, bool errors)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t;
+
+  if (k->refcount != REFCOUNT_INFINITY)
+    {
+      if (noisy)
+	fprintf (stderr, "key %p (%p..+%d): rc=%d/%d, dyn_rc=%d\n", k,
+		 (void *) k->host_start, (int) (k->host_end - k->host_start),
+		 (int) k->refcount, (int) k->refcount_chk,
+		 (int) k->dynamic_refcount);
+
+      if (k->refcount != k->refcount_chk)
+	{
+	  if (noisy)
+	    fprintf (stderr, "  -- key refcount mismatch!\n");
+	  errors = true;
+	}
+
+      t = k->tgt;
+
+      if (noisy)
+	fprintf (stderr, "tgt %p: rc=%d/%d\n", t, (int) t->refcount,
+		 (int) t->refcount_chk);
+
+      if (t->refcount != t->refcount_chk)
+	{
+	  if (noisy)
+	    fprintf (stderr,
+		     "  -- target memory descriptor refcount mismatch!\n");
+	  errors = true;
+	}
+    }
+
+  if (node->left)
+    errors |= rc_check_verify (node->left, noisy, errors);
+  if (node->right)
+    errors |= rc_check_verify (node->right, noisy, errors);
+
+  return errors;
+}
+
+/* Call with device locked.  */
+
+attribute_hidden void
+gomp_rc_check (struct gomp_device_descr *devicep, struct target_mem_desc *tgt)
+{
+  splay_tree sp = &devicep->mem_map;
+
+  bool noisy = getenv ("GOMP_DEBUG_TGT") != 0;
+
+  if (noisy)
+    fprintf (stderr, "\n*** GOMP_RC_CHECK ***\n\n");
+
+  if (sp->root)
+    {
+      rc_check_clear (sp->root);
+
+      for (struct target_mem_desc *t = tgt; t; t = t->prev)
+	{
+	  t->refcount_chk = 0;
+	  t->mark = false;
+	}
+
+      /* Add references for interconnected splay-tree keys.  */
+      rc_check_count (sp->root);
+
+      /* Add references for the tgt for a currently-executing kernel and/or
+	 any enclosing data directives.  */
+      for (struct target_mem_desc *t = tgt; t; t = t->prev)
+	{
+	  t->refcount_chk++;
+
+	  if (!t->mark)
+	    {
+	      for (int i = 0; i < t->list_count; i++)
+		if (t->list[i].key && !t->list[i].do_detach)
+		  t->list[i].key->refcount_chk++;
+
+	      t->mark = true;
+	    }
+	}
+
+      if (rc_check_verify (sp->root, noisy, false))
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("refcount checking failure");
+	}
+    }
+}
+#endif
+
 /* Handle the case where gomp_map_lookup, splay_tree_lookup or
    gomp_map_0len_lookup found oldn for newn.
    Helper function of gomp_map_vars.  */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum
  2020-05-22 22:16 ` [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum Julian Brown
@ 2020-06-25 10:52   ` Thomas Schwinge
  2020-07-10 12:06     ` Julian Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Schwinge @ 2020-06-25 10:52 UTC (permalink / raw)
  To: Julian Brown; +Cc: Catherine_Moore, jakub, gcc-patches

Hi Julian!

On 2020-05-22T15:16:06-0700, Julian Brown <julian@codesourcery.com> wrote:
> Since goacc_enter_datum only maps a single data item now, there is no
> need to pass "kinds" as an array.  Passing as a scalar allows for some
> simplification in the function's callers.

You'd hope (didn't verify) that the compiler can do the same
transformation/optimization.  ;-)

But, au contraire: in my opinion (but please tell if you disagree), we
should instead get (back) to the state where the runtime API and the
pragma variants of the respective OpenACC functionality map to the same
libgomp implementation.

That's what we had a while ago: 'acc_create' calling the same
'goacc_enter_data' as 'GOACC_enter_exit_data' did for OpenACC 'enter
data' with 'create' clause, etc.  You then removed/changed that in
2019-12-20 commit 378da98fcc907d05002bcd3d6ff7951f0cf485e5 "OpenACC
reference count overhaul", reason unknown.

The idea is (a) to match in the libgomp implementation what the OpenACC
specification states ("The 'acc_create' routines are equivalent to the
'enter data' directive with a 'create' clause", etc.), and (b) to reduce
code duplication and thus potential for bugs -- like we've seen in the
case of 'attach'/'detach', where one variant didn't do reference counting
(runtime API variant; correct), and the other variant did (pragma
variant; incorrect).

As it must be able to handle the very same things (and more), my
understanding/expectation is that 'goacc_enter_data_internal' must offer
a superset of 'goacc_enter_datum' functionality, so the latter can just
go away?

And same story for the 'exit data' implementations, of course:
'goacc_exit_datum' vs. 'goacc_exit_data_internal'.


Grüße
 Thomas


>       libgomp/
>       * oacc-mem.c (goacc_enter_datum): Use scalar kind argument instead of
>       kinds array.
>       (acc_create, acc_create_async, acc_copyin, acc_copyin_async): Update
>       calls to goacc_enter_datum.
> ---
>  libgomp/oacc-mem.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
> index fff0d573f59..20d241382a8 100644
> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -501,7 +501,8 @@ acc_unmap_data (void *h)
>  /* Enter dynamic mapping for a single datum.  Return the device pointer.  */
>
>  static void *
> -goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
> +goacc_enter_datum (void **hostaddrs, size_t *sizes, unsigned short kind,
> +                int async)
>  {
>    void *d;
>    splay_tree_key n;
> @@ -560,7 +561,7 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
>
>        struct target_mem_desc *tgt
>       = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes,
> -                            kinds, true, GOMP_MAP_VARS_ENTER_DATA);
> +                            &kind, true, GOMP_MAP_VARS_ENTER_DATA);
>        assert (tgt);
>        assert (tgt->list_count == 1);
>        n = tgt->list[0].key;
> @@ -584,15 +585,13 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void *kinds, int async)
>  void *
>  acc_create (void *h, size_t s)
>  {
> -  unsigned short kinds[1] = { GOMP_MAP_ALLOC };
> -  return goacc_enter_datum (&h, &s, &kinds, acc_async_sync);
> +  return goacc_enter_datum (&h, &s, GOMP_MAP_ALLOC, acc_async_sync);
>  }
>
>  void
>  acc_create_async (void *h, size_t s, int async)
>  {
> -  unsigned short kinds[1] = { GOMP_MAP_ALLOC };
> -  goacc_enter_datum (&h, &s, &kinds, async);
> +  goacc_enter_datum (&h, &s, GOMP_MAP_ALLOC, async);
>  }
>
>  /* acc_present_or_create used to be what acc_create is now.  */
> @@ -617,15 +616,13 @@ acc_pcreate (void *h, size_t s)
>  void *
>  acc_copyin (void *h, size_t s)
>  {
> -  unsigned short kinds[1] = { GOMP_MAP_TO };
> -  return goacc_enter_datum (&h, &s, &kinds, acc_async_sync);
> +  return goacc_enter_datum (&h, &s, GOMP_MAP_TO, acc_async_sync);
>  }
>
>  void
>  acc_copyin_async (void *h, size_t s, int async)
>  {
> -  unsigned short kinds[1] = { GOMP_MAP_TO };
> -  goacc_enter_datum (&h, &s, &kinds, async);
> +  goacc_enter_datum (&h, &s, GOMP_MAP_TO, async);
>  }
>
>  /* acc_present_or_copyin used to be what acc_copyin is now.  */
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum
  2020-06-25 10:52   ` Thomas Schwinge
@ 2020-07-10 12:06     ` Julian Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Brown @ 2020-07-10 12:06 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Catherine_Moore, jakub, gcc-patches

On Thu, 25 Jun 2020 12:52:23 +0200
Thomas Schwinge <thomas@codesourcery.com> wrote:

> Hi Julian!
> 
> On 2020-05-22T15:16:06-0700, Julian Brown <julian@codesourcery.com>
> wrote:
> > Since goacc_enter_datum only maps a single data item now, there is
> > no need to pass "kinds" as an array.  Passing as a scalar allows
> > for some simplification in the function's callers.  
> 
> You'd hope (didn't verify) that the compiler can do the same
> transformation/optimization.  ;-)
> 
> But, au contraire: in my opinion (but please tell if you disagree), we
> should instead get (back) to the state where the runtime API and the
> pragma variants of the respective OpenACC functionality map to the
> same libgomp implementation.

It's a little ugly for "enter data" because the API routines return the
device pointer, but the directive implementation may involve several
mappings and a single "device pointer" return doesn't really make sense
in that case. I didn't much like the previous approach of returning
NULL.

We can still try to factor out the duplicated code though. I've posted a
new approach here (see the parent "0/2" patch also):

https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549773.html

Julian

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-07-10 12:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-22 22:16 [PATCH 0/7] [OpenACC] Dynamic reference counts for mapped data Julian Brown
2020-05-22 22:16 ` [PATCH 1/7] [OpenACC] Missing unlocking on error paths in attach/detach code Julian Brown
2020-06-04 18:00   ` Thomas Schwinge
2020-05-22 22:16 ` [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics Julian Brown
2020-06-03 12:36   ` Thomas Schwinge
2020-06-03 15:19     ` Thomas Schwinge
2020-06-05 22:03       ` Julian Brown
2020-06-04 18:26   ` [OpenACC] Don't open-code 'gomp_remove_var' in 'acc_unmap_data' (was: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics) Thomas Schwinge
2020-05-22 22:16 ` [PATCH 3/7] [OpenACC] Don't pass kind array via pointer to goacc_enter_datum Julian Brown
2020-06-25 10:52   ` Thomas Schwinge
2020-07-10 12:06     ` Julian Brown
2020-05-22 22:16 ` [PATCH 4/7] [OpenACC] Fix incompatible copyout for acc_map_data (PR92843) Julian Brown
2020-05-22 22:21 ` [PATCH 5/7] [OpenACC] Distinguish structural/dynamic mappings in libgomp Julian Brown
2020-05-22 22:21 ` [PATCH 6/7] [OpenACC] Reference count self-checking (dynamic_refcount version) Julian Brown
2020-06-18 18:40   ` Julian Brown
2020-05-22 22:21 ` [PATCH 7/7] [OpenACC] Stricter dynamic data unmapping testing (WIP) Julian Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).