Hi! On Tue, 9 Jun 2020 12:41:21 +0200 Thomas Schwinge wrote: > Hi Julian! > > On 2020-06-05T21:31:08+0100, Julian Brown > wrote: > > On Fri, 5 Jun 2020 13:17:09 +0200 > > Thomas Schwinge wrote: > >> On 2019-12-17T21:03:47-0800, Julian Brown > >> wrote: > >> > This part contains the libgomp runtime support for the > >> > GOMP_MAP_ATTACH and GOMP_MAP_DETACH mapping kinds > >> > >> > --- a/libgomp/target.c > >> > +++ b/libgomp/target.c > >> > >> > @@ -1203,6 +1211,32 @@ gomp_map_vars_internal (struct > >> > gomp_device_descr *devicep, > >> > >> > + case GOMP_MAP_ATTACH: > >> > + { > >> > + cur_node.host_start = (uintptr_t) > >> > hostaddrs[i]; > >> > + cur_node.host_end = cur_node.host_start + > >> > sizeof (void *); > >> > + splay_tree_key n = splay_tree_lookup > >> > (mem_map, &cur_node); > >> > + if (n != NULL) > >> > + { > >> > + tgt->list[i].key = n; > >> > + tgt->list[i].offset = cur_node.host_start > >> > - n->host_start; > >> > + tgt->list[i].length = n->host_end - > >> > n->host_start; > >> > + tgt->list[i].copy_from = false; > >> > + tgt->list[i].always_copy_from = false; > >> > + tgt->list[i].do_detach > >> > + = (pragma_kind != > >> > GOMP_MAP_VARS_OPENACC_ENTER_DATA); > >> > + n->refcount++; > >> > + } > >> > + else > >> > + { > >> > + gomp_mutex_unlock (&devicep->lock); > >> > + gomp_fatal ("outer struct not mapped for > >> > attach"); > >> > + } > >> > + gomp_attach_pointer (devicep, aq, mem_map, n, > >> > + (uintptr_t) > >> > hostaddrs[i], sizes[i], > >> > + cbufp); > >> > + continue; > >> > + } > >> > >> For the OpenACC runtime API 'acc_attach' etc. routines they don't, > >> so what's the conceptual reason that for the corresponding OpenACC > >> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in > >> reference counting ('n->refcount++' above)? I understand OpenACC > >> 'attach'/'detach' clauses to be simple "executable clauses", which > >> just update some values somewhere (say, like > >> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state, > >> thus wouldn't appear to need reference counting? > > > > IIUC, n->refcount is not directly the "structural reference count" > > as seen at source level, but rather counts the number of > > target_var_descs in the lists appended to each target_mem_desc -- > > and GOMP_MAP_ATTACH have variable entries in those lists. > > That may be OK if that's purely an implementation detail that isn't > visible to the user, however: > > > That's not the case for the API > > routines. > > As I had mentioned, the problem is: in contrast to 'acc_attach', an > OpenACC 'enter data' directive with 'attach' clause currently uses > this same reference-counted code path, and thus such an 'attach' > without corresponding 'detach' inhibits unmapping; [...] The attached patch stops attach/detach operations from affecting reference counts (either structured or dynamic). This isn't as invasive as I'd imagined: we can extend the use of the "do_detach" flag in target_mem_descs' variable lists to mark mappings that correspond to attach operations, then use that flag to avoid refcount increment/decrements. (The flag should possibly be renamed now.) I've modified the refcount self-testing code successfully to work with this new scheme too, in case that's helpful. I'll send the patches for that separately. Tested with offloading to NVPTX. OK? Thanks, Julian ChangeLog libgomp/ * oacc-mem.c (goacc_enter_data_internal): Don't affect reference counts for attach mappings. (goacc_exit_data_internal): Don't affect reference counts for detach mappings. * target.c (gomp_map_vars_existing): Don't affect reference counts for attach mappings. (gomp_map_vars_internal): Set do_detach flag unconditionally to mark attach mappings. (gomp_unmap_vars_internal): Use above flag to prevent affecting reference count for attach mappings. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/attach-detach-rc-2.c: Likewise. * testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark test as shouldfail. * testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail gracefully in no-finalize mode.