* [PATCH 0/2] [OpenACC] Attached deep-copy pointers, diagnostics & modifications @ 2020-06-22 12:14 Julian Brown 2020-06-22 12:14 ` [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers Julian Brown 2020-06-22 12:14 ` [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) Julian Brown 0 siblings, 2 replies; 6+ messages in thread From: Julian Brown @ 2020-06-22 12:14 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Jakub Jelinek, Catherine Moore Investigating PR95590, I realised that we can do better at diagnosing some potentially troublesome usage of OpenACC "attach" behaviour, namely updating blocks with attached pointers. Updating either the host copy or device copy of such a block is problematic -- for a host update, the host may get a clobbered (device) version of a host pointer in its local version of the block (e.g. struct). A device update may clobber an attached device pointer with a host pointer. The spec text (OpenACC 3.0, "2.6.8. Attachment Counter") covering this case is: "Pointer members of structs, classes, or derived types in device or host memory can be overwritten due to update directives or API routines. It is the user’s responsibility to ensure that the pointers have the appropriate values before or after the data movement in either direction. The behavior of the program is undefined if any of the pointer members are attached when an update of a composite variable is performed." The first patch in this series addresses that paragraph by making such updates (as well as copyouts, similarly) be runtime errors. The same text *might* also be an argument _against_ the fix in the second patch, though that is perhaps debatable (further discussion on that patch). Tested with offloading to NVPTX. OK? Julian Julian Brown (2): [OpenACC] Refuse update/copyout for blocks with attached pointers [OpenACC] Detect pointer updates for attach operations (PR95590) libgomp/oacc-mem.c | 42 +++++++++-- libgomp/target.c | 56 ++++++++++++-- .../attach-ptr-change-1.c | 74 +++++++++++++++++++ .../copyback-attached-dynamic-1.c | 31 ++++++++ .../copyback-attached-structural-1.c | 30 ++++++++ .../copyback-attached-structural-2.c | 31 ++++++++ .../copyback-attached-structural-3.c | 26 +++++++ .../delete-attached-dynamic-1.c | 26 +++++++ .../delete-attached-structural-1.c | 25 +++++++ .../delete-attached-structural-2.c | 26 +++++++ .../update-attached-1.c | 33 +++++++++ .../deep-copy-6-no_finalize.F90 | 6 +- 12 files changed, 390 insertions(+), 16 deletions(-) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c -- 2.23.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers 2020-06-22 12:14 [PATCH 0/2] [OpenACC] Attached deep-copy pointers, diagnostics & modifications Julian Brown @ 2020-06-22 12:14 ` Julian Brown 2020-07-24 13:37 ` Thomas Schwinge 2020-06-22 12:14 ` [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) Julian Brown 1 sibling, 1 reply; 6+ messages in thread From: Julian Brown @ 2020-06-22 12:14 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Jakub Jelinek, Catherine Moore As mentioned in the parent email, this patch adds diagnostics for probably-broken code that updates (host/device) or copies-out blocks that still have attached pointers. Several new tests have been added. OK? Julian ChangeLog libgomp/ * oacc-mem.c (update_dev_host): Raise error on update of block with attached pointers. (goacc_exit_data_internal): Raise error on copyout of block with attached pointers. * target.c (gomp_unmap_vars_internal): Likewise. * testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/update-attached.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Update for new diagnostic. --- libgomp/oacc-mem.c | 42 ++++++++++++++++--- libgomp/target.c | 27 +++++++++--- .../copyback-attached-dynamic-1.c | 31 ++++++++++++++ .../copyback-attached-structural-1.c | 30 +++++++++++++ .../copyback-attached-structural-2.c | 31 ++++++++++++++ .../copyback-attached-structural-3.c | 26 ++++++++++++ .../delete-attached-dynamic-1.c | 26 ++++++++++++ .../delete-attached-structural-1.c | 25 +++++++++++ .../delete-attached-structural-2.c | 26 ++++++++++++ .../update-attached-1.c | 33 +++++++++++++++ .../deep-copy-6-no_finalize.F90 | 6 +-- 11 files changed, 290 insertions(+), 13 deletions(-) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 1816b06bf2d..cf054f14b12 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -865,6 +865,23 @@ update_dev_host (int is_dev, void *h, size_t s, int async) gomp_fatal ("[%p,%d] is not mapped", h, (int)s); } + if (n->aux && n->aux->attach_count) + { + size_t nptrs = (n->host_end - n->host_start + sizeof (void *) - 1) + / sizeof (void *); + for (size_t i = 0; i < nptrs; i++) + if (n->aux->attach_count[i] > 0) + { + gomp_mutex_unlock (&acc_dev->lock); + if (is_dev) + gomp_fatal ("[%p,+%d] device update would overwrite attached " + "pointers", h, (int) s); + else + gomp_fatal ("host update from block [%p,+%d] with attached " + "pointers", h, (int) s); + } + } + d = (void *) (n->tgt->tgt_start + n->tgt_offset + (uintptr_t) h - n->host_start); @@ -1329,11 +1346,26 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, if (copyfrom && n->refcount != REFCOUNT_INFINITY && (kind != GOMP_MAP_FROM || n->refcount == 0)) - gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start, - (void *) (n->tgt->tgt_start + n->tgt_offset - + cur_node.host_start - - n->host_start), - cur_node.host_end - cur_node.host_start); + { + if (n->aux && n->aux->attach_count) + { + size_t nptrs = (n->host_end - n->host_start + + sizeof (void *) - 1) / sizeof (void *); + for (size_t j = 0; j < nptrs; j++) + if (n->aux->attach_count[j] > 0) + { + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("copyout of block [%p,+%d] with " + "attached pointers", hostaddrs[i], + (int) size); + } + } + gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start, + (void *) (n->tgt->tgt_start + n->tgt_offset + + cur_node.host_start + - n->host_start), + cur_node.host_end - cur_node.host_start); + } if (n->refcount == 0) { diff --git a/libgomp/target.c b/libgomp/target.c index badc254a777..db6f56a8ff8 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -1649,11 +1649,28 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom, if ((do_unmap && do_copyfrom && tgt->list[i].copy_from) || tgt->list[i].always_copy_from) - gomp_copy_dev2host (devicep, aq, - (void *) (k->host_start + tgt->list[i].offset), - (void *) (k->tgt->tgt_start + k->tgt_offset - + tgt->list[i].offset), - tgt->list[i].length); + { + if (k->aux && k->aux->attach_count) + { + size_t nptrs = (k->host_end - k->host_start + + sizeof (void *) - 1) / sizeof (void *); + for (size_t j = 0; j < nptrs; j++) + if (k->aux->attach_count[j] > 0) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("copyout of block [%p,+%d] with " + "attached pointers", + (void *) (k->host_start + tgt->list[i].offset), + (int) (k->host_end - k->host_start)); + } + } + gomp_copy_dev2host (devicep, aq, + (void *) (k->host_start + tgt->list[i].offset), + (void *) (k->tgt->tgt_start + k->tgt_offset + + tgt->list[i].offset), + tgt->list[i].length); + } + if (do_unmap) { struct target_mem_desc *k_tgt = k->tgt; diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c new file mode 100644 index 00000000000..bc4e297fa6f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c @@ -0,0 +1,31 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ + +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(s) + + #pragma acc data copy(s.arr[0:1024]) + { + /* This directive does one too many attachments: it should fail when we try + to do the copyout below. */ + #pragma acc enter data attach(s.arr) + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ + } + + #pragma acc exit data copyout(s) + + return 0; +} + +/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c new file mode 100644 index 00000000000..7846c8c717c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c @@ -0,0 +1,30 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ + +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc data copy(s) + { + #pragma acc data copy(s.arr[0:1024]) + { + /* This directive does one too many attachments: it should fail when we try + to do the copyout below. */ + #pragma acc enter data attach(s.arr) + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ + } + } + + return 0; +} + +/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c new file mode 100644 index 00000000000..bffa06eb725 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c @@ -0,0 +1,31 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ + +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(localarray[0:1024]) + + #pragma acc data copy(s) + { + /* This directive does one too many attachments: it should fail when we try + to do the copyout below. */ + #pragma acc enter data attach(s.arr) + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ + } + + #pragma acc exit data delete(localarray[0:1024]) + + return 0; +} + +/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c new file mode 100644 index 00000000000..4b21677af09 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c @@ -0,0 +1,26 @@ +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(localarray[0:1024]) + + #pragma acc data copy(s) + { + /* Here the attach and detach balance: this should work. */ + #pragma acc enter data attach(s.arr) + #pragma acc exit data detach(s.arr) + } + + #pragma acc exit data delete(localarray[0:1024]) + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c new file mode 100644 index 00000000000..e074d507fb2 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c @@ -0,0 +1,26 @@ +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(s) + + #pragma acc data copy(s.arr[0:1024]) + { + /* We delete 's' from the target below: this extra attachment is not + dangerous and we do not raise an error. */ + #pragma acc enter data attach(s.arr) + } + + #pragma acc exit data delete(s) + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c new file mode 100644 index 00000000000..e675762ecd8 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c @@ -0,0 +1,25 @@ +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc data copyin(s) + { + #pragma acc data copy(s.arr[0:1024]) + { + /* This directive does one too many attachments: it should fail when we try + to do the copyout below. */ + #pragma acc enter data attach(s.arr) + } + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c new file mode 100644 index 00000000000..d2095255ad3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c @@ -0,0 +1,26 @@ +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(localarray[0:1024]) + + #pragma acc data copyin(s) + { + /* We only try to copy in: the extra attachment we're left over with is not + harmful and we don't raise an error. */ + #pragma acc enter data attach(s.arr) + } + + #pragma acc exit data delete(localarray[0:1024]) + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c new file mode 100644 index 00000000000..9f60bfa56f4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c @@ -0,0 +1,33 @@ +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ + +#include <stdlib.h> + +struct mystruct { + int *arr; +}; + +int +main (int argc, char *argv[]) +{ + int localarray[1024]; + int localarray2[1024]; + struct mystruct s; + s.arr = localarray; + + #pragma acc enter data copyin(s) + + #pragma acc data copy(s.arr[0:1024]) + { + s.arr = localarray2; + /* This update is dangerous because we have attached pointers: raise an + error. */ + #pragma acc update device(s) + /* { dg-output "\\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] device update would overwrite attached pointers" } */ + } + + #pragma acc exit data delete(s) + + return 0; +} + +/* { dg-shouldfail "" } */ diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 index ad8da71d7c9..355a381b625 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 @@ -8,7 +8,7 @@ ! { dg-output ".*CheCKpOInT1(\n|\r\n|\r)" } ! { dg-output ".*CheCKpOInT2(\n|\r\n|\r)" } -! Without the finalize, we do not detach properly so the host sees a device -! pointer, and fails with this STOP code. -! { dg-output "STOP 7(\n|\r\n|\r)+" } +! Without the finalize, we do not detach properly and raise an error on attempting +! the copyout. +! { dg-output ".*copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers(\n|\r\n|\r)+" } ! { dg-shouldfail "" } -- 2.23.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers 2020-06-22 12:14 ` [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers Julian Brown @ 2020-07-24 13:37 ` Thomas Schwinge 0 siblings, 0 replies; 6+ messages in thread From: Thomas Schwinge @ 2020-07-24 13:37 UTC (permalink / raw) To: Julian Brown; +Cc: Jakub Jelinek, gcc-patches Hi Julian! Quoting your parent email: On 2020-06-22T05:14:42-0700, Julian Brown <julian@codesourcery.com> wrote: > Investigating PR95590, I realised that we can do better at diagnosing > some potentially troublesome usage of OpenACC "attach" behaviour, namely > updating blocks with attached pointers. Updating either the host copy > or device copy of such a block is problematic -- for a host update, > the host may get a clobbered (device) version of a host pointer in its > local version of the block (e.g. struct). A device update may clobber > an attached device pointer with a host pointer. ACK. > The spec text (OpenACC 3.0, "2.6.8. Attachment Counter") covering this > case is: > > "Pointer members of structs, classes, or derived types in device > or host memory can be overwritten due to update directives or API > routines. It is the user’s responsibility to ensure that the pointers > have the appropriate values before or after the data movement in > either direction. The behavior of the program is undefined if any > of the pointer members are attached when an update of a composite > variable is performed." > > The first patch in this series addresses that paragraph by making > such updates (as well as copyouts, similarly) be runtime errors. Hmm. But why do you say "addresses [...] by making [...] be runtime errors" if the specification text *explicitly* states ("It is the user's responsibility") that doing such things invokes undefined behavior, and thus a user must not do that. (Here, the undefined behavior is: copying of host vs. device pointers -- I wouldn't assume (user), respectively imply (implementor) anything worse?) It's of couse good if we can (without much overhead) be helpful to the user (your proposed runtime error), but I want to make sure that I'm correctly understanding your rationale here. On 2020-06-22T05:14:43-0700, Julian Brown <julian@codesourcery.com> wrote: > As mentioned in the parent email, this patch adds diagnostics for > probably-broken code that updates (host/device) or copies-out blocks > that still have attached pointers. Several new tests have been added. > > OK? I so far haven't managed to really convince myself that we want to incur this overhead here. (I suppose it's not too much overhead.) I may re-consider this still. I suppose we can put this onto the backburner -- nothing else functionally depends on this? Assuming this checking does get installed (and enabled by default), I had the idea that we may (rather easily?) add a flag variable (ICV; initialized from an environment variable) to guard this checking behavior? I suppose we may now have a few libgomp testcases that actually do use 'acc_update_self' etc. to read out pointer values from visible device copies, and verify these, which wouldn't work any longer with that checking enabled. Such tests could then 'dg-set-target-env-var "GOMP_ATTACH_CHECKING" "0"' (better name is desirable), and have one variant with and one variant without the checking. Grüße Thomas > libgomp/ > * oacc-mem.c (update_dev_host): Raise error on update of block with > attached pointers. > (goacc_exit_data_internal): Raise error on copyout of block with > attached pointers. > * target.c (gomp_unmap_vars_internal): Likewise. > * testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c: > New test. > * testsuite/libgomp.oacc-c-c++-common/update-attached.c: New test. > * testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Update > for new diagnostic. > --- > libgomp/oacc-mem.c | 42 ++++++++++++++++--- > libgomp/target.c | 27 +++++++++--- > .../copyback-attached-dynamic-1.c | 31 ++++++++++++++ > .../copyback-attached-structural-1.c | 30 +++++++++++++ > .../copyback-attached-structural-2.c | 31 ++++++++++++++ > .../copyback-attached-structural-3.c | 26 ++++++++++++ > .../delete-attached-dynamic-1.c | 26 ++++++++++++ > .../delete-attached-structural-1.c | 25 +++++++++++ > .../delete-attached-structural-2.c | 26 ++++++++++++ > .../update-attached-1.c | 33 +++++++++++++++ > .../deep-copy-6-no_finalize.F90 | 6 +-- > 11 files changed, 290 insertions(+), 13 deletions(-) > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c > > diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c > index 1816b06bf2d..cf054f14b12 100644 > --- a/libgomp/oacc-mem.c > +++ b/libgomp/oacc-mem.c > @@ -865,6 +865,23 @@ update_dev_host (int is_dev, void *h, size_t s, int async) > gomp_fatal ("[%p,%d] is not mapped", h, (int)s); > } > > + if (n->aux && n->aux->attach_count) > + { > + size_t nptrs = (n->host_end - n->host_start + sizeof (void *) - 1) > + / sizeof (void *); > + for (size_t i = 0; i < nptrs; i++) > + if (n->aux->attach_count[i] > 0) > + { > + gomp_mutex_unlock (&acc_dev->lock); > + if (is_dev) > + gomp_fatal ("[%p,+%d] device update would overwrite attached " > + "pointers", h, (int) s); > + else > + gomp_fatal ("host update from block [%p,+%d] with attached " > + "pointers", h, (int) s); > + } > + } > + > d = (void *) (n->tgt->tgt_start + n->tgt_offset > + (uintptr_t) h - n->host_start); > > @@ -1329,11 +1346,26 @@ goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, > if (copyfrom > && n->refcount != REFCOUNT_INFINITY > && (kind != GOMP_MAP_FROM || n->refcount == 0)) > - gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start, > - (void *) (n->tgt->tgt_start + n->tgt_offset > - + cur_node.host_start > - - n->host_start), > - cur_node.host_end - cur_node.host_start); > + { > + if (n->aux && n->aux->attach_count) > + { > + size_t nptrs = (n->host_end - n->host_start > + + sizeof (void *) - 1) / sizeof (void *); > + for (size_t j = 0; j < nptrs; j++) > + if (n->aux->attach_count[j] > 0) > + { > + gomp_mutex_unlock (&acc_dev->lock); > + gomp_fatal ("copyout of block [%p,+%d] with " > + "attached pointers", hostaddrs[i], > + (int) size); > + } > + } > + gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start, > + (void *) (n->tgt->tgt_start + n->tgt_offset > + + cur_node.host_start > + - n->host_start), > + cur_node.host_end - cur_node.host_start); > + } > > if (n->refcount == 0) > { > diff --git a/libgomp/target.c b/libgomp/target.c > index badc254a777..db6f56a8ff8 100644 > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -1649,11 +1649,28 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom, > > if ((do_unmap && do_copyfrom && tgt->list[i].copy_from) > || tgt->list[i].always_copy_from) > - gomp_copy_dev2host (devicep, aq, > - (void *) (k->host_start + tgt->list[i].offset), > - (void *) (k->tgt->tgt_start + k->tgt_offset > - + tgt->list[i].offset), > - tgt->list[i].length); > + { > + if (k->aux && k->aux->attach_count) > + { > + size_t nptrs = (k->host_end - k->host_start > + + sizeof (void *) - 1) / sizeof (void *); > + for (size_t j = 0; j < nptrs; j++) > + if (k->aux->attach_count[j] > 0) > + { > + gomp_mutex_unlock (&devicep->lock); > + gomp_fatal ("copyout of block [%p,+%d] with " > + "attached pointers", > + (void *) (k->host_start + tgt->list[i].offset), > + (int) (k->host_end - k->host_start)); > + } > + } > + gomp_copy_dev2host (devicep, aq, > + (void *) (k->host_start + tgt->list[i].offset), > + (void *) (k->tgt->tgt_start + k->tgt_offset > + + tgt->list[i].offset), > + tgt->list[i].length); > + } > + > if (do_unmap) > { > struct target_mem_desc *k_tgt = k->tgt; > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c > new file mode 100644 > index 00000000000..bc4e297fa6f > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-dynamic-1.c > @@ -0,0 +1,31 @@ > +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ > + > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(s) > + > + #pragma acc data copy(s.arr[0:1024]) > + { > + /* This directive does one too many attachments: it should fail when we try > + to do the copyout below. */ > + #pragma acc enter data attach(s.arr) > + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ > + } > + > + #pragma acc exit data copyout(s) > + > + return 0; > +} > + > +/* { dg-shouldfail "" } */ > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c > new file mode 100644 > index 00000000000..7846c8c717c > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-1.c > @@ -0,0 +1,30 @@ > +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ > + > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc data copy(s) > + { > + #pragma acc data copy(s.arr[0:1024]) > + { > + /* This directive does one too many attachments: it should fail when we try > + to do the copyout below. */ > + #pragma acc enter data attach(s.arr) > + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ > + } > + } > + > + return 0; > +} > + > +/* { dg-shouldfail "" } */ > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c > new file mode 100644 > index 00000000000..bffa06eb725 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-2.c > @@ -0,0 +1,31 @@ > +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ > + > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(localarray[0:1024]) > + > + #pragma acc data copy(s) > + { > + /* This directive does one too many attachments: it should fail when we try > + to do the copyout below. */ > + #pragma acc enter data attach(s.arr) > + /* { dg-output "copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers" } */ > + } > + > + #pragma acc exit data delete(localarray[0:1024]) > + > + return 0; > +} > + > +/* { dg-shouldfail "" } */ > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c > new file mode 100644 > index 00000000000..4b21677af09 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/copyback-attached-structural-3.c > @@ -0,0 +1,26 @@ > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(localarray[0:1024]) > + > + #pragma acc data copy(s) > + { > + /* Here the attach and detach balance: this should work. */ > + #pragma acc enter data attach(s.arr) > + #pragma acc exit data detach(s.arr) > + } > + > + #pragma acc exit data delete(localarray[0:1024]) > + > + return 0; > +} > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c > new file mode 100644 > index 00000000000..e074d507fb2 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-dynamic-1.c > @@ -0,0 +1,26 @@ > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(s) > + > + #pragma acc data copy(s.arr[0:1024]) > + { > + /* We delete 's' from the target below: this extra attachment is not > + dangerous and we do not raise an error. */ > + #pragma acc enter data attach(s.arr) > + } > + > + #pragma acc exit data delete(s) > + > + return 0; > +} > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c > new file mode 100644 > index 00000000000..e675762ecd8 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-1.c > @@ -0,0 +1,25 @@ > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc data copyin(s) > + { > + #pragma acc data copy(s.arr[0:1024]) > + { > + /* This directive does one too many attachments: it should fail when we try > + to do the copyout below. */ > + #pragma acc enter data attach(s.arr) > + } > + } > + > + return 0; > +} > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c > new file mode 100644 > index 00000000000..d2095255ad3 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/delete-attached-structural-2.c > @@ -0,0 +1,26 @@ > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(localarray[0:1024]) > + > + #pragma acc data copyin(s) > + { > + /* We only try to copy in: the extra attachment we're left over with is not > + harmful and we don't raise an error. */ > + #pragma acc enter data attach(s.arr) > + } > + > + #pragma acc exit data delete(localarray[0:1024]) > + > + return 0; > +} > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c > new file mode 100644 > index 00000000000..9f60bfa56f4 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/update-attached-1.c > @@ -0,0 +1,33 @@ > +/* { dg-skip-if "" { *-*-* } { "-DACC_MEM_SHARED=1" } } */ > + > +#include <stdlib.h> > + > +struct mystruct { > + int *arr; > +}; > + > +int > +main (int argc, char *argv[]) > +{ > + int localarray[1024]; > + int localarray2[1024]; > + struct mystruct s; > + s.arr = localarray; > + > + #pragma acc enter data copyin(s) > + > + #pragma acc data copy(s.arr[0:1024]) > + { > + s.arr = localarray2; > + /* This update is dangerous because we have attached pointers: raise an > + error. */ > + #pragma acc update device(s) > + /* { dg-output "\\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] device update would overwrite attached pointers" } */ > + } > + > + #pragma acc exit data delete(s) > + > + return 0; > +} > + > +/* { dg-shouldfail "" } */ > diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 > index ad8da71d7c9..355a381b625 100644 > --- a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 > +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 > @@ -8,7 +8,7 @@ > ! { dg-output ".*CheCKpOInT1(\n|\r\n|\r)" } > ! { dg-output ".*CheCKpOInT2(\n|\r\n|\r)" } > > -! Without the finalize, we do not detach properly so the host sees a device > -! pointer, and fails with this STOP code. > -! { dg-output "STOP 7(\n|\r\n|\r)+" } > +! Without the finalize, we do not detach properly and raise an error on attempting > +! the copyout. > +! { dg-output ".*copyout of block \\\[0x\[0-9a-f\]+,\\+\[0-9\]+\\\] with attached pointers(\n|\r\n|\r)+" } > ! { dg-shouldfail "" } ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) 2020-06-22 12:14 [PATCH 0/2] [OpenACC] Attached deep-copy pointers, diagnostics & modifications Julian Brown 2020-06-22 12:14 ` [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers Julian Brown @ 2020-06-22 12:14 ` Julian Brown 2020-07-24 14:04 ` Thomas Schwinge 1 sibling, 1 reply; 6+ messages in thread From: Julian Brown @ 2020-06-22 12:14 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Jakub Jelinek, Catherine Moore As mentioned in the parent email, this is a fix for PR95590 that detects updates of attached pointers in blocks, and rewrites the attached pointer and resets its attachment counter appropriately. I am however not entirely sure this is desirable or required by the spec: points against are: - To avoid expensive copies from the device to the host and/or "wrong way" device-to-host splay tree lookups, it requires keeping an extra shadow copy of mapped blocks on the host in order to detect if a host pointer with attachments in the block has been changed between attach operations. We incur this overhead unconditionally if attach/detach are in use for what's not likely to be a common use case (it's slightly tricky to write a test case to exercise the behaviour, even -- Thomas's unmodified original for the PR raises an error after the previous patch in this series). - From a user perspective, I think it's going to be quite easy to get confused wrt. the hidden attachment counter state, with this kind of reset-on-host-pointer-modification behaviour. Mind you, silently *not* doing the update is likewise going to be confusing (the stale device pointer would be updated at present). Maybe this should be detected as an error instead? - The text in "2.6.8. Attachment Counter" *might* contribute to the argument that this kind of pointer-update detection is not required. Anyway, thoughts, or OK for mainline? Thanks, Julian ChangeLog PR libgomp/95590 libgomp/ * target.c (gomp_attach_pointer): Initialise shadow copy of block with attached pointers, and use to detect modifications of those pointers. * testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c: New test. --- libgomp/target.c | 29 +++++++- .../attach-ptr-change-1.c | 74 +++++++++++++++++++ 2 files changed, 100 insertions(+), 3 deletions(-) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c diff --git a/libgomp/target.c b/libgomp/target.c index db6f56a8ff8..076cc2bbbcb 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -691,6 +691,8 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, { struct splay_tree_key_s s; size_t size, idx; + char *shadow_block; + size_t shadow_size = n->host_end - n->host_start; if (n == NULL) { @@ -707,9 +709,31 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, if (!n->aux) n->aux = gomp_malloc_cleared (sizeof (struct splay_tree_aux)); + bool first = false; + if (!n->aux->attach_count) - n->aux->attach_count - = gomp_malloc_cleared (sizeof (*n->aux->attach_count) * size); + { + n->aux->attach_count + = gomp_malloc_cleared (sizeof (*n->aux->attach_count) * size + + shadow_size); + first = true; + } + + shadow_block = ((char *) n->aux->attach_count) + + sizeof (*n->aux->attach_count) * size; + + if (first) + memcpy (shadow_block, (const void *) n->host_start, shadow_size); + + uintptr_t target = (uintptr_t) *(void **) attach_to; + uintptr_t shadow_target + = (uintptr_t) *(void **) (shadow_block + attach_to - n->host_start); + if (target != shadow_target) + { + n->aux->attach_count[idx] = 0; + memcpy ((char *) shadow_block + attach_to - n->host_start, + (const void *) target, sizeof (void *)); + } if (n->aux->attach_count[idx] < UINTPTR_MAX) n->aux->attach_count[idx]++; @@ -723,7 +747,6 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, { uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to - n->host_start; - uintptr_t target = (uintptr_t) *(void **) attach_to; splay_tree_key tn; uintptr_t data; diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c new file mode 100644 index 00000000000..d4d84fdb092 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c @@ -0,0 +1,74 @@ +#include <assert.h> +#include <stdlib.h> +#include <openacc.h> + +struct str { + unsigned char *c; +}; + +int main() +{ + const int size_1 = sizeof (void *); + unsigned char *data_1 = (unsigned char *) malloc(sizeof (void *)); + assert(data_1); + void *data_1_d = acc_create(data_1, size_1); + assert(data_1_d); + assert(acc_is_present(data_1, size_1)); + + const int size_2 = sizeof (void *); + unsigned char *data_2 = (unsigned char *) malloc(size_2); + assert(data_2); + void *data_2_d = acc_create(data_2, size_2); + assert(data_2_d); + assert(acc_is_present(data_2, size_2)); + + struct str data_work; + data_work.c = data_1; + + acc_copyin(&data_work, sizeof data_work); + assert(acc_is_present(&data_work, sizeof data_work)); + assert(data_work.c == data_1); + + /* No attach has taken place so far. We can still do a self-update. */ + acc_update_self(&data_work, sizeof data_work); + assert(data_work.c == data_1); + + data_1[0] = 'a'; + data_2[0] = 'b'; + + acc_update_device (data_1, size_1); + acc_update_device (data_2, size_2); + + acc_attach((void **) &data_work.c); + #pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ + { + data_work.c[0] = 'c'; + } + + acc_update_self (data_1, size_1); + acc_update_self (data_2, size_2); + + assert (data_1[0] == 'c'); + assert (data_2[0] == 'b'); + + data_1[0] = 'a'; + data_2[0] = 'b'; + + acc_update_device (data_1, size_1); + acc_update_device (data_2, size_2); + + data_work.c = data_2; + acc_attach((void **) &data_work.c); + #pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ + { + data_work.c[0] = 'd'; + } + + acc_update_self (data_1, size_1); + acc_update_self (data_2, size_2); + + assert (data_1[0] == 'a'); + assert (data_2[0] == 'd'); + + return 0; +} -- 2.23.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) 2020-06-22 12:14 ` [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) Julian Brown @ 2020-07-24 14:04 ` Thomas Schwinge 2020-07-24 22:36 ` Julian Brown 0 siblings, 1 reply; 6+ messages in thread From: Thomas Schwinge @ 2020-07-24 14:04 UTC (permalink / raw) To: Julian Brown; +Cc: Jakub Jelinek, gcc-patches Hi Julian! On 2020-06-22T05:14:44-0700, Julian Brown <julian@codesourcery.com> wrote: > As mentioned in the parent email, this is a fix for PR95590 that detects > updates of attached pointers in blocks, and rewrites the attached pointer > and resets its attachment counter appropriately. I am however not entirely > sure this is desirable or required by the spec: points against are: > > - To avoid expensive copies from the device to the host and/or "wrong > way" device-to-host splay tree lookups, it requires keeping an extra > shadow copy of mapped blocks on the host in order to detect if a > host pointer with attachments in the block has been changed between > attach operations. I haven't spent too much time trying, but I too have not yet seen a way to avoid keeping this state ("shadow copy"), or looking it up on demand ("expensive copies from the device to the host"). I suppose we cannot get the necessary information/state from the host-side pointer (value) alone, and/or other state kept in the 'splay_tree_key n' etc.? > We incur this overhead unconditionally if > attach/detach are in use for what's not likely to be a common use case Is the overhead so bad, though? As soon as there's an 'attach', we have to 'malloc' anyway (can combine the two, as you've done), and the checking overhead doesn't seem so bad either? Should we reach out to other OpenACC compiler implementors, and ask for their understanding/approach to this aspect? > (it's slightly tricky to write a test case to exercise the behaviour, > even -- Thomas's unmodified original for the PR raises an error after > the previous patch in this series). Challange accepted! ;-P (..., but not right now.) > - From a user perspective, I think it's going to be quite easy to get > confused wrt. the hidden attachment counter state (Indeed that "hidden" aspect is a bit confusing. I've even thought whether we should add some 'gomp_get_attach_count' function just for our own testing purposes.) > with this kind of > reset-on-host-pointer-modification behaviour. Mind you, silently *not* > doing the update is likewise going to be confusing (the stale device > pointer would be updated at present). Maybe this should be detected > as an error instead? I don't understand that, I'm afraid, because as I have quoted in <https://gcc.gnu.org/PR95590> "OpenACC 'attach' behavior if already attached to different data", OpenACC explicitly mandates the "reset-on-host-pointer-modification" behavior, so I don't see a way to avoid implementing that? > - The text in "2.6.8. Attachment Counter" *might* contribute to the > argument that this kind of pointer-update detection is not required. Do you think these texts are in conflict in some way (that's not obvious to me)? Conceptually we're talking about making this behave: int a[]; #pragma acc enter data create(a) int b[]; #pragma acc enter data create(b) int *p; #pragma acc data create(p) // create long-lived device copy of 'p' { p = a; #pragma acc enter data attach(p) // explicit, or implicit #pragma acc parallel present(p) { fill_array(p); } // writes to device copy of 'a' p = b; #pragma acc enter data attach(p) // explicit, or implicit #pragma acc parallel present(p) { fill_array(p); } // writes to device copy of 'b' } #pragma acc parallel present(a, b) { [use device copies of 'a', 'b'] } ..., or this: int a[]; #pragma acc enter data create(a) int b[]; #pragma acc enter data create(b) struct { int data; int *p; } s s; #pragma acc data create(s) // create long-lived device copy of 's' { s.data = [...]; s.p = a; #pragma acc update device(s) // invokes ("expected") undefined behavior w.r.t. 's.p' #pragma acc parallel // implicit 'attach(s.p)' { fill_array(s.data, s.p); } // writes to device copy of 'a' s.data = [...]; s.p = a; #pragma acc update device(s) // invokes ("expected") undefined behavior w.r.t. 's.p' #pragma acc parallel // implicit 'attach(s.p)' { fill_array(s.data, s.p); } // writes to device copy of 'b' } #pragma acc parallel { [use device copies of 'a', 'b'] } Please verify these conceptually -- and, is there any other, different scenario to consider here? > OK for mainline? I think the implementation is fine, but I'd like to think about all this a bit more. Grüße Thomas > PR libgomp/95590 > > libgomp/ > * target.c (gomp_attach_pointer): Initialise shadow copy of block with > attached pointers, and use to detect modifications of those pointers. > * testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c: New test. > --- > libgomp/target.c | 29 +++++++- > .../attach-ptr-change-1.c | 74 +++++++++++++++++++ > 2 files changed, 100 insertions(+), 3 deletions(-) > create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c > > diff --git a/libgomp/target.c b/libgomp/target.c > index db6f56a8ff8..076cc2bbbcb 100644 > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -691,6 +691,8 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, > { > struct splay_tree_key_s s; > size_t size, idx; > + char *shadow_block; > + size_t shadow_size = n->host_end - n->host_start; > > if (n == NULL) > { > @@ -707,9 +709,31 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, > if (!n->aux) > n->aux = gomp_malloc_cleared (sizeof (struct splay_tree_aux)); > > + bool first = false; > + > if (!n->aux->attach_count) > - n->aux->attach_count > - = gomp_malloc_cleared (sizeof (*n->aux->attach_count) * size); > + { > + n->aux->attach_count > + = gomp_malloc_cleared (sizeof (*n->aux->attach_count) * size > + + shadow_size); > + first = true; > + } > + > + shadow_block = ((char *) n->aux->attach_count) > + + sizeof (*n->aux->attach_count) * size; > + > + if (first) > + memcpy (shadow_block, (const void *) n->host_start, shadow_size); > + > + uintptr_t target = (uintptr_t) *(void **) attach_to; > + uintptr_t shadow_target > + = (uintptr_t) *(void **) (shadow_block + attach_to - n->host_start); > + if (target != shadow_target) > + { > + n->aux->attach_count[idx] = 0; > + memcpy ((char *) shadow_block + attach_to - n->host_start, > + (const void *) target, sizeof (void *)); > + } > > if (n->aux->attach_count[idx] < UINTPTR_MAX) > n->aux->attach_count[idx]++; > @@ -723,7 +747,6 @@ gomp_attach_pointer (struct gomp_device_descr *devicep, > { > uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to > - n->host_start; > - uintptr_t target = (uintptr_t) *(void **) attach_to; > splay_tree_key tn; > uintptr_t data; > > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c > new file mode 100644 > index 00000000000..d4d84fdb092 > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/attach-ptr-change-1.c > @@ -0,0 +1,74 @@ > +#include <assert.h> > +#include <stdlib.h> > +#include <openacc.h> > + > +struct str { > + unsigned char *c; > +}; > + > +int main() > +{ > + const int size_1 = sizeof (void *); > + unsigned char *data_1 = (unsigned char *) malloc(sizeof (void *)); > + assert(data_1); > + void *data_1_d = acc_create(data_1, size_1); > + assert(data_1_d); > + assert(acc_is_present(data_1, size_1)); > + > + const int size_2 = sizeof (void *); > + unsigned char *data_2 = (unsigned char *) malloc(size_2); > + assert(data_2); > + void *data_2_d = acc_create(data_2, size_2); > + assert(data_2_d); > + assert(acc_is_present(data_2, size_2)); > + > + struct str data_work; > + data_work.c = data_1; > + > + acc_copyin(&data_work, sizeof data_work); > + assert(acc_is_present(&data_work, sizeof data_work)); > + assert(data_work.c == data_1); > + > + /* No attach has taken place so far. We can still do a self-update. */ > + acc_update_self(&data_work, sizeof data_work); > + assert(data_work.c == data_1); > + > + data_1[0] = 'a'; > + data_2[0] = 'b'; > + > + acc_update_device (data_1, size_1); > + acc_update_device (data_2, size_2); > + > + acc_attach((void **) &data_work.c); > + #pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ > + { > + data_work.c[0] = 'c'; > + } > + > + acc_update_self (data_1, size_1); > + acc_update_self (data_2, size_2); > + > + assert (data_1[0] == 'c'); > + assert (data_2[0] == 'b'); > + > + data_1[0] = 'a'; > + data_2[0] = 'b'; > + > + acc_update_device (data_1, size_1); > + acc_update_device (data_2, size_2); > + > + data_work.c = data_2; > + acc_attach((void **) &data_work.c); > + #pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ > + { > + data_work.c[0] = 'd'; > + } > + > + acc_update_self (data_1, size_1); > + acc_update_self (data_2, size_2); > + > + assert (data_1[0] == 'a'); > + assert (data_2[0] == 'd'); > + > + return 0; > +} ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) 2020-07-24 14:04 ` Thomas Schwinge @ 2020-07-24 22:36 ` Julian Brown 0 siblings, 0 replies; 6+ messages in thread From: Julian Brown @ 2020-07-24 22:36 UTC (permalink / raw) To: Thomas Schwinge; +Cc: Jakub Jelinek, gcc-patches On Fri, 24 Jul 2020 16:04:02 +0200 Thomas Schwinge <thomas@codesourcery.com> wrote: > Hi Julian! > > On 2020-06-22T05:14:44-0700, Julian Brown <julian@codesourcery.com> > wrote: > > As mentioned in the parent email, this is a fix for PR95590 that > > detects updates of attached pointers in blocks, and rewrites the > > attached pointer and resets its attachment counter appropriately. I > > am however not entirely sure this is desirable or required by the > > spec: points against are: > > > > - To avoid expensive copies from the device to the host and/or > > "wrong way" device-to-host splay tree lookups, it requires keeping > > an extra shadow copy of mapped blocks on the host in order to > > detect if a host pointer with attachments in the block has been > > changed between attach operations. > > I haven't spent too much time trying, but I too have not yet seen a > way to avoid keeping this state ("shadow copy"), or looking it up on > demand ("expensive copies from the device to the host"). > > I suppose we cannot get the necessary information/state from the > host-side pointer (value) alone, and/or other state kept in the > 'splay_tree_key n' etc.? I don't think so. A different implementation might keep the attachment counters associated with the target_mem_desc (on the "target side"), rather than the splay tree key (the "host side"), in which case the reset-on-host-pointer-modification might sort-of happen for free. But I think that would be quite problematic for other reasons with our current implementation. (Purely speculating, but maybe it "works" somewhat accidentally for PGI because of the way its host-to-device pointer mapping is implemented?) > > We incur this overhead unconditionally if > > attach/detach are in use for what's not likely to be a common > > use case > > Is the overhead so bad, though? As soon as there's an 'attach', we > have to 'malloc' anyway (can combine the two, as you've done), and the > checking overhead doesn't seem so bad either? > > Should we reach out to other OpenACC compiler implementors, and ask > for their understanding/approach to this aspect? I haven't measured the performance impact (it's probably negligible). It may be worth trying to get clarification from OpenACC upstream, though. > > (it's slightly tricky to write a test case to exercise the > > behaviour, even -- Thomas's unmodified original for the PR raises > > an error after the previous patch in this series). > > Challange accepted! ;-P (..., but not right now.) There was a test case attached to the parent email, too :-). > > - From a user perspective, I think it's going to be quite easy to > > get confused wrt. the hidden attachment counter state > > (Indeed that "hidden" aspect is a bit confusing. I've even thought > whether we should add some 'gomp_get_attach_count' function just for > our own testing purposes.) Yeah, maybe. > > with this kind of > > reset-on-host-pointer-modification behaviour. Mind you, > > silently *not* doing the update is likewise going to be confusing > > (the stale device pointer would be updated at present). Maybe this > > should be detected as an error instead? > > I don't understand that, I'm afraid, because as I have quoted in > <https://gcc.gnu.org/PR95590> "OpenACC 'attach' behavior if already > attached to different data", OpenACC explicitly mandates the > "reset-on-host-pointer-modification" behavior, so I don't see a way to > avoid implementing that? > > > - The text in "2.6.8. Attachment Counter" *might* contribute to the > > argument that this kind of pointer-update detection is not > > required. > > Do you think these texts are in conflict in some way (that's not > obvious to me)? I'm still not sure that the intended meaning (in OpenACC 2.6, 2.7.2. "Data Clause Actions", "Attach Action") is what you are reading into it. See also "2.7.1. Data Specification in Data Clauses", under Restrictions: "* In C and C++, modifying pointers in pointer arrays during the data lifetime, either on the host or on the device, may result in undefined behavior." That isn't explicitly about pointers within structs (as we're talking about here), but is of a similar flavour, I think -- in that recognizing host pointer modifications in arrays of pointers would require similar housekeeping in the runtime, but OpenACC 2.6 makes such modifications undefined behaviour instead. The text in "2.6.7. Attachment Counter" (in OpenACC 2.6) is specifically about update operations (acc_update API routines or equivalent directives), but again, detecting pointer modifications (on the host side) between successive "attach" operations seems like a departure from *not* needing to do the same for update operations. Should we also support modifications of attached pointers (e.g. in mapped structs) in device-side code? Why or why not? (That wouldn't be impossible, but the details of how it could work would be ugly indeed...) Here's a quick example of "weird" behaviour that would arise with the pointer-modification detection patch: #include <assert.h> #include <stdlib.h> struct mystr { int *ptr; }; #define N 1024 int main (int argc, char *argv[]) { int *arr1 = malloc (sizeof (int) * N); int *arr2 = malloc (sizeof (int) * N); struct mystr s; for (int i = 0; i < N; i++) { arr1[i] = i; arr2[i] = i * 2; } s.ptr = arr1; #pragma acc enter data copyin(s) #pragma acc data copy(s.ptr[0:N]) { s.ptr = arr2; #pragma acc parallel loop copy(s.ptr[0:N]) for (int i = 0; i < N; i++) s.ptr[i] = i * 3; } for (int i = 0; i < N; i++) { assert (arr1[i] == i); assert (arr2[i] == i * 3); } free (arr1); free (arr2); } With the patch, this gives: libgomp: attach count underflow Though of course it doesn't work properly without the pointer-modification detection patch either. This example could be made to work, but it would mean *not* resetting the attachment counter to one on detecting a modified host pointer -- the pointer mapping would be modified but the attachment counter would be incremented as usual (at the start of the "acc parallel"). That's arguably the right thing to do perhaps, but it's clearly not what the spec says, even with your reading. HTH, Julian ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-07-24 22:36 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-22 12:14 [PATCH 0/2] [OpenACC] Attached deep-copy pointers, diagnostics & modifications Julian Brown 2020-06-22 12:14 ` [PATCH 1/2] [OpenACC] Refuse update/copyout for blocks with attached pointers Julian Brown 2020-07-24 13:37 ` Thomas Schwinge 2020-06-22 12:14 ` [PATCH 2/2] [OpenACC] Detect pointer updates for attach operations (PR95590) Julian Brown 2020-07-24 14:04 ` Thomas Schwinge 2020-07-24 22:36 ` Julian Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).