* [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause
@ 2017-05-16 13:11 Chung-Lin Tang
2017-05-17 11:51 ` Thomas Schwinge
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Chung-Lin Tang @ 2017-05-16 13:11 UTC (permalink / raw)
To: gcc-patches; +Cc: Thomas Schwinge
[-- Attachment #1: Type: text/plain, Size: 5337 bytes --]
This patch for gomp-4_0-branch implements OpenACC 2.5 reference counting
of mappings, the finalize clause of the exit data directive, and the
corresponding API routines.
Tested without regressions, committed to gomp-4_0-branch.
Chung-Lin
2017-05-16 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Handle 'finalize' clause.
(c_parser_oacc_simple_clause): Add 'finalize' to comments.
(c_parser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_FINALIZE.
(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_FINALIZE.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FINALIZE.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Handle 'finalize' clause.
(cp_parser_oacc_simple_clause): Add 'finalize' to comments.
(cp_parser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_FINALIZE.
(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_FINALIZE.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FINALIZE.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_FINALIZE.
gcc/fortran/
* gfortran.h (struct gfc_omp_clauses): Add 'finalize:1' bitfield.
* openmp.c (enum omp_mask2): Add OMP_CLAUSE_FINALIZE.
(gfc_match_omp_clauses): Handle 'finalize' clause.
(OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_FINALIZE.
* trans-openmp.c (gfc_trans_omp_clauses_1): Handle finalize bit.
gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FINALIZE.
* tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_FINALIZE.
(omp_clause_code_name): Add "finalize" entry.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FINALIZE.
(expand_omp_target): Add finalize argument for GOACC_enter_exit_data
call.
* gimplify.c (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FINALIZE.
(gimplify_adjust_omp_clauses): Likewise.
libgomp/
* openacc.h (acc_copyout_finalize): Declare new API function.
(acc_copyout_finalize_async): Likewise.
(acc_delete_finalize): Likewise.
(acc_delete_finalize_async): Likewise.
* openacc_lib.h (acc_copyout_finalize): Declare new API function.
(acc_copyout_finalize_async): Likewise.
(acc_delete_finalize): Likewise.
(acc_delete_finalize_async): Likewise.
* openacc.f90 (acc_copyout_finalize_32_h): Define.
(acc_copyout_finalize_64_h): Likewise.
(acc_copyout_finalize_array_h): Likewise.
(acc_copyout_finalize_l): Likewise.
(acc_copyout_finalize_async_32_h): Define.
(acc_copyout_finalize_async_64_h): Likewise.
(acc_copyout_finalize_async_array_h): Likewise.
(acc_copyout_finalize_async_l): Likewise.
(acc_delete_finalize_32_h): Define.
(acc_delete_finalize_64_h): Likewise.
(acc_delete_finalize_array_h): Likewise.
(acc_delete_finalize_l): Likewise.
(acc_delete_finalize_async_32_h): Define.
(acc_delete_finalize_async_64_h): Likewise.
(acc_delete_finalize_async_array_h): Likewise.
(acc_delete_finalize_async_l): Likewise.
* libgomp.map (OACC_2.5): Add acc_copyout_finalize* and
acc_delete_finalize* entries.
* libgomp.h (struct splay_tree_key_s): Add 'dynamic_refcount' field.
(gomp_acc_remove_pointer): Adjust declaration.
(gomp_remove_var): New declaration.
* libgomp_g.h (GOACC_enter_exit_data): Adjust declaration.
* oacc-mem.c (acc_map_data): Adjust new key refcount to REFCOUNT_INFINITY.
(acc_unmap_data): Adjust key refcount to 1 for removal.
(present_create_copy): Increment mapping refcounts when mapping exists,
initialize dynamic refcount when creating new mapping.
(FLAG_FINALIZE): Define macro.
(delete_copyout): Adjust delete/copyout handling, add handling for FLAG_FINALIZE.
(acc_delete_finalize): Define new API function.
(acc_delete_finalize_async): Likewise.
(acc_copyout_finalize): Likewise.
(acc_copyout_finalize_async): Likewise.
(gomp_acc_insert_pointer): Adjust handling.
(gomp_acc_remove_pointer): Add finalize parameter, adjust handling.
* oacc-parallel.c (GOACC_parallel_keyed): Disable async registering when no
copyout needed.
(GOACC_enter_exit_data): Add and handle finalize argument, adjust
gomp_acc_insert_pointer and gomp_acc_remove_pointer calls.
(GOACC_declare): Adjust calls to GOACC_enter_exit_data.
* target.c (gomp_map_vars): Initialize dynamic_refcount.
(gomp_remove_var): Abstract out key unreferencing into new function.
(gomp_unmap_vars): Adjust to call gomp_remove_var.
(gomp_unload_image_from_device): Likewise.
(gomp_exit_data): Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-2.c: Adjust testcase for 2.5 reference counting.
* testsuite/libgomp.oacc-c-c++-common/lib-38.c: Likewise.
* testsuite/libgomp.oacc-fortran/data-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/data-5.c: Likewise.
* testsuite/libgomp.oacc-fortran/data-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/data-6.f90: Likewise.
[-- Attachment #2: oacc25-refcount.patch --]
[-- Type: text/x-patch, Size: 51065 bytes --]
Index: libgomp/oacc-parallel.c
===================================================================
--- libgomp/oacc-parallel.c (revision 248095)
+++ libgomp/oacc-parallel.c (revision 248096)
@@ -355,7 +355,22 @@
}
}
else
- tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
+ {
+ bool async_unmap = false;
+ for (size_t i = 0; i < tgt->list_count; i++)
+ {
+ splay_tree_key k = tgt->list[i].key;
+ if (k && k->refcount == 1)
+ {
+ async_unmap = true;
+ break;
+ }
+ }
+ if (async_unmap)
+ tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
+ else
+ gomp_unmap_vars (tgt, false);
+ }
acc_dev->openacc.async_set_async_func (acc_async_sync);
@@ -586,7 +601,7 @@
void
GOACC_enter_exit_data (int device, size_t mapnum,
void **hostaddrs, size_t *sizes, unsigned short *kinds,
- int async, int num_waits, ...)
+ int async, int finalize, int num_waits, ...)
{
struct goacc_thread *thr;
struct gomp_device_descr *acc_dev;
@@ -749,11 +764,9 @@
if (kind == GOMP_MAP_DECLARE_ALLOCATE)
gomp_acc_declare_allocate (true, pointer, &hostaddrs[i],
&sizes[i], &kinds[i]);
- else if (!acc_is_present (hostaddrs[i], sizes[i]))
- {
- gomp_acc_insert_pointer (pointer, &hostaddrs[i],
- &sizes[i], &kinds[i]);
- }
+ else
+ gomp_acc_insert_pointer (pointer, &hostaddrs[i],
+ &sizes[i], &kinds[i]);
/* Increment 'i' by two because OpenACC requires fortran
arrays to be contiguous, so each PSET is associated with
one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
@@ -775,12 +788,20 @@
{
case GOMP_MAP_DELETE:
if (acc_is_present (hostaddrs[i], sizes[i]))
- acc_delete (hostaddrs[i], sizes[i]);
+ {
+ if (finalize)
+ acc_delete_finalize (hostaddrs[i], sizes[i]);
+ else
+ acc_delete (hostaddrs[i], sizes[i]);
+ }
break;
case GOMP_MAP_DECLARE_DEALLOCATE:
case GOMP_MAP_FROM:
case GOMP_MAP_FORCE_FROM:
- acc_copyout (hostaddrs[i], sizes[i]);
+ if (finalize)
+ acc_copyout_finalize (hostaddrs[i], sizes[i]);
+ else
+ acc_copyout (hostaddrs[i], sizes[i]);
break;
default:
gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
@@ -793,11 +814,12 @@
if (kind == GOMP_MAP_DECLARE_DEALLOCATE)
gomp_acc_declare_allocate (false, pointer, &hostaddrs[i],
&sizes[i], &kinds[i]);
- else if (acc_is_present (hostaddrs[i], sizes[i]))
+ else
{
bool copyfrom = (kind == GOMP_MAP_FORCE_FROM
|| kind == GOMP_MAP_FROM);
- gomp_acc_remove_pointer (hostaddrs[i], copyfrom, async, pointer);
+ gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async,
+ finalize, pointer);
/* See the above comment. */
}
i += pointer - 1;
@@ -1077,7 +1099,7 @@
case GOMP_MAP_POINTER:
case GOMP_MAP_DELETE:
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0);
+ &kinds[i], 0, 0, 0);
break;
case GOMP_MAP_FORCE_DEVICEPTR:
@@ -1086,12 +1108,12 @@
case GOMP_MAP_ALLOC:
if (!acc_is_present (hostaddrs[i], sizes[i]))
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0);
+ &kinds[i], 0, 0, 0);
break;
case GOMP_MAP_TO:
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0);
+ &kinds[i], 0, 0, 0);
break;
@@ -1098,7 +1120,7 @@
case GOMP_MAP_FROM:
kinds[i] = GOMP_MAP_FORCE_FROM;
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0);
+ &kinds[i], 0, 0, 0);
break;
case GOMP_MAP_FORCE_PRESENT:
Index: libgomp/libgomp_g.h
===================================================================
--- libgomp/libgomp_g.h (revision 248095)
+++ libgomp/libgomp_g.h (revision 248096)
@@ -304,7 +304,7 @@
unsigned short *);
extern void GOACC_data_end (void);
extern void GOACC_enter_exit_data (int, size_t, void **,
- size_t *, unsigned short *, int, int, ...);
+ size_t *, unsigned short *, int, int, int, ...);
extern void GOACC_update (int, size_t, void **, size_t *,
unsigned short *, int, int, ...);
extern void GOACC_wait (int, int, ...);
Index: libgomp/openacc.h
===================================================================
--- libgomp/openacc.h (revision 248095)
+++ libgomp/openacc.h (revision 248096)
@@ -118,6 +118,12 @@
void acc_memcpy_to_device_async (void *, void *, size_t, int) __GOACC_NOTHROW;
void acc_memcpy_from_device_async (void *, void *, size_t, int) __GOACC_NOTHROW;
+/* Finalize versions of copyout/delete functions, specified in OpenACC 2.5. */
+void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW;
+void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
+void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW;
+void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
+
/* Old names. OpenACC does not specify whether these can or must
not be macros, inlines or aliases for the new names. */
#define acc_pcreate acc_present_or_create
Index: libgomp/libgomp.map
===================================================================
--- libgomp/libgomp.map (revision 248095)
+++ libgomp/libgomp.map (revision 248096)
@@ -388,6 +388,14 @@
acc_copyout_async_32_h_;
acc_copyout_async_64_h_;
acc_copyout_async_array_h_;
+ acc_copyout_finalize;
+ acc_copyout_finalize_32_h_;
+ acc_copyout_finalize_64_h_;
+ acc_copyout_finalize_array_h_;
+ acc_copyout_finalize_async;
+ acc_copyout_finalize_async_32_h_;
+ acc_copyout_finalize_async_64_h_;
+ acc_copyout_finalize_async_array_h_;
acc_create_async;
acc_create_async_32_h_;
acc_create_async_64_h_;
@@ -396,6 +404,14 @@
acc_delete_async_32_h_;
acc_delete_async_64_h_;
acc_delete_async_array_h_;
+ acc_delete_finalize;
+ acc_delete_finalize_32_h_;
+ acc_delete_finalize_64_h_;
+ acc_delete_finalize_array_h_;
+ acc_delete_finalize_async;
+ acc_delete_finalize_async_32_h_;
+ acc_delete_finalize_async_64_h_;
+ acc_delete_finalize_async_array_h_;
acc_get_default_async;
acc_get_default_async_h_;
acc_memcpy_from_device_async;
Index: libgomp/testsuite/libgomp.oacc-fortran/data-5.f90
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/data-5.f90 (nonexistent)
+++ libgomp/testsuite/libgomp.oacc-fortran/data-5.f90 (revision 248096)
@@ -0,0 +1,56 @@
+! { dg-do run }
+! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
+
+program refcount_test
+ use openacc
+ integer, allocatable :: h(:)
+ integer i, N
+
+ N = 256
+ allocate (h(N))
+
+ do i = 1, N
+ h(i) = i
+ end do
+
+ !$acc enter data create (h(1:N))
+ !$acc enter data copyin (h(1:N))
+ !$acc enter data copyin (h(1:N))
+ !$acc enter data copyin (h(1:N))
+
+ call acc_update_self (h)
+ do i = 1, N
+ if (h(i) .eq. i) c = c + 1
+ end do
+ ! h[] should be filled with uninitialized device values,
+ ! abort if it's not.
+ if (c .eq. N) call abort
+
+ h(:) = 0
+
+ !$acc parallel present (h(1:N))
+ do i = 1, N
+ h(i) = 111
+ end do
+ !$acc end parallel
+
+ ! No actual copyout should happen.
+ call acc_copyout (h)
+ do i = 1, N
+ if (h(i) .ne. 0) call abort
+ end do
+
+ !$acc exit data delete (h(1:N))
+
+ ! This should not actually be deleted yet.
+ if (acc_is_present (h) .eqv. .FALSE.) call abort
+
+ !$acc exit data copyout (h(1:N)) finalize
+
+ do i = 1, N
+ if (h(i) .ne. 111) call abort
+ end do
+
+ if (acc_is_present (h) .eqv. .TRUE.) call abort
+
+end program refcount_test
Index: libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 (revision 248095)
+++ libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 (revision 248096)
@@ -157,8 +157,8 @@
!$acc exit data delete (c(0:N), d(0:N))
- if (acc_is_present (c) .eqv. .TRUE.) call abort
- if (acc_is_present (d) .eqv. .TRUE.) call abort
+ if (acc_is_present (c) .eqv. .FALSE.) call abort
+ if (acc_is_present (d) .eqv. .FALSE.) call abort
!$acc exit data delete (c(0:N), d(0:N))
@@ -177,13 +177,13 @@
!$acc exit data delete (c(0:N), d(0:N))
- if (acc_is_present (c) .eqv. .TRUE.) call abort
- if (acc_is_present (d) .eqv. .TRUE.) call abort
+ !if (acc_is_present (c) .eqv. .TRUE.) call abort
+ !if (acc_is_present (d) .eqv. .TRUE.) call abort
!$acc exit data delete (c(0:N), d(0:N))
- if (acc_is_present (c) .eqv. .TRUE.) call abort
- if (acc_is_present (d) .eqv. .TRUE.) call abort
+ if (acc_is_present (c) .eqv. .TRUE.) call abort
+ if (acc_is_present (d) .eqv. .TRUE.) call abort
!$acc enter data present_or_copyin (c(0:N))
Index: libgomp/testsuite/libgomp.oacc-fortran/data-6.f90
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/data-6.f90 (nonexistent)
+++ libgomp/testsuite/libgomp.oacc-fortran/data-6.f90 (revision 248096)
@@ -0,0 +1,26 @@
+! { dg-do run }
+! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
+
+program refcount_test
+ use openacc
+ integer, allocatable :: h(:)
+ integer i, N
+
+ N = 256
+ allocate (h(N))
+
+ do i = 1, N
+ h(i) = i
+ end do
+
+ !$acc data create (h(1:N))
+ !$acc enter data create (h(1:N))
+ !$acc end data
+
+ if (acc_is_present (h) .eqv. .FALSE.) call abort
+
+ !$acc exit data delete (h(1:N))
+
+ if (acc_is_present (h) .eqv. .TRUE.) call abort
+
+end program refcount_test
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c (revision 248095)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-38.c (revision 248096)
@@ -38,7 +38,7 @@
memset (&h[0], 0, N);
- acc_copyout (h, N);
+ acc_copyout_finalize (h, N);
for (i = 0; i < N; i++)
{
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c (revision 248095)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c (revision 248096)
@@ -268,10 +268,10 @@
#pragma acc exit data delete (a[0:N], b[0:N])
- if (acc_is_present (a, nbytes))
+ if (!acc_is_present (a, nbytes))
abort ();
- if (acc_is_present (b, nbytes))
+ if (!acc_is_present (b, nbytes))
abort ();
#pragma acc exit data delete (a[0:N], b[0:N])
@@ -300,10 +300,10 @@
#pragma acc exit data delete (a[0:N], b[0:N])
- if (acc_is_present (a, nbytes))
+ if (!acc_is_present (a, nbytes))
abort ();
- if (acc_is_present (b, nbytes))
+ if (!acc_is_present (b, nbytes))
abort ();
#pragma acc exit data delete (a[0:N], b[0:N])
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/data-4.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/data-4.c (nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-4.c (revision 248096)
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+int
+main (int argc, char **argv)
+{
+ const int N = 256;
+ int i;
+ unsigned char *h;
+ void *d1, *d2;
+
+ h = (unsigned char *) malloc (N);
+
+ for (i = 0; i < N; i++)
+ {
+ h[i] = i;
+ }
+
+#pragma acc data create (h[0:N])
+ {
+ #pragma acc enter data create (h[0:N])
+ }
+
+ if (!acc_is_present (h, N))
+ abort ();
+
+#pragma acc exit data delete (h[0:N])
+
+ if (acc_is_present (h, N))
+ abort ();
+
+ free (h);
+ return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/data-5.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/data-5.c (nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-5.c (revision 248096)
@@ -0,0 +1,66 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+int
+main (int argc, char **argv)
+{
+ const int N = 256;
+ int i, c;
+ unsigned char *h;
+ void *d1, *d2;
+
+ h = (unsigned char *) malloc (N);
+
+ for (i = 0; i < N; i++)
+ h[i] = i;
+
+ #pragma acc enter data create (h[0:N])
+ #pragma acc enter data copyin (h[0:N])
+ #pragma acc enter data copyin (h[0:N])
+ #pragma acc enter data copyin (h[0:N])
+
+ acc_update_self (h, N);
+ for (i = 0, c = 0; i < N; i++)
+ if (h[i] == i)
+ c++;
+ /* h[] should be filled with uninitialized device values,
+ abort if it's not. */
+ if (c == N)
+ abort ();
+
+ for (i = 0; i < N; i++)
+ h[i] = 0;
+
+ #pragma acc parallel present(h[0:N])
+ {
+ for (i = 0; i < N; i++)
+ h[i] = 111;
+ }
+
+ /* No actual copyout should happen. */
+ acc_copyout (h, N);
+ for (i = 0; i < N; i++)
+ if (h[i] != 0)
+ abort ();
+
+ #pragma acc exit data delete (h[0:N])
+ /* This should not actually be deleted yet. */
+ if (!acc_is_present (h, N))
+ abort ();
+
+ #pragma acc exit data copyout (h[0:N]) finalize
+
+ for (i = 0; i < N; i++)
+ if (h[i] != 111)
+ abort ();
+
+ if (acc_is_present (h, N))
+ abort ();
+
+ free (h);
+ return 0;
+}
Index: libgomp/target.c
===================================================================
--- libgomp/target.c (revision 248095)
+++ libgomp/target.c (revision 248096)
@@ -984,6 +984,7 @@
tgt->list[i].offset = 0;
tgt->list[i].length = k->host_end - k->host_start;
k->refcount = 1;
+ k->dynamic_refcount = 0;
tgt->refcount++;
array->left = NULL;
array->right = NULL;
@@ -1242,6 +1243,23 @@
free (tgt);
}
+attribute_hidden bool
+gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
+{
+ bool is_tgt_unmapped = false;
+ splay_tree_remove (&devicep->mem_map, k);
+ if (k->link_key)
+ splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key);
+ if (k->tgt->refcount > 1)
+ k->tgt->refcount--;
+ else
+ {
+ is_tgt_unmapped = true;
+ gomp_unmap_tgt (k->tgt);
+ }
+ return is_tgt_unmapped;
+}
+
/* Unmap variables described by TGT. If DO_COPYFROM is true, copy relevant
variables back from device to host: if it is false, it is assumed that this
has been done already. */
@@ -1290,16 +1308,7 @@
+ tgt->list[i].offset),
tgt->list[i].length);
if (do_unmap)
- {
- splay_tree_remove (&devicep->mem_map, k);
- if (k->link_key)
- splay_tree_insert (&devicep->mem_map,
- (splay_tree_node) k->link_key);
- if (k->tgt->refcount > 1)
- k->tgt->refcount--;
- else
- gomp_unmap_tgt (k->tgt);
- }
+ gomp_remove_var (devicep, k);
}
if (tgt->refcount > 1)
@@ -1536,17 +1545,7 @@
else
{
splay_tree_key n = splay_tree_lookup (&devicep->mem_map, &k);
- splay_tree_remove (&devicep->mem_map, n);
- if (n->link_key)
- {
- if (n->tgt->refcount > 1)
- n->tgt->refcount--;
- else
- {
- is_tgt_unmapped = true;
- gomp_unmap_tgt (n->tgt);
- }
- }
+ is_tgt_unmapped = gomp_remove_var (devicep, n);
}
}
@@ -2229,16 +2228,7 @@
- k->host_start),
cur_node.host_end - cur_node.host_start);
if (k->refcount == 0)
- {
- splay_tree_remove (&devicep->mem_map, k);
- if (k->link_key)
- splay_tree_insert (&devicep->mem_map,
- (splay_tree_node) k->link_key);
- if (k->tgt->refcount > 1)
- k->tgt->refcount--;
- else
- gomp_unmap_tgt (k->tgt);
- }
+ gomp_remove_var (devicep, k);
break;
default:
Index: libgomp/oacc-mem.c
===================================================================
--- libgomp/oacc-mem.c (revision 248095)
+++ libgomp/oacc-mem.c (revision 248096)
@@ -440,6 +440,7 @@
tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, &devaddrs, &sizes,
&kinds, true, GOMP_MAP_VARS_OPENACC);
+ tgt->list[0].key->refcount = REFCOUNT_INFINITY;
}
gomp_mutex_lock (&acc_dev->lock);
@@ -494,6 +495,9 @@
(void *) n->host_start, (int) host_size, (void *) h);
}
+ /* Mark for removal. */
+ n->refcount = 1;
+
t = n->tgt;
if (t->refcount == 2)
@@ -583,6 +587,11 @@
gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
}
+ if (n->refcount != REFCOUNT_INFINITY)
+ {
+ n->refcount++;
+ n->dynamic_refcount++;
+ }
gomp_mutex_unlock (&acc_dev->lock);
}
else if (!(f & FLAG_CREATE))
@@ -609,6 +618,8 @@
tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, NULL, &s, &kinds, true,
GOMP_MAP_VARS_OPENACC);
+ /* Initialize dynamic refcount. */
+ tgt->list[0].key->dynamic_refcount = 1;
if (async > acc_async_sync)
acc_dev->openacc.async_set_async_func (acc_async_sync);
@@ -678,7 +689,8 @@
}
#endif
-#define FLAG_COPYOUT (1 << 0)
+#define FLAG_COPYOUT (1 << 0)
+#define FLAG_FINALIZE (1 << 1)
static void
delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
@@ -729,22 +741,58 @@
(void *) n->host_start, (int) host_size, (void *) h, (int) s);
}
- gomp_mutex_unlock (&acc_dev->lock);
+ if (n->refcount == REFCOUNT_INFINITY)
+ {
+ n->refcount = 0;
+ n->dynamic_refcount = 0;
+ }
+ if (n->refcount < n->dynamic_refcount)
+ {
+ gomp_mutex_unlock (&acc_dev->lock);
+ gomp_fatal ("Dynamic reference counting assert fail\n");
+ }
- if (async > acc_async_sync)
- acc_dev->openacc.async_set_async_func (async);
+ if (f & FLAG_FINALIZE)
+ {
+ n->refcount -= n->dynamic_refcount;
+ n->dynamic_refcount = 0;
+ }
+ else if (n->dynamic_refcount)
+ {
+ n->dynamic_refcount--;
+ n->refcount--;
+ }
- if (f & FLAG_COPYOUT)
- acc_dev->dev2host_func (acc_dev->target_id, h, d, s);
+ if (n->refcount == 0)
+ {
+ if (n->tgt->refcount == 2)
+ {
+ struct target_mem_desc *tp, *t;
+ for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
+ tp = t, t = t->prev)
+ if (n->tgt == t)
+ {
+ if (tp)
+ tp->prev = t->prev;
+ else
+ acc_dev->openacc.data_environ = t->prev;
+ break;
+ }
+ }
- acc_unmap_data (h);
+ if (f & FLAG_COPYOUT)
+ {
+ if (async > acc_async_sync)
+ acc_dev->openacc.async_set_async_func (async);
+ acc_dev->dev2host_func (acc_dev->target_id, h, d, s);
+ if (async > acc_async_sync)
+ acc_dev->openacc.async_set_async_func (acc_async_sync);
+ }
+ gomp_remove_var (acc_dev, n);
+ }
- if (async > acc_async_sync)
- acc_dev->openacc.async_set_async_func (acc_async_sync);
+ gomp_mutex_unlock (&acc_dev->lock);
- if (!acc_dev->free_func (acc_dev->target_id, d))
- gomp_fatal ("error in freeing device memory in %s", libfnname);
-
if (profiling_setup_p)
{
thr->prof_info = NULL;
@@ -765,6 +813,18 @@
}
void
+acc_delete_finalize (void *h , size_t s)
+{
+ delete_copyout (FLAG_FINALIZE, h, s, acc_async_sync, __FUNCTION__);
+}
+
+void
+acc_delete_finalize_async (void *h , size_t s, int async)
+{
+ delete_copyout (FLAG_FINALIZE, h, s, async, __FUNCTION__);
+}
+
+void
acc_copyout (void *h, size_t s)
{
delete_copyout (FLAG_COPYOUT, h, s, acc_async_sync, __FUNCTION__);
@@ -776,6 +836,19 @@
delete_copyout (FLAG_COPYOUT, h, s, async, __FUNCTION__);
}
+void
+acc_copyout_finalize (void *h, size_t s)
+{
+ delete_copyout (FLAG_COPYOUT | FLAG_FINALIZE, h, s, acc_async_sync,
+ __FUNCTION__);
+}
+
+void
+acc_copyout_finalize_async (void *h, size_t s, int async)
+{
+ delete_copyout (FLAG_COPYOUT | FLAG_FINALIZE, h, s, async, __FUNCTION__);
+}
+
static void
update_dev_host (int is_dev, void *h, size_t s, int async)
{
@@ -895,11 +968,37 @@
struct goacc_thread *thr = goacc_thread ();
struct gomp_device_descr *acc_dev = thr->dev;
+ if (acc_is_present (*hostaddrs, *sizes))
+ {
+ splay_tree_key n;
+ gomp_mutex_lock (&acc_dev->lock);
+ n = lookup_host (acc_dev, *hostaddrs, *sizes);
+ gomp_mutex_unlock (&acc_dev->lock);
+
+ tgt = n->tgt;
+ for (size_t i = 0; i < tgt->list_count; i++)
+ if (tgt->list[i].key == n)
+ {
+ for (size_t j = 0; j < mapnum; j++)
+ if (i + j < tgt->list_count && tgt->list[i + j].key)
+ {
+ tgt->list[i + j].key->refcount++;
+ tgt->list[i + j].key->dynamic_refcount++;
+ }
+ return;
+ }
+ /* Should not reach here. */
+ gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset");
+ }
+
gomp_debug (0, " %s: prepare mappings\n", __FUNCTION__);
tgt = gomp_map_vars (acc_dev, mapnum, hostaddrs,
NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC);
gomp_debug (0, " %s: mappings prepared\n", __FUNCTION__);
+ /* Initialize dynamic refcount. */
+ tgt->list[0].key->dynamic_refcount = 1;
+
gomp_mutex_lock (&acc_dev->lock);
tgt->prev = acc_dev->openacc.data_environ;
acc_dev->openacc.data_environ = tgt;
@@ -907,7 +1006,8 @@
}
void
-gomp_acc_remove_pointer (void *h, bool force_copyfrom, int async, int mapnum)
+gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async,
+ int finalize, int mapnum)
{
struct goacc_thread *thr = goacc_thread ();
struct gomp_device_descr *acc_dev = thr->dev;
@@ -915,6 +1015,9 @@
struct target_mem_desc *t;
int minrefs = (mapnum == 1) ? 2 : 3;
+ if (!acc_is_present (h, s))
+ return;
+
gomp_mutex_lock (&acc_dev->lock);
n = lookup_host (acc_dev, h, 1);
@@ -929,37 +1032,64 @@
t = n->tgt;
- struct target_mem_desc *tp;
+ if (n->refcount < n->dynamic_refcount)
+ {
+ gomp_mutex_unlock (&acc_dev->lock);
+ gomp_fatal ("Dynamic reference counting assert fail\n");
+ }
- if (t->refcount == minrefs)
+ if (finalize)
{
- /* This is the last reference, so pull the descriptor off the
- chain. This pevents gomp_unmap_vars via gomp_unmap_tgt from
- freeing the device memory. */
+ n->refcount -= n->dynamic_refcount;
+ n->dynamic_refcount = 0;
+ }
+ else if (n->dynamic_refcount)
+ {
+ n->dynamic_refcount--;
+ n->refcount--;
+ }
- for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
- tp = t, t = t->prev)
+ gomp_mutex_unlock (&acc_dev->lock);
+
+ if (n->refcount == 0)
+ {
+ if (t->refcount == minrefs)
{
- if (n->tgt == t)
+ /* This is the last reference, so pull the descriptor off the
+ chain. This prevents gomp_unmap_vars via gomp_unmap_tgt from
+ freeing the device memory. */
+ struct target_mem_desc *tp;
+ for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
+ tp = t, t = t->prev)
{
- if (tp)
- tp->prev = t->prev;
- else
- acc_dev->openacc.data_environ = t->prev;
- break;
+ if (n->tgt == t)
+ {
+ if (tp)
+ tp->prev = t->prev;
+ else
+ acc_dev->openacc.data_environ = t->prev;
+ break;
+ }
}
}
+
+ /* Set refcount to 1 to allow gomp_unmap_vars to unmap it. */
+ n->refcount = 1;
+ t->refcount = minrefs;
+ for (size_t i = 0; i < t->list_count; i++)
+ if (t->list[i].key == n)
+ {
+ t->list[i].copy_from = force_copyfrom ? 1 : 0;
+ break;
+ }
+ if (async > acc_async_sync)
+ acc_dev->openacc.async_set_async_func (async);
+ gomp_unmap_vars (t, true);
+ if (async > acc_async_sync)
+ acc_dev->openacc.async_set_async_func (acc_async_sync);
}
- t->list[0].copy_from = force_copyfrom ? 1 : 0;
-
gomp_mutex_unlock (&acc_dev->lock);
- /* If running synchronously, unmap immediately. */
- if (async < acc_async_noval)
- gomp_unmap_vars (t, true);
- else
- t->device_descr->openacc.register_async_cleanup_func (t, async);
-
gomp_debug (0, " %s: mappings restored\n", __FUNCTION__);
}
Index: libgomp/openacc.f90
===================================================================
--- libgomp/openacc.f90 (revision 248095)
+++ libgomp/openacc.f90 (revision 248096)
@@ -233,6 +233,24 @@
type (*), dimension (..), contiguous :: a
end subroutine
+ subroutine acc_copyout_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ end subroutine
+
+ subroutine acc_copyout_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ end subroutine
+
+ subroutine acc_copyout_finalize_array_h (a)
+ type (*), dimension (..), contiguous :: a
+ end subroutine
+
subroutine acc_delete_32_h (a, len)
use iso_c_binding, only: c_int32_t
!GCC$ ATTRIBUTES NO_ARG_CHECK :: a
@@ -251,6 +269,24 @@
type (*), dimension (..), contiguous :: a
end subroutine
+ subroutine acc_delete_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ end subroutine
+
+ subroutine acc_delete_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ end subroutine
+
+ subroutine acc_delete_finalize_array_h (a)
+ type (*), dimension (..), contiguous :: a
+ end subroutine
+
subroutine acc_update_device_32_h (a, len)
use iso_c_binding, only: c_int32_t
!GCC$ ATTRIBUTES NO_ARG_CHECK :: a
@@ -380,6 +416,30 @@
integer (acc_handle_kind) async
end subroutine
+ subroutine acc_copyout_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_copyout_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_copyout_finalize_async_array_h (a, async)
+ use openacc_kinds, only: acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async
+ end subroutine
+
subroutine acc_delete_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t
use openacc_kinds, only: acc_handle_kind
@@ -404,6 +464,30 @@
integer (acc_handle_kind) async
end subroutine
+ subroutine acc_delete_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_delete_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_delete_finalize_async_array_h (a, async)
+ use openacc_kinds, only: acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async
+ end subroutine
+
subroutine acc_update_device_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t
use openacc_kinds, only: acc_handle_kind
@@ -581,6 +665,14 @@
integer (c_size_t), value :: len
end subroutine
+ subroutine acc_copyout_finalize_l (a, len) &
+ bind (C, name = "acc_copyout_finalize")
+ use iso_c_binding, only: c_size_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_size_t), value :: len
+ end subroutine
+
subroutine acc_delete_l (a, len) &
bind (C, name = "acc_delete")
use iso_c_binding, only: c_size_t
@@ -589,6 +681,14 @@
integer (c_size_t), value :: len
end subroutine
+ subroutine acc_delete_finalize_l (a, len) &
+ bind (C, name = "acc_delete_finalize")
+ use iso_c_binding, only: c_size_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_size_t), value :: len
+ end subroutine
+
subroutine acc_update_device_l (a, len) &
bind (C, name = "acc_update_device")
use iso_c_binding, only: c_size_t
@@ -641,6 +741,15 @@
integer (c_int), value :: async
end subroutine
+ subroutine acc_copyout_finalize_async_l (a, len, async) &
+ bind (C, name = "acc_copyout_finalize_async")
+ use iso_c_binding, only: c_size_t, c_int
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_size_t), value :: len
+ integer (c_int), value :: async
+ end subroutine
+
subroutine acc_delete_async_l (a, len, async) &
bind (C, name = "acc_delete_async")
use iso_c_binding, only: c_size_t, c_int
@@ -650,6 +759,15 @@
integer (c_int), value :: async
end subroutine
+ subroutine acc_delete_finalize_async_l (a, len, async) &
+ bind (C, name = "acc_delete_finalize_async")
+ use iso_c_binding, only: c_size_t, c_int
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_size_t), value :: len
+ integer (c_int), value :: async
+ end subroutine
+
subroutine acc_update_device_async_l (a, len, async) &
bind (C, name = "acc_update_device_async")
use iso_c_binding, only: c_size_t, c_int
@@ -806,6 +924,12 @@
procedure :: acc_copyout_array_h
end interface
+ interface acc_copyout_finalize
+ procedure :: acc_copyout_finalize_32_h
+ procedure :: acc_copyout_finalize_64_h
+ procedure :: acc_copyout_finalize_array_h
+ end interface
+
interface acc_delete
procedure :: acc_delete_32_h
procedure :: acc_delete_64_h
@@ -812,6 +936,12 @@
procedure :: acc_delete_array_h
end interface
+ interface acc_delete_finalize
+ procedure :: acc_delete_finalize_32_h
+ procedure :: acc_delete_finalize_64_h
+ procedure :: acc_delete_finalize_array_h
+ end interface
+
interface acc_update_device
procedure :: acc_update_device_32_h
procedure :: acc_update_device_64_h
@@ -856,6 +986,12 @@
procedure :: acc_copyout_async_array_h
end interface
+ interface acc_copyout_finalize_async
+ procedure :: acc_copyout_finalize_async_32_h
+ procedure :: acc_copyout_finalize_async_64_h
+ procedure :: acc_copyout_finalize_async_array_h
+ end interface
+
interface acc_delete_async
procedure :: acc_delete_async_32_h
procedure :: acc_delete_async_64_h
@@ -862,6 +998,12 @@
procedure :: acc_delete_async_array_h
end interface
+ interface acc_delete_finalize_async
+ procedure :: acc_delete_finalize_async_32_h
+ procedure :: acc_delete_finalize_async_64_h
+ procedure :: acc_delete_finalize_async_array_h
+ end interface
+
interface acc_update_device_async
procedure :: acc_update_device_async_32_h
procedure :: acc_update_device_async_64_h
@@ -1104,6 +1246,30 @@
call acc_copyout_l (a, sizeof (a))
end subroutine
+subroutine acc_copyout_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t, c_size_t
+ use openacc_internal, only: acc_copyout_finalize_l
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ call acc_copyout_finalize_l (a, int (len, kind = c_size_t))
+end subroutine
+
+subroutine acc_copyout_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t, c_size_t
+ use openacc_internal, only: acc_copyout_finalize_l
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ call acc_copyout_finalize_l (a, int (len, kind = c_size_t))
+end subroutine
+
+subroutine acc_copyout_finalize_array_h (a)
+ use openacc_internal, only: acc_copyout_finalize_l
+ type (*), dimension (..), contiguous :: a
+ call acc_copyout_finalize_l (a, sizeof (a))
+end subroutine
+
subroutine acc_delete_32_h (a, len)
use iso_c_binding, only: c_int32_t, c_size_t
use openacc_internal, only: acc_delete_l
@@ -1128,6 +1294,30 @@
call acc_delete_l (a, sizeof (a))
end subroutine
+subroutine acc_delete_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t, c_size_t
+ use openacc_internal, only: acc_delete_finalize_l
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ call acc_delete_finalize_l (a, int (len, kind = c_size_t))
+end subroutine
+
+subroutine acc_delete_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t, c_size_t
+ use openacc_internal, only: acc_delete_finalize_l
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ call acc_delete_finalize_l (a, int (len, kind = c_size_t))
+end subroutine
+
+subroutine acc_delete_finalize_array_h (a)
+ use openacc_internal, only: acc_delete_finalize_l
+ type (*), dimension (..), contiguous :: a
+ call acc_delete_finalize_l (a, sizeof (a))
+end subroutine
+
subroutine acc_update_device_32_h (a, len)
use iso_c_binding, only: c_int32_t, c_size_t
use openacc_internal, only: acc_update_device_l
@@ -1304,6 +1494,37 @@
call acc_copyout_async_l (a, sizeof (a), int (async, kind = c_int))
end subroutine
+subroutine acc_copyout_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t, c_size_t, c_int
+ use openacc_internal, only: acc_copyout_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ call acc_copyout_finalize_async_l (a, int (len, kind = c_size_t), int (async, kind = c_int))
+end subroutine
+
+subroutine acc_copyout_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t, c_size_t, c_int
+ use openacc_internal, only: acc_copyout_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ call acc_copyout_finalize_async_l (a, int (len, kind = c_size_t), int (async, kind = c_int))
+end subroutine
+
+subroutine acc_copyout_finalize_async_array_h (a, async)
+ use iso_c_binding, only: c_int
+ use openacc_internal, only: acc_copyout_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async
+ call acc_copyout_finalize_async_l (a, sizeof (a), int (async, kind = c_int))
+end subroutine
+
subroutine acc_delete_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t, c_size_t, c_int
use openacc_internal, only: acc_delete_async_l
@@ -1335,6 +1556,37 @@
call acc_delete_async_l (a, sizeof (a), int (async, kind = c_int))
end subroutine
+subroutine acc_delete_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t, c_size_t, c_int
+ use openacc_internal, only: acc_delete_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ call acc_delete_finalize_async_l (a, int (len, kind = c_size_t), int (async, kind = c_int))
+end subroutine
+
+subroutine acc_delete_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t, c_size_t, c_int
+ use openacc_internal, only: acc_delete_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ call acc_delete_finalize_async_l (a, int (len, kind = c_size_t), int (async, kind = c_int))
+end subroutine
+
+subroutine acc_delete_finalize_async_array_h (a, async)
+ use iso_c_binding, only: c_int
+ use openacc_internal, only: acc_delete_finalize_async_l
+ use openacc_kinds, only: acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async
+ call acc_delete_finalize_async_l (a, sizeof (a), int (async, kind = c_int))
+end subroutine
+
subroutine acc_update_device_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t, c_size_t, c_int
use openacc_internal, only: acc_update_device_async_l
Index: libgomp/libgomp.h
===================================================================
--- libgomp/libgomp.h (revision 248095)
+++ libgomp/libgomp.h (revision 248096)
@@ -835,6 +835,8 @@
uintptr_t tgt_offset;
/* Reference count. */
uintptr_t refcount;
+ /* Dynamic reference count. */
+ uintptr_t dynamic_refcount;
/* Pointer to the original mapping of "omp declare target link" object. */
splay_tree_key link_key;
};
@@ -973,7 +975,7 @@
};
extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *);
-extern void gomp_acc_remove_pointer (void *, bool, int, int);
+extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int);
extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *,
unsigned short *);
@@ -985,6 +987,7 @@
extern void gomp_init_device (struct gomp_device_descr *);
extern void gomp_unload_device (struct gomp_device_descr *);
extern bool gomp_offload_target_available_p (int);
+extern bool gomp_remove_var (struct gomp_device_descr *, splay_tree_key);
/* work.c */
Index: libgomp/openacc_lib.h
===================================================================
--- libgomp/openacc_lib.h (revision 248095)
+++ libgomp/openacc_lib.h (revision 248096)
@@ -303,6 +303,26 @@
end subroutine
end interface
+ interface acc_copyout_finalize
+ subroutine acc_copyout_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ end subroutine
+
+ subroutine acc_copyout_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ end subroutine
+
+ subroutine acc_copyout_finalize_array_h (a)
+ type (*), dimension (..), contiguous :: a
+ end subroutine
+ end interface
+
interface acc_delete
subroutine acc_delete_32_h (a, len)
use iso_c_binding, only: c_int32_t
@@ -323,6 +343,26 @@
end subroutine
end interface
+ interface acc_delete_finalize
+ subroutine acc_delete_finalize_32_h (a, len)
+ use iso_c_binding, only: c_int32_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ end subroutine
+
+ subroutine acc_delete_finalize_64_h (a, len)
+ use iso_c_binding, only: c_int64_t
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ end subroutine
+
+ subroutine acc_delete_finalize_array_h (a)
+ type (*), dimension (..), contiguous :: a
+ end subroutine
+ end interface
+
interface acc_update_device
subroutine acc_update_device_32_h (a, len)
use iso_c_binding, only: c_int32_t
@@ -472,6 +512,32 @@
end subroutine
end interface
+ interface acc_copyout_finalize_async
+ subroutine acc_copyout_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t
+ import acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_copyout_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t
+ import acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_copyout_finalize_async_array_h (a, async_)
+ import acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async_
+ end subroutine
+ end interface
+
interface acc_delete_async
subroutine acc_delete_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t
@@ -498,6 +564,32 @@
end subroutine
end interface
+ interface acc_delete_finalize_async
+ subroutine acc_delete_finalize_async_32_h (a, len, async)
+ use iso_c_binding, only: c_int32_t
+ import acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int32_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_delete_finalize_async_64_h (a, len, async)
+ use iso_c_binding, only: c_int64_t
+ import acc_handle_kind
+ !GCC$ ATTRIBUTES NO_ARG_CHECK :: a
+ type (*), dimension (*) :: a
+ integer (c_int64_t) len
+ integer (acc_handle_kind) async
+ end subroutine
+
+ subroutine acc_delete_finalize_async_array_h (a, async_)
+ import acc_handle_kind
+ type (*), dimension (..), contiguous :: a
+ integer (acc_handle_kind) async_
+ end subroutine
+ end interface
+
interface acc_update_device_async
subroutine acc_update_device_async_32_h (a, len, async)
use iso_c_binding, only: c_int32_t
Index: gcc/c-family/c-pragma.h
===================================================================
--- gcc/c-family/c-pragma.h (revision 248095)
+++ gcc/c-family/c-pragma.h (revision 248096)
@@ -157,6 +157,7 @@
PRAGMA_OACC_CLAUSE_DEVICEPTR,
PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
PRAGMA_OACC_CLAUSE_DEVICE_TYPE,
+ PRAGMA_OACC_CLAUSE_FINALIZE,
PRAGMA_OACC_CLAUSE_GANG,
PRAGMA_OACC_CLAUSE_HOST,
PRAGMA_OACC_CLAUSE_INDEPENDENT,
Index: gcc/c/c-parser.c
===================================================================
--- gcc/c/c-parser.c (revision 248095)
+++ gcc/c/c-parser.c (revision 248096)
@@ -10375,6 +10375,8 @@
case 'f':
if (!strcmp ("final", p))
result = PRAGMA_OMP_CLAUSE_FINAL;
+ else if (!strcmp ("finalize", p))
+ result = PRAGMA_OACC_CLAUSE_FINALIZE;
else if (!strcmp ("firstprivate", p))
result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE;
else if (!strcmp ("from", p))
@@ -11693,8 +11695,9 @@
return list;
}
-/* OpenACC:
+/* OpenACC 2.5:
auto
+ finalize
independent
nohost
seq */
@@ -13171,6 +13174,11 @@
c_name = "device_type";
seen_dtype = true;
break;
+ case PRAGMA_OACC_CLAUSE_FINALIZE:
+ clauses = c_parser_oacc_simple_clause (parser, here,
+ OMP_CLAUSE_FINALIZE, clauses);
+ c_name = "finalize";
+ break;
case PRAGMA_OACC_CLAUSE_FIRSTPRIVATE:
clauses = c_parser_omp_clause_firstprivate (parser, clauses);
c_name = "firstprivate";
@@ -13816,6 +13824,7 @@
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \
+ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
static void
Index: gcc/c/c-typeck.c
===================================================================
--- gcc/c/c-typeck.c (revision 248095)
+++ gcc/c/c-typeck.c (revision 248096)
@@ -13397,6 +13397,7 @@
case OMP_CLAUSE_NOHOST:
case OMP_CLAUSE_TILE:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
pc = &OMP_CLAUSE_CHAIN (c);
continue;
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 248095)
+++ gcc/tree.c (revision 248096)
@@ -331,7 +331,8 @@
3, /* OMP_CLAUSE_TILE */
2, /* OMP_CLAUSE__GRIDDIM_ */
0, /* OMP_CLAUSE_IF_PRESENT */
- 2 /* OMP_CLAUSE_DEVICE_TYPE */
+ 2, /* OMP_CLAUSE_DEVICE_TYPE */
+ 0 /* OMP_CLAUSE_FINALIZE */
};
const char * const omp_clause_code_name[] =
@@ -406,7 +407,8 @@
"tile",
"_griddim_",
"if_present",
- "device_type"
+ "device_type",
+ "finalize"
};
@@ -11723,6 +11725,7 @@
case OMP_CLAUSE_NOHOST:
case OMP_CLAUSE_TILE:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
case OMP_CLAUSE_DEVICE_TYPE:
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c (revision 248095)
+++ gcc/omp-low.c (revision 248096)
@@ -2431,6 +2431,7 @@
case OMP_CLAUSE_TILE:
case OMP_CLAUSE_IF_PRESENT:
case OMP_CLAUSE_DEVICE_TYPE:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_ALIGNED:
@@ -2606,6 +2607,7 @@
case OMP_CLAUSE__GRIDDIM_:
case OMP_CLAUSE_IF_PRESENT:
case OMP_CLAUSE_DEVICE_TYPE:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_BIND:
@@ -14216,6 +14218,13 @@
if (t_async)
args.safe_push (t_async);
+ if (start_ix == BUILT_IN_GOACC_ENTER_EXIT_DATA)
+ {
+ c = find_omp_clause (clauses, OMP_CLAUSE_FINALIZE);
+ tree t_finalize = c ? integer_one_node : integer_zero_node;
+ args.safe_push (t_finalize);
+ }
+
/* Save the argument index, and ... */
unsigned t_wait_idx = args.length ();
unsigned num_waits = 0;
Index: gcc/cp/semantics.c
===================================================================
--- gcc/cp/semantics.c (revision 248095)
+++ gcc/cp/semantics.c (revision 248096)
@@ -7107,6 +7107,7 @@
case OMP_CLAUSE_BIND:
case OMP_CLAUSE_NOHOST:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_TILE:
Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c (revision 248095)
+++ gcc/cp/parser.c (revision 248096)
@@ -29815,6 +29815,8 @@
case 'f':
if (!strcmp ("final", p))
result = PRAGMA_OMP_CLAUSE_FINAL;
+ else if (!strcmp ("finalize", p))
+ result = PRAGMA_OACC_CLAUSE_FINALIZE;
else if (!strcmp ("firstprivate", p))
result = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE;
else if (!strcmp ("from", p))
@@ -30275,8 +30277,9 @@
return list;
}
-/* OpenACC 2.0:
+/* OpenACC 2.5:
auto
+ finalize
independent
nohost
seq */
@@ -32390,6 +32393,11 @@
c_name = "device_type";
seen_dtype = true;
break;
+ case PRAGMA_OACC_CLAUSE_FINALIZE:
+ clauses = cp_parser_oacc_simple_clause (parser, OMP_CLAUSE_FINALIZE,
+ clauses, here);
+ c_name = "finalize";
+ break;
case PRAGMA_OACC_CLAUSE_FIRSTPRIVATE:
clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_FIRSTPRIVATE,
clauses);
@@ -35582,6 +35590,7 @@
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \
+ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
static tree
Index: gcc/fortran/openmp.c
===================================================================
--- gcc/fortran/openmp.c (revision 248095)
+++ gcc/fortran/openmp.c (revision 248096)
@@ -835,6 +835,7 @@
OMP_CLAUSE_NOHOST,
OMP_CLAUSE_IF_PRESENT,
OMP_CLAUSE_DEVICE_TYPE,
+ OMP_CLAUSE_FINALIZE,
/* This must come last. */
OMP_MASK2_LAST
};
@@ -1304,6 +1305,14 @@
&& c->final_expr == NULL
&& gfc_match ("final ( %e )", &c->final_expr) == MATCH_YES)
continue;
+ if ((mask & OMP_CLAUSE_FINALIZE)
+ && !c->finalize
+ && gfc_match ("finalize") == MATCH_YES)
+ {
+ c->finalize = true;
+ needs_space = true;
+ continue;
+ }
if ((mask & OMP_CLAUSE_FIRSTPRIVATE)
&& gfc_match_omp_variable_list ("firstprivate (",
&c->lists[OMP_LIST_FIRSTPRIVATE],
@@ -2081,7 +2090,7 @@
| OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE)
#define OACC_EXIT_DATA_CLAUSES \
(omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT \
- | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE)
+ | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE)
#define OACC_WAIT_CLAUSES \
omp_mask (OMP_CLAUSE_ASYNC)
#define OACC_ROUTINE_CLAUSES \
Index: gcc/fortran/trans-openmp.c
===================================================================
--- gcc/fortran/trans-openmp.c (revision 248095)
+++ gcc/fortran/trans-openmp.c (revision 248096)
@@ -2936,6 +2936,11 @@
c = build_omp_clause (where.lb->location, OMP_CLAUSE_IF_PRESENT);
omp_clauses = gfc_trans_add_clause (c, omp_clauses);
}
+ if (clauses->finalize)
+ {
+ c = build_omp_clause (where.lb->location, OMP_CLAUSE_FINALIZE);
+ omp_clauses = gfc_trans_add_clause (c, omp_clauses);
+ }
if (clauses->independent)
{
c = build_omp_clause (where.lb->location, OMP_CLAUSE_INDEPENDENT);
Index: gcc/fortran/gfortran.h
===================================================================
--- gcc/fortran/gfortran.h (revision 248095)
+++ gcc/fortran/gfortran.h (revision 248096)
@@ -1318,7 +1318,7 @@
gfc_expr_list *tile_list;
unsigned async:1, gang:1, worker:1, vector:1, seq:1, independent:1;
unsigned wait:1, par_auto:1, gang_static:1, nohost:1, acc_collapse:1, bind:1;
- unsigned if_present:1;
+ unsigned if_present:1, finalize:1;
locus loc;
char bind_name[GFC_MAX_SYMBOL_LEN+1];
}
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c (revision 248095)
+++ gcc/gimplify.c (revision 248096)
@@ -7669,6 +7669,7 @@
case OMP_CLAUSE_SIMD:
case OMP_CLAUSE_IF_PRESENT:
case OMP_CLAUSE_DEVICE_TYPE:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_DEFAULTMAP:
@@ -8533,6 +8534,7 @@
case OMP_CLAUSE_TILE:
case OMP_CLAUSE_IF_PRESENT:
case OMP_CLAUSE_DEVICE_TYPE:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_BIND:
Index: gcc/tree-core.h
===================================================================
--- gcc/tree-core.h (revision 248095)
+++ gcc/tree-core.h (revision 248096)
@@ -473,7 +473,10 @@
OMP_CLAUSE_IF_PRESENT,
/* OpenACC clause: device_type ( device-type-list). */
- OMP_CLAUSE_DEVICE_TYPE
+ OMP_CLAUSE_DEVICE_TYPE,
+
+ /* OpenACC clause: finalize. */
+ OMP_CLAUSE_FINALIZE
};
#undef DEFTREESTRUCT
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause
2017-05-16 13:11 [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause Chung-Lin Tang
@ 2017-05-17 11:51 ` Thomas Schwinge
2017-05-17 12:06 ` Thomas Schwinge
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Thomas Schwinge @ 2017-05-17 11:51 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches
Hi!
On Tue, 16 May 2017 20:55:46 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> finalize clause of the exit data directive
The OpenACC front end code currently maps the OpenACC delete clause to
"OMP_CLAUSE_DELETE" -- however, without a finalize clause, this clause
actually has "OMP_CLAUSE_RELEASE" semantics. Committed to
gomp-4_0-branch in r248147:
commit 09ce4545696d19c65a552098187184f179096b45
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed May 17 11:46:53 2017 +0000
Use "GOMP_MAP_RELEASE" for OpenACC delete clause without finalize clause
gcc/c/
* c-parser.c (c_parser_oacc_data_clause)
<PRAGMA_OACC_CLAUSE_DELETE>: Use "GOMP_MAP_RELEASE".
gcc/cp/
* parser.c (cp_parser_oacc_data_clause)
<PRAGMA_OACC_CLAUSE_DELETE>: Use "GOMP_MAP_RELEASE".
gcc/fortran/
* openmp.c (gfc_match_omp_clauses) <OMP_CLAUSE_DELETE>: Use
"OMP_MAP_RELEASE".
gcc/
* gimplify.c (gimplify_oacc_declare_1) <GOMP_MAP_ALLOC>: Use
"GOMP_MAP_RELEASE".
libgomp/
* oacc-parallel.c (GOACC_enter_exit_data, GOACC_declare): Handle
"GOMP_MAP_RELEASE".
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248147 138bc75d-0d04-0410-961f-82ee72b054a4
---
gcc/ChangeLog.gomp | 5 +++++
gcc/c/ChangeLog.gomp | 5 +++++
gcc/c/c-parser.c | 2 +-
gcc/cp/ChangeLog.gomp | 5 +++++
gcc/cp/parser.c | 2 +-
gcc/fortran/ChangeLog.gomp | 3 +++
gcc/fortran/openmp.c | 2 +-
gcc/gimplify.c | 2 +-
libgomp/ChangeLog.gomp | 5 +++++
libgomp/oacc-parallel.c | 5 ++++-
10 files changed, 31 insertions(+), 5 deletions(-)
diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 77c7899..e858e78 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * gimplify.c (gimplify_oacc_declare_1) <GOMP_MAP_ALLOC>: Use
+ "GOMP_MAP_RELEASE".
+
2017-05-16 Cesar Philippidis <cesar@codesourcery.com>
* doc/include/texinfo.tex: Backport @title linewrap changes from
diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp
index eca8379..c70003f 100644
--- gcc/c/ChangeLog.gomp
+++ gcc/c/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * c-parser.c (c_parser_oacc_data_clause)
+ <PRAGMA_OACC_CLAUSE_DELETE>: Use "GOMP_MAP_RELEASE".
+
2017-05-16 Chung-Lin Tang <cltang@codesourcery.com>
* c-parser.c (c_parser_omp_clause_name): Handle 'finalize' clause.
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 17045ef..34f8b17 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -10805,7 +10805,7 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
kind = GOMP_MAP_ALLOC;
break;
case PRAGMA_OACC_CLAUSE_DELETE:
- kind = GOMP_MAP_DELETE;
+ kind = GOMP_MAP_RELEASE;
break;
case PRAGMA_OACC_CLAUSE_DEVICE:
kind = GOMP_MAP_FORCE_TO;
diff --git gcc/cp/ChangeLog.gomp gcc/cp/ChangeLog.gomp
index 3fa64ee..9a68194 100644
--- gcc/cp/ChangeLog.gomp
+++ gcc/cp/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * parser.c (cp_parser_oacc_data_clause)
+ <PRAGMA_OACC_CLAUSE_DELETE>: Use "GOMP_MAP_RELEASE".
+
2017-05-16 Chung-Lin Tang <cltang@codesourcery.com>
* parser.c (cp_parser_omp_clause_name): Handle 'finalize' clause.
diff --git gcc/cp/parser.c gcc/cp/parser.c
index c0fe65d..cbb11d0 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -30222,7 +30222,7 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
kind = GOMP_MAP_ALLOC;
break;
case PRAGMA_OACC_CLAUSE_DELETE:
- kind = GOMP_MAP_DELETE;
+ kind = GOMP_MAP_RELEASE;
break;
case PRAGMA_OACC_CLAUSE_DEVICE:
kind = GOMP_MAP_FORCE_TO;
diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp
index 03bc82e..b3e4f8d 100644
--- gcc/fortran/ChangeLog.gomp
+++ gcc/fortran/ChangeLog.gomp
@@ -1,5 +1,8 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * openmp.c (gfc_match_omp_clauses) <OMP_CLAUSE_DELETE>: Use
+ "OMP_MAP_RELEASE".
+
* openmp.c (gfc_match_omp_clauses): Handle "OMP_CLAUSE_DELETE"
just once.
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index 0992f0a..1d191d2 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1149,7 +1149,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, omp_mask mask,
if ((mask & OMP_CLAUSE_DELETE)
&& gfc_match ("delete ( ") == MATCH_YES
&& gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
- OMP_MAP_DELETE, true,
+ OMP_MAP_RELEASE, true,
allow_derived))
continue;
if ((mask & OMP_CLAUSE_DEPEND)
diff --git gcc/gimplify.c gcc/gimplify.c
index e6cdadc..7812471 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -8593,7 +8593,7 @@ gimplify_oacc_declare_1 (tree clause)
switch (kind)
{
case GOMP_MAP_ALLOC:
- new_op = GOMP_MAP_DELETE;
+ new_op = GOMP_MAP_RELEASE;
ret = true;
break;
diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 856bdf2..61d0358 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * oacc-parallel.c (GOACC_enter_exit_data, GOACC_declare): Handle
+ "GOMP_MAP_RELEASE".
+
2017-05-16 Cesar Philippidis <cesar@codesourcery.com>
* libgomp.texi: Update OpenACC references to version 2.5. Add entries
diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c
index a042aac..5b8435b 100644
--- libgomp/oacc-parallel.c
+++ libgomp/oacc-parallel.c
@@ -628,7 +628,8 @@ GOACC_enter_exit_data (int device, size_t mapnum,
break;
}
- if (kind == GOMP_MAP_DELETE
+ if (kind == GOMP_MAP_RELEASE
+ || kind == GOMP_MAP_DELETE
|| kind == GOMP_MAP_FROM
|| kind == GOMP_MAP_FORCE_FROM
|| kind == GOMP_MAP_DECLARE_DEALLOCATE)
@@ -786,6 +787,7 @@ GOACC_enter_exit_data (int device, size_t mapnum,
{
switch (kind)
{
+ case GOMP_MAP_RELEASE:
case GOMP_MAP_DELETE:
if (acc_is_present (hostaddrs[i], sizes[i]))
{
@@ -1097,6 +1099,7 @@ GOACC_declare (int device, size_t mapnum,
case GOMP_MAP_FORCE_FROM:
case GOMP_MAP_FORCE_TO:
case GOMP_MAP_POINTER:
+ case GOMP_MAP_RELEASE:
case GOMP_MAP_DELETE:
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
&kinds[i], 0, 0, 0);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause
2017-05-16 13:11 [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause Chung-Lin Tang
2017-05-17 11:51 ` Thomas Schwinge
@ 2017-05-17 12:06 ` Thomas Schwinge
2017-05-17 12:14 ` Thomas Schwinge
2017-05-17 12:16 ` Thomas Schwinge
3 siblings, 0 replies; 5+ messages in thread
From: Thomas Schwinge @ 2017-05-17 12:06 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches
Hi!
On Tue, 16 May 2017 20:55:46 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> finalize clause of the exit data directive
This would run into ICEs in the C++ front end (template handling) as well
as C and Fortran front ends (nested function handling), and didn't
pretty-print the "finalize" clause. Also test cases. Committed to
gomp-4_0-branch in r248148:
commit 2d734ec8526f73e69c7bfa9b60ec5e9c5a9e4f13
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed May 17 11:52:15 2017 +0000
Complete compiler-side handling of the OpenACC finalize clause
gcc/cp/
* pt.c (tsubst_omp_clauses): Handle "OMP_CLAUSE_FINALIZE".
gcc/
* tree-nested.c (convert_nonlocal_omp_clauses)
(convert_local_omp_clauses): Handle "OMP_CLAUSE_FINALIZE".
* tree-pretty-print.c (dump_omp_clause): Handle
"OMP_CLAUSE_FINALIZE".
gcc/testsuite/
* c-c++-common/goacc/data-2.c: Update.
* g++.dg/goacc/template.C: Likewise.
* gcc.dg/goacc/nested-function-1.c: Likewise.
* gfortran.dg/goacc/enter-exit-data.f95: Likewise.
* gfortran.dg/goacc/nested-function-1.f90: Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248148 138bc75d-0d04-0410-961f-82ee72b054a4
---
gcc/ChangeLog.gomp | 5 +++++
gcc/cp/ChangeLog.gomp | 2 ++
gcc/cp/pt.c | 1 +
gcc/testsuite/ChangeLog.gomp | 8 ++++++++
gcc/testsuite/c-c++-common/goacc/data-2.c | 3 +++
gcc/testsuite/g++.dg/goacc/template.C | 9 +++++++++
gcc/testsuite/gcc.dg/goacc/nested-function-1.c | 4 ++++
gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 | 3 +++
gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90 | 4 ++++
gcc/tree-nested.c | 2 ++
gcc/tree-pretty-print.c | 3 +++
11 files changed, 44 insertions(+)
diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index e858e78..d89897d 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,10 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * tree-nested.c (convert_nonlocal_omp_clauses)
+ (convert_local_omp_clauses): Handle "OMP_CLAUSE_FINALIZE".
+ * tree-pretty-print.c (dump_omp_clause): Handle
+ "OMP_CLAUSE_FINALIZE".
+
* gimplify.c (gimplify_oacc_declare_1) <GOMP_MAP_ALLOC>: Use
"GOMP_MAP_RELEASE".
diff --git gcc/cp/ChangeLog.gomp gcc/cp/ChangeLog.gomp
index 9a68194..b0c3dbf 100644
--- gcc/cp/ChangeLog.gomp
+++ gcc/cp/ChangeLog.gomp
@@ -1,5 +1,7 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * pt.c (tsubst_omp_clauses): Handle "OMP_CLAUSE_FINALIZE".
+
* parser.c (cp_parser_oacc_data_clause)
<PRAGMA_OACC_CLAUSE_DELETE>: Use "GOMP_MAP_RELEASE".
diff --git gcc/cp/pt.c gcc/cp/pt.c
index 84f64d8..abe8d36 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -14751,6 +14751,7 @@ tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort,
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_SEQ:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
case OMP_CLAUSE_DEVICE_TYPE:
break;
case OMP_CLAUSE_BIND:
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 34f0a06..960ad15 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,3 +1,11 @@
+2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * c-c++-common/goacc/data-2.c: Update.
+ * g++.dg/goacc/template.C: Likewise.
+ * gcc.dg/goacc/nested-function-1.c: Likewise.
+ * gfortran.dg/goacc/enter-exit-data.f95: Likewise.
+ * gfortran.dg/goacc/nested-function-1.f90: Likewise.
+
2017-05-15 Thomas Schwinge <thomas@codesourcery.com>
* c-c++-common/cpp/openacc-define-3.c: Update.
diff --git gcc/testsuite/c-c++-common/goacc/data-2.c gcc/testsuite/c-c++-common/goacc/data-2.c
index 1043bf8a..8c5d42a 100644
--- gcc/testsuite/c-c++-common/goacc/data-2.c
+++ gcc/testsuite/c-c++-common/goacc/data-2.c
@@ -10,6 +10,9 @@ foo (void)
#pragma acc exit data delete (a) if (0)
#pragma acc exit data copyout (b) if (a)
#pragma acc exit data delete (b)
+#pragma acc exit data delete (a) if (!0) finalize
+#pragma acc exit data copyout (b) finalize if (!a)
+#pragma acc exit data finalize delete (b)
#pragma acc enter /* { dg-error "expected 'data' after" } */
#pragma acc exit /* { dg-error "expected 'data' after" } */
#pragma acc enter data /* { dg-error "has no data movement clause" } */
diff --git gcc/testsuite/g++.dg/goacc/template.C gcc/testsuite/g++.dg/goacc/template.C
index f4d255c..d1acece 100644
--- gcc/testsuite/g++.dg/goacc/template.C
+++ gcc/testsuite/g++.dg/goacc/template.C
@@ -86,6 +86,8 @@ oacc_parallel_copy (T a)
#pragma acc update self (b)
#pragma acc update device (b)
#pragma acc exit data delete (b)
+#pragma acc exit data finalize copyout (b)
+#pragma acc exit data delete (b) finalize
return b;
}
@@ -133,6 +135,13 @@ oacc_kernels_copy (T a)
b = a;
}
+#pragma acc update host (b)
+#pragma acc update self (b)
+#pragma acc update device (b)
+#pragma acc exit data delete (b)
+#pragma acc exit data finalize copyout (b)
+#pragma acc exit data delete (b) finalize
+
return b;
}
diff --git gcc/testsuite/gcc.dg/goacc/nested-function-1.c gcc/testsuite/gcc.dg/goacc/nested-function-1.c
index 5fc2e46..6b76112 100644
--- gcc/testsuite/gcc.dg/goacc/nested-function-1.c
+++ gcc/testsuite/gcc.dg/goacc/nested-function-1.c
@@ -56,6 +56,8 @@ int main ()
for (local_j = 0; local_j < N; ++local_j)
;
}
+
+#pragma acc exit data copyout(local_a) delete(local_i) finalize
}
void nonlocal ()
@@ -95,6 +97,8 @@ int main ()
for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j)
;
}
+
+#pragma acc exit data copyout(nonlocal_a) delete(nonlocal_i) finalize
}
local ();
diff --git gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
index 8f1715e..805459c 100644
--- gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
+++ gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
@@ -84,5 +84,8 @@ contains
!$acc exit data delete (tip) ! { dg-error "POINTER" }
!$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" }
!$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" }
+ !$acc exit data finalize
+ !$acc exit data finalize copyout (i)
+ !$acc exit data finalize delete (i)
end subroutine foo
end module test
diff --git gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90 gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
index bbb53c3..005193f 100644
--- gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
+++ gcc/testsuite/gfortran.dg/goacc/nested-function-1.f90
@@ -56,6 +56,8 @@ contains
enddo
enddo
!$acc end kernels loop
+
+ !$acc exit data copyout(local_a) delete(local_i) finalize
end subroutine local
subroutine nonlocal ()
@@ -93,5 +95,7 @@ contains
enddo
enddo
!$acc end kernels loop
+
+ !$acc exit data copyout(nonlocal_a) delete(nonlocal_i) finalize
end subroutine nonlocal
end program main
diff --git gcc/tree-nested.c gcc/tree-nested.c
index 3ddfd65..d6635ab 100644
--- gcc/tree-nested.c
+++ gcc/tree-nested.c
@@ -1203,6 +1203,7 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
case OMP_CLAUSE_INDEPENDENT:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_DEVICE_TYPE:
@@ -1902,6 +1903,7 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
case OMP_CLAUSE_INDEPENDENT:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_IF_PRESENT:
+ case OMP_CLAUSE_FINALIZE:
break;
case OMP_CLAUSE_DEVICE_TYPE:
diff --git gcc/tree-pretty-print.c gcc/tree-pretty-print.c
index a208e8f..a8a0073 100644
--- gcc/tree-pretty-print.c
+++ gcc/tree-pretty-print.c
@@ -1093,6 +1093,9 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
case OMP_CLAUSE_IF_PRESENT:
pp_string (pp, "if_present");
break;
+ case OMP_CLAUSE_FINALIZE:
+ pp_string (pp, "finalize");
+ break;
default:
pp_string (pp, "unknown");
Grüße
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause
2017-05-16 13:11 [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause Chung-Lin Tang
2017-05-17 11:51 ` Thomas Schwinge
2017-05-17 12:06 ` Thomas Schwinge
@ 2017-05-17 12:14 ` Thomas Schwinge
2017-05-17 12:16 ` Thomas Schwinge
3 siblings, 0 replies; 5+ messages in thread
From: Thomas Schwinge @ 2017-05-17 12:14 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches
Hi!
On Tue, 16 May 2017 20:55:46 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> finalize clause of the exit data directive
Thanks!
> --- libgomp/oacc-parallel.c (revision 248095)
> +++ libgomp/oacc-parallel.c (revision 248096)
> void
> GOACC_enter_exit_data (int device, size_t mapnum,
> void **hostaddrs, size_t *sizes, unsigned short *kinds,
> - int async, int num_waits, ...)
> + int async, int finalize, int num_waits, ...)
> --- gcc/omp-low.c (revision 248095)
> +++ gcc/omp-low.c (revision 248096)
> @@ -14216,6 +14218,13 @@
> if (t_async)
> args.safe_push (t_async);
>
> + if (start_ix == BUILT_IN_GOACC_ENTER_EXIT_DATA)
> + {
> + c = find_omp_clause (clauses, OMP_CLAUSE_FINALIZE);
> + tree t_finalize = c ? integer_one_node : integer_zero_node;
> + args.safe_push (t_finalize);
> + }
> +
> /* Save the argument index, and ... */
> unsigned t_wait_idx = args.length ();
> unsigned num_waits = 0;
This breaks the ABI. (Also this didn't update
gcc/omp-builtins.def:BUILT_IN_GOACC_ENTER_EXIT_DATA. If I remember
correctly, I noted before that we really should add some consistency
checking for definition vs. usage of builtin functions.)
So I changed that to do similar to what Cesar recently had done for the
OpenACC update construct's if_present clause, and also added test cases.
Committed to gomp-4_0-branch in r248149:
commit 98cb728fdc1d848b3beadae5a81b663a30915925
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed May 17 12:03:00 2017 +0000
Revert GOACC_enter_exit_data ABI change
Instead, use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
semantics apply to all mappings of this OpenACC directive.
gcc/
* omp-low.c (expand_omp_target): Revert GOACC_enter_exit_data ABI
change.
* gimplify.c (gimplify_omp_target_update): Handle
OMP_CLAUSE_FINALIZE for OACC_EXIT_DATA.
gcc/testsuite/
* c-c++-common/goacc/finalize-1.c: New file.
* gfortran.dg/goacc/finalize-1.f: Likewise.
libgomp/
* oacc-parallel.c (GOACC_enter_exit_data): Locally compute
"finalize", and remove the formal parameter. Adjust all users.
(GOACC_declare): Don't replace GOMP_MAP_FROM with
GOMP_MAP_FORCE_FROM.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248149 138bc75d-0d04-0410-961f-82ee72b054a4
---
gcc/ChangeLog.gomp | 5 +++++
gcc/gimplify.c | 27 ++++++++++++++++++++++++++
gcc/omp-low.c | 7 -------
gcc/testsuite/ChangeLog.gomp | 3 +++
gcc/testsuite/c-c++-common/goacc/finalize-1.c | 28 +++++++++++++++++++++++++++
gcc/testsuite/gfortran.dg/goacc/finalize-1.f | 27 ++++++++++++++++++++++++++
libgomp/ChangeLog.gomp | 5 +++++
libgomp/libgomp_g.h | 2 +-
libgomp/oacc-parallel.c | 22 +++++++++++++++------
9 files changed, 112 insertions(+), 14 deletions(-)
diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index d89897d..084ac9b 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,10 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * omp-low.c (expand_omp_target): Revert GOACC_enter_exit_data ABI
+ change.
+ * gimplify.c (gimplify_omp_target_update): Handle
+ OMP_CLAUSE_FINALIZE for OACC_EXIT_DATA.
+
* tree-nested.c (convert_nonlocal_omp_clauses)
(convert_local_omp_clauses): Handle "OMP_CLAUSE_FINALIZE".
* tree-pretty-print.c (dump_omp_clause): Handle
diff --git gcc/gimplify.c gcc/gimplify.c
index 7812471..54c5d43 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -10060,6 +10060,33 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
break;
}
}
+ else if (TREE_CODE (expr) == OACC_EXIT_DATA
+ && find_omp_clause (OMP_STANDALONE_CLAUSES (expr),
+ OMP_CLAUSE_FINALIZE))
+ {
+ /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
+ semantics apply to all mappings of this OpenACC directive. */
+ bool finalize_marked = false;
+ for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
+ switch (OMP_CLAUSE_MAP_KIND (c))
+ {
+ case GOMP_MAP_FROM:
+ OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_FROM);
+ finalize_marked = true;
+ break;
+ case GOMP_MAP_RELEASE:
+ OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
+ finalize_marked = true;
+ break;
+ default:
+ /* Check consistency: libgomp relies on the very first data
+ mapping clause being marked, so make sure we did that before
+ any other mapping clauses. */
+ gcc_assert (finalize_marked);
+ break;
+ }
+ }
stmt = gimple_build_omp_target (NULL, kind, OMP_STANDALONE_CLAUSES (expr));
gimplify_seq_add_stmt (pre_p, stmt);
diff --git gcc/omp-low.c gcc/omp-low.c
index 394dd47..048d9fb 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -14218,13 +14218,6 @@ expand_omp_target (struct omp_region *region)
if (t_async)
args.safe_push (t_async);
- if (start_ix == BUILT_IN_GOACC_ENTER_EXIT_DATA)
- {
- c = find_omp_clause (clauses, OMP_CLAUSE_FINALIZE);
- tree t_finalize = c ? integer_one_node : integer_zero_node;
- args.safe_push (t_finalize);
- }
-
/* Save the argument index, and ... */
unsigned t_wait_idx = args.length ();
unsigned num_waits = 0;
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 960ad15..9ff5476 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,5 +1,8 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * c-c++-common/goacc/finalize-1.c: New file.
+ * gfortran.dg/goacc/finalize-1.f: Likewise.
+
* c-c++-common/goacc/data-2.c: Update.
* g++.dg/goacc/template.C: Likewise.
* gcc.dg/goacc/nested-function-1.c: Likewise.
diff --git gcc/testsuite/c-c++-common/goacc/finalize-1.c gcc/testsuite/c-c++-common/goacc/finalize-1.c
new file mode 100644
index 0000000..9482029
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/finalize-1.c
@@ -0,0 +1,28 @@
+/* Test valid usage and processing of the finalize clause. */
+
+/* { dg-additional-options "-fdump-tree-original -fdump-tree-gimple" } */
+
+extern int del_r;
+extern float del_f[3];
+extern double cpo_r[8];
+extern long cpo_f;
+
+void f ()
+{
+#pragma acc exit data delete (del_r)
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:del_r\\);$" 1 "original" } }
+ { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(release:del_r \\\[len: \[0-9\]+\\\]\\)$" 1 "gimple" } } */
+
+#pragma acc exit data finalize delete (del_f)
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:del_f\\) finalize;$" 1 "original" } }
+ { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(delete:del_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } } */
+
+#pragma acc exit data copyout (cpo_r)
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:cpo_r\\);$" 1 "original" } }
+ { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(from:cpo_r \\\[len: \[0-9\]+\\\]\\)$" 1 "gimple" } } */
+
+#pragma acc exit data copyout (cpo_f) finalize
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data finalize map\\(from:cpo_f\\);$" 1 "original" } }
+ { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data finalize map\\(force_from:cpo_f \\\[len: \[0-9\]+\\\]\\)$" 1 "gimple" } } */
+}
+
diff --git gcc/testsuite/gfortran.dg/goacc/finalize-1.f gcc/testsuite/gfortran.dg/goacc/finalize-1.f
new file mode 100644
index 0000000..5c7a921
--- /dev/null
+++ gcc/testsuite/gfortran.dg/goacc/finalize-1.f
@@ -0,0 +1,27 @@
+! Test valid usage and processing of the finalize clause.
+
+! { dg-additional-options "-fdump-tree-original -fdump-tree-gimple" }
+
+ SUBROUTINE f
+ IMPLICIT NONE
+ INTEGER :: del_r
+ REAL, DIMENSION (3) :: del_f
+ DOUBLE PRECISION, DIMENSION (8) :: cpo_r
+ LOGICAL :: cpo_f
+
+!$ACC EXIT DATA DELETE (del_r)
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:del_r\\);$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(release:del_r \\\[len: \[0-9\]+\\\]\\)$" 1 "gimple" } }
+
+!$ACC EXIT DATA FINALIZE DELETE (del_f)
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(release:del_f\\) finalize;$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(delete:del_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } }
+
+!$ACC EXIT DATA COPYOUT (cpo_r)
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:cpo_r\\);$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(from:cpo_r \\\[len: \[0-9\]+\\\]\\)$" 1 "gimple" } }
+
+!$ACC EXIT DATA COPYOUT (cpo_f) FINALIZE
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc exit data map\\(from:cpo_f\\) finalize;$" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp target oacc_enter_exit_data map\\(force_from:cpo_f \\\[len: \[0-9\]+\\\]\\) finalize$" 1 "gimple" } }
+ END SUBROUTINE f
diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 61d0358..2ea7215 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,10 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * oacc-parallel.c (GOACC_enter_exit_data): Locally compute
+ "finalize", and remove the formal parameter. Adjust all users.
+ (GOACC_declare): Don't replace GOMP_MAP_FROM with
+ GOMP_MAP_FORCE_FROM.
+
* oacc-parallel.c (GOACC_enter_exit_data, GOACC_declare): Handle
"GOMP_MAP_RELEASE".
diff --git libgomp/libgomp_g.h libgomp/libgomp_g.h
index 00f7a57..864ef3d 100644
--- libgomp/libgomp_g.h
+++ libgomp/libgomp_g.h
@@ -304,7 +304,7 @@ extern void GOACC_data_start (int, size_t, void **, size_t *,
unsigned short *);
extern void GOACC_data_end (void);
extern void GOACC_enter_exit_data (int, size_t, void **,
- size_t *, unsigned short *, int, int, int, ...);
+ size_t *, unsigned short *, int, int, ...);
extern void GOACC_update (int, size_t, void **, size_t *,
unsigned short *, int, int, ...);
extern void GOACC_wait (int, int, ...);
diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c
index 5b8435b..ff6e96c 100644
--- libgomp/oacc-parallel.c
+++ libgomp/oacc-parallel.c
@@ -601,7 +601,7 @@ GOACC_data_end (void)
void
GOACC_enter_exit_data (int device, size_t mapnum,
void **hostaddrs, size_t *sizes, unsigned short *kinds,
- int async, int finalize, int num_waits, ...)
+ int async, int num_waits, ...)
{
struct goacc_thread *thr;
struct gomp_device_descr *acc_dev;
@@ -609,6 +609,17 @@ GOACC_enter_exit_data (int device, size_t mapnum,
bool data_enter = false;
size_t i;
+ /* Determine whether "finalize" semantics apply to all mappings of this
+ OpenACC directive. */
+ bool finalize = false;
+ if (mapnum > 0)
+ {
+ unsigned char kind = kinds[0] & 0xff;
+ if (kind == GOMP_MAP_DELETE
+ || kind == GOMP_MAP_FORCE_FROM)
+ finalize = true;
+ }
+
/* Determine if this is an "acc enter data". */
for (i = 0; i < mapnum; ++i)
{
@@ -1102,7 +1113,7 @@ GOACC_declare (int device, size_t mapnum,
case GOMP_MAP_RELEASE:
case GOMP_MAP_DELETE:
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0, 0);
+ &kinds[i], 0, 0);
break;
case GOMP_MAP_FORCE_DEVICEPTR:
@@ -1111,19 +1122,18 @@ GOACC_declare (int device, size_t mapnum,
case GOMP_MAP_ALLOC:
if (!acc_is_present (hostaddrs[i], sizes[i]))
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0, 0);
+ &kinds[i], 0, 0);
break;
case GOMP_MAP_TO:
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0, 0);
+ &kinds[i], 0, 0);
break;
case GOMP_MAP_FROM:
- kinds[i] = GOMP_MAP_FORCE_FROM;
GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i],
- &kinds[i], 0, 0, 0);
+ &kinds[i], 0, 0);
break;
case GOMP_MAP_FORCE_PRESENT:
Grüße
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause
2017-05-16 13:11 [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause Chung-Lin Tang
` (2 preceding siblings ...)
2017-05-17 12:14 ` Thomas Schwinge
@ 2017-05-17 12:16 ` Thomas Schwinge
3 siblings, 0 replies; 5+ messages in thread
From: Thomas Schwinge @ 2017-05-17 12:16 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches
Hi!
On Tue, 16 May 2017 20:55:46 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> finalize clause of the exit data directive, and the
> corresponding API routines.
> --- libgomp/oacc-parallel.c (revision 248095)
> +++ libgomp/oacc-parallel.c (revision 248096)
> @@ -355,7 +355,22 @@
> }
> }
> else
> - tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
> + {
> + bool async_unmap = false;
> + for (size_t i = 0; i < tgt->list_count; i++)
> + {
> + splay_tree_key k = tgt->list[i].key;
> + if (k && k->refcount == 1)
> + {
> + async_unmap = true;
> + break;
> + }
> + }
> + if (async_unmap)
> + tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
> + else
> + gomp_unmap_vars (tgt, false);
> + }
>
> acc_dev->openacc.async_set_async_func (acc_async_sync);
This additional gomp_unmap_vars call also needs be instrumented for the
OpenACC Profiling Interface.
> --- libgomp/openacc.h (revision 248095)
> +++ libgomp/openacc.h (revision 248096)
> +/* Finalize versions of copyout/delete functions, specified in OpenACC 2.5. */
> +void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW;
> +void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
> +void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW;
> +void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
And for these, the OpenACC Profiling Interface status needs to be
documented.
Committed to gomp-4_0-branch in r248150:
commit dc97f798ad7f7f44f45b2b8e0ece81f3926fa1c2
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed May 17 12:07:30 2017 +0000
OpenACC 2.5 Profiling Interface changes for "finalize" handling
libgomp/
* libgomp.texi (OpenACC Profiling Interface): Update.
* oacc-parallel.c (GOACC_parallel_keyed): Update profiling event
generation.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248150 138bc75d-0d04-0410-961f-82ee72b054a4
---
libgomp/ChangeLog.gomp | 4 ++++
libgomp/libgomp.texi | 4 ++--
libgomp/oacc-parallel.c | 9 +++++++--
3 files changed, 13 insertions(+), 4 deletions(-)
diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 2ea7215..996c1f9 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,9 @@
2017-05-17 Thomas Schwinge <thomas@codesourcery.com>
+ * libgomp.texi (OpenACC Profiling Interface): Update.
+ * oacc-parallel.c (GOACC_parallel_keyed): Update profiling event
+ generation.
+
* oacc-parallel.c (GOACC_enter_exit_data): Locally compute
"finalize", and remove the formal parameter. Adjust all users.
(GOACC_declare): Don't replace GOMP_MAP_FROM with
diff --git libgomp/libgomp.texi libgomp/libgomp.texi
index 69fb3be..1dea1e2 100644
--- libgomp/libgomp.texi
+++ libgomp/libgomp.texi
@@ -3459,8 +3459,8 @@ offloading devices (it's not clear if they should be):
@item @code{acc_free}
@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
-@item @code{acc_copyout}, @code{acc_copyout_async}
-@item @code{acc_delete}, @code{acc_delete_async}
+@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
+@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
@item @code{acc_update_device}, @code{acc_update_device_async}
@item @code{acc_update_self}, @code{acc_update_self_async}
@item @code{acc_map_data}, @code{acc_unmap_data}
diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c
index ff6e96c..622c711 100644
--- libgomp/oacc-parallel.c
+++ libgomp/oacc-parallel.c
@@ -333,8 +333,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
async, dims, tgt);
/* If running synchronously, unmap immediately. */
+ bool copyfrom = true;
if (async < acc_async_noval)
{
+ unmap:
if (profiling_dispatch_p)
{
prof_info.event_type = acc_ev_exit_data_start;
@@ -344,7 +346,7 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
goacc_profiling_dispatch (&prof_info, &enter_exit_data_event_info,
&api_info);
}
- gomp_unmap_vars (tgt, true);
+ gomp_unmap_vars (tgt, copyfrom);
if (profiling_dispatch_p)
{
prof_info.event_type = acc_ev_exit_data_end;
@@ -369,7 +371,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
if (async_unmap)
tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
else
- gomp_unmap_vars (tgt, false);
+ {
+ copyfrom = false;
+ goto unmap;
+ }
}
acc_dev->openacc.async_set_async_func (acc_async_sync);
(That one can certainly do with some restructuring, to avoid the "goto".)
;-)
Grüße
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-17 12:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16 13:11 [gomp4] Implement OpenACC 2.5 reference counting, and finalize clause Chung-Lin Tang
2017-05-17 11:51 ` Thomas Schwinge
2017-05-17 12:06 ` Thomas Schwinge
2017-05-17 12:14 ` Thomas Schwinge
2017-05-17 12:16 ` Thomas Schwinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).