From: Thomas Schwinge <thomas@codesourcery.com>
To: <gcc-patches@gcc.gnu.org>, <fortran@gcc.gnu.org>
Cc: <hberre3@gatech.edu>, <rcheruku@amd.com>,
Tobias Burnus <tobias@codesourcery.com>
Subject: Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668] (was: Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643])
Date: Wed, 2 Nov 2022 21:34:29 +0100 [thread overview]
Message-ID: <87tu3hjdt6.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <87y1stjeda.fsf@euler.schwinge.homeip.net>
[-- Attachment #1: Type: text/plain, Size: 5114 bytes --]
Hi!
On 2022-11-02T21:22:25+0100, I wrote:
> On 2022-11-02T21:15:31+0100, I wrote:
>> On 2022-11-02T21:10:54+0100, I wrote:
>>> On 2022-11-02T21:04:56+0100, I wrote:
>>>> --- /dev/null
>>>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90
>>>> @@ -0,0 +1,268 @@
>>>> +! Test OpenACC 'declare create' with allocatable arrays.
>>>> +
>>>> +! { dg-do run }
>>>> +
>>>> +!TODO-OpenACC-declare-allocate
>>>> +! Not currently implementing correct '-DACC_MEM_SHARED=0' behavior:
>>>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>>>> +! { dg-xfail-run-if TODO { *-*-* } { -DACC_MEM_SHARED=0 } }
>>>> +
>>>> +[...]
>>>
>>> Getting rid of the "'dg-xfail-run-if' for '-DACC_MEM_SHARED=0'" via a
>>> work around (as seen in real-world code), I've pushed to master branch
>>> commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e
>>> "Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'"
>>
>>> ... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
>>> for missing support for OpenACC "Changes from Version 2.0 to 2.5":
>>> "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>>> Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>>> manually.
>>
>> A similar test case, but with different focus, I've pushed to master
>> branch in commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c
>> "Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'",
>> see attached.
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90
>> @@ -0,0 +1,402 @@
>> +! Test OpenACC 'declare create' with allocatable arrays.
>> +
>> +! { dg-do run }
>> +
>> +! Note that we're not testing OpenACC semantics here, but rather documenting
>> +! current GCC behavior, specifically, behavior concerning updating of
>> +! host/device array descriptors.
>> +! { dg-skip-if n/a { *-*-* } { -DACC_MEM_SHARED=1 } }
>> +
>> +!TODO-OpenACC-declare-allocate
>> +! Missing support for OpenACC "Changes from Version 2.0 to 2.5":
>> +! "The 'declare create' directive with a Fortran 'allocatable' has new behavior".
>> +! Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
>> +! manually.
>
> If instead of calling 'acc_create'/'acc_delete' we'd like to use
> '!$acc enter data create'/'!$acc exit data delete', we run into
> <https://gcc.gnu.org/PR106643>
> "[gfortran + OpenACC] Allocate in module causes refcount error".
> Pushed to master branchcommit da8e0e1191c5512244a752b30dea0eba83e3d10c
> "Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]",
> see attached.
> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -1166,6 +1165,31 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
> bool processed = false;
>
> struct target_mem_desc *tgt = n->tgt;
> +
> + /* Arrange so that OpenACC 'declare' code à la PR106643
> + "[gfortran + OpenACC] Allocate in module causes refcount error"
> + has a chance to work. */
> + if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
> + && tgt->list_count == 0)
> + {
> + /* 'declare target'. */
> + assert (n->refcount == REFCOUNT_INFINITY);
> +
> + for (size_t k = 1; k < groupnum; k++)
> + {
> + /* The only thing we expect to see here. */
> + assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
> + }
> +
> + /* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
> + will always see 'n->refcount == REFCOUNT_INFINITY',
> + there's no need to adjust 'n->dynamic_refcount' here. */
> +
> + processed = true;
> + }
To make slightly more interesting (real-world) test cases work, we here
also have to process the 'GOMP_MAP_TO_PSET', 'GOMP_MAP_POINTER' here.
Tobias had implemented such a thing in context of OpenMP PR96668
"[OpenMP] Re-mapping allocated but previously unallocated allocatable does not work"
a while ago, and we may do similar here. Side note: in the first version
of my changes, I had actually here in
'libgomp/oacc-mem.c:goacc_enter_data_internal' re-implemented the
corresponding -- "somewhat ugly" -- logic, when at some point I realized
that I instead could simply call into the existing code, greatly reducing
the complexity here... Pushed to master branch
commit f6ce1e77bbf5d3a096f52e674bfd7354c6537d10
"Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668]",
see attached.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: 0001-Support-OpenACC-declare-create-with-Fortran-allocata.patch --]
[-- Type: text/x-diff, Size: 11881 bytes --]
From f6ce1e77bbf5d3a096f52e674bfd7354c6537d10 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 28 Oct 2022 15:06:45 +0200
Subject: [PATCH] Support OpenACC 'declare create' with Fortran allocatable
arrays, part II [PR106643, PR96668]
PR libgomp/106643
PR fortran/96668
libgomp/
* oacc-mem.c (goacc_enter_data_internal): Support
OpenACC 'declare create' with Fortran allocatable arrays, part II.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.
---
libgomp/oacc-mem.c | 15 +++-
...locatable-array_descriptor-1-directive.f90 | 90 +++++++++++++------
.../libgomp.oacc-fortran/pr106643-1.f90 | 83 +++++++++++++++++
3 files changed, 160 insertions(+), 28 deletions(-)
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index ba010fddbb3..233fe0e4c1d 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -1166,7 +1166,10 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
struct target_mem_desc *tgt = n->tgt;
- /* Arrange so that OpenACC 'declare' code à la PR106643
+ /* Minimal OpenACC variant corresponding to PR96668
+ "[OpenMP] Re-mapping allocated but previously unallocated
+ allocatable does not work" 'libgomp/target.c' changes, so that
+ OpenACC 'declare' code à la PR106643
"[gfortran + OpenACC] Allocate in module causes refcount error"
has a chance to work. */
if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET
@@ -1181,6 +1184,16 @@ goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum,
assert ((kinds[i + k] & 0xff) == GOMP_MAP_POINTER);
}
+ /* Let 'goacc_map_vars' -> 'gomp_map_vars_internal' handle
+ this. */
+ gomp_mutex_unlock (&acc_dev->lock);
+ struct target_mem_desc *tgt_
+ = goacc_map_vars (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
+ &sizes[i], &kinds[i], true,
+ GOMP_MAP_VARS_ENTER_DATA);
+ assert (tgt_ == NULL);
+ gomp_mutex_lock (&acc_dev->lock);
+
/* Given that 'goacc_exit_data_internal'/'goacc_exit_datum_1'
will always see 'n->refcount == REFCOUNT_INFINITY',
there's no need to adjust 'n->dynamic_refcount' here. */
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
index 10e1d5bc378..6604f72c5c1 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90
@@ -105,27 +105,50 @@ program test
!$acc enter data create (b)
! This is now OpenACC "present":
if (.not.acc_is_present (b)) error stop
- ! This still has the initial array descriptor:
+ ! ..., and got the actual array descriptor installed:
!$acc serial
- call verify_initial
+ call verify_n1_allocated
!$acc end serial
do i = n1_lb, n1_ub
b(i) = i - 1
end do
- ! Verify that host-to-device copy doesn't touch the device-side (still
- ! initial) array descriptor (but it does copy the array data).
+ ! In 'declare-allocatable-array_descriptor-1-runtime.f90', this does "verify
+ ! that host-to-device copy doesn't touch the device-side (still initial)
+ ! array descriptor (but it does copy the array data"). This is here not
+ ! applicable anymore, as we've already gotten the actual array descriptor
+ ! installed. Thus now verify that it does copy the array data.
call acc_update_device (b)
!$acc serial
- call verify_initial
+ call verify_n1_allocated
!$acc end serial
b = 40
- ! Verify that device-to-host copy doesn't touch the host-side array
- ! descriptor, doesn't copy out the device-side (still initial) array
- ! descriptor (but it does copy the array data).
+ !$acc parallel copyout (id1_1) ! No data clause for 'b' (explicit or implicit): no 'GOMP_MAP_TO_PSET'.
+ call verify_n1_values (-1)
+ id1_1 = 0
+ !$acc end parallel
+ ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(from:id1_1\)$} 1 original } }
+ ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(from:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+
+ !$acc parallel copy (b) copyout (id1_2)
+ ! As already present, 'copy (b)' doesn't copy; addend is still '-1'.
+ call verify_n1_values (-1)
+ id1_2 = 0
+ !$acc end parallel
+ ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc parallel map\(tofrom:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2\)$} 1 original } }
+ !TODO ..., but without an actual use of 'b', the gimplifier removes the
+ !TODO 'GOMP_MAP_TO_PSET':
+ ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_parallel map\(tofrom:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(from:id1_2 \[len: [0-9]+\]\)$} 1 gimple } }
+
+ ! In 'declare-allocatable-array_descriptor-1-runtime.f90', this does "verify
+ ! that device-to-host copy doesn't touch the host-side array descriptor,
+ ! doesn't copy out the device-side (still initial) array descriptor (but it
+ ! does copy the array data)". This is here not applicable anymore, as we've
+ ! already gotten the actual array descriptor installed. Thus now verify that
+ ! it does copy the array data.
call acc_update_self (b)
call verify_n1_allocated
@@ -142,11 +165,19 @@ program test
! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_update map\(force_to:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
! ..., but it's silently skipped in 'GOACC_update'.
!$acc serial
- call verify_initial
+ call verify_n1_allocated
!$acc end serial
b = 41
+ !$acc parallel
+ call verify_n1_values (1)
+ !$acc end parallel
+
+ !$acc parallel copy (b)
+ call verify_n1_values (1)
+ !$acc end parallel
+
!$acc update self (b) self (id1_2)
! We do have 'GOMP_MAP_TO_PSET' here:
! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc update map\(force_from:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(force_from:id1_2\);$} 1 original } }
@@ -159,20 +190,9 @@ program test
b(i) = b(i) + 2
end do
- ! Now install the actual array descriptor, via a data clause for 'b'
- ! (explicit or implicit): must get a 'GOMP_MAP_TO_PSET', which then in
- ! 'gomp_map_vars_internal' is handled as 'declare target', and because of
- ! '*(void **) hostaddrs[i] != NULL', we've got 'has_always_ptrset == true',
- ! 'always_to_cnt == 1', and therefore 'gomp_map_vars_existing' does update
- ! the 'GOMP_MAP_TO_PSET'.
- !$acc serial present (b) copyin (id1_1)
- call verify_initial
- id1_1 = 0
- !$acc end serial
- ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma acc serial map\(force_present:\*\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[len: [^\]]+\]\) map\(to:b \[pointer set, len: [0-9]+\]\) map\(alloc:\(integer\(kind=[0-9]+\)\[0:\] \* restrict\) b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1\)$} 1 original } }
- !TODO ..., but without an actual use of 'b', the gimplifier removes the
- !TODO 'GOMP_MAP_TO_PSET':
- ! { dg-final { scan-tree-dump-times {(?n)^ *#pragma omp target oacc_serial map\(force_present:MEM <integer\(kind=[0-9]+\)\[0:\]> \[\(integer\(kind=[0-9]+\)\[0:\] \*\)[^\]]+\] \[len: [^\]]+\]\) map\(alloc:b\.data \[pointer assign, bias: 0\]\) map\(to:id1_1 \[len: [0-9]+\]\)$} 1 gimple } }
+ ! Now test that (potentially re-)installing the actual array descriptor is a
+ ! no-op, via a data clause for 'b' (explicit or implicit): must get a
+ ! 'GOMP_MAP_TO_PSET'.
!$acc serial present (b) copyin (id1_2)
call verify_n1_allocated
!TODO Use of 'b':
@@ -243,9 +263,9 @@ program test
if (acc_is_present (b)) error stop
!$acc enter data create (b)
if (.not.acc_is_present (b)) error stop
- ! This still has the previous (n1) array descriptor:
+ ! ..., and got the actual array descriptor installed:
!$acc serial
- call verify_n1_deallocated (.true.)
+ call verify_n2_allocated
!$acc end serial
do i = n2_lb, n2_ub
@@ -254,11 +274,19 @@ program test
call acc_update_device (b)
!$acc serial
- call verify_n1_deallocated (.true.)
+ call verify_n2_allocated
!$acc end serial
b = -40
+ !$acc parallel
+ call verify_n2_values (20)
+ !$acc end parallel
+
+ !$acc parallel copy (b)
+ call verify_n2_values (20)
+ !$acc end parallel
+
call acc_update_self (b)
call verify_n2_allocated
@@ -269,11 +297,19 @@ program test
!$acc update device (b)
!$acc serial
- call verify_n1_deallocated (.true.)
+ call verify_n2_allocated
!$acc end serial
b = -41
+ !$acc parallel
+ call verify_n2_values (-20)
+ !$acc end parallel
+
+ !$acc parallel copy (b)
+ call verify_n2_values (-20)
+ !$acc end parallel
+
!$acc update self (b)
call verify_n2_allocated
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90
new file mode 100644
index 00000000000..a9c969e3361
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr106643-1.f90
@@ -0,0 +1,83 @@
+! { dg-do run }
+! { dg-additional-options -cpp }
+
+
+!TODO OpenACC 'serial' vs. GCC/nvptx:
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+
+module m_macron
+
+ implicit none
+
+ real(kind(0d0)), allocatable, dimension(:) :: valls
+ !$acc declare create(valls)
+
+contains
+
+ subroutine s_macron_compute(size)
+
+ integer :: size
+
+ !$acc routine seq
+
+#if ACC_MEM_SHARED
+ if (valls(size) /= 1) error stop
+#else
+ if (valls(size) /= size - 2) error stop
+#endif
+
+ valls(size) = size + 2
+
+ end subroutine s_macron_compute
+
+ subroutine s_macron_init(size)
+
+ integer :: size
+
+ print*, "size=", size
+
+ print*, "allocate(valls(1:size))"
+ allocate(valls(1:size))
+
+ print*, "acc enter data create(valls(1:size))"
+ !$acc enter data create(valls(1:size))
+
+ print*, "!$acc update device(valls(1:size))"
+ valls(size) = size - 2
+ !$acc update device(valls(1:size))
+
+ valls(size) = 1
+
+ !$acc serial
+ call s_macron_compute(size)
+ !$acc end serial
+
+ valls(size) = -1
+
+ !$acc update host(valls(1:size))
+#if ACC_MEM_SHARED
+ if (valls(size) /= -1) error stop
+#else
+ if (valls(size) /= size + 2) error stop
+#endif
+
+ print*, valls(1:size)
+
+ print*, "acc exit data delete(valls)"
+ !$acc exit data delete(valls)
+
+ end subroutine s_macron_init
+
+end module m_macron
+
+
+program p_main
+
+ use m_macron
+
+ implicit none
+
+ call s_macron_init(10)
+
+end program p_main
--
2.35.1
next prev parent reply other threads:[~2022-11-02 20:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-05 15:24 [gomp4] add support for fortran allocate support with declare create Cesar Philippidis
2017-04-06 9:05 ` Thomas Schwinge
2017-04-06 14:26 ` Cesar Philippidis
2022-11-02 20:04 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
2022-11-02 20:10 ` Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90' (was: Add 'libgomp.oacc-fortran/declare-allocatable-1.f90') Thomas Schwinge
2022-11-02 20:15 ` Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90' Thomas Schwinge
2022-11-02 20:22 ` Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643] Thomas Schwinge
2022-11-02 20:34 ` Thomas Schwinge [this message]
[not found] ` <1ECCE9C8-0CE9-46EA-A0F2-3F3FA50F4681@gmail.com>
2022-11-03 10:47 ` Add 'libgomp.oacc-fortran/declare-allocatable-1.f90' (was: [gomp4] add support for fortran allocate support with declare create) Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tu3hjdt6.fsf@euler.schwinge.homeip.net \
--to=thomas@codesourcery.com \
--cc=fortran@gcc.gnu.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=hberre3@gatech.edu \
--cc=rcheruku@amd.com \
--cc=tobias@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).