public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped
@ 2023-10-02 10:40 burnus at gcc dot gnu.org
2023-10-04 13:53 ` [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself patrick.begou@univ-grenoble-alpes.fr
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: burnus at gcc dot gnu.org @ 2023-10-02 10:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
Bug ID: 111661
Summary: [OpenACC] Detach+Attach of DT component gives libgomp:
[0x405140,96] is not mapped
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: openacc
Severity: normal
Priority: P3
Component: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: patrick.begou@univ-grenoble-alpes.fr,
tschwinge at gcc dot gnu.org
Target Milestone: ---
Reported by Patrick. The following snippet gives:
libgomp: [0x406180,96] is not mapped
with GCC (mainline) and reportedly likewise GCC 13.2 but working with the
Nvidia and Cray compiler (see comments in the snipped). [Same result with an
older OG13 build: 13.1.1 20230710 [OG13].]
The problem seems to be the 'acc update' command. Here,
0x406180
is the host address of 'tab':
406180- tab - host
1847B20- tab%val - host
1304140000- tab - device
1304140200- tab%val - device
406180- tab - host
18FF8C0- tab%val - host
libgomp: [0x406180,96] is not mapped
In other words, the DETACH seems to not only DETACH 'tab%val' but also somehow
unmap 'tab'?
#pragma omp target oacc_exit_data map(delete:tab.val.data [len: 88])
map(delete:MEM <real(kind=8)[0:]> [(real(kind=8)[0:] *)_34] [len: _33])
map(to:tab.val [pointer set, len: 88]) map(force_detach:tab.val.data [bias: 0])
finalize
...
#pragma omp target oacc_update map(force_to:tab [len: 96])
!----------cut-------------------
use iso_c_binding
implicit none
type r2tab
double precision, dimension(:,:), allocatable :: val
integer :: dim1
integer :: dim2
end type r2tab
type(r2tab) :: tab
integer :: i,j
integer(c_intptr_t) :: iloc(2)
!$acc enter data copyin(tab)
tab%dim1 = 10
tab%dim2 = 20
allocate (tab%val(tab%dim1,tab%dim2))
print '(z16,a)', loc(tab), "- tab - host"
print '(z16,a)', loc(tab%val), "- tab%val - host"
!$acc enter data copyin(tab%val)
!$acc serial copyout(iloc)
iloc(1) = loc(tab)
iloc(2) = loc(tab%val)
!$acc end serial
print '(z16,a)', iloc(1), "- tab - device"
print '(z16,a)', iloc(2), "- tab%val - device"
!...
!$acc exit data delete(tab%val) finalize
! Works as is with nvfortran and CCE ftn but gfortran 13 requires
! additionally:
! !$acc exit data delete(tab) finalize
tab%dim1=11
tab%dim2=30
deallocate(tab%val)
allocate (tab%val(tab%dim1,tab%dim2))
print '(z16,a)', loc(tab), "- tab - host"
print '(z16,a)', loc(tab%val), "- tab%val - host"
! For nvfortran and ftn:
!$acc update device(tab)
! gfortran with the change above requires instead:
! !$acc enter data copyin(tab)
!$acc enter data create(tab%val)
!$acc parallel loop
do j = 1, tab%dim2
do i = 1, tab%dim1
tab%val(i,j) = j * 100 + i
end do
end do
!$acc end parallel loop
!$acc exit data copyout(tab%val) finalize
!$acc exit data delete(tab)
do j = 1, tab%dim2
do i = 1, tab%dim1
if (tab%val(i,j) /= j * 100 + i) error stop
end do
end do
end
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself
2023-10-02 10:40 [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped burnus at gcc dot gnu.org
@ 2023-10-04 13:53 ` patrick.begou@univ-grenoble-alpes.fr
2023-10-13 8:45 ` burnus at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: patrick.begou@univ-grenoble-alpes.fr @ 2023-10-04 13:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
--- Comment #1 from Patrick Bégou <patrick.begou@univ-grenoble-alpes.fr> ---
With tab an instantiation of an r2tab type described above by Tobias Burnus
this is what I am doing to enlarge the allocatable tab%val attribute to [n,m]
elements with gcc13.2.
!$acc exit data delete(tab%val) finalize
!$acc exit data delete(tab) finalize
deallocate(tab%val)
allocate(tab%val(n,m))
tab%dim1=n
tab%dim2=m
!$acc enter data copyin(tab)
!$acc enter data create(tab%val)
Previously (with nvfortran or ftn) I was using:
!$acc exit data delete(tab%val) finalize
deallocate(tab%val)
allocate(tab%val(n,m))
tab%dim1=n
tab%dim2=m
!$acc update device(tab)
!$acc enter data create(tab%val)
Unfortunatly, as soon as the user defined type contains more that one
allocatable attribute and only one should be resized, the workaround is no more
usable. The same limitation occur with chained lists uploaded on the GPU if one
of the elements contains an attribute allocatable that must be resize: it is no
more possible to remove and then offload again the tab component on the GPU.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself
2023-10-02 10:40 [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped burnus at gcc dot gnu.org
2023-10-04 13:53 ` [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself patrick.begou@univ-grenoble-alpes.fr
@ 2023-10-13 8:45 ` burnus at gcc dot gnu.org
2023-10-13 14:46 ` patrick.begou@univ-grenoble-alpes.fr
2023-11-16 20:59 ` tschwinge at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: burnus at gcc dot gnu.org @ 2023-10-13 8:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
@Patrick: It seems to work fine without "finalize".
Can you check whether the big program then works for you?
Usually, one should be able to do proper ref counting such that a
'finalize' -> force setting refcounts to zero
shouldn't be needed.
* * *
Looking at the code more closely, the problem is:
#pragma omp target oacc_exit_data map(delete:tab.val.data [len: 88])
this tries to 'delete' the array descriptor - but as tab.val.data is part of
'tab', this deletes all of "tab".
Compare the C example:
struct t { int *a; int n; };
void f() {
struct t s;
#pragma acc enter data copyin(s.a[:s.n])
#pragma acc exit data delete(s.a[:s.n])
// for completeness, not relevant here:
#pragma acc exit data detach(s.a)
#pragma acc exit data delete(s.a)
}
GCC does:
#pragma omp target oacc_enter_data map(struct:s [len: 1]) \
map(alloc:s.a [len: 8]) map(to:*_4 [len: _3]) map(attach:s.a [bias: 0])
#pragma omp target oacc_exit_data map(release:s.a [len: 8]) \
map(release:*_8 [len: _7]) map(detach:s.a [bias: 0])
#pragma omp target oacc_exit_data map(detach:s.a [bias: 0])
#pragma omp target oacc_exit_data map(release:s.a [len: 8])
which seems to be at least consistent. Again, here a 'finalize' would force the
reference counts to zero and, hence, also delete 's' and not only the
pointee/pointer target *s.a / s.a[0:.n] but also the pointer 's.a' itself.
(BTW: Same result since GCC 10; GCC 9 rejects that code.)
* * *
QUESTION: Is the current code for C (and Fortran) correct according to the
OpenACC specification or not?
FOLLOW UP QUESTION: If GCC's result is incorrect, what should the compiler do
instead?
And if it is correct, the question is: why do both ftn and nvfortran work in
the same way?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself
2023-10-02 10:40 [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped burnus at gcc dot gnu.org
2023-10-04 13:53 ` [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself patrick.begou@univ-grenoble-alpes.fr
2023-10-13 8:45 ` burnus at gcc dot gnu.org
@ 2023-10-13 14:46 ` patrick.begou@univ-grenoble-alpes.fr
2023-11-16 20:59 ` tschwinge at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: patrick.begou@univ-grenoble-alpes.fr @ 2023-10-13 14:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
--- Comment #3 from Patrick Bégou <patrick.begou@univ-grenoble-alpes.fr> ---
Hi Tobias,
thanks for this information.
- yes removing the "finalize" make this small test case working. In my
mind it should only remove the allocated attribute from the GPU with
setting the count to zero. Is it because the attribute is an allocatable
and not a pointer ? Is it the same behaviour with a pointer as attribute ?
- unfortunately this modification doesn't make some significant progress
with porting my large code (things are much more complex) but with Gnu
compilers Gdb is working. So it is a big step for investigating. I have
isolated the low level data management to test this module
independently. It works with ftn and Nvidia but not with Gnu at this
time. I have to investigate.
Patrick
Le 13/10/2023 à 10:45, burnus at gcc dot gnu.org a écrit :
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
>
> --- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
> @Patrick: It seems to work fine without "finalize".
>
> Can you check whether the big program then works for you?
> Usually, one should be able to do proper ref counting such that a
> 'finalize' -> force setting refcounts to zero
> shouldn't be needed.
>
> * * *
>
> Looking at the code more closely, the problem is:
>
> #pragma omp target oacc_exit_data map(delete:tab.val.data [len: 88])
>
> this tries to 'delete' the array descriptor - but as tab.val.data is part of
> 'tab', this deletes all of "tab".
>
>
> Compare the C example:
>
> struct t { int *a; int n; };
> void f() {
> struct t s;
> #pragma acc enter data copyin(s.a[:s.n])
> #pragma acc exit data delete(s.a[:s.n])
> // for completeness, not relevant here:
> #pragma acc exit data detach(s.a)
> #pragma acc exit data delete(s.a)
> }
>
>
> GCC does:
>
> #pragma omp target oacc_enter_data map(struct:s [len: 1]) \
> map(alloc:s.a [len: 8]) map(to:*_4 [len: _3]) map(attach:s.a [bias: 0])
>
> #pragma omp target oacc_exit_data map(release:s.a [len: 8]) \
> map(release:*_8 [len: _7]) map(detach:s.a [bias: 0])
>
> #pragma omp target oacc_exit_data map(detach:s.a [bias: 0])
> #pragma omp target oacc_exit_data map(release:s.a [len: 8])
>
> which seems to be at least consistent. Again, here a 'finalize' would force the
> reference counts to zero and, hence, also delete 's' and not only the
> pointee/pointer target *s.a / s.a[0:.n] but also the pointer 's.a' itself.
>
> (BTW: Same result since GCC 10; GCC 9 rejects that code.)
>
> * * *
>
> QUESTION: Is the current code for C (and Fortran) correct according to the
> OpenACC specification or not?
>
> FOLLOW UP QUESTION: If GCC's result is incorrect, what should the compiler do
> instead?
> And if it is correct, the question is: why do both ftn and nvfortran work in
> the same way?
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself
2023-10-02 10:40 [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped burnus at gcc dot gnu.org
` (2 preceding siblings ...)
2023-10-13 14:46 ` patrick.begou@univ-grenoble-alpes.fr
@ 2023-11-16 20:59 ` tschwinge at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: tschwinge at gcc dot gnu.org @ 2023-11-16 20:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111661
--- Comment #4 from Thomas Schwinge <tschwinge at gcc dot gnu.org> ---
Created attachment 56608
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56608&action=edit
'pr111661.c'
Before getting the Fortran case to work, let's indeed first look into some
conceptually corresponding C code:
(In reply to Tobias Burnus from comment #2)
> Looking at the code more closely, the problem is:
>
> #pragma omp target oacc_exit_data map(delete:tab.val.data [len: 88])
>
> this tries to 'delete' the array descriptor - but as tab.val.data is part of
> 'tab', this deletes all of "tab".
..., and indeed the same appears to happen in C:
> Compare the C example:
I completed this into a functional code, as follows (and attached).
> struct t { int *a; int n; };
> void f() {
> struct t s;
Here, first initialize 's':
s.n = 10;
s.a = __builtin_malloc(s.n * sizeof *s.a);
Now, before 's.a', we first need to establish 's' itself:
#pragma acc enter data copyin(s)
> #pragma acc enter data copyin(s.a[:s.n])
Then, let's do something observable, for example:
#pragma acc serial present(s)
{
for (int i = 0; i < s.n; ++i)
s.a[i] = i * i;
}
To be able to observe the computations, instead of:
> #pragma acc exit data delete(s.a[:s.n])
..., do:
#pragma acc exit data copyout(s.a[:s.n]) //finalize
After this, we expect 's' still to be alive:
if (!acc_is_present(&s, sizeof s))
__builtin_abort();
> // for completeness, not relevant here:
> #pragma acc exit data detach(s.a)
> #pragma acc exit data delete(s.a)
I don't understand what you're doing here; I commented out these two.
Instead, now get rid of 's':
#pragma acc exit data delete(s)
if (acc_is_present(&s, sizeof s))
__builtin_abort();
Verify results, and clean up:
for (int i = 0; i < s.n; ++i)
if (s.a[i] != i * i)
__builtin_abort();
__builtin_free(s.a);
> }
This works fine with 'finalize' commented out. However, with 'finalize'
enabled, we see:
> Again, here a 'finalize' would force
> the reference counts to zero and, hence, also delete 's' and not only the
> pointee/pointer target *s.a / s.a[0:.n] but also the pointer 's.a' itself.
... this behavior.
I've never in detail looked into the 'struct' mapping stuff -- I suppose the
problem here is not "simply" that '&s == &s.a', and that's confusing the
runtime?
> QUESTION: Is the current code for C (and Fortran) correct according to the
> OpenACC specification or not?
Per my -- quick, not in-depth -- first look, I'd say the code is correct, and
thus GCC's behavior is wrong.
> FOLLOW UP QUESTION: If GCC's result is incorrect, what should the compiler
> do instead?
It has to treat the outer 's' separate from the inner 's.a'. (..., even if
they happen to have the same address -- in case that's relevant here).
How does corresponding OpenMP code behave?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-16 20:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-02 10:40 [Bug fortran/111661] New: [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped burnus at gcc dot gnu.org
2023-10-04 13:53 ` [Bug fortran/111661] [OpenACC] Detach+Attach of DT component gives libgomp: [0x405140,96] is not mapped when running 'acc update' on DT var itself patrick.begou@univ-grenoble-alpes.fr
2023-10-13 8:45 ` burnus at gcc dot gnu.org
2023-10-13 14:46 ` patrick.begou@univ-grenoble-alpes.fr
2023-11-16 20:59 ` tschwinge at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).