Hi! On 2023-02-28T11:56:01+0100, I wrote: > I'm currently reviewing 'gomp_copy_host2dev', 'ephemeral' in a different > context, and a question came up here; > commit r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49 > "OpenMP: Handle descriptors in target's firstprivate [PR104949]": It doesn't seem as if we're going to address this question anytime soon, but it also isn't necessary; the worrysome condition currently cannot arise. I've therefore now documented that via 'assert (!aq)', and pushed to master branch commit e8fec6998b656dac02d4bc6c69b35a0fb5611e87 "Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]", see attached. Grüße Thomas > On 2022-05-11T19:33:00+0200, Tobias Burnus wrote: >> this patch handles (for target regions) >> firstprivate(array_descriptor) >> by not only firstprivatizing the descriptor but also the data >> it points to. This is done by turning it in omp-low.cc the clause >> into >> firstprivate(descr) firstprivate(descr.data) >> and then attaching the latter to the former. That works by >> adding an 'attach' after the last firstprivate (and checking >> for it in libgomp). The attached-to device address for a >> previous (here: the first) firstprivate is obtained by returning >> the device address inside the hostaddrs[i] alias omp_arr array, >> i.e. the compiler generates: >> omp_arr.1 = &descr; /* firstprivate */ >> omp_arr.2 = descr.data; /* firstprivate */ >> omp_arr.3 = &omp_arr.1; /* attach; bias: &desc.data-&desc */ >> and libgomp then knows that the device address is in the >> pointer. > >> Note: The code is not active for OpenACC. The existing code uses, e.g., >> 'goto oacc_firstprivate' – thus, the new code would be >> partially active. I went for making it completely inactive for OpenACC >> by adding one '!is_gimple_omp_oacc'. > > ACK. > >> I bet that a deep copy would be >> also useful for OpenACC, but I have neither checked what the current >> code does nor what the OpenACC spec says about this. > > Instead of adding corresponding handling to the OpenACC 'firstprivate' > special code paths later on, I suggest that we first address known issues > with OpenACC 'firstprivate' -- which probably may largely be achieved by > in fact removing those 'goto oacc_firstprivate's and other special code > paths? For example, see > "OpenACC 'firstprivate' clause: initial value". > > That means, the following code currently isn't active for OpenACC, and > given that OpenMP 'target' doesn't do asynchronous device execution > (meaning: not in the way/implementation of OpenACC 'async'), it thus > doesn't care about the 'ephemeral' argument to 'gomp_copy_host2dev', but > still, for correctness (and once that code gets used for OpenACC): > >> OpenMP: Handle descriptors in target's firstprivate [PR104949] >> >> For allocatable/pointer arrays, a firstprivate to a device >> not only needs to privatize the descriptor but also the actual >> data. This is implemented as: >> firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x) >> where the address of x in device memory is saved in hostaddrs[i] >> by libgomp and the middle end actually passes hostaddrs[i]' to >> attach. > >> --- a/libgomp/target.c >> +++ b/libgomp/target.c >> @@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, >> gomp_copy_host2dev (devicep, aq, >> (void *) (tgt->tgt_start + tgt_size), >> (void *) hostaddrs[i], len, false, cbufp); > > Here, passing 'ephemeral <- false' is correct, as 'h <- hostaddrs[i]' > points to non-ephemeral data. > >> + /* Save device address in hostaddr to permit latter availablity >> + when doing a deep-firstprivate with pointer attach. */ >> + hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size); > > Here, we modify 'hostaddrs[i]' (itself -- *not* the data that the > original 'hostaddrs[i]' points to), so the above 'gomp_copy_host2dev' > with 'ephemeral <- false' is still correct, right? > >> tgt_size += len; >> + >> + /* If followed by GOMP_MAP_ATTACH, pointer assign this >> + firstprivate to hostaddrs[i+1], which is assumed to contain a >> + device address. */ >> + if (i + 1 < mapnum >> + && (GOMP_MAP_ATTACH >> + == (typemask & get_kind (short_mapkind, kinds, i+1)))) >> + { >> + uintptr_t target = (uintptr_t) hostaddrs[i]; >> + void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1]; >> + gomp_copy_host2dev (devicep, aq, devptr, &target, >> + sizeof (void *), false, cbufp); > > However, 'h <- &target' here points to data in the local frame > ('target'), which potentially goes out of scope before an asynchronous > 'gomp_copy_host2dev' has completed. Thus, don't we have to pass here > 'ephemeral <- true' instead of 'ephemeral <- false'? Or, actually > instead of '&target', pass '&hostaddrs[i]', which then again points to > non-ephemeral data? Is the latter safe to do, or are we potentially > further down the line modifying the data that '&hostaddrs[i]' points to? > (I got a bit lost in the use of 'hostaddrs[i]' here.) > >> + ++i; >> + } >> continue; >> case GOMP_MAP_FIRSTPRIVATE_INT: >> case GOMP_MAP_ZERO_LEN_ARRAY_SECTION: >> @@ -2517,6 +2534,11 @@ copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs, >> memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]); >> hostaddrs[i] = tgt + tgt_size; >> tgt_size = tgt_size + sizes[i]; >> + if (i + 1 < mapnum && (kinds[i+1] & 0xff) == GOMP_MAP_ATTACH) >> + { >> + *(*(uintptr_t**) hostaddrs[i+1] + sizes[i+1]) = (uintptr_t) hostaddrs[i]; >> + ++i; >> + } >> } >> } > > For reference, the 'gomp_copy_host2dev' code that I highlighted above > still triggers for the following test cases only: > >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90 >> @@ -0,0 +1,33 @@ >> +! PR fortran/104949 >> + >> +implicit none (type,external) >> +integer, allocatable :: A(:) >> +A = [1,2,3,4,5,6] >> + >> +!$omp parallel firstprivate(A) >> +!$omp master >> + if (any (A /= [1,2,3,4,5])) error stop >> + A(:) = [99,88,77,66,55] >> +!$omp end master >> +!$omp end parallel >> + >> +!$omp target firstprivate(A) >> + if (any (A /= [1,2,3,4,5])) error stop >> + A(:) = [99,88,77,66,55] >> +!$omp end target >> +if (any (A /= [1,2,3,4,5])) error stop >> + >> +!$omp parallel default(firstprivate) >> +!$omp master >> + if (any (A /= [1,2,3,4,5])) error stop >> + A(:) = [99,88,77,66,55] >> +!$omp end master >> +!$omp end parallel >> +if (any (A /= [1,2,3,4,5])) error stop >> + >> +!$omp target defaultmap(firstprivate) >> + if (any (A /= [1,2,3,4,5])) error stop >> + A(:) = [99,88,77,66,55] >> +!$omp end target >> +if (any (A /= [1,2,3,4,5])) error stop >> +end > >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90 >> @@ -0,0 +1,113 @@ >> +! PR fortran/104949 >> + >> +module m >> +use omp_lib >> +implicit none (type, external) >> + >> +contains >> +subroutine one >> + integer, allocatable :: x(:) >> + integer :: i >> + >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x) >> + if (allocated(x)) error stop >> + !$omp end target >> + if (allocated(x)) error stop >> + end do >> + >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x, i) >> + if (allocated(x)) error stop >> + x = [10,20,30,40] + i >> + if (any (x /= [10,20,30,40] + i)) error stop >> + ! This leaks memory! >> + ! deallocate(x) >> + !$omp end target >> + if (allocated(x)) error stop >> + end do >> + >> + x = [1,2,3,4] >> + >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x, i) >> + if (i <= 0) error stop >> + if (.not.allocated(x)) error stop >> + if (size(x) /= 4) error stop >> + if (lbound(x,1) /= 1) error stop >> + if (any (x /= [1,2,3,4])) error stop >> + ! no reallocation, just malloced + assignment >> + x = [10,20,30,40] + i >> + if (any (x /= [10,20,30,40] + i)) error stop >> + ! This leaks memory! >> + ! deallocate(x) >> + !$omp end target >> + if (.not.allocated(x)) error stop >> + if (size(x) /= 4) error stop >> + if (lbound(x,1) /= 1) error stop >> + if (any (x /= [1,2,3,4])) error stop >> + end do >> + deallocate(x) >> +end >> + >> +subroutine two >> + character(len=:), allocatable :: x(:) >> + character(len=5) :: str >> + integer :: i >> + >> + str = "abcde" ! work around for PR fortran/91544 >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x) >> + if (allocated(x)) error stop >> + !$omp end target >> + if (allocated(x)) error stop >> + end do >> + >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x, i) >> + if (allocated(x)) error stop >> + ! no reallocation, just malloced + assignment >> + x = [character(len=2+i) :: str,"fhji","klmno"] >> + if (len(x) /= 2+i) error stop >> + if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop >> + ! This leaks memory! >> + ! deallocate(x) >> + !$omp end target >> + if (allocated(x)) error stop >> + end do >> + >> + x = [character(len=4) :: "ABCDE","FHJI","KLMNO"] >> + >> + do i = 1, omp_get_num_devices() + 1 >> + !$omp target firstprivate(x, i) >> + if (i <= 0) error stop >> + if (.not.allocated(x)) error stop >> + if (size(x) /= 3) error stop >> + if (lbound(x,1) /= 1) error stop >> + if (len(x) /= 4) error stop >> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop >> + !! Reallocation runs into the issue PR fortran/105538 >> + !! >> + !!x = [character(len=2+i) :: str,"fhji","klmno"] >> + !!if (len(x) /= 2+i) error stop >> + !!if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop >> + !! This leaks memory! >> + !! deallocate(x) >> + ! Just assign: >> + x = [character(len=4) :: "abcde","fhji","klmno"] >> + if (any (x /= [character(len=4) :: "abcde","fhji","klmno"])) error stop >> + !$omp end target >> + if (.not.allocated(x)) error stop >> + if (lbound(x,1) /= 1) error stop >> + if (size(x) /= 3) error stop >> + if (len(x) /= 4) error stop >> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop >> + end do >> + deallocate(x) >> +end >> +end module m >> + >> +use m >> +call one >> +call two >> +end > >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90 >> @@ -0,0 +1,24 @@ >> +implicit none >> + integer, allocatable :: x(:) >> + x = [1,2,3,4] >> + call foo(x) >> + if (any (x /= [1,2,3,4])) error stop >> + call foo() >> +contains >> +subroutine foo(c) >> + integer, allocatable, optional :: c(:) >> + logical :: is_present >> + is_present = present (c) >> + !$omp target firstprivate(c) >> + if (is_present) then >> + if (.not. allocated(c)) error stop >> + if (any (c /= [1,2,3,4])) error stop >> + c = [99,88,77,66] >> + if (any (c /= [99,88,77,66])) error stop >> + end if >> + !$omp end target >> + if (is_present) then >> + if (any (c /= [1,2,3,4])) error stop >> + end if >> +end >> +end > > > Grüße > Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955