From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 404A93850427 for ; Tue, 27 Jul 2021 09:22:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 404A93850427 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: 1kwoFkX0dypDlTKs07ERf0QnYEIFV7jNYdlKUW9acYhyNQKVEZL1UkEHDwCnl3fpg43ngBaHXT O38PPFI5VXDqGUni6KJeqPqlbfS2NYmhkZDWLYarJwiKrxtYQgK7R4z1NYniWibv3LGt27C6lh BZrC3l2402i6+qIR6HuruuJgHutm1mtNMdsbMP+K2KJfxhBLON9cUoYFjwlG3u1gCbBmHUUmRx qkFsF0znpfpZxjIEqv7P/q+HML3hTHemwJd/gNQ3AgJZD0msDTyAdVBA1EOKLgCYxDZwaxNUHm 08D8J1rmrso4U0DMWoHvnuAn X-IronPort-AV: E=Sophos;i="5.84,272,1620720000"; d="scan'208,223";a="66416351" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 27 Jul 2021 01:22:54 -0800 IronPort-SDR: G2hsTBdDOb7qjG4Lwt8CjCwrSafGKBXlSGW5mVyK/5pn4PntUvON66IU694HFara2vyZ8Cn+py NOL+N2JEkB9+wq4SxUkyRiho+TdTm7Dwe+fgn3SOEDRqCzIzh0Ynmy/AgRbTm4J9Jj36ISknOE Rtt8oDaHVguBAvDRfo6Y4Vb3yoe7sCWF1rMFZfMAzQYd8LXUJ/DW9/v8SsifzYbLX3HKrIK4zD 0eQcocRQlToJhz2Lkn7Q/5Q1v0MrFOcdgp7vT1mzeS3BKQw0zNgk3OsOVeJhoGBpEZ7cg0Join r/U= From: Thomas Schwinge To: CC: Chung-Lin Tang , Julian Brown Subject: Re: [PATCH, OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions In-Reply-To: <87sg1s9s9l.fsf@euler.schwinge.homeip.net> References: <4f2750a1-9935-6629-b7fd-ce6280f902c0@mentor.com> <87sg1s9s9l.fsf@euler.schwinge.homeip.net> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Tue, 27 Jul 2021 11:22:45 +0200 Message-ID: <871r7kun4q.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-08.mgc.mentorg.com (139.181.222.8) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jul 2021 09:22:57 -0000 --=-=-= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi! On 2021-06-08T19:32:22+0200, I wrote: > Hi Chung-Lin! > > ;-) It's been a while: > > On 2018-09-10T23:04:18+0800, Chung-Lin Tang wr= ote: >> * testsuite/libgomp.oacc-c-c++-common/lib-94.c: New test. >> * testsuite/libgomp.oacc-c-c++-common/lib-95.c: New test. >> * testsuite/libgomp.oacc-fortran/lib-16.f90: New test. > > Do you happen to remember why in these testcases you're using the > following pattern: Apparently not ;-) -- no answer/objection, I've thus now pushed "Fix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', 'libgomp.oacc-fortran/lib-16{,-2}.f90'" to master branch in commit 599e275d7e0b3fb79ff704d4cb2d8fdb0231116e, see attached. Gr=C3=BC=C3=9Fe Thomas >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c (nonexisten= t) >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c (working co= py) >> @@ -0,0 +1,42 @@ >> +/* { dg-do run } */ >> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=3D0" } } */ >> + >> +#include >> +#include >> +#include >> + >> +int >> +main (int argc, char **argv) >> +{ >> + const int N =3D 256; >> + int i; >> + int async =3D 8; >> + unsigned char *h; >> + >> + h =3D (unsigned char *) malloc (N); >> + >> + for (i =3D 0; i < N; i++) >> + { >> + h[i] =3D i; >> + } >> + >> + acc_copyin_async (h, N, async); >> + >> + memset (h, 0, N); >> + >> + acc_wait (async); > > You first issue 'acc_copyin_async', then (while potentially that's still > accessing 'h') already 'memset' 'h' (potentially overwriting data that > 'acc_copyin_async' is still working on), and only then 'acc_wait'? > > My understanding of OpenACC would swap 'memset' and 'acc_wait', but maybe > you have a specific reason to do it in this way? > > In particular, the GCC nvptx offloading implementation "doesn't seem to > care" (as discussed elsewhere; 'OpenACC "ephemeral" asynchronous > host-to-device copies', etc.) -- but I suppose if you meant to test such > implementation traits here, you'd have commented that? > >> + >> + acc_copyout_async (h, N, async + 1); >> + >> + acc_wait (async + 1); >> + >> + for (i =3D 0; i < N; i++) >> + { >> + if (h[i] !=3D i) >> + abort (); >> + } >> + >> + free (h); >> + >> + return 0; >> +} > >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (nonexisten= t) >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c (working co= py) >> @@ -0,0 +1,45 @@ >> +/* { dg-do run } */ >> +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=3D0" } } */ >> + >> +#include >> +#include >> +#include >> + >> +int >> +main (int argc, char **argv) >> +{ >> + const int N =3D 256; >> + int i, q =3D 5; >> + unsigned char *h, *g; >> + void *d; >> + >> + h =3D (unsigned char *) malloc (N); >> + g =3D (unsigned char *) malloc (N); >> + for (i =3D 0; i < N; i++) >> + { >> + g[i] =3D i; >> + } >> + >> + acc_create_async (h, N, q); >> + >> + acc_memcpy_to_device_async (acc_deviceptr (h), g, N, q); >> + memset (&h[0], 0, N); >> + >> + acc_wait (q); > > Similar here. > >> + acc_update_self_async (h, N, q + 1); >> + acc_delete_async (h, N, q + 1); >> + >> + acc_wait (q + 1); >> + >> + for (i =3D 0; i < N; i++) >> + { >> + if (h[i] !=3D i) >> + abort (); >> + } >> + >> + free (h); >> + free (g); >> + >> + return 0; >> +} > >> --- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (nonexisten= t) >> +++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 (working co= py) > > (Later also similarly copied into 'libgomp.oacc-fortran/lib-16-2.f90'.) > > Similar: > >> @@ -0,0 +1,57 @@ >> +! { dg-do run } >> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=3D0" } } >> + >> +program main >> + use openacc >> + implicit none >> + >> + integer, parameter :: N =3D 256 >> + integer, allocatable :: h(:) >> + integer :: i >> + integer :: async =3D 5 >> + >> + allocate (h(N)) >> + >> + do i =3D 1, N >> + h(i) =3D i >> + end do >> + >> + call acc_copyin (h) >> + >> + do i =3D 1, N >> + h(i) =3D i + i >> + end do >> + >> + call acc_update_device_async (h, sizeof (h), async) >> + >> + if (acc_is_present (h) .neqv. .TRUE.) call abort > > Don't we need 'acc_wait' here (while 'acc_update_device_async' may still > be reading from 'h'), before overwriting 'h' here: > >> + >> + h(:) =3D 0 >> + >> + call acc_copyout_async (h, sizeof (h), async) >> + >> + call acc_wait (async) >> + >> + do i =3D 1, N >> + if (h(i) /=3D i + i) call abort >> + end do >> + >> + call acc_copyin (h, sizeof (h)) >> + >> + h(:) =3D 0 >> + >> + call acc_update_self_async (h, sizeof (h), async) >> + >> + if (acc_is_present (h) .neqv. .TRUE.) call abort > > Don't we need 'acc_wait' here (to make sure we finish device to host copy > of 'h'), before evaluating 'h' here: > >> + >> + do i =3D 1, N >> + if (h(i) /=3D i + i) call abort >> + end do >> + >> + call acc_delete_async (h, async) >> + >> + call acc_wait (async) >> + >> + if (acc_is_present (h) .neqv. .FALSE.) call abort >> + >> +end program > > Julian has patches for most of these (as part of other commits). ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename="0001-Fix-OpenACC-async-wait-issues-in-libgomp.oacc-c-c-co.patch" >From 599e275d7e0b3fb79ff704d4cb2d8fdb0231116e Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Tue, 8 Jun 2021 19:32:22 +0200 Subject: [PATCH] Fix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', 'libgomp.oacc-fortran/lib-16{,-2}.f90' Fix-up for r265842 (commit 58168bbf6f8fb456280cca13343a498ad94878c7) "[OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions". libgomp/ * testsuite/libgomp.oacc-c-c++-common/lib-94.c: Fix OpenACC 'async'/'wait' issue. * testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-16-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/lib-16.f90: Likewise. Co-Authored-By: Julian Brown --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c | 4 ++-- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c | 3 ++- libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 | 4 ++++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 | 4 ++++ 4 files changed, 12 insertions(+), 3 deletions(-) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c index 54497237b0c..baa3ac83f04 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-94.c @@ -22,10 +22,10 @@ main (int argc, char **argv) acc_copyin_async (h, N, async); - memset (h, 0, N); - acc_wait (async); + memset (h, 0, N); + acc_copyout_async (h, N, async + 1); acc_wait (async + 1); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c index 85b238d78c8..842fb849e79 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-95.c @@ -23,10 +23,11 @@ main (int argc, char **argv) acc_create_async (h, N, q); acc_memcpy_to_device_async (acc_deviceptr (h), g, N, q); - memset (&h[0], 0, N); acc_wait (q); + memset (h, 0, N); + acc_update_self_async (h, N, q + 1); acc_delete_async (h, N, q + 1); diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 index ddd557d3be0..2be75dca98c 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 @@ -27,6 +27,8 @@ program main if (acc_is_present (h) .neqv. .TRUE.) stop 1 + call acc_wait (async) + h(:) = 0 call acc_copyout_async (h, sizeof (h), async) @@ -45,6 +47,8 @@ program main if (acc_is_present (h) .neqv. .TRUE.) stop 3 + call acc_wait (async) + do i = 1, N if (h(i) /= i + i) stop 4 end do diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 index ccd1ce6ee18..fae0d1031ed 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 @@ -27,6 +27,8 @@ program main if (acc_is_present (h) .neqv. .TRUE.) stop 1 + call acc_wait (async) + h(:) = 0 call acc_copyout_async (h, sizeof (h), async) @@ -45,6 +47,8 @@ program main if (acc_is_present (h) .neqv. .TRUE.) stop 3 + call acc_wait (async) + do i = 1, N if (h(i) /= i + i) stop 4 end do -- 2.30.2 --=-=-=--