From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 87690 invoked by alias); 7 Dec 2018 15:39:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 87662 invoked by uid 89); 7 Dec 2018 15:39:13 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=sk:select_ X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 07 Dec 2018 15:39:08 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=svr-ies-mbx-01.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gVIDa-0006tH-2N from Thomas_Schwinge@mentor.com for gcc-patches@gcc.gnu.org; Fri, 07 Dec 2018 07:39:06 -0800 Received: from hertz.schwinge.homeip.net (137.202.0.90) by svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 7 Dec 2018 15:39:02 +0000 From: Thomas Schwinge To: Chung-Lin Tang , Subject: [PR88407] [OpenACC] Correctly handle unseen async-arguments (was: [PATCH 5/6, OpenACC, libgomp] Async re-work, C/C++ testsuite changes) In-Reply-To: <8086c63b-f729-891b-3d21-76871d360734@mentor.com> References: <8086c63b-f729-891b-3d21-76871d360734@mentor.com> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/25.2.2 (x86_64-pc-linux-gnu) Date: Fri, 07 Dec 2018 15:39:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-12/txt/msg00470.txt.bz2 Hi Chung-Lin! On Tue, 25 Sep 2018 21:11:42 +0800, Chung-Lin Tang wrote: > These are the testsuite/libgomp.oacc-c-c++-common/* changes. > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c b/libgo= mp/testsuite/libgomp.oacc-c-c++-common/lib-71.c > index c85e824..6afe2a0 100644 > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c > @@ -92,16 +92,22 @@ main (int argc, char **argv) > abort (); > } >=20=20 > - fprintf (stderr, "CheCKpOInT\n"); > - if (acc_async_test (1) !=3D 0) > + if (acc_async_test (0) !=3D 0) > { > fprintf (stderr, "asynchronous operation not running\n"); > abort (); > } >=20=20 > + /* Test unseen async number. */ > + if (acc_async_test (1) !=3D 1) > + { > + fprintf (stderr, "acc_async_test failed on unseen number\n"); > + abort (); > + } > + > sleep ((int) (dtime / 1000.0f) + 1); >=20=20 > - if (acc_async_test (1) !=3D 1) > + if (acc_async_test (0) !=3D 1) > { > fprintf (stderr, "found asynchronous operation still running\n"); > abort (); > @@ -116,7 +122,3 @@ main (int argc, char **argv) >=20=20 > return 0; > } > - > -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ > -/* { dg-output "unknown async \[0-9\]+" } */ > -/* { dg-shouldfail "" } */ That's now correct OpenACC usage, but you've now made this one essentially the same as "libgomp.oacc-c-c++-common/lib-69.c". > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c b/libgo= mp/testsuite/libgomp.oacc-c-c++-common/lib-77.c > index f4f196d..2821f88 100644 > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c > @@ -111,7 +111,7 @@ main (int argc, char **argv) >=20=20 > start_timer (0); >=20=20 > - acc_wait (1); > + acc_wait (0); >=20=20 > atime =3D stop_timer (0); >=20=20 > @@ -132,7 +132,3 @@ main (int argc, char **argv) >=20=20 > return 0; > } > - > -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ > -/* { dg-output "unknown async \[0-9\]+" } */ > -/* { dg-shouldfail "" } */ Again, that's now correct OpenACC usage, but you've now made this one essentially the same as "libgomp.oacc-c-c++-common/lib-74.c". So, confused about the intended behavior, I've asked the OpenACC committee to clarify, and filed "[OpenACC] Correctly handle unseen async-arguments". Assuming this gets clarified in the way I think it should, I suggest the following. Any comments? commit a34177a6ce637da8060394f69358f25bce90a8be Author: Thomas Schwinge Date: Fri Dec 7 16:36:53 2018 +0100 [PR88407] [OpenACC] Correctly handle unseen async-arguments =20=20=20=20 ... which turn the operation into a no-op. =20=20=20=20 libgomp/ * plugin/plugin-nvptx.c (nvptx_async_test, nvptx_wait) (nvptx_wait_async): Unseen async-argument is a no-op. * testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-12.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-71.c: Merge into... * testsuite/libgomp.oacc-c-c++-common/lib-69.c: ... this. Upda= te. * testsuite/libgomp.oacc-c-c++-common/lib-77.c: Merge into... * testsuite/libgomp.oacc-c-c++-common/lib-74.c: ... this. Upda= te --- libgomp/plugin/plugin-nvptx.c | 13 +- .../libgomp.oacc-c-c++-common/async_queue-1.c | 30 +++++ .../libgomp.oacc-c-c++-common/data-2-lib.c | 2 + .../testsuite/libgomp.oacc-c-c++-common/data-2.c | 2 + .../testsuite/libgomp.oacc-c-c++-common/lib-69.c | 7 ++ .../testsuite/libgomp.oacc-c-c++-common/lib-71.c | 122 ------------------ .../testsuite/libgomp.oacc-c-c++-common/lib-74.c | 4 + .../testsuite/libgomp.oacc-c-c++-common/lib-77.c | 138 -----------------= ---- .../testsuite/libgomp.oacc-c-c++-common/lib-79.c | 24 ++++ libgomp/testsuite/libgomp.oacc-fortran/lib-12.f90 | 5 + 10 files changed, 80 insertions(+), 267 deletions(-) diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c index 7d0d38e0c2e1..6f9b16634b10 100644 --- libgomp/plugin/plugin-nvptx.c +++ libgomp/plugin/plugin-nvptx.c @@ -1539,9 +1539,8 @@ nvptx_async_test (int async) struct ptx_stream *s; =20 s =3D select_stream_for_async (async, pthread_self (), false, NULL); - if (!s) - GOMP_PLUGIN_fatal ("unknown async %d", async); + return 1; =20 r =3D CUDA_CALL_NOCHECK (cuStreamQuery, s->stream); if (r =3D=3D CUDA_SUCCESS) @@ -1596,7 +1595,7 @@ nvptx_wait (int async) =20 s =3D select_stream_for_async (async, pthread_self (), false, NULL); if (!s) - GOMP_PLUGIN_fatal ("unknown async %d", async); + return; =20 CUDA_CALL_ASSERT (cuStreamSynchronize, s->stream); =20 @@ -1610,14 +1609,14 @@ nvptx_wait_async (int async1, int async2) struct ptx_stream *s1, *s2; pthread_t self =3D pthread_self (); =20 + s1 =3D select_stream_for_async (async1, self, false, NULL); + if (!s1) + return; + /* The stream that is waiting (rather than being waited for) doesn't necessarily have to exist already. */ s2 =3D select_stream_for_async (async2, self, true, NULL); =20 - s1 =3D select_stream_for_async (async1, self, false, NULL); - if (!s1) - GOMP_PLUGIN_fatal ("invalid async 1\n"); - if (s1 =3D=3D s2) GOMP_PLUGIN_fatal ("identical parameters"); =20 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c libg= omp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c index 48e1846a36e3..544b19fe6635 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c @@ -41,6 +41,36 @@ int main(void) assert (queues[i].cuda_stream =3D=3D NULL); } =20 + /* No-ops still don't initialize them. */ + { + size_t i =3D 0; + /* Find the first non-special async-argument. */ + while (queues[i].async < 0) + ++i; + assert (i < queues_n); + +#pragma acc wait(queues[i].async) // no-op + + ++i; + assert (i < queues_n); +#pragma acc parallel wait(queues[i].async) // no-op + ; + + ++i; + assert (i < queues_n); + acc_wait(queues[i].async); // no-op + + i +=3D 2; + assert (i < queues_n); + acc_wait_async(queues[i - 1].async, queues[i].async); // no-op, and as= ync queue "i" does not get set up + + for (size_t i =3D 0; i < queues_n; ++i) + { + queues[i].cuda_stream =3D acc_get_cuda_stream (queues[i].async); + assert (queues[i].cuda_stream =3D=3D NULL); + } + } + for (size_t i =3D 0; i < queues_n; ++i) { /* Use the queue to initialize it. */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c libgomp= /testsuite/libgomp.oacc-c-c++-common/data-2-lib.c index 2ddfa7d4a01b..806c2f405a01 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c @@ -65,6 +65,8 @@ main (int argc, char **argv) b[i] =3D a[i]; =20 acc_wait (1); + /* Test unseen async-argument. */ + acc_wait (10); =20 acc_memcpy_from_device (a, d_a, nbytes); acc_memcpy_from_device (b, d_b, nbytes); diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/data-2.c index 0c6abe69dc17..b552b94529d8 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c @@ -85,6 +85,8 @@ main (int argc, char **argv) =20 #pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait (1) async (1) #pragma acc wait (1) + /* Test unseen async-argument. */ +#pragma acc wait (10) =20 for (i =3D 0; i < N; i++) { diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/lib-69.c index 5462f1253522..c10bebaab136 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c @@ -103,6 +103,13 @@ main (int argc, char **argv) abort (); } =20 + /* Test unseen async-argument. */ + if (acc_async_test (1) !=3D 1) + { + fprintf (stderr, "acc_async_test failed on unseen async-argument\n"); + abort (); + } + sleep (1); =20 if (acc_async_test (0) !=3D 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/lib-71.c deleted file mode 100644 index c85e82459554..000000000000 --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c +++ /dev/null @@ -1,122 +0,0 @@ -/* { dg-do run { target openacc_nvidia_accel_selected } } */ -/* { dg-additional-options "-lcuda" } */ - -#include -#include -#include -#include - -int -main (int argc, char **argv) -{ - CUdevice dev; - CUfunction delay; - CUmodule module; - CUresult r; - CUstream stream; - unsigned long *a, *d_a, dticks; - int nbytes; - float dtime; - void *kargs[2]; - int clkrate; - int devnum, nprocs; - - acc_init (acc_device_nvidia); - - devnum =3D acc_get_device_num (acc_device_nvidia); - - r =3D cuDeviceGet (&dev, devnum); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGet failed: %d\n", r); - abort (); - } - - r =3D - cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUN= T, - dev); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); - abort (); - } - - r =3D cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, de= v); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); - abort (); - } - - r =3D cuModuleLoad (&module, "subr.ptx"); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuModuleLoad failed: %d\n", r); - abort (); - } - - r =3D cuModuleGetFunction (&delay, module, "delay"); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); - abort (); - } - - nbytes =3D nprocs * sizeof (unsigned long); - - dtime =3D 200.0; - - dticks =3D (unsigned long) (dtime * clkrate); - - a =3D (unsigned long *) malloc (nbytes); - d_a =3D (unsigned long *) acc_malloc (nbytes); - - acc_map_data (a, d_a, nbytes); - - kargs[0] =3D (void *) &d_a; - kargs[1] =3D (void *) &dticks; - - r =3D cuStreamCreate (&stream, CU_STREAM_DEFAULT); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuStreamCreate failed: %d\n", r); - abort (); - } - - acc_set_cuda_stream (0, stream); - - r =3D cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuLaunchKernel failed: %d\n", r); - abort (); - } - - fprintf (stderr, "CheCKpOInT\n"); - if (acc_async_test (1) !=3D 0) - { - fprintf (stderr, "asynchronous operation not running\n"); - abort (); - } - - sleep ((int) (dtime / 1000.0f) + 1); - - if (acc_async_test (1) !=3D 1) - { - fprintf (stderr, "found asynchronous operation still running\n"); - abort (); - } - - acc_unmap_data (a); - - free (a); - acc_free (d_a); - - acc_shutdown (acc_device_nvidia); - - return 0; -} - -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ -/* { dg-output "unknown async \[0-9\]+" } */ -/* { dg-shouldfail "" } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/lib-74.c index 0726ee420c3f..0efcf0d52229 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c @@ -103,6 +103,8 @@ main (int argc, char **argv) } =20 acc_wait (0); + /* Test unseen async-argument. */ + acc_wait (1); =20 atime =3D stop_timer (0); =20 @@ -115,6 +117,8 @@ main (int argc, char **argv) start_timer (0); =20 acc_wait (0); + /* Test unseen async-argument. */ + acc_wait (1); =20 atime =3D stop_timer (0); =20 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/lib-77.c deleted file mode 100644 index f4f196def3b7..000000000000 --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c +++ /dev/null @@ -1,138 +0,0 @@ -/* { dg-do run { target openacc_nvidia_accel_selected } } */ -/* { dg-additional-options "-lcuda" } */ - -#include -#include -#include -#include -#include -#include "timer.h" - -int -main (int argc, char **argv) -{ - CUdevice dev; - CUfunction delay; - CUmodule module; - CUresult r; - CUstream stream; - unsigned long *a, *d_a, dticks; - int nbytes; - float atime, dtime; - void *kargs[2]; - int clkrate; - int devnum, nprocs; - - acc_init (acc_device_nvidia); - - devnum =3D acc_get_device_num (acc_device_nvidia); - - r =3D cuDeviceGet (&dev, devnum); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGet failed: %d\n", r); - abort (); - } - - r =3D - cuDeviceGetAttribute (&nprocs, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUN= T, - dev); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); - abort (); - } - - r =3D cuDeviceGetAttribute (&clkrate, CU_DEVICE_ATTRIBUTE_CLOCK_RATE, de= v); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuDeviceGetAttribute failed: %d\n", r); - abort (); - } - - r =3D cuModuleLoad (&module, "subr.ptx"); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuModuleLoad failed: %d\n", r); - abort (); - } - - r =3D cuModuleGetFunction (&delay, module, "delay"); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuModuleGetFunction failed: %d\n", r); - abort (); - } - - nbytes =3D nprocs * sizeof (unsigned long); - - dtime =3D 200.0; - - dticks =3D (unsigned long) (dtime * clkrate); - - a =3D (unsigned long *) malloc (nbytes); - d_a =3D (unsigned long *) acc_malloc (nbytes); - - acc_map_data (a, d_a, nbytes); - - kargs[0] =3D (void *) &d_a; - kargs[1] =3D (void *) &dticks; - - r =3D cuStreamCreate (&stream, CU_STREAM_DEFAULT); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuStreamCreate failed: %d\n", r); - abort (); - } - - acc_set_cuda_stream (0, stream); - - init_timers (1); - - start_timer (0); - - r =3D cuLaunchKernel (delay, 1, 1, 1, 1, 1, 1, 0, stream, kargs, 0); - if (r !=3D CUDA_SUCCESS) - { - fprintf (stderr, "cuLaunchKernel failed: %d\n", r); - abort (); - } - - fprintf (stderr, "CheCKpOInT\n"); - acc_wait (1); - - atime =3D stop_timer (0); - - if (atime < dtime) - { - fprintf (stderr, "actual time < delay time\n"); - abort (); - } - - start_timer (0); - - acc_wait (1); - - atime =3D stop_timer (0); - - if (0.010 < atime) - { - fprintf (stderr, "actual time < delay time\n"); - abort (); - } - - acc_unmap_data (a); - - fini_timers (); - - free (a); - acc_free (d_a); - - acc_shutdown (acc_device_nvidia); - - return 0; -} - -/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */ -/* { dg-output "unknown async \[0-9\]+" } */ -/* { dg-shouldfail "" } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c libgomp/tes= tsuite/libgomp.oacc-c-c++-common/lib-79.c index ef3df13ebc91..b2e2687e4e76 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c @@ -122,6 +122,13 @@ main (int argc, char **argv) } } =20 + if (acc_async_test (0) !=3D 0) + abort (); + + /* Test unseen async-argument. */ + if (acc_async_test (1) !=3D 1) + abort (); + acc_wait_async (0, 1); =20 if (acc_async_test (0) !=3D 0) @@ -130,6 +137,23 @@ main (int argc, char **argv) if (acc_async_test (1) !=3D 0) abort (); =20 + /* Test unseen async-argument. */ + { + if (acc_async_test (2) !=3D 1) + abort (); + + acc_wait_async (2, 1); + + if (acc_async_test (0) !=3D 0) + abort (); + + if (acc_async_test (1) !=3D 0) + abort (); + + if (acc_async_test (2) !=3D 1) + abort (); + } + acc_wait (1); =20 atime =3D stop_timer (0); diff --git libgomp/testsuite/libgomp.oacc-fortran/lib-12.f90 libgomp/testsu= ite/libgomp.oacc-fortran/lib-12.f90 index 6912f67d444e..4cf62f2728ad 100644 --- libgomp/testsuite/libgomp.oacc-fortran/lib-12.f90 +++ libgomp/testsuite/libgomp.oacc-fortran/lib-12.f90 @@ -17,9 +17,14 @@ program main =20 call acc_wait_async (0, 1) =20 + ! Test unseen async-argument. + if (acc_async_test (2) .neqv. .TRUE.) call abort + call acc_wait_async (2, 1) + call acc_wait (1) =20 if (acc_async_test (0) .neqv. .TRUE.) call abort if (acc_async_test (1) .neqv. .TRUE.) call abort + if (acc_async_test (2) .neqv. .TRUE.) call abort =20 end program Gr=C3=BC=C3=9Fe Thomas