From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15835 invoked by alias); 6 Dec 2018 20:57:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 15460 invoked by uid 89); 6 Dec 2018 20:57:45 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Dec 2018 20:57:44 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=svr-ies-mbx-01.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gV0iL-0004WL-9T from Thomas_Schwinge@mentor.com ; Thu, 06 Dec 2018 12:57:41 -0800 Received: from hertz.schwinge.homeip.net (137.202.0.90) by svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 6 Dec 2018 20:57:37 +0000 From: Thomas Schwinge To: Chung-Lin Tang CC: , Tom de Vries Subject: Re: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes In-Reply-To: <9523b49a-0454-e0a9-826d-5eeec2a8c973@mentor.com> References: <9523b49a-0454-e0a9-826d-5eeec2a8c973@mentor.com> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/25.2.2 (x86_64-pc-linux-gnu) Date: Thu, 06 Dec 2018 20:57:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-12/txt/msg00404.txt.bz2 Hi Chung-Lin! On Tue, 25 Sep 2018 21:11:58 +0800, Chung-Lin Tang wrote: > Hi Tom, > this patch removes large portions of plugin/plugin-nvptx.c, since a lot o= f it is > now in oacc-async.c now. The new code is essentially a NVPTX/CUDA-specifi= c implementation > of the new-style goacc_asyncqueues. > --- a/libgomp/plugin/plugin-nvptx.c > +++ b/libgomp/plugin/plugin-nvptx.c > +struct goacc_asyncqueue * > +GOMP_OFFLOAD_openacc_async_construct (void) > +{ > + struct goacc_asyncqueue *aq > + =3D GOMP_PLUGIN_malloc (sizeof (struct goacc_asyncqueue)); > + aq->cuda_stream =3D NULL; > + CUDA_CALL_ASSERT (cuStreamCreate, &aq->cuda_stream, CU_STREAM_DEFAULT); Curiously (this was the same in the code before): does this have to be "CU_STREAM_DEFAULT" instead of "CU_STREAM_NON_BLOCKING", because we want to block anything from running in parallel with "acc_async_sync" GPU kernels, that use the "NULL" stream? (Not asking you to change this now, but I wonder if this is overly strict?) > + if (aq->cuda_stream =3D=3D NULL) > + GOMP_PLUGIN_fatal ("CUDA stream create NULL\n"); Can this actually happen, given the "CUDA_CALL_ASSERT" usage above? > + CUDA_CALL_ASSERT (cuStreamSynchronize, aq->cuda_stream); Why is the synchronization needed here? > + return aq; > +} Gr=C3=BC=C3=9Fe Thomas