From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 111985 invoked by alias); 10 Dec 2018 10:02:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 111975 invoked by uid 89); 10 Dec 2018 10:02:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=Legacy X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 10 Dec 2018 10:02:46 +0000 Received: from svr-orw-mbx-02.mgc.mentorg.com ([147.34.90.202]) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gWIOh-0007gV-U6 from ChungLin_Tang@mentor.com ; Mon, 10 Dec 2018 02:02:43 -0800 Received: from [172.30.104.122] (147.34.91.1) by svr-orw-mbx-02.mgc.mentorg.com (147.34.90.202) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Mon, 10 Dec 2018 02:02:40 -0800 Subject: Re: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes To: Thomas Schwinge , Chung-Lin Tang CC: , Tom de Vries References: <9523b49a-0454-e0a9-826d-5eeec2a8c973@mentor.com> From: Chung-Lin Tang Message-ID: Date: Mon, 10 Dec 2018 10:02:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2018-12/txt/msg00559.txt.bz2 On 2018/12/7 04:57 AM, Thomas Schwinge wrote>> --- a/libgomp/plugin/plugin-nvptx.c >> +++ b/libgomp/plugin/plugin-nvptx.c > >> +struct goacc_asyncqueue * >> +GOMP_OFFLOAD_openacc_async_construct (void) >> +{ >> + struct goacc_asyncqueue *aq >> + = GOMP_PLUGIN_malloc (sizeof (struct goacc_asyncqueue)); >> + aq->cuda_stream = NULL; >> + CUDA_CALL_ASSERT (cuStreamCreate, &aq->cuda_stream, CU_STREAM_DEFAULT); > > Curiously (this was the same in the code before): does this have to be > "CU_STREAM_DEFAULT" instead of "CU_STREAM_NON_BLOCKING", because we want > to block anything from running in parallel with "acc_async_sync" GPU > kernels, that use the "NULL" stream? (Not asking you to change this now, > but I wonder if this is overly strict?) IIUC, this non-blocking only pertains to the "Legacy Default Stream" in CUDA, which we're pretty much ignoring; we should be using the newer per-thread default stream model. We could review this issue later. >> + if (aq->cuda_stream == NULL) >> + GOMP_PLUGIN_fatal ("CUDA stream create NULL\n"); > > Can this actually happen, given the "CUDA_CALL_ASSERT" usage above? Hmm, yeah I think this is superfluous too... >> + CUDA_CALL_ASSERT (cuStreamSynchronize, aq->cuda_stream); > > Why is the synchronization needed here? I don't remember, could likely be something added during debug. I'll remove this and test if things are okay. Thanks, Chung-Lin