From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 744DE384402B for ; Fri, 2 Jul 2021 13:52:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 744DE384402B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: k0+FQ433qMXTng8iZkmAaIQq9y00C3Gc+nkYuLtvbe1AY/Po/9dTsmHNJsvdC3MpDuqkV6tSe8 QDq7NLD7OJjwk81Ey7NdeIEgDbvc6MKX9bOAhsQugogO+xs8fEZ0eEMvSFhsLMQaXcacthHkjq 72iymIko1OucK6faCrFbBc1Gc0bYgSydJF1F+szpI2EmH5zQ9CbzK8GJMgMaYZYatMYaAx62ZV ZQ7mktcMIysLecuIHnxu/PMspUpR5fnxe08cVz9iQREt1WmZaQL5iKv4cjEhgJUpbFazdvIOTw 5ZE= X-IronPort-AV: E=Sophos;i="5.83,317,1616486400"; d="diff'?scan'208";a="63084611" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 02 Jul 2021 05:51:59 -0800 IronPort-SDR: sYCrxNJ6nlqNQcmkFTsIvOIM/MJLlATI+KTqVcf6ryN2brnW2lkwNDg+FZmsWEtyMaMip7O3Qn a8UK2w8Nm/NtiuqkcChUIE/l0R0EBpnUiOetZSi2C8qCUsWILMgrFm2s71Y7mX4X1TCnBkSO96 hmyJPIyuNxXhX3VphZ6f0z8jOEZ9CVWnqJlTYXqtA+zrBn+wVgCAKgkFHfus9LgurGCMIvAezX FndoU27bjvkAdyz88L+M1BNgV1++USJeFpaNBwH0gaA9nOtl+tB0twCO2eTubWml/XgwEsMjhi sAY= Date: Fri, 2 Jul 2021 14:51:51 +0100 From: Julian Brown To: Thomas Schwinge CC: Jakub Jelinek , Subject: Re: [PATCH 0/4] openacc: Async fixes Message-ID: <20210702145151.0d6a5371@squid.athome> In-Reply-To: <20210630114033.4e85103f@squid.athome> References: <87pmw3ycxb.fsf@euler.schwinge.homeip.net> <20210630114033.4e85103f@squid.athome> Organization: Mentor Graphics X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/gQWO/G=rMRuGZg_7AVNLJVH" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2021 13:52:01 -0000 --MP_/gQWO/G=rMRuGZg_7AVNLJVH Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Wed, 30 Jun 2021 11:40:33 +0100 Julian Brown wrote: > On Wed, 30 Jun 2021 10:28:00 +0200 > Thomas Schwinge wrote: > > > > - The OpenACC profiling-interface implementation did not measure > > > asynchronous operations properly. > > > > We'll need to be careful: (possibly, an older version of) that one > > we internally had identified to be causing some issues; see the > > "acc_prof-parallel-1.c intermittent failure on og10 branch" emails, > > 2020-07. > > Hmm, I'll check those. The problem here is that the async callbacks now execute in a different thread to the main program, so the direct sharing of the 'state' variable isn't safe. (I verified that by observing the result of "pthread_self ()" calls from the main thread and from the callback.) The attached patch appears to make the test run reliably on mainline (which still exhibits the failure with the parent patch series, very intermittently). A better solution might be to use the memory-model builtins for all 'state' variable accesses though. I think the async profiling callbacks *have to* run in a different thread to the main program, which would make this a testcase bug (the spec doesn't explicitly say this as of 3.0 though). However there might be an argument for making "acc_wait" and friends thread barriers with respect to the host (i.e. calling __atomic_thread_fence in the appropriate place in libgomp) -- otherwise you have "break out of the abstraction" provided by OpenACC and rely on a non-OpenACC API in order to observe any results measured in the async profiling callbacks. OTOH the memory-model stuff is part of C now, so maybe that's fine (and also, I'm doubtful that just adding the barrier and using regular global variable accesses is sufficient to ensure thread safety anyway). Thoughts? Thanks, Julian --MP_/gQWO/G=rMRuGZg_7AVNLJVH Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="acc_prof-parallel-barrier-1.diff" commit a24d5c521b66ae88d0ddd05ce7fe247c94802595 Author: Julian Brown Date: Fri Jul 2 03:42:41 2021 -0700 Add barrier, hack diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c index 9c8af743aba..2bba7bd2e55 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c @@ -683,6 +683,7 @@ int main() state_init = state; } #pragma acc wait + __atomic_thread_fence (__ATOMIC_SEQ_CST); assert (state_init == 104); } assert (state == 108); --MP_/gQWO/G=rMRuGZg_7AVNLJVH--