From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id EFBB83858D39 for ; Wed, 19 Oct 2022 15:53:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EFBB83858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from [10.10.3.121] (unknown [10.10.3.121]) by mail.ispras.ru (Postfix) with ESMTPS id 2B9ED419E9F6; Wed, 19 Oct 2022 15:53:47 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru 2B9ED419E9F6 Date: Wed, 19 Oct 2022 18:53:47 +0300 (MSK) From: Alexander Monakov To: Tobias Burnus cc: Jakub Jelinek , gcc-patches Subject: Re: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling In-Reply-To: <1e56b27f-3c74-8bc2-028b-9091fa1ae7b1@codesourcery.com> Message-ID: References: <57b3ae5e-8f15-8bea-fa09-39bccbaa2414@codesourcery.com> <3f0fc49f-b07f-bee2-51a8-a5d03f1c33ed@codesourcery.com> <30e3ed49-0d14-8015-57ef-3d70b1dea69a@codesourcery.com> <3ebce406-46e4-8f98-8c53-83b61423644e@codesourcery.com> <798d7ee1-2ffa-a591-38cb-a9ad421265d0@codesourcery.com> <832946f-bb12-23d7-7d64-47b85c95125@ispras.ru> <1e56b27f-3c74-8bc2-028b-9091fa1ae7b1@codesourcery.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323328-1859514567-1666194827=:16227" X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-1859514567-1666194827=:16227 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT On Wed, 12 Oct 2022, Tobias Burnus wrote: > On 11.10.22 13:12, Alexander Monakov wrote: > > My understanding is such trickery should not be necessary with > > the barrier-based approach, i.e. the sequence of PTX instructions > > > > st % plain store > > membar.sys > > st.volatile > > > > should be enough to guarantee that the former store is visible on the host > > before the latter, and work all the way back to sm_20. > > If I understand it correctly, you mean: > > GOMP_REV_OFFLOAD_VAR->dev_num = GOMP_ADDITIONAL_ICVS.device_num; > > __sync_synchronize (); /* membar.sys */ > asm volatile ("st.volatile.global.u64 [%0], %1;" > : : "r"(addr_struct_fn), "r" (fn) : "memory"); > > > And then directly followed by the busy wait: > > while (__atomic_load_n (&GOMP_REV_OFFLOAD_VAR->fn, __ATOMIC_ACQUIRE) != 0) > ; /* spin */ > > which GCC expands to: > > /* ld.global.u64 %r64,[__gomp_rev_offload_var]; > ld.u64 %r36,[%r64]; > membar.sys; */ > > The such updated patch is attached. I think the topic for which I was Cc'ed (memory space and access method for the synchronization variable) has been resolved nicely. I am not satisfied with some other points raised in the conversation, I hope they are noted. Alexander > (This is the only change + removing the mkoffload.cc part is the only > larger change. Otherwise, it only handles the minor comments by Jakub. > The now removed CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT was used > until commit r10-304-g1f4c5b9bb2eb81880e2bc725435d596fcd2bdfef i.e. > it is a really old left over!) > > Otherwise, tested* to work with sm_30 (error by mkoffload, unchanged), > sm_35 and sm_70. > > Tobias > > *With some added code; until GOMP_OFFLOAD_get_num_devices accepts > GOMP_REQUIRES_UNIFIED_SHARED_MEMORY and GOMP_OFFLOAD_load_image > gets passed a non-NULL for rev_fn_table, the current patch is a no op. > > Planned next is the related GCN patch – and the actual change > in libgomp/target.c (+ accepting USM in GOMP_OFFLOAD_get_num_devices) --8323328-1859514567-1666194827=:16227--