From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 006EF385C335 for ; Sun, 2 Oct 2022 18:14:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 006EF385C335 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.93,363,1654588800"; d="scan'208,217";a="86781026" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 02 Oct 2022 10:14:02 -0800 IronPort-SDR: gTuLH+UbT2lCNKoqWsueyO+/JgS6hvk4zYXK3W5vaWnokngJwifUBpHfytmfzgb4qlypK+KBOB tmoLvSVjwN6CkY1oxamd4jwJh/vBn1OlSA1XrMdnkLG90nsr+VCFT1wnGYdpMymUsGfRumjK2I WceH5dgj2nWzI1uogQcCGvUUpShf/jaZlHbLq7SyaVgVXLdW8h+9j58smDoeHVOsJ7ojq1sD45 C/mh+PIczXH9+UxY3GKlHB27ZOYSIcXpbvL93gDiy/5FXKmSiCXO2WKG0ANvW+cmlIqq8GzbEO vlA= Content-Type: multipart/alternative; boundary="------------BcF6Mi8gjNkQD028WYxWuna8" Message-ID: <3ebce406-46e4-8f98-8c53-83b61423644e@codesourcery.com> Date: Sun, 2 Oct 2022 20:13:56 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling Content-Language: en-US From: Tobias Burnus To: Alexander Monakov CC: Jakub Jelinek , Tom de Vries , gcc-patches References: <57b3ae5e-8f15-8bea-fa09-39bccbaa2414@codesourcery.com> <3f0fc49f-b07f-bee2-51a8-a5d03f1c33ed@codesourcery.com> <30e3ed49-0d14-8015-57ef-3d70b1dea69a@codesourcery.com> In-Reply-To: <30e3ed49-0d14-8015-57ef-3d70b1dea69a@codesourcery.com> X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-09.mgc.mentorg.com (139.181.222.9) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --------------BcF6Mi8gjNkQD028WYxWuna8 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable On 27.09.22 11:23, Tobias Burnus wrote: We do support #if __PTX_SM__ >=3D 600 (CUDA >=3D 8.0, ptx isa >=3D 5.0) and we also can configure GCC with --with-arch=3Dsm_70 (or sm_80 or ...) Thus, adding atomics with .sys scope is possible. See attached patch. This seems to work fine and I hope I got the assembly right in terms of atomic use. (And I do believe that the .release/.acquire do not need an additional __sync_syncronize()/"membar.sys= ".) Regarding this: While 'atom.op' (op =3D and/or/xor/cas/exch/add/inc/dec/min/max) with scope is a sm_60 feature, the used 'st/ld' with scope qualifier and .relaxed, .release / .relaxed, .acquire require sm_70. (Does not really matter as only ..., sm_53 and sm_70, ... is currently supported but not sm_60, but the #if should be obviously fixed.) * * * Looking at the generated code for without inline assembler, we have instead= of st.global.release.sys.u64 [%r27],%r39; and ld.acquire.sys.global.u64 %r62,[%r27]; for the older-systems (__PTX_SM < 700) the code: @ %r69 membar.sys; @ %r69 atom.exch.b64 _,[%r27],%r41; and ld.global.u64 %r64,[__gomp_rev_offload_var]; ld.u64 %r36,[%r64]; membar.sys; In my understanding, the membar.sys ensures - similar to st.release / ld.acquire that the memory handling is done in the correct order in scope .sys. As the 'fn' variable is initially 0 - and then only set via the device i.e. there is eventually a DMA write device->host, which is atomically as the will int64_t is written at once (and not first, e.g. the lower and then the upper half). The 'st'/'atom.exch' should work fine, despite having no .sys scope. Likewise, the membar.sys applies also in the other direction. Or did I miss something. If so, would an explicit __sync_synchronize() (=3D membar.s= ys) help between the 'st' and the 'ld'? Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 --------------BcF6Mi8gjNkQD028WYxWuna8--