From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 94A5238533E0 for ; Thu, 15 Dec 2022 20:13:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 94A5238533E0 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,248,1665475200"; d="diff'?scan'208";a="89839141" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 15 Dec 2022 12:13:28 -0800 IronPort-SDR: k7AOVp1JLE2NAwQw5TIahd2xOtpbXymDgnOdlfCpTEpaO22N4ze8WXYpYkSuXgcAEpB0Ss5PmR xWmmuH5fyNZGOx64WFDiwD3hXUFDTYXETDxezLFhekYi10NF3dK5yPBbg39mC9n9e1ZG1WWhQK zsyDElcR7SUIoNyYF5429gbpeXjIPy4eWY6aBfiUhjivLxB57scEdsspQCoM6//fXCaN6H3EgV IF/6Go/XewsaFbG2Q79ppiUYmO2ESAKDJkkr0hAgdZ+RmXoIVuEUeaMMrmQJAPAZKCIPCNNDBU PQk= Content-Type: multipart/mixed; boundary="------------Xwg0e7k0Ti1wU9jsGxK4B2DM" Message-ID: <1f985418-b6ae-150b-ba11-52a32438d2b5@codesourcery.com> Date: Thu, 15 Dec 2022 21:13:20 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [Patch] libgomp: Handle OpenMP's reverse offloads Content-Language: en-US From: Tobias Burnus To: Thomas Schwinge , , Jakub Jelinek References: <0567b7c6-fede-72b8-63d1-1fc10dca36a0@codesourcery.com> <87ilicfu55.fsf@euler.schwinge.homeip.net> In-Reply-To: X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --------------Xwg0e7k0Ti1wU9jsGxK4B2DM Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable Hi, On 15.12.22 20:42, Tobias Burnus wrote: >> If the libgomp plugin doesn't request special >> 'host_to_dev_cpy'/'dev_to_host_cpy' for 'gomp_target_rev', then standard >> 'gomp_copy_host2dev'/'gomp_copy_dev2host' are used, which use >> 'gomp_device_copy', which expects the device to be locked. (As can be >> told by the unconditional 'gomp_mutex_unlock (&devicep->lock);' before >> 'gomp_fatal'.) However, in a number of the >> 'gomp_copy_host2dev'/'gomp_copy_dev2host' calls from 'gomp_target_rev', >> the device definitely is not locked; see Actually, reading it + the source code again, I think it makes sense to return a boolean =E2=80=93 similar to devicep->host2dev_func and devicep->dev2host_func =E2=80=94 and possibly wrap it into some convenience function, similar to gomp_device_copy =E2=80=93 at least a bare exit() with= out further diagnostic does not seem to userfriendly. BTW: In line with the other code, you could use CUDA_CALL instead of CUDA_CALL_ERET; the fomer already calls the latter with 'false' as first argument + is used elsewhere. Regarding the lock: It seems the problem is the copying of devaddrs/sizes/kinds; this does not need any lock as the stack variables are on the device and only used for this reverse offload. Thus, there is no need for a lock as there are no races. However, as the existing gomp_copy_dev2host removes the lock, we could simply keep this lock =E2=80=93 and probably should move it down to just be= fore the user-function call =E2=80=93 removing all (non-error) locks and unlocks= on the way. =E2=80=94 I mean something like the attached patch. Finally, I think we need to find a solution for the issue Andrew tried to address. =E2=80=94 The current code invokes CUDA_CALL_ASSERT =E2=80=93 w= hich calls GOMP_PLUGIN_fatal. Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 --------------Xwg0e7k0Ti1wU9jsGxK4B2DM Content-Type: text/x-patch; charset="UTF-8"; name="patch.diff" Content-Disposition: attachment; filename="patch.diff" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2xpYmdvbXAvdGFyZ2V0LmMgYi9saWJnb21wL3RhcmdldC5jCmluZGV4 IGUzOGNjM2I2ZjFjLi40YjcyMzMzMDdjZCAxMDA2NDQKLS0tIGEvbGliZ29tcC90YXJnZXQu YworKysgYi9saWJnb21wL3RhcmdldC5jCkBAIC0zMzE5LDUgKzMzMTksNiBAQCBnb21wX3Rh cmdldF9yZXYgKHVpbnQ2NF90IGZuX3B0ciwgdWludDY0X3QgbWFwbnVtLCB1aW50NjRfdCBk ZXZhZGRyc19wdHIsCiAgIGdvbXBfbXV0ZXhfbG9jayAoJmRldmljZXAtPmxvY2spOwogICBu ID0gZ29tcF9tYXBfbG9va3VwX3JldiAoJmRldmljZXAtPm1lbV9tYXBfcmV2LCAmayk7Ci0g IGdvbXBfbXV0ZXhfdW5sb2NrICgmZGV2aWNlcC0+bG9jayk7CisgIGlmIChkZXZpY2VwLT5j YXBhYmlsaXRpZXMgJiBHT01QX09GRkxPQURfQ0FQX1NIQVJFRF9NRU0pCisgICAgZ29tcF9t dXRleF91bmxvY2sgKCZkZXZpY2VwLT5sb2NrKTsKIAogICBpZiAobiA9PSBOVUxMKQpAQCAt MzQwOSw1ICszNDEwLDQgQEAgZ29tcF90YXJnZXRfcmV2ICh1aW50NjRfdCBmbl9wdHIsIHVp bnQ2NF90IG1hcG51bSwgdWludDY0X3QgZGV2YWRkcnNfcHRyLAogICAgICAgY2RhdGEgPSBn b21wX2FsbG9jYSAoc2l6ZW9mICgqY2RhdGEpICogbWFwbnVtKTsKICAgICAgIG1lbXNldCAo Y2RhdGEsICdcMCcsIHNpemVvZiAoKmNkYXRhKSAqIG1hcG51bSk7Ci0gICAgICBnb21wX211 dGV4X2xvY2sgKCZkZXZpY2VwLT5sb2NrKTsKICAgICAgIGZvciAodWludDY0X3QgaSA9IDA7 IGkgPCBtYXBudW07IGkrKykKIAl7CkBAIC0zNjQzLDQgKzM2NDMsNSBAQCBnb21wX3Rhcmdl dF9yZXYgKHVpbnQ2NF90IGZuX3B0ciwgdWludDY0X3QgbWFwbnVtLCB1aW50NjRfdCBkZXZh ZGRyc19wdHIsCiAgICAgICB1aW50NjRfdCBzdHJ1Y3RfY3B5ID0gMDsKICAgICAgIGJvb2wg Y2xlYW5fc3RydWN0ID0gZmFsc2U7CisgICAgICBnb21wX211dGV4X2xvY2sgKCZkZXZpY2Vw LT5sb2NrKTsKICAgICAgIGZvciAodWludDY0X3QgaSA9IDA7IGkgPCBtYXBudW07IGkrKykK IAl7CkBAIC0zNjk1LDUgKzM2OTYsNSBAQCBnb21wX3RhcmdldF9yZXYgKHVpbnQ2NF90IGZu X3B0ciwgdWludDY0X3QgbWFwbnVtLCB1aW50NjRfdCBkZXZhZGRyc19wdHIsCiAJICAgICAg Z29tcF9hbGlnbmVkX2ZyZWUgKCh2b2lkICopICh1aW50cHRyX3QpIGRldmFkZHJzW2ldKTsK IAkgICAgfQotCisgICAgICBnb21wX211dGV4X3VubG9jayAoJmRldmljZXAtPmxvY2spOwog ICAgICAgZnJlZSAoZGV2YWRkcnMpOwogICAgICAgZnJlZSAoc2l6ZXMpOwo= --------------Xwg0e7k0Ti1wU9jsGxK4B2DM--