From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 4C6143857372 for ; Wed, 12 Oct 2022 17:09:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4C6143857372 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.95,179,1661846400"; d="scan'208";a="84465065" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 12 Oct 2022 09:09:56 -0800 IronPort-SDR: 3jG78tUUeWZbHCFYDD3mfQxwpKxebKS3ubqV+IT0YuU8PhoLNQm8AA7SwmHeZSGQjt7mhpe+JG HaIy9PYIDR1X/JZMhz4hqpIMALgKccahi6xfPbSFhzTwTzPPLYr1AwGSfHsuDjQLB+IP/7djK4 xVsd/Skfs7V6itCO3zj3Qe5SkEp8HmM9v2tc/8VA5JNgcqa0lMNr/VpnosAbVlDCO48MZP8w43 +AwckqU4M5galOPjG3Pc1ckbzbe67rQJZlS2vZlv1vxngGQH7SpItZsUF2wGq5meBFzWsL7CO4 v2s= Message-ID: <3a0eb685-6bb7-ed30-4024-887452c015fd@codesourcery.com> Date: Wed, 12 Oct 2022 18:09:51 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [Patch] libgomp/gcn: Prepare for reverse-offload callback handling Content-Language: en-GB To: Tobias Burnus , gcc-patches CC: Jakub Jelinek References: <55dacdd3-4a82-8087-fdba-824d9910e186@codesourcery.com> <02ec5f29-953b-63dd-7d44-04f9af36a114@codesourcery.com> <1c5166f4-91d2-b320-7fd9-6831c7e26342@codesourcery.com> From: Andrew Stubbs In-Reply-To: <1c5166f4-91d2-b320-7fd9-6831c7e26342@codesourcery.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-08.mgc.mentorg.com (139.181.222.8) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12/10/2022 15:29, Tobias Burnus wrote: > On 29.09.22 18:24, Andrew Stubbs wrote: >> On 27/09/2022 14:16, Tobias Burnus wrote: >>> Andrew did suggest a while back to piggyback on the console_output >>> handling, >>> avoiding another atomic access. - If this is still wanted, I like to >>> have some >>> guidance regarding how to actually implement it. >> [...] >> The point is that you can use the "msg" and "text" fields for whatever >> data you want, as long as you invent a new value for "type". >> [....] >> You can make "case 4" do whatever you want. There are enough bytes for >> 4 pointers, and you could use multiple packets (although it's not safe >> to assume they're contiguous or already arrived; maybe "case 4" for >> part 1, "case 5" for part 2). It's possible to change this structure, >> of course, but the target implementation is in newlib so versioning >> becomes a problem. > > I think  – also looking at the Newlib write.c implementation - that the > data is contiguous: there is an atomic add, where instead of passing '1' > for a single slot, I could also add '2' for two slots. Right, sorry, the buffer is circular, but the counter is linear. It simplified reservation that way, but it does mean that there's a limit to the number of times the buffer can cycle before the counter saturates. (You'd need to stream out gigabytes of data to hit the limit though.) > Attached is one variant – for the decl of the GOMP_OFFLOAD_target_rev, > it needs the generic parts of the sister nvptx patch.* > > 2*128 bytes were not enough, I need 3*128 bytes. (Or rather 5*64 + 32.) > As target_ext is blocking, I decided to use a stack local variable for > the remaining arguments and pass it along. Alternatively, I could also > use 2 slots - and process them together. This would avoid one > device->host memory copy but would make console_output less clear. > PS: Currently, device stack variables are private and cannot be accessed > from the host; this will change in a separate patch. It not only affects > the "rest" part as used in this patch but also the actual arrays behind > addr, kinds, and sizes. And quite likely a lot of the map/firstprivate > variables passed to addr. > > As num_devices() will return 0 or -1, this is for now a non-issue. So, the patch, as is, is known to be non-functional? How can you have tested it? For the addrs_sizes_kind data to be accessible the asm("s8") has to be wrong. I think the patch looks good, in principle. The use of the existing ring-buffer is the right way to do it, IMO. Can we get the manually allocated stacks patch in first and then follow up with these patches when they actually work? Andrew