From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 56A5D38493F2 for ; Mon, 20 Feb 2023 09:49:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 56A5D38493F2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.97,312,1669104000"; d="scan'208";a="101509299" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 20 Feb 2023 01:48:59 -0800 IronPort-SDR: kCea6dFDs2TU0v8yb5uL4o7X62q9I29DnUKNS0aIecVyb7A8CRqgEoGdO+OOvFBffsZPWQqfav E/TL0cp+wqih7CHwOntmLWYJP0NwWhB0Kom39HXrXBsek0cy/9ovOyN/kd82xfekzX01q7ZC7S EYLhsvJxo9MhrPSYc09/vUyl2QjiuC2ND674hNLdchWelvmfb8uFinPOJgwLWXEPrprnCXYHT2 OWi45+zEh2SQifkbpPoLh3ZAn1xtOGszgZJQIxlVBv5KE7SMWShfegM6Ks2AkdiAxfIYGsPaB2 DpM= Message-ID: <10037b90-784c-68c1-4299-ac98624e77ec@codesourcery.com> Date: Mon, 20 Feb 2023 09:48:53 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [og12] Attempt to register OpenMP pinned memory using a device instead of 'mlock' (was: [PATCH] libgomp, openmp: pinned memory) Content-Language: en-GB To: Thomas Schwinge CC: Jakub Jelinek , Tobias Burnus , References: <20220104155558.GG2646553@tucnak> <48ee767a-0d90-53b4-ea54-9deba9edd805@codesourcery.com> <20220104182829.GK2646553@tucnak> <20220104184740.GL2646553@tucnak> <87edzy5g8h.fsf@euler.schwinge.homeip.net> <87cz69tyla.fsf@dem-tschwing-1.ger.mentorg.com> <87lekxxo23.fsf@euler.schwinge.homeip.net> <87fsb4vhfs.fsf@euler.schwinge.homeip.net> From: Andrew Stubbs In-Reply-To: <87fsb4vhfs.fsf@euler.schwinge.homeip.net> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 17/02/2023 08:12, Thomas Schwinge wrote: > Hi Andrew! > > On 2023-02-16T23:06:44+0100, I wrote: >> On 2023-02-16T16:17:32+0000, "Stubbs, Andrew via Gcc-patches" wrote: >>> The mmap implementation was not optimized for a lot of small allocations, and I can't see that issue changing here >> >> That's correct, 'mmap' remains. Under the hood, 'cuMemHostRegister' must >> surely also be doing some 'mlock'-like thing, so I figured it's best to >> feed page-boundary memory regions to it, which 'mmap' gets us. >> >>> so I don't know if this can be used for mlockall replacement. >>> >>> I had assumed that using the Cuda allocator would fix that limitation. >> >> From what I've read (but no first-hand experiments), there's non-trivial >> overhead with 'cuMemHostRegister' (just like with 'mlock'), so routing >> all small allocations individually through it probably isn't a good idea >> either. Therefore, I suppose, we'll indeed want to use some local >> allocator if we wish this "optimized for a lot of small allocations". > > Eh, I suppose your point indirectly was that instead of 'mmap' plus > 'cuMemHostRegister' we ought to use 'cuMemAllocHost'/'cuMemHostAlloc', as > we assume those already do implement such a local allocator. Let me > quickly change that indeed -- we don't currently have a need to use > 'cuMemHostRegister' instead of 'cuMemAllocHost'/'cuMemHostAlloc'. Yes, that's right. I suppose it makes sense to register memory we already have, but if we want new memory then trying to reinvent what happens inside cuMemAllocHost is pointless. Andrew