From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 9ECBC3858D28 for ; Fri, 10 Feb 2023 15:31:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9ECBC3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.97,287,1669104000"; d="scan'208";a="97677786" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 10 Feb 2023 07:31:51 -0800 IronPort-SDR: lAI91CPHg11NY+83dMTxXq8xJgO2mST0NViJU4IHH3aKdhrmDUE28pcDDmOTcWu+Y2PuWQafk0 ofx4pce0fVwt6Uef7xJUx0CzFw2KGLNx0K9YP8V8WQzb+uaY74OCOVB2JKbBGnswVWNbBr7BvU /dTBcxkIc9bk98I94mo6Wz/jSoKY6a30kRWsK0MxHkOIIQ0LvUaMaWZ1wbDU30BfC3IN2TG38h onQuGJIII9zY6NwFo3kJh9oxyUu8TLiPysYpuLasXMLy7xDXSxbygd0ySj1nmQEN2j3mpg9aAn NRs= Message-ID: <4417d508-f9b1-8bf2-308d-f1a824b47303@codesourcery.com> Date: Fri, 10 Feb 2023 15:31:47 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [PATCH 3/5] openmp, nvptx: ompx_unified_shared_mem_alloc Content-Language: en-GB To: Thomas Schwinge CC: Hafiz Abid Qadeer , References: <20220308113059.688551-1-abidh@codesourcery.com> <20220308113059.688551-4-abidh@codesourcery.com> <871qmx1tzo.fsf@euler.schwinge.homeip.net> From: Andrew Stubbs In-Reply-To: <871qmx1tzo.fsf@euler.schwinge.homeip.net> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/02/2023 14:21, Thomas Schwinge wrote: > Is the correct fix the following (conceptually like > 'linux_memspace_alloc' cited above), or is there something that I fail to > understand? > > static void * > linux_memspace_calloc (omp_memspace_handle_t memspace, size_t size, int pin) > { > if (memspace == ompx_unified_shared_mem_space) > { > void *ret = gomp_usm_alloc (size, GOMP_DEVICE_ICV); > memset (ret, 0, size); > return ret; > } > - else if (memspace == ompx_unified_shared_mem_space > - || pin) > + else if (pin) > return linux_memspace_alloc (memspace, size, pin); > else > return calloc (1, size); Yes, I think that is what was intended (and what actually happens). You can have your memory both unified and pinned (well, maybe it's possible, but there's no one Cuda API for that), so the USM takes precedence. > The following ones then again are conceptually like > 'linux_memspace_alloc' cited above: > >> @@ -77,9 +86,9 @@ static void >> linux_memspace_free (omp_memspace_handle_t memspace, void *addr, size_t size, >> int pin) >> { >> - (void)memspace; >> - >> - if (pin) >> + if (memspace == ompx_unified_shared_mem_space) >> + gomp_usm_free (addr, GOMP_DEVICE_ICV); >> + else if (pin) >> munmap (addr, size); >> else >> free (addr); >> @@ -89,7 +98,9 @@ static void * >> linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr, >> size_t oldsize, size_t size, int oldpin, int pin) >> { >> - if (oldpin && pin) >> + if (memspace == ompx_unified_shared_mem_space) >> + goto manual_realloc; >> + else if (oldpin && pin) >> { >> void *newaddr = mremap (addr, oldsize, size, MREMAP_MAYMOVE); >> if (newaddr == MAP_FAILED) >> @@ -98,18 +109,19 @@ linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr, >> [...] Yes. > ..., and similar those here: > >> --- a/libgomp/config/nvptx/allocator.c >> +++ b/libgomp/config/nvptx/allocator.c >> @@ -125,6 +125,8 @@ nvptx_memspace_alloc (omp_memspace_handle_t memspace, size_t size) >> __atomic_store_n (&__nvptx_lowlat_heap_root, root.raw, MEMMODEL_RELEASE); >> return result; >> } >> + else if (memspace == ompx_host_mem_space) >> + return NULL; >> else >> return malloc (size); >> } >> @@ -145,6 +147,8 @@ nvptx_memspace_calloc (omp_memspace_handle_t memspace, size_t size) >> >> return result; >> } >> + else if (memspace == ompx_host_mem_space) >> + return NULL; >> else >> return calloc (1, size); >> } >> @@ -354,6 +358,8 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, void *addr, >> } >> return result; >> } >> + else if (memspace == ompx_host_mem_space) >> + return NULL; >> else >> return realloc (addr, size); >> } > > (I'd have added an explicit no-op (or, 'abort'?) to > 'nvptx_memspace_free', but that's maybe just me...) ;-\ Why? The host memspace is just the regular heap, which can be a thing on any device. It's an extension though so we can define it either way. >> --- a/libgomp/libgomp.h >> +++ b/libgomp/libgomp.h > >> +extern void * gomp_usm_alloc (size_t size, int device_num); >> +extern void gomp_usm_free (void *device_ptr, int device_num); >> +extern bool gomp_is_usm_ptr (void *ptr); > > 'gomp_is_usm_ptr' isn't defined/used anywhere; I'll remove it. I think I started that and then decided against. Thanks. >> --- a/libgomp/target.c >> +++ b/libgomp/target.c > >> @@ -3740,6 +3807,9 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device, >> DLSYM (unload_image); >> DLSYM (alloc); >> DLSYM (free); >> + DLSYM_OPT (usm_alloc, usm_alloc); >> + DLSYM_OPT (usm_free, usm_free); >> + DLSYM_OPT (is_usm_ptr, is_usm_ptr); >> DLSYM (dev2host); >> DLSYM (host2dev); > > As a sanity check, shouldn't we check that either none or all three of > those are defined, like in the 'if (cuda && cuda != 4) { [error] }' check > a bit further down? This is only going to happen when somebody writes a new plugin, and then they'll discover very quickly that there are issues. I've wasted more time writing this sentence than it's worth already. :) > Note that these remarks likewise apply to the current upstream > submission: > > "openmp, nvptx: ompx_unified_shared_mem_alloc". I have new patches to heap on top of this set already on OG12, and more planned, plus these ones you're working on; the whole patchset is going to have to get a rebase, squash, and tidy "soonish". Andrew