From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 0D3143858D28 for ; Fri, 10 Feb 2023 14:21:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D3143858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.97,287,1669104000"; d="scan'208";a="100322582" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 10 Feb 2023 06:21:38 -0800 IronPort-SDR: x9eBYTKkIO7NDbtm5qdDx3AVgDeaPPXE9LZA4D7/GQxelTh4y5DdWtmCLMl+FCOO6rM8lRYGtI wiv6Xy8fXvH1VK7AW0gr6YNZ+HO46teMfVBqO0OtWiKyTsnEWQ7NbvVrhFJTiVI21GtkugAN27 yUO+w2Vdk+y0dE93h+jW6EmlGqhIruZeWxCuxB7TIapBzSizaecct791+Mut3cJ67bXVhRo+jr 7vq6YVAICxyW42KrnoH663S4fcnXijDKQpvpwRMfcSw8fgBuzL3vFH5AKANeVAzXLwj4fXwH/P jsA= From: Thomas Schwinge To: CC: Hafiz Abid Qadeer , Subject: Re: [PATCH 3/5] openmp, nvptx: ompx_unified_shared_mem_alloc In-Reply-To: <20220308113059.688551-4-abidh@codesourcery.com> References: <20220308113059.688551-1-abidh@codesourcery.com> <20220308113059.688551-4-abidh@codesourcery.com> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/28.2 (x86_64-pc-linux-gnu) Date: Fri, 10 Feb 2023 15:21:31 +0100 Message-ID: <871qmx1tzo.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Andrew! On 2022-03-08T11:30:57+0000, Hafiz Abid Qadeer wro= te: > From: Andrew Stubbs > > This adds support for using Cuda Managed Memory with omp_alloc. It will = be > used as the underpinnings for "requires unified_shared_memory" in a later > patch. > > There are two new predefined allocators, ompx_unified_shared_mem_alloc an= d > ompx_host_mem_alloc, plus corresponding memory spaces, [...] > --- a/libgomp/config/linux/allocator.c > +++ b/libgomp/config/linux/allocator.c > @@ -42,9 +42,11 @@ > static void * > linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size, int p= in) > { > - (void)memspace; > - > - if (pin) > + if (memspace =3D=3D ompx_unified_shared_mem_space) > + { > + return gomp_usm_alloc (size, GOMP_DEVICE_ICV); > + } > + else if (pin) > { > void *addr =3D mmap (NULL, size, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); This I understand conceptually, but then: > @@ -67,7 +69,14 @@ linux_memspace_alloc (omp_memspace_handle_t memspace, = size_t size, int pin) > static void * > linux_memspace_calloc (omp_memspace_handle_t memspace, size_t size, int = pin) > { > - if (pin) > + if (memspace =3D=3D ompx_unified_shared_mem_space) > + { > + void *ret =3D gomp_usm_alloc (size, GOMP_DEVICE_ICV); > + memset (ret, 0, size); > + return ret; > + } > + else if (memspace =3D=3D ompx_unified_shared_mem_space > + || pin) > return linux_memspace_alloc (memspace, size, pin); > else > return calloc (1, size); ..., here, we've got a duplicated (and thus always-false) expression 'memspace =3D=3D ompx_unified_shared_mem_space' (..., which '-Wduplicated-cond' fails to report; "'-Wduplicated-cond' doesn't diagnose duplicated subexpressions"...). Is the correct fix the following (conceptually like 'linux_memspace_alloc' cited above), or is there something that I fail to understand? static void * linux_memspace_calloc (omp_memspace_handle_t memspace, size_t size, in= t pin) { if (memspace =3D=3D ompx_unified_shared_mem_space) { void *ret =3D gomp_usm_alloc (size, GOMP_DEVICE_ICV); memset (ret, 0, size); return ret; } - else if (memspace =3D=3D ompx_unified_shared_mem_space - || pin) + else if (pin) return linux_memspace_alloc (memspace, size, pin); else return calloc (1, size); The following ones then again are conceptually like 'linux_memspace_alloc' cited above: > @@ -77,9 +86,9 @@ static void > linux_memspace_free (omp_memspace_handle_t memspace, void *addr, size_t = size, > int pin) > { > - (void)memspace; > - > - if (pin) > + if (memspace =3D=3D ompx_unified_shared_mem_space) > + gomp_usm_free (addr, GOMP_DEVICE_ICV); > + else if (pin) > munmap (addr, size); > else > free (addr); > @@ -89,7 +98,9 @@ static void * > linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr, > size_t oldsize, size_t size, int oldpin, int pin) > { > - if (oldpin && pin) > + if (memspace =3D=3D ompx_unified_shared_mem_space) > + goto manual_realloc; > + else if (oldpin && pin) > { > void *newaddr =3D mremap (addr, oldsize, size, MREMAP_MAYMOVE); > if (newaddr =3D=3D MAP_FAILED) > @@ -98,18 +109,19 @@ linux_memspace_realloc (omp_memspace_handle_t memspa= ce, void *addr, > [...] ..., and similar those here: > --- a/libgomp/config/nvptx/allocator.c > +++ b/libgomp/config/nvptx/allocator.c > @@ -125,6 +125,8 @@ nvptx_memspace_alloc (omp_memspace_handle_t memspace,= size_t size) > __atomic_store_n (&__nvptx_lowlat_heap_root, root.raw, MEMMODEL_RE= LEASE); > return result; > } > + else if (memspace =3D=3D ompx_host_mem_space) > + return NULL; > else > return malloc (size); > } > @@ -145,6 +147,8 @@ nvptx_memspace_calloc (omp_memspace_handle_t memspace= , size_t size) > > return result; > } > + else if (memspace =3D=3D ompx_host_mem_space) > + return NULL; > else > return calloc (1, size); > } > @@ -354,6 +358,8 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspac= e, void *addr, > } > return result; > } > + else if (memspace =3D=3D ompx_host_mem_space) > + return NULL; > else > return realloc (addr, size); > } (I'd have added an explicit no-op (or, 'abort'?) to 'nvptx_memspace_free', but that's maybe just me...) ;-\ > --- a/libgomp/libgomp.h > +++ b/libgomp/libgomp.h > +extern void * gomp_usm_alloc (size_t size, int device_num); > +extern void gomp_usm_free (void *device_ptr, int device_num); > +extern bool gomp_is_usm_ptr (void *ptr); 'gomp_is_usm_ptr' isn't defined/used anywhere; I'll remove it. > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -3740,6 +3807,9 @@ gomp_load_plugin_for_device (struct gomp_device_des= cr *device, > DLSYM (unload_image); > DLSYM (alloc); > DLSYM (free); > + DLSYM_OPT (usm_alloc, usm_alloc); > + DLSYM_OPT (usm_free, usm_free); > + DLSYM_OPT (is_usm_ptr, is_usm_ptr); > DLSYM (dev2host); > DLSYM (host2dev); As a sanity check, shouldn't we check that either none or all three of those are defined, like in the 'if (cuda && cuda !=3D 4) { [error] }' check a bit further down? Note that these remarks likewise apply to the current upstream submission: "openmp, nvptx: ompx_unified_shared_mem_alloc". Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955