From: Tobias Burnus <tobias@codesourcery.com>
To: Thomas Schwinge <thomas@codesourcery.com>,
Andrew Stubbs <ams@codesourcery.com>,
Jakub Jelinek <jakub@redhat.com>
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] libgomp, openmp: pinned memory
Date: Thu, 9 Jun 2022 12:09:52 +0200 [thread overview]
Message-ID: <8c95bbcf-7a74-738d-ffc2-4cae606aac62@codesourcery.com> (raw)
In-Reply-To: <87edzy5g8h.fsf@euler.schwinge.homeip.net>
On 09.06.22 11:38, Thomas Schwinge wrote:
> On 2022-06-07T13:28:33+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
>> On 07/06/2022 13:10, Jakub Jelinek wrote:
>>> On Tue, Jun 07, 2022 at 12:05:40PM +0100, Andrew Stubbs wrote:
>>>> The memory pinned via the mlock call does not give the expected performance
>>>> boost. I had not expected that it would do much in my test setup, given that
>>>> the machine has a lot of RAM and my benchmarks are small, but others have
>>>> tried more and on varying machines and architectures.
>>> I don't understand why there should be any expected performance boost (at
>>> least not unless the machine starts swapping out pages),
>>> { omp_atk_pinned, true } is solely about the requirement that the memory
>>> can't be swapped out.
>> It seems like it takes a faster path through the NVidia drivers. [...]
I think this conflates two parts:
* User-defined allocators in general – there CUDA does not make much
sense and without unified-shared memory, it will always be inaccessible
on the device (w/o explicit/implicit mapping).
* Memory which is supposed to be accessible both on the host and on the
device. That's most obvious by explicitly allocating to be accessible
on both – it is less clear cut when just creating an allocator with
unified-shared memory as it is not clear when it is only using on the
host (e.g. with host-based thread parallelization) – and when it is also
relevant for the device.
Currently, the user has no means to express the intent that it should be
accessible on both the host and one/several devices, except for 'omp
requires unified_shared_memory'.
The next OpenMP version will likely permit a means to create an
allocator which permits this →
https://github.com/OpenMP/spec/issues/1843 (not publicly available;
slides (last comment) are slightly outdated).
* * *
The question is only what to do with 'requires unified_shared_memory' –
and a non-multi-device allocator.
Probably: unified_shared_memory or no nvptx device: just use mlock.
Otherwise (i.e. both nvptx device and (unified_shared_memory or a
multi-device-allocator)), use the CUDA one.
For the latter, I think Thomas' remarks are helpful.
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
next prev parent reply other threads:[~2022-06-09 10:09 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-04 15:32 Andrew Stubbs
2022-01-04 15:55 ` Jakub Jelinek
2022-01-04 16:58 ` Andrew Stubbs
2022-01-04 18:28 ` Jakub Jelinek
2022-01-04 18:47 ` Jakub Jelinek
2022-01-05 17:07 ` Andrew Stubbs
2022-01-13 13:53 ` Andrew Stubbs
2022-06-07 11:05 ` Andrew Stubbs
2022-06-07 12:10 ` Jakub Jelinek
2022-06-07 12:28 ` Andrew Stubbs
2022-06-07 12:40 ` Jakub Jelinek
2022-06-09 9:38 ` Thomas Schwinge
2022-06-09 10:09 ` Tobias Burnus [this message]
2022-06-09 10:22 ` Stubbs, Andrew
2022-06-09 10:31 ` Stubbs, Andrew
2023-02-16 15:32 ` Attempt to register OpenMP pinned memory using a device instead of 'mlock' (was: [PATCH] libgomp, openmp: pinned memory) Thomas Schwinge
2023-02-16 16:17 ` Stubbs, Andrew
2023-02-16 22:06 ` [og12] " Thomas Schwinge
2023-02-17 8:12 ` Thomas Schwinge
2023-02-20 9:48 ` Andrew Stubbs
2023-02-20 13:53 ` [og12] Attempt to not just register but allocate OpenMP pinned memory using a device (was: [og12] Attempt to register OpenMP pinned memory using a device instead of 'mlock') Thomas Schwinge
2023-02-10 15:11 ` [PATCH] libgomp, openmp: pinned memory Thomas Schwinge
2023-02-10 15:55 ` Andrew Stubbs
2023-02-16 21:39 ` [og12] Clarify/verify OpenMP 'omp_calloc' zero-initialization for pinned memory (was: [PATCH] libgomp, openmp: pinned memory) Thomas Schwinge
2023-03-24 15:49 ` [og12] libgomp: Document OpenMP 'pinned' memory (was: [PATCH] libgomp, openmp: pinned memory Thomas Schwinge
2023-03-27 9:27 ` Stubbs, Andrew
2023-03-27 11:26 ` [og12] libgomp: Document OpenMP 'pinned' memory (was: [PATCH] libgomp, openmp: pinned memory) Thomas Schwinge
2023-03-27 12:01 ` Andrew Stubbs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c95bbcf-7a74-738d-ffc2-4cae606aac62@codesourcery.com \
--to=tobias@codesourcery.com \
--cc=ams@codesourcery.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=thomas@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).