Use Cuda to pin memory, instead of Linux mlock, when available. There are two advantages: firstly, this gives a significant speed boost for NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit setting. The design adds a device independent plugin API for allocating pinned memory, and then implements it for NVPTX. At present, the other supported devices do not have equivalent capabilities (or requirements). libgomp/ChangeLog: * config/linux/allocator.c: Include assert.h. (using_device_for_page_locked): New variable. (linux_memspace_alloc): Add init0 parameter. Support device pinning. (linux_memspace_calloc): Set init0 to true. (linux_memspace_free): Support device pinning. (linux_memspace_realloc): Support device pinning. (MEMSPACE_ALLOC): Set init0 to false. * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_alloc): New prototype. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * libgomp.h (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (struct gomp_device_descr): Add page_locked_host_alloc_func and page_locked_host_free_func. * libgomp.texi: Adjust the docs for the pinned trait. * libgomp_g.h (GOMP_enable_pinned_mode): New prototype. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_page_locked_host_alloc): New function. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * target.c (device_for_page_locked): New variable. (get_device_for_page_locked): New function. (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (gomp_load_plugin_for_device): Add page_locked_host_alloc and page_locked_host_free. * testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX devices. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise. * testsuite/libgomp.c/alloc-pinned-6.c: Likewise. Co-Authored-By: Thomas Schwinge --- libgomp/config/linux/allocator.c | 137 ++++++++++++++----- libgomp/libgomp-plugin.h | 2 + libgomp/libgomp.h | 4 + libgomp/libgomp.texi | 11 +- libgomp/libgomp_g.h | 1 + libgomp/plugin/plugin-nvptx.c | 42 ++++++ libgomp/target.c | 136 ++++++++++++++++++ libgomp/testsuite/libgomp.c/alloc-pinned-1.c | 26 ++++ libgomp/testsuite/libgomp.c/alloc-pinned-2.c | 26 ++++ libgomp/testsuite/libgomp.c/alloc-pinned-3.c | 45 +++++- libgomp/testsuite/libgomp.c/alloc-pinned-4.c | 44 +++++- libgomp/testsuite/libgomp.c/alloc-pinned-5.c | 26 ++++ libgomp/testsuite/libgomp.c/alloc-pinned-6.c | 35 ++++- 13 files changed, 487 insertions(+), 48 deletions(-)