On Wed, 27 Sep 2023 00:57:58 +0200 Thomas Schwinge wrote: > On 2023-09-06T02:34:30-0700, Julian Brown > wrote: > > This patch works around behaviour of the 2D and 3D memcpy > > operations in the CUDA driver runtime. Particularly in Fortran, > > the "base pointer" of an array (used for either source or > > destination of a host/device copy) may lie outside of data that is > > actually stored on the device. The fix is to make sure that we use > > the first element of data to be transferred instead, and adjust > > parameters accordingly. > > Do you (a) have a stand-alone test case for this (that is, not > depending on your other pending patches, so that this could go in > directly -- together with the before-FAIL test case). Thanks for the reply! Here's a version with a stand-alone test case. > Do you (b) > know if is this a bug in our use of the CUDA Driver API or rather in > CUDA itself? If the latter, have you reported this to Nvidia? I don't think the CUDA behaviour is *wrong*, as such -- at least to the C/C++ way of thinking (or indeed a graphics-oriented way of thinking), one would normally think of an array as having a zero-based origin, and these 2D/3D memory copies would be intended as a way of updating just a part of an array (or texture) that has full duplicate copies on both the host and device. Our use-case just happens to be a bit different, both because Fortran (internally) represents an array by a zero-based origin but may use 1-based (or whatever-based) indices, and because we support partial mappings of host arrays on the device in all three supported languages -- which amounts to much the same thing, actually. That said, it *could* be fixed in CUDA, though probably not in all the versions currently deployed out there in the world. So I guess we'd still need a patch like this anyway. Julian