* [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]
@ 2024-07-26 18:05 Tobias Burnus
2024-07-29 8:18 ` Jakub Jelinek
2024-08-07 8:46 ` Thomas Schwinge
0 siblings, 2 replies; 4+ messages in thread
From: Tobias Burnus @ 2024-07-26 18:05 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek
[-- Attachment #1.1: Type: text/plain, Size: 402 bytes --]
The main idea of 'link' is to permit putting only a subset of a
huge array on the device. Well, in order to make this work properly,
it requires that one can map an array section, which does not
start with the first element.
This patch adjusts the pointers such, that this actually works.
(Tested on x86-64-gnu-linux with Nvptx offloading.)
Comments, suggestions, remarks before I commit it?
Tobias
[-- Attachment #2: fix-link-offset.diff --]
[-- Type: text/x-patch, Size: 3359 bytes --]
libgomp: Fix declare target link with offset array-section mapping [PR116107]
Assume that 'int var[100]' is 'omp declare target link(var)'. When now
mapping an array section with offset such as 'map(to:var[20:10])',
the device-side link pointer has to store &<device-storage-data>[0] minus
the offset such that var[20] will access <device-storage-data>[0]. But
the offset calculation was missed such that the device-side 'var' pointed
to the first element of the mapped data - and var[20] points beyond at
some invalid memory.
PR middle-end/116107
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): Honor array mapping offsets
with declare-target 'link' variables.
* testsuite/libgomp.c-c++-common/target-link-2.c: New test.
libgomp/target.c | 7 ++-
.../testsuite/libgomp.c-c++-common/target-link-2.c | 59 ++++++++++++++++++++++
2 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/libgomp/target.c b/libgomp/target.c
index aa01c1367b9..e3e648f5443 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1820,8 +1820,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
if (k->aux && k->aux->link_key)
{
/* Set link pointer on target to the device address of the
- mapped object. */
- void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset);
+ mapped object. Also deal with offsets due to
+ array-section mapping. */
+ void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset
+ - (k->host_start
+ - k->aux->link_key->host_start));
/* We intentionally do not use coalescing here, as it's not
data allocated by the current call to this function. */
gomp_copy_host2dev (devicep, aq, (void *) n->tgt_offset,
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
new file mode 100644
index 00000000000..4ff4080da76
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
@@ -0,0 +1,59 @@
+/* PR middle-end/116107 */
+
+#include <omp.h>
+
+int arr[15] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+#pragma omp declare target link(arr)
+
+#pragma omp begin declare target
+void f(int *res)
+{
+ __builtin_memcpy (res, &arr[5], sizeof(int)*10);
+}
+
+void g(int *res)
+{
+ __builtin_memcpy (res, &arr[3], sizeof(int)*10);
+}
+#pragma omp end declare target
+
+int main()
+{
+ int res[10], res2;
+ for (int dev = 0; dev < omp_get_num_devices(); dev++)
+ {
+ __builtin_memset (res, 0, sizeof (res));
+ res2 = 99;
+
+ #pragma omp target enter data map(arr[5:10]) device(dev)
+
+ #pragma omp target map(from: res) device(dev)
+ f (res);
+
+ #pragma omp target map(from: res2) device(dev)
+ res2 = arr[5];
+
+ if (res2 != 6)
+ __builtin_abort ();
+ for (int i = 0; i < 10; i++)
+ if (res[i] != 6 + i)
+ __builtin_abort ();
+
+ #pragma omp target exit data map(release:arr[5:10]) device(dev)
+
+ for (int i = 0; i < 15; i++)
+ res[i] *= 10;
+ __builtin_abort ();
+
+ #pragma omp target enter data map(arr[3:10]) device(dev)
+ __builtin_memset (res, 0, sizeof (res));
+
+ #pragma omp target map(from: res) device(dev)
+ g (res);
+
+ for (int i = 0; i < 10; i++)
+ if (res[i] != (4 + i)*10)
+ __builtin_abort ();
+ }
+ return 0;
+}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]
2024-07-26 18:05 [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107] Tobias Burnus
@ 2024-07-29 8:18 ` Jakub Jelinek
2024-08-07 8:46 ` Thomas Schwinge
1 sibling, 0 replies; 4+ messages in thread
From: Jakub Jelinek @ 2024-07-29 8:18 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches
On Fri, Jul 26, 2024 at 08:05:43PM +0200, Tobias Burnus wrote:
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1820,8 +1820,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
> if (k->aux && k->aux->link_key)
> {
> /* Set link pointer on target to the device address of the
> - mapped object. */
> - void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset);
> + mapped object. Also deal with offsets due to
> + array-section mapping. */
Formatting. Two spaces after . in both spots.
> + void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset
> + - (k->host_start
> + - k->aux->link_key->host_start));
Otherwise LGTM.
Jakub
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]
2024-07-26 18:05 [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107] Tobias Burnus
2024-07-29 8:18 ` Jakub Jelinek
@ 2024-08-07 8:46 ` Thomas Schwinge
2024-08-07 16:07 ` [committed] libgomp.c-c++-common/target-link-2.c: Fix test on multi-device systems (was: Re: [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]) Tobias Burnus
1 sibling, 1 reply; 4+ messages in thread
From: Thomas Schwinge @ 2024-08-07 8:46 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, Jakub Jelinek
Hi Tobias!
On 2024-07-26T20:05:43+0200, Tobias Burnus <tburnus@baylibre.com> wrote:
> The main idea of 'link' is to permit putting only a subset of a
> huge array on the device. Well, in order to make this work properly,
> it requires that one can map an array section, which does not
> start with the first element.
>
> This patch adjusts the pointers such, that this actually works.
>
> (Tested on x86-64-gnu-linux with Nvptx offloading.)
> Comments, suggestions, remarks before I commit it?
> libgomp: Fix declare target link with offset array-section mapping [PR116107]
>
> Assume that 'int var[100]' is 'omp declare target link(var)'. When now
> mapping an array section with offset such as 'map(to:var[20:10])',
> the device-side link pointer has to store &<device-storage-data>[0] minus
> the offset such that var[20] will access <device-storage-data>[0]. But
> the offset calculation was missed such that the device-side 'var' pointed
> to the first element of the mapped data - and var[20] points beyond at
> some invalid memory.
>
> PR middle-end/116107
>
> libgomp/ChangeLog:
>
> * target.c (gomp_map_vars_internal): Honor array mapping offsets
> with declare-target 'link' variables.
> * testsuite/libgomp.c-c++-common/target-link-2.c: New test.
>
> libgomp/target.c | 7 ++-
> .../testsuite/libgomp.c-c++-common/target-link-2.c | 59 ++++++++++++++++++++++
> 2 files changed, 64 insertions(+), 2 deletions(-)
The new test case 'libgomp.c-c++-common/target-link-2.c' generally PASSes
on one-GPU systems, but on a multi-GPU system (tested nvidia5):
$ nvidia-smi -L
GPU 0: Tesla K80 (UUID: [...])
GPU 1: Tesla K80 (UUID: [...])
..., I see:
+PASS: libgomp.c/../libgomp.c-c++-common/target-link-2.c (test for excess errors)
+FAIL: libgomp.c/../libgomp.c-c++-common/target-link-2.c execution test
+PASS: libgomp.c++/../libgomp.c-c++-common/target-link-2.c (test for excess errors)
+FAIL: libgomp.c++/../libgomp.c-c++-common/target-link-2.c execution test
[...]
#2 0x00007ffff7b548fc in __GI_abort () at abort.c:79
#3 0x0000000010000bd4 in main () at [...]/libgomp.c-c++-common/target-link-2.c:38
(gdb) frame 3
#3 0x0000000010000bd4 in main () at [...]/libgomp.c-c++-common/target-link-2.c:38
38 __builtin_abort ();
(gdb) list
33
34 #pragma omp target map(from: res2) device(dev)
35 res2 = arr[5];
36
37 if (res2 != 6)
38 __builtin_abort ();
[...]
(gdb) print res2
$1 = 60
I first thought that maybe just:
--- libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
+++ libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
@@ -54,6 +54,8 @@ int main()
for (int i = 0; i < 10; i++)
if (res[i] != (4 + i)*10)
__builtin_abort ();
+
+ #pragma omp target exit data map(release:arr[3:10]) device(dev)
}
return 0;
}
... was missing, but that doesn't resolve the issue: same error state.
Could you please have a look what other state needs to be reset, in which
way?
Grüße
Thomas
> diff --git a/libgomp/target.c b/libgomp/target.c
> index aa01c1367b9..e3e648f5443 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1820,8 +1820,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
> if (k->aux && k->aux->link_key)
> {
> /* Set link pointer on target to the device address of the
> - mapped object. */
> - void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset);
> + mapped object. Also deal with offsets due to
> + array-section mapping. */
> + void *tgt_addr = (void *) (tgt->tgt_start + k->tgt_offset
> + - (k->host_start
> + - k->aux->link_key->host_start));
> /* We intentionally do not use coalescing here, as it's not
> data allocated by the current call to this function. */
> gomp_copy_host2dev (devicep, aq, (void *) n->tgt_offset,
> diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
> new file mode 100644
> index 00000000000..4ff4080da76
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
> @@ -0,0 +1,59 @@
> +/* PR middle-end/116107 */
> +
> +#include <omp.h>
> +
> +int arr[15] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
> +#pragma omp declare target link(arr)
> +
> +#pragma omp begin declare target
> +void f(int *res)
> +{
> + __builtin_memcpy (res, &arr[5], sizeof(int)*10);
> +}
> +
> +void g(int *res)
> +{
> + __builtin_memcpy (res, &arr[3], sizeof(int)*10);
> +}
> +#pragma omp end declare target
> +
> +int main()
> +{
> + int res[10], res2;
> + for (int dev = 0; dev < omp_get_num_devices(); dev++)
> + {
> + __builtin_memset (res, 0, sizeof (res));
> + res2 = 99;
> +
> + #pragma omp target enter data map(arr[5:10]) device(dev)
> +
> + #pragma omp target map(from: res) device(dev)
> + f (res);
> +
> + #pragma omp target map(from: res2) device(dev)
> + res2 = arr[5];
> +
> + if (res2 != 6)
> + __builtin_abort ();
> + for (int i = 0; i < 10; i++)
> + if (res[i] != 6 + i)
> + __builtin_abort ();
> +
> + #pragma omp target exit data map(release:arr[5:10]) device(dev)
> +
> + for (int i = 0; i < 15; i++)
> + res[i] *= 10;
> + __builtin_abort ();
> +
> + #pragma omp target enter data map(arr[3:10]) device(dev)
> + __builtin_memset (res, 0, sizeof (res));
> +
> + #pragma omp target map(from: res) device(dev)
> + g (res);
> +
> + for (int i = 0; i < 10; i++)
> + if (res[i] != (4 + i)*10)
> + __builtin_abort ();
> + }
> + return 0;
> +}
^ permalink raw reply [flat|nested] 4+ messages in thread
* [committed] libgomp.c-c++-common/target-link-2.c: Fix test on multi-device systems (was: Re: [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107])
2024-08-07 8:46 ` Thomas Schwinge
@ 2024-08-07 16:07 ` Tobias Burnus
0 siblings, 0 replies; 4+ messages in thread
From: Tobias Burnus @ 2024-08-07 16:07 UTC (permalink / raw)
To: Thomas Schwinge; +Cc: gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 977 bytes --]
Hi Thomas,
Thomas Schwinge wrote:
> The new test case 'libgomp.c-c++-common/target-link-2.c' generally PASSes
> on one-GPU systems, but on a multi-GPU system (tested nvidia5):
After having debugged it, it became glaringly obvious, but could
otherwise be missed …
The testcase checks that mapping an array – and then remapping a
different stride works, but to see that it was really remapped, it
changed the host value before.
The issue was that it has to be changed back to the original value for
the next device as the value checks expect always the same value.
Committed as r15-2796-gaa689684d2bf58.
Thanks for the report!
Tobias
PS:
> I first thought that maybe just:
>
> + #pragma omp target exit data map(release:arr[3:10]) device(dev)
I was (and still am) torn between adding it (cleaner) or keeping it, as
both have some merits for testing - and haven't cleaned up after the
remapping. In any case, either testcase is fine (and should work).
[-- Attachment #2: committed.diff --]
[-- Type: text/x-patch, Size: 959 bytes --]
commit aa689684d2bf58d1b7e7938a1392e7a260276d14
Author: Tobias Burnus <tburnus@baylibre.com>
Date: Wed Aug 7 17:59:21 2024 +0200
libgomp.c-c++-common/target-link-2.c: Fix test on multi-device systems
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/target-link-2.c: Reset variable
value to handle multi-device tests.
---
libgomp/testsuite/libgomp.c-c++-common/target-link-2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
index 15da1656ebf..b64fbde70e3 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-link-2.c
@@ -54,6 +54,9 @@ int main()
for (int i = 0; i < 10; i++)
if (res[i] != (4 + i)*10)
__builtin_abort ();
+
+ for (int i = 0; i < 15; i++) /* Reset. */
+ arr[i] /= 10;
}
return 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-07 16:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-26 18:05 [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107] Tobias Burnus
2024-07-29 8:18 ` Jakub Jelinek
2024-08-07 8:46 ` Thomas Schwinge
2024-08-07 16:07 ` [committed] libgomp.c-c++-common/target-link-2.c: Fix test on multi-device systems (was: Re: [Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]) Tobias Burnus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).