* [Patch][1/3] libgomp: Prepare for reverse offload fn lookup
@ 2022-08-25 14:54 Tobias Burnus
2022-08-25 14:54 ` Tobias Burnus
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 14:54 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 981 bytes --]
Technically, this patch is stand alone, but conceptually it based on the
submitted but not reviewed patch:
"[Patch] OpenMP: Support reverse offload (middle end part)"
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html
With that patch, for reverse offloads ('omp target device(ancestor:1)'),
calls like the following are added:
GOMP_target_ext (-2 /* initial device */, omp_fn.1
where 'omp_fn.1' on nonhost devices a stub function just required for
looking up the host function pointer via the offload_funcs table.
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-gen.diff --]
[-- Type: text/x-patch, Size: 5324 bytes --]
libgomp: Prepare for reverse offload fn lookup
Prepare for reverse-offloading function-pointer lookup by passing
a rev_fn_table argument to GOMP_OFFLOAD_load_image.
The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
requested and devices not supported it, are filtered out.
(Up to and including this commit, no non-host device claims such
support and the caller currently always passes NULL.)
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
'uint64_t **rev_fn_table' argument.
* oacc-host.c (host_load_image): Likewise.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
currently unused.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
* target.c (gomp_load_image_to_device): Update call but pass
NULL for now.
liboffloadmic/ChangeLog:
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
Add (unused) uint64_t **rev_fn_table argument.
libgomp/libgomp-plugin.h | 2 +-
libgomp/oacc-host.c | 3 ++-
libgomp/plugin/plugin-gcn.c | 7 +++++--
libgomp/plugin/plugin-nvptx.c | 7 +++++--
libgomp/target.c | 2 +-
liboffloadmic/plugin/libgomp-plugin-intelmic.cpp | 3 ++-
6 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index ab3ed638475..57309303e66 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -130,7 +130,7 @@ extern bool GOMP_OFFLOAD_init_device (int);
extern bool GOMP_OFFLOAD_fini_device (int);
extern unsigned GOMP_OFFLOAD_version (void);
extern int GOMP_OFFLOAD_load_image (int, unsigned, const void *,
- struct addr_pair **);
+ struct addr_pair **, uint64_t **);
extern bool GOMP_OFFLOAD_unload_image (int, unsigned, const void *);
extern void *GOMP_OFFLOAD_alloc (int, size_t);
extern bool GOMP_OFFLOAD_free (int, void *);
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index eb11b9cf16a..4e3971ae1a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -81,7 +81,8 @@ static int
host_load_image (int n __attribute__ ((unused)),
unsigned v __attribute__ ((unused)),
const void *t __attribute__ ((unused)),
- struct addr_pair **r __attribute__ ((unused)))
+ struct addr_pair **r __attribute__ ((unused)),
+ uint64_t **f __attribute__ ((unused)))
{
return 0;
}
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index ea327bf2ca0..363e2950649 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3346,11 +3346,14 @@ GOMP_OFFLOAD_init_device (int n)
/* Load GCN object-code module described by struct gcn_image_desc in
TARGET_DATA and return references to kernel descriptors in TARGET_TABLE.
- If there are any constructors then run them. */
+ If there are any constructors then run them. If not NULL, REV_FN_TABLE will
+ contain the on-device addresses of the functions for reverse offload. To be
+ freed by the caller. */
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
- struct addr_pair **target_table)
+ struct addr_pair **target_table,
+ uint64_t **rev_fn_table __attribute__((unused)))
{
if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
{
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bc63e274cdf..d130665ed19 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1266,11 +1266,14 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
}
/* Load the (partial) program described by TARGET_DATA to device
- number ORD. Allocate and return TARGET_TABLE. */
+ number ORD. Allocate and return TARGET_TABLE. If not NULL, REV_FN_TABLE
+ will contain the on-device addresses of the functions for reverse offload.
+ To be freed by the caller. */
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
- struct addr_pair **target_table)
+ struct addr_pair **target_table,
+ uint64_t **rev_fn_table __attribute__((unused)))
{
CUmodule module;
const char *const *var_names;
diff --git a/libgomp/target.c b/libgomp/target.c
index 135db1d88ab..da907a2fb48 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2137,7 +2137,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
num_target_entries
= devicep->load_image_func (devicep->target_id, version,
- target_data, &target_table);
+ target_data, &target_table, NULL);
if (num_target_entries != num_funcs + num_vars
/* Others (device_num) are included as trailing entries in pair list. */
diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
index 33bae0650b4..7be27f0459d 100644
--- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
+++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
@@ -349,7 +349,8 @@ GOMP_OFFLOAD_version (void)
extern "C" int
GOMP_OFFLOAD_load_image (int device, const unsigned version,
- const void *target_image, addr_pair **result)
+ const void *target_image, addr_pair **result,
+ uint64_t ** /* rev_fn_table */)
{
TRACE ("(device = %d, target_image = %p)", device, target_image);
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch][1/3] libgomp: Prepare for reverse offload fn lookup
2022-08-25 14:54 [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Tobias Burnus
@ 2022-08-25 14:54 ` Tobias Burnus
2022-08-25 15:38 ` [Patch][2/3] GCN: libgomp+mkoffload.cc: " Tobias Burnus
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 14:54 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek
[-- Attachment #1.1: Type: text/plain, Size: 981 bytes --]
Technically, this patch is stand alone, but conceptually it based on the
submitted but not reviewed patch:
"[Patch] OpenMP: Support reverse offload (middle end part)"
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html
With that patch, for reverse offloads ('omp target device(ancestor:1)'),
calls like the following are added:
GOMP_target_ext (-2 /* initial device */, omp_fn.1
where 'omp_fn.1' on nonhost devices a stub function just required for
looking up the host function pointer via the offload_funcs table.
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-gen.diff --]
[-- Type: text/x-patch, Size: 5324 bytes --]
libgomp: Prepare for reverse offload fn lookup
Prepare for reverse-offloading function-pointer lookup by passing
a rev_fn_table argument to GOMP_OFFLOAD_load_image.
The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
requested and devices not supported it, are filtered out.
(Up to and including this commit, no non-host device claims such
support and the caller currently always passes NULL.)
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
'uint64_t **rev_fn_table' argument.
* oacc-host.c (host_load_image): Likewise.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
currently unused.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
* target.c (gomp_load_image_to_device): Update call but pass
NULL for now.
liboffloadmic/ChangeLog:
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
Add (unused) uint64_t **rev_fn_table argument.
libgomp/libgomp-plugin.h | 2 +-
libgomp/oacc-host.c | 3 ++-
libgomp/plugin/plugin-gcn.c | 7 +++++--
libgomp/plugin/plugin-nvptx.c | 7 +++++--
libgomp/target.c | 2 +-
liboffloadmic/plugin/libgomp-plugin-intelmic.cpp | 3 ++-
6 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index ab3ed638475..57309303e66 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -130,7 +130,7 @@ extern bool GOMP_OFFLOAD_init_device (int);
extern bool GOMP_OFFLOAD_fini_device (int);
extern unsigned GOMP_OFFLOAD_version (void);
extern int GOMP_OFFLOAD_load_image (int, unsigned, const void *,
- struct addr_pair **);
+ struct addr_pair **, uint64_t **);
extern bool GOMP_OFFLOAD_unload_image (int, unsigned, const void *);
extern void *GOMP_OFFLOAD_alloc (int, size_t);
extern bool GOMP_OFFLOAD_free (int, void *);
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index eb11b9cf16a..4e3971ae1a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -81,7 +81,8 @@ static int
host_load_image (int n __attribute__ ((unused)),
unsigned v __attribute__ ((unused)),
const void *t __attribute__ ((unused)),
- struct addr_pair **r __attribute__ ((unused)))
+ struct addr_pair **r __attribute__ ((unused)),
+ uint64_t **f __attribute__ ((unused)))
{
return 0;
}
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index ea327bf2ca0..363e2950649 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3346,11 +3346,14 @@ GOMP_OFFLOAD_init_device (int n)
/* Load GCN object-code module described by struct gcn_image_desc in
TARGET_DATA and return references to kernel descriptors in TARGET_TABLE.
- If there are any constructors then run them. */
+ If there are any constructors then run them. If not NULL, REV_FN_TABLE will
+ contain the on-device addresses of the functions for reverse offload. To be
+ freed by the caller. */
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
- struct addr_pair **target_table)
+ struct addr_pair **target_table,
+ uint64_t **rev_fn_table __attribute__((unused)))
{
if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
{
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bc63e274cdf..d130665ed19 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1266,11 +1266,14 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
}
/* Load the (partial) program described by TARGET_DATA to device
- number ORD. Allocate and return TARGET_TABLE. */
+ number ORD. Allocate and return TARGET_TABLE. If not NULL, REV_FN_TABLE
+ will contain the on-device addresses of the functions for reverse offload.
+ To be freed by the caller. */
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
- struct addr_pair **target_table)
+ struct addr_pair **target_table,
+ uint64_t **rev_fn_table __attribute__((unused)))
{
CUmodule module;
const char *const *var_names;
diff --git a/libgomp/target.c b/libgomp/target.c
index 135db1d88ab..da907a2fb48 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2137,7 +2137,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
num_target_entries
= devicep->load_image_func (devicep->target_id, version,
- target_data, &target_table);
+ target_data, &target_table, NULL);
if (num_target_entries != num_funcs + num_vars
/* Others (device_num) are included as trailing entries in pair list. */
diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
index 33bae0650b4..7be27f0459d 100644
--- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
+++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
@@ -349,7 +349,8 @@ GOMP_OFFLOAD_version (void)
extern "C" int
GOMP_OFFLOAD_load_image (int device, const unsigned version,
- const void *target_image, addr_pair **result)
+ const void *target_image, addr_pair **result,
+ uint64_t ** /* rev_fn_table */)
{
TRACE ("(device = %d, target_image = %p)", device, target_image);
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 14:54 [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Tobias Burnus
2022-08-25 14:54 ` Tobias Burnus
@ 2022-08-25 15:38 ` Tobias Burnus
2022-08-25 15:38 ` Tobias Burnus
2022-09-09 15:31 ` Jakub Jelinek
2022-08-25 17:30 ` [Patch][2/3] nvptx: " Tobias Burnus
2022-09-09 15:29 ` [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Jakub Jelinek
3 siblings, 2 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 15:38 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Andrew Stubbs
[-- Attachment #1: Type: text/plain, Size: 864 bytes --]
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for GCN. A variant would be
to only generate .offload_func_table inside mkoffload when
OMP_REQUIRES_REVERSE_OFFLOAD has been requested.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-gcn.diff --]
[-- Type: text/x-patch, Size: 4673 bytes --]
GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to GCN for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
similar to pre-existing .offload_var_table.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
.offload_func_table to populate rev_fn_table when requested.
gcc/config/gcn/mkoffload.cc | 11 ++++++++++-
libgomp/plugin/plugin-gcn.c | 26 +++++++++++++++++++++++++-
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index 4206448703a..24d327355e3 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -537,63 +537,72 @@ process_asm (FILE *in, FILE *out, FILE *cfile)
case IN_VARS:
{
char *varname;
unsigned varsize;
if (sscanf (buf, " .8byte %ms\n", &varname))
{
fputs (buf, out);
fgets (buf, sizeof (buf), in);
if (!sscanf (buf, " .8byte %u\n", &varsize))
abort ();
var_count++;
}
break;
}
case IN_FUNCS:
{
char *funcname;
if (sscanf (buf, "\t.8byte\t%ms\n", &funcname))
{
+ fputs (buf, out);
obstack_ptr_grow (&fns_os, funcname);
fn_count++;
continue;
}
break;
}
}
char dummy;
if (sscanf (buf, " .section .gnu.offload_vars%c", &dummy) > 0)
{
state = IN_VARS;
/* Add a global symbol to allow plugin-gcn.c to locate the table
at runtime. It can't use the "offload_var_table.N" emitted by
the compiler because a) they're not global, and b) there's one
for each input file combined into the binary. */
fputs (buf, out);
fputs ("\t.global .offload_var_table\n"
"\t.type .offload_var_table, @object\n"
".offload_var_table:\n",
out);
}
else if (sscanf (buf, " .section .gnu.offload_funcs%c", &dummy) > 0)
- state = IN_FUNCS;
+ {
+ state = IN_FUNCS;
+ /* Likewise for .gnu.offload_vars; used for reverse offload. */
+ fputs (buf, out);
+ fputs ("\t.global .offload_func_table\n"
+ "\t.type .offload_func_table, @object\n"
+ ".offload_func_table:\n",
+ out);
+ }
else if (sscanf (buf, " .amdgpu_metadata%c", &dummy) > 0)
{
state = IN_METADATA;
regcount.kernel_name = NULL;
regcount.sgpr_count = regcount.vgpr_count = -1;
}
else if (sscanf (buf, " .section %c", &dummy) > 0
|| sscanf (buf, " .text%c", &dummy) > 0
|| sscanf (buf, " .bss%c", &dummy) > 0
|| sscanf (buf, " .data%c", &dummy) > 0
|| sscanf (buf, " .ident %c", &dummy) > 0)
state = IN_CODE;
else if (sscanf (buf, " .end_amdgpu_metadata%c", &dummy) > 0)
{
state = IN_CODE;
gcc_assert (regcount.kernel_name != NULL
&& regcount.sgpr_count >= 0
&& regcount.vgpr_count >= 0);
obstack_grow (®counts_os, ®count, sizeof (regcount));
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 363e2950649..00603981866 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3353,7 +3353,7 @@ GOMP_OFFLOAD_init_device (int n)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
{
@@ -3520,6 +3520,30 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
if (module->fini_array_func)
kernel_count--;
+ if (rev_fn_table != NULL && kernel_count == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table != NULL)
+ {
+ hsa_status_t status;
+ hsa_executable_symbol_t var_symbol;
+ status = hsa_fns.hsa_executable_get_symbol_fn (agent->executable, NULL,
+ ".offload_func_table",
+ agent->id, 0, &var_symbol);
+ if (status != HSA_STATUS_SUCCESS)
+ hsa_fatal ("Could not find symbol for variable in the code object",
+ status);
+ uint64_t fn_table_addr;
+ status = hsa_fns.hsa_executable_symbol_get_info_fn
+ (var_symbol, HSA_EXECUTABLE_SYMBOL_INFO_VARIABLE_ADDRESS,
+ &fn_table_addr);
+ if (status != HSA_STATUS_SUCCESS)
+ hsa_fatal ("Could not extract a variable from its symbol", status);
+ *rev_fn_table = GOMP_PLUGIN_malloc (kernel_count * sizeof (uint64_t));
+ GOMP_OFFLOAD_dev2host (agent->device_id, *rev_fn_table,
+ (void*) fn_table_addr,
+ kernel_count * sizeof (uint64_t));
+ }
+
return kernel_count + var_count + other_count;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 15:38 ` [Patch][2/3] GCN: libgomp+mkoffload.cc: " Tobias Burnus
@ 2022-08-25 15:38 ` Tobias Burnus
2022-09-09 15:31 ` Jakub Jelinek
1 sibling, 0 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 15:38 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Andrew Stubbs
[-- Attachment #1.1: Type: text/plain, Size: 864 bytes --]
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for GCN. A variant would be
to only generate .offload_func_table inside mkoffload when
OMP_REQUIRES_REVERSE_OFFLOAD has been requested.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-gcn.diff --]
[-- Type: text/x-patch, Size: 4673 bytes --]
GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to GCN for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
similar to pre-existing .offload_var_table.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
.offload_func_table to populate rev_fn_table when requested.
gcc/config/gcn/mkoffload.cc | 11 ++++++++++-
libgomp/plugin/plugin-gcn.c | 26 +++++++++++++++++++++++++-
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index 4206448703a..24d327355e3 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -537,63 +537,72 @@ process_asm (FILE *in, FILE *out, FILE *cfile)
case IN_VARS:
{
char *varname;
unsigned varsize;
if (sscanf (buf, " .8byte %ms\n", &varname))
{
fputs (buf, out);
fgets (buf, sizeof (buf), in);
if (!sscanf (buf, " .8byte %u\n", &varsize))
abort ();
var_count++;
}
break;
}
case IN_FUNCS:
{
char *funcname;
if (sscanf (buf, "\t.8byte\t%ms\n", &funcname))
{
+ fputs (buf, out);
obstack_ptr_grow (&fns_os, funcname);
fn_count++;
continue;
}
break;
}
}
char dummy;
if (sscanf (buf, " .section .gnu.offload_vars%c", &dummy) > 0)
{
state = IN_VARS;
/* Add a global symbol to allow plugin-gcn.c to locate the table
at runtime. It can't use the "offload_var_table.N" emitted by
the compiler because a) they're not global, and b) there's one
for each input file combined into the binary. */
fputs (buf, out);
fputs ("\t.global .offload_var_table\n"
"\t.type .offload_var_table, @object\n"
".offload_var_table:\n",
out);
}
else if (sscanf (buf, " .section .gnu.offload_funcs%c", &dummy) > 0)
- state = IN_FUNCS;
+ {
+ state = IN_FUNCS;
+ /* Likewise for .gnu.offload_vars; used for reverse offload. */
+ fputs (buf, out);
+ fputs ("\t.global .offload_func_table\n"
+ "\t.type .offload_func_table, @object\n"
+ ".offload_func_table:\n",
+ out);
+ }
else if (sscanf (buf, " .amdgpu_metadata%c", &dummy) > 0)
{
state = IN_METADATA;
regcount.kernel_name = NULL;
regcount.sgpr_count = regcount.vgpr_count = -1;
}
else if (sscanf (buf, " .section %c", &dummy) > 0
|| sscanf (buf, " .text%c", &dummy) > 0
|| sscanf (buf, " .bss%c", &dummy) > 0
|| sscanf (buf, " .data%c", &dummy) > 0
|| sscanf (buf, " .ident %c", &dummy) > 0)
state = IN_CODE;
else if (sscanf (buf, " .end_amdgpu_metadata%c", &dummy) > 0)
{
state = IN_CODE;
gcc_assert (regcount.kernel_name != NULL
&& regcount.sgpr_count >= 0
&& regcount.vgpr_count >= 0);
obstack_grow (®counts_os, ®count, sizeof (regcount));
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 363e2950649..00603981866 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3353,7 +3353,7 @@ GOMP_OFFLOAD_init_device (int n)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
{
@@ -3520,6 +3520,30 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
if (module->fini_array_func)
kernel_count--;
+ if (rev_fn_table != NULL && kernel_count == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table != NULL)
+ {
+ hsa_status_t status;
+ hsa_executable_symbol_t var_symbol;
+ status = hsa_fns.hsa_executable_get_symbol_fn (agent->executable, NULL,
+ ".offload_func_table",
+ agent->id, 0, &var_symbol);
+ if (status != HSA_STATUS_SUCCESS)
+ hsa_fatal ("Could not find symbol for variable in the code object",
+ status);
+ uint64_t fn_table_addr;
+ status = hsa_fns.hsa_executable_symbol_get_info_fn
+ (var_symbol, HSA_EXECUTABLE_SYMBOL_INFO_VARIABLE_ADDRESS,
+ &fn_table_addr);
+ if (status != HSA_STATUS_SUCCESS)
+ hsa_fatal ("Could not extract a variable from its symbol", status);
+ *rev_fn_table = GOMP_PLUGIN_malloc (kernel_count * sizeof (uint64_t));
+ GOMP_OFFLOAD_dev2host (agent->device_id, *rev_fn_table,
+ (void*) fn_table_addr,
+ kernel_count * sizeof (uint64_t));
+ }
+
return kernel_count + var_count + other_count;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 14:54 [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Tobias Burnus
2022-08-25 14:54 ` Tobias Burnus
2022-08-25 15:38 ` [Patch][2/3] GCN: libgomp+mkoffload.cc: " Tobias Burnus
@ 2022-08-25 17:30 ` Tobias Burnus
2022-08-25 17:30 ` Tobias Burnus
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
2022-09-09 15:29 ` [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Jakub Jelinek
3 siblings, 2 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 17:30 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Tom de Vries
[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for NVPTX.
Having array[] = {fn1,fn2}; works with nvptx only since sm_35; hence,
if there is a reverse_offload and sm_30 is used, there will be a compile-time
error.
To avoid incompatibilities, I compile with the same PTX ISA .version and
sm_XX version as the (last) file that contains the reverse offload. While
it should not matter, some newer CUDA might not support, e.g., sm_35 or
do not like a specific ISA version - thus, that seemed to be safer.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-nvptx.diff --]
[-- Type: text/x-patch, Size: 11019 bytes --]
nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (record_id): Strip quotations from
function name.
(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
at least sm_35, create '$offload_func_table' global array and init
with reverse-offload function addresses.
* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
force_public attribute to force .visible.
(nvptx_declare_function_name): For "omp target
device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
function address table '$offload_func_table' if rev_fn_table
is not NULL.
gcc/config/nvptx/mkoffload.cc | 104 ++++++++++++++++++++++++++++++++++++++++--
gcc/config/nvptx/nvptx.cc | 20 +++++---
libgomp/plugin/plugin-nvptx.c | 19 +++++++-
3 files changed, 131 insertions(+), 12 deletions(-)
diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 3eea0a8f138..c496766b1cc 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -108,12 +108,14 @@ xputenv (const char *string)
static void
record_id (const char *p1, id_map ***where)
{
+ gcc_assert (p1[0] == '"');
+ p1++;
const char *end = strchr (p1, '\n');
if (!end)
fatal_error (input_location, "malformed ptx file");
id_map *v = XNEW (id_map);
- size_t len = end - p1;
+ size_t len = end - p1 - 1 ; /* remove tailing '"' */
v->ptx_name = XNEWVEC (char, len + 1);
memcpy (v->ptx_name, p1, len);
v->ptx_name[len] = '\0';
@@ -242,6 +244,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
id_map const *id;
unsigned obj_count = 0;
unsigned ix;
+ const char *sm_ver = NULL, *version = NULL;
+ const char *sm_ver2 = NULL, *version2 = NULL;
+ size_t file_cnt = 0;
+ size_t *file_idx = XALLOCAVEC (size_t, len);
fprintf (out, "#include <stdint.h>\n\n");
@@ -250,6 +256,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
for (size_t i = 0; i != len;)
{
char c;
+ bool output_fn_ptr = false;
+ file_idx[file_cnt++] = i;
fprintf (out, "static const char ptx_code_%u[] =\n\t\"", obj_count++);
while ((c = input[i++]))
@@ -261,6 +269,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
case '\n':
fprintf (out, "\\n\"\n\t\"");
/* Look for mappings on subsequent lines. */
+ if (UNLIKELY (startswith (input + i, ".target sm_")))
+ {
+ sm_ver = input + i + strlen (".target sm_");
+ continue;
+ }
+ if (UNLIKELY (startswith (input + i, ".version ")))
+ {
+ version = input + i + strlen (".version ");
+ continue;
+ }
while (startswith (input + i, "//:"))
{
i += 3;
@@ -268,7 +286,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
if (startswith (input + i, "VAR_MAP "))
record_id (input + i + 8, &vars_tail);
else if (startswith (input + i, "FUNC_MAP "))
- record_id (input + i + 9, &funcs_tail);
+ {
+ output_fn_ptr = true;
+ record_id (input + i + 9, &funcs_tail);
+ }
else
abort ();
/* Skip to next line. */
@@ -286,6 +307,81 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
putc (c, out);
}
fprintf (out, "\";\n\n");
+ if (output_fn_ptr
+ && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
+ && sm_ver[2] == '\n')
+ fatal_error (input_location,
+ "%<omp requires reverse_offload%> requires at least "
+ "%<sm_35%> for %<-misa=%>");
+ sm_ver2 = sm_ver;
+ version2 = version;
+ }
+ }
+
+ /* Create function-pointer array, required for reverse
+ offload function-pointer lookup. */
+
+ if (func_ids && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ const char needle[] = "// BEGIN GLOBAL FUNCTION DECL: ";
+ fprintf (out, "static const char ptx_code_%u[] =\n", obj_count++);
+ fprintf (out, "\t\".version ");
+ for (size_t i = 0; version2[i] != '\0' && version2[i] != '\n'; i++)
+ fputc (version2[i], out);
+ fprintf (out, "\"\n\t\".target sm_");
+ for (size_t i = 0; version2[i] != '\0' && sm_ver2[i] != '\n'; i++)
+ fputc (sm_ver2[i], out);
+ fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
+
+ size_t fidx = 0;
+ for (id = func_ids; id; id = id->next)
+ {
+ /* Only 'nohost' functions are needed - use NULL for the rest.
+ Alternatively, besides searching for 'BEGIN FUNCTION DECL',
+ checking for '.visible .entry ' + id->ptx_name would be
+ required. */
+ if (!endswith (id->ptx_name, "$nohost"))
+ continue;
+ fprintf (out, "\t\".extern ");
+ const char *p = input + file_idx[fidx];
+ while (true)
+ {
+ p = strstr (p, needle);
+ if (!p)
+ {
+ fidx++;
+ if (fidx >= file_cnt)
+ break;
+ p = input + file_idx[fidx];
+ continue;
+ }
+ p += strlen (needle);
+ if (!startswith (p, id->ptx_name))
+ continue;
+ p += strlen (id->ptx_name);
+ if (*p != '\n')
+ continue;
+ p++;
+ gcc_assert (startswith (p, ".visible "));
+ p += strlen (".visible ");
+ for (; *p != '\0' && *p != '\n'; p++)
+ fputc (*p, out);
+ break;
+ }
+ fprintf (out, "\"\n");
+ if (fidx == file_cnt)
+ fatal_error (input_location,
+ "Cannot find function declaration for %qs",
+ id->ptx_name);
+ }
+ fprintf (out, "\t\".visible .global .align 8 .u64 "
+ "$offload_func_table[] = {");
+ for (comma = "", id = func_ids; id; comma = ",", id = id->next)
+ fprintf (out, "%s\"\n\t\t\"%s", comma,
+ endswith (id->ptx_name, "$nohost") ? id->ptx_name : "0");
+ fprintf (out, "};\\n\";\n\n");
}
/* Dump out array of pointers to ptx object strings. */
@@ -300,7 +396,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
/* Dump out variable idents. */
fprintf (out, "static const char *const var_mappings[] = {");
for (comma = "", id = var_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t%s", comma, id->ptx_name);
+ fprintf (out, "%s\n\t\"%s\"", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
/* Dump out function idents. */
@@ -309,7 +405,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
" unsigned short dim[%d];\n"
"} func_mappings[] = {\n", GOMP_DIM_MAX);
for (comma = "", id = func_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t{%s}", comma, id->ptx_name);
+ fprintf (out, "%s\n\t{\"%s\"}", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
fprintf (out,
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index e4297e2d6c3..3293c096822 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -989,15 +989,15 @@ write_var_marker (FILE *file, bool is_defn, bool globalize, const char *name)
static void
write_fn_proto_1 (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public)
{
if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL)
- write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name);
+ write_fn_marker (s, is_defn, TREE_PUBLIC (decl) || force_public, name);
/* PTX declaration. */
if (DECL_EXTERNAL (decl))
s << ".extern ";
- else if (TREE_PUBLIC (decl))
+ else if (TREE_PUBLIC (decl) || force_public)
s << (DECL_WEAK (decl) ? ".weak " : ".visible ");
s << (write_as_kernel (DECL_ATTRIBUTES (decl)) ? ".entry " : ".func ");
@@ -1086,7 +1086,7 @@ write_fn_proto_1 (std::stringstream &s, bool is_defn,
static void
write_fn_proto (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public=false)
{
const char *replacement = nvptx_name_replacement (name);
char *replaced_dots = NULL;
@@ -1103,9 +1103,9 @@ write_fn_proto (std::stringstream &s, bool is_defn,
if (is_defn)
/* Emit a declaration. The PTX assembler gets upset without it. */
- write_fn_proto_1 (s, false, name, decl);
+ write_fn_proto_1 (s, false, name, decl, force_public);
- write_fn_proto_1 (s, is_defn, name, decl);
+ write_fn_proto_1 (s, is_defn, name, decl, force_public);
if (replaced_dots)
XDELETE (replaced_dots);
@@ -1481,7 +1481,13 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
tree fntype = TREE_TYPE (decl);
tree result_type = TREE_TYPE (fntype);
int argno = 0;
+ bool force_public = false;
+ /* For reverse-offload 'nohost' functions: In order to be collectable in
+ '$offload_func_table', cf. mkoffload.cc, the function has to be visible. */
+ if (lookup_attribute ("omp target device_ancestor_nohost",
+ DECL_ATTRIBUTES (decl)))
+ force_public = true;
if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
{
@@ -1493,7 +1499,7 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
/* We construct the initial part of the function into a string
stream, in order to share the prototype writing code. */
std::stringstream s;
- write_fn_proto (s, true, name, decl);
+ write_fn_proto (s, true, name, decl, force_public);
s << "{\n";
bool return_in_mem = write_return_type (s, false, result_type);
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index d130665ed19..ac400fc2a1d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1273,7 +1273,7 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
CUmodule module;
const char *const *var_names;
@@ -1376,6 +1376,23 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
targ_tbl->start = targ_tbl->end = 0;
targ_tbl++;
+ if (rev_fn_table && fn_entries == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table)
+ {
+ CUdeviceptr var;
+ size_t bytes;
+ r = CUDA_CALL_NOCHECK (cuModuleGetGlobal, &var, &bytes, module,
+ "$offload_func_table");
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuModuleGetGlobal error: %s", cuda_error (r));
+ assert (bytes == sizeof (uint64_t) * fn_entries);
+ *rev_fn_table = GOMP_PLUGIN_malloc (sizeof (uint64_t) * fn_entries);
+ r = CUDA_CALL_NOCHECK (cuMemcpyDtoH, *rev_fn_table, var, bytes);
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuMemcpyDtoH error: %s", cuda_error (r));
+ }
+
nvptx_set_clocktick (module, dev);
return fn_entries + var_entries + other_entries;
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 17:30 ` [Patch][2/3] nvptx: " Tobias Burnus
@ 2022-08-25 17:30 ` Tobias Burnus
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
1 sibling, 0 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-25 17:30 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Tom de Vries
[-- Attachment #1.1: Type: text/plain, Size: 1192 bytes --]
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for NVPTX.
Having array[] = {fn1,fn2}; works with nvptx only since sm_35; hence,
if there is a reverse_offload and sm_30 is used, there will be a compile-time
error.
To avoid incompatibilities, I compile with the same PTX ISA .version and
sm_XX version as the (last) file that contains the reverse offload. While
it should not matter, some newer CUDA might not support, e.g., sm_35 or
do not like a specific ISA version - thus, that seemed to be safer.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-nvptx.diff --]
[-- Type: text/x-patch, Size: 11019 bytes --]
nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (record_id): Strip quotations from
function name.
(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
at least sm_35, create '$offload_func_table' global array and init
with reverse-offload function addresses.
* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
force_public attribute to force .visible.
(nvptx_declare_function_name): For "omp target
device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
function address table '$offload_func_table' if rev_fn_table
is not NULL.
gcc/config/nvptx/mkoffload.cc | 104 ++++++++++++++++++++++++++++++++++++++++--
gcc/config/nvptx/nvptx.cc | 20 +++++---
libgomp/plugin/plugin-nvptx.c | 19 +++++++-
3 files changed, 131 insertions(+), 12 deletions(-)
diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 3eea0a8f138..c496766b1cc 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -108,12 +108,14 @@ xputenv (const char *string)
static void
record_id (const char *p1, id_map ***where)
{
+ gcc_assert (p1[0] == '"');
+ p1++;
const char *end = strchr (p1, '\n');
if (!end)
fatal_error (input_location, "malformed ptx file");
id_map *v = XNEW (id_map);
- size_t len = end - p1;
+ size_t len = end - p1 - 1 ; /* remove tailing '"' */
v->ptx_name = XNEWVEC (char, len + 1);
memcpy (v->ptx_name, p1, len);
v->ptx_name[len] = '\0';
@@ -242,6 +244,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
id_map const *id;
unsigned obj_count = 0;
unsigned ix;
+ const char *sm_ver = NULL, *version = NULL;
+ const char *sm_ver2 = NULL, *version2 = NULL;
+ size_t file_cnt = 0;
+ size_t *file_idx = XALLOCAVEC (size_t, len);
fprintf (out, "#include <stdint.h>\n\n");
@@ -250,6 +256,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
for (size_t i = 0; i != len;)
{
char c;
+ bool output_fn_ptr = false;
+ file_idx[file_cnt++] = i;
fprintf (out, "static const char ptx_code_%u[] =\n\t\"", obj_count++);
while ((c = input[i++]))
@@ -261,6 +269,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
case '\n':
fprintf (out, "\\n\"\n\t\"");
/* Look for mappings on subsequent lines. */
+ if (UNLIKELY (startswith (input + i, ".target sm_")))
+ {
+ sm_ver = input + i + strlen (".target sm_");
+ continue;
+ }
+ if (UNLIKELY (startswith (input + i, ".version ")))
+ {
+ version = input + i + strlen (".version ");
+ continue;
+ }
while (startswith (input + i, "//:"))
{
i += 3;
@@ -268,7 +286,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
if (startswith (input + i, "VAR_MAP "))
record_id (input + i + 8, &vars_tail);
else if (startswith (input + i, "FUNC_MAP "))
- record_id (input + i + 9, &funcs_tail);
+ {
+ output_fn_ptr = true;
+ record_id (input + i + 9, &funcs_tail);
+ }
else
abort ();
/* Skip to next line. */
@@ -286,6 +307,81 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
putc (c, out);
}
fprintf (out, "\";\n\n");
+ if (output_fn_ptr
+ && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
+ && sm_ver[2] == '\n')
+ fatal_error (input_location,
+ "%<omp requires reverse_offload%> requires at least "
+ "%<sm_35%> for %<-misa=%>");
+ sm_ver2 = sm_ver;
+ version2 = version;
+ }
+ }
+
+ /* Create function-pointer array, required for reverse
+ offload function-pointer lookup. */
+
+ if (func_ids && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ const char needle[] = "// BEGIN GLOBAL FUNCTION DECL: ";
+ fprintf (out, "static const char ptx_code_%u[] =\n", obj_count++);
+ fprintf (out, "\t\".version ");
+ for (size_t i = 0; version2[i] != '\0' && version2[i] != '\n'; i++)
+ fputc (version2[i], out);
+ fprintf (out, "\"\n\t\".target sm_");
+ for (size_t i = 0; version2[i] != '\0' && sm_ver2[i] != '\n'; i++)
+ fputc (sm_ver2[i], out);
+ fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
+
+ size_t fidx = 0;
+ for (id = func_ids; id; id = id->next)
+ {
+ /* Only 'nohost' functions are needed - use NULL for the rest.
+ Alternatively, besides searching for 'BEGIN FUNCTION DECL',
+ checking for '.visible .entry ' + id->ptx_name would be
+ required. */
+ if (!endswith (id->ptx_name, "$nohost"))
+ continue;
+ fprintf (out, "\t\".extern ");
+ const char *p = input + file_idx[fidx];
+ while (true)
+ {
+ p = strstr (p, needle);
+ if (!p)
+ {
+ fidx++;
+ if (fidx >= file_cnt)
+ break;
+ p = input + file_idx[fidx];
+ continue;
+ }
+ p += strlen (needle);
+ if (!startswith (p, id->ptx_name))
+ continue;
+ p += strlen (id->ptx_name);
+ if (*p != '\n')
+ continue;
+ p++;
+ gcc_assert (startswith (p, ".visible "));
+ p += strlen (".visible ");
+ for (; *p != '\0' && *p != '\n'; p++)
+ fputc (*p, out);
+ break;
+ }
+ fprintf (out, "\"\n");
+ if (fidx == file_cnt)
+ fatal_error (input_location,
+ "Cannot find function declaration for %qs",
+ id->ptx_name);
+ }
+ fprintf (out, "\t\".visible .global .align 8 .u64 "
+ "$offload_func_table[] = {");
+ for (comma = "", id = func_ids; id; comma = ",", id = id->next)
+ fprintf (out, "%s\"\n\t\t\"%s", comma,
+ endswith (id->ptx_name, "$nohost") ? id->ptx_name : "0");
+ fprintf (out, "};\\n\";\n\n");
}
/* Dump out array of pointers to ptx object strings. */
@@ -300,7 +396,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
/* Dump out variable idents. */
fprintf (out, "static const char *const var_mappings[] = {");
for (comma = "", id = var_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t%s", comma, id->ptx_name);
+ fprintf (out, "%s\n\t\"%s\"", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
/* Dump out function idents. */
@@ -309,7 +405,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
" unsigned short dim[%d];\n"
"} func_mappings[] = {\n", GOMP_DIM_MAX);
for (comma = "", id = func_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t{%s}", comma, id->ptx_name);
+ fprintf (out, "%s\n\t{\"%s\"}", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
fprintf (out,
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index e4297e2d6c3..3293c096822 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -989,15 +989,15 @@ write_var_marker (FILE *file, bool is_defn, bool globalize, const char *name)
static void
write_fn_proto_1 (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public)
{
if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL)
- write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name);
+ write_fn_marker (s, is_defn, TREE_PUBLIC (decl) || force_public, name);
/* PTX declaration. */
if (DECL_EXTERNAL (decl))
s << ".extern ";
- else if (TREE_PUBLIC (decl))
+ else if (TREE_PUBLIC (decl) || force_public)
s << (DECL_WEAK (decl) ? ".weak " : ".visible ");
s << (write_as_kernel (DECL_ATTRIBUTES (decl)) ? ".entry " : ".func ");
@@ -1086,7 +1086,7 @@ write_fn_proto_1 (std::stringstream &s, bool is_defn,
static void
write_fn_proto (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public=false)
{
const char *replacement = nvptx_name_replacement (name);
char *replaced_dots = NULL;
@@ -1103,9 +1103,9 @@ write_fn_proto (std::stringstream &s, bool is_defn,
if (is_defn)
/* Emit a declaration. The PTX assembler gets upset without it. */
- write_fn_proto_1 (s, false, name, decl);
+ write_fn_proto_1 (s, false, name, decl, force_public);
- write_fn_proto_1 (s, is_defn, name, decl);
+ write_fn_proto_1 (s, is_defn, name, decl, force_public);
if (replaced_dots)
XDELETE (replaced_dots);
@@ -1481,7 +1481,13 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
tree fntype = TREE_TYPE (decl);
tree result_type = TREE_TYPE (fntype);
int argno = 0;
+ bool force_public = false;
+ /* For reverse-offload 'nohost' functions: In order to be collectable in
+ '$offload_func_table', cf. mkoffload.cc, the function has to be visible. */
+ if (lookup_attribute ("omp target device_ancestor_nohost",
+ DECL_ATTRIBUTES (decl)))
+ force_public = true;
if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
{
@@ -1493,7 +1499,7 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
/* We construct the initial part of the function into a string
stream, in order to share the prototype writing code. */
std::stringstream s;
- write_fn_proto (s, true, name, decl);
+ write_fn_proto (s, true, name, decl, force_public);
s << "{\n";
bool return_in_mem = write_return_type (s, false, result_type);
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index d130665ed19..ac400fc2a1d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1273,7 +1273,7 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
CUmodule module;
const char *const *var_names;
@@ -1376,6 +1376,23 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
targ_tbl->start = targ_tbl->end = 0;
targ_tbl++;
+ if (rev_fn_table && fn_entries == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table)
+ {
+ CUdeviceptr var;
+ size_t bytes;
+ r = CUDA_CALL_NOCHECK (cuModuleGetGlobal, &var, &bytes, module,
+ "$offload_func_table");
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuModuleGetGlobal error: %s", cuda_error (r));
+ assert (bytes == sizeof (uint64_t) * fn_entries);
+ *rev_fn_table = GOMP_PLUGIN_malloc (sizeof (uint64_t) * fn_entries);
+ r = CUDA_CALL_NOCHECK (cuMemcpyDtoH, *rev_fn_table, var, bytes);
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuMemcpyDtoH error: %s", cuda_error (r));
+ }
+
nvptx_set_clocktick (module, dev);
return fn_entries + var_entries + other_entries;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 17:30 ` [Patch][2/3] nvptx: " Tobias Burnus
2022-08-25 17:30 ` Tobias Burnus
@ 2022-08-29 18:43 ` Tobias Burnus
2022-08-29 18:43 ` Tobias Burnus
` (2 more replies)
1 sibling, 3 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-29 18:43 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Tom de Vries
[-- Attachment #1: Type: text/plain, Size: 1327 bytes --]
Slightly revised version, fixing some issues in mkoffload.cc. Otherwise, the same applies:
On 25.08.22 19:30, Tobias Burnus wrote:
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for NVPTX.
Having array[] = {fn1,fn2}; works with nvptx only since sm_35; hence,
if there is a reverse_offload and sm_30 is used, there will be a compile-time
error.
To avoid incompatibilities, I compile with the same PTX ISA .version and
sm_XX version as the (last) file that contains the reverse offload. While
it should not matter, some newer CUDA might not support, e.g., sm_35 or
do not like a specific ISA version - thus, that seemed to be safer.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-nvptx-v2.diff --]
[-- Type: text/x-patch, Size: 11610 bytes --]
nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (struct id_map): Add 'dim' member.
(record_id): Store func name without quotes, store dim separately.
(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
at least sm_35, create '$offload_func_table' global array and init
with reverse-offload function addresses.
* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
force_public attribute to force .visible.
(nvptx_declare_function_name): For "omp target
device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
function address table '$offload_func_table' if rev_fn_table
is not NULL.
gcc/config/nvptx/mkoffload.cc | 119 ++++++++++++++++++++++++++++++++++++++++--
gcc/config/nvptx/nvptx.cc | 20 +++++---
libgomp/plugin/plugin-nvptx.c | 19 +++++++-
3 files changed, 146 insertions(+), 12 deletions(-)
diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 3eea0a8f138..834b2059aac 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -47,6 +47,7 @@ struct id_map
{
id_map *next;
char *ptx_name;
+ char *dim;
};
static id_map *func_ids, **funcs_tail = &func_ids;
@@ -108,8 +109,11 @@ xputenv (const char *string)
static void
record_id (const char *p1, id_map ***where)
{
- const char *end = strchr (p1, '\n');
- if (!end)
+ gcc_assert (p1[0] == '"');
+ p1++;
+ const char *end = strchr (p1, '"');
+ const char *end2 = strchr (p1, '\n');
+ if (!end || !end2 || end >= end2)
fatal_error (input_location, "malformed ptx file");
id_map *v = XNEW (id_map);
@@ -117,6 +121,16 @@ record_id (const char *p1, id_map ***where)
v->ptx_name = XNEWVEC (char, len + 1);
memcpy (v->ptx_name, p1, len);
v->ptx_name[len] = '\0';
+ p1 = end + 1;
+ if (*end != '\n')
+ {
+ len = end2 - p1;
+ v->dim = XNEWVEC (char, len + 1);
+ memcpy (v->dim, p1, len);
+ v->dim[len] = '\0';
+ }
+ else
+ v->dim = NULL;
v->next = NULL;
id_map **tail = *where;
*tail = v;
@@ -242,6 +256,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
id_map const *id;
unsigned obj_count = 0;
unsigned ix;
+ const char *sm_ver = NULL, *version = NULL;
+ const char *sm_ver2 = NULL, *version2 = NULL;
+ size_t file_cnt = 0;
+ size_t *file_idx = XALLOCAVEC (size_t, len);
fprintf (out, "#include <stdint.h>\n\n");
@@ -250,6 +268,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
for (size_t i = 0; i != len;)
{
char c;
+ bool output_fn_ptr = false;
+ file_idx[file_cnt++] = i;
fprintf (out, "static const char ptx_code_%u[] =\n\t\"", obj_count++);
while ((c = input[i++]))
@@ -261,6 +281,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
case '\n':
fprintf (out, "\\n\"\n\t\"");
/* Look for mappings on subsequent lines. */
+ if (UNLIKELY (startswith (input + i, ".target sm_")))
+ {
+ sm_ver = input + i + strlen (".target sm_");
+ continue;
+ }
+ if (UNLIKELY (startswith (input + i, ".version ")))
+ {
+ version = input + i + strlen (".version ");
+ continue;
+ }
while (startswith (input + i, "//:"))
{
i += 3;
@@ -268,7 +298,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
if (startswith (input + i, "VAR_MAP "))
record_id (input + i + 8, &vars_tail);
else if (startswith (input + i, "FUNC_MAP "))
- record_id (input + i + 9, &funcs_tail);
+ {
+ output_fn_ptr = true;
+ record_id (input + i + 9, &funcs_tail);
+ }
else
abort ();
/* Skip to next line. */
@@ -286,6 +319,81 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
putc (c, out);
}
fprintf (out, "\";\n\n");
+ if (output_fn_ptr
+ && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
+ && sm_ver[2] == '\n')
+ fatal_error (input_location,
+ "%<omp requires reverse_offload%> requires at least "
+ "%<sm_35%> for %<-misa=%>");
+ sm_ver2 = sm_ver;
+ version2 = version;
+ }
+ }
+
+ /* Create function-pointer array, required for reverse
+ offload function-pointer lookup. */
+
+ if (func_ids && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ const char needle[] = "// BEGIN GLOBAL FUNCTION DECL: ";
+ fprintf (out, "static const char ptx_code_%u[] =\n", obj_count++);
+ fprintf (out, "\t\".version ");
+ for (size_t i = 0; version2[i] != '\0' && version2[i] != '\n'; i++)
+ fputc (version2[i], out);
+ fprintf (out, "\"\n\t\".target sm_");
+ for (size_t i = 0; version2[i] != '\0' && sm_ver2[i] != '\n'; i++)
+ fputc (sm_ver2[i], out);
+ fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
+
+ size_t fidx = 0;
+ for (id = func_ids; id; id = id->next)
+ {
+ /* Only 'nohost' functions are needed - use NULL for the rest.
+ Alternatively, besides searching for 'BEGIN FUNCTION DECL',
+ checking for '.visible .entry ' + id->ptx_name would be
+ required. */
+ if (!endswith (id->ptx_name, "$nohost"))
+ continue;
+ fprintf (out, "\t\".extern ");
+ const char *p = input + file_idx[fidx];
+ while (true)
+ {
+ p = strstr (p, needle);
+ if (!p)
+ {
+ fidx++;
+ if (fidx >= file_cnt)
+ break;
+ p = input + file_idx[fidx];
+ continue;
+ }
+ p += strlen (needle);
+ if (!startswith (p, id->ptx_name))
+ continue;
+ p += strlen (id->ptx_name);
+ if (*p != '\n')
+ continue;
+ p++;
+ gcc_assert (startswith (p, ".visible "));
+ p += strlen (".visible ");
+ for (; *p != '\0' && *p != '\n'; p++)
+ fputc (*p, out);
+ break;
+ }
+ fprintf (out, "\"\n");
+ if (fidx == file_cnt)
+ fatal_error (input_location,
+ "Cannot find function declaration for %qs",
+ id->ptx_name);
+ }
+ fprintf (out, "\t\".visible .global .align 8 .u64 "
+ "$offload_func_table[] = {");
+ for (comma = "", id = func_ids; id; comma = ",", id = id->next)
+ fprintf (out, "%s\"\n\t\t\"%s", comma,
+ endswith (id->ptx_name, "$nohost") ? id->ptx_name : "0");
+ fprintf (out, "};\\n\";\n\n");
}
/* Dump out array of pointers to ptx object strings. */
@@ -300,7 +408,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
/* Dump out variable idents. */
fprintf (out, "static const char *const var_mappings[] = {");
for (comma = "", id = var_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t%s", comma, id->ptx_name);
+ fprintf (out, "%s\n\t\"%s\"", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
/* Dump out function idents. */
@@ -309,7 +417,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
" unsigned short dim[%d];\n"
"} func_mappings[] = {\n", GOMP_DIM_MAX);
for (comma = "", id = func_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t{%s}", comma, id->ptx_name);
+ fprintf (out, "%s\n\t{\"%s\"%s}", comma, id->ptx_name,
+ id->dim ? id->dim : "");
fprintf (out, "\n};\n\n");
fprintf (out,
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index e4297e2d6c3..3293c096822 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -989,15 +989,15 @@ write_var_marker (FILE *file, bool is_defn, bool globalize, const char *name)
static void
write_fn_proto_1 (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public)
{
if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL)
- write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name);
+ write_fn_marker (s, is_defn, TREE_PUBLIC (decl) || force_public, name);
/* PTX declaration. */
if (DECL_EXTERNAL (decl))
s << ".extern ";
- else if (TREE_PUBLIC (decl))
+ else if (TREE_PUBLIC (decl) || force_public)
s << (DECL_WEAK (decl) ? ".weak " : ".visible ");
s << (write_as_kernel (DECL_ATTRIBUTES (decl)) ? ".entry " : ".func ");
@@ -1086,7 +1086,7 @@ write_fn_proto_1 (std::stringstream &s, bool is_defn,
static void
write_fn_proto (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public=false)
{
const char *replacement = nvptx_name_replacement (name);
char *replaced_dots = NULL;
@@ -1103,9 +1103,9 @@ write_fn_proto (std::stringstream &s, bool is_defn,
if (is_defn)
/* Emit a declaration. The PTX assembler gets upset without it. */
- write_fn_proto_1 (s, false, name, decl);
+ write_fn_proto_1 (s, false, name, decl, force_public);
- write_fn_proto_1 (s, is_defn, name, decl);
+ write_fn_proto_1 (s, is_defn, name, decl, force_public);
if (replaced_dots)
XDELETE (replaced_dots);
@@ -1481,7 +1481,13 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
tree fntype = TREE_TYPE (decl);
tree result_type = TREE_TYPE (fntype);
int argno = 0;
+ bool force_public = false;
+ /* For reverse-offload 'nohost' functions: In order to be collectable in
+ '$offload_func_table', cf. mkoffload.cc, the function has to be visible. */
+ if (lookup_attribute ("omp target device_ancestor_nohost",
+ DECL_ATTRIBUTES (decl)))
+ force_public = true;
if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
{
@@ -1493,7 +1499,7 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
/* We construct the initial part of the function into a string
stream, in order to share the prototype writing code. */
std::stringstream s;
- write_fn_proto (s, true, name, decl);
+ write_fn_proto (s, true, name, decl, force_public);
s << "{\n";
bool return_in_mem = write_return_type (s, false, result_type);
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index d130665ed19..ac400fc2a1d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1273,7 +1273,7 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
CUmodule module;
const char *const *var_names;
@@ -1376,6 +1376,23 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
targ_tbl->start = targ_tbl->end = 0;
targ_tbl++;
+ if (rev_fn_table && fn_entries == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table)
+ {
+ CUdeviceptr var;
+ size_t bytes;
+ r = CUDA_CALL_NOCHECK (cuModuleGetGlobal, &var, &bytes, module,
+ "$offload_func_table");
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuModuleGetGlobal error: %s", cuda_error (r));
+ assert (bytes == sizeof (uint64_t) * fn_entries);
+ *rev_fn_table = GOMP_PLUGIN_malloc (sizeof (uint64_t) * fn_entries);
+ r = CUDA_CALL_NOCHECK (cuMemcpyDtoH, *rev_fn_table, var, bytes);
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuMemcpyDtoH error: %s", cuda_error (r));
+ }
+
nvptx_set_clocktick (module, dev);
return fn_entries + var_entries + other_entries;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
@ 2022-08-29 18:43 ` Tobias Burnus
2022-09-09 15:36 ` Jakub Jelinek
2022-09-23 15:40 ` [og12] Come up with {,UN}LIKELY macros (was: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Thomas Schwinge
2 siblings, 0 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-08-29 18:43 UTC (permalink / raw)
To: gcc-patches, Jakub Jelinek, Tom de Vries
[-- Attachment #1.1: Type: text/plain, Size: 1327 bytes --]
Slightly revised version, fixing some issues in mkoffload.cc. Otherwise, the same applies:
On 25.08.22 19:30, Tobias Burnus wrote:
On 25.08.22 16:54, Tobias Burnus wrote:
The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.
This patch adds the actual implementation for NVPTX.
Having array[] = {fn1,fn2}; works with nvptx only since sm_35; hence,
if there is a reverse_offload and sm_30 is used, there will be a compile-time
error.
To avoid incompatibilities, I compile with the same PTX ISA .version and
sm_XX version as the (last) file that contains the reverse offload. While
it should not matter, some newer CUDA might not support, e.g., sm_35 or
do not like a specific ISA version - thus, that seemed to be safer.
This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: fn-lookup-nvptx-v2.diff --]
[-- Type: text/x-patch, Size: 11610 bytes --]
nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (struct id_map): Add 'dim' member.
(record_id): Store func name without quotes, store dim separately.
(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
at least sm_35, create '$offload_func_table' global array and init
with reverse-offload function addresses.
* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
force_public attribute to force .visible.
(nvptx_declare_function_name): For "omp target
device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
function address table '$offload_func_table' if rev_fn_table
is not NULL.
gcc/config/nvptx/mkoffload.cc | 119 ++++++++++++++++++++++++++++++++++++++++--
gcc/config/nvptx/nvptx.cc | 20 +++++---
libgomp/plugin/plugin-nvptx.c | 19 +++++++-
3 files changed, 146 insertions(+), 12 deletions(-)
diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 3eea0a8f138..834b2059aac 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -47,6 +47,7 @@ struct id_map
{
id_map *next;
char *ptx_name;
+ char *dim;
};
static id_map *func_ids, **funcs_tail = &func_ids;
@@ -108,8 +109,11 @@ xputenv (const char *string)
static void
record_id (const char *p1, id_map ***where)
{
- const char *end = strchr (p1, '\n');
- if (!end)
+ gcc_assert (p1[0] == '"');
+ p1++;
+ const char *end = strchr (p1, '"');
+ const char *end2 = strchr (p1, '\n');
+ if (!end || !end2 || end >= end2)
fatal_error (input_location, "malformed ptx file");
id_map *v = XNEW (id_map);
@@ -117,6 +121,16 @@ record_id (const char *p1, id_map ***where)
v->ptx_name = XNEWVEC (char, len + 1);
memcpy (v->ptx_name, p1, len);
v->ptx_name[len] = '\0';
+ p1 = end + 1;
+ if (*end != '\n')
+ {
+ len = end2 - p1;
+ v->dim = XNEWVEC (char, len + 1);
+ memcpy (v->dim, p1, len);
+ v->dim[len] = '\0';
+ }
+ else
+ v->dim = NULL;
v->next = NULL;
id_map **tail = *where;
*tail = v;
@@ -242,6 +256,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
id_map const *id;
unsigned obj_count = 0;
unsigned ix;
+ const char *sm_ver = NULL, *version = NULL;
+ const char *sm_ver2 = NULL, *version2 = NULL;
+ size_t file_cnt = 0;
+ size_t *file_idx = XALLOCAVEC (size_t, len);
fprintf (out, "#include <stdint.h>\n\n");
@@ -250,6 +268,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
for (size_t i = 0; i != len;)
{
char c;
+ bool output_fn_ptr = false;
+ file_idx[file_cnt++] = i;
fprintf (out, "static const char ptx_code_%u[] =\n\t\"", obj_count++);
while ((c = input[i++]))
@@ -261,6 +281,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
case '\n':
fprintf (out, "\\n\"\n\t\"");
/* Look for mappings on subsequent lines. */
+ if (UNLIKELY (startswith (input + i, ".target sm_")))
+ {
+ sm_ver = input + i + strlen (".target sm_");
+ continue;
+ }
+ if (UNLIKELY (startswith (input + i, ".version ")))
+ {
+ version = input + i + strlen (".version ");
+ continue;
+ }
while (startswith (input + i, "//:"))
{
i += 3;
@@ -268,7 +298,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
if (startswith (input + i, "VAR_MAP "))
record_id (input + i + 8, &vars_tail);
else if (startswith (input + i, "FUNC_MAP "))
- record_id (input + i + 9, &funcs_tail);
+ {
+ output_fn_ptr = true;
+ record_id (input + i + 9, &funcs_tail);
+ }
else
abort ();
/* Skip to next line. */
@@ -286,6 +319,81 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
putc (c, out);
}
fprintf (out, "\";\n\n");
+ if (output_fn_ptr
+ && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
+ && sm_ver[2] == '\n')
+ fatal_error (input_location,
+ "%<omp requires reverse_offload%> requires at least "
+ "%<sm_35%> for %<-misa=%>");
+ sm_ver2 = sm_ver;
+ version2 = version;
+ }
+ }
+
+ /* Create function-pointer array, required for reverse
+ offload function-pointer lookup. */
+
+ if (func_ids && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+ {
+ const char needle[] = "// BEGIN GLOBAL FUNCTION DECL: ";
+ fprintf (out, "static const char ptx_code_%u[] =\n", obj_count++);
+ fprintf (out, "\t\".version ");
+ for (size_t i = 0; version2[i] != '\0' && version2[i] != '\n'; i++)
+ fputc (version2[i], out);
+ fprintf (out, "\"\n\t\".target sm_");
+ for (size_t i = 0; version2[i] != '\0' && sm_ver2[i] != '\n'; i++)
+ fputc (sm_ver2[i], out);
+ fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
+
+ size_t fidx = 0;
+ for (id = func_ids; id; id = id->next)
+ {
+ /* Only 'nohost' functions are needed - use NULL for the rest.
+ Alternatively, besides searching for 'BEGIN FUNCTION DECL',
+ checking for '.visible .entry ' + id->ptx_name would be
+ required. */
+ if (!endswith (id->ptx_name, "$nohost"))
+ continue;
+ fprintf (out, "\t\".extern ");
+ const char *p = input + file_idx[fidx];
+ while (true)
+ {
+ p = strstr (p, needle);
+ if (!p)
+ {
+ fidx++;
+ if (fidx >= file_cnt)
+ break;
+ p = input + file_idx[fidx];
+ continue;
+ }
+ p += strlen (needle);
+ if (!startswith (p, id->ptx_name))
+ continue;
+ p += strlen (id->ptx_name);
+ if (*p != '\n')
+ continue;
+ p++;
+ gcc_assert (startswith (p, ".visible "));
+ p += strlen (".visible ");
+ for (; *p != '\0' && *p != '\n'; p++)
+ fputc (*p, out);
+ break;
+ }
+ fprintf (out, "\"\n");
+ if (fidx == file_cnt)
+ fatal_error (input_location,
+ "Cannot find function declaration for %qs",
+ id->ptx_name);
+ }
+ fprintf (out, "\t\".visible .global .align 8 .u64 "
+ "$offload_func_table[] = {");
+ for (comma = "", id = func_ids; id; comma = ",", id = id->next)
+ fprintf (out, "%s\"\n\t\t\"%s", comma,
+ endswith (id->ptx_name, "$nohost") ? id->ptx_name : "0");
+ fprintf (out, "};\\n\";\n\n");
}
/* Dump out array of pointers to ptx object strings. */
@@ -300,7 +408,7 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
/* Dump out variable idents. */
fprintf (out, "static const char *const var_mappings[] = {");
for (comma = "", id = var_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t%s", comma, id->ptx_name);
+ fprintf (out, "%s\n\t\"%s\"", comma, id->ptx_name);
fprintf (out, "\n};\n\n");
/* Dump out function idents. */
@@ -309,7 +417,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
" unsigned short dim[%d];\n"
"} func_mappings[] = {\n", GOMP_DIM_MAX);
for (comma = "", id = func_ids; id; comma = ",", id = id->next)
- fprintf (out, "%s\n\t{%s}", comma, id->ptx_name);
+ fprintf (out, "%s\n\t{\"%s\"%s}", comma, id->ptx_name,
+ id->dim ? id->dim : "");
fprintf (out, "\n};\n\n");
fprintf (out,
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index e4297e2d6c3..3293c096822 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -989,15 +989,15 @@ write_var_marker (FILE *file, bool is_defn, bool globalize, const char *name)
static void
write_fn_proto_1 (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public)
{
if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL)
- write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name);
+ write_fn_marker (s, is_defn, TREE_PUBLIC (decl) || force_public, name);
/* PTX declaration. */
if (DECL_EXTERNAL (decl))
s << ".extern ";
- else if (TREE_PUBLIC (decl))
+ else if (TREE_PUBLIC (decl) || force_public)
s << (DECL_WEAK (decl) ? ".weak " : ".visible ");
s << (write_as_kernel (DECL_ATTRIBUTES (decl)) ? ".entry " : ".func ");
@@ -1086,7 +1086,7 @@ write_fn_proto_1 (std::stringstream &s, bool is_defn,
static void
write_fn_proto (std::stringstream &s, bool is_defn,
- const char *name, const_tree decl)
+ const char *name, const_tree decl, bool force_public=false)
{
const char *replacement = nvptx_name_replacement (name);
char *replaced_dots = NULL;
@@ -1103,9 +1103,9 @@ write_fn_proto (std::stringstream &s, bool is_defn,
if (is_defn)
/* Emit a declaration. The PTX assembler gets upset without it. */
- write_fn_proto_1 (s, false, name, decl);
+ write_fn_proto_1 (s, false, name, decl, force_public);
- write_fn_proto_1 (s, is_defn, name, decl);
+ write_fn_proto_1 (s, is_defn, name, decl, force_public);
if (replaced_dots)
XDELETE (replaced_dots);
@@ -1481,7 +1481,13 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
tree fntype = TREE_TYPE (decl);
tree result_type = TREE_TYPE (fntype);
int argno = 0;
+ bool force_public = false;
+ /* For reverse-offload 'nohost' functions: In order to be collectable in
+ '$offload_func_table', cf. mkoffload.cc, the function has to be visible. */
+ if (lookup_attribute ("omp target device_ancestor_nohost",
+ DECL_ATTRIBUTES (decl)))
+ force_public = true;
if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
{
@@ -1493,7 +1499,7 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl)
/* We construct the initial part of the function into a string
stream, in order to share the prototype writing code. */
std::stringstream s;
- write_fn_proto (s, true, name, decl);
+ write_fn_proto (s, true, name, decl, force_public);
s << "{\n";
bool return_in_mem = write_return_type (s, false, result_type);
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index d130665ed19..ac400fc2a1d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1273,7 +1273,7 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
int
GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct addr_pair **target_table,
- uint64_t **rev_fn_table __attribute__((unused)))
+ uint64_t **rev_fn_table)
{
CUmodule module;
const char *const *var_names;
@@ -1376,6 +1376,23 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
targ_tbl->start = targ_tbl->end = 0;
targ_tbl++;
+ if (rev_fn_table && fn_entries == 0)
+ *rev_fn_table = NULL;
+ else if (rev_fn_table)
+ {
+ CUdeviceptr var;
+ size_t bytes;
+ r = CUDA_CALL_NOCHECK (cuModuleGetGlobal, &var, &bytes, module,
+ "$offload_func_table");
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuModuleGetGlobal error: %s", cuda_error (r));
+ assert (bytes == sizeof (uint64_t) * fn_entries);
+ *rev_fn_table = GOMP_PLUGIN_malloc (sizeof (uint64_t) * fn_entries);
+ r = CUDA_CALL_NOCHECK (cuMemcpyDtoH, *rev_fn_table, var, bytes);
+ if (r != CUDA_SUCCESS)
+ GOMP_PLUGIN_fatal ("cuMemcpyDtoH error: %s", cuda_error (r));
+ }
+
nvptx_set_clocktick (module, dev);
return fn_entries + var_entries + other_entries;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch][1/3] libgomp: Prepare for reverse offload fn lookup
2022-08-25 14:54 [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Tobias Burnus
` (2 preceding siblings ...)
2022-08-25 17:30 ` [Patch][2/3] nvptx: " Tobias Burnus
@ 2022-09-09 15:29 ` Jakub Jelinek
3 siblings, 0 replies; 15+ messages in thread
From: Jakub Jelinek @ 2022-09-09 15:29 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches
On Thu, Aug 25, 2022 at 04:54:51PM +0200, Tobias Burnus wrote:
> Technically, this patch is stand alone, but conceptually it based on the
> submitted but not reviewed patch:
> "[Patch] OpenMP: Support reverse offload (middle end part)"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html
>
> With that patch, for reverse offloads ('omp target device(ancestor:1)'),
> calls like the following are added:
> GOMP_target_ext (-2 /* initial device */, omp_fn.1
> where 'omp_fn.1' on nonhost devices a stub function just required for
> looking up the host function pointer via the offload_funcs table.
>
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
>
> OK for mainline?
>
> Tobias
>
>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> libgomp: Prepare for reverse offload fn lookup
>
> Prepare for reverse-offloading function-pointer lookup by passing
> a rev_fn_table argument to GOMP_OFFLOAD_load_image.
>
> The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
> requested and devices not supported it, are filtered out.
> (Up to and including this commit, no non-host device claims such
> support and the caller currently always passes NULL.)
>
> libgomp/ChangeLog:
>
> * libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
> 'uint64_t **rev_fn_table' argument.
> * oacc-host.c (host_load_image): Likewise.
> * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
> currently unused.
> * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
> * target.c (gomp_load_image_to_device): Update call but pass
> NULL for now.
>
> liboffloadmic/ChangeLog:
>
> * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
> Add (unused) uint64_t **rev_fn_table argument.
Ok, thanks.
Jakub
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-25 15:38 ` [Patch][2/3] GCN: libgomp+mkoffload.cc: " Tobias Burnus
2022-08-25 15:38 ` Tobias Burnus
@ 2022-09-09 15:31 ` Jakub Jelinek
1 sibling, 0 replies; 15+ messages in thread
From: Jakub Jelinek @ 2022-09-09 15:31 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, Andrew Stubbs
On Thu, Aug 25, 2022 at 05:38:58PM +0200, Tobias Burnus wrote:
> On 25.08.22 16:54, Tobias Burnus wrote:
>
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
>
>
> This patch adds the actual implementation for GCN. A variant would be
> to only generate .offload_func_table inside mkoffload when
> OMP_REQUIRES_REVERSE_OFFLOAD has been requested.
>
> This is currently effectively a no op as with [1/3] patch, always NULL
> is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
> 'omp requires reverse_offload' has been specified.
>
> OK for mainline?
>
> Tobias
>
>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
>
> Add support to GCN for reverse lookup of function name to prepare for
> 'omp target device(ancestor:1)'.
>
> gcc/ChangeLog:
>
> * config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
> similar to pre-existing .offload_var_table.
>
> libgomp/ChangeLog:
>
> * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
> .offload_func_table to populate rev_fn_table when requested.
Ok.
Jakub
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
2022-08-29 18:43 ` Tobias Burnus
@ 2022-09-09 15:36 ` Jakub Jelinek
2022-09-12 12:02 ` [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Tobias Burnus
2022-09-23 15:40 ` [og12] Come up with {,UN}LIKELY macros (was: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Thomas Schwinge
2 siblings, 1 reply; 15+ messages in thread
From: Jakub Jelinek @ 2022-09-09 15:36 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, Tom de Vries
On Mon, Aug 29, 2022 at 08:43:26PM +0200, Tobias Burnus wrote:
> Slightly revised version, fixing some issues in mkoffload.cc. Otherwise, the same applies:
>
> On 25.08.22 19:30, Tobias Burnus wrote:
> On 25.08.22 16:54, Tobias Burnus wrote:
>
> The attached patch prepare for reverse-offload device->host
> function-address lookup by requesting (if needed) the on-device address.
>
>
> This patch adds the actual implementation for NVPTX.
>
> Having array[] = {fn1,fn2}; works with nvptx only since sm_35; hence,
> if there is a reverse_offload and sm_30 is used, there will be a compile-time
> error.
Wonder if we instead shouldn't arrange for silent request for no PTX
offloading (or one with warning?) if sm_30 and reverse offload is needed.
Error might be too harsh, the program can still offload to GCN or host
just fine...
Otherwise LGTM.
Jakub
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup)
2022-09-09 15:36 ` Jakub Jelinek
@ 2022-09-12 12:02 ` Tobias Burnus
2022-09-12 12:10 ` Jakub Jelinek
2022-10-17 11:59 ` Fix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c' (was: [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible) Thomas Schwinge
0 siblings, 2 replies; 15+ messages in thread
From: Tobias Burnus @ 2022-09-12 12:02 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, Tom de Vries
[-- Attachment #1.1: Type: text/plain, Size: 1784 bytes --]
On 09.09.22 17:36, Jakub Jelinek wrote:
Wonder if we instead shouldn't arrange for silent request for no PTX
offloading (or one with warning?) if sm_30 and reverse offload is needed.
Error might be too harsh, the program can still offload to GCN or host
just fine...
Attached patch now implements the warning. I think silently failing is
not the proper solution. It is too confusing and without telling the
user, they may not notice this issue.
I also changed the "progname" variable; it seems as if the only user is
the diagnostic machinery and it seems to make sense to have:
nvptx mkoffload: warning: 'omp requires reverse_offload' requires at least
'sm_35' for '-misa=' - disabling offload-code generation for this device type
instead of guessing whether "mkoffload" is for the host, gcn or nvptx. I know
that the common way is to use the binary name ("lto1") or ... but I still
think this prefix – which is the tool_name. Makes sense.
(BTW: gcc/config/i386/intelmic-mkoffload.cc uses "mkoffload-intelmic".)
I also changed "-misa=" in the diagnostic to "-march=" as Tom changed the default
from "-misa=" (now an alias) to "-march=" in GCC 12 (+plus added -march-map= in addition).
I also added a testcase for this – and updated the testsuite for the dg-warning.
OK for mainline?
Tobias
PS: As no code for nvptx is generated, there is no "device present bit not used" warning
with GOMP_DEBUG, but as there is a compile-time warning, I guess that's fine.
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: nvptx-warn-if-sm-too-old-v3.diff --]
[-- Type: text/x-patch, Size: 7573 bytes --]
nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible
Reverse offload requests at least -misa=sm_35; with this patch, a warning
instead of an error is shown, still permitting reverse offload for all
other configured device types. This is achieved by not calling
GOMP_offload_register_ver (and stopping generating pointless 'static const char'
variables, once known.)
The tool_name as progname changes adds "nvptx " and "gcn " to the
"mkoffload: warning/error:" diagnostic.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (process): Replace a fatal_error by
a warning + not enabling offloading if -misa=sm_30 prevents
reverse offload.
(main): Use tool_name as progname for diagnostic.
* config/gcn/mkoffload.cc (main): Likewise.
libgomp/ChangeLog:
* libgomp.texi (Offload-Target Specifics: nvptx): Document
that reverse offload requires >= -march=sm_35.
* testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx
with -misa=sm_35.
* testsuite/libgomp.c-c++-common/requires-5.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-6.c: Likewise.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise.
* testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise.
* testsuite/libgomp.c/reverse-offload-sm30.c: New test.
gcc/config/gcn/mkoffload.cc | 2 +-
gcc/config/nvptx/mkoffload.cc | 17 +++++++++++++----
libgomp/libgomp.texi | 3 +++
libgomp/testsuite/libgomp.c-c++-common/requires-4.c | 1 +
libgomp/testsuite/libgomp.c-c++-common/requires-5.c | 1 +
libgomp/testsuite/libgomp.c-c++-common/requires-6.c | 2 ++
.../testsuite/libgomp.c-c++-common/reverse-offload-1.c | 1 +
libgomp/testsuite/libgomp.c/reverse-offload-sm30.c | 15 +++++++++++++++
libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90 | 1 +
9 files changed, 38 insertions(+), 5 deletions(-)
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index 24d327355e3..64037806acf 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -805,7 +805,7 @@ main (int argc, char **argv)
FILE *cfile = stdout;
const char *outname = 0;
- progname = "mkoffload";
+ progname = tool_name;
diagnostic_initialize (global_dc, 0);
obstack_init (&files_to_cleanup);
diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 834b2059aac..5f3e07ad066 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -324,9 +324,18 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
{
if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
&& sm_ver[2] == '\n')
- fatal_error (input_location,
- "%<omp requires reverse_offload%> requires at least "
- "%<sm_35%> for %<-misa=%>");
+ {
+ warning_at (input_location, 0,
+ "%<omp requires reverse_offload%> requires at "
+ "least %<sm_35%> for %<-march=%> - disabling "
+ "offload-code generation for this device type");
+ /* As now an empty file is compiled and there is no call to
+ GOMP_offload_register_ver, this device type is effectively
+ disabled. */
+ fflush (out);
+ ftruncate (fileno (out), 0);
+ return;
+ }
sm_ver2 = sm_ver;
version2 = version;
}
@@ -526,7 +535,7 @@ main (int argc, char **argv)
FILE *out = stdout;
const char *outname = 0;
- progname = "mkoffload";
+ progname = tool_name;
diagnostic_initialize (global_dc, 0);
if (atexit (mkoffload_cleanup) != 0)
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 4eaad4348bb..1f402d6df79 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -4386,6 +4386,9 @@ The implementation remark:
@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
using the C library @code{printf} functions and the Fortran
@code{print}/@code{write} statements.
+@item Compilation OpenMP code that contains @code{requires reverse_offload}
+ requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
+ is not supported.
@end itemize
diff --git a/libgomp/testsuite/libgomp.c-c++-common/requires-4.c b/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
index 6ed5a5f647a..5883eff0d93 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
@@ -1,4 +1,5 @@
/* { dg-additional-options "-flto" } */
+/* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
/* { dg-additional-sources requires-4-aux.c } */
/* Check no diagnostic by device-compiler's or host compiler's lto1.
diff --git a/libgomp/testsuite/libgomp.c-c++-common/requires-5.c b/libgomp/testsuite/libgomp.c-c++-common/requires-5.c
index 7fe0c735d27..d43d78db6fa 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/requires-5.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/requires-5.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
/* { dg-additional-sources requires-5-aux.c } */
/* Depending on offload device capabilities, it may print something like the
diff --git a/libgomp/testsuite/libgomp.c-c++-common/requires-6.c b/libgomp/testsuite/libgomp.c-c++-common/requires-6.c
index b00c7459bbc..a25b4d2dedd 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/requires-6.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/requires-6.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
+
#pragma omp requires unified_shared_memory, unified_address, reverse_offload
/* The requires line is not active as there is none of:
diff --git a/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c
index 976e129f560..52d828caf1c 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c
@@ -1,4 +1,5 @@
/* { dg-do run } */
+/* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
/* { dg-additional-sources reverse-offload-1-aux.c } */
/* Check that reverse offload works in particular:
diff --git a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
new file mode 100644
index 00000000000..14aed0132b7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
@@ -0,0 +1,15 @@
+/* { dg-do link { target { offload_target_nvptx } } } */
+/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload=-mptx=_" } */
+
+#pragma omp requires reverse_offload
+
+int
+main ()
+{
+ #pragma omp target
+ {
+ }
+ return 0;
+}
+
+/* { dg-warning "'omp requires reverse_offload' requires at least 'sm_35' for '-march=' - disabling offload-code generation for this device type" "" { target *-*-* } 0 } */
diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90 b/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90
index 7cfb8b6552e..de68011f8f7 100644
--- a/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90
@@ -1,4 +1,5 @@
! { dg-do run }
+! { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } }
! { dg-additional-sources reverse-offload-1-aux.f90 }
! Check that reverse offload works in particular:
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup)
2022-09-12 12:02 ` [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Tobias Burnus
@ 2022-09-12 12:10 ` Jakub Jelinek
2022-10-17 11:59 ` Fix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c' (was: [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible) Thomas Schwinge
1 sibling, 0 replies; 15+ messages in thread
From: Jakub Jelinek @ 2022-09-12 12:10 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, Tom de Vries
On Mon, Sep 12, 2022 at 02:02:16PM +0200, Tobias Burnus wrote:
> + {
> + warning_at (input_location, 0,
> + "%<omp requires reverse_offload%> requires at "
> + "least %<sm_35%> for %<-march=%> - disabling "
> + "offload-code generation for this device type");
I wonder whether it shouldn't talk about
-foffload-options=nvptx-none=-march=
instead of just -march=.
Otherwise LGTM.
Jakub
^ permalink raw reply [flat|nested] 15+ messages in thread
* [og12] Come up with {,UN}LIKELY macros (was: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup)
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
2022-08-29 18:43 ` Tobias Burnus
2022-09-09 15:36 ` Jakub Jelinek
@ 2022-09-23 15:40 ` Thomas Schwinge
2 siblings, 0 replies; 15+ messages in thread
From: Thomas Schwinge @ 2022-09-23 15:40 UTC (permalink / raw)
To: Tobias Burnus, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2160 bytes --]
Hi!
Since the 2022-09-12 backport of this:
On 2022-08-29T20:43:26+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
... to og12 in commit 2b6ad53fd76c7bb9605be417d137a7d9a18f2117, the og12
branch didn't build anymore:
[...]/gcc/config/nvptx/mkoffload.cc: In function 'void process(FILE*, FILE*, uint32_t)':
[...]/gcc/config/nvptx/mkoffload.cc:284:59: error: 'UNLIKELY' was not declared in this scope
if (UNLIKELY (startswith (input + i, ".target sm_")))
^
[...]/gcc/config/nvptx/mkoffload.cc:289:57: error: 'UNLIKELY' was not declared in this scope
if (UNLIKELY (startswith (input + i, ".version ")))
^
make[2]: *** [[...]/gcc/config/nvptx/t-nvptx:8: mkoffload.o] Error 1
> --- a/gcc/config/nvptx/mkoffload.cc
> +++ b/gcc/config/nvptx/mkoffload.cc
> @@ -261,6 +281,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
> case '\n':
> fprintf (out, "\\n\"\n\t\"");
> /* Look for mappings on subsequent lines. */
> + if (UNLIKELY (startswith (input + i, ".target sm_")))
> + {
> + sm_ver = input + i + strlen (".target sm_");
> + continue;
> + }
> + if (UNLIKELY (startswith (input + i, ".version ")))
> + {
> + version = input + i + strlen (".version ");
> + continue;
> + }
To fix this, I've pushed a (very much reduced) partial cherry-pick of
commit r13-171-g22d9c8802add09a93308319fc37dd3a0f1125393
"Come up with {,UN}LIKELY macros" to og12 branch in
commit 44b77201a5431450f608b4538fefb1319127de13, see attached.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Come-up-with-UN-LIKELY-macros.og12.patch --]
[-- Type: text/x-diff, Size: 1429 bytes --]
From 44b77201a5431450f608b4538fefb1319127de13 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Thu, 3 Feb 2022 10:58:18 +0100
Subject: [PATCH] Come up with {,UN}LIKELY macros.
gcc/ChangeLog:
* system.h (LIKELY): Define.
(UNLIKELY): Likewise.
(cherry picked from commit 22d9c8802add09a93308319fc37dd3a0f1125393, partial)
---
gcc/ChangeLog.omp | 8 ++++++++
gcc/system.h | 3 +++
2 files changed, 11 insertions(+)
diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 4f80bcbd356..30c3abfc15b 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,3 +1,11 @@
+2022-09-23 Thomas Schwinge <thomas@codesourcery.com>
+
+ Backport from master branch:
+ 2022-05-09 Martin Liska <mliska@suse.cz>
+
+ * system.h (LIKELY): Define.
+ (UNLIKELY): Likewise.
+
2022-09-12 Tobias Burnus <tobias@codesourcery.com>
Backport from mainline:
diff --git a/gcc/system.h b/gcc/system.h
index e10c34f70ec..6b6868d0bbf 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -736,6 +736,9 @@ extern int vsnprintf (char *, size_t, const char *, va_list);
#define __builtin_expect(a, b) (a)
#endif
+#define LIKELY(x) (__builtin_expect ((x), 1))
+#define UNLIKELY(x) (__builtin_expect ((x), 0))
+
/* Some of the headers included by <memory> can use "abort" within a
namespace, e.g. "_VSTD::abort();", which fails after we use the
preprocessor to redefine "abort" as "fancy_abort" below. */
--
2.35.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Fix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c' (was: [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible)
2022-09-12 12:02 ` [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Tobias Burnus
2022-09-12 12:10 ` Jakub Jelinek
@ 2022-10-17 11:59 ` Thomas Schwinge
1 sibling, 0 replies; 15+ messages in thread
From: Thomas Schwinge @ 2022-10-17 11:59 UTC (permalink / raw)
To: Tobias Burnus, gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1351 bytes --]
Hi!
On 2022-09-12T14:02:16+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
> @@ -0,0 +1,15 @@
> +/* { dg-do link { target { offload_target_nvptx } } } */
> +/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload=-mptx=_" } */
Pushed to master branch
commit b61796663ba1fe8fb83203829398f3f89ec212b7
"Fix nvptx-specific '-foffload-options' syntax in
'libgomp.c/reverse-offload-sm30.c'", see attached.
Cherry-picked pushed to devel/omp/gcc-12 branch in
commit f36ce95ad928578aa6739f61480e6c8fbaf2248e
"Fix nvptx-specific '-foffload-options' syntax in
'libgomp.c/reverse-offload-sm30.c'", see attached.
Grüße
Thomas
> +
> +#pragma omp requires reverse_offload
> +
> +int
> +main ()
> +{
> + #pragma omp target
> + {
> + }
> + return 0;
> +}
> +
> +/* { dg-warning "'omp requires reverse_offload' requires at least 'sm_35' for '-march=' - disabling offload-code generation for this device type" "" { target *-*-* } 0 } */
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-nvptx-specific-foffload-options-syntax-in-libgom.patch --]
[-- Type: text/x-diff, Size: 1327 bytes --]
From b61796663ba1fe8fb83203829398f3f89ec212b7 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 23 Sep 2022 11:29:50 +0200
Subject: [PATCH] Fix nvptx-specific '-foffload-options' syntax in
'libgomp.c/reverse-offload-sm30.c'
That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.
Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".
libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.
---
libgomp/testsuite/libgomp.c/reverse-offload-sm30.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
index fbfeae1fd41..7f10fd4ded9 100644
--- a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
+++ b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
@@ -1,5 +1,5 @@
/* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload=-mptx=_" } */
+/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload-options=nvptx-none=-mptx=_" } */
#pragma omp requires reverse_offload
--
2.35.1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-Fix-nvptx-specific-foffload-options-syntax-in-l.og12.patch --]
[-- Type: text/x-diff, Size: 2001 bytes --]
From f36ce95ad928578aa6739f61480e6c8fbaf2248e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 23 Sep 2022 11:29:50 +0200
Subject: [PATCH] Fix nvptx-specific '-foffload-options' syntax in
'libgomp.c/reverse-offload-sm30.c'
That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.
Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".
libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.
(cherry picked from commit b61796663ba1fe8fb83203829398f3f89ec212b7)
---
libgomp/ChangeLog.omp | 8 ++++++++
libgomp/testsuite/libgomp.c/reverse-offload-sm30.c | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index cb3541be378..048314eb1be 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,11 @@
+2022-10-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ Backported from master:
+ 2022-10-17 Thomas Schwinge <thomas@codesourcery.com>
+
+ * testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
+ '-foffload-options' syntax.
+
2022-10-14 Julian Brown <julian@codesourcery.com>
* testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output.
diff --git a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
index fbfeae1fd41..7f10fd4ded9 100644
--- a/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
+++ b/libgomp/testsuite/libgomp.c/reverse-offload-sm30.c
@@ -1,5 +1,5 @@
/* { dg-do link { target { offload_target_nvptx } } } */
-/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload=-mptx=_" } */
+/* { dg-additional-options "-foffload-options=nvptx-none=-march=sm_30 -foffload-options=nvptx-none=-mptx=_" } */
#pragma omp requires reverse_offload
--
2.35.1
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-10-17 11:59 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25 14:54 [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Tobias Burnus
2022-08-25 14:54 ` Tobias Burnus
2022-08-25 15:38 ` [Patch][2/3] GCN: libgomp+mkoffload.cc: " Tobias Burnus
2022-08-25 15:38 ` Tobias Burnus
2022-09-09 15:31 ` Jakub Jelinek
2022-08-25 17:30 ` [Patch][2/3] nvptx: " Tobias Burnus
2022-08-25 17:30 ` Tobias Burnus
2022-08-29 18:43 ` [Patch][2/3][v2] " Tobias Burnus
2022-08-29 18:43 ` Tobias Burnus
2022-09-09 15:36 ` Jakub Jelinek
2022-09-12 12:02 ` [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible (was: Re: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Tobias Burnus
2022-09-12 12:10 ` Jakub Jelinek
2022-10-17 11:59 ` Fix nvptx-specific '-foffload-options' syntax in 'libgomp.c/reverse-offload-sm30.c' (was: [Patch] nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible) Thomas Schwinge
2022-09-23 15:40 ` [og12] Come up with {,UN}LIKELY macros (was: [Patch][2/3][v2] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup) Thomas Schwinge
2022-09-09 15:29 ` [Patch][1/3] libgomp: Prepare for reverse offload fn lookup Jakub Jelinek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).