* [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it
@ 2024-03-15 12:21 Tobias Burnus
2024-03-15 13:14 ` Andrew Stubbs
0 siblings, 1 reply; 5+ messages in thread
From: Tobias Burnus @ 2024-03-15 12:21 UTC (permalink / raw)
To: Andrew Stubbs, gcc-patches
[-- Attachment #1.1: Type: text/plain, Size: 1236 bytes --]
Given the large number of AMD GPU ISAs and the number of files which
have to be adapted, I wonder whether it makes sense to consolidate this
a bit, especially in the light that we may want to support more in the
future.
Besides using some macros, I also improved the diagnostic if the object
code couldn't be recognized (shouldn't happen) or if the GPU is
unsupported (likely; it now prints the GPU string). I was initially
thinking of resolving the arch encoded in the eflag to a string, but as
this is about GCC-generated code, it seemed to be unlikely of much use.
[It should that rare that we might also go back to the static string
instead of outputting the hex value of the eflag.]
Note: I only modified mkoffload.cc and plugin-gcn.c, but with some
tweaks it could also be used for other files in gcc/config/gcn/.
If you add a new ISA, you still need to update plugin-gcn.c's
max_isa_vgprs and the xnack/sram-ecc handling in mkoffload.c's main, but
that should be all for those two files.
Thoughts?
Tobias
PS: I think the patch is fine and builds, but I have not tested it on an
AMD GPU machine, yet.
PPS: For using for other files, see also in config/nvptx which uses
nvptx-sm.def to generate several files.
[-- Attachment #2: gcn-def.diff --]
[-- Type: text/x-patch, Size: 13094 bytes --]
GCN: Define ISA archs in gcn-devices.def and use it
Adding new a GCN ISAs requires to update many files, making it more
likely to miss a file; by adding the gcn-devices.def file and using
it in config/gcn/mkoffload.cc and libgomp/plugin/plugin-gcn.c, it
reduces the duplications.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_...): Replace
explicit #define by an enum created from gcn-devices.def.
(main): Use gcn-devices.def definitions for -march=gfx.* string
parsing.
libgomp/ChangeLog:
* plugin/gcn-devices.def: New file.
* plugin/plugin-gcn.c (gcn_..._s): Remove.
(enum EF_AMDGPU_MACH): Generate EF_AMDGPU_MACH_AMDGCN_...
using gcn-devices.def.
(isa_hsa_name, isa_gcc_name, isa_code): Use gcn-devices.def
to handle the ISAs.
(max_isa_vgprs): Update used enum name (GFX90a -> GFX90A).
(isa_matches_agent, GOMP_OFFLOAD_init_device): Be more verbose
in case of an unsupported ISA.
gcc/config/gcn/mkoffload.cc | 42 ++++++---------
libgomp/plugin/gcn-devices.def | 62 ++++++++++++++++++++++
libgomp/plugin/plugin-gcn.c | 118 +++++++++++++++--------------------------
3 files changed, 119 insertions(+), 103 deletions(-)
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index fe443abba21..081110d7030 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -47,20 +47,14 @@
#undef ELFABIVERSION_AMDGPU_HSA_V4
#define ELFABIVERSION_AMDGPU_HSA_V4 2
-#undef EF_AMDGPU_MACH_AMDGCN_GFX803
-#define EF_AMDGPU_MACH_AMDGCN_GFX803 0x2a
-#undef EF_AMDGPU_MACH_AMDGCN_GFX900
-#define EF_AMDGPU_MACH_AMDGCN_GFX900 0x2c
-#undef EF_AMDGPU_MACH_AMDGCN_GFX906
-#define EF_AMDGPU_MACH_AMDGCN_GFX906 0x2f
-#undef EF_AMDGPU_MACH_AMDGCN_GFX908
-#define EF_AMDGPU_MACH_AMDGCN_GFX908 0x30
-#undef EF_AMDGPU_MACH_AMDGCN_GFX90a
-#define EF_AMDGPU_MACH_AMDGCN_GFX90a 0x3f
-#undef EF_AMDGPU_MACH_AMDGCN_GFX1030
-#define EF_AMDGPU_MACH_AMDGCN_GFX1030 0x36
-#undef EF_AMDGPU_MACH_AMDGCN_GFX1100
-#define EF_AMDGPU_MACH_AMDGCN_GFX1100 0x41
+/* Use an enum as macros cannot define macros and
+ assume that EF_AMDGPU_MACH_AMDGCN_... is not #defined. */
+enum {
+#define AMDGPU_ISA(suffix, str, val) \
+ EF_AMDGPU_MACH_AMDGCN_ ## suffix = val,
+#include "../libgomp/plugin/gcn-devices.def"
+#undef AMDGPU_ISA
+};
#define EF_AMDGPU_FEATURE_XNACK_V4 0x300 /* Mask. */
#define EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4 0x000
@@ -959,18 +953,12 @@ main (int argc, char **argv)
dumppfx = argv[++i];
else if (strcmp (argv[i], "-march=fiji") == 0)
elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX803;
- else if (strcmp (argv[i], "-march=gfx900") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX900;
- else if (strcmp (argv[i], "-march=gfx906") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX906;
- else if (strcmp (argv[i], "-march=gfx908") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX908;
- else if (strcmp (argv[i], "-march=gfx90a") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX90a;
- else if (strcmp (argv[i], "-march=gfx1030") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX1030;
- else if (strcmp (argv[i], "-march=gfx1100") == 0)
- elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX1100;
+#define AMDGPU_ISA(suffix, str, val) \
+ else if (strcmp (argv[i], "-march=" str) == 0) \
+ elf_arch = EF_AMDGPU_MACH_AMDGCN_ ## suffix;
+#include "../libgomp/plugin/gcn-devices.def"
+#undef AMDGPU_ISA
+
#define STR "-mstack-size="
else if (startswith (argv[i], STR))
gcn_stack_size = atoi (argv[i] + strlen (STR));
@@ -1029,7 +1017,7 @@ main (int argc, char **argv)
if (TEST_SRAM_ECC_UNSET (elf_flags))
SET_SRAM_ECC_ANY (elf_flags);
break;
- case EF_AMDGPU_MACH_AMDGCN_GFX90a:
+ case EF_AMDGPU_MACH_AMDGCN_GFX90A:
if (TEST_XNACK_UNSET (elf_flags))
SET_XNACK_ANY (elf_flags);
if (TEST_SRAM_ECC_UNSET (elf_flags))
diff --git a/libgomp/plugin/gcn-devices.def b/libgomp/plugin/gcn-devices.def
new file mode 100644
index 00000000000..8a51fa16fbc
--- /dev/null
+++ b/libgomp/plugin/gcn-devices.def
@@ -0,0 +1,62 @@
+/* The enum mirrors the corresponding LLVM enum's values for the ISAs.
+ See https://llvm.org/docs/AMDGPUUsage.html#amdgpu-ef-amdgpu-mach-table
+ and llvm/include/llvm/BinaryFormat/ELF.h. */
+
+#ifndef AMDGPU_ISA_UNSUPPORTED
+#define AMDGPU_ISA_UNSUPPORTED(suffix, str, val)
+#endif
+
+AMDGPU_ISA_UNSUPPORTED (GFX600, "gfx600", 0x020)
+AMDGPU_ISA_UNSUPPORTED (GFX601, "gfx601", 0x021)
+AMDGPU_ISA_UNSUPPORTED (GFX700, "gfx700", 0x022)
+AMDGPU_ISA_UNSUPPORTED (GFX701, "gfx701", 0x023)
+AMDGPU_ISA_UNSUPPORTED (GFX702, "gfx702", 0x024)
+AMDGPU_ISA_UNSUPPORTED (GFX703, "gfx703", 0x025)
+AMDGPU_ISA_UNSUPPORTED (GFX704, "gfx704", 0x026)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X27, "gfx704", 0x027)
+AMDGPU_ISA_UNSUPPORTED (GFX801, "gfx801", 0x028)
+AMDGPU_ISA_UNSUPPORTED (GFX802, "gfx802", 0x029)
+AMDGPU_ISA (GFX803, "gfx803", 0x02a) /* FIJI */
+AMDGPU_ISA_UNSUPPORTED (GFX810, "gfx810", 0x02b)
+AMDGPU_ISA (GFX900, "gfx900", 0x02c)
+AMDGPU_ISA_UNSUPPORTED (GFX902, "gfx902", 0x02d)
+AMDGPU_ISA_UNSUPPORTED (GFX904, "gfx904", 0x02e)
+AMDGPU_ISA (GFX906, "gfx906", 0x02f)
+AMDGPU_ISA (GFX908, "gfx908", 0x030)
+AMDGPU_ISA_UNSUPPORTED (GFX909, "gfx909", 0x031)
+AMDGPU_ISA_UNSUPPORTED (GFX90C, "gfx90c", 0x032)
+AMDGPU_ISA_UNSUPPORTED (GFX1010, "gfx1010", 0x033)
+AMDGPU_ISA_UNSUPPORTED (GFX1011, "gfx1011", 0x034)
+AMDGPU_ISA_UNSUPPORTED (GFX1012, "gfx1012", 0x035)
+AMDGPU_ISA (GFX1030, "gfx1030", 0x036)
+AMDGPU_ISA_UNSUPPORTED (GFX1031, "gfx1031", 0x037)
+AMDGPU_ISA_UNSUPPORTED (GFX1032, "gfx1032", 0x038)
+AMDGPU_ISA_UNSUPPORTED (GFX1033, "gfx1033", 0x039)
+AMDGPU_ISA_UNSUPPORTED (GFX602, "gfx602", 0x03a)
+AMDGPU_ISA_UNSUPPORTED (GFX705, "gfx705", 0x03b)
+AMDGPU_ISA_UNSUPPORTED (GFX805, "gfx805", 0x03c)
+AMDGPU_ISA_UNSUPPORTED (GFX1035, "gfx1035", 0x03d)
+AMDGPU_ISA_UNSUPPORTED (GFX1034, "gfx1034", 0x03e)
+AMDGPU_ISA (GFX90A, "gfx90a", 0x03f)
+AMDGPU_ISA_UNSUPPORTED (GFX940, "gfx940", 0x040)
+AMDGPU_ISA (GFX1100, "gfx1100", 0x041)
+AMDGPU_ISA_UNSUPPORTED (GFX1013, "gfx1013", 0x042)
+AMDGPU_ISA_UNSUPPORTED (GFX1150, "gfx1150", 0x043)
+AMDGPU_ISA_UNSUPPORTED (GFX1103, "gfx1103", 0x044)
+AMDGPU_ISA_UNSUPPORTED (GFX1036, "gfx1036", 0x045)
+AMDGPU_ISA_UNSUPPORTED (GFX1101, "gfx1101", 0x046)
+AMDGPU_ISA_UNSUPPORTED (GFX1102, "gfx1102", 0x047)
+AMDGPU_ISA_UNSUPPORTED (GFX1200, "gfx1200", 0x048)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X49, "reserved_0x49", 0x049)
+AMDGPU_ISA_UNSUPPORTED (GFX1151, "gfx1151", 0x04a)
+AMDGPU_ISA_UNSUPPORTED (GFX941, "gfx941", 0x04b)
+AMDGPU_ISA_UNSUPPORTED (GFX942, "gfx942", 0x04c)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X4D, "reserved_0x4d", 0x04d)
+AMDGPU_ISA_UNSUPPORTED (GFX1201, "gfx1201", 0x04e)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X4F, "reserved_0x4f", 0x04f)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X50, "reserved_0x50", 0x050)
+AMDGPU_ISA_UNSUPPORTED (GFX9_GENERIC, "gfx9_generic", 0x051)
+AMDGPU_ISA_UNSUPPORTED (GFX10_1_GENERIC, "gfx10_1_generic", 0x052)
+AMDGPU_ISA_UNSUPPORTED (GFX10_3_GENERIC, "gfx10_3_generic", 0x053)
+AMDGPU_ISA_UNSUPPORTED (GFX11_GENERIC, "gfx11_generic", 0x054)
+AMDGPU_ISA_UNSUPPORTED (RESERVED_0X55, "reserved 0x55", 0x055)
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 7e141a85f31..31bf8fd22d3 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -379,19 +379,11 @@ struct gcn_image_desc
const unsigned global_variable_count;
};
-/* This enum mirrors the corresponding LLVM enum's values for all ISAs that we
- support.
- See https://llvm.org/docs/AMDGPUUsage.html#amdgpu-ef-amdgpu-mach-table */
-
typedef enum {
EF_AMDGPU_MACH_UNSUPPORTED = -1,
- EF_AMDGPU_MACH_AMDGCN_GFX803 = 0x02a,
- EF_AMDGPU_MACH_AMDGCN_GFX900 = 0x02c,
- EF_AMDGPU_MACH_AMDGCN_GFX906 = 0x02f,
- EF_AMDGPU_MACH_AMDGCN_GFX908 = 0x030,
- EF_AMDGPU_MACH_AMDGCN_GFX90a = 0x03f,
- EF_AMDGPU_MACH_AMDGCN_GFX1030 = 0x036,
- EF_AMDGPU_MACH_AMDGCN_GFX1100 = 0x041
+#define AMDGPU_ISA(suffix, str, val) EF_AMDGPU_MACH_AMDGCN_ ## suffix = val,
+#include "gcn-devices.def"
+#undef AMDGPU_ISA
} EF_AMDGPU_MACH;
const static int EF_AMDGPU_MACH_MASK = 0x000000ff;
@@ -1670,36 +1662,19 @@ elf_gcn_isa_field (Elf64_Ehdr *image)
return image->e_flags & EF_AMDGPU_MACH_MASK;
}
-const static char *gcn_gfx803_s = "gfx803";
-const static char *gcn_gfx900_s = "gfx900";
-const static char *gcn_gfx906_s = "gfx906";
-const static char *gcn_gfx908_s = "gfx908";
-const static char *gcn_gfx90a_s = "gfx90a";
-const static char *gcn_gfx1030_s = "gfx1030";
-const static char *gcn_gfx1100_s = "gfx1100";
-const static int gcn_isa_name_len = 7;
-
/* Returns the name that the HSA runtime uses for the ISA or NULL if we do not
support the ISA. */
static const char*
-isa_hsa_name (int isa) {
+isa_hsa_name (int isa)
+{
switch(isa)
{
- case EF_AMDGPU_MACH_AMDGCN_GFX803:
- return gcn_gfx803_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX900:
- return gcn_gfx900_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX906:
- return gcn_gfx906_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX908:
- return gcn_gfx908_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX90a:
- return gcn_gfx90a_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX1030:
- return gcn_gfx1030_s;
- case EF_AMDGPU_MACH_AMDGCN_GFX1100:
- return gcn_gfx1100_s;
+#define AMDGPU_ISA(suffix, str, val) \
+ case EF_AMDGPU_MACH_AMDGCN_ ## suffix: \
+ return str;
+#include "gcn-devices.def"
+#undef AMDGPU_ISA
}
return NULL;
}
@@ -1709,7 +1684,8 @@ isa_hsa_name (int isa) {
Keep in sync with /gcc/config/gcn/gcn.{c,opt}. */
static const char*
-isa_gcc_name (int isa) {
+isa_gcc_name (int isa)
+{
switch(isa)
{
case EF_AMDGPU_MACH_AMDGCN_GFX803:
@@ -1723,27 +1699,13 @@ isa_gcc_name (int isa) {
the given name (as used by the HSA runtime). */
static gcn_isa
-isa_code(const char *isa) {
- if (!strncmp (isa, gcn_gfx803_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX803;
-
- if (!strncmp (isa, gcn_gfx900_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX900;
-
- if (!strncmp (isa, gcn_gfx906_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX906;
-
- if (!strncmp (isa, gcn_gfx908_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX908;
-
- if (!strncmp (isa, gcn_gfx90a_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX90a;
-
- if (!strncmp (isa, gcn_gfx1030_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX1030;
-
- if (!strncmp (isa, gcn_gfx1100_s, gcn_isa_name_len))
- return EF_AMDGPU_MACH_AMDGCN_GFX1100;
+isa_code(const char *isa)
+{
+#define AMDGPU_ISA(suffix, str, val) \
+ if (!strcmp (str, isa)) \
+ return EF_AMDGPU_MACH_AMDGCN_ ## suffix;
+#include "gcn-devices.def"
+#undef AMDGPU_ISA
return EF_AMDGPU_MACH_UNSUPPORTED;
}
@@ -1760,7 +1722,7 @@ max_isa_vgprs (int isa)
case EF_AMDGPU_MACH_AMDGCN_GFX906:
case EF_AMDGPU_MACH_AMDGCN_GFX908:
return 256;
- case EF_AMDGPU_MACH_AMDGCN_GFX90a:
+ case EF_AMDGPU_MACH_AMDGCN_GFX90A:
return 512;
case EF_AMDGPU_MACH_AMDGCN_GFX1030:
return 512; /* 512 SIMD32 = 256 wavefrontsize64. */
@@ -2466,25 +2428,24 @@ isa_matches_agent (struct agent_info *agent, Elf64_Ehdr *image)
{
int isa_field = elf_gcn_isa_field (image);
const char* isa_s = isa_hsa_name (isa_field);
- if (!isa_s)
- {
- hsa_error ("Unsupported ISA in GCN code object.", HSA_STATUS_ERROR);
- return false;
- }
- if (isa_field != agent->device_isa)
+ if (!isa_s || isa_field != agent->device_isa)
{
- char msg[120];
- const char *agent_isa_s = isa_hsa_name (agent->device_isa);
- const char *agent_isa_gcc_s = isa_gcc_name (agent->device_isa);
- assert (agent_isa_s);
- assert (agent_isa_gcc_s);
-
- snprintf (msg, sizeof msg,
- "GCN code object ISA '%s' does not match GPU ISA '%s'.\n"
- "Try to recompile with '-foffload-options=-march=%s'.\n",
- isa_s, agent_isa_s, agent_isa_gcc_s);
-
+ char msg[101 + 3*7 + 1];
+ if (!isa_s)
+ snprintf (msg, sizeof msg,
+ "Unsupported ISA %x in GCN code object.", isa_field);
+ else
+ {
+ const char *agent_isa_s = isa_hsa_name (agent->device_isa);
+ const char *agent_isa_gcc_s = isa_gcc_name (agent->device_isa);
+ assert (agent_isa_s);
+ assert (agent_isa_gcc_s);
+ snprintf (msg, sizeof msg,
+ "GCN code object ISA '%s' does not match GPU ISA '%s'.\n"
+ "Try to recompile with '-foffload-options=-march=%s'.\n",
+ isa_s, agent_isa_s, agent_isa_gcc_s);
+ }
hsa_error (msg, HSA_STATUS_ERROR);
return false;
}
@@ -3393,7 +3354,12 @@ GOMP_OFFLOAD_init_device (int n)
agent->device_isa = isa_code (agent->name);
if (agent->device_isa == EF_AMDGPU_MACH_UNSUPPORTED)
- return hsa_error ("Unknown GCN agent architecture", HSA_STATUS_ERROR);
+ {
+ char msg[31 + 64 + 1];
+ snprintf (msg, sizeof msg,
+ "Unknown GCN agent architecture %s", agent->name);
+ return hsa_error (msg, HSA_STATUS_ERROR);
+ }
status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_VENDOR_NAME,
&agent->vendor_name);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it
2024-03-15 12:21 [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it Tobias Burnus
@ 2024-03-15 13:14 ` Andrew Stubbs
2024-03-15 13:56 ` Tobias Burnus
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Stubbs @ 2024-03-15 13:14 UTC (permalink / raw)
To: Tobias Burnus, gcc-patches
On 15/03/2024 12:21, Tobias Burnus wrote:
> Given the large number of AMD GPU ISAs and the number of files which
> have to be adapted, I wonder whether it makes sense to consolidate this
> a bit, especially in the light that we may want to support more in the
> future.
>
> Besides using some macros, I also improved the diagnostic if the object
> code couldn't be recognized (shouldn't happen) or if the GPU is
> unsupported (likely; it now prints the GPU string). I was initially
> thinking of resolving the arch encoded in the eflag to a string, but as
> this is about GCC-generated code, it seemed to be unlikely of much use.
> [It should that rare that we might also go back to the static string
> instead of outputting the hex value of the eflag.]
>
> Note: I only modified mkoffload.cc and plugin-gcn.c, but with some
> tweaks it could also be used for other files in gcc/config/gcn/.
>
> If you add a new ISA, you still need to update plugin-gcn.c's
> max_isa_vgprs and the xnack/sram-ecc handling in mkoffload.c's main, but
> that should be all for those two files.
>
> Thoughts?
This is more-or-less what I was planning to do myself, but as I want to
include all the other features that get parametrized in gcn.cc, gcn.h,
gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet. Unfortunately, I
think the gcn.opt and config.gcc will always need manually updating, but
if that's all it'll be an improvement.
I don't like the idea of including AMDGPU_ISA_UNSUPPORTED; that list is
going to be permanently out of date, and even if we maintain it
fastidiously last year's release isn't going to have the updated list in
the wild. I think it's not actually active in this patch in any case.
Instead of AMDGPU_ISA, I think "AMDGPU_ELF" makes more sense. The ISA is
"CDNA2" or "RDNA3", etc., and the compiler needs to know about that.
Ultimately, I want to replace many of the conditionals like
"TARGET_CDNA2_PLUS" from the code and replace them with feature flags
derived from a def file, or at least a header file. We've acquired too
many places where there are unsearchable conditionals that need finding
and fixing every time a new device comes along.
I had imagined that this .def file would exist in gcc/config/gcn, but
you've placed it in libgomp.... maybe it makes sense to have multiple
such files if they contain very different data, but I had imagined one
file and I'm not sure that the compiler definitions live in libgomp.
> Tobias
>
> PS: I think the patch is fine and builds, but I have not tested it on an
> AMD GPU machine, yet.
>
> PPS: For using for other files, see also in config/nvptx which uses
> nvptx-sm.def to generate several files.
>
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it
2024-03-15 13:14 ` Andrew Stubbs
@ 2024-03-15 13:56 ` Tobias Burnus
2024-03-15 16:35 ` Andrew Stubbs
0 siblings, 1 reply; 5+ messages in thread
From: Tobias Burnus @ 2024-03-15 13:56 UTC (permalink / raw)
To: Andrew Stubbs, gcc-patches
Hi Andrew,
Andrew Stubbs wrote:
> This is more-or-less what I was planning to do myself, but as I want
> to include all the other features that get parametrized in gcn.cc,
> gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet.
> Unfortunately, I think the gcn.opt and config.gcc will always need
> manually updating, but if that's all it'll be an improvement.
Well, for .opt see how nvptx does it – it actually generates an .opt file.
> I don't like the idea of including AMDGPU_ISA_UNSUPPORTED;
I concur – I was initially thinking of reporting the device name
("Unsupported %s") but I then realized that the agent returns a string
while only for GCC generated files (→ eflag) the hexcode is used. Thus,
I ended up not using it.
> Ultimately, I want to replace many of the conditionals like
> "TARGET_CDNA2_PLUS" from the code and replace them with feature flags
> derived from a def file, or at least a header file. We've acquired too
> many places where there are unsearchable conditionals that need
> finding and fixing every time a new device comes along.
I was thinking of having more flags, but those where the only ones
required for the two files.
> I had imagined that this .def file would exist in gcc/config/gcn, but
> you've placed it in libgomp.... maybe it makes sense to have multiple
> such files if they contain very different data, but I had imagined one
> file and I'm not sure that the compiler definitions live in libgomp.
There is already:
gcc/config/darwin-c.cc:#include "../../libcpp/internal.h"
gcc/config/gcn/gcn-run.cc:#include
"../../../libgomp/config/gcn/libgomp-gcn.h"
gcc/fortran/cpp.cc:#include "../../libcpp/internal.h"
gcc/fortran/trigd_fe.inc:#include "../../libgfortran/intrinsics/trigd.inc"
But there is also the reverse:
libcpp/lex.cc:#include "../gcc/config/i386/cpuid.h"
libgfortran/libgfortran.h:#include "../gcc/fortran/libgfortran.h"
lto-plugin/lto-plugin.c:#include "../gcc/lto/common.h"
If you add more items, it is probably better to have it under
gcc/config/gcn/ - and I really prefer a single file for all.
* * *
Talking about feature sets: This would be a bit like LLVM (see below)
but I think they have a bit too much indirections. But I do concur that
we need to consolidate the current support – and hopefully make it
easier to keep adding more GPU support; we seem to have already covered
a larger chunk :-)
I also did wonder whether we should support, e.g., running a gfx1100
code (or a gfx11-generic one) on, e.g., a gfx1103 device. Alternatively,
we could keep the current check which requires an exact match.
BTW: I do note that looking at the feature sets in LLVM that all GFX110x
GPUs seem to have common silicon bugs: FeatureMSAALoadDstSelBug and
FeatureMADIntraFwdBug, while 1100 and 1102 additionally have the
FeatureUserSGPRInit16Bug but 1101 and 1103 don't. — For some reasons,
FeatureISAVersion11_Generic only consists of two of those bugs (it
doesn't have FeatureMADIntraFwdBug), which doesn't seem to be that
consistent. Maybe the workaround has issues elsewhere? If so, a generic
-march=gfx11 might be not as useful as one might hope for.
* * *
If I look at LLVM's
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPU.td
,
they first define several features – like 'FeatureUnalignedScratchAccess'.
Then they combine them like in:
def FeatureISAVersion11_Common ... [FeatureGFX11, ...
FeatureAtomicFaddRtnInsts ...
And then they use those to map them to feature sets like:
def FeatureISAVersion11_0_Common ...
listconcat(FeatureISAVersion11_Common.Features,
[FeatureMSAALoadDstSelBug ...
And for gfx1103:
def FeatureISAVersion11_0_3 : FeatureSet<
!listconcat(FeatureISAVersion11_0_Common.Features,
[])>;
The mapping to gfx... names then happens in
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/GCNProcessors.td
such as:
def : ProcessorModel<"gfx1103", GFX11SpeedModel,
FeatureISAVersion11_0_3.Features
>;
Or for the generic one, i.e.:
// [gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151]
def : ProcessorModel<"gfx11-generic", GFX11SpeedModel,
FeatureISAVersion11_Generic.Features
LLVM also has some generic flags like the following in
https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/TargetParser.cpp
{{"gfx1013"}, {"gfx1013"}, GK_GFX1013,
FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32|FEATURE_XNACK|FEATURE_WGP},
I hope that this will give some inspiration – but I assume that at least
the initial implementation will be much shorter.
Tobias
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it
2024-03-15 13:56 ` Tobias Burnus
@ 2024-03-15 16:35 ` Andrew Stubbs
2024-03-18 7:47 ` Richard Biener
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Stubbs @ 2024-03-15 16:35 UTC (permalink / raw)
To: Tobias Burnus, gcc-patches
On 15/03/2024 13:56, Tobias Burnus wrote:
> Hi Andrew,
>
> Andrew Stubbs wrote:
>> This is more-or-less what I was planning to do myself, but as I want
>> to include all the other features that get parametrized in gcn.cc,
>> gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet.
>> Unfortunately, I think the gcn.opt and config.gcc will always need
>> manually updating, but if that's all it'll be an improvement.
>
> Well, for .opt see how nvptx does it – it actually generates an .opt file.
>
>> I don't like the idea of including AMDGPU_ISA_UNSUPPORTED;
>
> I concur – I was initially thinking of reporting the device name
> ("Unsupported %s") but I then realized that the agent returns a string
> while only for GCC generated files (→ eflag) the hexcode is used. Thus,
> I ended up not using it.
>
>> Ultimately, I want to replace many of the conditionals like
>> "TARGET_CDNA2_PLUS" from the code and replace them with feature flags
>> derived from a def file, or at least a header file. We've acquired too
>> many places where there are unsearchable conditionals that need
>> finding and fixing every time a new device comes along.
> I was thinking of having more flags, but those where the only ones
> required for the two files.
>> I had imagined that this .def file would exist in gcc/config/gcn, but
>> you've placed it in libgomp.... maybe it makes sense to have multiple
>> such files if they contain very different data, but I had imagined one
>> file and I'm not sure that the compiler definitions live in libgomp.
>
> There is already:
>
> gcc/config/darwin-c.cc:#include "../../libcpp/internal.h"
>
> gcc/config/gcn/gcn-run.cc:#include
> "../../../libgomp/config/gcn/libgomp-gcn.h"
>
> gcc/fortran/cpp.cc:#include "../../libcpp/internal.h"
>
> gcc/fortran/trigd_fe.inc:#include "../../libgfortran/intrinsics/trigd.inc"
>
> But there is also the reverse:
>
> libcpp/lex.cc:#include "../gcc/config/i386/cpuid.h"
>
> libgfortran/libgfortran.h:#include "../gcc/fortran/libgfortran.h"
>
> lto-plugin/lto-plugin.c:#include "../gcc/lto/common.h"
>
> If you add more items, it is probably better to have it under
> gcc/config/gcn/ - and I really prefer a single file for all.
>
> * * *
>
> Talking about feature sets: This would be a bit like LLVM (see below)
> but I think they have a bit too much indirections. But I do concur that
> we need to consolidate the current support – and hopefully make it
> easier to keep adding more GPU support; we seem to have already covered
> a larger chunk :-)
>
> I also did wonder whether we should support, e.g., running a gfx1100
> code (or a gfx11-generic one) on, e.g., a gfx1103 device. Alternatively,
> we could keep the current check which requires an exact match.
We didn't invent that restriction; the runtime won't let you do it. We
only have the check because the HSA/ROCr error message is not very
user-friendly.
> BTW: I do note that looking at the feature sets in LLVM that all GFX110x
> GPUs seem to have common silicon bugs: FeatureMSAALoadDstSelBug and
> FeatureMADIntraFwdBug, while 1100 and 1102 additionally have the
> FeatureUserSGPRInit16Bug but 1101 and 1103 don't. — For some reasons,
> FeatureISAVersion11_Generic only consists of two of those bugs (it
> doesn't have FeatureMADIntraFwdBug), which doesn't seem to be that
> consistent. Maybe the workaround has issues elsewhere? If so, a generic
> -march=gfx11 might be not as useful as one might hope for.
>
> * * *
>
> If I look at LLVM's
> https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPU.td ,
>
> they first define several features – like 'FeatureUnalignedScratchAccess'.
>
> Then they combine them like in:
>
> def FeatureISAVersion11_Common ... [FeatureGFX11, ...
> FeatureAtomicFaddRtnInsts ...
>
> And then they use those to map them to feature sets like:
>
> def FeatureISAVersion11_0_Common ...
> listconcat(FeatureISAVersion11_Common.Features,
> [FeatureMSAALoadDstSelBug ...
>
> And for gfx1103:
>
> def FeatureISAVersion11_0_3 : FeatureSet<
> !listconcat(FeatureISAVersion11_0_Common.Features,
> [])>;
>
> The mapping to gfx... names then happens in
> https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/GCNProcessors.td such as:
>
> def : ProcessorModel<"gfx1103", GFX11SpeedModel,
> FeatureISAVersion11_0_3.Features
> >;
>
> Or for the generic one, i.e.:
>
> // [gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151]
> def : ProcessorModel<"gfx11-generic", GFX11SpeedModel,
> FeatureISAVersion11_Generic.Features
>
> LLVM also has some generic flags like the following in
> https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/TargetParser.cpp
>
> {{"gfx1013"}, {"gfx1013"}, GK_GFX1013,
> FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32|FEATURE_XNACK|FEATURE_WGP},
>
> I hope that this will give some inspiration – but I assume that at least
> the initial implementation will be much shorter.
Yeah, we can have one macro for each arch, or multiple macros for
building different tables. First one seems easier but less readable,
second one will need some thinking about. Probably best to keep it
simple though.
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it
2024-03-15 16:35 ` Andrew Stubbs
@ 2024-03-18 7:47 ` Richard Biener
0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2024-03-18 7:47 UTC (permalink / raw)
To: Andrew Stubbs; +Cc: Tobias Burnus, gcc-patches
On Fri, Mar 15, 2024 at 5:36 PM Andrew Stubbs <ams@baylibre.com> wrote:
>
> On 15/03/2024 13:56, Tobias Burnus wrote:
> > Hi Andrew,
> >
> > Andrew Stubbs wrote:
> >> This is more-or-less what I was planning to do myself, but as I want
> >> to include all the other features that get parametrized in gcn.cc,
> >> gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet.
> >> Unfortunately, I think the gcn.opt and config.gcc will always need
> >> manually updating, but if that's all it'll be an improvement.
> >
> > Well, for .opt see how nvptx does it – it actually generates an .opt file.
> >
> >> I don't like the idea of including AMDGPU_ISA_UNSUPPORTED;
> >
> > I concur – I was initially thinking of reporting the device name
> > ("Unsupported %s") but I then realized that the agent returns a string
> > while only for GCC generated files (→ eflag) the hexcode is used. Thus,
> > I ended up not using it.
> >
> >> Ultimately, I want to replace many of the conditionals like
> >> "TARGET_CDNA2_PLUS" from the code and replace them with feature flags
> >> derived from a def file, or at least a header file. We've acquired too
> >> many places where there are unsearchable conditionals that need
> >> finding and fixing every time a new device comes along.
> > I was thinking of having more flags, but those where the only ones
> > required for the two files.
> >> I had imagined that this .def file would exist in gcc/config/gcn, but
> >> you've placed it in libgomp.... maybe it makes sense to have multiple
> >> such files if they contain very different data, but I had imagined one
> >> file and I'm not sure that the compiler definitions live in libgomp.
> >
> > There is already:
> >
> > gcc/config/darwin-c.cc:#include "../../libcpp/internal.h"
> >
> > gcc/config/gcn/gcn-run.cc:#include
> > "../../../libgomp/config/gcn/libgomp-gcn.h"
> >
> > gcc/fortran/cpp.cc:#include "../../libcpp/internal.h"
> >
> > gcc/fortran/trigd_fe.inc:#include "../../libgfortran/intrinsics/trigd.inc"
> >
> > But there is also the reverse:
> >
> > libcpp/lex.cc:#include "../gcc/config/i386/cpuid.h"
> >
> > libgfortran/libgfortran.h:#include "../gcc/fortran/libgfortran.h"
> >
> > lto-plugin/lto-plugin.c:#include "../gcc/lto/common.h"
> >
> > If you add more items, it is probably better to have it under
> > gcc/config/gcn/ - and I really prefer a single file for all.
> >
> > * * *
> >
> > Talking about feature sets: This would be a bit like LLVM (see below)
> > but I think they have a bit too much indirections. But I do concur that
> > we need to consolidate the current support – and hopefully make it
> > easier to keep adding more GPU support; we seem to have already covered
> > a larger chunk :-)
> >
> > I also did wonder whether we should support, e.g., running a gfx1100
> > code (or a gfx11-generic one) on, e.g., a gfx1103 device. Alternatively,
> > we could keep the current check which requires an exact match.
>
> We didn't invent that restriction; the runtime won't let you do it. We
> only have the check because the HSA/ROCr error message is not very
> user-friendly.
Note that I heard/read somewhere that they plan to support a "blend" version
that would allow running a kernel on any gfx11xx or gfx106x versions (supposedly
at some runtime cost). Guess we'll need to watch the LLVM side of things here
(or the ROCm runtime side of it).
> > BTW: I do note that looking at the feature sets in LLVM that all GFX110x
> > GPUs seem to have common silicon bugs: FeatureMSAALoadDstSelBug and
> > FeatureMADIntraFwdBug, while 1100 and 1102 additionally have the
> > FeatureUserSGPRInit16Bug but 1101 and 1103 don't. — For some reasons,
> > FeatureISAVersion11_Generic only consists of two of those bugs (it
> > doesn't have FeatureMADIntraFwdBug), which doesn't seem to be that
> > consistent. Maybe the workaround has issues elsewhere? If so, a generic
> > -march=gfx11 might be not as useful as one might hope for.
> >
> > * * *
> >
> > If I look at LLVM's
> > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPU.td ,
> >
> > they first define several features – like 'FeatureUnalignedScratchAccess'.
> >
> > Then they combine them like in:
> >
> > def FeatureISAVersion11_Common ... [FeatureGFX11, ...
> > FeatureAtomicFaddRtnInsts ...
> >
> > And then they use those to map them to feature sets like:
> >
> > def FeatureISAVersion11_0_Common ...
> > listconcat(FeatureISAVersion11_Common.Features,
> > [FeatureMSAALoadDstSelBug ...
> >
> > And for gfx1103:
> >
> > def FeatureISAVersion11_0_3 : FeatureSet<
> > !listconcat(FeatureISAVersion11_0_Common.Features,
> > [])>;
> >
> > The mapping to gfx... names then happens in
> > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/GCNProcessors.td such as:
> >
> > def : ProcessorModel<"gfx1103", GFX11SpeedModel,
> > FeatureISAVersion11_0_3.Features
> > >;
> >
> > Or for the generic one, i.e.:
> >
> > // [gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151]
> > def : ProcessorModel<"gfx11-generic", GFX11SpeedModel,
> > FeatureISAVersion11_Generic.Features
> >
> > LLVM also has some generic flags like the following in
> > https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/TargetParser.cpp
> >
> > {{"gfx1013"}, {"gfx1013"}, GK_GFX1013,
> > FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32|FEATURE_XNACK|FEATURE_WGP},
> >
> > I hope that this will give some inspiration – but I assume that at least
> > the initial implementation will be much shorter.
>
> Yeah, we can have one macro for each arch, or multiple macros for
> building different tables. First one seems easier but less readable,
> second one will need some thinking about. Probably best to keep it
> simple though.
I'd say whatever gets us to smaller required changes to say introduce support
for gfx1013 is welcome. Applies to the libgomp plugin side as well, of course.
Richard.
> Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-18 7:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15 12:21 [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it Tobias Burnus
2024-03-15 13:14 ` Andrew Stubbs
2024-03-15 13:56 ` Tobias Burnus
2024-03-15 16:35 ` Andrew Stubbs
2024-03-18 7:47 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).