public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [OpenMP/offloading][RFC] How to handle target/device-specifics with C pre-processor (in general, inside 'omp declare variant')
@ 2023-08-21 16:23 Tobias Burnus
  2023-08-22  7:25 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Tobias Burnus @ 2023-08-21 16:23 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge, Joseph Myers

RFC – and idea how to handle this best in GCC? See the two examples
below for what we would like to support.

* * *

In GCC, we handle OpenMP (and OpenACC) by parsing the input file once,
produce an internal representation (in LTO format) for offloading code
and only at link time process it by passing it via the LTO wrapper to
the offloading-device compilers (mkoffload / device lto1).
See https://gcc.gnu.org/wiki/Offloading

This works okayish - even though it causes some issues like with
metadirectives (they are implemened on the OG13 branch, however).
And with declare variant or a nohost version, where getting rid of
the host version is not that easy as it has to be in there until
omp-offload.cc's functions are run, which comes rather late.

There are currently already some issues like with -ffast-math
and GLIBC's finite math functions, which are not be available
on the device side when using newlib's libm..
(However, GLIBC has removed those.)

Likewise, it would be nice to do like Clang+LLVM does: Auto-enable
some device-specific math functions. (Albeit that won't work well
with Fortran.)


However, with OpenMP 5.1, there is a real issue. In 5.1, Appendix B
it reads as:
"For C/C++, the declare variant directive was extended to support elision
of preprocessed code and to allow enclosed function definitions to be
interpreted as variant functions (see Section 7.5)."

The problem is the "elision of preprocessed" as it permits code like the
following:

|#ifdef _OPENMP #pragma omp begin declare variant
match(device={arch=NVPTX}) #include "cuda/math.h" #pragma omp begin
declare variant match(device={isa=sm70}) #include "cuda/sm70/math.h"
#pragma omp end declare variant #pragma omp end declare variant #pragma
omp begin declare variant match(arch=AMD) #include "amdgpu/math.h"
#pragma omp end declare variant #endif|

And such code needs to keep working if there is a '#define ABC ...' in
one file and an '#ifndef ABC / #define ABC ...' in the other file.

Additionally, it would be neat if it would handle target-specific defines
like '#if __PTX_SM__ == 350' for the relevant parts (here: arch=nvptx).
(We already do support context selectors via the gcc/config/*/t-omp-device files;
see also https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html )

Thoughts?

* * *

The question is also what to support – "just" function declarations which are specific
to a device or some generic replacement of the kind:

|#pragma omp begin declare variant match(device={arch=NVPTX})|
   #define NUM_THREADS 128
#pragma omp end declare variant
|#pragma omp begin declare variant match(device={arch=AMDGCN})|
   #define NUM_THREADS 64
#pragma omp end declare variant

#ifndef NUM_THREADSß
   #define NUM_THREADS 16
#endif

...
printf ("Running with %d threads\n", NUM_THREADS);
#pragma omp parallel for num_threads(NUM_THREADS)

* * *

If we only handle 'begin/end declare variant', the following
works in principle:
- Parse the file once with only host-code parsing but
- keep track of delimited '|omp begin declare variant|'
   where the context selector matches one of the supported
   offload targets.
- parse the file n-times again but this time set the
   target-#defined (extended version of gcc/config/*/t-omp-device
   to make them available?)
- When doing so, ignore all non-offloading bits (issue: implicit
   'declare target' + have the data available for variant resolution).
- Store this in some way.

But it is not really clear to me how to do this in actual code.

Any suggestion?

Tobias

PS: I would like to have some input before the Cauldron, but we might want
to additionally discuss this in detail during the cauldron, possibly some
brainstorming before the BoF and then surely also in the BoF.

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-22  9:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-21 16:23 [OpenMP/offloading][RFC] How to handle target/device-specifics with C pre-processor (in general, inside 'omp declare variant') Tobias Burnus
2023-08-22  7:25 ` Richard Biener
2023-08-22  8:43   ` Tobias Burnus
2023-08-22  9:05     ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).