public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tobias Burnus <tobias@codesourcery.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch] OpenMP: Move omp requires checks to libgomp
Date: Thu, 9 Jun 2022 14:46:34 +0200	[thread overview]
Message-ID: <7f9c91c1-a479-f94f-ac14-1d6827ce671b@codesourcery.com> (raw)
In-Reply-To: <YqHcI46gvVbB+E/G@tucnak>

On 09.06.22 13:40, Jakub Jelinek via Gcc-patches wrote:
> On Wed, Jun 08, 2022 at 05:56:02AM +0200, Tobias Burnus wrote:
>> +         && lookup_attribute ("omp declare target",
>> +                              DECL_ATTRIBUTES (current_function_decl)))
>> +       omp_requires_mask
>> +         = (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARGET_USED);
> I must admit it is unclear what the
> "must appear lexically before any device constructs or device routines."
> restriction actually means for device routines.
> Is that lexically before definition of such device routines, or even their
> declarations?
I have similar issues – also for Fortran (and C++) module use. Hence, I
had filled https://github.com/OpenMP/spec/issues/3240 (not publicly
accessible); I added your issues to the list.
> The above patch snippet is I believe for function definitions that were
> arked as declare target before the definition somehow (another decl for
> it merged with the new one or in between the begin/end).  And is true
> even for device_type (host), to rule that out you'd need to check for
> "omp declare target host" attribute not being present.
> I'm not against the above snippet perhaps adjusted for device_type(host),
> but IMHO we want clarifications from omp-lang
How to proceed for now? And does 'omp_is_initial_device()' on the host a
device function or not? It can be hard-coded to 'true' ...
> [...]
> target update is also a device construct and the above snippet hasn't been
> added for it, ditto for interop which we don't implement yet.
> But, my preference would be instead of adding these snippets to
> c_parser_omp_target_{data,enter_data,exit_data,update} etc. move it from
> c_parser_omp_target to c_parser_omp_all_clauses:
>    if (flag_openmp
>        && (mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICE)) != 0)
>      omp_requires_mask
>        = (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARGET_USED);
> (somewhere at the start of the function), because the definition of device
> constructs is exactly like that:
> "device construct     An OpenMP construct that accepts the device clause."

Makes sense.

[C++ cases]

> Ditto.
> For Fortran, is the above mostly not needed because requires need to be in
> the specification part and device constructs are executable and appear in
> the part after it?  Do we allow requires in BLOCK's specification part?
We don't allow it in BLOCK – but there are issues related to USE-ing
modules, cf. OpenMP issue.
>> --- a/gcc/gimplify.cc
>> +++ b/gcc/gimplify.cc
>> @@ -3644,6 +3644,9 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
>> +  if (fndecl && flag_openmp && omp_runtime_api_call (fndecl, true))
>> +    omp_requires_mask
>> +      = (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARGET_USED);
> I'm sure device APIs were discussed, but I must be blind and I can't find it
> in either 5.0, 5.1 or 5.2.  All I see is device constructs or device routines
> in those places where I'd also look for device related OpenMP runtime
> library APIs.  Though, if some routine calls omp_get_num_devices (),
> certainly the library at that point needs to know
> reverse_offload/unified_shared_memory/etc. requires because that determines
> how many devices it has.  So, what have I missed (aka on which place in the
> standard the above snippet is based on)?

It is based on your review comments from last year ("Something I miss in
the patch is that for the device API calls") plus what requires some
device initialization. But otherwise, I also did not see it.

In terms of parsing, it makes no difference – contrary to
'unified_shared_memory', where the parser could decide not to add
implicit mapping, the compiler part is not affected by API calls.

I cannot really make up my mind whether it should be required in this
case or not. Maybe, it is not needed.

> + const char *requires_section = ".gnu.gomp_requires";
>> +      tree maskvar = build_decl (UNKNOWN_LOCATION, VAR_DECL,
>> +                                 get_identifier (".gomp_requires_mask"),
>> +                                 unsigned_type_node);
>> +      SET_DECL_ALIGN (maskvar, TYPE_ALIGN (unsigned_type_node));
> Don't we want also DECL_USER_ALIGN (maskvar) = 1; so that
> we never try to increase its alignment?
Probably yes.
> Is it an allocated section, or should it be better non-allocated and then
> dealt with by mkoffload?
>
> Shouldn't the vars in that section be const, so that it is a read-only
> section?
>
> Is unsigned_type_node what we want (say wouldn't be just unsigned_char_node
> be enough, currently we just need 3 bits).

Probably -that would be 8 bits, leaving 5 spare. I have not checked what
Andrew et al. do with the pinned-memory support by -f<some-flag>, but
that will likely use only 1 to 3 bits, if any.

> Also, wonder if for HAVE_GAS_SHF_MERGE && flag_merge_constants
> we shouldn't try to make that section mergeable.  If it goes away during
> linking and is replaced by something, then it doesn't matter, but otherwise,
> as we don't record which TU had what flags, all we care about is that
> there were some TUs which used device construct/routines (and device APIs?)
> and used bitmask 7, other TUs that used bitmask 3 and others that used
> bitmask 4.
(maybe – I am not sure about this, either.)
> @@ -442,6 +463,14 @@ omp_finish_file (void)
>       }
>     else
>       {
> +#ifndef ACCEL_COMPILER
> +      if (flag_openmp
> +       && (omp_requires_mask & OMP_REQUIRES_TARGET_USED)
> +       && (omp_requires_mask & (OMP_REQUIRES_UNIFIED_ADDRESS
> +                                | OMP_REQUIRES_UNIFIED_SHARED_MEMORY
> +                                | OMP_REQUIRES_REVERSE_OFFLOAD)))
> +     sorry ("OpenMP device offloading is not supported for this target");
> +#endif
> I don't understand this snippet.  Without named sections on the host,
> I bet we simply don't support offloading at all,
> the record_offload_symbol target hook is only non-trivially defined
> for nvptx and nvptx isn't typical host for OpenMP offloading,
> because we don't remember it anywhere.

I thought that would address your: "This probably needs to sorry if the
target doesn't support named sections. We probably don't support LTO in
that case either though."

> I wonder if we shouldn't rename it when we change the arguments,
> so that if one mixes an older plugin with newer libgomp or vice versa
> this is easily caught.
Ok.
>> +/* Start/end of .gnu.gomp.requires section of program, defined in
> Isn't it .gnu.gomp_requires ?
Yes.
>> +   crtoffloadbegin/end.o.  */
>> +__attribute__((weak))
>> +extern const unsigned int __requires_mask_table[];
>> +__attribute__((weak))
>> +extern const unsigned int __requires_mask_table_end[];
> I must say it is unclear to me how this works.
> It will only find one such array, say in the executable,
> or if the executable doesn't have it, in one of the shared libraries.
>
> I think we want some solution that will work with OpenMP code
> at least in the executable and some shared libraries it is linked against
> (obviously another case is when a library with certain #pragma omp requires
> is dlopened after we've finalized the number of devices, bet the options
> in that case are either warn or fatal error).

There is the general problem that GCC does not work well with having
device routines in a shared library;  it works fine if the device part
is only in the library – but calling from an .exe program's target
region a declare-target function in a library won't work.

Thus, we may need to find a more general solution for this.

> The choices could be e.g. make __requires_mask_table{,_end} .hidden
> and in the crtoffloadbegin.o (or end) unconditionally call some new libgomp
> routine to register the table, but the disadvantage of that is that we could
> have many of those register calls even when there is nothing to register
> (sure, the .ctor in crtoffloadbegin.o (or end) could compare the 2 addresses
> and not call anything if the table is empty but there would be still code
> executed during initialization of the library).
>
> Yet another possibility is linker plugin case.  We already use it for the
> case where we actually have some offloading LTO bytecode, transform it into
> a data section and register with GOMP_offload_register*.

> So, if we could e.g. at the same time also process those requires arrays,
> diagnose at link time if multiple TUs with that set disagree on the mask
> value and in the target data provide that mask to the library, that would
> be much nicer.
> And the masks either could be gathered from .gnu.gomp_requires or it can be
> somehow encoded in the offloading LTO or its associated data.
> What is important though is that it will work even if we actually don't have
> any "omp declare target" functions or variables in the TU or the whole
> executable or library, just the requires mask.  But that can be dealt with
> e.g. by forcing the LTO sections even for that case or so.
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

  reply	other threads:[~2022-06-09 12:46 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-13 15:07 [PATCH, OpenMP 5.0] More implementation of the requires directive Chung-Lin Tang
2021-01-13 15:27 ` Jakub Jelinek
2021-03-25 11:18 ` Thomas Schwinge
2022-03-29 13:42 ` Andrew Stubbs
2022-06-08  3:56 ` [Patch] OpenMP: Move omp requires checks to libgomp Tobias Burnus
2022-06-09 11:40   ` Jakub Jelinek
2022-06-09 12:46     ` Tobias Burnus [this message]
2022-06-09 14:19       ` Jakub Jelinek
2022-06-29 14:33         ` [Patch][v4] " Tobias Burnus
2022-06-29 17:02           ` Jakub Jelinek
2022-06-29 18:10             ` Tobias Burnus
2022-06-29 20:18               ` Jakub Jelinek
2022-07-01 13:06                 ` [Patch][v5] " Tobias Burnus
2022-07-01 14:34                   ` Jakub Jelinek
2022-07-01 16:31                     ` Tobias Burnus
2022-07-01 16:55                       ` Jakub Jelinek
2022-07-01 21:08                         ` Tobias Burnus
2022-07-04  8:31                           ` Jakub Jelinek
2022-07-07 13:26                           ` Fix one issue in OpenMP 'requires' directive diagnostics (was: [Patch][v5] OpenMP: Move omp requires checks to libgomp) Thomas Schwinge
2022-07-07 13:56                             ` Tobias Burnus
2022-07-08  6:59                               ` Thomas Schwinge
2022-07-06 10:42                   ` Restore 'GOMP_offload_unregister_ver' functionality " Thomas Schwinge
2022-07-06 13:59                     ` Tobias Burnus
2022-07-06 21:08                       ` Thomas Schwinge
2022-08-17 11:45                       ` Jakub Jelinek
2023-09-15  9:41                   ` [Patch][v5] OpenMP: Move omp requires checks to libgomp Thomas Schwinge
2022-07-07  8:37           ` Adjust 'libgomp.c-c++-common/requires-3.c' (was: [Patch][v4] OpenMP: Move omp requires checks to libgomp) Thomas Schwinge
2022-07-07  9:02             ` Tobias Burnus
2022-07-07  8:42           ` Enhance 'libgomp.c-c++-common/requires-4.c', 'libgomp.c-c++-common/requires-5.c' testing " Thomas Schwinge
2022-07-07  9:36             ` Tobias Burnus
2022-07-07 10:42               ` Thomas Schwinge
2022-07-06 10:30   ` Define 'OMP_REQUIRES_[...]', 'GOMP_REQUIRES_[...]' in a single place (was: [Patch] " Thomas Schwinge
2022-07-06 13:40     ` Tobias Burnus
2022-07-06 11:04   ` Fix Intel MIC 'mkoffload' for OpenMP 'requires' " Thomas Schwinge
2022-07-06 11:29     ` Tobias Burnus
2022-07-06 12:38       ` Thomas Schwinge
2022-07-06 13:30         ` Tobias Burnus
2022-07-07 10:46           ` Thomas Schwinge
2022-07-06 14:19     ` Tobias Burnus
2024-03-07 12:38   ` nvptx: 'cuDeviceGetCount' failure is fatal " Thomas Schwinge
2024-03-07 14:28     ` nvptx: 'cuDeviceGetCount' failure is fatal Tobias Burnus
2024-03-08 15:58       ` Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f9c91c1-a479-f94f-ac14-1d6827ce671b@codesourcery.com \
    --to=tobias@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).