From: Jakub Jelinek <jakub@redhat.com>
To: Marcel Vollweiler <marcel@codesourcery.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [Patch] OpenMP, libgomp, gimple: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices
Date: Fri, 30 Sep 2022 11:35:41 +0200 [thread overview]
Message-ID: <Yza4bfgQT/j7XhmU@tucnak> (raw)
In-Reply-To: <3195cfa5-0612-5b52-4c24-9763c9a56864@codesourcery.com>
On Sun, Sep 18, 2022 at 10:24:43AM +0200, Marcel Vollweiler wrote:
> gcc/ChangeLog:
>
> * gimplify.cc (optimize_target_teams): Set initial num_teams_upper
> to "-2" instead of "1" for non-existing num_teams clause in order to
> disambiguate from the case of an existing num_teams clause with value 1.
>
> libgomp/ChangeLog:
>
> * config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to
> allow processing of device-specific values.
> (omp_set_teams_thread_limit): Likewise.
> (ialias): Likewise.
> * config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise.
> (omp_set_teams_thread_limit): Likewise.
> (ialias): Likewise.
> * icv-device.c (omp_get_teams_thread_limit): Likewise.
> (ialias): Likewise.
> (omp_set_teams_thread_limit): Likewise.
> * icv.c (omp_set_teams_thread_limit): Removed.
> (omp_get_teams_thread_limit): Likewise.
> (ialias): Likewise.
> * target.c (get_gomp_offload_icvs): Added teams_thread_limit_var
> handling.
> (gomp_load_image_to_device): Added a size check for the ICVs struct
> variable.
> (gomp_copy_back_icvs): New function that is used in GOMP_target_ext to
> copy back the ICV values from device to host.
> (GOMP_target_ext): Update the number of teams and threads in the kernel
> args also considering device-specific values.
> * testsuite/libgomp.c-c++-common/icv-4.c: Bugfix.
Better say what exactly you changed in words.
> * testsuite/libgomp.c-c++-common/icv-5.c: Extended.
> * testsuite/libgomp.c-c++-common/icv-6.c: Extended.
> * testsuite/libgomp.c-c++-common/icv-7.c: Extended.
> * testsuite/libgomp.c-c++-common/icv-9.c: New test.
> * testsuite/libgomp.fortran/icv-5.f90: New test.
> * testsuite/libgomp.fortran/icv-6.f90: New test.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/gomp/target-teams-1.c: Adapt expected values for
> num_teams from "1" to "-2" in cases without num_teams clause.
> * g++.dg/gomp/target-teams-1.C: Likewise.
> * gfortran.dg/gomp/defaultmap-4.f90: Likewise.
> * gfortran.dg/gomp/defaultmap-5.f90: Likewise.
> * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -14153,7 +14153,7 @@ optimize_target_teams (tree target, gimple_seq *pre_p)
> struct gimplify_omp_ctx *target_ctx = gimplify_omp_ctxp;
>
> if (teams == NULL_TREE)
> - num_teams_upper = integer_one_node;
> + num_teams_upper = build_int_cst (integer_type_node, -2);
> else
> for (c = OMP_TEAMS_CLAUSES (teams); c; c = OMP_CLAUSE_CHAIN (c))
> {
The function comment above optimize_target_teams contains detailed
description on what the values mean and why, so it definitely should
document what -2 means and when it is used.
I know you have documentation in libgomp for it, but it should be in both
places.
> + intptr_t new_teams = orig_teams, new_threads = orig_threads;
> + /* ORIG_TEAMS == -2: No explicit teams construct specified. Set to 1.
Two spaces after .
> + ORIG_TEAMS == -1: TEAMS construct with NUM_TEAMS clause specified, but the
> + value could not be specified. No Change.
Likewise.
lowercase change ?
> + ORIG_TEAMS == 0: TEAMS construct without NUM_TEAMS clause.
> + Set device-specific value.
> + ORIG_TEAMS > 0: Value was already set through e.g. NUM_TEAMS clause.
> + No change. */
> + if (orig_teams == -2)
> + new_teams = 1;
> + else if (orig_teams == 0)
> + {
> + struct gomp_offload_icv_list *item = gomp_get_offload_icv_item (device);
> + if (item != NULL)
> + new_teams = item->icvs.nteams;
> + }
> + /* The device-specific teams-thread-limit is only set if (a) an explicit TEAMS
> + region exists, i.e. ORIG_TEAMS > -2, and (b) THREADS was not already set by
> + e.g. a THREAD_LIMIT clause. */
> + if (orig_teams >= -2 && orig_threads == 0)
The comment talks about ORIG_TEAMS > -2, but the condition is >= -2.
So which one is it?
> + /* This tests a large number of teams and threads. If it is larger than
> + 2^15+1 then the according argument in the kernels arguments list
> + is encoded with two items instead of one. On NVIDIA there is an
> + adjustment for too large teams and threads. For AMD such adjustment
> + exists only for threads and will cause runtime errors with a two large
s/two/too/ ?
Shouldn't amdgcn adjusts also number of teams?
As for testcases, have you tested this in a native setup where dg-set-target-env-var
actually works?
Jakub
next prev parent reply other threads:[~2022-09-30 9:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-14 16:06 Marcel Vollweiler
2022-06-30 13:16 ` Jakub Jelinek
2022-08-03 12:40 ` Marcel Vollweiler
2022-09-18 8:24 ` Marcel Vollweiler
2022-09-30 9:35 ` Jakub Jelinek [this message]
2022-11-24 14:09 ` Marcel Vollweiler
2022-12-05 13:50 ` Jakub Jelinek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yza4bfgQT/j7XhmU@tucnak \
--to=jakub@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=marcel@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).