From: Marcel Vollweiler <marcel@codesourcery.com>
To: <gcc-patches@gcc.gnu.org>
Subject: [Patch] OpenMP, libgomp, gimple: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices
Date: Thu, 14 Apr 2022 18:06:24 +0200 [thread overview]
Message-ID: <d32cc89e-74d7-06bd-3ef6-15deadd28a78@codesourcery.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 3076 bytes --]
Hi,
This patch adds support for omp_get_max_teams, omp_set_num_teams, and
omp_{gs}et_teams_thread_limit on offload devices.
The patch builds on the following patches which are submitted, but not yet
approved/committed:
- [PATCH] OpenMP, libgomp: Environment variable syntax extension.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588728.html
- [PATCH] OpenMP, libgomp: Add new runtime routine omp_get_mapped_ptr.
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591556.html
The OpenMP runtime routines omp_get_max_teams, omp_set_num_teams, and
omp_{gs}et_teams_thread_limit were introduced in OpenMP 5.1 and where already
implemented for the host usage with patch
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581283.html
The new patch extends the functionality of these OpenMP runtime routines by the
usage also on the device, i.e. device-specific values for nteams-var and
teams-thread-limit-var ICVs can now be retrieved and set also on the device. The
updated number of teams/threads are then used when launching the kernel.
The following main aspects are considered:
(a) Implemented the functions in the according icv-device files.
(b) Added structures to not only store initial device-specific values (they have
to be kept for omp_display_env) but also device-specific ICV values that can be
changed on the device at runtime.
(c) Changed the gimplification:
(c.1) Introduced integer_minus_two_node.
(c.2) For target regions that do not include teams constructs, now the clause
num_teams(-2) is added instead num_teams(1). This was necessary as num_teams(1)
is ambigious: it can also mean that a teams construct with explicit num_teams(1)
clause was specified inside the target region. The disambiguation is needed in
order to choose the correct thread limit: teams-thread-limit-var is only
intended for teams constructs such that if there is no teams construct, then the
number of threads is limited by thread-limit-var.
(d) Extend GOMP_target_ext. The host needs to set the device-specific ICV values
before the kernel is launched. The number of teams and threads are members of
the args list and are modified when no value was specified in an explicit clause
and the computation of the value was not postponed due to mapped variables.
(d.1) The arguments list is copied in order to guarantee immutability.
(e) Added copy back mechanism for ICVs which are modified on the device. The
only way to change device-specific ICVs is to do it on the device. As the
device-specific values are sometimes needed also on the host when the kernel is
launched (particularly number of teams and threads) they have to be copied back.
The patch was tested on x86_64-linux with nvptx and gcn offloading. All with no
regressions.
Marcel
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: teams_and_threads_on_device.diff --]
[-- Type: text/plain, Size: 75768 bytes --]
gcc/ChangeLog:
* gimplify.cc (optimize_target_teams): Changed integer_one_node to
integer_minus_two_node in case of non-existing teams construct in target
region due to disambiguation. Previously, num_teams(1) was used as
clause on the target construct when (a) no teams construct exists in the
target region or (b) a teams construct with explicit num_teams(1)
clause was specified.
* tree-core.h (enum tree_index): Added TI_INTEGER_MINUS_TWO.
* tree.cc (build_common_tree_nodes): Added integer_minus_two_node.
* tree.h (integer_minus_two_node): Likewise.
libgomp/ChangeLog:
* config/gcn/icv-device.c (omp_set_num_teams): Added.
(omp_get_teams_thread_limit): Added.
(omp_set_teams_thread_limit): Added.
(ialias): Added for omp_set_num_teams and omp_{gs}et_teams_thread_limit.
* config/nvptx/icv-device.c (omp_set_num_teams): Likewise.
(omp_get_teams_thread_limit): Likewise.
(omp_set_teams_thread_limit): Likewise.
(ialias): Likewise.
* env.c (struct gomp_default_icv_t): Added to hold default ICV values.
(struct gomp_icv_list): Removed static.
(omp_display_env): Renaming of used lists.
(add_device_specific_icv): Removed static.
(gomp_add_device_specific_icv): Removed static.
(parse_device_specific): Renaming of used lists and added storing of
parsed values in lists of modifiable ICV values.
* icv-device.c (omp_set_num_teams): Added.
(ialias): Added for omp_set_num_teams and omp_{gs}et_teams_thread_limit.
(omp_get_teams_thread_limit): Added.
(omp_set_teams_thread_limit): Added.
* icv.c (omp_set_num_teams): Removed.
(omp_set_teams_thread_limit): Removed.
(omp_get_teams_thread_limit): Removed.
(ialias): Removed for omp_set_num_teams and
omp_{gs}et_teams_thread_limit.
* libgomp-plugin.h (GOMP_TEAMS_THREAD_LIMIT_VAR): Added definition.
* libgomp.h (struct gomp_default_icv_t): Added.
(gomp_add_device_specific_icv): Added external declaration.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Added
GOMP_TEAMS_THREAD_LIMIT_VAR.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
* target.c (gomp_get_icv): Added parameter and changed behaviour
considering also default ICV values.
(gomp_get_or_add_icv_int): Added. Adds ICV values to a list of
device-specific ICVs if they not already there.
(gomp_load_image_to_device): Implemented copy back for device-specific
ICVs which are changed on the device but needed on the host. Currently
only nteams-var and teams-thread-limit-var are copied back.
(gomp_copy_back_icv): Added. Copies back ICV values from target to
device using the pointer mapping.
(GOMP_target_ext): Changing TEAMS and THREADS in the arguments list
considering device-specific ICV values. Also invokes the copy back for
ICV values from target to host.
* testsuite/libgomp.c-c++-common/icv-4.c: Fixed typo for getting
environment variable OMP_TEAMS_THREAD_LIMIT.
* testsuite/libgomp.c-c++-common/icv-5.c: Added several test cases now also including teams-thread-limit-var.
* testsuite/libgomp.c-c++-common/icv-6.c: Likewise.
* testsuite/libgomp.c-c++-common/icv-7.c: Likewise.
* testsuite/libgomp.c-c++-common/icv-8.c: New test.
* testsuite/libgomp.fortran/icv-5.f90: New test.
* testsuite/libgomp.fortran/icv-6.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-teams-1.c: Changed num_teams(1) expectation
to num_teams(-2) since now non-existing teams constructs are specified
with num_teams(-2) instead of num_teams(1) which was ambigious.
* g++.dg/gomp/target-teams-1.C: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 2588824..91f75ec 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -13994,7 +13994,7 @@ optimize_target_teams (tree target, gimple_seq *pre_p)
struct gimplify_omp_ctx *target_ctx = gimplify_omp_ctxp;
if (teams == NULL_TREE)
- num_teams_upper = integer_one_node;
+ num_teams_upper = integer_minus_two_node;
else
for (c = OMP_TEAMS_CLAUSES (teams); c; c = OMP_CLAUSE_CHAIN (c))
{
diff --git a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
index 51b8d48..74d60e1 100644
--- a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
@@ -81,5 +81,5 @@ foo (int a, int b, long c, long d)
/* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */
/* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */
/* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */
-/* { dg-final { scan-tree-dump-times "num_teams\\(1\\)" 2 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "num_teams\\(-2\\)" 2 "gimple" } } */
/* { dg-final { scan-tree-dump-times "thread_limit\\(1\\)" 0 "gimple" } } */
diff --git a/gcc/testsuite/g++.dg/gomp/target-teams-1.C b/gcc/testsuite/g++.dg/gomp/target-teams-1.C
index f78a608..29e5597 100644
--- a/gcc/testsuite/g++.dg/gomp/target-teams-1.C
+++ b/gcc/testsuite/g++.dg/gomp/target-teams-1.C
@@ -88,5 +88,5 @@ foo (int a, int b, long c, long d)
/* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */
/* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */
/* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */
-/* { dg-final { scan-tree-dump-times "num_teams\\(1\\)" 2 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "num_teams\\(-2\\)" 2 "gimple" } } */
/* { dg-final { scan-tree-dump-times "thread_limit\\(1\\)" 0 "gimple" } } */
diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90
index 7b182b5..9081159 100644
--- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90
@@ -141,5 +141,5 @@ end
! { dg-final { scan-tree-dump-times "map\\(to:\\.strxparr \\\[len:" 2 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 2 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(alloc\\)" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(alloc:scalar\\) defaultmap\\(to:aggregate\\) defaultmap\\(tofrom:allocatable\\) defaultmap\\(firstprivate:pointer\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(alloc\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(alloc:scalar\\) defaultmap\\(to:aggregate\\) defaultmap\\(tofrom:allocatable\\) defaultmap\\(firstprivate:pointer\\)" 1 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90
index 1391274..91566ed 100644
--- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90
@@ -141,5 +141,5 @@ end
! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 2 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:\\*strxp \\\[len:" 1 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(to\\)" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(to:scalar\\) defaultmap\\(tofrom:aggregate\\) defaultmap\\(firstprivate:allocatable\\) defaultmap\\(default:pointer\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(to\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(to:scalar\\) defaultmap\\(tofrom:aggregate\\) defaultmap\\(firstprivate:allocatable\\) defaultmap\\(default:pointer\\)" 1 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90
index 9a81d0f..867e41a 100644
--- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90
@@ -101,4 +101,4 @@ end
! { dg-final { scan-tree-dump-times "map\\(to:\\.strxparr \\\[len:" 1 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 1 "gimple" } }
! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 1 "gimple" } }
-! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(default\\)" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(default\\)" 1 "gimple" } }
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index f1c2b64..7aec940 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -642,6 +642,7 @@ enum tree_index {
TI_INTEGER_ONE,
TI_INTEGER_THREE,
TI_INTEGER_MINUS_ONE,
+ TI_INTEGER_MINUS_TWO,
TI_NULL_POINTER,
TI_SIZE_ZERO,
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 8f83ea1..8cb474d 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9345,6 +9345,7 @@ build_common_tree_nodes (bool signed_char)
integer_one_node = build_int_cst (integer_type_node, 1);
integer_three_node = build_int_cst (integer_type_node, 3);
integer_minus_one_node = build_int_cst (integer_type_node, -1);
+ integer_minus_two_node = build_int_cst (integer_type_node, -2);
size_zero_node = size_int (0);
size_one_node = size_int (1);
diff --git a/gcc/tree.h b/gcc/tree.h
index cea49a5..1aeb009 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4206,6 +4206,7 @@ tree_strip_any_location_wrapper (tree exp)
#define integer_one_node global_trees[TI_INTEGER_ONE]
#define integer_three_node global_trees[TI_INTEGER_THREE]
#define integer_minus_one_node global_trees[TI_INTEGER_MINUS_ONE]
+#define integer_minus_two_node global_trees[TI_INTEGER_MINUS_TWO]
#define size_zero_node global_trees[TI_SIZE_ZERO]
#define size_one_node global_trees[TI_SIZE_ONE]
#define bitsize_zero_node global_trees[TI_BITSIZE_ZERO]
diff --git a/libgomp/config/gcn/icv-device.c b/libgomp/config/gcn/icv-device.c
index 26b2e7a..a99aa0d 100644
--- a/libgomp/config/gcn/icv-device.c
+++ b/libgomp/config/gcn/icv-device.c
@@ -37,6 +37,7 @@ volatile int GOMP_DEFAULT_DEVICE_VAR;
volatile int GOMP_MAX_ACTIVE_LEVELS_VAR;
volatile omp_proc_bind_t GOMP_BIND_VAR;
volatile int GOMP_NTEAMS_VAR;
+volatile int GOMP_TEAMS_THREAD_LIMIT_VAR;
void
omp_set_default_device (int device_num __attribute__((unused)))
@@ -80,6 +81,26 @@ omp_get_max_teams (void)
return GOMP_NTEAMS_VAR;
}
+void
+omp_set_num_teams (int num_teams)
+{
+ if (num_teams >= 0)
+ GOMP_NTEAMS_VAR = num_teams;
+}
+
+int
+omp_get_teams_thread_limit (void)
+{
+ return GOMP_TEAMS_THREAD_LIMIT_VAR;
+}
+
+void
+omp_set_teams_thread_limit (int thread_limit)
+{
+ if (thread_limit >= 0)
+ GOMP_TEAMS_THREAD_LIMIT_VAR = thread_limit;
+}
+
ialias (omp_set_default_device)
ialias (omp_get_default_device)
ialias (omp_get_initial_device)
@@ -87,3 +108,6 @@ ialias (omp_get_num_devices)
ialias (omp_is_initial_device)
ialias (omp_get_device_num)
ialias (omp_get_max_teams)
+ialias (omp_set_num_teams)
+ialias (omp_get_teams_thread_limit)
+ialias (omp_set_teams_thread_limit)
diff --git a/libgomp/config/nvptx/icv-device.c b/libgomp/config/nvptx/icv-device.c
index f744e75..4b1106b 100644
--- a/libgomp/config/nvptx/icv-device.c
+++ b/libgomp/config/nvptx/icv-device.c
@@ -37,6 +37,7 @@ static volatile int GOMP_DEFAULT_DEVICE_VAR;
static volatile int GOMP_MAX_ACTIVE_LEVELS_VAR;
static volatile omp_proc_bind_t GOMP_BIND_VAR;
static volatile int GOMP_NTEAMS_VAR;
+static volatile int GOMP_TEAMS_THREAD_LIMIT_VAR;
void
omp_set_default_device (int device_num __attribute__((unused)))
@@ -80,6 +81,26 @@ omp_get_max_teams (void)
return GOMP_NTEAMS_VAR;
}
+void
+omp_set_num_teams (int num_teams)
+{
+ if (num_teams >= 0)
+ GOMP_NTEAMS_VAR = num_teams;
+}
+
+int
+omp_get_teams_thread_limit (void)
+{
+ return GOMP_TEAMS_THREAD_LIMIT_VAR;
+}
+
+void
+omp_set_teams_thread_limit (int thread_limit)
+{
+ if (thread_limit >= 0)
+ GOMP_TEAMS_THREAD_LIMIT_VAR = thread_limit;
+}
+
ialias (omp_set_default_device)
ialias (omp_get_default_device)
ialias (omp_get_initial_device)
@@ -87,3 +108,6 @@ ialias (omp_get_num_devices)
ialias (omp_is_initial_device)
ialias (omp_get_device_num)
ialias (omp_get_max_teams)
+ialias (omp_set_num_teams)
+ialias (omp_get_teams_thread_limit)
+ialias (omp_set_teams_thread_limit)
diff --git a/libgomp/env.c b/libgomp/env.c
index 74803ec..f3a86fa 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -84,6 +84,16 @@ struct gomp_initial_icv_t gomp_initial_icv_all;
by using environment variables like OMP_NUM_TEAMS_DEV. */
struct gomp_initial_icv_t gomp_initial_icv_dev;
+/* Default values of ICVs according to the OpenMP standard. */
+struct gomp_default_icv_t gomp_default_icv_values = {
+ .run_sched_var = GFS_DYNAMIC,
+ .run_sched_chunk_size = 1,
+ .max_active_levels_var = 1,
+ .bind_var = omp_proc_bind_false,
+ .nteams_var = 0,
+ .teams_thread_limit_var = 0
+};
+
/* Returns the element of the list for the specified device number. */
struct gomp_icv_list*
gomp_get_icv_list (struct gomp_icv_list **list, int device_num)
@@ -109,21 +119,25 @@ gomp_get_icv_value_ptr (struct gomp_icv_list **list, int device_num)
/* Lists for initial device-specific ICVs, i.e. ICVs that are configured for
particular devices (with environment variables like OMP_NUM_TEAMS_DEV_42). */
-struct gomp_icv_list *gomp_dyn_var_dev_list = NULL;
-struct gomp_icv_list *gomp_nthreads_var_dev_list = NULL;
-struct gomp_icv_list *gomp_nthreads_var_list_dev_list = NULL;
-struct gomp_icv_list *gomp_nthreads_var_list_len_dev_list = NULL;
-struct gomp_icv_list *gomp_run_sched_var_dev_list = NULL;
-struct gomp_icv_list *gomp_run_sched_chunk_size_dev_list = NULL;
-struct gomp_icv_list *gomp_nteams_var_dev_list = NULL;
-struct gomp_icv_list *gomp_thread_limit_var_dev_list = NULL;
-struct gomp_icv_list *gomp_max_active_levels_var_dev_list = NULL;
-struct gomp_icv_list *gomp_proc_bind_var_dev_list = NULL;
-struct gomp_icv_list *gomp_proc_bind_var_list_dev_list = NULL;
-struct gomp_icv_list *gomp_proc_bind_var_list_len_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_dyn_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_nthreads_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_nthreads_var_list_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_nthreads_var_list_len_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_run_sched_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_run_sched_chunk_size_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_nteams_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_thread_limit_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_max_active_levels_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_proc_bind_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_proc_bind_var_list_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_proc_bind_var_list_len_dev_list = NULL;
struct gomp_icv_list *stacksize_dev_list = NULL;
struct gomp_icv_list *wait_policy_dev_list = NULL;
-struct gomp_icv_list *teams_thread_limit_var_dev_list = NULL;
+struct gomp_icv_list *gomp_initial_teams_thread_limit_var_dev_list = NULL;
+
+/* Lists for device-specific ICVs which can be changed, e.g. on the device. */
+struct gomp_icv_list *gomp_nteams_var_dev_list = NULL;
+struct gomp_icv_list *gomp_teams_thread_limit_var_dev_list = NULL;
/* Flags for non-global ICVs to store by which environment variables they are
affected. */
@@ -1425,7 +1439,7 @@ omp_display_env (int verbose)
if (gomp_initial_icv_flags.dyn_var & GOMP_ENV_VAR_SUFFIX_DEV)
fprintf (stderr, " [device] OMP_DYNAMIC = '%s'\n",
gomp_initial_icv_dev.dyn_var ? "TRUE" : "FALSE");
- struct gomp_icv_list* l_dyn_var = gomp_dyn_var_dev_list;
+ struct gomp_icv_list* l_dyn_var = gomp_initial_dyn_var_dev_list;
while (l_dyn_var != NULL)
{
fprintf (stderr, " [%d] OMP_DYNAMIC = '%s'\n", l_dyn_var->device_num,
@@ -1458,19 +1472,19 @@ omp_display_env (int verbose)
fprintf (stderr, ",%lu", gomp_initial_icv_dev.nthreads_var_list[i]);
fputs ("'\n", stderr);
}
- struct gomp_icv_list* l_nthreads_var = gomp_nthreads_var_dev_list;
+ struct gomp_icv_list* l_nthreads_var = gomp_initial_nthreads_var_dev_list;
while (l_nthreads_var != NULL)
{
fprintf (stderr, " [%d] OMP_NUM_THREADS = '%lu",
l_nthreads_var->device_num,
*(unsigned long*)l_nthreads_var->value);
struct gomp_icv_list *len
- = gomp_get_icv_list (&gomp_nthreads_var_list_len_dev_list,
+ = gomp_get_icv_list (&gomp_initial_nthreads_var_list_len_dev_list,
l_nthreads_var->device_num);
if (len != NULL)
{
struct gomp_icv_list *list
- = gomp_get_icv_list (&gomp_nthreads_var_list_dev_list,
+ = gomp_get_icv_list (&gomp_initial_nthreads_var_list_dev_list,
l_nthreads_var->device_num);
for (i = 1; i < *(unsigned long*)len->value; i++)
fprintf (stderr, ",%lu", (*(unsigned long**)list->value)[i]);
@@ -1487,11 +1501,11 @@ omp_display_env (int verbose)
if (gomp_initial_icv_flags.run_sched_var & GOMP_ENV_VAR_SUFFIX_DEV)
print_schedule (gomp_initial_icv_dev.run_sched_var,
gomp_initial_icv_dev.run_sched_chunk_size, "device");
- struct gomp_icv_list* l_run_sched_var = gomp_run_sched_var_dev_list;
+ struct gomp_icv_list* l_run_sched_var = gomp_initial_run_sched_var_dev_list;
while (l_run_sched_var != NULL)
{
struct gomp_icv_list* l_run_sched_chunk_size
- = gomp_get_icv_list (&gomp_run_sched_chunk_size_dev_list,
+ = gomp_get_icv_list (&gomp_initial_run_sched_chunk_size_dev_list,
l_run_sched_var->device_num);
int chunk_size = *(int*)l_run_sched_chunk_size->value;
char dev_num[10];
@@ -1513,14 +1527,14 @@ omp_display_env (int verbose)
print_proc_bind (gomp_initial_icv_dev.bind_var,
gomp_initial_icv_dev.bind_var_list_len,
&gomp_initial_icv_dev.bind_var_list, "device");
- struct gomp_icv_list* l_proc_bind_var = gomp_proc_bind_var_dev_list;
+ struct gomp_icv_list* l_proc_bind_var = gomp_initial_proc_bind_var_dev_list;
while (l_proc_bind_var != NULL)
{
struct gomp_icv_list *list
- = gomp_get_icv_list (&gomp_proc_bind_var_list_dev_list,
+ = gomp_get_icv_list (&gomp_initial_proc_bind_var_list_dev_list,
l_proc_bind_var->device_num);
struct gomp_icv_list *len
- = gomp_get_icv_list (&gomp_proc_bind_var_list_len_dev_list,
+ = gomp_get_icv_list (&gomp_initial_proc_bind_var_list_len_dev_list,
l_proc_bind_var->device_num);
char dev_num[10];
sprintf (dev_num, "%d", l_proc_bind_var->device_num);
@@ -1581,7 +1595,7 @@ omp_display_env (int verbose)
if (gomp_initial_icv_flags.thread_limit_var & GOMP_ENV_VAR_SUFFIX_DEV)
fprintf (stderr, " [device] OMP_THREAD_LIMIT = '%d'\n",
gomp_initial_icv_dev.thread_limit_var);
- struct gomp_icv_list* l_thread_limit = gomp_thread_limit_var_dev_list;
+ struct gomp_icv_list* l_thread_limit = gomp_initial_thread_limit_var_dev_list;
while (l_thread_limit != NULL)
{
fprintf (stderr, " [%d] OMP_THREAD_LIMIT = '%d'\n",
@@ -1598,7 +1612,7 @@ omp_display_env (int verbose)
fprintf (stderr, " [device] OMP_MAX_ACTIVE_LEVELS = '%u'\n",
gomp_initial_icv_dev.max_active_levels_var);
struct gomp_icv_list* l_max_active_levels
- = gomp_max_active_levels_var_dev_list;
+ = gomp_initial_max_active_levels_var_dev_list;
while (l_max_active_levels != NULL)
{
fprintf (stderr, " [%d] OMP_MAX_ACTIVE_LEVELS = '%u'\n",
@@ -1615,7 +1629,7 @@ omp_display_env (int verbose)
if (gomp_initial_icv_flags.nteams_var & GOMP_ENV_VAR_SUFFIX_DEV)
fprintf (stderr, " [device] OMP_NUM_TEAMS = '%d'\n",
gomp_initial_icv_dev.nteams_var);
- struct gomp_icv_list* l_nteams_var = gomp_nteams_var_dev_list;
+ struct gomp_icv_list* l_nteams_var = gomp_initial_nteams_var_dev_list;
while (l_nteams_var != NULL)
{
fprintf (stderr, " [%d] OMP_NUM_TEAMS = '%d'\n",
@@ -1631,7 +1645,8 @@ omp_display_env (int verbose)
if (gomp_initial_icv_flags.teams_thread_limit_var & GOMP_ENV_VAR_SUFFIX_DEV)
fprintf (stderr, " [device] OMP_TEAMS_THREAD_LIMIT = '%u'\n",
gomp_initial_icv_dev.teams_thread_limit_var);
- struct gomp_icv_list* l_teams_thr_limit = teams_thread_limit_var_dev_list;
+ struct gomp_icv_list* l_teams_thr_limit =
+ gomp_initial_teams_thread_limit_var_dev_list;
while (l_teams_thr_limit != NULL)
{
fprintf (stderr, " [%d] OMP_TEAMS_THREAD_LIMIT = '%u'\n",
@@ -1764,8 +1779,9 @@ get_device_num (char **env, int prefix_len, int *dev_num, int *name_len)
/* Helper function for parse_device_specific. Adds a new node to the given
list. */
-static struct gomp_icv_list*
-add_device_specific_icv (int dev_num, size_t size, struct gomp_icv_list **list)
+struct gomp_icv_list *
+gomp_add_device_specific_icv (int dev_num, size_t size,
+ struct gomp_icv_list **list)
{
if (list == NULL)
return NULL;
@@ -1801,12 +1817,12 @@ parse_device_specific ()
enum gomp_schedule_type schedule_type;
int chunk_size;
parse_schedule (name, &schedule_type, &chunk_size);
- new_node = add_device_specific_icv (dev_num,
+ new_node = gomp_add_device_specific_icv (dev_num,
sizeof (enum gomp_schedule_type),
- &gomp_run_sched_var_dev_list);
+ &gomp_initial_run_sched_var_dev_list);
*((enum gomp_schedule_type*)(new_node->value)) = schedule_type;
- new_node = add_device_specific_icv (dev_num, sizeof (int),
- &gomp_run_sched_chunk_size_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int),
+ &gomp_initial_run_sched_chunk_size_dev_list);
*((int*)(new_node->value)) = chunk_size;
goto next;
}
@@ -1819,8 +1835,8 @@ parse_device_specific ()
bool value;
if (!parse_boolean (name, &value))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (bool),
- &gomp_dyn_var_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (bool),
+ &gomp_initial_dyn_var_dev_list);
*((bool*)(new_node->value)) = value;
goto next;
}
@@ -1834,8 +1850,8 @@ parse_device_specific ()
if (!parse_unsigned_long (name, &value, false))
continue;
value = value > INT_MAX ? UINT_MAX : value;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long),
- &gomp_thread_limit_var_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long),
+ &gomp_initial_thread_limit_var_dev_list);
*((unsigned long*)(new_node->value)) = value;
goto next;
}
@@ -1850,17 +1866,17 @@ parse_device_specific ()
unsigned long nvalues = 0;
if (!parse_unsigned_long_list (name, &value, &pvalues, &nvalues))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long),
- &gomp_nthreads_var_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long),
+ &gomp_initial_nthreads_var_dev_list);
*((unsigned long*)(new_node->value)) = value;
if (nvalues > 0)
{
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long*),
- &gomp_nthreads_var_list_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long*),
+ &gomp_initial_nthreads_var_list_dev_list);
*((unsigned long**)(new_node->value)) = pvalues;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long*),
- &gomp_nthreads_var_list_len_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long*),
+ &gomp_initial_nthreads_var_list_len_dev_list);
*((unsigned long*)(new_node->value)) = nvalues;
}
goto next;
@@ -1874,7 +1890,10 @@ parse_device_specific ()
int value;
if (!parse_int (name, &value, false))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (int),
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int),
+ &gomp_initial_nteams_var_dev_list);
+ *((int*)(new_node->value)) = value;
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int),
&gomp_nteams_var_dev_list);
*((int*)(new_node->value)) = value;
goto next;
@@ -1904,18 +1923,18 @@ parse_device_specific ()
if (value == omp_proc_bind_false)
value = true;
- new_node = add_device_specific_icv (dev_num, sizeof (char),
- &gomp_proc_bind_var_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (char),
+ &gomp_initial_proc_bind_var_dev_list);
*((char*)(new_node->value)) = value;
if (nvalues > 0)
{
- new_node = add_device_specific_icv (dev_num, sizeof (char*),
- &gomp_proc_bind_var_list_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (char*),
+ &gomp_initial_proc_bind_var_list_dev_list);
*((char**)(new_node->value)) = pvalues;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long),
- &gomp_proc_bind_var_list_len_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long),
+ &gomp_initial_proc_bind_var_list_len_dev_list);
*((unsigned long*)(new_node->value)) = nvalues;
}
goto next;
@@ -1931,8 +1950,8 @@ parse_device_specific ()
continue;
value = (value > gomp_supported_active_levels)
? gomp_supported_active_levels : value;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long),
- &gomp_max_active_levels_var_dev_list);
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long),
+ &gomp_initial_max_active_levels_var_dev_list);
*((unsigned long*)(new_node->value)) = value;
goto next;
}
@@ -1945,7 +1964,7 @@ parse_device_specific ()
unsigned long value;
if (!parse_stacksize (name, &value))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (unsigned long),
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long),
&stacksize_dev_list);
*((unsigned long*)(new_node->value)) = value;
goto next;
@@ -1959,7 +1978,7 @@ parse_device_specific ()
int value;
if (!parse_wait_policy (name, &value))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (int),
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int),
&wait_policy_dev_list);
*((int*)(new_node->value)) = value;
goto next;
@@ -1973,8 +1992,12 @@ parse_device_specific ()
int value;
if (!parse_int (name, &value, false))
continue;
- new_node = add_device_specific_icv (dev_num, sizeof (int),
- &teams_thread_limit_var_dev_list);
+ new_node = gomp_add_device_specific_icv (
+ dev_num, sizeof (int),
+ &gomp_initial_teams_thread_limit_var_dev_list);
+ *((int*)(new_node->value)) = value;
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int),
+ &gomp_teams_thread_limit_var_dev_list);
*((int*)(new_node->value)) = value;
goto next;
}
diff --git a/libgomp/icv-device.c b/libgomp/icv-device.c
index bc1af97..10b2e33 100644
--- a/libgomp/icv-device.c
+++ b/libgomp/icv-device.c
@@ -88,3 +88,29 @@ omp_get_max_teams (void)
}
ialias (omp_get_max_teams)
+
+void
+omp_set_num_teams (int num_teams)
+{
+ if (num_teams >= 0)
+ gomp_nteams_var = num_teams;
+}
+
+ialias (omp_set_num_teams)
+
+int
+omp_get_teams_thread_limit (void)
+{
+ return gomp_teams_thread_limit_var;
+}
+
+ialias (omp_get_teams_thread_limit)
+
+void
+omp_set_teams_thread_limit (int thread_limit)
+{
+ if (thread_limit >= 0)
+ gomp_teams_thread_limit_var = thread_limit;
+}
+
+ialias (omp_set_teams_thread_limit)
diff --git a/libgomp/icv.c b/libgomp/icv.c
index 5c435d6..9aef91c 100644
--- a/libgomp/icv.c
+++ b/libgomp/icv.c
@@ -148,26 +148,6 @@ omp_get_supported_active_levels (void)
return gomp_supported_active_levels;
}
-void
-omp_set_num_teams (int num_teams)
-{
- if (num_teams >= 0)
- gomp_nteams_var = num_teams;
-}
-
-void
-omp_set_teams_thread_limit (int thread_limit)
-{
- if (thread_limit >= 0)
- gomp_teams_thread_limit_var = thread_limit;
-}
-
-int
-omp_get_teams_thread_limit (void)
-{
- return gomp_teams_thread_limit_var;
-}
-
int
omp_get_cancellation (void)
{
@@ -268,9 +248,6 @@ ialias (omp_get_thread_limit)
ialias (omp_set_max_active_levels)
ialias (omp_get_max_active_levels)
ialias (omp_get_supported_active_levels)
-ialias (omp_set_num_teams)
-ialias (omp_set_teams_thread_limit)
-ialias (omp_get_teams_thread_limit)
ialias (omp_get_cancellation)
ialias (omp_get_proc_bind)
ialias (omp_get_max_task_priority)
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index 072cc47..75d7663 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -116,6 +116,7 @@ struct addr_pair
#define GOMP_MAX_ACTIVE_LEVELS_VAR __gomp_max_active_levels
#define GOMP_BIND_VAR __gomp_bind
#define GOMP_NTEAMS_VAR __gomp_nteams
+#define GOMP_TEAMS_THREAD_LIMIT_VAR __gomp_teams_thread_limit_var
/* Miscellaneous functions. */
extern void *GOMP_PLUGIN_malloc (size_t) __attribute__ ((malloc));
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 89d1453..b585c05 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -504,6 +504,17 @@ struct gomp_initial_icv_t
unsigned long stacksize;
};
+struct gomp_default_icv_t
+{
+ enum gomp_schedule_type run_sched_var;
+ int run_sched_chunk_size;
+ unsigned char max_active_levels_var;
+ char bind_var;
+ int nteams_var;
+ int teams_thread_limit_var;
+};
+extern struct gomp_default_icv_t gomp_default_icv_values;
+
struct gomp_icv_flags_t
{
enum gomp_env_var_suffix_t nthreads_var;
@@ -527,13 +538,19 @@ struct gomp_icv_list {
extern void *gomp_get_icv_value_ptr (struct gomp_icv_list **list,
int device_num);
-extern struct gomp_icv_list *gomp_run_sched_var_dev_list;
-extern struct gomp_icv_list *gomp_run_sched_chunk_size_dev_list;
+extern struct gomp_icv_list* gomp_add_device_specific_icv (int dev_num,
+ size_t size,
+ struct gomp_icv_list **list);
+extern struct gomp_icv_list *gomp_initial_run_sched_var_dev_list;
+extern struct gomp_icv_list *gomp_initial_run_sched_chunk_size_dev_list;
+extern struct gomp_icv_list *gomp_initial_max_active_levels_var_dev_list;
+extern struct gomp_icv_list *gomp_initial_proc_bind_var_dev_list;
+extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_dev_list;
+extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_len_dev_list;
+extern struct gomp_icv_list *gomp_initial_nteams_var_dev_list;
+
extern struct gomp_icv_list *gomp_nteams_var_dev_list;
-extern struct gomp_icv_list *gomp_max_active_levels_var_dev_list;
-extern struct gomp_icv_list *gomp_proc_bind_var_dev_list;
-extern struct gomp_icv_list *gomp_proc_bind_var_list_dev_list;
-extern struct gomp_icv_list *gomp_proc_bind_var_list_len_dev_list;
+extern struct gomp_icv_list *gomp_teams_thread_limit_var_dev_list;
extern struct gomp_initial_icv_t gomp_initial_icv_all;
extern struct gomp_initial_icv_t gomp_initial_icv_dev;
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 44ab369..3236c38 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -572,7 +572,8 @@ static char *GOMP_ICV_STRINGS[] =
XSTRING (GOMP_DYN_VAR),
XSTRING (GOMP_MAX_ACTIVE_LEVELS_VAR),
XSTRING (GOMP_BIND_VAR),
- XSTRING (GOMP_NTEAMS_VAR)
+ XSTRING (GOMP_NTEAMS_VAR),
+ XSTRING (GOMP_TEAMS_THREAD_LIMIT_VAR)
};
/* }}} */
@@ -3371,7 +3372,7 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
struct kernel_info *kernel;
int kernel_count = image_desc->kernel_count;
unsigned var_count = image_desc->global_variable_count;
- int other_count = 10;
+ int other_count = 11;
agent = get_agent_info (ord);
if (!agent)
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 9d603fd..ced24ca 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -346,7 +346,8 @@ static char *GOMP_ICV_STRINGS[] =
XSTRING (GOMP_DYN_VAR),
XSTRING (GOMP_MAX_ACTIVE_LEVELS_VAR),
XSTRING (GOMP_BIND_VAR),
- XSTRING (GOMP_NTEAMS_VAR)
+ XSTRING (GOMP_NTEAMS_VAR),
+ XSTRING (GOMP_TEAMS_THREAD_LIMIT_VAR)
};
static inline struct nvptx_thread *
@@ -1316,7 +1317,7 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
fn_descs = img_header->fn_descs;
/* Currently, other entry kinds are 'device number' and further ICVs. */
- other_entries = 10;
+ other_entries = 11;
targ_tbl = GOMP_PLUGIN_malloc (sizeof (struct addr_pair)
* (fn_entries + var_entries + other_entries));
diff --git a/libgomp/target.c b/libgomp/target.c
index 4a9e533..1a6afe6 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2077,13 +2077,51 @@ gomp_update (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs,
(_DEV_42, _DEV, _ALL). */
static void*
gomp_get_icv (struct gomp_icv_list **list, int dev_num,
- enum gomp_env_var_suffix_t flag, void *dev_val, void *all_val)
+ enum gomp_env_var_suffix_t flag, void *dev_val, void *all_val,
+ void *default_val)
{
void *val = gomp_get_icv_value_ptr (list, dev_num);
if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_DEV))
- val = dev_val;
+ val = dev_val;
if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_ALL))
- val = all_val;
+ val = all_val;
+ if (val == NULL)
+ val = default_val;
+ if (val == NULL)
+ {
+ gomp_error ("Expected value for ICV.");
+ return NULL;
+ }
+ return val;
+}
+
+/* Helper function for 'gomp_load_image_to_device'. Similar to gomp_get_icv but
+ also adds the ICV value to LIST if not already there. */
+static void*
+gomp_get_or_add_icv_int (struct gomp_icv_list **list, int dev_num,
+ enum gomp_env_var_suffix_t flag, void *dev_val,
+ void *all_val, void *default_val)
+{
+ void *val = gomp_get_icv_value_ptr (list, dev_num);
+ if (val == NULL)
+ {
+ if (flag & GOMP_ENV_VAR_SUFFIX_DEV)
+ val = dev_val;
+ if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_ALL))
+ val = all_val;
+ if (val == NULL)
+ val = default_val;
+ if (val == NULL)
+ {
+ gomp_error ("Expected value for ICV.");
+ return NULL;
+ }
+
+ struct gomp_icv_list *new_node;
+ new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), list);
+ *((int*)(new_node->value)) = *(int*)val;
+ val = new_node->value;
+ }
return val;
}
@@ -2107,8 +2145,12 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
int num_funcs = host_funcs_end - host_func_table;
int num_vars = (host_vars_end - host_var_table) / 2;
- /* Others currently is only 'device_num' */
- int num_others = 10;
+ /* Others is 'device_num' and further ICVs. */
+ int num_others = 11;
+
+ /* Number of ICVs, which need to be copied back to the host.
+ Currently nteams-var and teams-thread-limit-var. */
+ int num_ICVs_copied_back = 2;
/* Load image to device and get target addresses for the image. */
struct addr_pair *target_table = NULL;
@@ -2132,7 +2174,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
/* Insert host-target address mapping into splay tree. */
struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt));
- tgt->array = gomp_malloc ((num_funcs + num_vars) * sizeof (*tgt->array));
+ tgt->array = gomp_malloc ((num_funcs + num_vars + num_ICVs_copied_back)
+ * sizeof (*tgt->array));
tgt->refcount = REFCOUNT_INFINITY;
tgt->tgt_start = 0;
tgt->tgt_end = 0;
@@ -2204,6 +2247,10 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
for (i = 0; i < num_others; ++i)
{
struct addr_pair *var = &target_table[num_funcs + num_vars + i];
+ /* COPY_BACK_SIZE is used if the device-specific ICV needs to be
+ copied back from device to host. */
+ size_t copy_back_size = 0;
+
/* Start address will be non-zero for the current entry if
the variable was found in this image. */
if (var->start != 0)
@@ -2223,16 +2270,23 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
values that are set during kernel entering. */
break;
case 3: // GOMP_RUN_SCHED_VAR
- val = gomp_get_icv (&gomp_run_sched_var_dev_list, dev_num,
- gomp_initial_icv_flags.run_sched_var,
- &gomp_initial_icv_dev.run_sched_var,
- &gomp_initial_icv_all.run_sched_var);
+ val =
+ gomp_get_icv (&gomp_initial_run_sched_var_dev_list,
+ dev_num,
+ gomp_initial_icv_flags.run_sched_var,
+ &gomp_initial_icv_dev.run_sched_var,
+ &gomp_initial_icv_all.run_sched_var,
+ &gomp_default_icv_values.run_sched_var);
break;
case 4: // GOMP_RUN_SCHED_CHUNK_SIZE
- val = gomp_get_icv (&gomp_run_sched_chunk_size_dev_list, dev_num,
- gomp_initial_icv_flags.run_sched_chunk_size,
- &gomp_initial_icv_dev.run_sched_chunk_size,
- &gomp_initial_icv_all.run_sched_chunk_size);
+ val =
+ gomp_get_icv (&gomp_initial_run_sched_chunk_size_dev_list,
+ dev_num,
+ gomp_initial_icv_flags.run_sched_chunk_size,
+ &gomp_initial_icv_dev.run_sched_chunk_size,
+ &gomp_initial_icv_all.run_sched_chunk_size,
+ &gomp_default_icv_values.
+ run_sched_chunk_size);
break;
case 5: // GOMP_DEFAULT_DEVICE_VAR
val = &gomp_global_icv.default_device_var;
@@ -2242,22 +2296,41 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
values that are set during kernel entering. */
break;
case 7: // GOMP_MAX_ACTIVE_LEVELS_VAR
- val = gomp_get_icv (&gomp_max_active_levels_var_dev_list, dev_num,
- gomp_initial_icv_flags.max_active_levels_var,
- &gomp_initial_icv_dev.max_active_levels_var,
- &gomp_initial_icv_all.max_active_levels_var);
+ val =
+ gomp_get_icv
+ (&gomp_initial_max_active_levels_var_dev_list, dev_num,
+ gomp_initial_icv_flags.max_active_levels_var,
+ &gomp_initial_icv_dev.max_active_levels_var,
+ &gomp_initial_icv_all.max_active_levels_var,
+ &gomp_default_icv_values.max_active_levels_var);
break;
case 8: // GOMP_BIND_VAR
- val = gomp_get_icv (&gomp_proc_bind_var_dev_list, dev_num,
+ val = gomp_get_icv (&gomp_initial_proc_bind_var_dev_list,
+ dev_num,
gomp_initial_icv_flags.bind_var,
&gomp_initial_icv_dev.bind_var,
- &gomp_initial_icv_all.bind_var);
+ &gomp_initial_icv_all.bind_var,
+ &gomp_default_icv_values.bind_var);
break;
case 9: // GOMP_NTEAMS_VAR
- val = gomp_get_icv (&gomp_nteams_var_dev_list, dev_num,
- gomp_initial_icv_flags.nteams_var,
- &gomp_initial_icv_dev.nteams_var,
- &gomp_initial_icv_all.nteams_var);
+ val =
+ gomp_get_or_add_icv_int
+ (&gomp_nteams_var_dev_list, dev_num,
+ gomp_initial_icv_flags.nteams_var,
+ &gomp_initial_icv_dev.nteams_var,
+ &gomp_initial_icv_all.nteams_var,
+ &gomp_default_icv_values.nteams_var);
+ copy_back_size = sizeof (int);
+ break;
+ case 10: // GOMP_TEAMS_THREAD_LIMIT_VAR
+ val =
+ gomp_get_or_add_icv_int
+ (&gomp_teams_thread_limit_var_dev_list, dev_num,
+ gomp_initial_icv_flags.teams_thread_limit_var,
+ &gomp_initial_icv_dev.teams_thread_limit_var,
+ &gomp_initial_icv_all.teams_thread_limit_var,
+ &gomp_default_icv_values.teams_thread_limit_var);
+ copy_back_size = sizeof (int);
break;
}
if (val != NULL)
@@ -2267,6 +2340,22 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
actually designating its device number into effect. */
gomp_copy_host2dev (devicep, NULL, (void *) var->start, val,
var_size, false, NULL);
+ if (copy_back_size)
+ {
+ splay_tree_key k = &array->key;
+ k->host_start = (uintptr_t) val;
+ k->host_end =
+ k->host_start + (size_mask & copy_back_size);
+ k->tgt = tgt;
+ k->tgt_offset = var->start;
+ k->refcount = REFCOUNT_INFINITY;
+ k->dynamic_refcount = 0;
+ k->aux = NULL;
+ array->left = NULL;
+ array->right = NULL;
+ splay_tree_insert (&devicep->mem_map, array);
+ array++;
+ }
}
}
}
@@ -2661,6 +2750,16 @@ clear_unsupported_flags (struct gomp_device_descr *devicep, unsigned int flags)
return flags;
}
+static void gomp_copy_back_icv (struct gomp_icv_list **list,
+ struct gomp_device_descr *devicep, int device,
+ size_t size)
+{
+ void *host_ptr = gomp_get_icv_value_ptr (list, device);
+ void *dev_ptr = omp_get_mapped_ptr (host_ptr, device);
+ if (dev_ptr != NULL)
+ gomp_copy_dev2host (devicep, NULL, host_ptr, dev_ptr, size);
+}
+
/* Like GOMP_target, but KINDS is 16-bit, UNUSED is no longer present,
and several arguments have been added:
FLAGS is a bitmask, see GOMP_TARGET_FLAG_* in gomp-constants.h.
@@ -2693,6 +2792,144 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
size_t tgt_align = 0, tgt_size = 0;
bool fpc_done = false;
+ /* Obtain the original TEAMS and THREADS values from ARGS. */
+ intptr_t orig_teams = 1, orig_threads = 0;
+ size_t num_args = 0, len = 1, teams_len = 1, threads_len = 1;
+ void **tmpargs = args;
+ while (*tmpargs)
+ {
+ intptr_t id = (intptr_t) *tmpargs++, val;
+ if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM)
+ {
+ val = (intptr_t) *tmpargs++;
+ len = 2;
+ }
+ else
+ {
+ val = id >> GOMP_TARGET_ARG_VALUE_SHIFT;
+ len = 1;
+ }
+ num_args += len;
+ if ((id & GOMP_TARGET_ARG_DEVICE_MASK) != GOMP_TARGET_ARG_DEVICE_ALL)
+ continue;
+ val = val > INT_MAX ? INT_MAX : val;
+ if ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_NUM_TEAMS)
+ {
+ orig_teams = val;
+ teams_len = len;
+ }
+ else if ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_THREAD_LIMIT)
+ {
+ orig_threads = val;
+ threads_len = len;
+ }
+ }
+
+ intptr_t new_teams = orig_teams, new_threads = orig_threads;
+ /* ORIG_TEAMS == -2: No explicit teams construct specified. Set to 1.
+ ORIG_TEAMS == -1: Teams construct with NUM_TEAMS clause specified, but the
+ value could not be specified. No Change.
+ ORIG_TEAMS == 0: TEAMS construct without NUM_TEAMS clause.
+ Set device-specific value.
+ ORIG_TEAMS > 0: Value was already set through e.g. NUM_TEAMS clause.
+ No change. */
+ if (orig_teams == -2)
+ new_teams = 1;
+ else if (orig_teams == 0)
+ {
+ void *dev_val =
+ gomp_get_icv_value_ptr (&gomp_nteams_var_dev_list, device);
+ if (dev_val != NULL)
+ new_teams = *((int*) dev_val);
+ }
+ /* The device-specific teams-thread-limit is only set if (a) an explicit TEAMS
+ region exists, i.e. ORIG_TEAMS > -2, and (b) THREADS was not already set by
+ e.g. a THREAD_LIMIT clause. */
+ if (orig_teams >= -2 && orig_threads == 0)
+ {
+ void *dev_val =
+ gomp_get_icv_value_ptr (&gomp_teams_thread_limit_var_dev_list, device);
+ if (dev_val != NULL)
+ new_threads = *((int*) dev_val);
+ }
+
+ /* Copy and change the arguments list only if TEAMS or THREADS need to be
+ updated. */
+ void **new_args = args;
+ if (orig_teams != new_teams || orig_threads != new_threads)
+ {
+ size_t tms_len = (orig_teams == new_teams
+ ? teams_len
+ : (new_teams > -(1 << 15) && new_teams < (1 << 15)
+ ? 1 : 2));
+ size_t ths_len = (orig_threads == new_threads
+ ? threads_len
+ : (new_threads > -(1 << 15) && new_threads < (1 << 15)
+ ? 1 : 2));
+ /* One additional item since the last arg must be NULL. */
+ size_t new_args_cnt = num_args - teams_len - threads_len + tms_len
+ + ths_len + 1;
+ new_args = (void **) gomp_alloca (new_args_cnt * sizeof (void*));
+
+ tmpargs = args;
+ void **tmp_new_args = new_args;
+ while (*tmpargs)
+ {
+ intptr_t id = (intptr_t) *tmpargs;
+ if (((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_NUM_TEAMS
+ && orig_teams != new_teams)
+ || ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_THREAD_LIMIT
+ && orig_threads != new_threads))
+ {
+ tmpargs++;
+ if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM)
+ tmpargs++;
+ }
+ else
+ {
+ *tmp_new_args++ = *tmpargs++;
+ if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM)
+ *tmp_new_args++ = *tmpargs++;
+ }
+ }
+
+ if (orig_teams != new_teams)
+ {
+ intptr_t new_val = new_teams;
+ if (tms_len == 1)
+ {
+ new_val = (new_val << GOMP_TARGET_ARG_VALUE_SHIFT)
+ | GOMP_TARGET_ARG_NUM_TEAMS;
+ *tmp_new_args++ = (void *) new_val;
+ }
+ else
+ {
+ *tmp_new_args++ = (void *) (GOMP_TARGET_ARG_SUBSEQUENT_PARAM
+ | GOMP_TARGET_ARG_NUM_TEAMS);
+ *tmp_new_args++ = (void *) new_val;
+ }
+ }
+
+ if (orig_threads != new_threads)
+ {
+ intptr_t new_val = new_threads;
+ if (ths_len == 1)
+ {
+ new_val = (new_val << GOMP_TARGET_ARG_VALUE_SHIFT)
+ | GOMP_TARGET_ARG_THREAD_LIMIT;
+ *tmp_new_args++ = (void *) new_val;
+ }
+ else
+ {
+ *tmp_new_args++ = (void *) (GOMP_TARGET_ARG_SUBSEQUENT_PARAM
+ | GOMP_TARGET_ARG_THREAD_LIMIT);
+ *tmp_new_args++ = (void *) new_val;
+ }
+ }
+
+ *tmp_new_args = NULL;
+ }
+
flags = clear_unsupported_flags (devicep, flags);
if (flags & GOMP_TARGET_FLAG_NOWAIT)
@@ -2731,7 +2968,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
&& !thr->task->final_task)
{
gomp_create_target_task (devicep, fn, mapnum, hostaddrs,
- sizes, kinds, flags, depend, args,
+ sizes, kinds, flags, depend, new_args,
GOMP_TARGET_TASK_BEFORE_MAP);
return;
}
@@ -2777,7 +3014,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
tgt_align, tgt_size);
}
}
- gomp_target_fallback (fn, hostaddrs, devicep, args);
+ gomp_target_fallback (fn, hostaddrs, devicep, new_args);
return;
}
@@ -2805,9 +3042,10 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
tgt_vars = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds,
true, &refcount_set, GOMP_MAP_VARS_TARGET);
}
+
devicep->run_func (devicep->target_id, fn_addr,
tgt_vars ? (void *) tgt_vars->tgt_start : hostaddrs,
- args);
+ new_args);
if (tgt_vars)
{
htab_clear (refcount_set);
@@ -2815,6 +3053,14 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
}
if (refcount_set)
htab_free (refcount_set);
+
+ /* Copy back ICVs which were probably changed on the device.
+ HOST_PTR is expected to exist since it was added in
+ gomp_load_image_to_device if not already available. */
+ gomp_copy_back_icv (&gomp_nteams_var_dev_list, devicep, device, sizeof (int));
+ gomp_copy_back_icv (&gomp_teams_thread_limit_var_dev_list, devicep, device,
+ sizeof (int));
+
}
/* Host fallback for GOMP_target_data{,_ext} routines. */
diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-4.c b/libgomp/testsuite/libgomp.c-c++-common/icv-4.c
index b987a33..9da0d63 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/icv-4.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/icv-4.c
@@ -16,7 +16,7 @@ main ()
}
else
omp_set_num_teams (6);
- if (getenv ("OMP_TEAMS_THREAD_LIMIT") == NULL
+ if (getenv ("OMP_TEAMS_THREAD_LIMIT") != NULL
&& strcmp (getenv ("OMP_TEAMS_THREAD_LIMIT"), "12") == 0)
{
if (omp_get_teams_thread_limit () != 12)
diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-5.c b/libgomp/testsuite/libgomp.c-c++-common/icv-5.c
index 21bf44d..8f768b3 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/icv-5.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/icv-5.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "-DAMD" { target openacc_radeon_accel_selected } } */
/* { dg-do run } */
#include <omp.h>
@@ -6,15 +7,22 @@
#include <unistd.h>
char const *varnames[] = {
+ "OMP_NUM_TEAMS_ALL",
+ "OMP_NUM_TEAMS_DEV",
+ "OMP_NUM_TEAMS",
"OMP_NUM_TEAMS_DEV_0",
"OMP_NUM_TEAMS_DEV_1",
"OMP_NUM_TEAMS_DEV_2",
- "OMP_NUM_TEAMS_ALL",
- "OMP_NUM_TEAMS_DEV",
- "OMP_NUM_TEAMS"
+ "OMP_TEAMS_THREAD_LIMIT_ALL",
+ "OMP_TEAMS_THREAD_LIMIT_DEV",
+ "OMP_TEAMS_THREAD_LIMIT",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_0",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_1",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_2"
};
-char const *values[] = { "42", "43", "44", "45", "46", "47" };
-const int cnt = 6;
+char const *values[] = { "3", "4", "5", "6", "7", "8",
+ "2", "3", "4", "5", "6", "7" };
+const int cnt = 12;
int
main (int argc, char *const *argv)
@@ -35,14 +43,192 @@ main (int argc, char *const *argv)
abort ();
}
- if (omp_get_max_teams () != 47)
+ if (omp_get_max_teams () != 5
+ || omp_get_teams_thread_limit () != 4)
+ abort ();
+
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 5
+ || omp_get_team_num () >= 5)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 4
+ || omp_get_thread_num () >= 4)
+ abort ();
+ }
+
+ omp_set_num_teams (4);
+ omp_set_teams_thread_limit (3);
+ if (omp_get_max_teams () != 4
+ || omp_get_teams_thread_limit () != 3)
+ abort ();
+
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 4
+ || omp_get_team_num () >= 4)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 3
+ || omp_get_thread_num () >= 3)
+ abort ();
+ }
+
+ #pragma omp teams num_teams(3) thread_limit(2)
+ {
+ if (omp_get_num_teams () != 3
+ || omp_get_team_num () >= 3)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 2
+ || omp_get_thread_num () >= 2)
+ abort ();
+ }
+
+ #pragma omp teams num_teams(5) thread_limit(4)
+ {
+ if (omp_get_num_teams () != 5
+ || omp_get_team_num () >= 5)
abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 4
+ || omp_get_thread_num () >= 4)
+ abort ();
+ }
int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices ();
+ if (num_devices <= 0)
+ return 0;
+
for (int i=0; i < num_devices; i++)
- #pragma omp target device (i)
- if (omp_get_max_teams () != 42 + i)
+ {
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 6 + i
+ || omp_get_teams_thread_limit () != 5 + i)
abort ();
+ #pragma omp target device (i)
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 6 + i
+ || omp_get_team_num () >= 6 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 5 + i
+ || omp_get_thread_num () >= 5 + i)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ {
+ omp_set_num_teams (5 + i);
+ omp_set_teams_thread_limit (4 + i);
+ if (omp_get_max_teams () != 5 + i
+ || omp_get_teams_thread_limit () != 4 + i)
+ abort ();
+ }
+
+ /* omp_set_num_teams and omp_set_teams_thread_limit above set the value
+ of nteams-var and teams-thread-limit-var ICVs on device 'i', which has
+ scope 'device' and should be avaible in subsequent target regions. */
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 5 + i
+ || omp_get_teams_thread_limit () != 4 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 5 + i
+ || omp_get_team_num () >= 5 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 4 + i
+ || omp_get_thread_num () >= 4 + i)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ #pragma omp teams num_teams(6 + i) thread_limit(5 + i)
+ {
+ if (omp_get_num_teams () > 6 + i
+ || omp_get_team_num () >= 6 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 5 + i
+ || omp_get_thread_num () >= 5 + i
+ || omp_get_num_teams () > 6 + i
+ || omp_get_team_num () >= 6 + i)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ #pragma omp teams num_teams(4 + i) thread_limit(3 + i)
+ {
+ if (omp_get_num_teams () > 4 + i
+ || omp_get_team_num () >= 4 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 3 + i
+ || omp_get_thread_num () >= 3 + i
+ || omp_get_num_teams () > 4 + i
+ || omp_get_team_num () >= 4 + i)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ #pragma omp teams thread_limit(3 + i) num_teams(4 + i)
+ {
+ if (omp_get_num_teams () > 4 + i
+ || omp_get_team_num () >= 4 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 3 + i
+ || omp_get_thread_num () >= 3 + i
+ || omp_get_num_teams () > 4 + i
+ || omp_get_team_num () >= 4 + i)
+ abort ();
+ }
+
+ /* This tests a large number of teams and threads. If it is larger than
+ 2^15+1 then the according argument in the kernels arguments list
+ is encoded with two items instead of one. On NVIDIA there is an
+ adjustment for too large teams and threads. For AMD such adjustment
+ exists only for threads and will cause runtime errors with a two large
+ number of teams. */
+ intptr_t large_num_teams = 66000;
+#ifdef AMD
+ large_num_teams = 8;
+#endif
+ intptr_t large_threads_limit = 67000;
+ #pragma omp target device (i)
+ {
+ omp_set_num_teams (large_num_teams + i);
+ omp_set_teams_thread_limit (large_threads_limit + i);
+ if (omp_get_max_teams () != large_num_teams + i
+ || omp_get_teams_thread_limit () != large_threads_limit + i)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != large_num_teams + i
+ || omp_get_teams_thread_limit () != large_threads_limit + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > large_num_teams + i
+ || omp_get_team_num () >= large_num_teams + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > large_threads_limit + i
+ || omp_get_thread_num () >= large_threads_limit + i)
+ abort ();
+ }
+
+ }
+
return 0;
-}
\ No newline at end of file
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-6.c b/libgomp/testsuite/libgomp.c-c++-common/icv-6.c
index 943147b..5e761aa 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/icv-6.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/icv-6.c
@@ -1,8 +1,8 @@
/* { dg-do run } */
-/* This tests the hierarchical usage of ICVs on the device, i.e. if
- OMP_NUM_TEAMS_DEV_<device_num> is not configured, then the value of
- OMP_NUM_TEAMS_DEV should be used. */
+/* This test considers the hierarchical usage of ICVs on the device, i.e. if
+ e.g. OMP_NUM_TEAMS_DEV_<device_num> is not configured, then the value of
+ OMP_NUM_TEAMS_DEV should be used for the targets. */
#include <omp.h>
#include <stdlib.h>
@@ -12,16 +12,21 @@
char const *varnames[] = {
"OMP_NUM_TEAMS_ALL",
"OMP_NUM_TEAMS_DEV",
+ "OMP_TEAMS_THREAD_LIMIT_ALL",
+ "OMP_TEAMS_THREAD_LIMIT_DEV",
};
-char const *values[] = { "42", "43" };
-const int cnt = 2;
+char const *values[] = { "3", "4", "2", "3" };
+const int cnt = 4;
char const *excludes[] = {
"OMP_NUM_TEAMS_DEV_0",
"OMP_NUM_TEAMS_DEV_1",
"OMP_NUM_TEAMS_DEV_2",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_0",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_1",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_2"
};
-const int cnt_exludes = 3;
+const int cnt_exludes = 6;
int
main (int argc, char *const *argv)
@@ -50,14 +55,55 @@ main (int argc, char *const *argv)
abort ();
}
- if (omp_get_max_teams () != 42)
- abort ();
-
int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices ();
for (int i=0; i < num_devices; i++)
- #pragma omp target device (i)
- if (omp_get_max_teams () != 43)
+ {
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 4
+ || omp_get_teams_thread_limit () != 3)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 4
+ || omp_get_team_num () >= 4)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 3
+ || omp_get_thread_num () >= 3)
+ abort ();
+ }
+
+ #pragma omp target device (i)
+ {
+ omp_set_num_teams (3 + i);
+ omp_set_teams_thread_limit (2 + i);
+ if (omp_get_max_teams () != 3 + i
+ || omp_get_teams_thread_limit () != 2 + i)
+ abort ();
+ }
+
+ /* omp_set_num_teams above set the value of nteams-var ICV on device 'i',
+ which has scope 'device' and should be avaible in subsequent target
+ regions. */
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 3 + i
+ || omp_get_teams_thread_limit () != 2 + i)
abort ();
+ #pragma omp target device (i)
+ #pragma omp teams
+ {
+ if (omp_get_num_teams () > 3 + i
+ || omp_get_team_num () >= 3 + i)
+ abort ();
+ #pragma omp parallel
+ if (omp_get_thread_limit () > 2 + i
+ || omp_get_thread_num () >= 2 + i)
+ abort ();
+ }
+ }
+
return 0;
-}
\ No newline at end of file
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-7.c b/libgomp/testsuite/libgomp.c-c++-common/icv-7.c
index 857d796..b28a33a 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/icv-7.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/icv-7.c
@@ -12,19 +12,25 @@
#include <unistd.h>
char const *varnames[] = {
- "OMP_NUM_TEAMS_ALL"
+ "OMP_NUM_TEAMS_ALL",
+ "OMP_TEAMS_THREAD_LIMIT_ALL"
};
-char const *values[] = { "42" };
-const int cnt = 1;
+char const *values[] = { "7", "2" };
+const int cnt = 2;
char const *excludes[] = {
"OMP_NUM_TEAMS_DEV_0",
"OMP_NUM_TEAMS_DEV_1",
"OMP_NUM_TEAMS_DEV_2",
"OMP_NUM_TEAMS_DEV",
- "OMP_NUM_TEAMS"
+ "OMP_NUM_TEAMS",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_0",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_1",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_2",
+ "OMP_TEAMS_THREAD_LIMIT_DEV",
+ "OMP_TEAMS_THREAD_LIMIT"
};
-const int cnt_exludes = 5;
+const int cnt_exludes = 10;
int
main (int argc, char *const *argv)
@@ -53,14 +59,73 @@ main (int argc, char *const *argv)
abort ();
}
- if (omp_get_max_teams () != 42)
+ if (omp_get_max_teams () != 7
+ || omp_get_teams_thread_limit () != 2)
+ abort ();
+
+ #pragma omp teams
+ if (omp_get_num_teams () > 7
+ || omp_get_team_num () >= 7)
+ abort ();
+
+ omp_set_num_teams (9);
+ omp_set_teams_thread_limit (3);
+ if (omp_get_max_teams () != 9
+ || omp_get_teams_thread_limit () != 3)
+ abort ();
+
+ #pragma omp teams
+ if (omp_get_num_teams () > 9
+ || omp_get_team_num () >= 9)
+ abort ();
+
+ #pragma omp teams num_teams(5)
+ if (omp_get_num_teams () > 5
+ || omp_get_team_num () >= 5)
abort ();
int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices ();
for (int i=0; i < num_devices; i++)
- #pragma omp target device (i)
- if (omp_get_max_teams () != 42)
+ {
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 7
+ || omp_get_teams_thread_limit () != 2)
abort ();
+ #pragma omp target device (i)
+ #pragma omp teams
+ if (omp_get_num_teams () > 7
+ || omp_get_team_num () >= 7)
+ abort ();
+
+ #pragma omp target device (i)
+ {
+ omp_set_num_teams (8 + i);
+ omp_set_teams_thread_limit (4 + i);
+ if (omp_get_max_teams () != 8 + i
+ || omp_get_teams_thread_limit () != 4 + i)
+ abort ();
+ }
+
+ /* omp_set_num_teams above set the value of nteams-var ICV on device 'i',
+ which has scope 'device' and should be avaible in subsequent target
+ regions. */
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 8 + i
+ || omp_get_teams_thread_limit () != 4 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams
+ if (omp_get_num_teams () > 8 + i
+ || omp_get_team_num () >= 8 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams num_teams(5 + i)
+ if (omp_get_num_teams () != 5 + i)
+ abort ();
+ }
+
return 0;
-}
\ No newline at end of file
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-8.c b/libgomp/testsuite/libgomp.c-c++-common/icv-8.c
new file mode 100644
index 0000000..1eda2fd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/icv-8.c
@@ -0,0 +1,105 @@
+/* { dg-do run } */
+
+/* This tests usage of ICVs on the host and on devices if no corresponding
+ environment variables are configured. */
+
+#include <omp.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+char const *excludes[] = {
+ "OMP_NUM_TEAMS_DEV_0",
+ "OMP_NUM_TEAMS_DEV_1",
+ "OMP_NUM_TEAMS_DEV_2",
+ "OMP_NUM_TEAMS_DEV",
+ "OMP_NUM_TEAMS_ALL",
+ "OMP_NUM_TEAMS",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_0",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_1",
+ "OMP_TEAMS_THREAD_LIMIT_DEV_2",
+ "OMP_TEAMS_THREAD_LIMIT_DEV",
+ "OMP_TEAMS_THREAD_LIMIT_ALL",
+ "OMP_TEAMS_THREAD_LIMIT"
+};
+const int cnt_exludes = 12;
+
+int
+main (int argc, char *const *argv)
+{
+ int updated = 0;
+ for (int i = 0; i < cnt_exludes; i++)
+ if (getenv (excludes[i]) != NULL)
+ {
+ unsetenv (excludes[i]);
+ updated = 1;
+ }
+ if (updated)
+ {
+ execv (argv[0], argv);
+ abort ();
+ }
+
+ if (omp_get_max_teams () != 0
+ || omp_get_teams_thread_limit () != 0)
+ abort ();
+
+ omp_set_num_teams (9);
+ omp_set_teams_thread_limit (2);
+ if (omp_get_max_teams () != 9
+ || omp_get_teams_thread_limit () != 2)
+ abort ();
+
+ #pragma omp teams
+ if (omp_get_num_teams () > 9
+ || omp_get_team_num () >= 9)
+ abort ();
+
+ #pragma omp teams num_teams(5)
+ if (omp_get_num_teams () > 5
+ || omp_get_team_num () >= 5)
+ abort ();
+
+ int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices ();
+ for (int i=0; i < num_devices; i++)
+ {
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 0
+ || omp_get_teams_thread_limit () != 0)
+ abort ();
+
+ #pragma omp target device (i)
+ {
+ omp_set_num_teams (8 + i);
+ omp_set_teams_thread_limit (3 + i);
+ if (omp_get_max_teams () != 8 + i
+ || omp_get_teams_thread_limit () != 3 + i)
+ abort ();
+ }
+
+ /* omp_set_num_teams above set the value of nteams-var ICV on device 'i',
+ which has scope 'device' and should be avaible in subsequent target
+ regions. */
+ #pragma omp target device (i)
+ if (omp_get_max_teams () != 8 + i
+ || omp_get_teams_thread_limit () != 3 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams
+ if (omp_get_num_teams () > 8 + i
+ || omp_get_team_num () >= 8 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ #pragma omp teams num_teams(5 + i)
+ if (omp_get_num_teams () > 5 + i
+ || omp_get_team_num () >= 5 + i)
+ abort ();
+
+ #pragma omp target device (i)
+ ;
+ }
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.fortran/icv-5.f90 b/libgomp/testsuite/libgomp.fortran/icv-5.f90
new file mode 100644
index 0000000..9462ca9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/icv-5.f90
@@ -0,0 +1,231 @@
+! { dg-set-target-env-var OMP_NUM_TEAMS_ALL "3" }
+! { dg-set-target-env-var OMP_NUM_TEAMS_DEV "4" }
+! { dg-set-target-env-var OMP_NUM_TEAMS "5" }
+! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_0 "6" }
+! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_1 "7" }
+! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_2 "8" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_ALL "2" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV "3" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT "4" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_0 "5" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_1 "6" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_2 "7" }
+
+use omp_lib
+implicit none (type, external)
+ integer :: num_devices, i, large_num_teams, large_threads_limit
+ logical :: err
+
+ if (omp_get_num_devices () > 3) then
+ num_devices = 3
+ else
+ num_devices = omp_get_num_devices ()
+ end if
+
+ do i=0,num_devices-1
+
+ ! Testing NUM_TEAMS.
+ if (env_is_set_dev ("OMP_NUM_TEAMS_DEV_", i, 6 + i)) then
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_max_teams () /= 6 + i) err = .true.
+ !$omp end target
+ if (err) stop 1
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ if (omp_get_num_teams () > 6 + i .or. omp_get_team_num () >= 6 + i) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 2
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_num_teams (5 + i)
+ if (omp_get_max_teams () /= 5 + i) err = .true.
+ !$omp end target
+ if (err) stop 3
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_max_teams () /= 5 + i) err = .true.
+ !$omp end target
+ if (err) stop 4
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ if (omp_get_num_teams () > 5 + i .or. omp_get_team_num () >= 5 + i) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 5
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams num_teams(6 + i)
+ if (omp_get_num_teams () > 6 + i .or. omp_get_team_num () >= 6 + i) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 6
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams num_teams(4 + i)
+ if (omp_get_num_teams () > 4 + i .or. omp_get_team_num () >= 4 + i) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 7
+
+ large_num_teams = 66000
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_num_teams (large_num_teams + i)
+ if (omp_get_max_teams () /= large_num_teams + i) err = .true.
+ !$omp end target
+ if (err) stop 8
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_max_teams () /= large_num_teams + i) err = .true.
+ !$omp end target
+ if (err) stop 9
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ if (omp_get_num_teams () > large_num_teams + i &
+ .or. omp_get_team_num () >= large_num_teams + i) err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 10
+ end if
+
+ ! Testing TEAMS-THREAD-LIMIT
+ if (env_is_set_dev ("OMP_TEAMS_THREAD_LIMIT_DEV_", i, 5 + i)) then
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_teams_thread_limit () /= 5 + i) err = .true.
+ !$omp end target
+ if (err) stop 11
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ !$omp parallel
+ if (omp_get_thread_limit () > 5 + i .or. omp_get_thread_num () >= 5 + i) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 12
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_teams_thread_limit (4 + i)
+ if (omp_get_teams_thread_limit () /= 4 + i) err = .true.
+ !$omp end target
+ if (err) stop 13
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_teams_thread_limit () /= 4 + i) err = .true.
+ !$omp end target
+ if (err) stop 14
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ !$omp parallel
+ if (omp_get_thread_limit () > 4 + i .or. omp_get_thread_num () >= 4 + i) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 15
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams thread_limit(5 + i)
+ !$omp parallel
+ if (omp_get_thread_limit () > 5 + i .or. omp_get_thread_num () >= 5 + i) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 16
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams thread_limit(3 + i)
+ !$omp parallel
+ if (omp_get_thread_limit () > 3 + i .or. omp_get_thread_num () >= 3 + i) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 17
+
+ large_threads_limit = 67000
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_teams_thread_limit (large_threads_limit + i)
+ if (omp_get_teams_thread_limit () /= large_threads_limit + i) err = .true.
+ !$omp end target
+ if (err) stop 18
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_teams_thread_limit () /= large_threads_limit + i) err = .true.
+ !$omp end target
+ if (err) stop 19
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ !$omp parallel
+ if (omp_get_thread_limit () > large_threads_limit + i &
+ .or. omp_get_thread_num () >= large_threads_limit + i) err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 20
+ end if
+
+ end do
+
+contains
+ logical function env_is_set (name, val)
+ character(len=*) :: name, val
+ character(len=40) :: val2
+ integer :: stat
+ call get_environment_variable (name, val2, status=stat)
+ if (stat == 0) then
+ if (val == val2) then
+ env_is_set = .true.
+ return
+ end if
+ else if (stat /= 1) then
+ error stop 30
+ endif
+ env_is_set = .false.
+ end
+ logical function env_is_set_dev (name, dev_num, val)
+ character(len=*) :: name
+ integer :: dev_num, val
+ character(len=64) :: dev_num_str, env_var, val_str
+ write (dev_num_str,*) dev_num
+ dev_num_str = ADJUSTL(dev_num_str)
+ env_var = name // dev_num_str
+ write (val_str,*) val
+ val_str = ADJUSTL(val_str)
+ write (*,*) ">", TRIM(env_var), "<"
+ write (*,*) ">", TRIM(val_str), "<"
+ write (*,*) "env_is_set (TRIM(env_var), TRIM(device_num): ", env_is_set(TRIM(env_var), TRIM(val_str))
+ env_is_set_dev = env_is_set (TRIM(env_var), TRIM(val_str))
+ end
+end
diff --git a/libgomp/testsuite/libgomp.fortran/icv-6.f90 b/libgomp/testsuite/libgomp.fortran/icv-6.f90
new file mode 100644
index 0000000..c1be2b2
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/icv-6.f90
@@ -0,0 +1,142 @@
+! { dg-set-target-env-var OMP_NUM_TEAMS_ALL "3" }
+! { dg-set-target-env-var OMP_NUM_TEAMS_DEV "4" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_ALL "2" }
+! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV "3" }
+
+! This test considers the hierarchical usage of ICVs on the device, i.e. if
+! e.g. OMP_NUM_TEAMS_DEV_<device_num> is not configured, then the value of
+! OMP_NUM_TEAMS_DEV should be used for the targets.
+
+use omp_lib
+implicit none (type, external)
+ integer :: num_devices, i, stat, tmp
+ logical :: err
+ character(len=40) :: val
+
+ ! The following environment variables should not be set.
+ call get_environment_variable ("OMP_NUM_TEAMS_DEV_0", val, status=stat)
+ if (stat /= 1) return
+ call get_environment_variable ("OMP_NUM_TEAMS_DEV_1", val, status=stat)
+ if (stat /= 1) return
+ call get_environment_variable ("OMP_NUM_TEAMS_DEV_2", val, status=stat)
+ if (stat /= 1) return
+ call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_0", val, status=stat)
+ if (stat /= 1) return
+ call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_1", val, status=stat)
+ if (stat /= 1) return
+ call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_2", val, status=stat)
+ if (stat /= 1) return
+
+ if (omp_get_num_devices () > 3) then
+ num_devices = 3
+ else
+ num_devices = omp_get_num_devices ()
+ end if
+
+ do i=0,num_devices-1
+
+ ! Testing NUM_TEAMS.
+ write (*,*) 'env_is_set ("OMP_NUM_TEAMS_DEV", "4"): ', env_is_set ("OMP_NUM_TEAMS_DEV", "4")
+ if (env_is_set ("OMP_NUM_TEAMS_DEV", "4")) then
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_max_teams () /= 4) err = .true.
+ !$omp end target
+ if (err) stop 1
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ if (omp_get_num_teams () > 4 .or. omp_get_team_num () >= 4) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 2
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_num_teams (3 + i)
+ if (omp_get_max_teams () /= 3 + i) err = .true.
+ !$omp end target
+ if (err) stop 3
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_max_teams () /= 3 + i) err = .true.
+ !$omp end target
+ if (err) stop 4
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ if (omp_get_num_teams () > 3 + i .or. omp_get_team_num () >= 3 + i) &
+ err = .true.
+ !$omp end teams
+ !$omp end target
+ if (err) stop 5
+ end if
+
+ ! Testing TEAMS-THREAD-LIMIT
+ write (*,*) 'env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3"): ', env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3")
+ if (env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3")) then
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_teams_thread_limit () /= 3) err = .true.
+ !$omp end target
+ if (err) stop 6
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ !$omp parallel
+ if (omp_get_thread_limit () > 3 .or. omp_get_thread_num () >= 3) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 7
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ call omp_set_teams_thread_limit (2 + i)
+ if (omp_get_teams_thread_limit () /= 2 + i) err = .true.
+ !$omp end target
+ if (err) stop 8
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ if (omp_get_teams_thread_limit () /= 2 + i) err = .true.
+ !$omp end target
+ if (err) stop 9
+
+ err = .false.
+ !$omp target device(i) map(tofrom: err)
+ !$omp teams
+ !$omp parallel
+ if (omp_get_thread_limit () > 2 + i .or. omp_get_thread_num () >= 2 + i) &
+ err = .true.
+ !$omp end parallel
+ !$omp end teams
+ !$omp end target
+ if (err) stop 10
+ end if
+
+ end do
+
+contains
+ logical function env_is_set (name, val)
+ character(len=*) :: name, val
+ character(len=40) :: val2
+ integer :: stat
+ call get_environment_variable (name, val2, status=stat)
+ if (stat == 0) then
+ if (val == val2) then
+ env_is_set = .true.
+ return
+ end if
+ else if (stat /= 1) then
+ error stop 10
+ endif
+ env_is_set = .false.
+ end
+end
next reply other threads:[~2022-04-14 16:06 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-14 16:06 Marcel Vollweiler [this message]
2022-06-30 13:16 ` Jakub Jelinek
2022-08-03 12:40 ` Marcel Vollweiler
2022-09-18 8:24 ` Marcel Vollweiler
2022-09-30 9:35 ` Jakub Jelinek
2022-11-24 14:09 ` Marcel Vollweiler
2022-12-05 13:50 ` Jakub Jelinek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d32cc89e-74d7-06bd-3ef6-15deadd28a78@codesourcery.com \
--to=marcel@codesourcery.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).