gcc/ChangeLog: * gimplify.cc (optimize_target_teams): Changed integer_one_node to integer_minus_two_node in case of non-existing teams construct in target region due to disambiguation. Previously, num_teams(1) was used as clause on the target construct when (a) no teams construct exists in the target region or (b) a teams construct with explicit num_teams(1) clause was specified. * tree-core.h (enum tree_index): Added TI_INTEGER_MINUS_TWO. * tree.cc (build_common_tree_nodes): Added integer_minus_two_node. * tree.h (integer_minus_two_node): Likewise. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_set_num_teams): Added. (omp_get_teams_thread_limit): Added. (omp_set_teams_thread_limit): Added. (ialias): Added for omp_set_num_teams and omp_{gs}et_teams_thread_limit. * config/nvptx/icv-device.c (omp_set_num_teams): Likewise. (omp_get_teams_thread_limit): Likewise. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * env.c (struct gomp_default_icv_t): Added to hold default ICV values. (struct gomp_icv_list): Removed static. (omp_display_env): Renaming of used lists. (add_device_specific_icv): Removed static. (gomp_add_device_specific_icv): Removed static. (parse_device_specific): Renaming of used lists and added storing of parsed values in lists of modifiable ICV values. * icv-device.c (omp_set_num_teams): Added. (ialias): Added for omp_set_num_teams and omp_{gs}et_teams_thread_limit. (omp_get_teams_thread_limit): Added. (omp_set_teams_thread_limit): Added. * icv.c (omp_set_num_teams): Removed. (omp_set_teams_thread_limit): Removed. (omp_get_teams_thread_limit): Removed. (ialias): Removed for omp_set_num_teams and omp_{gs}et_teams_thread_limit. * libgomp-plugin.h (GOMP_TEAMS_THREAD_LIMIT_VAR): Added definition. * libgomp.h (struct gomp_default_icv_t): Added. (gomp_add_device_specific_icv): Added external declaration. * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Added GOMP_TEAMS_THREAD_LIMIT_VAR. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise. * target.c (gomp_get_icv): Added parameter and changed behaviour considering also default ICV values. (gomp_get_or_add_icv_int): Added. Adds ICV values to a list of device-specific ICVs if they not already there. (gomp_load_image_to_device): Implemented copy back for device-specific ICVs which are changed on the device but needed on the host. Currently only nteams-var and teams-thread-limit-var are copied back. (gomp_copy_back_icv): Added. Copies back ICV values from target to device using the pointer mapping. (GOMP_target_ext): Changing TEAMS and THREADS in the arguments list considering device-specific ICV values. Also invokes the copy back for ICV values from target to host. * testsuite/libgomp.c-c++-common/icv-4.c: Fixed typo for getting environment variable OMP_TEAMS_THREAD_LIMIT. * testsuite/libgomp.c-c++-common/icv-5.c: Added several test cases now also including teams-thread-limit-var. * testsuite/libgomp.c-c++-common/icv-6.c: Likewise. * testsuite/libgomp.c-c++-common/icv-7.c: Likewise. * testsuite/libgomp.c-c++-common/icv-8.c: New test. * testsuite/libgomp.fortran/icv-5.f90: New test. * testsuite/libgomp.fortran/icv-6.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-teams-1.c: Changed num_teams(1) expectation to num_teams(-2) since now non-existing teams constructs are specified with num_teams(-2) instead of num_teams(1) which was ambigious. * g++.dg/gomp/target-teams-1.C: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 2588824..91f75ec 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -13994,7 +13994,7 @@ optimize_target_teams (tree target, gimple_seq *pre_p) struct gimplify_omp_ctx *target_ctx = gimplify_omp_ctxp; if (teams == NULL_TREE) - num_teams_upper = integer_one_node; + num_teams_upper = integer_minus_two_node; else for (c = OMP_TEAMS_CLAUSES (teams); c; c = OMP_CLAUSE_CHAIN (c)) { diff --git a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c index 51b8d48..74d60e1 100644 --- a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c +++ b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c @@ -81,5 +81,5 @@ foo (int a, int b, long c, long d) /* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */ /* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */ /* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */ -/* { dg-final { scan-tree-dump-times "num_teams\\(1\\)" 2 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "num_teams\\(-2\\)" 2 "gimple" } } */ /* { dg-final { scan-tree-dump-times "thread_limit\\(1\\)" 0 "gimple" } } */ diff --git a/gcc/testsuite/g++.dg/gomp/target-teams-1.C b/gcc/testsuite/g++.dg/gomp/target-teams-1.C index f78a608..29e5597 100644 --- a/gcc/testsuite/g++.dg/gomp/target-teams-1.C +++ b/gcc/testsuite/g++.dg/gomp/target-teams-1.C @@ -88,5 +88,5 @@ foo (int a, int b, long c, long d) /* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */ /* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */ /* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */ -/* { dg-final { scan-tree-dump-times "num_teams\\(1\\)" 2 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "num_teams\\(-2\\)" 2 "gimple" } } */ /* { dg-final { scan-tree-dump-times "thread_limit\\(1\\)" 0 "gimple" } } */ diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90 index 7b182b5..9081159 100644 --- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-4.f90 @@ -141,5 +141,5 @@ end ! { dg-final { scan-tree-dump-times "map\\(to:\\.strxparr \\\[len:" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 2 "gimple" } } -! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(alloc\\)" 1 "gimple" } } -! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(alloc:scalar\\) defaultmap\\(to:aggregate\\) defaultmap\\(tofrom:allocatable\\) defaultmap\\(firstprivate:pointer\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(alloc\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(alloc:scalar\\) defaultmap\\(to:aggregate\\) defaultmap\\(tofrom:allocatable\\) defaultmap\\(firstprivate:pointer\\)" 1 "gimple" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90 index 1391274..91566ed 100644 --- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-5.f90 @@ -141,5 +141,5 @@ end ! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:\\*strxp \\\[len:" 1 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 1 "gimple" } } -! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(to\\)" 1 "gimple" } } -! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(to:scalar\\) defaultmap\\(tofrom:aggregate\\) defaultmap\\(firstprivate:allocatable\\) defaultmap\\(default:pointer\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(to\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(to:scalar\\) defaultmap\\(tofrom:aggregate\\) defaultmap\\(firstprivate:allocatable\\) defaultmap\\(default:pointer\\)" 1 "gimple" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90 b/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90 index 9a81d0f..867e41a 100644 --- a/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/defaultmap-6.f90 @@ -101,4 +101,4 @@ end ! { dg-final { scan-tree-dump-times "map\\(to:\\.strxparr \\\[len:" 1 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:strxparr \\\[pointer set, len:" 1 "gimple" } } ! { dg-final { scan-tree-dump-times "map\\(to:\\.strxp \\\[len:" 1 "gimple" } } -! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(1\\) thread_limit\\(0\\) defaultmap\\(default\\)" 1 "gimple" } } +! { dg-final { scan-tree-dump-times "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(default\\)" 1 "gimple" } } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index f1c2b64..7aec940 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -642,6 +642,7 @@ enum tree_index { TI_INTEGER_ONE, TI_INTEGER_THREE, TI_INTEGER_MINUS_ONE, + TI_INTEGER_MINUS_TWO, TI_NULL_POINTER, TI_SIZE_ZERO, diff --git a/gcc/tree.cc b/gcc/tree.cc index 8f83ea1..8cb474d 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -9345,6 +9345,7 @@ build_common_tree_nodes (bool signed_char) integer_one_node = build_int_cst (integer_type_node, 1); integer_three_node = build_int_cst (integer_type_node, 3); integer_minus_one_node = build_int_cst (integer_type_node, -1); + integer_minus_two_node = build_int_cst (integer_type_node, -2); size_zero_node = size_int (0); size_one_node = size_int (1); diff --git a/gcc/tree.h b/gcc/tree.h index cea49a5..1aeb009 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -4206,6 +4206,7 @@ tree_strip_any_location_wrapper (tree exp) #define integer_one_node global_trees[TI_INTEGER_ONE] #define integer_three_node global_trees[TI_INTEGER_THREE] #define integer_minus_one_node global_trees[TI_INTEGER_MINUS_ONE] +#define integer_minus_two_node global_trees[TI_INTEGER_MINUS_TWO] #define size_zero_node global_trees[TI_SIZE_ZERO] #define size_one_node global_trees[TI_SIZE_ONE] #define bitsize_zero_node global_trees[TI_BITSIZE_ZERO] diff --git a/libgomp/config/gcn/icv-device.c b/libgomp/config/gcn/icv-device.c index 26b2e7a..a99aa0d 100644 --- a/libgomp/config/gcn/icv-device.c +++ b/libgomp/config/gcn/icv-device.c @@ -37,6 +37,7 @@ volatile int GOMP_DEFAULT_DEVICE_VAR; volatile int GOMP_MAX_ACTIVE_LEVELS_VAR; volatile omp_proc_bind_t GOMP_BIND_VAR; volatile int GOMP_NTEAMS_VAR; +volatile int GOMP_TEAMS_THREAD_LIMIT_VAR; void omp_set_default_device (int device_num __attribute__((unused))) @@ -80,6 +81,26 @@ omp_get_max_teams (void) return GOMP_NTEAMS_VAR; } +void +omp_set_num_teams (int num_teams) +{ + if (num_teams >= 0) + GOMP_NTEAMS_VAR = num_teams; +} + +int +omp_get_teams_thread_limit (void) +{ + return GOMP_TEAMS_THREAD_LIMIT_VAR; +} + +void +omp_set_teams_thread_limit (int thread_limit) +{ + if (thread_limit >= 0) + GOMP_TEAMS_THREAD_LIMIT_VAR = thread_limit; +} + ialias (omp_set_default_device) ialias (omp_get_default_device) ialias (omp_get_initial_device) @@ -87,3 +108,6 @@ ialias (omp_get_num_devices) ialias (omp_is_initial_device) ialias (omp_get_device_num) ialias (omp_get_max_teams) +ialias (omp_set_num_teams) +ialias (omp_get_teams_thread_limit) +ialias (omp_set_teams_thread_limit) diff --git a/libgomp/config/nvptx/icv-device.c b/libgomp/config/nvptx/icv-device.c index f744e75..4b1106b 100644 --- a/libgomp/config/nvptx/icv-device.c +++ b/libgomp/config/nvptx/icv-device.c @@ -37,6 +37,7 @@ static volatile int GOMP_DEFAULT_DEVICE_VAR; static volatile int GOMP_MAX_ACTIVE_LEVELS_VAR; static volatile omp_proc_bind_t GOMP_BIND_VAR; static volatile int GOMP_NTEAMS_VAR; +static volatile int GOMP_TEAMS_THREAD_LIMIT_VAR; void omp_set_default_device (int device_num __attribute__((unused))) @@ -80,6 +81,26 @@ omp_get_max_teams (void) return GOMP_NTEAMS_VAR; } +void +omp_set_num_teams (int num_teams) +{ + if (num_teams >= 0) + GOMP_NTEAMS_VAR = num_teams; +} + +int +omp_get_teams_thread_limit (void) +{ + return GOMP_TEAMS_THREAD_LIMIT_VAR; +} + +void +omp_set_teams_thread_limit (int thread_limit) +{ + if (thread_limit >= 0) + GOMP_TEAMS_THREAD_LIMIT_VAR = thread_limit; +} + ialias (omp_set_default_device) ialias (omp_get_default_device) ialias (omp_get_initial_device) @@ -87,3 +108,6 @@ ialias (omp_get_num_devices) ialias (omp_is_initial_device) ialias (omp_get_device_num) ialias (omp_get_max_teams) +ialias (omp_set_num_teams) +ialias (omp_get_teams_thread_limit) +ialias (omp_set_teams_thread_limit) diff --git a/libgomp/env.c b/libgomp/env.c index 74803ec..f3a86fa 100644 --- a/libgomp/env.c +++ b/libgomp/env.c @@ -84,6 +84,16 @@ struct gomp_initial_icv_t gomp_initial_icv_all; by using environment variables like OMP_NUM_TEAMS_DEV. */ struct gomp_initial_icv_t gomp_initial_icv_dev; +/* Default values of ICVs according to the OpenMP standard. */ +struct gomp_default_icv_t gomp_default_icv_values = { + .run_sched_var = GFS_DYNAMIC, + .run_sched_chunk_size = 1, + .max_active_levels_var = 1, + .bind_var = omp_proc_bind_false, + .nteams_var = 0, + .teams_thread_limit_var = 0 +}; + /* Returns the element of the list for the specified device number. */ struct gomp_icv_list* gomp_get_icv_list (struct gomp_icv_list **list, int device_num) @@ -109,21 +119,25 @@ gomp_get_icv_value_ptr (struct gomp_icv_list **list, int device_num) /* Lists for initial device-specific ICVs, i.e. ICVs that are configured for particular devices (with environment variables like OMP_NUM_TEAMS_DEV_42). */ -struct gomp_icv_list *gomp_dyn_var_dev_list = NULL; -struct gomp_icv_list *gomp_nthreads_var_dev_list = NULL; -struct gomp_icv_list *gomp_nthreads_var_list_dev_list = NULL; -struct gomp_icv_list *gomp_nthreads_var_list_len_dev_list = NULL; -struct gomp_icv_list *gomp_run_sched_var_dev_list = NULL; -struct gomp_icv_list *gomp_run_sched_chunk_size_dev_list = NULL; -struct gomp_icv_list *gomp_nteams_var_dev_list = NULL; -struct gomp_icv_list *gomp_thread_limit_var_dev_list = NULL; -struct gomp_icv_list *gomp_max_active_levels_var_dev_list = NULL; -struct gomp_icv_list *gomp_proc_bind_var_dev_list = NULL; -struct gomp_icv_list *gomp_proc_bind_var_list_dev_list = NULL; -struct gomp_icv_list *gomp_proc_bind_var_list_len_dev_list = NULL; +struct gomp_icv_list *gomp_initial_dyn_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_nthreads_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_nthreads_var_list_dev_list = NULL; +struct gomp_icv_list *gomp_initial_nthreads_var_list_len_dev_list = NULL; +struct gomp_icv_list *gomp_initial_run_sched_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_run_sched_chunk_size_dev_list = NULL; +struct gomp_icv_list *gomp_initial_nteams_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_thread_limit_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_max_active_levels_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_proc_bind_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_proc_bind_var_list_dev_list = NULL; +struct gomp_icv_list *gomp_initial_proc_bind_var_list_len_dev_list = NULL; struct gomp_icv_list *stacksize_dev_list = NULL; struct gomp_icv_list *wait_policy_dev_list = NULL; -struct gomp_icv_list *teams_thread_limit_var_dev_list = NULL; +struct gomp_icv_list *gomp_initial_teams_thread_limit_var_dev_list = NULL; + +/* Lists for device-specific ICVs which can be changed, e.g. on the device. */ +struct gomp_icv_list *gomp_nteams_var_dev_list = NULL; +struct gomp_icv_list *gomp_teams_thread_limit_var_dev_list = NULL; /* Flags for non-global ICVs to store by which environment variables they are affected. */ @@ -1425,7 +1439,7 @@ omp_display_env (int verbose) if (gomp_initial_icv_flags.dyn_var & GOMP_ENV_VAR_SUFFIX_DEV) fprintf (stderr, " [device] OMP_DYNAMIC = '%s'\n", gomp_initial_icv_dev.dyn_var ? "TRUE" : "FALSE"); - struct gomp_icv_list* l_dyn_var = gomp_dyn_var_dev_list; + struct gomp_icv_list* l_dyn_var = gomp_initial_dyn_var_dev_list; while (l_dyn_var != NULL) { fprintf (stderr, " [%d] OMP_DYNAMIC = '%s'\n", l_dyn_var->device_num, @@ -1458,19 +1472,19 @@ omp_display_env (int verbose) fprintf (stderr, ",%lu", gomp_initial_icv_dev.nthreads_var_list[i]); fputs ("'\n", stderr); } - struct gomp_icv_list* l_nthreads_var = gomp_nthreads_var_dev_list; + struct gomp_icv_list* l_nthreads_var = gomp_initial_nthreads_var_dev_list; while (l_nthreads_var != NULL) { fprintf (stderr, " [%d] OMP_NUM_THREADS = '%lu", l_nthreads_var->device_num, *(unsigned long*)l_nthreads_var->value); struct gomp_icv_list *len - = gomp_get_icv_list (&gomp_nthreads_var_list_len_dev_list, + = gomp_get_icv_list (&gomp_initial_nthreads_var_list_len_dev_list, l_nthreads_var->device_num); if (len != NULL) { struct gomp_icv_list *list - = gomp_get_icv_list (&gomp_nthreads_var_list_dev_list, + = gomp_get_icv_list (&gomp_initial_nthreads_var_list_dev_list, l_nthreads_var->device_num); for (i = 1; i < *(unsigned long*)len->value; i++) fprintf (stderr, ",%lu", (*(unsigned long**)list->value)[i]); @@ -1487,11 +1501,11 @@ omp_display_env (int verbose) if (gomp_initial_icv_flags.run_sched_var & GOMP_ENV_VAR_SUFFIX_DEV) print_schedule (gomp_initial_icv_dev.run_sched_var, gomp_initial_icv_dev.run_sched_chunk_size, "device"); - struct gomp_icv_list* l_run_sched_var = gomp_run_sched_var_dev_list; + struct gomp_icv_list* l_run_sched_var = gomp_initial_run_sched_var_dev_list; while (l_run_sched_var != NULL) { struct gomp_icv_list* l_run_sched_chunk_size - = gomp_get_icv_list (&gomp_run_sched_chunk_size_dev_list, + = gomp_get_icv_list (&gomp_initial_run_sched_chunk_size_dev_list, l_run_sched_var->device_num); int chunk_size = *(int*)l_run_sched_chunk_size->value; char dev_num[10]; @@ -1513,14 +1527,14 @@ omp_display_env (int verbose) print_proc_bind (gomp_initial_icv_dev.bind_var, gomp_initial_icv_dev.bind_var_list_len, &gomp_initial_icv_dev.bind_var_list, "device"); - struct gomp_icv_list* l_proc_bind_var = gomp_proc_bind_var_dev_list; + struct gomp_icv_list* l_proc_bind_var = gomp_initial_proc_bind_var_dev_list; while (l_proc_bind_var != NULL) { struct gomp_icv_list *list - = gomp_get_icv_list (&gomp_proc_bind_var_list_dev_list, + = gomp_get_icv_list (&gomp_initial_proc_bind_var_list_dev_list, l_proc_bind_var->device_num); struct gomp_icv_list *len - = gomp_get_icv_list (&gomp_proc_bind_var_list_len_dev_list, + = gomp_get_icv_list (&gomp_initial_proc_bind_var_list_len_dev_list, l_proc_bind_var->device_num); char dev_num[10]; sprintf (dev_num, "%d", l_proc_bind_var->device_num); @@ -1581,7 +1595,7 @@ omp_display_env (int verbose) if (gomp_initial_icv_flags.thread_limit_var & GOMP_ENV_VAR_SUFFIX_DEV) fprintf (stderr, " [device] OMP_THREAD_LIMIT = '%d'\n", gomp_initial_icv_dev.thread_limit_var); - struct gomp_icv_list* l_thread_limit = gomp_thread_limit_var_dev_list; + struct gomp_icv_list* l_thread_limit = gomp_initial_thread_limit_var_dev_list; while (l_thread_limit != NULL) { fprintf (stderr, " [%d] OMP_THREAD_LIMIT = '%d'\n", @@ -1598,7 +1612,7 @@ omp_display_env (int verbose) fprintf (stderr, " [device] OMP_MAX_ACTIVE_LEVELS = '%u'\n", gomp_initial_icv_dev.max_active_levels_var); struct gomp_icv_list* l_max_active_levels - = gomp_max_active_levels_var_dev_list; + = gomp_initial_max_active_levels_var_dev_list; while (l_max_active_levels != NULL) { fprintf (stderr, " [%d] OMP_MAX_ACTIVE_LEVELS = '%u'\n", @@ -1615,7 +1629,7 @@ omp_display_env (int verbose) if (gomp_initial_icv_flags.nteams_var & GOMP_ENV_VAR_SUFFIX_DEV) fprintf (stderr, " [device] OMP_NUM_TEAMS = '%d'\n", gomp_initial_icv_dev.nteams_var); - struct gomp_icv_list* l_nteams_var = gomp_nteams_var_dev_list; + struct gomp_icv_list* l_nteams_var = gomp_initial_nteams_var_dev_list; while (l_nteams_var != NULL) { fprintf (stderr, " [%d] OMP_NUM_TEAMS = '%d'\n", @@ -1631,7 +1645,8 @@ omp_display_env (int verbose) if (gomp_initial_icv_flags.teams_thread_limit_var & GOMP_ENV_VAR_SUFFIX_DEV) fprintf (stderr, " [device] OMP_TEAMS_THREAD_LIMIT = '%u'\n", gomp_initial_icv_dev.teams_thread_limit_var); - struct gomp_icv_list* l_teams_thr_limit = teams_thread_limit_var_dev_list; + struct gomp_icv_list* l_teams_thr_limit = + gomp_initial_teams_thread_limit_var_dev_list; while (l_teams_thr_limit != NULL) { fprintf (stderr, " [%d] OMP_TEAMS_THREAD_LIMIT = '%u'\n", @@ -1764,8 +1779,9 @@ get_device_num (char **env, int prefix_len, int *dev_num, int *name_len) /* Helper function for parse_device_specific. Adds a new node to the given list. */ -static struct gomp_icv_list* -add_device_specific_icv (int dev_num, size_t size, struct gomp_icv_list **list) +struct gomp_icv_list * +gomp_add_device_specific_icv (int dev_num, size_t size, + struct gomp_icv_list **list) { if (list == NULL) return NULL; @@ -1801,12 +1817,12 @@ parse_device_specific () enum gomp_schedule_type schedule_type; int chunk_size; parse_schedule (name, &schedule_type, &chunk_size); - new_node = add_device_specific_icv (dev_num, + new_node = gomp_add_device_specific_icv (dev_num, sizeof (enum gomp_schedule_type), - &gomp_run_sched_var_dev_list); + &gomp_initial_run_sched_var_dev_list); *((enum gomp_schedule_type*)(new_node->value)) = schedule_type; - new_node = add_device_specific_icv (dev_num, sizeof (int), - &gomp_run_sched_chunk_size_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), + &gomp_initial_run_sched_chunk_size_dev_list); *((int*)(new_node->value)) = chunk_size; goto next; } @@ -1819,8 +1835,8 @@ parse_device_specific () bool value; if (!parse_boolean (name, &value)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (bool), - &gomp_dyn_var_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (bool), + &gomp_initial_dyn_var_dev_list); *((bool*)(new_node->value)) = value; goto next; } @@ -1834,8 +1850,8 @@ parse_device_specific () if (!parse_unsigned_long (name, &value, false)) continue; value = value > INT_MAX ? UINT_MAX : value; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long), - &gomp_thread_limit_var_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long), + &gomp_initial_thread_limit_var_dev_list); *((unsigned long*)(new_node->value)) = value; goto next; } @@ -1850,17 +1866,17 @@ parse_device_specific () unsigned long nvalues = 0; if (!parse_unsigned_long_list (name, &value, &pvalues, &nvalues)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long), - &gomp_nthreads_var_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long), + &gomp_initial_nthreads_var_dev_list); *((unsigned long*)(new_node->value)) = value; if (nvalues > 0) { - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long*), - &gomp_nthreads_var_list_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long*), + &gomp_initial_nthreads_var_list_dev_list); *((unsigned long**)(new_node->value)) = pvalues; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long*), - &gomp_nthreads_var_list_len_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long*), + &gomp_initial_nthreads_var_list_len_dev_list); *((unsigned long*)(new_node->value)) = nvalues; } goto next; @@ -1874,7 +1890,10 @@ parse_device_specific () int value; if (!parse_int (name, &value, false)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (int), + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), + &gomp_initial_nteams_var_dev_list); + *((int*)(new_node->value)) = value; + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), &gomp_nteams_var_dev_list); *((int*)(new_node->value)) = value; goto next; @@ -1904,18 +1923,18 @@ parse_device_specific () if (value == omp_proc_bind_false) value = true; - new_node = add_device_specific_icv (dev_num, sizeof (char), - &gomp_proc_bind_var_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (char), + &gomp_initial_proc_bind_var_dev_list); *((char*)(new_node->value)) = value; if (nvalues > 0) { - new_node = add_device_specific_icv (dev_num, sizeof (char*), - &gomp_proc_bind_var_list_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (char*), + &gomp_initial_proc_bind_var_list_dev_list); *((char**)(new_node->value)) = pvalues; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long), - &gomp_proc_bind_var_list_len_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long), + &gomp_initial_proc_bind_var_list_len_dev_list); *((unsigned long*)(new_node->value)) = nvalues; } goto next; @@ -1931,8 +1950,8 @@ parse_device_specific () continue; value = (value > gomp_supported_active_levels) ? gomp_supported_active_levels : value; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long), - &gomp_max_active_levels_var_dev_list); + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long), + &gomp_initial_max_active_levels_var_dev_list); *((unsigned long*)(new_node->value)) = value; goto next; } @@ -1945,7 +1964,7 @@ parse_device_specific () unsigned long value; if (!parse_stacksize (name, &value)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (unsigned long), + new_node = gomp_add_device_specific_icv (dev_num, sizeof (unsigned long), &stacksize_dev_list); *((unsigned long*)(new_node->value)) = value; goto next; @@ -1959,7 +1978,7 @@ parse_device_specific () int value; if (!parse_wait_policy (name, &value)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (int), + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), &wait_policy_dev_list); *((int*)(new_node->value)) = value; goto next; @@ -1973,8 +1992,12 @@ parse_device_specific () int value; if (!parse_int (name, &value, false)) continue; - new_node = add_device_specific_icv (dev_num, sizeof (int), - &teams_thread_limit_var_dev_list); + new_node = gomp_add_device_specific_icv ( + dev_num, sizeof (int), + &gomp_initial_teams_thread_limit_var_dev_list); + *((int*)(new_node->value)) = value; + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), + &gomp_teams_thread_limit_var_dev_list); *((int*)(new_node->value)) = value; goto next; } diff --git a/libgomp/icv-device.c b/libgomp/icv-device.c index bc1af97..10b2e33 100644 --- a/libgomp/icv-device.c +++ b/libgomp/icv-device.c @@ -88,3 +88,29 @@ omp_get_max_teams (void) } ialias (omp_get_max_teams) + +void +omp_set_num_teams (int num_teams) +{ + if (num_teams >= 0) + gomp_nteams_var = num_teams; +} + +ialias (omp_set_num_teams) + +int +omp_get_teams_thread_limit (void) +{ + return gomp_teams_thread_limit_var; +} + +ialias (omp_get_teams_thread_limit) + +void +omp_set_teams_thread_limit (int thread_limit) +{ + if (thread_limit >= 0) + gomp_teams_thread_limit_var = thread_limit; +} + +ialias (omp_set_teams_thread_limit) diff --git a/libgomp/icv.c b/libgomp/icv.c index 5c435d6..9aef91c 100644 --- a/libgomp/icv.c +++ b/libgomp/icv.c @@ -148,26 +148,6 @@ omp_get_supported_active_levels (void) return gomp_supported_active_levels; } -void -omp_set_num_teams (int num_teams) -{ - if (num_teams >= 0) - gomp_nteams_var = num_teams; -} - -void -omp_set_teams_thread_limit (int thread_limit) -{ - if (thread_limit >= 0) - gomp_teams_thread_limit_var = thread_limit; -} - -int -omp_get_teams_thread_limit (void) -{ - return gomp_teams_thread_limit_var; -} - int omp_get_cancellation (void) { @@ -268,9 +248,6 @@ ialias (omp_get_thread_limit) ialias (omp_set_max_active_levels) ialias (omp_get_max_active_levels) ialias (omp_get_supported_active_levels) -ialias (omp_set_num_teams) -ialias (omp_set_teams_thread_limit) -ialias (omp_get_teams_thread_limit) ialias (omp_get_cancellation) ialias (omp_get_proc_bind) ialias (omp_get_max_task_priority) diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h index 072cc47..75d7663 100644 --- a/libgomp/libgomp-plugin.h +++ b/libgomp/libgomp-plugin.h @@ -116,6 +116,7 @@ struct addr_pair #define GOMP_MAX_ACTIVE_LEVELS_VAR __gomp_max_active_levels #define GOMP_BIND_VAR __gomp_bind #define GOMP_NTEAMS_VAR __gomp_nteams +#define GOMP_TEAMS_THREAD_LIMIT_VAR __gomp_teams_thread_limit_var /* Miscellaneous functions. */ extern void *GOMP_PLUGIN_malloc (size_t) __attribute__ ((malloc)); diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index 89d1453..b585c05 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -504,6 +504,17 @@ struct gomp_initial_icv_t unsigned long stacksize; }; +struct gomp_default_icv_t +{ + enum gomp_schedule_type run_sched_var; + int run_sched_chunk_size; + unsigned char max_active_levels_var; + char bind_var; + int nteams_var; + int teams_thread_limit_var; +}; +extern struct gomp_default_icv_t gomp_default_icv_values; + struct gomp_icv_flags_t { enum gomp_env_var_suffix_t nthreads_var; @@ -527,13 +538,19 @@ struct gomp_icv_list { extern void *gomp_get_icv_value_ptr (struct gomp_icv_list **list, int device_num); -extern struct gomp_icv_list *gomp_run_sched_var_dev_list; -extern struct gomp_icv_list *gomp_run_sched_chunk_size_dev_list; +extern struct gomp_icv_list* gomp_add_device_specific_icv (int dev_num, + size_t size, + struct gomp_icv_list **list); +extern struct gomp_icv_list *gomp_initial_run_sched_var_dev_list; +extern struct gomp_icv_list *gomp_initial_run_sched_chunk_size_dev_list; +extern struct gomp_icv_list *gomp_initial_max_active_levels_var_dev_list; +extern struct gomp_icv_list *gomp_initial_proc_bind_var_dev_list; +extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_dev_list; +extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_len_dev_list; +extern struct gomp_icv_list *gomp_initial_nteams_var_dev_list; + extern struct gomp_icv_list *gomp_nteams_var_dev_list; -extern struct gomp_icv_list *gomp_max_active_levels_var_dev_list; -extern struct gomp_icv_list *gomp_proc_bind_var_dev_list; -extern struct gomp_icv_list *gomp_proc_bind_var_list_dev_list; -extern struct gomp_icv_list *gomp_proc_bind_var_list_len_dev_list; +extern struct gomp_icv_list *gomp_teams_thread_limit_var_dev_list; extern struct gomp_initial_icv_t gomp_initial_icv_all; extern struct gomp_initial_icv_t gomp_initial_icv_dev; diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 44ab369..3236c38 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -572,7 +572,8 @@ static char *GOMP_ICV_STRINGS[] = XSTRING (GOMP_DYN_VAR), XSTRING (GOMP_MAX_ACTIVE_LEVELS_VAR), XSTRING (GOMP_BIND_VAR), - XSTRING (GOMP_NTEAMS_VAR) + XSTRING (GOMP_NTEAMS_VAR), + XSTRING (GOMP_TEAMS_THREAD_LIMIT_VAR) }; /* }}} */ @@ -3371,7 +3372,7 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, struct kernel_info *kernel; int kernel_count = image_desc->kernel_count; unsigned var_count = image_desc->global_variable_count; - int other_count = 10; + int other_count = 11; agent = get_agent_info (ord); if (!agent) diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 9d603fd..ced24ca 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -346,7 +346,8 @@ static char *GOMP_ICV_STRINGS[] = XSTRING (GOMP_DYN_VAR), XSTRING (GOMP_MAX_ACTIVE_LEVELS_VAR), XSTRING (GOMP_BIND_VAR), - XSTRING (GOMP_NTEAMS_VAR) + XSTRING (GOMP_NTEAMS_VAR), + XSTRING (GOMP_TEAMS_THREAD_LIMIT_VAR) }; static inline struct nvptx_thread * @@ -1316,7 +1317,7 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, fn_descs = img_header->fn_descs; /* Currently, other entry kinds are 'device number' and further ICVs. */ - other_entries = 10; + other_entries = 11; targ_tbl = GOMP_PLUGIN_malloc (sizeof (struct addr_pair) * (fn_entries + var_entries + other_entries)); diff --git a/libgomp/target.c b/libgomp/target.c index 4a9e533..1a6afe6 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -2077,13 +2077,51 @@ gomp_update (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs, (_DEV_42, _DEV, _ALL). */ static void* gomp_get_icv (struct gomp_icv_list **list, int dev_num, - enum gomp_env_var_suffix_t flag, void *dev_val, void *all_val) + enum gomp_env_var_suffix_t flag, void *dev_val, void *all_val, + void *default_val) { void *val = gomp_get_icv_value_ptr (list, dev_num); if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_DEV)) - val = dev_val; + val = dev_val; if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_ALL)) - val = all_val; + val = all_val; + if (val == NULL) + val = default_val; + if (val == NULL) + { + gomp_error ("Expected value for ICV."); + return NULL; + } + return val; +} + +/* Helper function for 'gomp_load_image_to_device'. Similar to gomp_get_icv but + also adds the ICV value to LIST if not already there. */ +static void* +gomp_get_or_add_icv_int (struct gomp_icv_list **list, int dev_num, + enum gomp_env_var_suffix_t flag, void *dev_val, + void *all_val, void *default_val) +{ + void *val = gomp_get_icv_value_ptr (list, dev_num); + if (val == NULL) + { + if (flag & GOMP_ENV_VAR_SUFFIX_DEV) + val = dev_val; + if (val == NULL && (flag & GOMP_ENV_VAR_SUFFIX_ALL)) + val = all_val; + if (val == NULL) + val = default_val; + if (val == NULL) + { + gomp_error ("Expected value for ICV."); + return NULL; + } + + struct gomp_icv_list *new_node; + new_node = gomp_add_device_specific_icv (dev_num, sizeof (int), list); + *((int*)(new_node->value)) = *(int*)val; + val = new_node->value; + } return val; } @@ -2107,8 +2145,12 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, int num_funcs = host_funcs_end - host_func_table; int num_vars = (host_vars_end - host_var_table) / 2; - /* Others currently is only 'device_num' */ - int num_others = 10; + /* Others is 'device_num' and further ICVs. */ + int num_others = 11; + + /* Number of ICVs, which need to be copied back to the host. + Currently nteams-var and teams-thread-limit-var. */ + int num_ICVs_copied_back = 2; /* Load image to device and get target addresses for the image. */ struct addr_pair *target_table = NULL; @@ -2132,7 +2174,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, /* Insert host-target address mapping into splay tree. */ struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt)); - tgt->array = gomp_malloc ((num_funcs + num_vars) * sizeof (*tgt->array)); + tgt->array = gomp_malloc ((num_funcs + num_vars + num_ICVs_copied_back) + * sizeof (*tgt->array)); tgt->refcount = REFCOUNT_INFINITY; tgt->tgt_start = 0; tgt->tgt_end = 0; @@ -2204,6 +2247,10 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, for (i = 0; i < num_others; ++i) { struct addr_pair *var = &target_table[num_funcs + num_vars + i]; + /* COPY_BACK_SIZE is used if the device-specific ICV needs to be + copied back from device to host. */ + size_t copy_back_size = 0; + /* Start address will be non-zero for the current entry if the variable was found in this image. */ if (var->start != 0) @@ -2223,16 +2270,23 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, values that are set during kernel entering. */ break; case 3: // GOMP_RUN_SCHED_VAR - val = gomp_get_icv (&gomp_run_sched_var_dev_list, dev_num, - gomp_initial_icv_flags.run_sched_var, - &gomp_initial_icv_dev.run_sched_var, - &gomp_initial_icv_all.run_sched_var); + val = + gomp_get_icv (&gomp_initial_run_sched_var_dev_list, + dev_num, + gomp_initial_icv_flags.run_sched_var, + &gomp_initial_icv_dev.run_sched_var, + &gomp_initial_icv_all.run_sched_var, + &gomp_default_icv_values.run_sched_var); break; case 4: // GOMP_RUN_SCHED_CHUNK_SIZE - val = gomp_get_icv (&gomp_run_sched_chunk_size_dev_list, dev_num, - gomp_initial_icv_flags.run_sched_chunk_size, - &gomp_initial_icv_dev.run_sched_chunk_size, - &gomp_initial_icv_all.run_sched_chunk_size); + val = + gomp_get_icv (&gomp_initial_run_sched_chunk_size_dev_list, + dev_num, + gomp_initial_icv_flags.run_sched_chunk_size, + &gomp_initial_icv_dev.run_sched_chunk_size, + &gomp_initial_icv_all.run_sched_chunk_size, + &gomp_default_icv_values. + run_sched_chunk_size); break; case 5: // GOMP_DEFAULT_DEVICE_VAR val = &gomp_global_icv.default_device_var; @@ -2242,22 +2296,41 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, values that are set during kernel entering. */ break; case 7: // GOMP_MAX_ACTIVE_LEVELS_VAR - val = gomp_get_icv (&gomp_max_active_levels_var_dev_list, dev_num, - gomp_initial_icv_flags.max_active_levels_var, - &gomp_initial_icv_dev.max_active_levels_var, - &gomp_initial_icv_all.max_active_levels_var); + val = + gomp_get_icv + (&gomp_initial_max_active_levels_var_dev_list, dev_num, + gomp_initial_icv_flags.max_active_levels_var, + &gomp_initial_icv_dev.max_active_levels_var, + &gomp_initial_icv_all.max_active_levels_var, + &gomp_default_icv_values.max_active_levels_var); break; case 8: // GOMP_BIND_VAR - val = gomp_get_icv (&gomp_proc_bind_var_dev_list, dev_num, + val = gomp_get_icv (&gomp_initial_proc_bind_var_dev_list, + dev_num, gomp_initial_icv_flags.bind_var, &gomp_initial_icv_dev.bind_var, - &gomp_initial_icv_all.bind_var); + &gomp_initial_icv_all.bind_var, + &gomp_default_icv_values.bind_var); break; case 9: // GOMP_NTEAMS_VAR - val = gomp_get_icv (&gomp_nteams_var_dev_list, dev_num, - gomp_initial_icv_flags.nteams_var, - &gomp_initial_icv_dev.nteams_var, - &gomp_initial_icv_all.nteams_var); + val = + gomp_get_or_add_icv_int + (&gomp_nteams_var_dev_list, dev_num, + gomp_initial_icv_flags.nteams_var, + &gomp_initial_icv_dev.nteams_var, + &gomp_initial_icv_all.nteams_var, + &gomp_default_icv_values.nteams_var); + copy_back_size = sizeof (int); + break; + case 10: // GOMP_TEAMS_THREAD_LIMIT_VAR + val = + gomp_get_or_add_icv_int + (&gomp_teams_thread_limit_var_dev_list, dev_num, + gomp_initial_icv_flags.teams_thread_limit_var, + &gomp_initial_icv_dev.teams_thread_limit_var, + &gomp_initial_icv_all.teams_thread_limit_var, + &gomp_default_icv_values.teams_thread_limit_var); + copy_back_size = sizeof (int); break; } if (val != NULL) @@ -2267,6 +2340,22 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, actually designating its device number into effect. */ gomp_copy_host2dev (devicep, NULL, (void *) var->start, val, var_size, false, NULL); + if (copy_back_size) + { + splay_tree_key k = &array->key; + k->host_start = (uintptr_t) val; + k->host_end = + k->host_start + (size_mask & copy_back_size); + k->tgt = tgt; + k->tgt_offset = var->start; + k->refcount = REFCOUNT_INFINITY; + k->dynamic_refcount = 0; + k->aux = NULL; + array->left = NULL; + array->right = NULL; + splay_tree_insert (&devicep->mem_map, array); + array++; + } } } } @@ -2661,6 +2750,16 @@ clear_unsupported_flags (struct gomp_device_descr *devicep, unsigned int flags) return flags; } +static void gomp_copy_back_icv (struct gomp_icv_list **list, + struct gomp_device_descr *devicep, int device, + size_t size) +{ + void *host_ptr = gomp_get_icv_value_ptr (list, device); + void *dev_ptr = omp_get_mapped_ptr (host_ptr, device); + if (dev_ptr != NULL) + gomp_copy_dev2host (devicep, NULL, host_ptr, dev_ptr, size); +} + /* Like GOMP_target, but KINDS is 16-bit, UNUSED is no longer present, and several arguments have been added: FLAGS is a bitmask, see GOMP_TARGET_FLAG_* in gomp-constants.h. @@ -2693,6 +2792,144 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, size_t tgt_align = 0, tgt_size = 0; bool fpc_done = false; + /* Obtain the original TEAMS and THREADS values from ARGS. */ + intptr_t orig_teams = 1, orig_threads = 0; + size_t num_args = 0, len = 1, teams_len = 1, threads_len = 1; + void **tmpargs = args; + while (*tmpargs) + { + intptr_t id = (intptr_t) *tmpargs++, val; + if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM) + { + val = (intptr_t) *tmpargs++; + len = 2; + } + else + { + val = id >> GOMP_TARGET_ARG_VALUE_SHIFT; + len = 1; + } + num_args += len; + if ((id & GOMP_TARGET_ARG_DEVICE_MASK) != GOMP_TARGET_ARG_DEVICE_ALL) + continue; + val = val > INT_MAX ? INT_MAX : val; + if ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_NUM_TEAMS) + { + orig_teams = val; + teams_len = len; + } + else if ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_THREAD_LIMIT) + { + orig_threads = val; + threads_len = len; + } + } + + intptr_t new_teams = orig_teams, new_threads = orig_threads; + /* ORIG_TEAMS == -2: No explicit teams construct specified. Set to 1. + ORIG_TEAMS == -1: Teams construct with NUM_TEAMS clause specified, but the + value could not be specified. No Change. + ORIG_TEAMS == 0: TEAMS construct without NUM_TEAMS clause. + Set device-specific value. + ORIG_TEAMS > 0: Value was already set through e.g. NUM_TEAMS clause. + No change. */ + if (orig_teams == -2) + new_teams = 1; + else if (orig_teams == 0) + { + void *dev_val = + gomp_get_icv_value_ptr (&gomp_nteams_var_dev_list, device); + if (dev_val != NULL) + new_teams = *((int*) dev_val); + } + /* The device-specific teams-thread-limit is only set if (a) an explicit TEAMS + region exists, i.e. ORIG_TEAMS > -2, and (b) THREADS was not already set by + e.g. a THREAD_LIMIT clause. */ + if (orig_teams >= -2 && orig_threads == 0) + { + void *dev_val = + gomp_get_icv_value_ptr (&gomp_teams_thread_limit_var_dev_list, device); + if (dev_val != NULL) + new_threads = *((int*) dev_val); + } + + /* Copy and change the arguments list only if TEAMS or THREADS need to be + updated. */ + void **new_args = args; + if (orig_teams != new_teams || orig_threads != new_threads) + { + size_t tms_len = (orig_teams == new_teams + ? teams_len + : (new_teams > -(1 << 15) && new_teams < (1 << 15) + ? 1 : 2)); + size_t ths_len = (orig_threads == new_threads + ? threads_len + : (new_threads > -(1 << 15) && new_threads < (1 << 15) + ? 1 : 2)); + /* One additional item since the last arg must be NULL. */ + size_t new_args_cnt = num_args - teams_len - threads_len + tms_len + + ths_len + 1; + new_args = (void **) gomp_alloca (new_args_cnt * sizeof (void*)); + + tmpargs = args; + void **tmp_new_args = new_args; + while (*tmpargs) + { + intptr_t id = (intptr_t) *tmpargs; + if (((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_NUM_TEAMS + && orig_teams != new_teams) + || ((id & GOMP_TARGET_ARG_ID_MASK) == GOMP_TARGET_ARG_THREAD_LIMIT + && orig_threads != new_threads)) + { + tmpargs++; + if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM) + tmpargs++; + } + else + { + *tmp_new_args++ = *tmpargs++; + if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM) + *tmp_new_args++ = *tmpargs++; + } + } + + if (orig_teams != new_teams) + { + intptr_t new_val = new_teams; + if (tms_len == 1) + { + new_val = (new_val << GOMP_TARGET_ARG_VALUE_SHIFT) + | GOMP_TARGET_ARG_NUM_TEAMS; + *tmp_new_args++ = (void *) new_val; + } + else + { + *tmp_new_args++ = (void *) (GOMP_TARGET_ARG_SUBSEQUENT_PARAM + | GOMP_TARGET_ARG_NUM_TEAMS); + *tmp_new_args++ = (void *) new_val; + } + } + + if (orig_threads != new_threads) + { + intptr_t new_val = new_threads; + if (ths_len == 1) + { + new_val = (new_val << GOMP_TARGET_ARG_VALUE_SHIFT) + | GOMP_TARGET_ARG_THREAD_LIMIT; + *tmp_new_args++ = (void *) new_val; + } + else + { + *tmp_new_args++ = (void *) (GOMP_TARGET_ARG_SUBSEQUENT_PARAM + | GOMP_TARGET_ARG_THREAD_LIMIT); + *tmp_new_args++ = (void *) new_val; + } + } + + *tmp_new_args = NULL; + } + flags = clear_unsupported_flags (devicep, flags); if (flags & GOMP_TARGET_FLAG_NOWAIT) @@ -2731,7 +2968,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, && !thr->task->final_task) { gomp_create_target_task (devicep, fn, mapnum, hostaddrs, - sizes, kinds, flags, depend, args, + sizes, kinds, flags, depend, new_args, GOMP_TARGET_TASK_BEFORE_MAP); return; } @@ -2777,7 +3014,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, tgt_align, tgt_size); } } - gomp_target_fallback (fn, hostaddrs, devicep, args); + gomp_target_fallback (fn, hostaddrs, devicep, new_args); return; } @@ -2805,9 +3042,10 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, tgt_vars = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, true, &refcount_set, GOMP_MAP_VARS_TARGET); } + devicep->run_func (devicep->target_id, fn_addr, tgt_vars ? (void *) tgt_vars->tgt_start : hostaddrs, - args); + new_args); if (tgt_vars) { htab_clear (refcount_set); @@ -2815,6 +3053,14 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, } if (refcount_set) htab_free (refcount_set); + + /* Copy back ICVs which were probably changed on the device. + HOST_PTR is expected to exist since it was added in + gomp_load_image_to_device if not already available. */ + gomp_copy_back_icv (&gomp_nteams_var_dev_list, devicep, device, sizeof (int)); + gomp_copy_back_icv (&gomp_teams_thread_limit_var_dev_list, devicep, device, + sizeof (int)); + } /* Host fallback for GOMP_target_data{,_ext} routines. */ diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-4.c b/libgomp/testsuite/libgomp.c-c++-common/icv-4.c index b987a33..9da0d63 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/icv-4.c +++ b/libgomp/testsuite/libgomp.c-c++-common/icv-4.c @@ -16,7 +16,7 @@ main () } else omp_set_num_teams (6); - if (getenv ("OMP_TEAMS_THREAD_LIMIT") == NULL + if (getenv ("OMP_TEAMS_THREAD_LIMIT") != NULL && strcmp (getenv ("OMP_TEAMS_THREAD_LIMIT"), "12") == 0) { if (omp_get_teams_thread_limit () != 12) diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-5.c b/libgomp/testsuite/libgomp.c-c++-common/icv-5.c index 21bf44d..8f768b3 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/icv-5.c +++ b/libgomp/testsuite/libgomp.c-c++-common/icv-5.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-DAMD" { target openacc_radeon_accel_selected } } */ /* { dg-do run } */ #include @@ -6,15 +7,22 @@ #include char const *varnames[] = { + "OMP_NUM_TEAMS_ALL", + "OMP_NUM_TEAMS_DEV", + "OMP_NUM_TEAMS", "OMP_NUM_TEAMS_DEV_0", "OMP_NUM_TEAMS_DEV_1", "OMP_NUM_TEAMS_DEV_2", - "OMP_NUM_TEAMS_ALL", - "OMP_NUM_TEAMS_DEV", - "OMP_NUM_TEAMS" + "OMP_TEAMS_THREAD_LIMIT_ALL", + "OMP_TEAMS_THREAD_LIMIT_DEV", + "OMP_TEAMS_THREAD_LIMIT", + "OMP_TEAMS_THREAD_LIMIT_DEV_0", + "OMP_TEAMS_THREAD_LIMIT_DEV_1", + "OMP_TEAMS_THREAD_LIMIT_DEV_2" }; -char const *values[] = { "42", "43", "44", "45", "46", "47" }; -const int cnt = 6; +char const *values[] = { "3", "4", "5", "6", "7", "8", + "2", "3", "4", "5", "6", "7" }; +const int cnt = 12; int main (int argc, char *const *argv) @@ -35,14 +43,192 @@ main (int argc, char *const *argv) abort (); } - if (omp_get_max_teams () != 47) + if (omp_get_max_teams () != 5 + || omp_get_teams_thread_limit () != 4) + abort (); + + #pragma omp teams + { + if (omp_get_num_teams () > 5 + || omp_get_team_num () >= 5) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 4 + || omp_get_thread_num () >= 4) + abort (); + } + + omp_set_num_teams (4); + omp_set_teams_thread_limit (3); + if (omp_get_max_teams () != 4 + || omp_get_teams_thread_limit () != 3) + abort (); + + #pragma omp teams + { + if (omp_get_num_teams () > 4 + || omp_get_team_num () >= 4) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 3 + || omp_get_thread_num () >= 3) + abort (); + } + + #pragma omp teams num_teams(3) thread_limit(2) + { + if (omp_get_num_teams () != 3 + || omp_get_team_num () >= 3) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 2 + || omp_get_thread_num () >= 2) + abort (); + } + + #pragma omp teams num_teams(5) thread_limit(4) + { + if (omp_get_num_teams () != 5 + || omp_get_team_num () >= 5) abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 4 + || omp_get_thread_num () >= 4) + abort (); + } int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices (); + if (num_devices <= 0) + return 0; + for (int i=0; i < num_devices; i++) - #pragma omp target device (i) - if (omp_get_max_teams () != 42 + i) + { + #pragma omp target device (i) + if (omp_get_max_teams () != 6 + i + || omp_get_teams_thread_limit () != 5 + i) abort (); + #pragma omp target device (i) + #pragma omp teams + { + if (omp_get_num_teams () > 6 + i + || omp_get_team_num () >= 6 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 5 + i + || omp_get_thread_num () >= 5 + i) + abort (); + } + + #pragma omp target device (i) + { + omp_set_num_teams (5 + i); + omp_set_teams_thread_limit (4 + i); + if (omp_get_max_teams () != 5 + i + || omp_get_teams_thread_limit () != 4 + i) + abort (); + } + + /* omp_set_num_teams and omp_set_teams_thread_limit above set the value + of nteams-var and teams-thread-limit-var ICVs on device 'i', which has + scope 'device' and should be avaible in subsequent target regions. */ + #pragma omp target device (i) + if (omp_get_max_teams () != 5 + i + || omp_get_teams_thread_limit () != 4 + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams + { + if (omp_get_num_teams () > 5 + i + || omp_get_team_num () >= 5 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 4 + i + || omp_get_thread_num () >= 4 + i) + abort (); + } + + #pragma omp target device (i) + #pragma omp teams num_teams(6 + i) thread_limit(5 + i) + { + if (omp_get_num_teams () > 6 + i + || omp_get_team_num () >= 6 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 5 + i + || omp_get_thread_num () >= 5 + i + || omp_get_num_teams () > 6 + i + || omp_get_team_num () >= 6 + i) + abort (); + } + + #pragma omp target device (i) + #pragma omp teams num_teams(4 + i) thread_limit(3 + i) + { + if (omp_get_num_teams () > 4 + i + || omp_get_team_num () >= 4 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 3 + i + || omp_get_thread_num () >= 3 + i + || omp_get_num_teams () > 4 + i + || omp_get_team_num () >= 4 + i) + abort (); + } + + #pragma omp target device (i) + #pragma omp teams thread_limit(3 + i) num_teams(4 + i) + { + if (omp_get_num_teams () > 4 + i + || omp_get_team_num () >= 4 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 3 + i + || omp_get_thread_num () >= 3 + i + || omp_get_num_teams () > 4 + i + || omp_get_team_num () >= 4 + i) + abort (); + } + + /* This tests a large number of teams and threads. If it is larger than + 2^15+1 then the according argument in the kernels arguments list + is encoded with two items instead of one. On NVIDIA there is an + adjustment for too large teams and threads. For AMD such adjustment + exists only for threads and will cause runtime errors with a two large + number of teams. */ + intptr_t large_num_teams = 66000; +#ifdef AMD + large_num_teams = 8; +#endif + intptr_t large_threads_limit = 67000; + #pragma omp target device (i) + { + omp_set_num_teams (large_num_teams + i); + omp_set_teams_thread_limit (large_threads_limit + i); + if (omp_get_max_teams () != large_num_teams + i + || omp_get_teams_thread_limit () != large_threads_limit + i) + abort (); + } + + #pragma omp target device (i) + if (omp_get_max_teams () != large_num_teams + i + || omp_get_teams_thread_limit () != large_threads_limit + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams + { + if (omp_get_num_teams () > large_num_teams + i + || omp_get_team_num () >= large_num_teams + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > large_threads_limit + i + || omp_get_thread_num () >= large_threads_limit + i) + abort (); + } + + } + return 0; -} \ No newline at end of file +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-6.c b/libgomp/testsuite/libgomp.c-c++-common/icv-6.c index 943147b..5e761aa 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/icv-6.c +++ b/libgomp/testsuite/libgomp.c-c++-common/icv-6.c @@ -1,8 +1,8 @@ /* { dg-do run } */ -/* This tests the hierarchical usage of ICVs on the device, i.e. if - OMP_NUM_TEAMS_DEV_ is not configured, then the value of - OMP_NUM_TEAMS_DEV should be used. */ +/* This test considers the hierarchical usage of ICVs on the device, i.e. if + e.g. OMP_NUM_TEAMS_DEV_ is not configured, then the value of + OMP_NUM_TEAMS_DEV should be used for the targets. */ #include #include @@ -12,16 +12,21 @@ char const *varnames[] = { "OMP_NUM_TEAMS_ALL", "OMP_NUM_TEAMS_DEV", + "OMP_TEAMS_THREAD_LIMIT_ALL", + "OMP_TEAMS_THREAD_LIMIT_DEV", }; -char const *values[] = { "42", "43" }; -const int cnt = 2; +char const *values[] = { "3", "4", "2", "3" }; +const int cnt = 4; char const *excludes[] = { "OMP_NUM_TEAMS_DEV_0", "OMP_NUM_TEAMS_DEV_1", "OMP_NUM_TEAMS_DEV_2", + "OMP_TEAMS_THREAD_LIMIT_DEV_0", + "OMP_TEAMS_THREAD_LIMIT_DEV_1", + "OMP_TEAMS_THREAD_LIMIT_DEV_2" }; -const int cnt_exludes = 3; +const int cnt_exludes = 6; int main (int argc, char *const *argv) @@ -50,14 +55,55 @@ main (int argc, char *const *argv) abort (); } - if (omp_get_max_teams () != 42) - abort (); - int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices (); for (int i=0; i < num_devices; i++) - #pragma omp target device (i) - if (omp_get_max_teams () != 43) + { + #pragma omp target device (i) + if (omp_get_max_teams () != 4 + || omp_get_teams_thread_limit () != 3) + abort (); + + #pragma omp target device (i) + #pragma omp teams + { + if (omp_get_num_teams () > 4 + || omp_get_team_num () >= 4) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 3 + || omp_get_thread_num () >= 3) + abort (); + } + + #pragma omp target device (i) + { + omp_set_num_teams (3 + i); + omp_set_teams_thread_limit (2 + i); + if (omp_get_max_teams () != 3 + i + || omp_get_teams_thread_limit () != 2 + i) + abort (); + } + + /* omp_set_num_teams above set the value of nteams-var ICV on device 'i', + which has scope 'device' and should be avaible in subsequent target + regions. */ + #pragma omp target device (i) + if (omp_get_max_teams () != 3 + i + || omp_get_teams_thread_limit () != 2 + i) abort (); + #pragma omp target device (i) + #pragma omp teams + { + if (omp_get_num_teams () > 3 + i + || omp_get_team_num () >= 3 + i) + abort (); + #pragma omp parallel + if (omp_get_thread_limit () > 2 + i + || omp_get_thread_num () >= 2 + i) + abort (); + } + } + return 0; -} \ No newline at end of file +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-7.c b/libgomp/testsuite/libgomp.c-c++-common/icv-7.c index 857d796..b28a33a 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/icv-7.c +++ b/libgomp/testsuite/libgomp.c-c++-common/icv-7.c @@ -12,19 +12,25 @@ #include char const *varnames[] = { - "OMP_NUM_TEAMS_ALL" + "OMP_NUM_TEAMS_ALL", + "OMP_TEAMS_THREAD_LIMIT_ALL" }; -char const *values[] = { "42" }; -const int cnt = 1; +char const *values[] = { "7", "2" }; +const int cnt = 2; char const *excludes[] = { "OMP_NUM_TEAMS_DEV_0", "OMP_NUM_TEAMS_DEV_1", "OMP_NUM_TEAMS_DEV_2", "OMP_NUM_TEAMS_DEV", - "OMP_NUM_TEAMS" + "OMP_NUM_TEAMS", + "OMP_TEAMS_THREAD_LIMIT_DEV_0", + "OMP_TEAMS_THREAD_LIMIT_DEV_1", + "OMP_TEAMS_THREAD_LIMIT_DEV_2", + "OMP_TEAMS_THREAD_LIMIT_DEV", + "OMP_TEAMS_THREAD_LIMIT" }; -const int cnt_exludes = 5; +const int cnt_exludes = 10; int main (int argc, char *const *argv) @@ -53,14 +59,73 @@ main (int argc, char *const *argv) abort (); } - if (omp_get_max_teams () != 42) + if (omp_get_max_teams () != 7 + || omp_get_teams_thread_limit () != 2) + abort (); + + #pragma omp teams + if (omp_get_num_teams () > 7 + || omp_get_team_num () >= 7) + abort (); + + omp_set_num_teams (9); + omp_set_teams_thread_limit (3); + if (omp_get_max_teams () != 9 + || omp_get_teams_thread_limit () != 3) + abort (); + + #pragma omp teams + if (omp_get_num_teams () > 9 + || omp_get_team_num () >= 9) + abort (); + + #pragma omp teams num_teams(5) + if (omp_get_num_teams () > 5 + || omp_get_team_num () >= 5) abort (); int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices (); for (int i=0; i < num_devices; i++) - #pragma omp target device (i) - if (omp_get_max_teams () != 42) + { + #pragma omp target device (i) + if (omp_get_max_teams () != 7 + || omp_get_teams_thread_limit () != 2) abort (); + #pragma omp target device (i) + #pragma omp teams + if (omp_get_num_teams () > 7 + || omp_get_team_num () >= 7) + abort (); + + #pragma omp target device (i) + { + omp_set_num_teams (8 + i); + omp_set_teams_thread_limit (4 + i); + if (omp_get_max_teams () != 8 + i + || omp_get_teams_thread_limit () != 4 + i) + abort (); + } + + /* omp_set_num_teams above set the value of nteams-var ICV on device 'i', + which has scope 'device' and should be avaible in subsequent target + regions. */ + #pragma omp target device (i) + if (omp_get_max_teams () != 8 + i + || omp_get_teams_thread_limit () != 4 + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams + if (omp_get_num_teams () > 8 + i + || omp_get_team_num () >= 8 + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams num_teams(5 + i) + if (omp_get_num_teams () != 5 + i) + abort (); + } + return 0; -} \ No newline at end of file +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/icv-8.c b/libgomp/testsuite/libgomp.c-c++-common/icv-8.c new file mode 100644 index 0000000..1eda2fd --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/icv-8.c @@ -0,0 +1,105 @@ +/* { dg-do run } */ + +/* This tests usage of ICVs on the host and on devices if no corresponding + environment variables are configured. */ + +#include +#include +#include +#include + +char const *excludes[] = { + "OMP_NUM_TEAMS_DEV_0", + "OMP_NUM_TEAMS_DEV_1", + "OMP_NUM_TEAMS_DEV_2", + "OMP_NUM_TEAMS_DEV", + "OMP_NUM_TEAMS_ALL", + "OMP_NUM_TEAMS", + "OMP_TEAMS_THREAD_LIMIT_DEV_0", + "OMP_TEAMS_THREAD_LIMIT_DEV_1", + "OMP_TEAMS_THREAD_LIMIT_DEV_2", + "OMP_TEAMS_THREAD_LIMIT_DEV", + "OMP_TEAMS_THREAD_LIMIT_ALL", + "OMP_TEAMS_THREAD_LIMIT" +}; +const int cnt_exludes = 12; + +int +main (int argc, char *const *argv) +{ + int updated = 0; + for (int i = 0; i < cnt_exludes; i++) + if (getenv (excludes[i]) != NULL) + { + unsetenv (excludes[i]); + updated = 1; + } + if (updated) + { + execv (argv[0], argv); + abort (); + } + + if (omp_get_max_teams () != 0 + || omp_get_teams_thread_limit () != 0) + abort (); + + omp_set_num_teams (9); + omp_set_teams_thread_limit (2); + if (omp_get_max_teams () != 9 + || omp_get_teams_thread_limit () != 2) + abort (); + + #pragma omp teams + if (omp_get_num_teams () > 9 + || omp_get_team_num () >= 9) + abort (); + + #pragma omp teams num_teams(5) + if (omp_get_num_teams () > 5 + || omp_get_team_num () >= 5) + abort (); + + int num_devices = omp_get_num_devices () > 3 ? 3 : omp_get_num_devices (); + for (int i=0; i < num_devices; i++) + { + #pragma omp target device (i) + if (omp_get_max_teams () != 0 + || omp_get_teams_thread_limit () != 0) + abort (); + + #pragma omp target device (i) + { + omp_set_num_teams (8 + i); + omp_set_teams_thread_limit (3 + i); + if (omp_get_max_teams () != 8 + i + || omp_get_teams_thread_limit () != 3 + i) + abort (); + } + + /* omp_set_num_teams above set the value of nteams-var ICV on device 'i', + which has scope 'device' and should be avaible in subsequent target + regions. */ + #pragma omp target device (i) + if (omp_get_max_teams () != 8 + i + || omp_get_teams_thread_limit () != 3 + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams + if (omp_get_num_teams () > 8 + i + || omp_get_team_num () >= 8 + i) + abort (); + + #pragma omp target device (i) + #pragma omp teams num_teams(5 + i) + if (omp_get_num_teams () > 5 + i + || omp_get_team_num () >= 5 + i) + abort (); + + #pragma omp target device (i) + ; + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/icv-5.f90 b/libgomp/testsuite/libgomp.fortran/icv-5.f90 new file mode 100644 index 0000000..9462ca9 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/icv-5.f90 @@ -0,0 +1,231 @@ +! { dg-set-target-env-var OMP_NUM_TEAMS_ALL "3" } +! { dg-set-target-env-var OMP_NUM_TEAMS_DEV "4" } +! { dg-set-target-env-var OMP_NUM_TEAMS "5" } +! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_0 "6" } +! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_1 "7" } +! { dg-set-target-env-var OMP_NUM_TEAMS_DEV_2 "8" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_ALL "2" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV "3" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT "4" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_0 "5" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_1 "6" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV_2 "7" } + +use omp_lib +implicit none (type, external) + integer :: num_devices, i, large_num_teams, large_threads_limit + logical :: err + + if (omp_get_num_devices () > 3) then + num_devices = 3 + else + num_devices = omp_get_num_devices () + end if + + do i=0,num_devices-1 + + ! Testing NUM_TEAMS. + if (env_is_set_dev ("OMP_NUM_TEAMS_DEV_", i, 6 + i)) then + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_max_teams () /= 6 + i) err = .true. + !$omp end target + if (err) stop 1 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + if (omp_get_num_teams () > 6 + i .or. omp_get_team_num () >= 6 + i) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 2 + + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_num_teams (5 + i) + if (omp_get_max_teams () /= 5 + i) err = .true. + !$omp end target + if (err) stop 3 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_max_teams () /= 5 + i) err = .true. + !$omp end target + if (err) stop 4 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + if (omp_get_num_teams () > 5 + i .or. omp_get_team_num () >= 5 + i) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 5 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams num_teams(6 + i) + if (omp_get_num_teams () > 6 + i .or. omp_get_team_num () >= 6 + i) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 6 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams num_teams(4 + i) + if (omp_get_num_teams () > 4 + i .or. omp_get_team_num () >= 4 + i) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 7 + + large_num_teams = 66000 + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_num_teams (large_num_teams + i) + if (omp_get_max_teams () /= large_num_teams + i) err = .true. + !$omp end target + if (err) stop 8 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_max_teams () /= large_num_teams + i) err = .true. + !$omp end target + if (err) stop 9 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + if (omp_get_num_teams () > large_num_teams + i & + .or. omp_get_team_num () >= large_num_teams + i) err = .true. + !$omp end teams + !$omp end target + if (err) stop 10 + end if + + ! Testing TEAMS-THREAD-LIMIT + if (env_is_set_dev ("OMP_TEAMS_THREAD_LIMIT_DEV_", i, 5 + i)) then + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_teams_thread_limit () /= 5 + i) err = .true. + !$omp end target + if (err) stop 11 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + !$omp parallel + if (omp_get_thread_limit () > 5 + i .or. omp_get_thread_num () >= 5 + i) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 12 + + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_teams_thread_limit (4 + i) + if (omp_get_teams_thread_limit () /= 4 + i) err = .true. + !$omp end target + if (err) stop 13 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_teams_thread_limit () /= 4 + i) err = .true. + !$omp end target + if (err) stop 14 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + !$omp parallel + if (omp_get_thread_limit () > 4 + i .or. omp_get_thread_num () >= 4 + i) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 15 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams thread_limit(5 + i) + !$omp parallel + if (omp_get_thread_limit () > 5 + i .or. omp_get_thread_num () >= 5 + i) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 16 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams thread_limit(3 + i) + !$omp parallel + if (omp_get_thread_limit () > 3 + i .or. omp_get_thread_num () >= 3 + i) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 17 + + large_threads_limit = 67000 + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_teams_thread_limit (large_threads_limit + i) + if (omp_get_teams_thread_limit () /= large_threads_limit + i) err = .true. + !$omp end target + if (err) stop 18 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_teams_thread_limit () /= large_threads_limit + i) err = .true. + !$omp end target + if (err) stop 19 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + !$omp parallel + if (omp_get_thread_limit () > large_threads_limit + i & + .or. omp_get_thread_num () >= large_threads_limit + i) err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 20 + end if + + end do + +contains + logical function env_is_set (name, val) + character(len=*) :: name, val + character(len=40) :: val2 + integer :: stat + call get_environment_variable (name, val2, status=stat) + if (stat == 0) then + if (val == val2) then + env_is_set = .true. + return + end if + else if (stat /= 1) then + error stop 30 + endif + env_is_set = .false. + end + logical function env_is_set_dev (name, dev_num, val) + character(len=*) :: name + integer :: dev_num, val + character(len=64) :: dev_num_str, env_var, val_str + write (dev_num_str,*) dev_num + dev_num_str = ADJUSTL(dev_num_str) + env_var = name // dev_num_str + write (val_str,*) val + val_str = ADJUSTL(val_str) + write (*,*) ">", TRIM(env_var), "<" + write (*,*) ">", TRIM(val_str), "<" + write (*,*) "env_is_set (TRIM(env_var), TRIM(device_num): ", env_is_set(TRIM(env_var), TRIM(val_str)) + env_is_set_dev = env_is_set (TRIM(env_var), TRIM(val_str)) + end +end diff --git a/libgomp/testsuite/libgomp.fortran/icv-6.f90 b/libgomp/testsuite/libgomp.fortran/icv-6.f90 new file mode 100644 index 0000000..c1be2b2 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/icv-6.f90 @@ -0,0 +1,142 @@ +! { dg-set-target-env-var OMP_NUM_TEAMS_ALL "3" } +! { dg-set-target-env-var OMP_NUM_TEAMS_DEV "4" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_ALL "2" } +! { dg-set-target-env-var OMP_TEAMS_THREAD_LIMIT_DEV "3" } + +! This test considers the hierarchical usage of ICVs on the device, i.e. if +! e.g. OMP_NUM_TEAMS_DEV_ is not configured, then the value of +! OMP_NUM_TEAMS_DEV should be used for the targets. + +use omp_lib +implicit none (type, external) + integer :: num_devices, i, stat, tmp + logical :: err + character(len=40) :: val + + ! The following environment variables should not be set. + call get_environment_variable ("OMP_NUM_TEAMS_DEV_0", val, status=stat) + if (stat /= 1) return + call get_environment_variable ("OMP_NUM_TEAMS_DEV_1", val, status=stat) + if (stat /= 1) return + call get_environment_variable ("OMP_NUM_TEAMS_DEV_2", val, status=stat) + if (stat /= 1) return + call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_0", val, status=stat) + if (stat /= 1) return + call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_1", val, status=stat) + if (stat /= 1) return + call get_environment_variable ("OMP_TEAMS_THREAD_LIMIT_DEV_2", val, status=stat) + if (stat /= 1) return + + if (omp_get_num_devices () > 3) then + num_devices = 3 + else + num_devices = omp_get_num_devices () + end if + + do i=0,num_devices-1 + + ! Testing NUM_TEAMS. + write (*,*) 'env_is_set ("OMP_NUM_TEAMS_DEV", "4"): ', env_is_set ("OMP_NUM_TEAMS_DEV", "4") + if (env_is_set ("OMP_NUM_TEAMS_DEV", "4")) then + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_max_teams () /= 4) err = .true. + !$omp end target + if (err) stop 1 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + if (omp_get_num_teams () > 4 .or. omp_get_team_num () >= 4) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 2 + + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_num_teams (3 + i) + if (omp_get_max_teams () /= 3 + i) err = .true. + !$omp end target + if (err) stop 3 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_max_teams () /= 3 + i) err = .true. + !$omp end target + if (err) stop 4 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + if (omp_get_num_teams () > 3 + i .or. omp_get_team_num () >= 3 + i) & + err = .true. + !$omp end teams + !$omp end target + if (err) stop 5 + end if + + ! Testing TEAMS-THREAD-LIMIT + write (*,*) 'env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3"): ', env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3") + if (env_is_set ("OMP_TEAMS_THREAD_LIMIT_DEV", "3")) then + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_teams_thread_limit () /= 3) err = .true. + !$omp end target + if (err) stop 6 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + !$omp parallel + if (omp_get_thread_limit () > 3 .or. omp_get_thread_num () >= 3) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 7 + + err = .false. + !$omp target device(i) map(tofrom: err) + call omp_set_teams_thread_limit (2 + i) + if (omp_get_teams_thread_limit () /= 2 + i) err = .true. + !$omp end target + if (err) stop 8 + + err = .false. + !$omp target device(i) map(tofrom: err) + if (omp_get_teams_thread_limit () /= 2 + i) err = .true. + !$omp end target + if (err) stop 9 + + err = .false. + !$omp target device(i) map(tofrom: err) + !$omp teams + !$omp parallel + if (omp_get_thread_limit () > 2 + i .or. omp_get_thread_num () >= 2 + i) & + err = .true. + !$omp end parallel + !$omp end teams + !$omp end target + if (err) stop 10 + end if + + end do + +contains + logical function env_is_set (name, val) + character(len=*) :: name, val + character(len=40) :: val2 + integer :: stat + call get_environment_variable (name, val2, status=stat) + if (stat == 0) then + if (val == val2) then + env_is_set = .true. + return + end if + else if (stat /= 1) then + error stop 10 + endif + env_is_set = .false. + end +end