From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1729) id 4402D384D142; Wed, 29 Jun 2022 14:44:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4402D384D142 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Kwok Yeung To: gcc-cvs@gcc.gnu.org Subject: [gcc/devel/omp/gcc-12] openmp: Add support for 'target_device' context selector set X-Act-Checkin: gcc X-Git-Author: Kwok Cheung Yeung X-Git-Refname: refs/heads/devel/omp/gcc-12 X-Git-Oldrev: 67310930e695f7b9a2ad2437386324fd74d8f1b0 X-Git-Newrev: da1da23068d1357cb7471dc9c95cfbafc1b5f12e Message-Id: <20220629144404.4402D384D142@sourceware.org> Date: Wed, 29 Jun 2022 14:44:04 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2022 14:44:04 -0000 https://gcc.gnu.org/g:da1da23068d1357cb7471dc9c95cfbafc1b5f12e commit da1da23068d1357cb7471dc9c95cfbafc1b5f12e Author: Kwok Cheung Yeung Date: Tue Jan 25 11:50:08 2022 -0800 openmp: Add support for 'target_device' context selector set 2022-01-25 Kwok Cheung Yeung gcc/ * builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. * omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin. * omp-general.cc (omp_context_selector_matches): Handle 'target_device' selector set. (omp_dynamic_cond): Generate expression tree for 'target_device' selector set. (omp_context_compute_score): Handle selectors in 'target_device' set. gcc/c/ * c-parser.cc (omp_target_device_selectors): New. (c_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (c_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/cp/ * parser.cc (omp_target_device_selectors): New. (cp_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (cp_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/fortran/ * openmp.cc (omp_target_device_selectors): New. (gfc_match_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (gfc_match_omp_context_selector_specification): Handle 'target_device' selector set. * types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. gcc/testsuite/ * c-c++-common/gomp/metadirective-7.c: New. * gfortran.dg/gomp/metadirective-7.f90: New. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add selector.c. * Makefile.am: Regenerate. * config/gcn/selector.c: New. * config/linux/selector.c: New. * config/linux/x86/selector.c: New. * config/nvptx/selector.c: New. * libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New. * libgomp.h (struct gomp_device_descr): Add evaluate_device_func field. * libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device. * libgomp_g.h (GOMP_evaluate_current_device): New. (GOMP_evaluate_target_device): New. * oacc-host.c (host_evaluate_device): New. (host_openacc_exec): Initialize evaluate_device_func field to host_evaluate_device. * plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New. * plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and compute_minor fields. (nvptx_open_device): Read compute capability information from device. (CHECK_ISA): New macro. (GOMP_OFFLOAD_evaluate_device): New. * selector.c: New. * target.c (GOMP_evaluate_target_device): New. (gomp_load_plugin_for_device): Load evaulate_device plugin function. * testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase. * testsuite/libgomp.fortran/metadirective-5.f90: New testcase. Diff: --- gcc/ChangeLog.omp | 11 + gcc/builtin-types.def | 2 + gcc/c/ChangeLog.omp | 8 + gcc/c/c-parser.cc | 28 +- gcc/cp/ChangeLog.omp | 8 + gcc/cp/parser.cc | 28 +- gcc/fortran/ChangeLog.omp | 9 + gcc/fortran/openmp.cc | 34 ++- gcc/fortran/types.def | 2 + gcc/omp-builtins.def | 3 + gcc/omp-general.cc | 71 ++++- gcc/testsuite/ChangeLog.omp | 5 + gcc/testsuite/c-c++-common/gomp/metadirective-7.c | 31 ++ gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 | 36 +++ libgomp/ChangeLog.omp | 28 ++ libgomp/Makefile.am | 8 +- libgomp/Makefile.in | 25 +- libgomp/config/gcn/selector.c | 57 ++++ libgomp/config/linux/selector.c | 43 +++ libgomp/config/linux/x86/selector.c | 325 +++++++++++++++++++++ libgomp/config/nvptx/selector.c | 65 +++++ libgomp/libgomp-plugin.h | 2 + libgomp/libgomp.h | 1 + libgomp/libgomp.map | 1 + libgomp/libgomp_g.h | 8 + libgomp/oacc-host.c | 11 + libgomp/plugin/plugin-gcn.c | 14 + libgomp/plugin/plugin-nvptx.c | 46 +++ libgomp/selector.c | 36 +++ libgomp/target.c | 38 +++ libgomp/testsuite/Makefile.in | 1 + .../libgomp.c-c++-common/metadirective-5.c | 46 +++ .../testsuite/libgomp.fortran/metadirective-5.f90 | 44 +++ 33 files changed, 1043 insertions(+), 32 deletions(-) diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index 5a7891f02ec..eb3a1130f72 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,14 @@ +2022-01-25 Kwok Cheung Yeung + + * builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New + type. + * omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin. + * omp-general.cc (omp_context_selector_matches): Handle 'target_device' + selector set. + (omp_dynamic_cond): Generate expression tree for 'target_device' + selector set. + (omp_context_compute_score): Handle selectors in 'target_device' set. + 2022-01-25 Kwok Cheung Yeung * omp-general.cc (omp_dynamic_cond): Do not return user condition if diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index eb21aef101f..d0161537a95 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -681,6 +681,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR, BT_INT, BT_PTR) DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL, BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL) +DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG, BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING, diff --git a/gcc/c/ChangeLog.omp b/gcc/c/ChangeLog.omp index 103bf158527..487c385fb6c 100644 --- a/gcc/c/ChangeLog.omp +++ b/gcc/c/ChangeLog.omp @@ -1,3 +1,11 @@ +2022-01-25 Kwok Cheung Yeung + + * c-parser.cc (omp_target_device_selectors): New. + (c_parser_omp_context_selector): Accept 'target_device' selector set. + Treat 'device_num' selector as expression. + (c_parser_omp_context_selector_specification): Handle 'target_device' + selector set. + 2022-01-25 Kwok Cheung Yeung * c-parser.cc (c_parser_skip_to_end_of_block_or_statement): Track diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index d215ad62103..fea78ddbb57 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -21468,6 +21468,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -21525,6 +21527,13 @@ c_parser_omp_context_selector (c_parser *parser, tree set, tree parms, property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -21563,6 +21572,12 @@ c_parser_omp_context_selector (c_parser *parser, tree set, tree parms, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && IDENTIFIER_POINTER (set)[0] == 't' + && strcmp (IDENTIFIER_POINTER (selector), + "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + c_parser_consume_token (parser); if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)) @@ -21787,6 +21802,10 @@ c_parser_omp_context_selector_specification (c_parser *parser, tree parms, if (strcmp (setp, "implementation") == 0) setp = NULL; break; + case 't': + if (metadirective_p && strcmp (setp, "target_device") == 0) + setp = NULL; + break; case 'u': if (strcmp (setp, "user") == 0) setp = NULL; @@ -21796,8 +21815,13 @@ c_parser_omp_context_selector_specification (c_parser *parser, tree parms, } if (setp) { - c_parser_error (parser, "expected %, %, " - "% or %"); + if (metadirective_p) + c_parser_error (parser, "expected %, %, " + "%, % " + "or %"); + else + c_parser_error (parser, "expected %, %, " + "% or %"); return error_mark_node; } diff --git a/gcc/cp/ChangeLog.omp b/gcc/cp/ChangeLog.omp index 31b98884749..c2a317e29fa 100644 --- a/gcc/cp/ChangeLog.omp +++ b/gcc/cp/ChangeLog.omp @@ -1,3 +1,11 @@ +2022-01-25 Kwok Cheung Yeung + + * parser.cc (omp_target_device_selectors): New. + (cp_parser_omp_context_selector): Accept 'target_device' selector set. + Treat 'device_num' selector as expression. + (cp_parser_omp_context_selector_specification): Handle 'target_device' + selector set. + 2022-01-25 Kwok Cheung Yeung * parser.cc (cp_parser_skip_to_end_of_statement): Revert. diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index ee838ea28f4..90af3f89a5c 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -45142,6 +45142,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -45199,6 +45201,13 @@ cp_parser_omp_context_selector (cp_parser *parser, tree set, bool has_parms_p, property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -45236,6 +45245,12 @@ cp_parser_omp_context_selector (cp_parser *parser, tree set, bool has_parms_p, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && IDENTIFIER_POINTER (set)[0] == 't' + && strcmp (IDENTIFIER_POINTER (selector), + "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + cp_lexer_consume_token (parser->lexer); if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) @@ -45473,6 +45488,10 @@ cp_parser_omp_context_selector_specification (cp_parser *parser, if (strcmp (setp, "implementation") == 0) setp = NULL; break; + case 't': + if (metadirective_p && strcmp (setp, "target_device") == 0) + setp = NULL; + break; case 'u': if (strcmp (setp, "user") == 0) setp = NULL; @@ -45482,8 +45501,13 @@ cp_parser_omp_context_selector_specification (cp_parser *parser, } if (setp) { - cp_parser_error (parser, "expected %, %, " - "% or %"); + if (metadirective_p) + cp_parser_error (parser, "expected %, %, " + "%, % " + "or %"); + else + cp_parser_error (parser, "expected %, %, " + "% or %"); return error_mark_node; } diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp index a90393e489e..e7262204266 100644 --- a/gcc/fortran/ChangeLog.omp +++ b/gcc/fortran/ChangeLog.omp @@ -1,3 +1,12 @@ +2022-01-25 Kwok Cheung Yeung + + * openmp.cc (omp_target_device_selectors): New. + (gfc_match_omp_context_selector): Accept 'target_device' selector set. + Treat 'device_num' selector as expression. + (gfc_match_omp_context_selector_specification): Handle 'target_device' + selector set. + * types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. + 2022-01-25 Kwok Cheung Yeung * decl.cc (gfc_match_end): Handle COMP_OMP_METADIRECTIVE and diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 7cfe5a72dbc..0fc7e64dfa5 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -4638,6 +4638,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -4695,6 +4697,13 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -4730,6 +4739,11 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) && strcmp (selector, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && oss->trait_set_selector_name[0] == 't' + && strcmp (selector, "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + if (gfc_match (" (") == MATCH_YES) { if (property_kind == CTX_PROPERTY_NONE) @@ -4918,13 +4932,14 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) user */ match -gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head) +gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head, + bool metadirective_p = false) { do { match m; - const char *selector_sets[] = { "construct", "device", - "implementation", "user" }; + const char *selector_sets[] = { "construct", "device", "implementation", + "target_device", "user" }; const int selector_set_count = sizeof (selector_sets) / sizeof (*selector_sets); int i; @@ -4936,10 +4951,15 @@ gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head) if (strcmp (buf, selector_sets[i]) == 0) break; - if (m != MATCH_YES || i == selector_set_count) + if (m != MATCH_YES || i == selector_set_count + || (!metadirective_p && strcmp (buf, "target_device") == 0)) { - gfc_error ("expected 'construct', 'device', 'implementation' or " - "'user' at %C"); + if (metadirective_p) + gfc_error ("expected 'construct', 'device', 'implementation', " + "'target_device' or 'user' at %C"); + else + gfc_error ("expected 'construct', 'device', 'implementation' " + "or 'user' at %C"); return MATCH_ERROR; } @@ -5113,7 +5133,7 @@ match_omp_metadirective (bool begin_p) if (!default_p) { - if (gfc_match_omp_context_selector_specification (&selectors) + if (gfc_match_omp_context_selector_specification (&selectors, true) != MATCH_YES) return MATCH_ERROR; diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index d290620c612..a9ada1c2680 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -174,6 +174,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR, BT_INT, BT_PTR) DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL, BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL) +DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT, diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def index 0926f56abae..ce57a93c9d4 100644 --- a/gcc/omp-builtins.def +++ b/gcc/omp-builtins.def @@ -467,3 +467,6 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WARNING, "GOMP_warning", BT_FN_VOID_CONST_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_ERROR, "GOMP_error", BT_FN_VOID_CONST_PTR_SIZE, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) +DEF_GOMP_BUILTIN (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE, "GOMP_evaluate_target_device", + BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + ATTR_NOTHROW_LEAF_LIST) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 1e66cd7bce6..48953e11997 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -1339,6 +1339,12 @@ omp_context_selector_matches (tree ctx, bool metadirective_p) ret = -1; continue; } + else if (set == 't') + { + /* The target_device set is dynamic, so treat it as always + resolvable. */ + continue; + } for (tree t2 = TREE_VALUE (t1); t2; t2 = TREE_CHAIN (t2)) { const char *sel = IDENTIFIER_POINTER (TREE_PURPOSE (t2)); @@ -2012,6 +2018,8 @@ omp_get_context_selector (tree ctx, const char *set, const char *sel) static tree omp_dynamic_cond (tree ctx) { + tree expr = NULL_TREE; + tree user = omp_get_context_selector (ctx, "user", "condition"); if (user) { @@ -2021,10 +2029,60 @@ omp_dynamic_cond (tree ctx) /* The user condition is not dynamic if it is constant. */ if (!tree_fits_shwi_p (TREE_VALUE (expr_list))) - return TREE_VALUE (expr_list); + expr = TREE_VALUE (expr_list); } - return NULL_TREE; + tree target_device = omp_get_context_selector (ctx, "target_device", NULL); + if (target_device) + { + tree device_num = null_pointer_node; + tree kind = null_pointer_node; + tree arch = null_pointer_node; + tree isa = null_pointer_node; + + tree device_num_sel = omp_get_context_selector (ctx, "target_device", + "device_num"); + if (device_num_sel) + device_num = TREE_VALUE (TREE_VALUE (device_num_sel)); + else + device_num = build_int_cst (integer_type_node, -1); + + tree kind_sel = omp_get_context_selector (ctx, "target_device", "kind"); + if (kind_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (kind_sel))); + kind = build_string_literal (strlen (str) + 1, str); + } + + tree arch_sel = omp_get_context_selector (ctx, "target_device", "arch"); + if (arch_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (arch_sel))); + arch = build_string_literal (strlen (str) + 1, str); + } + + tree isa_sel = omp_get_context_selector (ctx, "target_device", "isa"); + if (isa_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (isa_sel))); + isa = build_string_literal (strlen (str) + 1, str); + } + + /* Generate a call to GOMP_evaluate_target_device. */ + tree builtin_fn + = builtin_decl_explicit (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE); + tree call = build_call_expr (builtin_fn, 4, device_num, kind, arch, isa); + + if (expr == NULL_TREE) + expr = call; + else + expr = fold_build2 (TRUTH_ANDIF_EXPR, boolean_type_node, expr, call); + } + + return expr; } /* Return true iff the context selector CTX contains a dynamic element @@ -2045,9 +2103,12 @@ static bool omp_context_compute_score (tree ctx, widest_int *score, bool declare_simd) { tree construct = omp_get_context_selector (ctx, "construct", NULL); - bool has_kind = omp_get_context_selector (ctx, "device", "kind"); - bool has_arch = omp_get_context_selector (ctx, "device", "arch"); - bool has_isa = omp_get_context_selector (ctx, "device", "isa"); + bool has_kind = omp_get_context_selector (ctx, "device", "kind") + || omp_get_context_selector (ctx, "target_device", "kind"); + bool has_arch = omp_get_context_selector (ctx, "device", "arch") + || omp_get_context_selector (ctx, "target_device", "arch"); + bool has_isa = omp_get_context_selector (ctx, "device", "isa") + || omp_get_context_selector (ctx, "target_device", "isa"); bool ret = false; *score = 1; for (tree t1 = ctx; t1; t1 = TREE_CHAIN (t1)) diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp index 14857e67f8f..53eeb8b6d96 100644 --- a/gcc/testsuite/ChangeLog.omp +++ b/gcc/testsuite/ChangeLog.omp @@ -1,3 +1,8 @@ +2022-01-25 Kwok Cheung Yeung + + * c-c++-common/gomp/metadirective-7.c: New. + * gfortran.dg/gomp/metadirective-7.f90: New. + 2022-01-25 Kwok Cheung Yeung * c-c++-common/gomp/metadirective-1.c: New. diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-7.c b/gcc/testsuite/c-c++-common/gomp/metadirective-7.c new file mode 100644 index 00000000000..cf695aa24cb --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/metadirective-7.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fdump-tree-gimple" } */ + +#define N 256 + +void f (int a[], int num) +{ + int i; + + #pragma omp metadirective \ + when (target_device={device_num(num), kind("gpu"), arch("nvptx")}: \ + target parallel for map(tofrom: a[0:N])) \ + when (target_device={device_num(num), kind("gpu"), \ + arch("amdgcn"), isa("gfx906")}: \ + target parallel for) \ + when (target_device={device_num(num), kind("cpu"), arch("x86_64")}: \ + parallel for) + for (i = 0; i < N; i++) + a[i] += i; + + #pragma omp metadirective \ + when (target_device={kind("gpu"), arch("nvptx")}: \ + target parallel for map(tofrom: a[0:N])) + for (i = 0; i < N; i++) + a[i] += i; +} + +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"gpu\"\\\[0\\\], &\"amdgcn\"\\\[0\\\], &\"gfx906\"\\\[0\\\]\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"cpu\"\\\[0\\\], &\"x86_64\"\\\[0\\\], 0B\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(-1, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } */ diff --git a/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 b/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 new file mode 100644 index 00000000000..870ea192fbc --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 @@ -0,0 +1,36 @@ +! { dg-do compile } +! { dg-additional-options "-fdump-tree-gimple" } + +program main + integer, parameter :: N = 256 +contains + subroutine f (a, num) + integer :: a(N) + integer :: num + integer :: i + + !$omp metadirective & + !$omp& when (target_device={device_num(num), kind("gpu"), arch("nvptx")}: & + !$omp& target parallel do map(tofrom: a(1:N))) & + !$omp& when (target_device={device_num(num), kind("gpu"), & + !$omp& arch("amdgcn"), isa("gfx906")}: & + !$omp& target parallel do) & + !$omp& when (target_device={device_num(num), kind("cpu"), arch("x86_64")}: & + !$omp& parallel do) + do i = 1, N + a(i) = a(i) + i + end do + + !$omp metadirective & + !$omp& when (target_device={kind("gpu"), arch("nvptx")}: & + !$omp& target parallel do map(tofrom: a(1:N))) + do i = 1, N + a(i) = a(i) + i + end do + end subroutine +end program + +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"gpu\"\\\[0\\\], &\"amdgcn\"\\\[0\\\], &\"gfx906\"\\\[0\\\]\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"cpu\"\\\[0\\\], &\"x86_64\"\\\[0\\\], 0B\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(-1, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index 80fb80d7079..0d47950d20d 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,3 +1,31 @@ +2022-01-25 Kwok Cheung Yeung + + * Makefile.am (libgomp_la_SOURCES): Add selector.c. + * Makefile.am: Regenerate. + * config/gcn/selector.c: New. + * config/linux/selector.c: New. + * config/linux/x86/selector.c: New. + * config/nvptx/selector.c: New. + * libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New. + * libgomp.h (struct gomp_device_descr): Add evaluate_device_func field. + * libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device. + * libgomp_g.h (GOMP_evaluate_current_device): New. + (GOMP_evaluate_target_device): New. + * oacc-host.c (host_evaluate_device): New. + (host_openacc_exec): Initialize evaluate_device_func field to + host_evaluate_device. + * plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New. + * plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and + compute_minor fields. + (nvptx_open_device): Read compute capability information from device. + (CHECK_ISA): New macro. + (GOMP_OFFLOAD_evaluate_device): New. + * selector.c: New. + * target.c (GOMP_evaluate_target_device): New. + (gomp_load_plugin_for_device): Load evaulate_device plugin function. + * testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase. + * testsuite/libgomp.fortran/metadirective-5.f90: New testcase. + 2022-01-25 Kwok Cheung Yeung * testsuite/libgomp.c-c++-common/metadirective-1.c: New. diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am index fa3104f7321..273c7fc89aa 100644 --- a/libgomp/Makefile.am +++ b/libgomp/Makefile.am @@ -63,10 +63,10 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c error.c \ icv.c icv-device.c iter.c iter_ull.c loop.c loop_ull.c ordered.c \ parallel.c scope.c sections.c single.c task.c team.c work.c lock.c \ mutex.c proc.c sem.c bar.c ptrlock.c time.c fortran.c affinity.c \ - target.c splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \ - oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \ - priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \ - oacc-target.c oacc-profiling-acc_register_library.c + selector.c target.c splay-tree.c libgomp-plugin.c oacc-parallel.c \ + oacc-host.c oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c \ + oacc-cuda.c priority_queue.c affinity-fmt.c teams.c allocator.c \ + oacc-profiling.c oacc-target.c oacc-profiling-acc_register_library.c include $(top_srcdir)/plugin/Makefrag.am diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in index 5026d6a546a..0bc746e67e9 100644 --- a/libgomp/Makefile.in +++ b/libgomp/Makefile.in @@ -216,12 +216,12 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo critical.lo \ loop.lo loop_ull.lo ordered.lo parallel.lo scope.lo \ sections.lo single.lo task.lo team.lo work.lo lock.lo mutex.lo \ proc.lo sem.lo bar.lo ptrlock.lo time.lo fortran.lo \ - affinity.lo target.lo splay-tree.lo libgomp-plugin.lo \ - oacc-parallel.lo oacc-host.lo oacc-init.lo oacc-mem.lo \ - oacc-async.lo oacc-plugin.lo oacc-cuda.lo priority_queue.lo \ - affinity-fmt.lo teams.lo allocator.lo oacc-profiling.lo \ - oacc-target.lo oacc-profiling-acc_register_library.lo \ - $(am__objects_1) + affinity.lo selector.lo target.lo splay-tree.lo \ + libgomp-plugin.lo oacc-parallel.lo oacc-host.lo oacc-init.lo \ + oacc-mem.lo oacc-async.lo oacc-plugin.lo oacc-cuda.lo \ + priority_queue.lo affinity-fmt.lo teams.lo allocator.lo \ + oacc-profiling.lo oacc-target.lo \ + oacc-profiling-acc_register_library.lo $(am__objects_1) libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS) AM_V_P = $(am__v_P_@AM_V@) am__v_P_ = $(am__v_P_@AM_DEFAULT_V@) @@ -557,12 +557,12 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c \ error.c icv.c icv-device.c iter.c iter_ull.c loop.c loop_ull.c \ ordered.c parallel.c scope.c sections.c single.c task.c team.c \ work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c time.c \ - fortran.c affinity.c target.c splay-tree.c libgomp-plugin.c \ - oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \ - oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \ - affinity-fmt.c teams.c allocator.c oacc-profiling.c \ - oacc-target.c oacc-profiling-acc_register_library.c \ - $(am__append_3) + fortran.c affinity.c selector.c target.c splay-tree.c \ + libgomp-plugin.c oacc-parallel.c oacc-host.c oacc-init.c \ + oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \ + priority_queue.c affinity-fmt.c teams.c allocator.c \ + oacc-profiling.c oacc-target.c \ + oacc-profiling-acc_register_library.c $(am__append_3) # Nvidia PTX OpenACC plugin. @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION) @@ -775,6 +775,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scope.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/selector.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/splay-tree.Plo@am__quote@ diff --git a/libgomp/config/gcn/selector.c b/libgomp/config/gcn/selector.c new file mode 100644 index 00000000000..60793fc05d3 --- /dev/null +++ b/libgomp/config/gcn/selector.c @@ -0,0 +1,57 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + an AMD GCN GPU. */ + +#include "libgomp.h" +#include + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + + if (arch && strcmp (arch, "gcn") != 0) + return false; + + if (!isa) + return true; + +#ifdef __GCN3__ + if (strcmp (isa, "fiji") == 0 || strcmp (isa, "gfx803") == 0) + return true; +#endif + +#ifdef __GCN5__ + if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0 + || strcmp (isa, "gfx908") == 0) + return true; +#endif + + return false; +} diff --git a/libgomp/config/linux/selector.c b/libgomp/config/linux/selector.c new file mode 100644 index 00000000000..84e59c7aabe --- /dev/null +++ b/libgomp/config/linux/selector.c @@ -0,0 +1,43 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains a generic implementation of + GOMP_evaluate_current_device when run on a Linux host. */ + +#include +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "cpu") != 0) + return false; + + if (!arch && !isa) + return true; + + return false; +} diff --git a/libgomp/config/linux/x86/selector.c b/libgomp/config/linux/x86/selector.c new file mode 100644 index 00000000000..2b6c2ba165b --- /dev/null +++ b/libgomp/config/linux/x86/selector.c @@ -0,0 +1,325 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + an x86/x64-based Linux host. */ + +#include +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "cpu") != 0) + return false; + + if (arch + && strcmp (arch, "x86") != 0 + && strcmp (arch, "ia32") != 0 +#ifdef __x86_64__ + && strcmp (arch, "x86_64") != 0 +#endif +#ifdef __ILP32__ + && strcmp (arch, "x32") != 0 +#endif + && strcmp (arch, "i386") != 0 + && strcmp (arch, "i486") != 0 +#ifndef __i486__ + && strcmp (arch, "i586") != 0 +#endif +#if !defined (__i486__) && !defined (__i586__) + && strcmp (arch, "i686") != 0 +#endif + ) + return false; + + if (!isa) + return true; + +#ifdef __WBNOINVD__ + if (strcmp (isa, "wbnoinvd") == 0) return true; +#endif +#ifdef __AVX512VP2INTERSECT__ + if (strcmp (isa, "avx512vp2intersect") == 0) return true; +#endif +#ifdef __MMX__ + if (strcmp (isa, "mmx") == 0) return true; +#endif +#ifdef __3dNOW__ + if (strcmp (isa, "3dnow") == 0) return true; +#endif +#ifdef __3dNOW_A__ + if (strcmp (isa, "3dnowa") == 0) return true; +#endif +#ifdef __SSE__ + if (strcmp (isa, "sse") == 0) return true; +#endif +#ifdef __SSE2__ + if (strcmp (isa, "sse2") == 0) return true; +#endif +#ifdef __SSE3__ + if (strcmp (isa, "sse3") == 0) return true; +#endif +#ifdef __SSSE3__ + if (strcmp (isa, "ssse3") == 0) return true; +#endif +#ifdef __SSE4_1__ + if (strcmp (isa, "sse4.1") == 0) return true; +#endif +#ifdef __SSE4_2__ + if (strcmp (isa, "sse4") == 0 || strcmp (isa, "sse4.2") == 0) return true; +#endif +#ifdef __AES__ + if (strcmp (isa, "aes") == 0) return true; +#endif +#ifdef __SHA__ + if (strcmp (isa, "sha") == 0) return true; +#endif +#ifdef __PCLMUL__ + if (strcmp (isa, "pclmul") == 0) return true; +#endif +#ifdef __AVX__ + if (strcmp (isa, "avx") == 0) return true; +#endif +#ifdef __AVX2__ + if (strcmp (isa, "avx2") == 0) return true; +#endif +#ifdef __AVX512F__ + if (strcmp (isa, "avx512f") == 0) return true; +#endif +#ifdef __AVX512ER__ + if (strcmp (isa, "avx512er") == 0) return true; +#endif +#ifdef __AVX512CD__ + if (strcmp (isa, "avx512cd") == 0) return true; +#endif +#ifdef __AVX512PF__ + if (strcmp (isa, "avx512pf") == 0) return true; +#endif +#ifdef __AVX512DQ__ + if (strcmp (isa, "avx512dq") == 0) return true; +#endif +#ifdef __AVX512BW__ + if (strcmp (isa, "avx512bw") == 0) return true; +#endif +#ifdef __AVX512VL__ + if (strcmp (isa, "avx512vl") == 0) return true; +#endif +#ifdef __AVX512VBMI__ + if (strcmp (isa, "avx512vbmi") == 0) return true; +#endif +#ifdef __AVX512IFMA__ + if (strcmp (isa, "avx512ifma") == 0) return true; +#endif +#ifdef __AVX5124VNNIW__ + if (strcmp (isa, "avx5124vnniw") == 0) return true; +#endif +#ifdef __AVX512VBMI2__ + if (strcmp (isa, "avx512vbmi2") == 0) return true; +#endif +#ifdef __AVX512VNNI__ + if (strcmp (isa, "avx512vnni") == 0) return true; +#endif +#ifdef __PCONFIG__ + if (strcmp (isa, "pconfig") == 0) return true; +#endif +#ifdef __SGX__ + if (strcmp (isa, "sgx") == 0) return true; +#endif +#ifdef __AVX5124FMAPS__ + if (strcmp (isa, "avx5124fmaps") == 0) return true; +#endif +#ifdef __AVX512BITALG__ + if (strcmp (isa, "avx512bitalg") == 0) return true; +#endif +#ifdef __AVX512VPOPCNTDQ__ + if (strcmp (isa, "avx512vpopcntdq") == 0) return true; +#endif +#ifdef __FMA__ + if (strcmp (isa, "fma") == 0) return true; +#endif +#ifdef __RTM__ + if (strcmp (isa, "rtm") == 0) return true; +#endif +#ifdef __SSE4A__ + if (strcmp (isa, "sse4a") == 0) return true; +#endif +#ifdef __FMA4__ + if (strcmp (isa, "fma4") == 0) return true; +#endif +#ifdef __XOP__ + if (strcmp (isa, "xop") == 0) return true; +#endif +#ifdef __LWP__ + if (strcmp (isa, "lwp") == 0) return true; +#endif +#ifdef __ABM__ + if (strcmp (isa, "abm") == 0) return true; +#endif +#ifdef __BMI__ + if (strcmp (isa, "bmi") == 0) return true; +#endif +#ifdef __BMI2__ + if (strcmp (isa, "bmi2") == 0) return true; +#endif +#ifdef __LZCNT__ + if (strcmp (isa, "lzcnt") == 0) return true; +#endif +#ifdef __TBM__ + if (strcmp (isa, "tbm") == 0) return true; +#endif +#ifdef __CRC32__ + if (strcmp (isa, "crc32") == 0) return true; +#endif +#ifdef __POPCNT__ + if (strcmp (isa, "popcnt") == 0) return true; +#endif +#ifdef __FSGSBASE__ + if (strcmp (isa, "fsgsbase") == 0) return true; +#endif +#ifdef __RDRND__ + if (strcmp (isa, "rdrnd") == 0) return true; +#endif +#ifdef __F16C__ + if (strcmp (isa, "f16c") == 0) return true; +#endif +#ifdef __RDSEED__ + if (strcmp (isa, "rdseed") == 0) return true; +#endif +#ifdef __PRFCHW__ + if (strcmp (isa, "prfchw") == 0) return true; +#endif +#ifdef __ADX__ + if (strcmp (isa, "adx") == 0) return true; +#endif +#ifdef __FXSR__ + if (strcmp (isa, "fxsr") == 0) return true; +#endif +#ifdef __XSAVE__ + if (strcmp (isa, "xsave") == 0) return true; +#endif +#ifdef __XSAVEOPT__ + if (strcmp (isa, "xsaveopt") == 0) return true; +#endif +#ifdef __PREFETCHWT1__ + if (strcmp (isa, "prefetchwt1") == 0) return true; +#endif +#ifdef __CLFLUSHOPT__ + if (strcmp (isa, "clflushopt") == 0) return true; +#endif +#ifdef __CLZERO__ + if (strcmp (isa, "clzero") == 0) return true; +#endif +#ifdef __XSAVEC__ + if (strcmp (isa, "xsavec") == 0) return true; +#endif +#ifdef __XSAVES__ + if (strcmp (isa, "xsaves") == 0) return true; +#endif +#ifdef __CLWB__ + if (strcmp (isa, "clwb") == 0) return true; +#endif +#ifdef __MWAITX__ + if (strcmp (isa, "mwaitx") == 0) return true; +#endif +#ifdef __PKU__ + if (strcmp (isa, "pku") == 0) return true; +#endif +#ifdef __RDPID__ + if (strcmp (isa, "rdpid") == 0) return true; +#endif +#ifdef __GFNI__ + if (strcmp (isa, "gfni") == 0) return true; +#endif +#ifdef __SHSTK__ + if (strcmp (isa, "shstk") == 0) return true; +#endif +#ifdef __VAES__ + if (strcmp (isa, "vaes") == 0) return true; +#endif +#ifdef __VPCLMULQDQ__ + if (strcmp (isa, "vpclmulqdq") == 0) return true; +#endif +#ifdef __MOVDIRI__ + if (strcmp (isa, "movdiri") == 0) return true; +#endif +#ifdef __MOVDIR64B__ + if (strcmp (isa, "movdir64b") == 0) return true; +#endif +#ifdef __WAITPKG__ + if (strcmp (isa, "waitpkg") == 0) return true; +#endif +#ifdef __CLDEMOTE__ + if (strcmp (isa, "cldemote") == 0) return true; +#endif +#ifdef __SERIALIZE__ + if (strcmp (isa, "serialize") == 0) return true; +#endif +#ifdef __PTWRITE__ + if (strcmp (isa, "ptwrite") == 0) return true; +#endif +#ifdef __AVX512BF16__ + if (strcmp (isa, "avx512bf16") == 0) return true; +#endif +#ifdef __AVX512FP16__ + if (strcmp (isa, "avx512fp16") == 0) return true; +#endif +#ifdef __ENQCMD__ + if (strcmp (isa, "enqcmd") == 0) return true; +#endif +#ifdef __TSXLDTRK__ + if (strcmp (isa, "tsxldtrk") == 0) return true; +#endif +#ifdef __AMX_TILE__ + if (strcmp (isa, "amx-tile") == 0) return true; +#endif +#ifdef __AMX_INT8__ + if (strcmp (isa, "amx-int8") == 0) return true; +#endif +#ifdef __AMX_BF16__ + if (strcmp (isa, "amx-bf16") == 0) return true; +#endif +#ifdef __LAHF_SAHF__ + if (strcmp (isa, "sahf") == 0) return true; +#endif +#ifdef __MOVBE__ + if (strcmp (isa, "movbe") == 0) return true; +#endif +#ifdef __UINTR__ + if (strcmp (isa, "uintr") == 0) return true; +#endif +#ifdef __HRESET__ + if (strcmp (isa, "hreset") == 0) return true; +#endif +#ifdef __KL__ + if (strcmp (isa, "kl") == 0) return true; +#endif +#ifdef __WIDEKL__ + if (strcmp (isa, "widekl") == 0) return true; +#endif + + return false; +} diff --git a/libgomp/config/nvptx/selector.c b/libgomp/config/nvptx/selector.c new file mode 100644 index 00000000000..50b5f9020ac --- /dev/null +++ b/libgomp/config/nvptx/selector.c @@ -0,0 +1,65 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + a Nvidia GPU. */ + +#include "libgomp.h" +#include + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + + if (arch && strcmp (arch, "nvptx") != 0) + return false; + + if (!isa) + return true; + + if (strcmp (isa, "sm_30") == 0) + return true; +#if __PTX_SM__ >= 350 + if (strcmp (isa, "sm_35") == 0) + return true; +#endif +#if __PTX_SM__ >= 530 + if (strcmp (isa, "sm_53") == 0) + return true; +#endif +#if __PTX_SM__ >= 750 + if (strcmp (isa, "sm_75") == 0) + return true; +#endif +#if __PTX_SM__ >= 800 + if (strcmp (isa, "sm_80") == 0) + return true; +#endif + + return false; +} diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h index 73d2ed813eb..7f8087a51ae 100644 --- a/libgomp/libgomp-plugin.h +++ b/libgomp/libgomp-plugin.h @@ -141,6 +141,8 @@ extern bool GOMP_OFFLOAD_dev2dev (int, void *, const void *, size_t); extern bool GOMP_OFFLOAD_can_run (void *); extern void GOMP_OFFLOAD_run (int, void *, void *, void **); extern void GOMP_OFFLOAD_async_run (int, void *, void *, void **, void *); +extern bool GOMP_OFFLOAD_evaluate_device (int, const char *, const char *, + const char *); extern void GOMP_OFFLOAD_openacc_exec (void (*) (void *), size_t, void **, void **, unsigned *, void *); diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index f50c21d1fdd..5c97f04dde4 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1245,6 +1245,7 @@ struct gomp_device_descr __typeof (GOMP_OFFLOAD_can_run) *can_run_func; __typeof (GOMP_OFFLOAD_run) *run_func; __typeof (GOMP_OFFLOAD_async_run) *async_run_func; + __typeof (GOMP_OFFLOAD_evaluate_device) *evaluate_device_func; /* Splay tree containing information about mapped memory regions. */ struct splay_tree_s mem_map; diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 2ac58094169..9d94aba3f70 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -400,6 +400,7 @@ GOMP_5.1 { GOMP_scope_start; GOMP_warning; GOMP_teams4; + GOMP_evaluate_target_device; } GOMP_5.0.1; OACC_2.0 { diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h index 3985f9ec68c..a8e29c576c4 100644 --- a/libgomp/libgomp_g.h +++ b/libgomp/libgomp_g.h @@ -336,6 +336,11 @@ extern void GOMP_single_copy_end (void *); extern void GOMP_scope_start (uintptr_t *); +/* selector.c */ + +extern bool GOMP_evaluate_current_device (const char *, const char *, + const char *); + /* target.c */ extern void GOMP_target (int, void (*) (void *), const void *, @@ -357,6 +362,9 @@ extern void GOMP_target_enter_exit_data (int, size_t, void **, size_t *, extern void GOMP_teams (unsigned int, unsigned int); extern bool GOMP_teams4 (unsigned int, unsigned int, unsigned int, bool); +extern bool GOMP_evaluate_target_device (int, const char *, const char *, + const char *); + /* teams.c */ extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned, diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index 8e97fc4f0ab..0ac70af8111 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -140,6 +140,16 @@ host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars, fn (vars); } +static bool +host_evaluate_device (int device_num __attribute__ ((unused)), + const char *kind __attribute__ ((unused)), + const char *arch __attribute__ ((unused)), + const char *isa __attribute__ ((unused))) +{ + __builtin_unreachable (); + return false; +} + static void host_openacc_exec (void (*fn) (void *), size_t mapnum __attribute__ ((unused)), @@ -287,6 +297,7 @@ static struct gomp_device_descr host_dispatch = .dev2host_func = host_dev2host, .host2dev_func = host_host2dev, .run_func = host_run, + .evaluate_device_func = host_evaluate_device, .mem_map = { NULL }, /* .lock initialized in goacc_host_init. */ diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 37fda2342d9..bbdc6587d18 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3799,6 +3799,20 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars, GOMP_PLUGIN_target_task_completion, async_data); } +bool +GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + struct agent_info *agent = get_agent_info (device_num); + + if (kind && strcmp (kind, "gpu") != 0) + return false; + if (arch && strcmp (arch, "gcn") != 0) + return false; + + return !isa || isa_code (isa) == agent->device_isa; +} + /* }}} */ /* {{{ OpenACC Plugin API */ diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index cef64847976..39450b61957 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -311,6 +311,7 @@ struct ptx_device int max_threads_per_block; int max_threads_per_multiprocessor; int default_dims[GOMP_DIM_MAX]; + int compute_major, compute_minor; /* Length as used by the CUDA Runtime API ('struct cudaDeviceProp'). */ char name[256]; @@ -532,6 +533,14 @@ nvptx_open_device (int n) for (int i = 0; i != GOMP_DIM_MAX; i++) ptx_dev->default_dims[i] = 0; + CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi, + CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, dev); + ptx_dev->compute_major = pi; + + CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi, + CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, dev); + ptx_dev->compute_minor = pi; + CUDA_CALL_ERET (NULL, cuDeviceGetName, ptx_dev->name, sizeof ptx_dev->name, dev); @@ -2067,3 +2076,40 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args) } /* TODO: Implement GOMP_OFFLOAD_async_run. */ + +#define CHECK_ISA(major, minor) \ + if (((device->compute_major == major && device->compute_minor >= minor) \ + || device->compute_major > major) \ + && strcmp (isa, "sm_"#major#minor) == 0) \ + return true + +bool +GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + if (arch && strcmp (arch, "nvptx") != 0) + return false; + if (!isa) + return true; + + struct ptx_device *device = ptx_devices[device_num]; + + CHECK_ISA (3, 0); + CHECK_ISA (3, 5); + CHECK_ISA (3, 7); + CHECK_ISA (5, 0); + CHECK_ISA (5, 2); + CHECK_ISA (5, 3); + CHECK_ISA (6, 0); + CHECK_ISA (6, 1); + CHECK_ISA (6, 2); + CHECK_ISA (7, 0); + CHECK_ISA (7, 2); + CHECK_ISA (7, 5); + CHECK_ISA (8, 0); + CHECK_ISA (8, 6); + + return false; +} diff --git a/libgomp/selector.c b/libgomp/selector.c new file mode 100644 index 00000000000..dc920ee065f --- /dev/null +++ b/libgomp/selector.c @@ -0,0 +1,36 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains a placeholder implementation of + GOMP_evaluate_current_device. */ + +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + return false; +} diff --git a/libgomp/target.c b/libgomp/target.c index 3e81e40be92..d5fb1aa93de 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -3895,6 +3895,43 @@ omp_pause_resource_all (omp_pause_resource_t kind) ialias (omp_pause_resource) ialias (omp_pause_resource_all) +bool +GOMP_evaluate_target_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + bool result = true; + + if (device_num < 0) + device_num = omp_get_default_device (); + + if (kind && strcmp (kind, "any") == 0) + kind = NULL; + + gomp_debug (1, "%s: device_num = %u, kind=%s, arch=%s, isa=%s", + __FUNCTION__, device_num, kind, arch, isa); + + if (omp_get_device_num () == device_num) + result = GOMP_evaluate_current_device (kind, arch, isa); + else + { + if (!omp_is_initial_device ()) + /* Accelerators are not expected to know about other devices. */ + result = false; + else + { + struct gomp_device_descr *device = resolve_device (device_num); + if (device == NULL) + result = false; + else if (device->evaluate_device_func) + result = device->evaluate_device_func (device_num, kind, arch, + isa); + } + } + + gomp_debug (1, " -> %s\n", result ? "true" : "false"); + return result; +} + #ifdef PLUGIN_SUPPORT /* This function tries to load a plugin for DEVICE. Name of plugin is passed @@ -3948,6 +3985,7 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device, DLSYM (free); DLSYM (dev2host); DLSYM (host2dev); + DLSYM (evaluate_device); device->capabilities = device->get_caps_func (); if (device->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400) { diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in index e48c3f2f9b0..5eed05f5dde 100644 --- a/libgomp/testsuite/Makefile.in +++ b/libgomp/testsuite/Makefile.in @@ -284,6 +284,7 @@ pdfdir = @pdfdir@ prefix = @prefix@ program_transform_name = @program_transform_name@ psdir = @psdir@ +runstatedir = @runstatedir@ sbindir = @sbindir@ sharedstatedir = @sharedstatedir@ srcdir = @srcdir@ diff --git a/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c b/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c new file mode 100644 index 00000000000..e8ab7ccb166 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c @@ -0,0 +1,46 @@ +/* { dg-do run } */ + +#define N 100 + +#include +#include + +int f(int a[], int num) +{ + int on_device = 0; + int i; + + #pragma omp metadirective \ + when (target_device={device_num(num), kind("gpu")}: \ + target parallel for map(to: a[0:N]), map(from: on_device)) \ + default (parallel for private (on_device)) + for (i = 0; i < N; i++) + { + a[i] += i; + on_device = 1; + } + + return on_device; +} + +int main (void) +{ + int a[N]; + int on_device_count = 0; + int i; + + for (i = 0; i < N; i++) + a[i] = i; + + for (i = 0; i <= omp_get_num_devices (); i++) + on_device_count += f (a, i); + + if (on_device_count != omp_get_num_devices ()) + return 1; + + for (i = 0; i < N; i++) + if (a[i] != 2 * i) + return 2; + + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 b/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 new file mode 100644 index 00000000000..3992286dc08 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 @@ -0,0 +1,44 @@ +! { dg-do run } + +program main + use omp_lib + + implicit none + + integer, parameter :: N = 100 + integer :: a(N) + integer :: on_device_count = 0 + integer :: i + + do i = 1, N + a(i) = i + end do + + do i = 0, omp_get_num_devices () + on_device_count = on_device_count + f (a, i) + end do + + if (on_device_count .ne. omp_get_num_devices ()) stop 1 + + do i = 1, N + if (a(i) .ne. 2 * i) stop 2; + end do +contains + integer function f (a, num) + integer, intent(inout) :: a(N) + integer, intent(in) :: num + integer :: on_device + integer :: i + + on_device = 0 + !$omp metadirective & + !$omp& when (target_device={device_num(num), kind("gpu")}: & + !$omp& target parallel do map(to: a(1:N)), map(from: on_device)) & + !$omp& default (parallel do private(on_device)) + do i = 1, N + a(i) = a(i) + i + on_device = 1 + end do + f = on_device; + end function +end program