* [Patch] OpenMP: Support complex/float in && and || reduction @ 2021-04-30 23:12 Tobias Burnus 2021-05-03 17:38 ` Jakub Jelinek 0 siblings, 1 reply; 4+ messages in thread From: Tobias Burnus @ 2021-04-30 23:12 UTC (permalink / raw) To: gcc-patches, Jakub Jelinek [-- Attachment #1: Type: text/plain, Size: 753 bytes --] C/C++ permits to use || and && (logical OR and logical AND) for floating-point and complex scalars; those evaluated unequal zero and the result is of type 'int' with value 0 or 1. While || and && with floating-point numbers is somewhat sensible, it does not really make sense to use a non-bool/non-integer as reduction variable – but as C/C++ permits this and OpenMP follows the base language, this patch implements it. OK for mainline? Tobias PS: The https://github.com/clang-ykt/omptests testsuite uses it and this patch silences 1008 'error:' lines. ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf [-- Attachment #2: red-and-or-diff-v2.diff --] [-- Type: text/x-patch, Size: 21754 bytes --] OpenMP: Support complex/float in && and || reduction C/C++ permit logical AND and logical OR also with floating-point or complex arguments by doing an unequal zero comparison; the result is an 'int' with value one or zero. Hence, those are also permitted as reduction variable, even though it is not the most sensible thing to do. gcc/c/ChangeLog: * c-typeck.c (c_finish_omp_clauses): Accept float + complex for || and && reductions. gcc/cp/ChangeLog: * semantics.c (finish_omp_reduction_clause): Accept float + complex for || and && reductions. gcc/ChangeLog: * omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle && and || with floating-point and complex arguments. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-1.c: New test. * testsuite/libgomp.c-c++-common/reduction-2.c: New test. * testsuite/libgomp.c-c++-common/reduction-3.c: New test. gcc/c/c-typeck.c | 10 +- gcc/cp/semantics.c | 8 +- gcc/omp-low.c | 76 +++++++- .../testsuite/libgomp.c-c++-common/reduction-1.c | 192 +++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-2.c | 192 +++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-3.c | 192 +++++++++++++++++++++ 6 files changed, 651 insertions(+), 19 deletions(-) diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 3b45cfda0ff..fdc7bb6125c 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14097,6 +14097,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case PLUS_EXPR: case MULT_EXPR: case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: break; case MIN_EXPR: if (TREE_CODE (type) == COMPLEX_TYPE) @@ -14115,14 +14117,6 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case BIT_IOR_EXPR: r_name = "|"; break; - case TRUTH_ANDIF_EXPR: - if (FLOAT_TYPE_P (type)) - r_name = "&&"; - break; - case TRUTH_ORIF_EXPR: - if (FLOAT_TYPE_P (type)) - r_name = "||"; - break; default: gcc_unreachable (); } diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 6224f49f189..0d590c318fb 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -6032,6 +6032,8 @@ finish_omp_reduction_clause (tree c, bool *need_default_ctor, bool *need_dtor) case PLUS_EXPR: case MULT_EXPR: case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: predefined = true; break; case MIN_EXPR: @@ -6047,12 +6049,6 @@ finish_omp_reduction_clause (tree c, bool *need_default_ctor, bool *need_dtor) break; predefined = true; break; - case TRUTH_ANDIF_EXPR: - case TRUTH_ORIF_EXPR: - if (FLOAT_TYPE_P (type)) - break; - predefined = true; - break; default: break; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 7b122059c6e..bcdd34590bf 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -6376,6 +6376,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (code == MINUS_EXPR) code = PLUS_EXPR; + /* C/C++ permits FP/complex with || and &&. */ + bool is_fp_and_or + = ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) + || TREE_CODE (TREE_TYPE (new_var)) == COMPLEX_TYPE)); tree new_vard = new_var; if (is_simd && omp_is_reference (var)) { @@ -6443,8 +6448,23 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (is_simd) { tree ref = build_outer_var_ref (var, ctx); - - x = build2 (code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + if (is_fp_and_or) + new_var2 = fold_build2_loc ( + clause_loc, NE_EXPR, + integer_type_node, new_var, + build_zero_cst (TREE_TYPE (new_var))); + tree ref2 = ref; + if (is_fp_and_or + && (FLOAT_TYPE_P (TREE_TYPE (ref)) + || TREE_CODE (TREE_TYPE (ref)) + == COMPLEX_TYPE)) + ref2 = fold_build2_loc ( + clause_loc, NE_EXPR, integer_type_node, + ref, build_zero_cst (TREE_TYPE (ref))); + x = build2 (code, TREE_TYPE (ref2), ref2, new_var2); + if (new_var2 != new_var) + x = fold_convert (TREE_TYPE (new_var), x); ref = build_outer_var_ref (var, ctx); gimplify_assign (ref, x, dlist); } @@ -7384,13 +7404,32 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, if (code == MINUS_EXPR) code = PLUS_EXPR; + /* C/C++ permits FP/complex with || and &&. */ + bool is_fp_and_or = ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) + || TREE_CODE (TREE_TYPE (new_var)) + == COMPLEX_TYPE)); if (count == 1) { tree addr = build_fold_addr_expr_loc (clause_loc, ref); addr = save_expr (addr); ref = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (addr)), addr); - x = fold_build2_loc (clause_loc, code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + if (is_fp_and_or) + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, new_var, + build_zero_cst (TREE_TYPE (new_var))); + tree ref2 = ref; + if (is_fp_and_or + && (FLOAT_TYPE_P (TREE_TYPE (ref)) + || TREE_CODE (TREE_TYPE (ref)) == COMPLEX_TYPE)) + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, ref, + build_zero_cst (TREE_TYPE (ref))); + x = fold_build2_loc (clause_loc, code, TREE_TYPE (new_var2), ref2, + new_var2); + if (new_var2 != new_var) + x = fold_convert (TREE_TYPE (new_var), x); x = build2 (OMP_ATOMIC, void_type_node, addr, x); OMP_ATOMIC_MEMORY_ORDER (x) = OMP_MEMORY_ORDER_RELAXED; gimplify_and_add (x, stmt_seqp); @@ -7495,7 +7534,21 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, } else { - x = build2 (code, TREE_TYPE (out), out, priv); + tree out2 = out; + if (is_fp_and_or) + out2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, out, + build_zero_cst (type)); + tree priv2 = priv; + if (is_fp_and_or + && (FLOAT_TYPE_P (TREE_TYPE (priv)) + || TREE_CODE (TREE_TYPE (priv)) == COMPLEX_TYPE)) + priv2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, priv, + build_zero_cst (TREE_TYPE (priv))); + x = build2 (code, TREE_TYPE (out2), out2, priv2); + if (out2 != out) + x = fold_convert (TREE_TYPE (out), x); out = unshare_expr (out); gimplify_assign (out, x, &sub_seq); } @@ -7529,7 +7582,20 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, } else { - x = build2 (code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + if (is_fp_and_or) + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, new_var, + build_zero_cst (TREE_TYPE (new_var))); + tree ref2 = ref; + if (is_fp_and_or + && (FLOAT_TYPE_P (TREE_TYPE (ref)) + || TREE_CODE (TREE_TYPE (ref)) == COMPLEX_TYPE)) + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, ref, + build_zero_cst (TREE_TYPE (ref))); + x = build2 (code, TREE_TYPE (ref), ref2, new_var2); + if (new_var2 != new_var) + x = fold_convert (TREE_TYPE (new_var), x); ref = build_outer_var_ref (var, ctx); gimplify_assign (ref, x, &sub_seq); } diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c new file mode 100644 index 00000000000..89a4153b078 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: FP reduction variable + FP array. */ + +#define N 1024 +_Complex float rcf[N]; +_Complex double rcd[N]; +float rf[N]; +double rd[N]; + +int +reduction_or () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1.0; + rd[15] = 1.0; + rcf[10] = 1.0; + rcd[15] = 1.0i; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0.0; + rd[15] = 0.0; + rcf[10] = 0.0; + rcd[15] = 0.0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c new file mode 100644 index 00000000000..bdcba863767 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: FP reduction variable + integer array. */ + +#define N 1024 +char rcf[N]; +short rcd[N]; +int rf[N]; +long rd[N]; + +int +reduction_or () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1; + rd[15] = 1; + rcf[10] = 1; + rcd[15] = 1; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0; + rd[15] = 0; + rcf[10] = 0; + rcd[15] = 0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c new file mode 100644 index 00000000000..0f09aab40ec --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: integer reduction variable + FP array. */ + +#define N 1024 +_Complex float rcf[N]; +_Complex double rcd[N]; +float rf[N]; +double rd[N]; + +int +reduction_or () +{ + char orf = 0; + short ord = 0; + int orfc = 0; + long ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + char orf = 0; + short ord = 0; + int orfc = 0; + long ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + unsigned char andf = 1; + unsigned short andd = 1; + unsigned int andfc = 1; + unsigned long anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + unsigned char andf = 1; + unsigned short andd = 1; + unsigned int andfc = 1; + unsigned long anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1.0; + rd[15] = 1.0; + rcf[10] = 1.0; + rcd[15] = 1.0i; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0.0; + rd[15] = 0.0; + rcf[10] = 0.0; + rcd[15] = 0.0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] OpenMP: Support complex/float in && and || reduction 2021-04-30 23:12 [Patch] OpenMP: Support complex/float in && and || reduction Tobias Burnus @ 2021-05-03 17:38 ` Jakub Jelinek 2021-05-04 10:16 ` Tobias Burnus 0 siblings, 1 reply; 4+ messages in thread From: Jakub Jelinek @ 2021-05-03 17:38 UTC (permalink / raw) To: Tobias Burnus; +Cc: gcc-patches On Sat, May 01, 2021 at 01:12:15AM +0200, Tobias Burnus wrote: > gcc/c/ChangeLog: > > * c-typeck.c (c_finish_omp_clauses): Accept float + complex for || and && > reductions. > > gcc/cp/ChangeLog: > > * semantics.c (finish_omp_reduction_clause): Accept float + complex for || and && > reductions. > > gcc/ChangeLog: > > * omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle && and || > with floating-point and complex arguments. All the above ChangeLog lines are too long. > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -6376,6 +6376,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, > if (code == MINUS_EXPR) > code = PLUS_EXPR; > > + /* C/C++ permits FP/complex with || and &&. */ > + bool is_fp_and_or > + = ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) > + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) > + || TREE_CODE (TREE_TYPE (new_var)) == COMPLEX_TYPE)); The above line is too long too, please use || (TREE_CODE (TREE_TYPE (new_var)) == COMPLEX_TYPE))); > tree new_vard = new_var; > if (is_simd && omp_is_reference (var)) > { > @@ -6443,8 +6448,23 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, > if (is_simd) > { > tree ref = build_outer_var_ref (var, ctx); > - > - x = build2 (code, TREE_TYPE (ref), ref, new_var); > + tree new_var2 = new_var; > + if (is_fp_and_or) > + new_var2 = fold_build2_loc ( > + clause_loc, NE_EXPR, > + integer_type_node, new_var, > + build_zero_cst (TREE_TYPE (new_var))); Formatting, would be nice to avoid the ( at the end of line, e.g. { tree zero = build_zero_cst (TREE_TYPE (new_var)); new_var2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, new_var, zero); } > + tree ref2 = ref; > + if (is_fp_and_or > + && (FLOAT_TYPE_P (TREE_TYPE (ref)) > + || TREE_CODE (TREE_TYPE (ref)) > + == COMPLEX_TYPE)) Please wrap the == into ()s. Though ref should have the same type new_var (or at least a compatible type), so I don't see the point of the && ... in there and of using two separate if (is_fp_and_or) blocks. So tree new_var2 = new_var; tree ref2 = ref; if (is_fp_and_or) { tree zero = ...; new_var2 = ... ref2 = ...; } > + ref2 = fold_build2_loc ( > + clause_loc, NE_EXPR, integer_type_node, > + ref, build_zero_cst (TREE_TYPE (ref))); And try to avoid the ( here too. Even better would be to split the function a little bit, but that can be done another day. > + x = build2 (code, TREE_TYPE (ref2), ref2, new_var2); > + if (new_var2 != new_var) > + x = fold_convert (TREE_TYPE (new_var), x); > ref = build_outer_var_ref (var, ctx); > gimplify_assign (ref, x, dlist); > } > @@ -7384,13 +7404,32 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, > if (code == MINUS_EXPR) > code = PLUS_EXPR; > > + /* C/C++ permits FP/complex with || and &&. */ > + bool is_fp_and_or = ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) > + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) > + || TREE_CODE (TREE_TYPE (new_var)) > + == COMPLEX_TYPE)); Again, ()s around ==. > if (count == 1) > { > tree addr = build_fold_addr_expr_loc (clause_loc, ref); > > addr = save_expr (addr); > ref = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (addr)), addr); > - x = fold_build2_loc (clause_loc, code, TREE_TYPE (ref), ref, new_var); > + tree new_var2 = new_var; > + if (is_fp_and_or) > + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, > + integer_type_node, new_var, > + build_zero_cst (TREE_TYPE (new_var))); > + tree ref2 = ref; > + if (is_fp_and_or > + && (FLOAT_TYPE_P (TREE_TYPE (ref)) > + || TREE_CODE (TREE_TYPE (ref)) == COMPLEX_TYPE)) > + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, ref, > + build_zero_cst (TREE_TYPE (ref))); And similar question as above. And the line is too long. > + x = fold_build2_loc (clause_loc, code, TREE_TYPE (new_var2), ref2, > + new_var2); > + if (new_var2 != new_var) > + x = fold_convert (TREE_TYPE (new_var), x); > x = build2 (OMP_ATOMIC, void_type_node, addr, x); > OMP_ATOMIC_MEMORY_ORDER (x) = OMP_MEMORY_ORDER_RELAXED; > gimplify_and_add (x, stmt_seqp); > @@ -7495,7 +7534,21 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, > } > else > { > - x = build2 (code, TREE_TYPE (out), out, priv); > + tree out2 = out; > + if (is_fp_and_or) > + out2 = fold_build2_loc (clause_loc, NE_EXPR, > + integer_type_node, out, > + build_zero_cst (type)); > + tree priv2 = priv; > + if (is_fp_and_or > + && (FLOAT_TYPE_P (TREE_TYPE (priv)) > + || TREE_CODE (TREE_TYPE (priv)) == COMPLEX_TYPE)) And here too. > + priv2 = fold_build2_loc (clause_loc, NE_EXPR, > + integer_type_node, priv, > + build_zero_cst (TREE_TYPE (priv))); > + x = build2 (code, TREE_TYPE (out2), out2, priv2); > + if (out2 != out) > + x = fold_convert (TREE_TYPE (out), x); > out = unshare_expr (out); > gimplify_assign (out, x, &sub_seq); > } > @@ -7529,7 +7582,20 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, > } > else > { > - x = build2 (code, TREE_TYPE (ref), ref, new_var); > + tree new_var2 = new_var; > + if (is_fp_and_or) > + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, > + integer_type_node, new_var, > + build_zero_cst (TREE_TYPE (new_var))); > + tree ref2 = ref; > + if (is_fp_and_or > + && (FLOAT_TYPE_P (TREE_TYPE (ref)) > + || TREE_CODE (TREE_TYPE (ref)) == COMPLEX_TYPE)) > + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, ref, > + build_zero_cst (TREE_TYPE (ref))); > + x = build2 (code, TREE_TYPE (ref), ref2, new_var2); > + if (new_var2 != new_var) > + x = fold_convert (TREE_TYPE (new_var), x); Likewise. For the testcases, would be nice to have one with _Complex int, though perhaps separately from the ones you've included because while float or _Complex double are standard, _Complex int is a GNU extension. Otherwise LGTM. Jakub ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] OpenMP: Support complex/float in && and || reduction 2021-05-03 17:38 ` Jakub Jelinek @ 2021-05-04 10:16 ` Tobias Burnus 2021-05-04 10:26 ` Jakub Jelinek 0 siblings, 1 reply; 4+ messages in thread From: Tobias Burnus @ 2021-05-04 10:16 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 1599 bytes --] On 03.05.21 19:38, Jakub Jelinek wrote: > All the above ChangeLog lines are too long. Fixed. > The above line is too long too, ... I counted 80 characters - but the line break is now required for: used '(...)' around '... == ...'). >> + if (is_fp_and_or) >> + new_var2 = fold_build2_loc ( >> + clause_loc, NE_EXPR, >> + integer_type_node, new_var, >> + build_zero_cst (TREE_TYPE (new_var))); > Formatting, would be nice to avoid the ( at the end of line, e.g. > { > tree zero = build_zero_cst (TREE_TYPE (new_var)); I added now the zero plus > Please wrap the == into ()s. Done. > Though ref should have the same type new_var (or at least a compatible type), > so I don't see the point of the && ... in there and of using two separate > if (is_fp_and_or) blocks. I have now merged the two parts into a single block - with a single zero as proposed. > For the testcases, would be nice to have one with _Complex int, though > perhaps separately from the ones you've included because while float > or _Complex double are standard, _Complex int is a GNU extension. Done (reduction-4.c). Unless there are further comments, I intent to commit it after the lunch break. Tobias ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf [-- Attachment #2: red-and-or-diff-v3.diff --] [-- Type: text/x-patch, Size: 27907 bytes --] OpenMP: Support complex/float in && and || reduction C/C++ permit logical AND and logical OR also with floating-point or complex arguments by doing an unequal zero comparison; the result is an 'int' with value one or zero. Hence, those are also permitted as reduction variable, even though it is not the most sensible thing to do. gcc/c/ChangeLog: * c-typeck.c (c_finish_omp_clauses): Accept float + complex for || and && reductions. gcc/cp/ChangeLog: * semantics.c (finish_omp_reduction_clause): Accept float + complex for || and && reductions. gcc/ChangeLog: * omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle && and || with floating-point and complex arguments. gcc/testsuite/ChangeLog: * gcc.dg/gomp/clause-1.c: Use 'reduction(&:' instead of '...(&&:'. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-1.c: New test. * testsuite/libgomp.c-c++-common/reduction-2.c: New test. * testsuite/libgomp.c-c++-common/reduction-3.c: New test. gcc/c/c-typeck.c | 10 +- gcc/cp/semantics.c | 8 +- gcc/omp-low.c | 87 ++++++++- gcc/testsuite/gcc.dg/gomp/clause-1.c | 2 +- .../testsuite/libgomp.c-c++-common/reduction-1.c | 192 ++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-2.c | 192 ++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-3.c | 192 ++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-4.c | 194 +++++++++++++++++++++ 8 files changed, 856 insertions(+), 21 deletions(-) diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 3b45cfda0ff..fdc7bb6125c 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14097,6 +14097,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case PLUS_EXPR: case MULT_EXPR: case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: break; case MIN_EXPR: if (TREE_CODE (type) == COMPLEX_TYPE) @@ -14115,14 +14117,6 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case BIT_IOR_EXPR: r_name = "|"; break; - case TRUTH_ANDIF_EXPR: - if (FLOAT_TYPE_P (type)) - r_name = "&&"; - break; - case TRUTH_ORIF_EXPR: - if (FLOAT_TYPE_P (type)) - r_name = "||"; - break; default: gcc_unreachable (); } diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 6224f49f189..0d590c318fb 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -6032,6 +6032,8 @@ finish_omp_reduction_clause (tree c, bool *need_default_ctor, bool *need_dtor) case PLUS_EXPR: case MULT_EXPR: case MINUS_EXPR: + case TRUTH_ANDIF_EXPR: + case TRUTH_ORIF_EXPR: predefined = true; break; case MIN_EXPR: @@ -6047,12 +6049,6 @@ finish_omp_reduction_clause (tree c, bool *need_default_ctor, bool *need_dtor) break; predefined = true; break; - case TRUTH_ANDIF_EXPR: - case TRUTH_ORIF_EXPR: - if (FLOAT_TYPE_P (type)) - break; - predefined = true; - break; default: break; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 1f14c4b1d69..1e31d8e72c4 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -6389,6 +6389,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (code == MINUS_EXPR) code = PLUS_EXPR; + /* C/C++ permits FP/complex with || and &&. */ + bool is_fp_and_or + = (((code == TRUTH_ANDIF_EXPR) || (code == TRUTH_ORIF_EXPR)) + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) + || (TREE_CODE (TREE_TYPE (new_var)) == COMPLEX_TYPE))); tree new_vard = new_var; if (is_simd && omp_is_reference (var)) { @@ -6437,7 +6442,20 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, x = build2 (code, TREE_TYPE (ivar), ivar, x); gimplify_assign (ivar, x, &llist[2]); } - x = build2 (code, TREE_TYPE (ref), ref, ivar); + tree ivar2 = ivar; + tree ref2 = ref; + if (is_fp_and_or) + { + tree zero = build_zero_cst (TREE_TYPE (ivar)); + ivar2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, ivar, + zero); + ref2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, ref, zero); + } + x = build2 (code, TREE_TYPE (ref), ref2, ivar2); + if (is_fp_and_or) + x = fold_convert (TREE_TYPE (ref), x); ref = build_outer_var_ref (var, ctx); gimplify_assign (ref, x, &llist[1]); @@ -6456,8 +6474,22 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, if (is_simd) { tree ref = build_outer_var_ref (var, ctx); - - x = build2 (code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + tree ref2 = ref; + if (is_fp_and_or) + { + tree zero = build_zero_cst (TREE_TYPE (new_var)); + new_var2 + = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, new_var, + zero); + ref2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, ref, + zero); + } + x = build2 (code, TREE_TYPE (ref2), ref2, new_var2); + if (is_fp_and_or) + x = fold_convert (TREE_TYPE (new_var), x); ref = build_outer_var_ref (var, ctx); gimplify_assign (ref, x, dlist); } @@ -7397,13 +7429,32 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, if (code == MINUS_EXPR) code = PLUS_EXPR; + /* C/C++ permits FP/complex with || and &&. */ + bool is_fp_and_or = (((code == TRUTH_ANDIF_EXPR) + || (code == TRUTH_ORIF_EXPR)) + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) + || (TREE_CODE (TREE_TYPE (new_var)) + == COMPLEX_TYPE))); if (count == 1) { tree addr = build_fold_addr_expr_loc (clause_loc, ref); addr = save_expr (addr); ref = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (addr)), addr); - x = fold_build2_loc (clause_loc, code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + tree ref2 = ref; + if (is_fp_and_or) + { + tree zero = build_zero_cst (TREE_TYPE (new_var)); + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, new_var, zero); + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, + ref, zero); + } + x = fold_build2_loc (clause_loc, code, TREE_TYPE (new_var2), ref2, + new_var2); + if (is_fp_and_or) + x = fold_convert (TREE_TYPE (new_var), x); x = build2 (OMP_ATOMIC, void_type_node, addr, x); OMP_ATOMIC_MEMORY_ORDER (x) = OMP_MEMORY_ORDER_RELAXED; gimplify_and_add (x, stmt_seqp); @@ -7508,7 +7559,19 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, } else { - x = build2 (code, TREE_TYPE (out), out, priv); + tree out2 = out; + tree priv2 = priv; + if (is_fp_and_or) + { + tree zero = build_zero_cst (TREE_TYPE (out)); + out2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, out, zero); + priv2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, priv, zero); + } + x = build2 (code, TREE_TYPE (out2), out2, priv2); + if (is_fp_and_or) + x = fold_convert (TREE_TYPE (out), x); out = unshare_expr (out); gimplify_assign (out, x, &sub_seq); } @@ -7542,7 +7605,19 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, } else { - x = build2 (code, TREE_TYPE (ref), ref, new_var); + tree new_var2 = new_var; + tree ref2 = ref; + if (is_fp_and_or) + { + tree zero = build_zero_cst (TREE_TYPE (new_var)); + new_var2 = fold_build2_loc (clause_loc, NE_EXPR, + integer_type_node, new_var, zero); + ref2 = fold_build2_loc (clause_loc, NE_EXPR, integer_type_node, + ref, zero); + } + x = build2 (code, TREE_TYPE (ref), ref2, new_var2); + if (is_fp_and_or) + x = fold_convert (TREE_TYPE (new_var), x); ref = build_outer_var_ref (var, ctx); gimplify_assign (ref, x, &sub_seq); } diff --git a/gcc/testsuite/gcc.dg/gomp/clause-1.c b/gcc/testsuite/gcc.dg/gomp/clause-1.c index 9d34b041606..8e7cc950d22 100644 --- a/gcc/testsuite/gcc.dg/gomp/clause-1.c +++ b/gcc/testsuite/gcc.dg/gomp/clause-1.c @@ -56,7 +56,7 @@ foo (int x) ; #pragma omp p reduction (|:d) /* { dg-error "has invalid type for" } */ ; -#pragma omp p reduction (&&:d) /* { dg-error "has invalid type for" } */ +#pragma omp p reduction (&:d) /* { dg-error "has invalid type for" } */ ; #pragma omp p copyin (d) /* { dg-error "must be 'threadprivate'" } */ ; diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c new file mode 100644 index 00000000000..89a4153b078 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-1.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: FP reduction variable + FP array. */ + +#define N 1024 +_Complex float rcf[N]; +_Complex double rcd[N]; +float rf[N]; +double rd[N]; + +int +reduction_or () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1.0; + rd[15] = 1.0; + rcf[10] = 1.0; + rcd[15] = 1.0i; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0.0; + rd[15] = 0.0; + rcf[10] = 0.0; + rcd[15] = 0.0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c new file mode 100644 index 00000000000..bdcba863767 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-2.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: FP reduction variable + integer array. */ + +#define N 1024 +char rcf[N]; +short rcd[N]; +int rf[N]; +long rd[N]; + +int +reduction_or () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + float orf = 0; + double ord = 0; + _Complex float orfc = 0; + _Complex double ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + float andf = 1; + double andd = 1; + _Complex float andfc = 1; + _Complex double anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1; + rd[15] = 1; + rcf[10] = 1; + rcd[15] = 1; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0; + rd[15] = 0; + rcf[10] = 0; + rcd[15] = 0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c new file mode 100644 index 00000000000..0f09aab40ec --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-3.c @@ -0,0 +1,192 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to a floating-poing + or complex type. + + While having a floating-point/complex array element with || and && can make + sense, having a non-integer/non-bool reduction variable is odd but valid. + + Test: integer reduction variable + FP array. */ + +#define N 1024 +_Complex float rcf[N]; +_Complex double rcd[N]; +float rf[N]; +double rd[N]; + +int +reduction_or () +{ + char orf = 0; + short ord = 0; + int orfc = 0; + long ordc = 0; + + #pragma omp parallel reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp parallel for reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp parallel for simd reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp parallel loop reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_or_teams () +{ + char orf = 0; + short ord = 0; + int orfc = 0; + long ordc = 0; + + #pragma omp teams distribute parallel for reduction(||: orf) + for (int i=0; i < N; ++i) + orf = orf || rf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ord) + for (int i=0; i < N; ++i) + ord = ord || rcd[i]; + + #pragma omp teams distribute parallel for reduction(||: orfc) + for (int i=0; i < N; ++i) + orfc = orfc || rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ordc) + for (int i=0; i < N; ++i) + ordc = ordc || rcd[i]; + + return orf + ord + __real__ orfc + __real__ ordc; +} + +int +reduction_and () +{ + unsigned char andf = 1; + unsigned short andd = 1; + unsigned int andfc = 1; + unsigned long anddc = 1; + + #pragma omp parallel reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp parallel for reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp parallel for simd reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp parallel loop reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +reduction_and_teams () +{ + unsigned char andf = 1; + unsigned short andd = 1; + unsigned int andfc = 1; + unsigned long anddc = 1; + + #pragma omp teams distribute parallel for reduction(&&: andf) + for (int i=0; i < N; ++i) + andf = andf && rf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andd) + for (int i=0; i < N; ++i) + andd = andd && rcd[i]; + + #pragma omp teams distribute parallel for reduction(&&: andfc) + for (int i=0; i < N; ++i) + andfc = andfc && rcf[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: anddc) + for (int i=0; i < N; ++i) + anddc = anddc && rcd[i]; + + return andf + andd + __real__ andfc + __real__ anddc; +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rf[i] = 0; + rd[i] = 0; + rcf[i] = 0; + rcd[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rf[10] = 1.0; + rd[15] = 1.0; + rcf[10] = 1.0; + rcd[15] = 1.0i; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rf[i] = 1; + rd[i] = 1; + rcf[i] = 1; + rcd[i] = 1; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rf[10] = 0.0; + rd[15] = 0.0; + rcf[10] = 0.0; + rcd[15] = 0.0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-4.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-4.c new file mode 100644 index 00000000000..a465e10ff4f --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-4.c @@ -0,0 +1,194 @@ +/* C / C++'s logical AND and OR operators take any scalar argument + which compares (un)equal to 0 - the result 1 or 0 and of type int. + + In this testcase, the int result is again converted to an integer complex + type. + + While having a floating-point/complex array element with || and && can make + sense, having a complex reduction variable is odd but valid. + + Test: int complex reduction variable + int complex array. */ + +#define N 1024 +_Complex char rcc[N]; +_Complex short rcs[N]; +_Complex int rci[N]; +_Complex long long rcl[N]; + +int +reduction_or () +{ + _Complex char orc = 0; + _Complex short ors = 0; + _Complex int ori = 0; + _Complex long orl = 0; + + #pragma omp parallel reduction(||: orc) + for (int i=0; i < N; ++i) + orc = orc || rcl[i]; + + #pragma omp parallel for reduction(||: ors) + for (int i=0; i < N; ++i) + ors = ors || rci[i]; + + #pragma omp parallel for simd reduction(||: ori) + for (int i=0; i < N; ++i) + ori = ori || rcs[i]; + + #pragma omp parallel loop reduction(||: orl) + for (int i=0; i < N; ++i) + orl = orl || rcc[i]; + + return __real__ (orc + ors + ori + orl) + __imag__ (orc + ors + ori + orl); +} + +int +reduction_or_teams () +{ + _Complex char orc = 0; + _Complex short ors = 0; + _Complex int ori = 0; + _Complex long orl = 0; + + #pragma omp teams distribute parallel for reduction(||: orc) + for (int i=0; i < N; ++i) + orc = orc || rcc[i]; + + #pragma omp teams distribute parallel for simd reduction(||: ors) + for (int i=0; i < N; ++i) + ors = ors || rcs[i]; + + #pragma omp teams distribute parallel for reduction(||: ori) + for (int i=0; i < N; ++i) + ori = ori || rci[i]; + + #pragma omp teams distribute parallel for simd reduction(||: orl) + for (int i=0; i < N; ++i) + orl = orl || rcl[i]; + + return __real__ (orc + ors + ori + orl) + __imag__ (orc + ors + ori + orl); +} + +int +reduction_and () +{ + _Complex char andc = 1; + _Complex short ands = 1; + _Complex int andi = 1; + _Complex long andl = 1; + + #pragma omp parallel reduction(&&: andc) + for (int i=0; i < N; ++i) + andc = andc && rcc[i]; + + #pragma omp parallel for reduction(&&: ands) + for (int i=0; i < N; ++i) + ands = ands && rcs[i]; + + #pragma omp parallel for simd reduction(&&: andi) + for (int i=0; i < N; ++i) + andi = andi && rci[i]; + + #pragma omp parallel loop reduction(&&: andl) + for (int i=0; i < N; ++i) + andl = andl && rcl[i]; + + return __real__ (andc + ands + andi + andl) + + __imag__ (andc + ands + andi + andl); +} + +int +reduction_and_teams () +{ + _Complex char andc = 1; + _Complex short ands = 1; + _Complex int andi = 1; + _Complex long andl = 1; + + #pragma omp teams distribute parallel for reduction(&&: andc) + for (int i=0; i < N; ++i) + andc = andc && rcl[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: ands) + for (int i=0; i < N; ++i) + ands = ands && rci[i]; + + #pragma omp teams distribute parallel for reduction(&&: andi) + for (int i=0; i < N; ++i) + andi = andi && rcs[i]; + + #pragma omp teams distribute parallel for simd reduction(&&: andl) + for (int i=0; i < N; ++i) + andl = andl && rcc[i]; + + return __real__ (andc + ands + andi + andl) + + __imag__ (andc + ands + andi + andl); +} + +int +main () +{ + for (int i = 0; i < N; ++i) + { + rcc[i] = 0; + rcs[i] = 0; + rci[i] = 0; + rcl[i] = 0; + } + + if (reduction_or () != 0) + __builtin_abort (); + if (reduction_or_teams () != 0) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + rcc[10] = 1.0; + rcs[15] = 1.0i; + rci[10] = 1.0; + rcl[15] = 1.0i; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + for (int i = 0; i < N; ++i) + { + rcc[i] = 1; + rcs[i] = 1i; + rci[i] = 1; + rcl[i] = 1 + 1i; + } + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 4) + __builtin_abort (); + if (reduction_and_teams () != 4) + __builtin_abort (); + + rcc[10] = 0.0; + rcs[15] = 0.0; + rci[10] = 0.0; + rcl[15] = 0.0; + + if (reduction_or () != 4) + __builtin_abort (); + if (reduction_or_teams () != 4) + __builtin_abort (); + if (reduction_and () != 0) + __builtin_abort (); + if (reduction_and_teams () != 0) + __builtin_abort (); + + return 0; +} ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] OpenMP: Support complex/float in && and || reduction 2021-05-04 10:16 ` Tobias Burnus @ 2021-05-04 10:26 ` Jakub Jelinek 0 siblings, 0 replies; 4+ messages in thread From: Jakub Jelinek @ 2021-05-04 10:26 UTC (permalink / raw) To: Tobias Burnus; +Cc: gcc-patches On Tue, May 04, 2021 at 12:16:50PM +0200, Tobias Burnus wrote: > Unless there are further comments, I intent to commit it after the lunch > break. Just further nits, sorry (but no need to retest): > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -6389,6 +6389,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, > if (code == MINUS_EXPR) > code = PLUS_EXPR; > > + /* C/C++ permits FP/complex with || and &&. */ > + bool is_fp_and_or > + = (((code == TRUTH_ANDIF_EXPR) || (code == TRUTH_ORIF_EXPR)) The ()s around code == TRUTH_*_EXPR are unnecessary. > + && (FLOAT_TYPE_P (TREE_TYPE (new_var)) > + || (TREE_CODE (TREE_TYPE (new_var)) == COMPLEX_TYPE))); And if I count well, this is too long, so should have == on the next line below TREE_CODE. If it would fit, the ()s around the == would be unnecessary too, the whole point of them is to make emacs happy (not an emacs user myself though), which would otherwise misalign it. Only needed if it needs a line split. Without ()s, I think emacs likes to indent foo (....) || bar (..................) == ........... as foo (....) || bar (..................) == ........... and the cure is foo (....) || (bar (..................) == ...........) > @@ -7397,13 +7429,32 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, > if (code == MINUS_EXPR) > code = PLUS_EXPR; > > + /* C/C++ permits FP/complex with || and &&. */ > + bool is_fp_and_or = (((code == TRUTH_ANDIF_EXPR) > + || (code == TRUTH_ORIF_EXPR)) See above. Otherwise ok. Jakub ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-05-04 10:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-30 23:12 [Patch] OpenMP: Support complex/float in && and || reduction Tobias Burnus 2021-05-03 17:38 ` Jakub Jelinek 2021-05-04 10:16 ` Tobias Burnus 2021-05-04 10:26 ` Jakub Jelinek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).