From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id BAFF0383FB94; Fri, 24 Mar 2023 15:51:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BAFF0383FB94 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,288,1673942400"; d="scan'208";a="274548" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 24 Mar 2023 07:31:26 -0800 IronPort-SDR: f7z0Mukx/xVpIbMmh6yATJaH3WQSY2TGvCVSKYvR0yrdbd1cKZdR6WT4Ml983GyWztwvk61qPr CHOJGEjLcNDvIsJ2OnvLcixxVzsLdV1M5vk/Y9TclpTuq+4qLFwi+cuc+EUNhAl3cogNiZ7FkN M8WCJ8vqNUXRI7NeXzT0ZHmrqvWsJOi/EuPFG5awMaJXODKz3GN5v+JRwu9EYaxj8QEHaBbxLN pLHR2tSjvm/3tzs/lhs+hldJaySsmK9d9S1lJgNS5gvRnZ6XbxZDtLRKNKrsynq+S11PgNwHv3 fYQ= From: Frederik Harwath To: , , , Subject: [PATCH 6/7] openmp: Add Fortran support for loop transformations on inner loops Date: Fri, 24 Mar 2023 16:30:44 +0100 Message-ID: <20230324153046.3996092-7-frederik@codesourcery.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230324153046.3996092-1-frederik@codesourcery.com> References: <20230324153046.3996092-1-frederik@codesourcery.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the level of the loop nest at which a transformation should be applied. The middle end transformation pass is adjusted to apply the transformations at the correct level of a loop nest and to take their effect on the loop nest depth into account. gcc/fortran/ChangeLog: * openmp.cc (omp_unroll_removes_loop_nest): Move down in file. (resolve_loop_transform_generic): Remove, and ... (resolve_omp_unroll): ... inline and adapt here. Move function. Move functin. (find_nested_loop_in_block): New function. (find_nested_loop_in_chain): New function, used ... (is_outer_iteration_variable): ... here, and ... (expr_is_invariant): ... here. (resolve_omp_do): Adjust code for resolving loop transformations. (resolve_omp_tile): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL on new clause. (compute_transformed_depth): New function to compute the depth ("collapse") of a transformed loop nest, used (gfc_trans_omp_do): ... here. gcc/ChangeLog: * omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type in comment. (gomp_for_uncollapse): Adjust "collapse" value after uncollapse. (partial_unroll): Add argument for the loop nest level to be transf= ormed. (tile): Likewise. (transform_gomp_for): Pass level to transformatoin functions. (optimize_transformation_clauses): Handle transformation clauses fo= r all levels recursively. * tree-pretty-print.cc (dump_omp_clause): Print OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access clause operand 0. (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0. (OMP_CLAUSE_TILE_SIZES): Likewise. gcc/cp/ChangeLog * parser.cc (cp_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (cp_parser_omp_clause_unroll_partial): Likewise. (cp_parser_omp_tile_sizes): Likewise. (cp_parser_omp_loop_transform_clause): Likewise. (cp_parser_omp_nested_loop_transform_clauses): Likewise. (cp_parser_omp_unroll): Likewise. * pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL and OMP_CLAUSE_TILE handling to changed number of operands. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (c_parser_omp_clause_unroll_partial): Likewise. (c_parser_omp_tile_sizes): Likewise. (c_parser_omp_loop_transform_clause): Likewise. (c_parser_omp_nested_loop_transform_clauses): Likewise. (c_parser_omp_unroll): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New tes= t. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test= . * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test= . * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test= . * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New tes= t. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test= . * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New tes= t. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test= . * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New tes= t. * gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to changed diagnostic messages. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test. --- gcc/c/c-parser.cc | 10 +- gcc/cp/parser.cc | 12 +- gcc/cp/pt.cc | 12 +- gcc/fortran/openmp.cc | 173 ++++++++++++------ gcc/fortran/trans-openmp.cc | 74 ++++++-- gcc/omp-transform-loops.cc | 138 ++++++++------ .../gomp/loop-transforms/inner-loops.f90 | 124 +++++++++++++ .../gomp/loop-transforms/tile-3.f90 | 4 +- .../loop-transforms/tile-imperfect-nest.f90 | 93 ++++++++++ .../loop-transforms/tile-inner-loops-1.f90 | 16 ++ .../loop-transforms/tile-inner-loops-2.f90 | 23 +++ .../loop-transforms/tile-inner-loops-3.f90 | 22 +++ .../loop-transforms/tile-inner-loops-3a.f90 | 31 ++++ .../loop-transforms/tile-inner-loops-4.f90 | 30 +++ .../loop-transforms/tile-inner-loops-4a.f90 | 26 +++ .../loop-transforms/tile-inner-loops-5.f90 | 123 +++++++++++++ .../tile-non-rectangular-1.f90 | 71 +++++++ .../tile-non-rectangular-2.f90 | 12 ++ .../gomp/loop-transforms/unroll-8.f90 | 2 +- .../gomp/loop-transforms/unroll-9.f90 | 2 +- .../loop-transforms/unroll-inner-loop.f90 | 57 ++++++ .../loop-transforms/unroll-non-rect-1.f90 | 31 ++++ .../gomp/loop-transforms/unroll-tile-1.f90 | 2 +- .../gomp/loop-transforms/unroll-tile-2.f90 | 2 +- .../loop-transforms/unroll-tile-inner-1.f90 | 25 +++ gcc/tree-pretty-print.cc | 24 +++ gcc/tree.cc | 8 +- gcc/tree.h | 9 +- .../loop-transforms/inner-1.f90 | 77 ++++++++ 29 files changed, 1103 insertions(+), 130 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-lo= ops.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imp= erfect-nest.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-3.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-3a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-4.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-4a.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inn= er-loops-5.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non= -rectangular-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non= -rectangular-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-i= nner-loop.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-n= on-rect-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-t= ile-inner-1.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/loop-transforms/inner= -1.f90 diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index aac23dec9c0..41f9fb90037 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -17466,6 +17466,7 @@ c_parser_omp_clause_unroll_full (c_parser *parser, = tree list) location_t loc =3D c_parser_peek_token (parser)->location; tree c =3D build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node, 0)= ; OMP_CLAUSE_CHAIN (c) =3D list; return c; } @@ -17486,6 +17487,7 @@ c_parser_omp_clause_unroll_partial (c_parser *parse= r, tree list) loc =3D c_parser_peek_token (parser)->location; c =3D build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) =3D NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node, 0)= ; OMP_CLAUSE_CHAIN (c) =3D list; if (!c_parser_next_token_is (parser, CPP_OPEN_PAREN)) @@ -24011,6 +24013,7 @@ c_parser_omp_tile_sizes (c_parser *parser, location= _t loc) gcc_assert (sizes); tree c =3D build_omp_clause (loc, OMP_CLAUSE_TILE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node, 0)= ; OMP_CLAUSE_TILE_SIZES (c) =3D sizes; return c; @@ -24036,7 +24039,11 @@ c_parser_omp_loop_transform_clause (c_parser *pars= er) if (!c) { if (c_parser_next_token_is (parser, CPP_PRAGMA_EOL)) - c =3D build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + { + c =3D build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE= ); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D + build_int_cst (unsigned_type_node, 0); + } else c =3D error_mark_node; } @@ -24191,6 +24198,7 @@ c_parser_omp_unroll (location_t loc, c_parser *pars= er, bool *if_p) if (!clauses) { tree c =3D build_omp_clause (loc, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); OMP_CLAUSE_CHAIN (c) =3D clauses; clauses =3D c; } diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 084ecd3ada5..8219c476153 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -39476,6 +39476,7 @@ cp_parser_omp_clause_unroll_full (tree list, locati= on_t loc) return list; tree c =3D build_omp_clause (loc, OMP_CLAUSE_UNROLL_FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node, 0)= ; OMP_CLAUSE_CHAIN (c) =3D list; return c; } @@ -39494,6 +39495,7 @@ cp_parser_omp_clause_unroll_partial (cp_parser *par= ser, tree list, tree c, num =3D error_mark_node; c =3D build_omp_clause (loc, OMP_CLAUSE_UNROLL_PARTIAL); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) =3D NULL_TREE; + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node, 0)= ; OMP_CLAUSE_CHAIN (c) =3D list; if (!cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) @@ -45786,6 +45788,8 @@ cp_parser_omp_tile_sizes (cp_parser *parser, locati= on_t loc) gcc_assert (sizes); tree c =3D build_omp_clause (loc, OMP_CLAUSE_TILE); OMP_CLAUSE_TILE_SIZES (c) =3D sizes; + OMP_CLAUSE_TRANSFORM_LEVEL (c) + =3D build_int_cst (unsigned_type_node, 0); return c; } @@ -45846,7 +45850,11 @@ cp_parser_omp_loop_transform_clause (cp_parser *pa= rser) if (!c) { if (cp_lexer_next_token_is (lexer, CPP_PRAGMA_EOL)) - c =3D build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + { + c =3D build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE= ); + OMP_CLAUSE_TRANSFORM_LEVEL (c) + =3D build_int_cst (unsigned_type_node, 0); + } else c =3D error_mark_node; } @@ -45926,6 +45934,7 @@ cp_parser_omp_nested_loop_transform_clauses (cp_par= ser *parser, tree &clauses, default: gcc_unreachable (); } + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); if (depth < last_depth) { @@ -45974,6 +45983,7 @@ cp_parser_omp_unroll (cp_parser *parser, cp_token *= tok, bool *if_p) if (!clauses) { tree c =3D build_omp_clause (tok->location, OMP_CLAUSE_UNROLL_NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); OMP_CLAUSE_CHAIN (c) =3D clauses; clauses =3D c; } diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index a9d36d66caf..aeea36b24d7 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -18086,11 +18086,19 @@ tsubst_omp_clauses (tree clauses, enum c_omp_regi= on_type ort, case OMP_CLAUSE_ASYNC: case OMP_CLAUSE_WAIT: case OMP_CLAUSE_DETACH: - case OMP_CLAUSE_UNROLL_PARTIAL: - case OMP_CLAUSE_TILE: OMP_CLAUSE_OPERAND (nc, 0) =3D tsubst_expr (OMP_CLAUSE_OPERAND (oc, 0), args, complain, in= _decl); break; + case OMP_CLAUSE_UNROLL_PARTIAL: + OMP_CLAUSE_UNROLL_PARTIAL_EXPR (nc) + =3D tsubst_expr (OMP_CLAUSE_UNROLL_PARTIAL_EXPR (oc), args, com= plain, + in_decl); + break; + case OMP_CLAUSE_TILE: + OMP_CLAUSE_TILE_SIZES (nc) + =3D tsubst_expr (OMP_CLAUSE_TILE_SIZES (oc), args, complain, + in_decl); + break; case OMP_CLAUSE_REDUCTION: case OMP_CLAUSE_IN_REDUCTION: case OMP_CLAUSE_TASK_REDUCTION: diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 1de61029768..86e9e4ead0e 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -9389,27 +9389,79 @@ gfc_resolve_omp_local_vars (gfc_namespace *ns) gfc_traverse_ns (ns, handle_local_var); } + +/* Forward declaration for mutually recursive functions. */ +static gfc_code * +find_nested_loop_in_block (gfc_code *block); + +/* Return the first nested DO loop in CHAIN, or NULL if there + isn't one. Does no error checking on intervening code. */ + +static gfc_code * +find_nested_loop_in_chain (gfc_code *chain) +{ + gfc_code *code; + + if (!chain) + return NULL; + + for (code =3D chain; code; code =3D code->next) + { + if (code->op =3D=3D EXEC_DO) + return code; + else if (loop_transform_p (code->op) && code->block) + { + code =3D code->block; + continue; + } + else if (code->op =3D=3D EXEC_BLOCK) + { + gfc_code *c =3D find_nested_loop_in_block (code); + if (c) + return c; + } + } + return NULL; +} + +/* Return the first nested DO loop in BLOCK, or NULL if there + isn't one. Does no error checking on intervening code. */ +static gfc_code * +find_nested_loop_in_block (gfc_code *block) +{ + gfc_namespace *ns; + gcc_assert (block->op =3D=3D EXEC_BLOCK); + ns =3D block->ext.block.ns; + gcc_assert (ns); + return find_nested_loop_in_chain (ns->code); +} /* CODE is an OMP loop construct. Return true if VAR matches an iteration variable outer to level DEPTH. */ static bool is_outer_iteration_variable (gfc_code *code, int depth, gfc_symbol *var) { int i; - gfc_code *do_code =3D code->block->next; - while (loop_transform_p (do_code->op)) { - if (do_code->block) - do_code =3D do_code->block->next; - else - do_code =3D do_code->next; - } - gcc_assert (!loop_transform_p (do_code->op)); + gfc_code *chain; + if (code->block) + chain =3D code->block->next; + else + { + gcc_assert (loop_transform_p (code->op)); + chain =3D code; + while (loop_transform_p (chain->op)) + chain =3D chain->next; + } for (i =3D 1; i < depth; i++) { + gfc_code *do_code =3D find_nested_loop_in_chain (chain); + gcc_assert (do_code !=3D code); + gcc_assert (do_code && do_code->op =3D=3D EXEC_DO); gfc_symbol *ivar =3D do_code->ext.iterator->var->symtree->n.sym; if (var =3D=3D ivar) return true; - do_code =3D do_code->block->next; + + chain =3D do_code->block->next; } return false; } @@ -9420,21 +9472,22 @@ static bool expr_is_invariant (gfc_code *code, int depth, gfc_expr *expr) { int i; - gfc_code *do_code =3D code->block->next; - while (loop_transform_p (do_code->op)) { - if (do_code->block) - do_code =3D do_code->block->next; - else - do_code =3D do_code->next; - } - gcc_assert (!loop_transform_p (do_code->op)); + gfc_code *do_code =3D code; + + /* Move over loop transformations until the + loop is found. It may also be represented by a + transformation construct (but then with a block) + if it is not associated with any other construct. */ + while (loop_transform_p (do_code->op) && !do_code->block) + do_code =3D do_code->next; for (i =3D 1; i < depth; i++) { + do_code =3D find_nested_loop_in_chain (do_code->block->next); + gcc_assert (do_code); gfc_symbol *ivar =3D do_code->ext.iterator->var->symtree->n.sym; if (gfc_find_sym_in_expr (ivar, expr)) return false; - do_code =3D do_code->block->next; } return true; } @@ -9828,6 +9881,8 @@ resolve_omp_do (gfc_code *code) if (i =3D=3D collapse || c) break; do_code =3D do_code->block; + do_code =3D resolve_nested_loop_transforms (do_code, name, collapse = - i, + &code->loc); if (do_code->op !=3D EXEC_DO && do_code->op !=3D EXEC_DO_WHILE) { gfc_error ("not enough DO loops for collapsed %s at %L", @@ -9835,6 +9890,8 @@ resolve_omp_do (gfc_code *code) break; } do_code =3D do_code->next; + do_code =3D resolve_nested_loop_transforms (do_code, name, collapse = - i, + &code->loc); if (do_code =3D=3D NULL || (do_code->op !=3D EXEC_DO && do_code->op !=3D EXEC_DO_WHILE)) { @@ -9848,7 +9905,7 @@ resolve_omp_do (gfc_code *code) static void resolve_omp_tile (gfc_code *code) { - gfc_code *do_code, *c; + gfc_code *do_code, *next; gfc_symbol *dovar; const char *name =3D "!$OMP TILE"; @@ -9862,65 +9919,78 @@ resolve_omp_tile (gfc_code *code) for (unsigned i =3D 1; i <=3D num_loops; i++) { + + gfc_symbol *start_var =3D NULL, *end_var =3D NULL; + if (do_code->op =3D=3D EXEC_DO_WHILE) { gfc_error ("%s cannot be a DO WHILE or DO without loop control " "at %L", name, &do_code->loc); - break; + return; } if (do_code->op =3D=3D EXEC_DO_CONCURRENT) { gfc_error ("%s cannot be a DO CONCURRENT loop at %L", name, &do_code->loc); - break; + return; } if (do_code->op !=3D EXEC_DO) { gfc_error ("%s must be DO loop at %L", name, &do_code->loc); - break; + return; } gcc_assert (do_code->op !=3D EXEC_OMP_UNROLL); gcc_assert (do_code->op =3D=3D EXEC_DO); dovar =3D do_code->ext.iterator->var->symtree->n.sym; - if (i > 1) + if (is_outer_iteration_variable (code, i, dovar)) { - gfc_code *do_code2 =3D code; - while (loop_transform_p (do_code2->op)) - { - if (do_code2->block) - do_code2 =3D do_code2->block->next; - else - do_code2 =3D do_code2->next; - } - gcc_assert (!loop_transform_p (do_code2->op)); - - for (unsigned j =3D 1; j < i; j++) - { - gfc_symbol *ivar =3D do_code2->ext.iterator->var->symtree->n.= sym; - if (dovar =3D=3D ivar - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->sta= rt) - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->end= ) - || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->ste= p)) - { - gfc_error ("%s loops don't form rectangular " - "iteration space at %L", name, &do_code->loc); - break; - } - do_code2 =3D do_code2->block->next; - } + gfc_error ("%s iteration variable used in more than one loop at %= L (depth %d)", + name, &do_code->loc, i); + return; } - for (c =3D do_code->next; c; c =3D c->next) - if (c->op !=3D EXEC_NOP && c->op !=3D EXEC_CONTINUE) + else if (!bound_expr_is_canonical (code, i, + do_code->ext.iterator->start, + &start_var)) + { + gfc_error ("%s loop start expression not in canonical form at %L"= , + name, &do_code->loc); + return; + } + else if (!bound_expr_is_canonical (code, i, + do_code->ext.iterator->end, + &end_var)) + { + gfc_error ("%s loop end expression not in canonical form at %L", + name, &do_code->loc); + return; + } + else if (start_var && end_var && start_var !=3D end_var) + { + gfc_error ("%s loop bounds reference different " + "iteration variables at %L", name, &do_code->loc); + return; + } + else if (!expr_is_invariant (code, i, do_code->ext.iterator->step)) + { + gfc_error ("%s loop increment not in canonical form at %L", + name, &do_code->loc); + return; + } + if (start_var || end_var) + code->ext.omp_clauses->non_rectangular =3D 1; + for (next =3D do_code->next; next; next =3D next->next) + if (next->op !=3D EXEC_NOP && next->op !=3D EXEC_CONTINUE) { gfc_error ("%s loops not perfectly nested at %L", - name, &c->loc); + name, &next->loc); break; } - if (i =3D=3D num_loops || c) + if (i =3D=3D num_loops || next) break; do_code =3D do_code->block; + do_code =3D resolve_nested_loop_transforms (do_code, name, num_loops= - i, &code->loc); if (do_code->op !=3D EXEC_DO && do_code->op !=3D EXEC_DO_WHILE) { gfc_error ("not enough DO loops for %s at %L", @@ -9928,6 +9998,7 @@ resolve_omp_tile (gfc_code *code) break; } do_code =3D do_code->next; + do_code =3D resolve_nested_loop_transforms (do_code, name, num_loops= - i, &code->loc); if (do_code =3D=3D NULL || (do_code->op !=3D EXEC_DO && do_code->op !=3D EXEC_DO_WHILE)) { diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 6936cd7f5ee..0cef3a8ba3a 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3893,12 +3893,14 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_= clauses *clauses, if (clauses->unroll_full) { c =3D build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL= _FULL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); omp_clauses =3D gfc_trans_add_clause (c, omp_clauses); } if (clauses->unroll_none) { c =3D build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL= _NONE); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); omp_clauses =3D gfc_trans_add_clause (c, omp_clauses); } @@ -3906,6 +3908,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_cl= auses *clauses, { c =3D build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_UNROLL_PARTIAL); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); OMP_CLAUSE_UNROLL_PARTIAL_EXPR (c) =3D clauses->unroll_partial_factor ? build_int_cst ( integer_type_node, clauses->unroll_partial_factor) @@ -3926,6 +3929,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_cl= auses *clauses, c =3D build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_TILE); OMP_CLAUSE_TILE_SIZES (c) =3D build_tree_list_vec (tvec); + OMP_CLAUSE_TRANSFORM_LEVEL (c) =3D build_int_cst (unsigned_type_node= , 0); omp_clauses =3D gfc_trans_add_clause (c, omp_clauses); tvec->truncate (0); @@ -5308,6 +5312,29 @@ gfc_expr_list_len (gfc_expr_list *list) return len; } +/* Traverse the loops with nesting depth at most + COLLAPSE from CODE and determine the largest + loop nest depth required by the loop transformations + found on the loops. */ +int compute_transformed_depth (gfc_code *code, int collapse) +{ + int new_collapse =3D collapse; + for (int i =3D 0; i < new_collapse; i++) + { + gcc_assert (code->op =3D=3D EXEC_DO || loop_transform_p (code->op)); + while (loop_transform_p (code->op)) + { + int tile_depth + =3D gfc_expr_list_len (code->ext.omp_clauses->tile_sizes); + new_collapse =3D MAX (new_collapse, i + tile_depth); + code =3D code->block ? code->block->next : code->next; + } + code =3D code->block->next; + } + + return new_collapse; +} + static tree gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_omp_clauses *do_clauses, tree par_clauses) @@ -5343,6 +5370,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stm= tblock_t *pblock, do" (or similar directive) are represented as clauses on the "omp do"= . */ loop_transform_clauses =3D NULL; int omp_tile_depth =3D gfc_expr_list_len (omp_tile); + tree clauses_tail =3D NULL; while (loop_transform_p (code->op)) { tree clauses =3D gfc_trans_omp_clauses (pblock, code->ext.omp_clause= s, @@ -5354,7 +5382,14 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, st= mtblock_t *pblock, directive, an error will be emitted in pass-omp_transform_loops. *= / omp_tile_depth =3D gfc_expr_list_len (code->ext.omp_clauses->tile_si= zes); - loop_transform_clauses =3D chainon (loop_transform_clauses, clauses)= ; + if (!loop_transform_clauses) + { + loop_transform_clauses =3D clauses; + clauses_tail =3D tree_last (clauses); + } + else + clauses_tail =3D chainon (clauses_tail, clauses); + code =3D code->block ? code->block->next : code->next; } gcc_assert (!loop_transform_p (code->op)); @@ -5371,9 +5406,12 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, st= mtblock_t *pblock, collapse =3D clauses->orderedc; if (collapse <=3D 0) collapse =3D 1; - collapse =3D MAX (collapse, omp_tile_depth); + gfc_code *first_loop =3D loop_transform_p (orig_code->op) ? + orig_code : orig_code->block->next; + int transform_depth =3D compute_transformed_depth (first_loop, collapse)= ; + collapse =3D transform_depth; init =3D make_tree_vec (collapse); cond =3D make_tree_vec (collapse); incr =3D make_tree_vec (collapse); @@ -5384,15 +5422,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, st= mtblock_t *pblock, on the simd construct and DO's clauses are translated elsewhere. */ do_clauses->sched_simd =3D false; - if (loop_transform_p (op)) - { - /* This is a loop transformation on a loop which is not associated w= ith - any other directive. Use the directive location instead of the loo= p - location for the clauses. */ - omp_clauses =3D gfc_trans_omp_clauses (pblock, do_clauses, top_loc); - } - else - omp_clauses =3D gfc_trans_omp_clauses (pblock, do_clauses, code->loc); + omp_clauses =3D NULL; + omp_clauses =3D gfc_trans_omp_clauses (pblock, do_clauses, top_loc); omp_clauses =3D chainon (omp_clauses, loop_transform_clauses); for (i =3D 0; i < collapse; i++) @@ -5665,7 +5696,26 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, st= mtblock_t *pblock, } if (i + 1 < collapse) - code =3D code->block->next; + { + code =3D code->block->next; + + loop_transform_clauses =3D NULL; + clauses_tail =3D omp_clauses; + while (loop_transform_p (code->op)) + { + loop_transform_clauses =3D gfc_trans_omp_clauses ( + pblock, code->ext.omp_clauses, code->loc); + for (tree c =3D loop_transform_clauses; c; + c =3D OMP_CLAUSE_CHAIN (c)) + OMP_CLAUSE_TRANSFORM_LEVEL (c) + =3D build_int_cst (unsigned_type_node, i + 1); + + clauses_tail =3D chainon (clauses_tail, loop_transform_clause= s); + clauses_tail =3D tree_last (loop_transform_clauses); + + code =3D code->block ? code->block->next : code->next; + } + } } if (pblock !=3D &block) diff --git a/gcc/omp-transform-loops.cc b/gcc/omp-transform-loops.cc index 858a271261a..517faea537c 100644 --- a/gcc/omp-transform-loops.cc +++ b/gcc/omp-transform-loops.cc @@ -127,7 +127,7 @@ extern tree gimple_assign_rhs_to_tree (gimple *stmt); /* Substitute all definitions from SEQ bottom-up into EXPR. This is used t= o - reconstruct a tree for a gimplified expression for determinig whether o= r not + reconstruct a tree from a gimplified expression for determinig whether = or not the number of iterations of a loop is constant. */ tree @@ -227,6 +227,7 @@ gomp_for_uncollapse (gomp_for *omp_for, int from_depth = =3D 0, bool expand =3D false) { int collapse =3D gimple_omp_for_collapse (omp_for); gcc_assert (from_depth < collapse); + gcc_assert (from_depth >=3D 0); if (collapse <=3D 1) return omp_for; @@ -266,6 +267,7 @@ gomp_for_uncollapse (gomp_for *omp_for, int from_depth = =3D 0, bool expand =3D false) if (from_depth > 0) { gimple_omp_set_body (omp_for, body); + omp_for->collapse =3D from_depth; return omp_for; } @@ -453,7 +455,7 @@ after transform: Misc 6.0: Loop transformations #3440")= in the non-public OpenMP spec repository. */ static gimple_seq -partial_unroll (gomp_for *omp_for, tree unroll_factor, +partial_unroll (gomp_for *omp_for, size_t level, tree unroll_factor, location_t loc, tree transformation_clauses, walk_ctx *ctx) { gcc_assert (unroll_factor); @@ -463,7 +465,7 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor, /* Partial unrolling reduces the loop nest depth of a canonical loop nes= t to 1 hence outer directives cannot require a greater collapse. */ - gcc_assert (gimple_omp_for_collapse (omp_for) <=3D 1); + gcc_assert (gimple_omp_for_collapse (omp_for) <=3D level + 1); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE | MSG_PRIORITY_INTERNALS, @@ -473,12 +475,12 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor= , gomp_for *unrolled_for =3D as_a (copy_gimple_seq_and_replace= _locals (omp_for)); - tree final =3D gimple_omp_for_final (unrolled_for, 0); - tree incr =3D gimple_omp_for_incr (unrolled_for, 0); - tree index =3D gimple_omp_for_index (unrolled_for, 0); + tree final =3D gimple_omp_for_final (unrolled_for, level); + tree incr =3D gimple_omp_for_incr (unrolled_for, level); + tree index =3D gimple_omp_for_index (unrolled_for, level); gimple_seq body =3D gimple_omp_body (unrolled_for); - tree_code cond =3D gimple_omp_for_cond (unrolled_for, 0); + tree_code cond =3D gimple_omp_for_cond (unrolled_for, level); tree step =3D TREE_OPERAND (incr, 1); gimple_omp_set_body (unrolled_for, build_unroll_body (body, unroll_factor, index, incr, @@ -503,7 +505,7 @@ partial_unroll (gomp_for *omp_for, tree unroll_factor, scaled_step =3D var; } TREE_OPERAND (incr, 1) =3D scaled_step; - gimple_omp_for_set_incr (unrolled_for, 0, incr); + gimple_omp_for_set_incr (unrolled_for, level, incr); pop_gimplify_context (result_bind); @@ -864,7 +866,7 @@ canonicalize_conditions (gomp_for *omp_for) */ static gimple_seq -tile (gomp_for *omp_for, location_t loc, tree tile_sizes, +tile (gomp_for *omp_for, location_t loc, size_t start_level, tree tile_siz= es, tree transformation_clauses, walk_ctx *ctx) { if (dump_enabled_p ()) @@ -896,22 +898,21 @@ tile (gomp_for *omp_for, location_t loc, tree tile_si= zes, collapse_clause =3D c; } - /* The 'omp tile' construct creates a canonical loop-nest whose nesting = depth - equals tiling_depth. The whole loop-nest has depth at least 2 * - omp_tile_depth, but the 'tile loops' at levels - omp_tile_depth+1...2*omp_tile_depth are not in canonical loop-nest fo= rm - and hence cannot be associated with a loop construct. */ - if (clause_collapse > tiling_depth) + /* The tiled loop nest is a canonical loop nest with nesting depth + tiling_depth. The tile loops below that level are not in + canonical loop nest form and hence cannot be associated with a + loop construct. */ + if (clause_collapse > tiling_depth + start_level) { error_at (OMP_CLAUSE_LOCATION (collapse_clause), "collapse cannot extend below the floor loops " "generated by the % construct"); OMP_CLAUSE_COLLAPSE_EXPR (collapse_clause) - =3D build_int_cst (unsigned_type_node, tiling_depth); + =3D build_int_cst (unsigned_type_node, start_level + tiling_depth= ); return transform_gomp_for (omp_for, NULL, ctx); } - if (tiling_depth > collapse) + if (start_level + tiling_depth > collapse) return transform_gomp_for (omp_for, NULL, ctx); gcc_assert (collapse >=3D clause_collapse); @@ -919,13 +920,15 @@ tile (gomp_for *omp_for, location_t loc, tree tile_si= zes, push_gimplify_context (); /* Create the index variables for iterating the tiles in the floor - loops first tiling_depth loops transformed loop nest. */ + loops which will be the loops at levels start_level + ... start_level + tiling_depth of the transformed loop nest. The + loops at level 0 ... start_level - 1 are left unchanged. */ gimple_seq floor_loops_pre_body =3D NULL; size_t tile_level =3D 0; auto_vec sizes_vec; for (tree el =3D tile_sizes; el; el =3D TREE_CHAIN (el), tile_level++) { - size_t nest_level =3D tile_level; + size_t nest_level =3D start_level + tile_level; tree index =3D gimple_omp_for_index (omp_for, nest_level); tree init =3D gimple_omp_for_initial (omp_for, nest_level); tree incr =3D gimple_omp_for_incr (omp_for, nest_level); @@ -956,6 +959,7 @@ tile (gomp_for *omp_for, location_t loc, tree tile_size= s, gimple_omp_for_set_incr (floor_loops, nest_level, incr); gimple_omp_for_set_index (floor_loops, nest_level, tile_index); } + gbind *result_bind =3D gimple_build_bind (NULL, NULL, NULL); pop_gimplify_context (result_bind); gimple_seq_add_seq (gimple_omp_for_pre_body_ptr (floor_loops), @@ -972,6 +976,9 @@ tile (gomp_for *omp_for, location_t loc, tree tile_size= s, to add the incomplete tile checks to each level loop. */ tile_loops =3D gomp_for_uncollapse (as_a (tile_loops)); + for (size_t i =3D 0; i < start_level; i++) + tile_loops =3D gimple_omp_body (tile_loops); + gimple_omp_for_set_kind (as_a (tile_loops), GF_OMP_FOR_KIND_TRANSFORM_LOOP); gimple_omp_for_set_clauses (tile_loops, NULL_TREE); @@ -990,50 +997,51 @@ tile (gomp_for *omp_for, location_t loc, tree tile_si= zes, tree break_label =3D create_artificial_label (UNKNOWN_LOCATION); gimple_seq_add_stmt (surrounding_seq, gimple_build_label (break_label)); - for (size_t level =3D 0; level < tiling_depth; level++) + for (size_t tile_level =3D 0; tile_level < tiling_depth; tile_level++) { - tree original_index =3D gimple_omp_for_index (omp_for, level); - tree original_final =3D gimple_omp_for_final (omp_for, level); + gimple_seq level_preamble =3D NULL; + gimple_seq level_body =3D gimple_omp_body (level_loop); + auto gsi =3D gsi_start (level_body); - tree tile_index =3D gimple_omp_for_index (floor_loops, level); - tree tile_size =3D sizes_vec[level]; + int nest_level =3D start_level + tile_level; + tree original_index =3D gimple_omp_for_index (omp_for, nest_level); + tree original_final =3D gimple_omp_for_final (omp_for, nest_level); + + tree tile_index + =3D gimple_omp_for_index (floor_loops, nest_level); + tree tile_size =3D sizes_vec[tile_level]; tree type =3D TREE_TYPE (tile_index); tree plus_type =3D type; - tree incr =3D gimple_omp_for_incr (omp_for, level); + tree incr =3D gimple_omp_for_incr (omp_for, nest_level); tree step =3D omp_get_for_step_from_incr (gimple_location (omp_for),= incr); gimple_seq *pre_body =3D gimple_omp_for_pre_body_ptr (level_loop); - gimple_seq level_body =3D gimple_omp_body (level_loop); gcc_assert (gimple_omp_for_collapse (level_loop) =3D=3D 1); - tree_code original_cond =3D gimple_omp_for_cond (omp_for, level); + tree_code original_cond =3D gimple_omp_for_cond (omp_for, nest_level= ); gimple_omp_for_set_initial (level_loop, 0, tile_index); tree tile_final =3D create_tmp_var (type); - tree scaled_tile_size =3D fold_build2 (MULT_EXPR, TREE_TYPE (tile_si= ze), - tile_size, step); + tree scaled_tile_size + =3D fold_build2 (MULT_EXPR, TREE_TYPE (tile_size), tile_size, ste= p); tree_code plus_code =3D PLUS_EXPR; if (POINTER_TYPE_P (TREE_TYPE (tile_index))) { plus_code =3D POINTER_PLUS_EXPR; int unsignedp =3D TYPE_UNSIGNED (TREE_TYPE (scaled_tile_size)); - plus_type =3D signed_or_unsigned_type_for (unsignedp, ptrdiff_typ= e_node); + plus_type + =3D signed_or_unsigned_type_for (unsignedp, ptrdiff_type_node= ); } scaled_tile_size =3D fold_convert (plus_type, scaled_tile_size); - gimplify_assign (tile_final, - fold_build2 (plus_code, type, - tile_index, scaled_tile_size), - pre_body); + gimplify_assign ( + tile_final, + fold_build2 (plus_code, type, tile_index, scaled_tile_size), + pre_body); gimple_omp_for_set_final (level_loop, 0, tile_final); - /* Redefine the original loop index variable of OMP_FOR in terms of = the - floor loop and the tiling loop index variable for the current - dimension/level at the top of the loop. */ - gimple_seq level_preamble =3D NULL; - push_gimplify_context (); tree body_label =3D create_artificial_label (UNKNOWN_LOCATION); @@ -1047,7 +1055,6 @@ tile (gomp_for *omp_for, location_t loc, tree tile_si= zes, break_label)); gimple_seq_add_stmt (&level_preamble, gimple_build_label (body_label= )); - auto gsi =3D gsi_start (level_body); gsi_insert_seq_before (&gsi, level_preamble, GSI_SAME_STMT); gbind *level_bind =3D gimple_build_bind (NULL, NULL, NULL); pop_gimplify_context (level_bind); @@ -1057,10 +1064,10 @@ tile (gomp_for *omp_for, location_t loc, tree tile_= sizes, surrounding_seq =3D &level_body; level_loop =3D gsi_stmt (gsi); - /* The label for jumping out of the loop at the next nesting - level. For the outermost level, the label is put after the - loop-nest, for the last one it is not necessary. */ - if (level !=3D tiling_depth - 1) + /* The label for jumping out of the loop at the next + nesting level. For the outermost level, the label is put + after the loop-nest, for the last one it is not necessary. */ + if (tile_level !=3D tiling_depth - 1) { break_label =3D create_artificial_label (UNKNOWN_LOCATION); gsi_insert_after (&gsi, gimple_build_label (break_label), @@ -1093,13 +1100,15 @@ tile (gomp_for *omp_for, location_t loc, tree tile_= sizes, next_transform_depth =3D list_length (OMP_CLAUSE_TILE_SIZES (remaining_clauses)); + size_t next_level + =3D tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (remaining_clauses))= ; /* The current "omp tile" transformation reduces the nesting depth of the canonical loop-nest to TILING_DEPTH. Hence the following "omp tile" transformation is invalid if it requires a greater nesting depth. */ - gcc_assert (next_transform_depth <=3D tiling_depth); - if (next_transform_depth > new_collapse) - new_collapse =3D next_transform_depth; + gcc_assert (next_level + next_transform_depth <=3D start_level + til= ing_depth); + if (next_level + next_transform_depth > new_collapse) + new_collapse =3D next_level + next_transform_depth; } if (collapse > new_collapse) @@ -1260,14 +1269,17 @@ transform_gomp_for (gomp_for *omp_for, tree transfo= rmation, walk_ctx *ctx) gimple_seq result =3D NULL; location_t loc =3D OMP_CLAUSE_LOCATION (transformation); auto dump_loc =3D dump_user_location_t::from_location_t (loc); + size_t level =3D tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (transformatio= n)); switch (OMP_CLAUSE_CODE (transformation)) { case OMP_CLAUSE_UNROLL_FULL: gcc_assert (TREE_CHAIN (transformation) =3D=3D NULL); + gcc_assert (level =3D=3D 0); result =3D full_unroll (omp_for, loc, ctx); break; case OMP_CLAUSE_UNROLL_NONE: gcc_assert (TREE_CHAIN (transformation) =3D=3D NULL); + gcc_assert (level =3D=3D 0); if (assign_unroll_full_clause_p (omp_for, transformation)) { result =3D full_unroll (omp_for, loc, ctx); @@ -1275,7 +1287,7 @@ transform_gomp_for (gomp_for *omp_for, tree transform= ation, walk_ctx *ctx) else if (tree unroll_factor =3D assign_unroll_partial_clause_p (omp_for, transformation)= ) { - result =3D partial_unroll (omp_for, unroll_factor, loc, + result =3D partial_unroll (omp_for, level, unroll_factor, loc, transformation, ctx); } else { @@ -1312,12 +1324,14 @@ transform_gomp_for (gomp_for *omp_for, tree transfo= rmation, walk_ctx *ctx) "factor turned into % clause\n= ", factor); } - result =3D partial_unroll (omp_for, unroll_factor, loc, transformat= ion, - ctx); + + result =3D partial_unroll (omp_for, level, + unroll_factor, loc, transformation, ctx); } break; case OMP_CLAUSE_TILE: - result =3D tile (omp_for, loc, OMP_CLAUSE_TILE_SIZES (transformation= ), + result =3D tile (omp_for, loc, level, + OMP_CLAUSE_TILE_SIZES (transformation), transformation, ctx); break; default: @@ -1418,6 +1432,9 @@ print_optimized_unroll_partial_msg (tree c) static tree optimize_transformation_clauses (tree clauses) { + if (!clauses) + return NULL_TREE; + /* The last unroll_partial clause seen in clauses, if any, or the last merged unroll partial clause. */ tree unroll_partial =3D NULL; @@ -1429,6 +1446,7 @@ optimize_transformation_clauses (tree clauses) since last_non_unroll was seen. */ bool merged_unroll_partial =3D false; + size_t level =3D tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (clauses)); for (tree c =3D clauses; c !=3D NULL_TREE; c =3D OMP_CLAUSE_CHAIN (c)) { enum omp_clause_code code =3D OMP_CLAUSE_CODE (c); @@ -1516,6 +1534,24 @@ optimize_transformation_clauses (tree clauses) default: gcc_unreachable (); } + + /* The transformations are ordered by the level of the loop-nest to = which + they apply in decreasing order. Handle the different levels separa= tely + as long as we do not implement optimizations across the levels. */ + tree next_c =3D OMP_CLAUSE_CHAIN (c); + if (!next_c) + break; + + size_t next_level =3D tree_to_uhwi (OMP_CLAUSE_TRANSFORM_LEVEL (next= _c)); + if (next_level !=3D level) + { + gcc_assert (next_level < level); + tree tail =3D optimize_transformation_clauses (next_c); + OMP_CLAUSE_CHAIN (c) =3D tail; + break; + } + else level =3D next_level; + } if (merged_unroll_partial && dump_enabled_p ()) diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90= b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 new file mode 100644 index 00000000000..f9ee5184dab --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/inner-loops.f90 @@ -0,0 +1,124 @@ +subroutine test1 + !$omp parallel do collapse(2) + do i=3D0,100 + !$omp unroll partial(2) + do j=3D-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +subroutine test2 + !$omp parallel do collapse(3) + do i=3D0,100 + !$omp unroll partial(2) ! { dg-error {loop nest depth after \!\$OMP U= NROLL at \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=3D-300,100 + do k=3D-300,100 + call dummy (k) + end do + end do + end do +end subroutine test2 + +subroutine test3 +!$omp parallel do collapse(3) +do i=3D0,100 + do j=3D-300,100 + !$omp unroll partial(2) + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test3 + +subroutine test4 +!$omp parallel do collapse(3) +do i=3D0,100 + !$omp tile sizes(3) ! { dg-error {loop nest depth after \!\$OMP TILE at= \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do j=3D-300,100 + !$omp unroll partial(2) + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test4 + +subroutine test5 + !$omp parallel do collapse(3) + !$omp tile sizes(3,2) ! { dg-error {loop nest depth after \!\$OMP TILE a= t \(1\) is insufficient for outer \!\$OMP PARALLEL DO} } + do i=3D0,100 + do j=3D-300,100 + do k=3D-300,100 + call dummy (k) + end do + end do + end do +end subroutine test5 + +subroutine test6 +!$omp parallel do collapse(3) +do i=3D0,100 + !$omp tile sizes(3,2) + do j=3D-300,100 + !$omp unroll partial(2) + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test6 + +subroutine test7 +!$omp parallel do collapse(3) +do i=3D0,100 + !$omp tile sizes(3,3) + do j=3D-300,100 + !$omp tile sizes(5) + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test7 + +subroutine test8 +!$omp parallel do collapse(1) +do i=3D0,100 + !$omp tile sizes(3,3) + do j=3D-300,100 + !$omp tile sizes(5) + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test8 + +subroutine test9 +!$omp parallel do collapse(3) +do i=3D0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP T= ILE at \(1\)} } + do j=3D-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE a= t \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test9 + +subroutine test10 +!$omp parallel do +do i=3D0,100 + !$omp tile sizes(3,3,3) ! { dg-error {not enough DO loops for \!\$OMP T= ILE at \(1\)} } + do j=3D-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TILE a= t \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + call dummy (k) + end do +end do +end do +end subroutine test10 + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 b/gc= c/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 index eaa7895eaa0..308e3b3e4d0 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-3.f90 @@ -2,9 +2,9 @@ subroutine test implicit none integer :: i, j, k - !$omp parallel do collapse(2) ordered(2) + !$omp parallel do collapse(2) ordered(2) ! { dg-error {'ordered' invalid= in conjunction with 'omp tile'} } !$omp tile sizes (1,2) - do i =3D 1,100 ! { dg-error {'ordered' invalid in conjunction with 'omp = tile'} } + do i =3D 1,100 do j =3D 1,100 call dummy(j) do k =3D 1,100 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-= nest.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-ne= st.f90 new file mode 100644 index 00000000000..3ec1671f01f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f9= 0 @@ -0,0 +1,93 @@ +subroutine test0 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i =3D 1,m + !$omp tile sizes (8, 1) + do j =3D 1,n + !$omp unroll partial(10) + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + end do + end do + end do +end subroutine test0 + +subroutine test0m + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i =3D 1,m + !$omp tile sizes (8, 1) + do j =3D 1,n + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + inner =3D inner + a(k, i) * b(j, k) + end do + c(j, i) =3D inner ! { dg-error {\!\$OMP TILE loops not perfectly= nested at \(1\)} } + end do + end do +end subroutine test0m + +subroutine test1 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i =3D 1,m + !$omp tile sizes (8, 1) + do j =3D 1,n + !$omp unroll partial(10) + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + inner =3D inner + a(k, i) * b(j, k) + end do + c(j, i) =3D inner ! { dg-error {\!\$OMP TILE loops not perfectly= nested at \(1\)} "TODO Fix with upcoming imperfect loop nest handling" { x= fail *-*-* } } + end do + end do +end subroutine test1 + + +subroutine test2 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + !$omp tile sizes (8, 1) + do i =3D 1,m + !$omp tile sizes (8, 1) + do j =3D 1,n + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + inner =3D inner + a(k, i) * b(j, k) + end do + c(j, i) =3D inner ! { dg-error {\!\$OMP TILE loops not perfectly= nested at \(1\)} } + end do + end do +end subroutine test2 + +subroutine test3 + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + !$omp parallel do collapse(2) private(inner) + do i =3D 1,m + !$omp tile sizes (8, 1) + do j =3D 1,n + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + inner =3D inner + a(k, i) * b(j, k) + end do + c(j, i) =3D inner ! { dg-error {\!\$OMP TILE loops not perfectly= nested at \(1\)} } + end do + end do +end subroutine test3 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1= .f90 new file mode 100644 index 00000000000..6474b9da1e2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90 @@ -0,0 +1,16 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test1 + !$omp parallel do collapse(2) + do i=3D0,100 + !$omp tile sizes(4) + do j=3D-300,100 + call dummy (j) + end do + end do +end subroutine test1 + +! Collapse of the gimple_omp_for should be unaffacted by the transformatio= n +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\)= tile sizes\(4\).1\n +for \(i =3D 0; i <=3D 100; i =3D i \+ 1\)\n +for \(j = =3D -300; j <=3D 100; j =3D j \+ 1\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait collapse\(2\)= private\(j.0\) private\(j\)\n +for \(i =3D 0; i < 101; i =3D i \+ 1\)\n +f= or \(.omp_tile_index.\d =3D -300; .omp_tile_index.\d < 101; .omp_tile_index= .\d =3D .omp_tile_index.\d \+ 4\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2= .f90 new file mode 100644 index 00000000000..0d462debd72 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90 @@ -0,0 +1,23 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test2 + !$omp parallel do + !$omp tile sizes(3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(3,3) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2 + +! One gimple_omp_for should cover the outer two loops, another the inner t= wo loops +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3= , 3\)@0\n +for \(i =3D 0; i <=3D 100; i =3D i \+ 1\)\n +for \(j =3D -300; j= <=3D 100; j =3D j \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile size= s\(3, 3\)@0\n +for \(k =3D -300; k <=3D 100; k =3D k \+ 1\)\n +for \(l =3D = 0; l <=3D 100; l =3D l \+ 1\)} 1 "original" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait\n +for \(.omp= _tile_index.\d =3D 0; .omp_tile_index.\d < 101; .omp_tile_index.\d =3D .omp= _tile_index.\d \+ \d\)} 1 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-3.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3= .f90 new file mode 100644 index 00000000000..3ce87ad8a4b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90 @@ -0,0 +1,22 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(3,3) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! gimple_omp_for collapse should be extended to cover all loops affected b= y the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3= , 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i =3D 0; i <=3D 100; i =3D i \+ 1\= )\n +for \(j =3D -300; j <=3D 100; j =3D j \+ 1\)\n +for \(k =3D -300; k <= =3D 100; k =3D k \+ 1\)\n +for \(l =3D 0; l <=3D 100; l =3D l \+ 1\)} 1 "or= iginal" } } +! Collapse after the transformations should be 1 +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait private\(l.0\= ) private\(k\)\n +for \(.omp_tile_index.\d =3D 0; .omp_tile_index.\d < 101;= .omp_tile_index.\d =3D .omp_tile_index.\d \+ \d\)} 1 "omp_transform_loops"= } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-3a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-= 3a.f90 new file mode 100644 index 00000000000..2c06d2094ba --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f9= 0 @@ -0,0 +1,31 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(3,3) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test + +! gimple_omp_for collapse should be extended to cover all loops affected b= y the transformations (i.e. 4) +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile size= s\(3, 3, 3\)@0 tile sizes\(3, 3\)@2\n +for \(i =3D 0; i <=3D 100; i =3D i \= + 1\)\n +for \(j =3D -300; j <=3D 100; j =3D j \+ 1\)\n +for \(k =3D -300; = k <=3D 100; k =3D k \+ 1\)\n +for \(l =3D 0; l <=3D 100; l =3D l \+ 1\)} 1 = "original" } } + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } + +! Third level is tiled first by the inner construct. The resulting floor l= oop is tiled by the outer construct. +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.1} 2 "omp_transf= orm_loops" } } + +! All other levels are tiled once +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.2} 1 "omp_transf= orm_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.3} 1 "omp_transf= orm_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index.4} 1 "omp_transf= orm_loops" } } + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-4.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4= .f90 new file mode 100644 index 00000000000..355d977fe35 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90 @@ -0,0 +1,30 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp parallel do + !$omp tile sizes(3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(3,3) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! The outer gimple_omp_for should not cover the loop with the tile transfo= rmation +! { dg-final { scan-tree-dump-times {\#pragma omp for nowait tile sizes\(3= \)@0\n +for \(i =3D 0; i <=3D 100; i =3D i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile size= s\(3, 3\)@0\n +for \(k =3D -300; k <=3D 100; k =3D k \+ 1\)\n +for \(l =3D = 0; l <=3D 100; l =3D l \+ 1\)} 1 "original" } } + + +! After transformations, the outer loop should be a floor loop created +! by the tiling and the outer construct type and non-transformation +! clauses should be unaffected by the tiling +! { dg-final { scan-tree-dump {\#pragma omp for nowait\n +for \(.omp_tile_= index.\d =3D 0; .omp_tile_index.\d < 101; .omp_tile_index.\d =3D .omp_tile_= index.\d \+ 3\)} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp} 2 "omp_transform_loops"= } } +! { dg-final { scan-tree-dump-times {\#pragma omp parallel} 1 "omp_transfo= rm_loops" } } +! { dg-final { scan-tree-dump-times {\#pragma omp for} 1 "omp_transform_lo= ops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-4a.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-= 4a.f90 new file mode 100644 index 00000000000..0c83da660f5 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f9= 0 @@ -0,0 +1,26 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +subroutine test3 + !$omp tile sizes(3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(3,3) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3 + +! There should be separate gimple_omp_for constructs for the tile construc= ts because the tiling depth +! of the outer construct does not reach the level of the inner construct +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile size= s\(3\)@0\n +for \(i =3D 0; i <=3D 100; i =3D i \+ 1\)\n} 1 "original" } } +! { dg-final { scan-tree-dump-times {\#pragma omp loop_transform tile size= s\(3, 3\)@0\n +for \(k =3D -300; k <=3D 100; k =3D k \+ 1\)\n +for \(l =3D = 0; l <=3D 100; l =3D l \+ 1\)} 1 "original" } } + + +! The loops should be lowered after the tiling transformations +! { dg-final { scan-tree-dump-not {\#pragma omp} "omp_transform_loops" } } +! { dg-final { scan-tree-dump-times {if \(.omp_tile_index} 3 "omp_transfor= m_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loop= s-5.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5= .f90 new file mode 100644 index 00000000000..670e14caa12 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90 @@ -0,0 +1,123 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5) + do k=3D-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1a + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5,5) + do k=3D-300,100 + do l=3D-300,100 + do m=3D-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2a + +subroutine test3a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TI= LE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3a + +subroutine test4a + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP= TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D-300,100 + do m=3D-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4a + +subroutine test1b + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5) + do k=3D-300,100 + call dummy (k) + end do + end do + end do +end subroutine test1b + +subroutine test2b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5,5) + do k=3D-300,100 + do l=3D-300,100 + do m=3D-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test2b + +subroutine test3b + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5) ! { dg-error {loop nest depth after \!\$OMP TI= LE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D-300,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test3b + +subroutine test4b + !$omp parallel do + !$omp tile sizes(3,3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp tile sizes(5,5) ! { dg-error {loop nest depth after \!\$OMP= TILE at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D-300,100 + do m=3D-300,100 + call dummy (m) + end do + end do + end do + end do + end do +end subroutine test4b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectan= gular-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-recta= ngular-1.f90 new file mode 100644 index 00000000000..169c2b10e54 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-1= .f90 @@ -0,0 +1,71 @@ +subroutine test1 + !$omp tile sizes(1) + do i =3D 1,100 + do j =3D 1,i + do k =3D 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test1 + +subroutine test2 + !$omp tile sizes(1,2) ! { dg-error {'tile' loop transformation may not a= ppear on non-rectangular for} } + do i =3D 1,100 + do j =3D 1,i + do k =3D 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test2 + +subroutine test3 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not= appear on non-rectangular for} } + do i =3D 1,100 + do j =3D 1,i + do k =3D 1,100 + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test3 + +subroutine test4 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not= appear on non-rectangular for} } + do i =3D 1,100 + do j =3D 1,100 + do k =3D 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test4 + +subroutine test5 + !$omp tile sizes(1,2) + do i =3D 1,100 + do j =3D 1,100 + do k =3D 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test5 + +subroutine test6 + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not= appear on non-rectangular for} } + do i =3D 1,100 + do j =3D 1,100 + do k =3D 1,j + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test6 diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectan= gular-2.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-recta= ngular-2.f90 new file mode 100644 index 00000000000..d5352e5a117 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/tile-non-rectangular-2= .f90 @@ -0,0 +1,12 @@ +subroutine test + !$omp tile sizes(1,2,1) ! { dg-error {'tile' loop transformation may not= appear on non-rectangular for} } ! { dg-error {'tile' loop transformation = may not appear on non-rectangular for } } + do i =3D 1,100 + do j =3D 1,100 + do k =3D 1,i + call dummy(i) + end do + end do + end do + !$end omp tile +end subroutine test + diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 b/= gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 index 9b91e5c5f98..fd687890ee6 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-8.f90 @@ -16,7 +16,7 @@ end subroutine test1 ! Loop should be unrolled 1 * 2 * 3 * 4 =3D 24 times -! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll= _partial\(4\) unroll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} = "original" } } +! { dg-final { scan-tree-dump {#pragma omp for nowait collapse\(1\) unroll= _partial\(4\).0 unroll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\= (1\)} "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform" "omp_transf= orm_loops" } } ! { dg-final { scan-tree-dump-times "dummy" 24 "omp_transform_loops" } } ! { dg-final { scan-tree-dump-times {#pragma omp for} 1 "omp_transform_loo= ps" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 b/= gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 index 849d4e77984..928ca44e811 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-9.f90 @@ -13,6 +13,6 @@ subroutine test1 end do end subroutine test1 -! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full unro= ll_partial\(3\) unroll_partial\(2\) unroll_partial\(1\)} "original" } } +! { dg-final { scan-tree-dump {#pragma omp loop_transform unroll_full.0 un= roll_partial\(3\).0 unroll_partial\(2\).0 unroll_partial\(1\).0} "original"= } } ! { dg-final { scan-tree-dump-not "#pragma omp unroll" "omp_transform_loop= s" } } ! { dg-final { scan-tree-dump-times "dummy" 100 "omp_transform_loops" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-lo= op.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f= 90 new file mode 100644 index 00000000000..efcc691185d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90 @@ -0,0 +1,57 @@ +subroutine test1a + !$omp parallel do + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp unroll partial(5) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1a + +subroutine test1b + !$omp tile sizes(3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp unroll partial(5) + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test1b + +subroutine test2a + !$omp parallel do + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$O= MP UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2a + +subroutine test2b + !$omp tile sizes(3,3,3,3) + do i=3D0,100 + do j=3D-300,100 + !$omp unroll partial(5) ! { dg-error {loop nest depth after \!\$OM= P UNROLL at \(1\) is insufficient for outer \!\$OMP TILE} } + do k=3D-300,100 + do l=3D0,100 + call dummy (l) + end do + end do + end do + end do +end subroutine test2b diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect= -1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f= 90 new file mode 100644 index 00000000000..3da99158cc0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-non-rect-1.f90 @@ -0,0 +1,31 @@ +subroutine test + implicit none + + integer :: i, j, k + !$omp target parallel do collapse(2) ! { dg-error {invalid OpenMP non-re= ctangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 2 step '5'}= } + do i =3D -300, 100 + !$omp unroll partial + do j =3D i,i*2 + call dummy (i) + end do + end do + + !$omp target parallel do collapse(3) ! { dg-error {invalid OpenMP non-re= ctangular loop step; '\(2 - 1\) \* 1' is not a multiple of loop 3 step '5'}= } + do i =3D -300, 100 + do j =3D 1,10 + !$omp unroll partial + do k =3D j,j*2 + 1 + call dummy (i) + end do + end do + end do + + !$omp unroll full + do i =3D -3, 5 + do j =3D 1,10 + do k =3D j,j*2 + 1 + call dummy (i) + end do + end do + end do +end subroutine diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f= 90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 index cda878f3037..20617e25105 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90 @@ -21,7 +21,7 @@ function mult (a, b) result (c) end do end function mult -! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial= \(1\) tile sizes\(8, 8\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial= \(1\)@0 tile sizes\(8, 8\)@0} 1 "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_parti= al" "omp_transform_loops" } } ! Tiling adds two floor and two tile loops. diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f= 90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 index 00615011856..c1e7f356a87 100644 --- a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90 @@ -22,7 +22,7 @@ function mult (a, b) result (c) !$omp end target end function mult -! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial= \(2\) tile sizes\(8, 8, 4\)} 1 "original" } } +! { dg-final { scan-tree-dump-times {#pragma omp for nowait unroll_partial= \(2\)@0 tile sizes\(8, 8, 4\)@0} 1 "original" } } ! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_parti= al" "omp_transform_loops" } } ! Check the number of loops diff --git a/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inn= er-1.f90 b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner= -1.f90 new file mode 100644 index 00000000000..bc7a890df17 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f9= 0 @@ -0,0 +1,25 @@ +! { dg-additional-options "-fdump-tree-original" } +! { dg-additional-options "-fdump-tree-omp_transform_loops" } + +function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + + !$omp parallel do collapse(2) + !$omp tile sizes (8,8) + do i =3D 1,m + do j =3D 1,n + inner =3D 0 + !$omp unroll partial(10) + do k =3D 1, n + inner =3D inner + a(k, i) * b(j, k) + end do + c(j, i) =3D inner + end do + end do +end function mult + +! { dg-final { scan-tree-dump-times "#pragma omp loop_transform unroll_par= tial" 1 "original" } } +! { dg-final { scan-tree-dump-not "#pragma omp loop_transform unroll_parti= al" "omp_transform_loops" } } diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 02c207d87a0..510f65311b5 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -507,9 +507,21 @@ dump_omp_clause (pretty_printer *pp, tree clause, int = spc, dump_flags_t flags) goto print_remap; case OMP_CLAUSE_UNROLL_FULL: pp_string (pp, "unroll_full"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_UNROLL_NONE: pp_string (pp, "unroll_none"); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_UNROLL_PARTIAL: pp_string (pp, "unroll_partial"); @@ -520,6 +532,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int = spc, dump_flags_t flags) false); pp_right_paren (pp); } + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE_TILE: pp_string (pp, "tile sizes"); @@ -528,6 +546,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int = spc, dump_flags_t flags) dump_generic_node (pp, OMP_CLAUSE_TILE_SIZES (clause), spc, flags, false); pp_right_paren (pp); + if (OMP_CLAUSE_TRANSFORM_LEVEL (clause)) + { + pp_string (pp, "@"); + dump_generic_node (pp, OMP_CLAUSE_TRANSFORM_LEVEL (clause), + spc, flags, false); + } break; case OMP_CLAUSE__LOOPTEMP_: name =3D "_looptemp_"; diff --git a/gcc/tree.cc b/gcc/tree.cc index 893f509fa3a..38478a0ad46 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -326,11 +326,11 @@ unsigned const char omp_clause_num_ops[] =3D 0, /* OMP_CLAUSE_IF_PRESENT */ 0, /* OMP_CLAUSE_FINALIZE */ 0, /* OMP_CLAUSE_NOHOST */ - 0, /* OMP_CLAUSE_UNROLL_FULL */ + 1, /* OMP_CLAUSE_UNROLL_FULL */ - 0, /* OMP_CLAUSE_UNROLL_NONE */ - 1, /* OMP_CLAUSE_UNROLL_PARTIAL */ - 1 /* OMP_CLAUSE_TILE */ + 1, /* OMP_CLAUSE_UNROLL_NONE */ + 2, /* OMP_CLAUSE_UNROLL_PARTIAL */ + 2 /* OMP_CLAUSE_TILE */ }; const char * const omp_clause_code_name[] =3D diff --git a/gcc/tree.h b/gcc/tree.h index 8f4d2761d1a..0f8aebab89f 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1787,11 +1787,16 @@ class auto_suppress_location_wrappers #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_USE_DEVICE_PTR)->base.public= _flag) +/* The level of a collapsed loop nest at which the tranformation represent= ed + by this clause should be applied. */ +#define OMP_CLAUSE_TRANSFORM_LEVEL(NODE) \ + OMP_CLAUSE_OPERAND (NODE, 0) + #define OMP_CLAUSE_UNROLL_PARTIAL_EXPR(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PA= RTIAL), 0) + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_UNROLL_PA= RTIAL), 1) #define OMP_CLAUSE_TILE_SIZES(NODE) \ - OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0) + OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 1) #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subco= de.proc_bind_kind) diff --git a/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 = b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 new file mode 100644 index 00000000000..1db97feb34d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/loop-transforms/inner-1.f90 @@ -0,0 +1,77 @@ +module matrix + implicit none + integer :: n =3D 10 + integer :: m =3D 10 + +contains + function mult (a, b) result (c) + integer, allocatable, dimension (:,:) :: a,b,c + integer :: i, j, k, inner + + allocate(c( n, m )) + !$omp target parallel do collapse(2) private(inner) map(to:a,b) map(fr= om:c) + !$omp tile sizes (8, 1) + do i =3D 1,m + !$omp tile sizes (8) + do j =3D 1,n + !$omp unroll partial(10) + do k =3D 1, n + if (k =3D=3D 1) then + inner =3D 0 + endif + inner =3D inner + a(k, i) * b(j, k) + if (k =3D=3D n) then + c(j, i) =3D inner + endif + end do + end do + end do + end function mult + + subroutine print_matrix (m) + integer, allocatable :: m(:,:) + integer :: i, j, n + + n =3D size (m, 1) + do i =3D 1,n + do j =3D 1,n + write (*, fmt=3D"(i4)", advance=3D'no') m(j, i) + end do + write (*, *) "" + end do + write (*, *) "" + end subroutine + +end module matrix + +program main + use matrix + implicit none + + integer, allocatable :: a(:,:),b(:,:),c(:,:) + integer :: i,j + + allocate(a( n, m )) + allocate(b( n, m )) + + do i =3D 1,n + do j =3D 1,m + a(j,i) =3D merge(1,0, i.eq.j) + b(j,i) =3D j + end do + end do + + c =3D mult (a, b) + + call print_matrix (a) + call print_matrix (b) + call print_matrix (c) + + do i =3D 1,n + do j =3D 1,m + if (b(i,j) .ne. c(i,j)) call abort () + end do + end do + + +end program main -- 2.36.1 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=DFe 201, 8= 0634 M=FCnchen; Gesellschaft mit beschr=E4nkter Haftung; Gesch=E4ftsf=FChre= r: Thomas Heurung, Frank Th=FCrauf; Sitz der Gesellschaft: M=FCnchen; Regis= tergericht M=FCnchen, HRB 106955