From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 04C2F3851C35; Thu, 7 Jan 2021 19:35:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 04C2F3851C35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Sandra_Loosemore@mentor.com IronPort-SDR: bDD61SKepbaDdQFYie/WLDSbboNJd15z0bNbPxfy81/tDVCQk4pOQksYceecU0FZZerwRtEXiY BO8J3dXBmRUhX+2qQN9e+p2S0KvI5QXuqq/ivr1xlh15NH/lgCCDT0bV+LJBvf7OSLvuGaO0iD QWicyj5O4hq8gYSaN4LdTVmWDAAQmj4DfPVcfpkl1WcEMPslwnRtNJEmXtGSxfOvoQ7NYfbCpm zyCSgDlrbRwS5T+faoyVcngnFKln3dTDCVa/MPvAuDlYVg0XJ2fPwwhgf2AuDDEGKnOHR2dtTl GCk= X-IronPort-AV: E=Sophos;i="5.79,329,1602576000"; d="scan'208,223";a="56867858" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 07 Jan 2021 11:35:43 -0800 IronPort-SDR: eardurbqyCZg77f8SIy8DY8M0Tqx0CQK7AoeSB0rKksfdRtj99DUUGsLtAPrnty+oT4U7dr9jl N7q3SN/ADSM/ujHVEeUPA5UKOhSMlpVQur6ZXzlruc3Pznnf1rk+Wijsyy7KUqBqbbbRn3Zubu gpTTl1zG1YcxHihqEYkTSSZFm/+/urtDA4q+kGI08wfjfD9QuH7enj0sf21h/yzZccXqrKF0P+ ow3yY5ben/aDvaMLjObr1XhScXSvdyY91dEPBUExCnCdL9Z3uRR3NxzbFc5HavPrB/v4kzibk+ e9Y= Subject: Re: [OG10] Fortran: delinearize multi-dimensional array accesses To: Thomas Koenig , "gcc-patches@gcc.gnu.org" , "fortran@gcc.gnu.org" , Tobias Burnus References: <5d972896-d5ef-df64-a663-a3a339b99e70@netcologne.de> From: Sandra Loosemore Message-ID: <3f8fc2fd-b164-d004-eda9-bd559a0b52be@codesourcery.com> Date: Thu, 7 Jan 2021 12:35:36 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <5d972896-d5ef-df64-a663-a3a339b99e70@netcologne.de> Content-Type: multipart/mixed; boundary="------------D5E78FF1E0F3406BE059146C" Content-Language: en-US X-ClientProxiedBy: svr-orw-mbx-04.mgc.mentorg.com (147.34.90.204) To svr-orw-mbx-03.mgc.mentorg.com (147.34.90.203) X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2021 19:35:48 -0000 --------------D5E78FF1E0F3406BE059146C Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit On 12/26/20 3:41 AM, Thomas Koenig wrote: > > Hi Sandra, > >> The attached patch implements delinearization of array accesses in the >> Fortran front end, something that has been discussed for a long time. > > Definitely - among others, this is the subject of PR 14741, which is by > now quite historic. > >> I've been asked to try to get this patch committed on the OG10 branch >> since it is blocking some further optimization work with Graphite for >> OpenACC kernels regions.  I have a mainline version of this patch as >> well that I can send to anyone interested in trying it out, but TBH, I >> don't think this is ready for mainline yet. > > That would be indeed interesting.  Could you post that to the list as > well? Attached to this mail. > The current status is that >> there are still two gfortran tests that are regressing >> (gfortran.dg/graphite/id-9.f and >> gfortran.dg/vect/fast-math-mgrid-resid.f), and while it's been >> confirmed that this helps with Graphite optimizations as intended, we >> haven't yet run any benchmarks to confirm that it doesn't make other >> things slower. > > It is probably too late; this could go in for the next stage 1. Yes, I hope we can get this more polished by that time. > Does the patch actually make loop interchange for matrix multiplication > work (done with C for loops or Fortran DO loops), or is there additional > work required? There's some additional work required, and probably not by me since I know next to nothing about graphite or GCC's loop optimization framework generally. :-( (The motivation for this patch at this time is that we need the delinearization for some ongoing OpenACC parallelization work.) > Regarding scalarized loops: We still to not collapse loops for > >   subroutine foo(a) >     real, dimension(:,:), contiguous :: a >     a = 5. >   end subroutine foo > > so an extension to scalarized loops would be quite valuable. Yes. I guesstimated that would be a medium-sized project since the scalarized references are generated by a completely different code path. > It would be interesting to see if > >   subroutine foo(a,n,m) >     real, dimension(n,m) :: a >     do j=1,m >       do i=1,n >         a(i,j) = 5. >       end do >     end do >   end subroutine foo > > is collapsed to a single loop with the patch and Graphite. This one also requires some additional work. At least Graphite recognizes the SCoP around the loop nest with this patch which it did not do without delinearization. -Sandra --------------D5E78FF1E0F3406BE059146C Content-Type: text/x-patch; charset="UTF-8"; name="delinearize-mainline-dec17.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="delinearize-mainline-dec17.patch" >From 2e714885f2aab9c24b6f3d37c995e7e4a5913ee5 Mon Sep 17 00:00:00 2001 From: Sandra Loosemore Date: Thu, 7 Jan 2021 11:09:03 -0800 Subject: [PATCH] Fortran: delinearize multi-dimensional array accesses The Fortran front end presently linearizes accesses to multi-dimensional arrays by combining the indices for the various dimensions into a series of explicit multiplies and adds with refactoring to allow CSE of invariant parts of the computation. Unfortunately this representation interferes with Graphite-based loop optimizations. It is difficult to recover the original multi-dimensional form of the access by the time loop optimizations run because parts of it have already been optimized away or into a form that is not easily recognizable, so it seems better to have the Fortran front end produce delinearized accesses to begin with, a set of nested ARRAY_REFs similar to the existing behavior of the C and C++ front ends. This is a long-standing problem that has previously been discussed e.g. in PR 14741 and PR61000. This patch is an initial implementation for explicit array accesses only; it doesn't handle the accesses generated during scalarization of whole-array or array-section operations, which follow a different code path. 2020-12-17 Sandra Loosemore Tobias Burnus gcc/ * expr.c (get_inner_reference): Handle NOP_EXPR like VIEW_CONVERT_EXPR. gcc/fortran/ * lang.opt (-param=delinearize=): New. * trans-array.c (get_class_array_vptr): New, split from... (build_array_ref): ...here. (get_array_lbound, get_array_ubound): New, split from... (gfc_conv_array_ref): ...here. Additional code refactoring plus support for delinearization of the array access. gcc/testsuite/ * gfortran.dg/assumed_type_2.f90: Adjust patterns. * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. * gfortran.dg/graphite/block-3.f90: Remove xfails. * gfortran.dg/graphite/block-4.f90: Likewise. * gfortran.dg/inline_matmul_24.f90: Adjust patterns. * gfortran.dg/no_arg_check_2.f90: Likewise. * gfortran.dg/pr32921.f: Likewise. * gfortran.dg/reassoc_4.f: Disable delinearization for this test. --- gcc/expr.c | 1 + gcc/fortran/lang.opt | 4 + gcc/fortran/trans-array.c | 321 ++++++++++++++++----- gcc/testsuite/gfortran.dg/assumed_type_2.f90 | 6 +- .../gfortran.dg/goacc/kernels-loop-inner.f95 | 2 +- gcc/testsuite/gfortran.dg/graphite/block-3.f90 | 1 - gcc/testsuite/gfortran.dg/graphite/block-4.f90 | 1 - gcc/testsuite/gfortran.dg/inline_matmul_24.f90 | 2 +- gcc/testsuite/gfortran.dg/no_arg_check_2.f90 | 6 +- gcc/testsuite/gfortran.dg/pr32921.f | 2 +- gcc/testsuite/gfortran.dg/reassoc_4.f | 2 +- 11 files changed, 256 insertions(+), 92 deletions(-) diff --git a/gcc/expr.c b/gcc/expr.c index 33934d6..dc74b5f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -7530,6 +7530,7 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize, break; case VIEW_CONVERT_EXPR: + case NOP_EXPR: break; case MEM_REF: diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt index 57b0264..11cb086 100644 --- a/gcc/fortran/lang.opt +++ b/gcc/fortran/lang.opt @@ -513,6 +513,10 @@ fdefault-real-16 Fortran Var(flag_default_real_16) Set the default real kind to an 16 byte wide type. +-param=delinearize= +Common Joined UInteger Var(flag_delinearize_aref) Init(1) IntegerRange(0,1) Param Optimization +Delinearize array references. + fdollar-ok Fortran Var(flag_dollar_ok) Allow dollar signs in entity names. diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 33e05be..4f52ac9 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -3637,11 +3637,9 @@ add_to_offset (tree *cst_offset, tree *offset, tree t) } } - static tree -build_array_ref (tree desc, tree offset, tree decl, tree vptr) +get_class_array_vptr (tree desc, tree vptr) { - tree tmp; tree type; tree cdesc; @@ -3665,19 +3663,74 @@ build_array_ref (tree desc, tree offset, tree decl, tree vptr) && GFC_CLASS_TYPE_P (TYPE_CANONICAL (type))) vptr = gfc_class_vptr_get (TREE_OPERAND (cdesc, 0)); } + return vptr; +} +static tree +build_array_ref (tree desc, tree offset, tree decl, tree vptr) +{ + tree tmp; + vptr = get_class_array_vptr (desc, vptr); tmp = gfc_conv_array_data (desc); tmp = build_fold_indirect_ref_loc (input_location, tmp); tmp = gfc_build_array_ref (tmp, offset, decl, vptr); return tmp; } +/* Get the declared lower bound for rank N of array DECL which might + be either a bare array or a descriptor. This differs from + gfc_conv_array_lbound because it gets information for temporary array + objects from AR instead of the descriptor (they can differ). */ + +static tree +get_array_lbound (tree decl, int n, gfc_symbol *sym, + gfc_array_ref *ar, gfc_se *se) +{ + if (sym->attr.temporary) + { + gfc_se tmpse; + gfc_init_se (&tmpse, se); + gfc_conv_expr_type (&tmpse, ar->as->lower[n], gfc_array_index_type); + gfc_add_block_to_block (&se->pre, &tmpse.pre); + return tmpse.expr; + } + else + return gfc_conv_array_lbound (decl, n); +} + +/* Similarly for the upper bound. */ +static tree +get_array_ubound (tree decl, int n, gfc_symbol *sym, + gfc_array_ref *ar, gfc_se *se) +{ + if (sym->attr.temporary) + { + gfc_se tmpse; + gfc_init_se (&tmpse, se); + gfc_conv_expr_type (&tmpse, ar->as->upper[n], gfc_array_index_type); + gfc_add_block_to_block (&se->pre, &tmpse.pre); + return tmpse.expr; + } + else + return gfc_conv_array_ubound (decl, n); +} + /* Build an array reference. se->expr already holds the array descriptor. This should be either a variable, indirect variable reference or component reference. For arrays which do not have a descriptor, se->expr will be the data pointer. - a(i, j, k) = base[offset + i * stride[0] + j * stride[1] + k * stride[2]]*/ + + There are two strategies here. In the traditional case, multidimensional + arrays are explicitly linearized into a one-dimensional array, with the + index computed as if by + a(i, j, k) = base[offset + i * stride[0] + j * stride[1] + k * stride[2]] + + However, we can often get better code using the Graphite framework + and scalar evolutions in the middle end, which expects to see + multidimensional array accesses represented as nested ARRAY_REFs, similar + to what the C/C++ front ends produce. Delinearization is controlled + by flag_delinearize_aref. */ void gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, @@ -3688,11 +3741,16 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, tree tmp; tree stride; tree decl = NULL_TREE; + tree cooked_decl = NULL_TREE; + tree vptr = se->class_vptr; gfc_se indexse; gfc_se tmpse; gfc_symbol * sym = expr->symtree->n.sym; char *var_name = NULL; + tree aref = NULL_TREE; + tree atype = NULL_TREE; + /* Handle coarrays. */ if (ar->dimen == 0) { gcc_assert (ar->codimen || sym->attr.select_rank_temporary @@ -3752,15 +3810,160 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, } } + /* Per comments above, DECL is not always a declaration. It may be + either a variable, indirect variable reference, or component + reference. It may have array or pointer type, or it may be a + descriptor with RECORD_TYPE. */ decl = se->expr; if (IS_CLASS_ARRAY (sym) && sym->attr.dummy && ar->as->type != AS_DEFERRED) decl = sym->backend_decl; - cst_offset = offset = gfc_index_zero_node; - add_to_offset (&cst_offset, &offset, gfc_conv_array_offset (decl)); + /* A pointer array component can be detected from its field decl. Fix + the descriptor, mark the resulting variable decl and store it in + COOKED_DECL to pass to gfc_build_array_ref. */ + if (get_CFI_desc (sym, expr, &cooked_decl, ar)) + cooked_decl = build_fold_indirect_ref_loc (input_location, cooked_decl); + if (!expr->ts.deferred && !sym->attr.codimension + && is_pointer_array (se->expr)) + { + if (TREE_CODE (se->expr) == COMPONENT_REF) + cooked_decl = se->expr; + else if (TREE_CODE (se->expr) == INDIRECT_REF) + cooked_decl = TREE_OPERAND (se->expr, 0); + else + cooked_decl = se->expr; + } + else if (expr->ts.deferred + || (sym->ts.type == BT_CHARACTER + && sym->attr.select_type_temporary)) + { + if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr))) + { + cooked_decl = se->expr; + if (TREE_CODE (cooked_decl) == INDIRECT_REF) + cooked_decl = TREE_OPERAND (cooked_decl, 0); + } + else + cooked_decl = sym->backend_decl; + } + else if (sym->ts.type == BT_CLASS) + { + if (UNLIMITED_POLY (sym)) + { + gfc_expr *class_expr = gfc_find_and_cut_at_last_class_ref (expr); + gfc_init_se (&tmpse, NULL); + gfc_conv_expr (&tmpse, class_expr); + if (!se->class_vptr) + vptr = gfc_class_vptr_get (tmpse.expr); + gfc_free_expr (class_expr); + cooked_decl = tmpse.expr; + } + else + cooked_decl = NULL_TREE; + } + + /* Find the base of the array; this normally has ARRAY_TYPE. */ + tree base = build_fold_indirect_ref_loc (input_location, + gfc_conv_array_data (se->expr)); + tree type = TREE_TYPE (base); + + /* Handle special cases, copied from gfc_build_array_ref. After we get + through this, we know TYPE definitely is an ARRAY_TYPE. */ + if (GFC_ARRAY_TYPE_P (type) && GFC_TYPE_ARRAY_RANK (type) == 0) + { + gcc_assert (GFC_TYPE_ARRAY_CORANK (type) > 0); + se->expr = fold_convert (TYPE_MAIN_VARIANT (type), base); + return; + } + if (TREE_CODE (type) != ARRAY_TYPE) + { + gcc_assert (cooked_decl == NULL_TREE); + se->expr = base; + return; + } - /* Calculate the offsets from all the dimensions. Make sure to associate - the final offset so that we form a chain of loop invariant summands. */ + /* Check for cases where we cannot delinearize. */ + + bool delinearize = flag_delinearize_aref; + + /* There is no point in trying to delinearize 1-dimensional arrays. */ + if (ar->dimen == 1) + delinearize = false; + + if (delinearize + && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr)) + || (DECL_P (se->expr) + && DECL_LANG_SPECIFIC (se->expr) + && GFC_DECL_SAVED_DESCRIPTOR (se->expr)))) + { + /* Descriptor arrays that may not be contiguous cannot + be delinearized without using the stride in the descriptor, + which generally involves introducing a division operation. + That's unlikely to produce optimal code, so avoid doing it. */ + tree desc = se->expr; + if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr))) + desc = GFC_DECL_SAVED_DESCRIPTOR (se->expr); + tree tmptype = TREE_TYPE (desc); + if (POINTER_TYPE_P (tmptype)) + tmptype = TREE_TYPE (tmptype); + enum gfc_array_kind akind = GFC_TYPE_ARRAY_AKIND (tmptype); + if (akind != GFC_ARRAY_ASSUMED_SHAPE_CONT + && akind != GFC_ARRAY_ASSUMED_RANK_CONT + && akind != GFC_ARRAY_ALLOCATABLE + && akind != GFC_ARRAY_POINTER_CONT) + delinearize = false; + } + + /* See gfc_build_array_ref in trans.c. If we have a cooked_decl or + vptr, then we most likely have to do pointer arithmetic using a + linearized array offset. */ + if (delinearize && cooked_decl) + delinearize = false; + else if (delinearize && get_class_array_vptr (se->expr, vptr)) + delinearize = false; + + if (!delinearize) + { + /* Initialize the offset from the array descriptor. This accounts + for the array base being something other than zero. */ + cst_offset = offset = gfc_index_zero_node; + add_to_offset (&cst_offset, &offset, gfc_conv_array_offset (decl)); + } + else + { + /* If we are delinearizing, build up the nested array type using the + dimension information we have for each rank. */ + atype = TREE_TYPE (type); + for (n = 0; n < ar->dimen; n++) + { + /* We're working from the outermost nested array reference inward + in this step. ATYPE is the element type for the access in + this rank; build the new array type based on the bounds + information and store it back into ATYPE for the next rank's + processing. */ + tree lbound = get_array_lbound (decl, n, sym, ar, se); + tree ubound = get_array_ubound (decl, n, sym, ar, se); + tree dimen = build_range_type (TREE_TYPE (lbound), + lbound, ubound); + atype = build_array_type (atype, dimen); + + /* Emit a DECL_EXPR for the array type so the gimplification of + its type sizes works correctly. */ + if (! TYPE_NAME (atype)) + TYPE_NAME (atype) = build_decl (UNKNOWN_LOCATION, TYPE_DECL, + NULL_TREE, atype); + gfc_add_expr_to_block (&se->pre, + build1 (DECL_EXPR, atype, + TYPE_NAME (atype))); + } + + /* Cast base to the innermost array type. */ + if (DECL_P (base)) + TREE_ADDRESSABLE (base) = 1; + aref = build1 (NOP_EXPR, atype, base); + } + + /* Process indices in reverse order. */ for (n = ar->dimen - 1; n >= 0; n--) { /* Calculate the index for this dimension. */ @@ -3778,16 +3981,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, indexse.expr = save_expr (indexse.expr); /* Lower bound. */ - tmp = gfc_conv_array_lbound (decl, n); - if (sym->attr.temporary) - { - gfc_init_se (&tmpse, se); - gfc_conv_expr_type (&tmpse, ar->as->lower[n], - gfc_array_index_type); - gfc_add_block_to_block (&se->pre, &tmpse.pre); - tmp = tmpse.expr; - } - + tmp = get_array_lbound (decl, n, sym, ar, se); cond = fold_build2_loc (input_location, LT_EXPR, logical_type_node, indexse.expr, tmp); msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " @@ -3802,16 +3996,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, arrays. */ if (n < ar->dimen - 1 || ar->as->type != AS_ASSUMED_SIZE) { - tmp = gfc_conv_array_ubound (decl, n); - if (sym->attr.temporary) - { - gfc_init_se (&tmpse, se); - gfc_conv_expr_type (&tmpse, ar->as->upper[n], - gfc_array_index_type); - gfc_add_block_to_block (&se->pre, &tmpse.pre); - tmp = tmpse.expr; - } - + tmp = get_array_ubound (decl, n, sym, ar, se); cond = fold_build2_loc (input_location, GT_EXPR, logical_type_node, indexse.expr, tmp); msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " @@ -3824,65 +4009,41 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr, } } - /* Multiply the index by the stride. */ - stride = gfc_conv_array_stride (decl, n); - tmp = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, - indexse.expr, stride); - - /* And add it to the total. */ - add_to_offset (&cst_offset, &offset, tmp); - } - - if (!integer_zerop (cst_offset)) - offset = fold_build2_loc (input_location, PLUS_EXPR, - gfc_array_index_type, offset, cst_offset); - - /* A pointer array component can be detected from its field decl. Fix - the descriptor, mark the resulting variable decl and pass it to - build_array_ref. */ - decl = NULL_TREE; - if (get_CFI_desc (sym, expr, &decl, ar)) - decl = build_fold_indirect_ref_loc (input_location, decl); - if (!expr->ts.deferred && !sym->attr.codimension - && is_pointer_array (se->expr)) - { - if (TREE_CODE (se->expr) == COMPONENT_REF) - decl = se->expr; - else if (TREE_CODE (se->expr) == INDIRECT_REF) - decl = TREE_OPERAND (se->expr, 0); - else - decl = se->expr; - } - else if (expr->ts.deferred - || (sym->ts.type == BT_CHARACTER - && sym->attr.select_type_temporary)) - { - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr))) + if (!delinearize) { - decl = se->expr; - if (TREE_CODE (decl) == INDIRECT_REF) - decl = TREE_OPERAND (decl, 0); + /* Multiply the index by the stride. */ + stride = gfc_conv_array_stride (decl, n); + tmp = fold_build2_loc (input_location, MULT_EXPR, + gfc_array_index_type, + indexse.expr, stride); + + /* And add it to the total. */ + add_to_offset (&cst_offset, &offset, tmp); } else - decl = sym->backend_decl; - } - else if (sym->ts.type == BT_CLASS) - { - if (UNLIMITED_POLY (sym)) { - gfc_expr *class_expr = gfc_find_and_cut_at_last_class_ref (expr); - gfc_init_se (&tmpse, NULL); - gfc_conv_expr (&tmpse, class_expr); - if (!se->class_vptr) - se->class_vptr = gfc_class_vptr_get (tmpse.expr); - gfc_free_expr (class_expr); - decl = tmpse.expr; + /* Peel off a layer of array nesting from ATYPE to + to get the result type of the new ARRAY_REF. */ + atype = TREE_TYPE (atype); + aref = build4 (ARRAY_REF, atype, aref, indexse.expr, + NULL_TREE, NULL_TREE); } - else - decl = NULL_TREE; } - se->expr = build_array_ref (se->expr, offset, decl, se->class_vptr); + if (!delinearize) + { + /* Build a linearized array reference using the offset from all + dimensions. */ + if (!integer_zerop (cst_offset)) + offset = fold_build2_loc (input_location, PLUS_EXPR, + gfc_array_index_type, offset, cst_offset); + se->class_vptr = vptr; + vptr = get_class_array_vptr (se->expr, vptr); + se->expr = gfc_build_array_ref (base, offset, cooked_decl, vptr); + } + else + /* Return the outermost ARRAY_REF we already built. */ + se->expr = aref; } diff --git a/gcc/testsuite/gfortran.dg/assumed_type_2.f90 b/gcc/testsuite/gfortran.dg/assumed_type_2.f90 index 5d3cd7e..07be87e 100644 --- a/gcc/testsuite/gfortran.dg/assumed_type_2.f90 +++ b/gcc/testsuite/gfortran.dg/assumed_type_2.f90 @@ -147,12 +147,12 @@ end ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_int," 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&array_int.1.," 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*array_int" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(real.kind=4..0:. . restrict\\) array_real_alloc.data" 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*real.kind=4..0.*restrict.*array_real_alloc.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(character.kind=1..1:1. .\\) .array_char_ptr.data" 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(struct t2.0:. . restrict\\) array_t2_alloc.data" 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*struct t2.0:..*restrict.*array_t2_alloc.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t3 .\\) .array_t3_ptr.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) array_class_t1_alloc._data.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) \\(array_class_t1_ptr._data.dat" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 index a3ad591..d8d14c4 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 @@ -7,7 +7,7 @@ program main integer :: a(100,100), b(100,100) integer :: i, j, d - !$acc kernels ! { dg-message "optimized: assigned OpenACC seq loop parallelism" } + !$acc kernels ! { dg-message "optimized: assigned OpenACC gang loop parallelism" } do i=1,100 do j=1,100 a(i,j) = 1 diff --git a/gcc/testsuite/gfortran.dg/graphite/block-3.f90 b/gcc/testsuite/gfortran.dg/graphite/block-3.f90 index 452de73..60c7952 100644 --- a/gcc/testsuite/gfortran.dg/graphite/block-3.f90 +++ b/gcc/testsuite/gfortran.dg/graphite/block-3.f90 @@ -12,6 +12,5 @@ enddo end subroutine matrix_multiply -! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-times "will be loop blocked" 1 "graphite" { xfail *-*-* } } } diff --git a/gcc/testsuite/gfortran.dg/graphite/block-4.f90 b/gcc/testsuite/gfortran.dg/graphite/block-4.f90 index 42af5b6..1bc7a1b 100644 --- a/gcc/testsuite/gfortran.dg/graphite/block-4.f90 +++ b/gcc/testsuite/gfortran.dg/graphite/block-4.f90 @@ -15,6 +15,5 @@ enddo end subroutine matrix_multiply -! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-times "will be loop blocked" 1 "graphite" { xfail *-*-* } } } diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 index 3168d5f..8d84f3c 100644 --- a/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 +++ b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 @@ -39,4 +39,4 @@ program testMATMUL call abort() end if end program testMATMUL -! { dg-final { scan-tree-dump-times "gamma5\\\[__var_1_do \\* 4 \\+ __var_2_do\\\]|gamma5\\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\\]" 1 "original" } } +! { dg-final { scan-tree-dump-times "gamma5.*\\\[NON_LVALUE_EXPR <__var_1_do>\\\]\\\[NON_LVALUE_EXPR <__var_2_do>\\\]" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 index 3570b97..0900dd8 100644 --- a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 +++ b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 @@ -129,12 +129,12 @@ end ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_int," 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&array_int.1.," 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*array_int" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(real.kind=4..0:. . restrict\\) array_real_alloc.data" 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*real.kind=4..0.*restrict.*array_real_alloc.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(character.kind=1..1:1. .\\) .array_char_ptr.data" 1 "original" } } -! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(struct t2.0:. . restrict\\) array_t2_alloc.data" 1 "original" } } +! { dg-final { scan-tree-dump-times "sub_scalar .&.*struct t2.0:..*restrict.*array_t2_alloc.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t3 .\\) .array_t3_ptr.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) array_class_t1_alloc._data.data" 1 "original" } } ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) \\(array_class_t1_ptr._data.dat" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/pr32921.f b/gcc/testsuite/gfortran.dg/pr32921.f index 0661208..8534386 100644 --- a/gcc/testsuite/gfortran.dg/pr32921.f +++ b/gcc/testsuite/gfortran.dg/pr32921.f @@ -45,4 +45,4 @@ RETURN END -! { dg-final { scan-tree-dump-times "stride" 4 "lim2" } } +! { dg-final { scan-tree-dump-times "ubound" 4 "lim2" } } diff --git a/gcc/testsuite/gfortran.dg/reassoc_4.f b/gcc/testsuite/gfortran.dg/reassoc_4.f index fdcb46e..2368b76 100644 --- a/gcc/testsuite/gfortran.dg/reassoc_4.f +++ b/gcc/testsuite/gfortran.dg/reassoc_4.f @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options "-O3 -ffast-math -fdump-tree-reassoc1 --param max-completely-peeled-insns=200" } +! { dg-options "-O3 -ffast-math -fdump-tree-reassoc1 --param max-completely-peeled-insns=200 --param delinearize=0" } subroutine anisonl(w,vo,anisox,s,ii1,jj1,weight) integer ii1,jj1,i1,iii1,j1,jjj1,k1,l1,m1,n1 real*8 w(3,3),vo(3,3),anisox(3,3,3,3),s(60,60),weight -- 2.8.1 --------------D5E78FF1E0F3406BE059146C--