From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3811 invoked by alias); 24 Nov 2015 12:20:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 3776 invoked by uid 89); 24 Nov 2015 12:20:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 24 Nov 2015 12:20:21 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41803) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1a1Ca2-0003DF-UT for gcc-patches@gnu.org; Tue, 24 Nov 2015 07:20:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a1CZz-0002Wi-L2 for gcc-patches@gnu.org; Tue, 24 Nov 2015 07:20:18 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:46902) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1CZz-0002W2-DB for gcc-patches@gnu.org; Tue, 24 Nov 2015 07:20:15 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-03.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1a1CZx-0006h0-Au from Tom_deVries@mentor.com ; Tue, 24 Nov 2015 04:20:13 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-03.mgc.mentorg.com (137.202.0.108) with Microsoft SMTP Server id 14.3.224.2; Tue, 24 Nov 2015 12:20:10 +0000 Subject: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def To: Richard Biener References: <5640BD31.2060602@mentor.com> <5640FB07.6010008@mentor.com> <5649C41A.40403@mentor.com> <564DA4CA.3020506@mentor.com> <564F1F85.1000108@mentor.com> <564F4B72.8010605@mentor.com> CC: "gcc-patches@gnu.org" , Jakub Jelinek From: Tom de Vries Message-ID: <565455C9.7000206@mentor.com> Date: Tue, 24 Nov 2015 12:22:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------070401010205090102020101" X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 X-SW-Source: 2015-11/txt/msg02886.txt.bz2 --------------070401010205090102020101 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1113 On 23/11/15 11:02, Richard Biener wrote: > On Fri, 20 Nov 2015, Tom de Vries wrote: > >> On 20/11/15 14:29, Richard Biener wrote: >>> I agree it's somewhat of an odd behavior but all passes should >>> either be placed in a sub-pipeline with an outer >>> loop_optimizer_init()/finalize () call or call both themselves. >> >> Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the loop >> pipeline. >> >> We could use the style used in pass_slp_vectorize::execute: >> ... >> pass_slp_vectorize::execute (function *fun) >> { >> basic_block bb; >> >> bool in_loop_pipeline = scev_initialized_p (); >> if (!in_loop_pipeline) >> { >> loop_optimizer_init (LOOPS_NORMAL); >> scev_initialize (); >> } >> >> ... >> >> if (!in_loop_pipeline) >> { >> scev_finalize (); >> loop_optimizer_finalize (); >> } >> ... >> >> Although that doesn't strike me as particularly clean. > > At least it would be a consistent "unclean" style. So yes, the > above would work for me. > Reposting using the in_loop_pipeline style in pass_lim. Thanks, - Tom --------------070401010205090102020101 Content-Type: text/x-patch; name="0004-Add-pass_oacc_kernels-pass-group-in-passes.def.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="0004-Add-pass_oacc_kernels-pass-group-in-passes.def.patch" Content-length: 3891 Add pass_oacc_kernels pass group in passes.def 2015-11-09 Tom de Vries * omp-low.c (pass_expand_omp_ssa::clone): New function. * passes.def: Add pass_oacc_kernels pass group. * tree-ssa-loop-ch.c (pass_ch::clone): New function. * tree-ssa-loop-im.c (tree_ssa_lim): Make static. (pass_lim::execute): Allow to run outside pass_tree_loop. --- gcc/omp-low.c | 1 + gcc/passes.def | 18 ++++++++++++++++++ gcc/tree-ssa-loop-ch.c | 2 ++ gcc/tree-ssa-loop-im.c | 12 ++++++++++-- 4 files changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index efe5d3a..7318b0e 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -13366,6 +13366,7 @@ public: return !(fun->curr_properties & PROP_gimple_eomp); } virtual unsigned int execute (function *) { return execute_expand_omp (); } + opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); } }; // class pass_expand_omp_ssa diff --git a/gcc/passes.def b/gcc/passes.def index 17027786..f1969c0 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -88,7 +88,25 @@ along with GCC; see the file COPYING3. If not see /* pass_build_ealias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_ealias); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 1. */ + NEXT_PASS (pass_oacc_kernels); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) + NEXT_PASS (pass_ch); + POP_INSERT_PASSES () NEXT_PASS (pass_fre); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 2. */ + NEXT_PASS (pass_oacc_kernels2); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2) + /* We use pass_lim to rewrite in-memory iteration and reduction + variable accesses in loops into local variables accesses. */ + NEXT_PASS (pass_lim); + NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); + NEXT_PASS (pass_dce); + NEXT_PASS (pass_parallelize_loops_oacc_kernels); + NEXT_PASS (pass_expand_omp_ssa); + POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_dse); NEXT_PASS (pass_cd_dce); diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 7e618bf..6493fcc 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -165,6 +165,8 @@ public: /* Initialize and finalize loop structures, copying headers inbetween. */ virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_ch (m_ctxt); } + protected: /* ch_base method: */ virtual bool process_loop_p (struct loop *loop); diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 30b53ce..0d82d36 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-propagate.h" #include "trans-mem.h" #include "gimple-fold.h" +#include "tree-scalar-evolution.h" /* TODO: Support for predicated code motion. I.e. @@ -2496,7 +2497,7 @@ tree_ssa_lim_finalize (void) /* Moves invariants from loops. Only "expensive" invariants are moved out -- i.e. those that are likely to be win regardless of the register pressure. */ -unsigned int +static unsigned int tree_ssa_lim (void) { unsigned int todo; @@ -2560,10 +2561,17 @@ public: unsigned int pass_lim::execute (function *fun) { + bool in_loop_pipeline = scev_initialized_p (); + if (!in_loop_pipeline) + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); + if (number_of_loops (fun) <= 1) return 0; + unsigned int todo = tree_ssa_lim (); - return tree_ssa_lim (); + if (!in_loop_pipeline) + loop_optimizer_finalize (); + return todo; } } // anon namespace --------------070401010205090102020101--