From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 72593 invoked by alias); 11 Nov 2015 11:03:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 72581 invoked by uid 89); 11 Nov 2015 11:03:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 11 Nov 2015 11:03:05 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35592) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZwTB9-0002FD-Hs for gcc-patches@gnu.org; Wed, 11 Nov 2015 06:03:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZwTAi-0003e8-2L for gcc-patches@gnu.org; Wed, 11 Nov 2015 06:03:03 -0500 Received: from mx2.suse.de ([195.135.220.15]:59434) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZwTAh-0003e0-Oj for gcc-patches@gnu.org; Wed, 11 Nov 2015 06:02:35 -0500 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 11BC3AC6C; Wed, 11 Nov 2015 11:02:15 +0000 (UTC) Date: Wed, 11 Nov 2015 11:03:00 -0000 From: Richard Biener To: Tom de Vries cc: "gcc-patches@gnu.org" , Jakub Jelinek Subject: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def In-Reply-To: <5640FB07.6010008@mentor.com> Message-ID: References: <5640BD31.2060602@mentor.com> <5640FB07.6010008@mentor.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 195.135.220.15 X-SW-Source: 2015-11/txt/msg01337.txt.bz2 On Mon, 9 Nov 2015, Tom de Vries wrote: > On 09/11/15 16:35, Tom de Vries wrote: > > Hi, > > > > this patch series for stage1 trunk adds support to: > > - parallelize oacc kernels regions using parloops, and > > - map the loops onto the oacc gang dimension. > > > > The patch series contains these patches: > > > > 1 Insert new exit block only when needed in > > transform_to_exit_first_loop_alt > > 2 Make create_parallel_loop return void > > 3 Ignore reduction clause on kernels directive > > 4 Implement -foffload-alias > > 5 Add in_oacc_kernels_region in struct loop > > 6 Add pass_oacc_kernels > > 7 Add pass_dominator_oacc_kernels > > 8 Add pass_ch_oacc_kernels > > 9 Add pass_parallelize_loops_oacc_kernels > > 10 Add pass_oacc_kernels pass group in passes.def > > 11 Update testcases after adding kernels pass group > > 12 Handle acc loop directive > > 13 Add c-c++-common/goacc/kernels-*.c > > 14 Add gfortran.dg/goacc/kernels-*.f95 > > 15 Add libgomp.oacc-c-c++-common/kernels-*.c > > 16 Add libgomp.oacc-fortran/kernels-*.f95 > > > > The first 9 patches are more or less independent, but patches 10-16 are > > intended to be committed at the same time. > > > > Bootstrapped and reg-tested on x86_64. > > > > Build and reg-tested with nvidia accelerator, in combination with a > > patch that enables accelerator testing (which is submitted at > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). > > > > I'll post the individual patches in reply to this message. > > > > This patch adds the pass_oacc_kernels pass group to the pass list in > passes.def. > > Note the repetition of pass_lim/pass_copy_prop. The first pair is for an inner > loop in a loop nest, the second for an outer loop in a loop nest. @@ -86,6 +86,27 @@ along with GCC; see the file COPYING3. If not see /* pass_build_ealias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_ealias); + /* Pass group that runs when there are oacc kernels in the + function. */ + NEXT_PASS (pass_oacc_kernels); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) + NEXT_PASS (pass_dominator_oacc_kernels); + NEXT_PASS (pass_ch_oacc_kernels); + NEXT_PASS (pass_dominator_oacc_kernels); + NEXT_PASS (pass_tree_loop_init); + NEXT_PASS (pass_lim); + NEXT_PASS (pass_copy_prop); + NEXT_PASS (pass_lim); + NEXT_PASS (pass_copy_prop); iterate lim/copyprop twice?! Why's that needed? + NEXT_PASS (pass_scev_cprop); What's that for? It's supposed to help removing loops - I don't expect kernels to vanish. + NEXT_PASS (pass_tree_loop_done); + NEXT_PASS (pass_dominator_oacc_kernels); Three times DOM? No please. I wonder why you don't run oacc_kernels after FRE and drop the initial DOM(s). + NEXT_PASS (pass_dce); + NEXT_PASS (pass_tree_loop_init); + NEXT_PASS (pass_parallelize_loops_oacc_kernels); + NEXT_PASS (pass_expand_omp_ssa); + NEXT_PASS (pass_tree_loop_done); The switches into/outof tree_loop also look odd to me, but well (they'll be controlled by -ftree-loop-optimize)). + POP_INSERT_PASSES () Please get some more sense into this pass pipeline. Richard. > Thanks, > - Tom > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)