From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 66670 invoked by alias); 18 Jun 2015 10:48:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 66660 invoked by uid 89); 18 Jun 2015 10:48:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 18 Jun 2015 10:48:44 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D415BAC47; Thu, 18 Jun 2015 10:48:40 +0000 (UTC) Date: Thu, 18 Jun 2015 10:55:00 -0000 From: Richard Biener To: Tom de Vries cc: GCC Patches , Jakub Jelinek Subject: Re: [gomp4, committed] Fix parallelization for fortran oacc kernels tests In-Reply-To: <55829E5A.4000205@mentor.com> Message-ID: References: <55829E5A.4000205@mentor.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2015-06/txt/msg01256.txt.bz2 On Thu, 18 Jun 2015, Tom de Vries wrote: > Hi, > > I ran into a problem with fortran loops in oacc kernels regions not being > parallelized, after introducting transform_to_exit_first_loop_alt. > > For gfortran.dg/goacc/kernels-loop.f95, we get: > ... > #pragma omp target oacc_parallel num_gangs(1) > ... > instead of the desired num_gangs (32). > > transform_to_exit_first_loop_alt fails because nit is _135, where nit is > defined by: > ... > *_105 = 0; > D__lsm.27_50 = *_105; > _32 = (unsigned int) D__lsm.27_50; > _135 = 1023 - _32; > ... > > pass_fre would manage to propagate the '*105 = 0' assignment. But in the > current pass order, pass_fre is run before pass_lim, where this pattern is > introduced: > ... > NEXT_PASS (pass_ch_oacc_kernels); > NEXT_PASS (pass_fre); > NEXT_PASS (pass_tree_loop_init); > NEXT_PASS (pass_lim); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_scev_cprop); > NEXT_PASS (pass_parallelize_loops_oacc_kernels); > NEXT_PASS (pass_expand_omp_ssa); > NEXT_PASS (pass_tree_loop_done); > ... > > The patch moves pass_fre to the location of pass_copy_prop, and replaces it. > Furthermore, it adds scans to the fortran test-cases to make sure they get > properly parallelized. You may now figure out that LIM needs FRE to detect equal memory references to apply store-motion. But maybe the issues oacc lowering introduces are limited and under your control. Richard. > Bootstrapped and reg-tested on x86_64. > > Committed to gomp-4_0-branch. > > Thanks, > - Tom > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)