From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 84122 invoked by alias); 22 Jul 2015 17:05:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 84112 invoked by uid 89); 22 Jul 2015 17:05:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Jul 2015 17:05:49 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZHxSi-0001YE-Md from Tom_deVries@mentor.com ; Wed, 22 Jul 2015 10:05:45 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Wed, 22 Jul 2015 18:05:43 +0100 Message-ID: <55AFCD5A.1080200@mentor.com> Date: Wed, 22 Jul 2015 17:06:00 -0000 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Jakub Jelinek CC: Chung-Lin Tang , gcc-patches , Tom de Vries , Thomas Schwinge Subject: [gomp4, committed] Set safelen to INT_MAX for oacc independent pragma References: <55A4A21C.1070004@codesourcery.com> <20150714070010.GY1788@tucnak.redhat.com> <55A4D7E0.2020303@codesourcery.com> <20150714094859.GC1788@tucnak.redhat.com> In-Reply-To: <20150714094859.GC1788@tucnak.redhat.com> Content-Type: multipart/mixed; boundary="------------040904060601050004070505" X-SW-Source: 2015-07/txt/msg01874.txt.bz2 --------------040904060601050004070505 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 2968 [ was; Re: [PATCH, gomp4] Propagate independent clause for OpenACC kernels pass ] On 14/07/15 11:48, Jakub Jelinek wrote: > On Tue, Jul 14, 2015 at 05:35:28PM +0800, Chung-Lin Tang wrote: >> The wording of OpenACC independent is more simple: >> "... the independent clause tells the implementation that the iterations of this loop >> are data-independent with respect to each other." -- OpenACC spec 2.7.9 >> >> I would say this implies even more relaxed conditions than OpenMP simd safelen, >> essentially saying that the compiler doesn't even need dependence analysis; just >> assume independence of iterations. > > safelen is also saying that the compiler doesn't even need dependence > analysis. It is just that only some transformations of the loop are ok > without dependence analysis, others need to be with dependence analysis. > Classical vectorization optimizations (instead of doing one iteration > at a time you can do up to safelen consecutive iterations together) for the > first statement in the loop, then second statement, etc. are ok without > dependence analysis, but e.g. reversing the loop and running first the last > iteration and so on up to first, or running the iterations in random orders > is not ok. > >>> So if OpenACC independent means there are no dependencies in between >>> iterations, the OpenMP counterpart here is #pragma omp for simd schedule (auto) >>> or #pragma omp distribute parallel for simd schedule (auto). >> >> schedule(auto) appears to correspond to the OpenACC 'auto' clause, or >> what is implied in a kernels compute construct, but I'm not sure it implies >> no dependencies between iterations? > > By the schedule(auto) I meant that the user tells the compiler it can > parallelize the loop with whatever schedule it wants. Other schedules are > quite well defined, if the team has that many threads, which of the thread > gets which iteration, so user could rely on a particular parallelization and > the loop iterations still could not be 100% independent. With > schedule(auto) you say it is up to the compiler to schedule them, thus they > really have to be all independent. > >> Putting aside the semantic issues, as of currently safelen>0 turns on a certain amount of >> vectorization code that we are not currently using (and not likely at all for nvptx). >> Right now, we're just trying to pass the new flag to a kernels tree-parloops based pass. > > In any case, when setting your flag you should also set safelen = INT_MAX, > as the OpenACC independent implies that you can vectorize the loop with any > vectorization factor without performing dependency analysis on the loop. > OpenACC is (hopefully) not just about PTX and most other targets will want > to vectorize such loops. > This patch sets safelen to INT_MAX for loops marked with the independent clause on the openacc loop directive. Build and reg-tested on x86_64 with nvidia accelerator. Committed to gomp-4_0-branch. Thanks, - Tom --------------040904060601050004070505 Content-Type: text/x-patch; name="0001-Set-safelen-to-INT_MAX-for-oacc-independent-pragma.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename*0="0001-Set-safelen-to-INT_MAX-for-oacc-independent-pragma.patc"; filename*1="h" Content-length: 650 Set safelen to INT_MAX for oacc independent pragma 2015-07-22 Tom de Vries * omp-low.c (expand_omp_for): Set loop->safelen to INT_MAX if marked_independent. --- gcc/omp-low.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 0419dcd..65c6321 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -8286,6 +8286,7 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt) { struct loop *loop = region->cont->loop_father; loop->marked_independent = true; + loop->safelen = INT_MAX; } } else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_SIMD) -- 1.9.1 --------------040904060601050004070505--