From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 69785 invoked by alias); 22 Mar 2018 17:51:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 69775 invoked by uid 89); 22 Mar 2018 17:51:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy= X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Mar 2018 17:51:13 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1ez4Mq-0000QA-7h from Tom_deVries@mentor.com for gcc-patches@gcc.gnu.org; Thu, 22 Mar 2018 10:51:12 -0700 Received: from [137.202.13.181] (137.202.0.87) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 22 Mar 2018 17:51:08 +0000 Subject: Re: [og7] vector_length extension part 2: Generalize state propagation and synchronization To: Cesar Philippidis CC: "gcc-patches@gcc.gnu.org" , Thomas Schwinge References: <823cc381-8752-14df-d6e2-0203de5da2fb@codesourcery.com> <0e1729fb-fd98-1f37-8de9-0a11e2aa9c7a@mentor.com> <7cc5faf7-255d-523d-144f-a5a3439ce25b@codesourcery.com> <44178739-74a6-0534-deb2-f14f1d6d2cf4@mentor.com> <63b73137-2190-ce03-2269-4a26f0d32431@codesourcery.com> From: Tom de Vries Message-ID: <2cc6a590-5fbe-99f8-b726-2abd3e109656@mentor.com> Date: Thu, 22 Mar 2018 17:58:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <63b73137-2190-ce03-2269-4a26f0d32431@codesourcery.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-SW-Source: 2018-03/txt/msg01222.txt.bz2 On 03/22/2018 06:24 PM, Cesar Philippidis wrote: > On 03/22/2018 09:18 AM, Tom de Vries wrote: > >> That's obviously not good enough. >> >> When I compile this test-case: >> ... >> int >> main (void) >> { >>   int a[10]; >> #pragma acc parallel num_workers (16) >> #pragma acc loop worker >>   for (int i = 0; i < 10; i++) >>     a[i] = i; >> >>   return 0; >> } >> ... >> >> I get: >> ... >>  .maxntid 32, 16, 1 >> ... >> >> That's the change you need to isolate. > > I attached an updated patch which incorporates the > cfun->machine->axis_dim changes. It now generates more precise arguments > for maxntid. I'll try this out. Still, this doesn't address my request: "Also, list in the comment a JIT driver version, and sm_ version and a testcase for which this is required" Thanks, - Tom > > Cesar > > > 0001-emit-.maxntid-hint.patch > > > From 11035dc92884146dc4d974156adcb260568db785 Mon Sep 17 00:00:00 2001 > From: Cesar Philippidis > Date: Thu, 22 Mar 2018 08:05:53 -0700 > Subject: [PATCH] emit .maxntid hint > > --- > gcc/config/nvptx/nvptx.c | 19 +++++++++++++++++++ > gcc/config/nvptx/nvptx.h | 2 ++ > 2 files changed, 21 insertions(+) > > diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c > index eff87732c4b..3958f71e995 100644 > --- a/gcc/config/nvptx/nvptx.c > +++ b/gcc/config/nvptx/nvptx.c > @@ -76,6 +76,7 @@ > #include "target-def.h" > > #define WORKAROUND_PTXJIT_BUG 1 > +#define WORKAROUND_PTXJIT_BUG_3 1 > > /* Define dimension sizes for known hardware. */ > #define PTX_VECTOR_LENGTH 32 > @@ -1219,6 +1220,16 @@ nvptx_declare_function_name (FILE *file, const char *name, const_tree decl) > stream, in order to share the prototype writing code. */ > std::stringstream s; > write_fn_proto (s, true, name, decl); > + > +#if WORKAROUND_PTXJIT_BUG_3 > + /* Emitting a .maxntid seems to have the effect of encouraging the > + PTX JIT emit SYNC branches. */ > + if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl)) > + && lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl))) > + s << ".maxntid " << cfun->machine->axis_dim[0] << ", " > + << cfun->machine->axis_dim[1] << ", 1\n"; > +#endif > + > s << "{\n"; > > bool return_in_mem = write_return_type (s, false, result_type); > @@ -2831,6 +2842,11 @@ struct offload_attrs > int max_workers; > }; > > +/* Define entries for cfun->machine->axis_dim. */ > + > +#define MACH_VECTOR_LENGTH 0 > +#define MACH_MAX_WORKERS 1 > + > struct parallel > { > /* Parent parallel. */ > @@ -4525,6 +4541,9 @@ nvptx_reorg (void) > > populate_offload_attrs (&oa); > > + cfun->machine->axis_dim[MACH_VECTOR_LENGTH] = oa.vector_length; > + cfun->machine->axis_dim[MACH_MAX_WORKERS] = oa.max_workers; > + > /* If there is worker neutering, there must be vector > neutering. Otherwise the hardware will fail. */ > gcc_assert (!(oa.mask & GOMP_DIM_MASK (GOMP_DIM_WORKER)) > diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h > index 8a14507c88a..958516da604 100644 > --- a/gcc/config/nvptx/nvptx.h > +++ b/gcc/config/nvptx/nvptx.h > @@ -226,6 +226,8 @@ struct GTY(()) machine_function > int return_mode; /* Return mode of current fn. > (machine_mode not defined yet.) */ > rtx axis_predicate[2]; /* Neutering predicates. */ > + int axis_dim[2]; /* Maximum number of threads on each axis, dim[0] is > + vector_length, dim[1] is num_workers. */ > rtx unisimt_master; /* 'Master lane index' for -muniform-simt. */ > rtx unisimt_predicate; /* Predicate for -muniform-simt. */ > rtx unisimt_location; /* Mask location for -muniform-simt. */ >