From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 48868 invoked by alias); 6 Aug 2015 16:33:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 48852 invoked by uid 89); 6 Aug 2015 16:33:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Aug 2015 16:33:38 +0000 Received: from svr-orw-fem-06.mgc.mentorg.com ([147.34.97.120]) by relay1.mentorg.com with esmtp id 1ZNO6o-00021A-BF from Cesar_Philippidis@mentor.com ; Thu, 06 Aug 2015 09:33:34 -0700 Received: from [127.0.0.1] (147.34.91.1) by SVR-ORW-FEM-06.mgc.mentorg.com (147.34.97.120) with Microsoft SMTP Server id 14.3.224.2; Thu, 6 Aug 2015 09:33:33 -0700 Message-ID: <55C38C5D.6040602@mentor.com> Date: Thu, 06 Aug 2015 16:33:00 -0000 From: Cesar Philippidis User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Nathan Sidwell , GCC Patches CC: Jakub Jelinek , Thomas Schwinge Subject: Re: [gomp4] Redesign oacc_parallel launch API References: <55B7B332.1010603@acm.org> In-Reply-To: <55B7B332.1010603@acm.org> Content-Type: multipart/mixed; boundary="------------000106010900090902030601" X-SW-Source: 2015-08/txt/msg00365.txt.bz2 --------------000106010900090902030601 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-length: 1891 On 07/28/2015 09:52 AM, Nathan Sidwell wrote: > I've committed this patch to the gomp4 branch to redo the launch API. > I'll post a version for trunk once the versioning patch gets approved & > committed. > > This changes the API in a number of ways, allowing device-specific > knowledge to be moved into the device compiler and out of the host > compiler. > > Firstly, we attach a tuple of launch dimensions as an attribute to the > offloaded function's 'oacc function' attribute. These are the constant > launch dimensions. Dynamic dimensions get a zero for their slot in this > list. Further this list can be extended in the future to an alist keyed > by device_type. > > Dynamic dimensions are computed on the host. however they are passed > via varadic args to the GOACC_parallel function (which is renamed). The > varadic args are passed using key/value representation, and 3 keys are > currently defined: > END -- end of the varadic list > DIM - set of runtime-computed dimensions. Only the dynamic ones are > passed. > ASYNC_WAIT - an async and a set of waits (possibly zero). > > I have arranged for the key to have a slot that can later be filled by > device_type, and hence support multiple device types. > > The constant dimensions can be used in expansion of the GOACC_nid > function in the device compiler. The device compiler could also process > that list to select the device_type slot that is appropriate. > > For PTX the backend is augmented to emit the launch dimensions into the > target data, from whence the ptx plugin can pick them up and overwrite > with any dynamic ones passed in from the launch function. Looking at set_oacc_fn_attrib, it appears that const values are also considered dynamic. See the attached test case more more info. Is that the expected behavior? If not, I could take a look at this after I finished my reduction patch. Cesar --------------000106010900090902030601 Content-Type: text/x-csrc; name="vlength.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="vlength.c" Content-length: 250 #include const int vl = 32; int main () { unsigned int red = 0; #pragma acc parallel loop vector_length (vl) vector reduction (+:red) copy (red) for (int i = 0; i < 100; i++) red ++; printf ("red = %d\n", red); return 0; } --------------000106010900090902030601--