From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 49848 invoked by alias); 19 Nov 2015 14:29:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 49829 invoked by uid 89); 19 Nov 2015 14:29:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 19 Nov 2015 14:29:15 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZzQD0-0007Ee-Vi from Julian_Brown@mentor.com ; Thu, 19 Nov 2015 06:29:11 -0800 Received: from octopus (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Thu, 19 Nov 2015 14:27:00 +0000 Date: Thu, 19 Nov 2015 14:29:00 -0000 From: Julian Brown To: Jakub Jelinek CC: James Norris , GCC Patches , "Joseph S. Myers" , Nathan Sidwell Subject: Re: [OpenACC 0/7] host_data construct Message-ID: <20151119142650.5a8842e4@octopus> In-Reply-To: <20151119131345.GX5675@tucnak.redhat.com> References: <56293476.5020801@codesourcery.com> <562A578E.4080907@codesourcery.com> <20151026183422.GW478@tucnak.redhat.com> <20151102183339.365c3d33@octopus> <20151112111621.657650bc@octopus> <20151118124747.30a2ec5d@octopus> <20151119131345.GX5675@tucnak.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-11/txt/msg02337.txt.bz2 On Thu, 19 Nov 2015 14:13:45 +0100 Jakub Jelinek wrote: > On Wed, Nov 18, 2015 at 12:47:47PM +0000, Julian Brown wrote: > > The FE/gimplifier part is okay, but I really don't like the > omp-low.c changes, mostly the *lookup_decl_in_outer_ctx* changes. > If I count well, we have right now 27 maybe_lookup_decl_in_outer_ctx > callers and 7 lookup_decl_in_outer_ctx callers, you want to change > behavior of 1 maybe_lookup_decl_in_outer_ctx and 1 > lookup_decl_in_outer_ctx. Why exactly those 2 and not the others? The not-very-good reason is that those are the merely the places that allowed the supplied examples to work, and I'm wary of changing other code that I don't understand very well. > What are the exact rules (what does the standard say about it)? > I'd expect that all phases (scan_sharing_clauses, lower_omp* and > expand_omp*) should agree on the same behavior, otherwise I can't see > how it can work properly. OK, thanks -- as to what the standard says, it's so ill-specified in this area that nothing can be learned about the behaviour of offloaded regions within host_data constructs, and my question about that on the technical mailing list is still unanswered (actually Nathan suggested in private mail that the conservative thing to do would be to disallow offloaded regions entirely within host_data constructs, so maybe that's the way to go). OpenMP 4.5 seems to *not* specify the skipping-over behaviour for use_device_ptr variables (p105, lines 20-23): "The is_device_ptr clause is used to indicate that a list item is a device pointer already in the device data environment and that it should be used directly. Support for device pointers created outside of OpenMP, specifically outside of the omp_target_alloc routine and the use_device_ptr clause, is implementation defined." That suggests that use_device_ptr is a valid way to create device pointers for use in enclosed target regions: the behaviour I assumed was wrong for OpenACC. So I think my guess at the "most-obvious" behaviour was probably misguided anyway. It's maybe even more complicated. Consider the example: char x[1024]; #pragma acc enter data copyin(x) #pragma acc host_data use_device(x) { target_primitive(x); #pragma acc parallel present(x) [1] { x[5] = 0; [2] } } Here, the "present" clause marked [1] will fail (because 'x' is a target pointer now). If it's omitted, the array access [2] will cause an implicit present_or_copy to be used for the 'x' pointer (which again will fail, because now 'x' points to target data). Maybe what we actually need is, #pragma acc host_data use_device(x) { target_primitive(x); #pragma acc parallel deviceptr(x) { ... } } with the deviceptr(x) clause magically substituted in the parallel construct, but I'm struggling to see how we could justify doing that when that behaviour's not mentioned in the spec at all. Aha, so: maybe manually using deviceptr(x) is implicitly mandatory in this situation, and missing it out should be an error? That suddenly seems to make most sense. I'll see about fixing the patch to do that. Julian