From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52374 invoked by alias); 19 Nov 2015 15:57:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 52364 invoked by uid 89); 19 Nov 2015 15:57:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 19 Nov 2015 15:57:31 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 561138D3B4; Thu, 19 Nov 2015 15:57:30 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-34.ams2.redhat.com [10.36.116.34]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tAJFvRIF031316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 19 Nov 2015 10:57:28 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id tAJFvPpl007060; Thu, 19 Nov 2015 16:57:26 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id tAJFvN81007059; Thu, 19 Nov 2015 16:57:23 +0100 Date: Thu, 19 Nov 2015 15:57:00 -0000 From: Jakub Jelinek To: Julian Brown Cc: James Norris , GCC Patches , "Joseph S. Myers" , Nathan Sidwell Subject: Re: [OpenACC 0/7] host_data construct Message-ID: <20151119155723.GA5675@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <56293476.5020801@codesourcery.com> <562A578E.4080907@codesourcery.com> <20151026183422.GW478@tucnak.redhat.com> <20151102183339.365c3d33@octopus> <20151112111621.657650bc@octopus> <20151118124747.30a2ec5d@octopus> <20151119131345.GX5675@tucnak.redhat.com> <20151119142650.5a8842e4@octopus> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151119142650.5a8842e4@octopus> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-11/txt/msg02362.txt.bz2 On Thu, Nov 19, 2015 at 02:26:50PM +0000, Julian Brown wrote: > OK, thanks -- as to what the standard says, it's so ill-specified in > this area that nothing can be learned about the behaviour of offloaded > regions within host_data constructs, and my question about that on the > technical mailing list is still unanswered (actually Nathan suggested > in private mail that the conservative thing to do would be to disallow > offloaded regions entirely within host_data constructs, so maybe that's > the way to go). > > OpenMP 4.5 seems to *not* specify the skipping-over behaviour for > use_device_ptr variables (p105, lines 20-23): > > "The is_device_ptr clause is used to indicate that a list item is a > device pointer already in the device data environment and that it > should be used directly. Support for device pointers created outside > of OpenMP, specifically outside of the omp_target_alloc routine and the > use_device_ptr clause, is implementation defined." > > That suggests that use_device_ptr is a valid way to create device > pointers for use in enclosed target regions: the behaviour I assumed > was wrong for OpenACC. So I think my guess at the "most-obvious" > behaviour was probably misguided anyway. use_device_ptr kind of privatizes the variable, the private variable being the device pointer corresponding to the host pointer outside of the target data with use_device_ptr clause. And, if you want to use that device pointer in a target region, it should be on the is_device_ptr clause on the target construct. See e.g. libgomp.c/target-18.c testcase. int a[4]; ... #pragma omp target data map(to:a) #pragma omp target data use_device_ptr(a) map(from:err) #pragma omp target is_device_ptr(a) private(i) map(from:err) { err = 0; for (i = 0; i < 4; i++) if (a[i] != 23 + i) err = 1; } The implementation has this way a choice how to implement device pointers (what use_device_ptr gives you, or say omp_target_alloc returns) - either (GCC's choice at least for the XeonPhi and hopefully PTX, HSA does not care, as it shares address space) implement them as host pointer encoding the bits the target device wants to use, or some kind of descriptor. In the former case, is_device_ptr is essentially a firstprivate, you bitwise copy the device pointer from the host to target device, where you can dereference it etc. In the descriptor case you'd do some transformation of the host side representation of the device pointer to the device side. > > It's maybe even more complicated. Consider the example: > > char x[1024]; > > #pragma acc enter data copyin(x) > > #pragma acc host_data use_device(x) > { > target_primitive(x); > #pragma acc parallel present(x) [1] > { > x[5] = 0; [2] > } > } If it is unclear, I think disallowing acc {parallel,kernels} inside of acc host_data might be too big hammer, but perhaps just erroring out or warning during gimplification that if you (explicitly or implicitly) try to map a var that is in use_device clause in some outer context, it is either wrong, unsupported or will not do what users think? I will double check on omp-lang, but supposedly we could for OpenMP warn in similar cases (use_device_ptr clause instead of use_device), except when it is passed to is_device_ptr clause, because I think the behavior is just unspecified otherwise. > > Here, the "present" clause marked [1] will fail (because 'x' is a > target pointer now). If it's omitted, the array access [2] will cause an > implicit present_or_copy to be used for the 'x' pointer (which again > will fail, because now 'x' points to target data). Maybe what we > actually need is, > > #pragma acc host_data use_device(x) > { > target_primitive(x); > #pragma acc parallel deviceptr(x) > { > ... > } > } > > with the deviceptr(x) clause magically substituted in the parallel > construct, but I'm struggling to see how we could justify doing that > when that behaviour's not mentioned in the spec at all. Is deviceptr as above meant to work? That is the OpenACC counterpart of is_device_ptr, right? If yes, then I'd suggest just warning if you try to implicitly or explicitly map something use_device in outer contexts, and just make sure you don't ICE on the cases where you warn. If the standard does not say what it means, then it is unspecified behavior... Jakub