From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jakub@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
 by sourceware.org (Postfix) with ESMTPS id B0BDF3814FC2
 for <gcc-patches@gcc.gnu.org>; Tue,  7 Jun 2022 12:40:34 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B0BDF3814FC2
Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com
 [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-625-6NwhwCeWM7mxc8s0HZhcgw-1; Tue, 07 Jun 2022 08:40:32 -0400
X-MC-Unique: 6NwhwCeWM7mxc8s0HZhcgw-1
Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com
 [10.11.54.7])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 53CC08027EE;
 Tue,  7 Jun 2022 12:40:32 +0000 (UTC)
Received: from tucnak.zalov.cz (unknown [10.39.192.11])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 12F26140EBD5;
 Tue,  7 Jun 2022 12:40:31 +0000 (UTC)
Received: from tucnak.zalov.cz (localhost [127.0.0.1])
 by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 257CeSwq2099935
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
 Tue, 7 Jun 2022 14:40:29 +0200
Received: (from jakub@localhost)
 by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 257CeShv2099934;
 Tue, 7 Jun 2022 14:40:28 +0200
Date: Tue, 7 Jun 2022 14:40:28 +0200
From: Jakub Jelinek <jakub@redhat.com>
To: Andrew Stubbs <ams@codesourcery.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] libgomp, openmp: pinned memory
Message-ID: <Yp9HPPS4xwUhnL9K@tucnak>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <f5260c95-6c71-99a7-3bf2-774380444082@codesourcery.com>
 <20220104155558.GG2646553@tucnak>
 <48ee767a-0d90-53b4-ea54-9deba9edd805@codesourcery.com>
 <20220104182829.GK2646553@tucnak> <20220104184740.GL2646553@tucnak>
 <b59981ce-9e47-8b00-03b8-1a9a5d555bb7@codesourcery.com>
 <a79567df-f061-8248-4281-63c74e724cb7@codesourcery.com>
 <dadaaf64-360f-bffb-8616-1ab9493cb358@codesourcery.com>
 <Yp9AMrhxak8lOh4t@tucnak>
 <e8fc4b30-768a-2a02-1fc9-208ab9bf8a5d@codesourcery.com>
MIME-Version: 1.0
In-Reply-To: <e8fc4b30-768a-2a02-1fc9-208ab9bf8a5d@codesourcery.com>
X-Scanned-By: MIMEDefang 2.85 on 10.11.54.7
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Jun 2022 12:40:36 -0000

On Tue, Jun 07, 2022 at 01:28:33PM +0100, Andrew Stubbs wrote:
> > For performance boost of what kind of code?
> > I don't understand how Cuda API could be useful (or can be used at all) if
> > offloading to NVPTX isn't involved.  The fact that somebody asks for host
> > memory allocation with omp_atk_pinned set to true doesn't mean it will be
> > in any way related to NVPTX offloading (unless it is in NVPTX target region
> > obviously, but then mlock isn't available, so sure, if there is something
> > CUDA can provide for that case, nice).
> 
> This is specifically for NVPTX offload, of course, but then that's what our
> customer is paying for.
> 
> The expectation, from users, is that memory pinning will give the benefits
> specific to the active device. We can certainly make that happen when there
> is only one (flavour of) offload device present. I had hoped it could be one
> way for all, but it looks like not.

I think that is just an expectation that isn't backed by anything in the
standard.
When users need something like that (but would be good to describe what
it is, memory that will be primarily used for interfacing the offloading
device 0 (or some specific device given by some number), or memory that
can be used without remapping on some offloading device, something else?
And when we know what exactly that is (e.g. what Cuda APIs or GCN APIs etc.
can provide), discuss on omp-lang whether there shouldn't be some standard
way to ask for such an allocator.  Or there is always the possibility of
extensions.  Not sure if one can just define ompx_atv_whatever, use some
large value for it (but the spec doesn't have a vendor range which would be
safe to use) and support it that way.

Plus a different thing is allocators in the offloading regions.
I think we should translate some omp_alloc etc. calls in such regions
when they use constant expression standard allocators to doing the
allocation through other means, or allocators.c can be overridden or
amended for the needs or possibilities of the offloading targets.

	Jakub