From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 859503858D28 for ; Tue, 31 Jan 2023 12:22:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 859503858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675167743; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=k4wP8F6oHdaFP8L/VEyob05kDLLvOOf0BaBKY7Xn/rs=; b=RkK/FKNOuZ0B4bvPJmV7dkPqlPSHiURYjDByrcDq+NK2ocwINDvfQDwu39/9HfxpNZ3wo2 gohS7r/PJ4nfM6Ak5ASP++yPnJiaDZh24ZqocYaLzSKu+3AunrdyvJo60ucgzato6uzxT8 UFzYGLCSfMBYHkCq4VwIdK0q6HU5cgs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-248-l99YZ_9dM_u1Jvs0-HTNtQ-1; Tue, 31 Jan 2023 07:22:19 -0500 X-MC-Unique: l99YZ_9dM_u1Jvs0-HTNtQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 27811800159; Tue, 31 Jan 2023 12:22:19 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.223]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DCB6F140EBF4; Tue, 31 Jan 2023 12:22:18 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 30VCM8Dv1045852 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 31 Jan 2023 13:22:16 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 30VCLxtW1045851; Tue, 31 Jan 2023 13:21:59 +0100 Date: Tue, 31 Jan 2023 13:21:58 +0100 From: Jakub Jelinek To: Tobias Burnus Cc: gcc-patches Subject: Re: [Patch] libgomp.texi: Reverse-offload updates (was: [Patch] libgomp: Handle OpenMP's reverse offloads) Message-ID: Reply-To: Jakub Jelinek References: <0567b7c6-fede-72b8-63d1-1fc10dca36a0@codesourcery.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, Dec 10, 2022 at 09:18:26AM +0100, Tobias Burnus wrote: > libgomp.texi: Reverse-offload updates > > libgomp/ > * libgomp.texi (5.0 Impl. Status): Update 'requires' and 'ancestor'. > (GCN): Add item about 'omp requires'. > (nvptx): Likewise; add item about reverse offload. > > --- a/libgomp/libgomp.texi > +++ b/libgomp/libgomp.texi > @@ -192,8 +192,8 @@ The OpenMP 4.5 specification is fully supported. > env variable @tab Y @tab > @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab > @item @code{requires} directive @tab P > - @tab complete but no non-host devices provides @code{unified_address}, > - @code{unified_shared_memory} or @code{reverse_offload} > + @tab complete but no non-host devices provides @code{unified_address} or > + @code{unified_shared_memory} > @item @code{teams} construct outside an enclosing target region @tab Y @tab > @item Non-rectangular loop nests @tab Y @tab > @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab > @@ -228,7 +228,7 @@ The OpenMP 4.5 specification is fully supported. > @item @code{allocate} clause @tab P @tab Initial support > @item @code{use_device_addr} clause on @code{target data} @tab Y @tab > @item @code{ancestor} modifier on @code{device} clause > - @tab Y @tab See comment for @code{requires} > + @tab Y @tab Host fallback with GCN devices > @item Implicit declare target directive @tab Y @tab > @item Discontiguous array section with @code{target update} construct > @tab N @tab > @@ -288,7 +288,7 @@ The OpenMP 4.5 specification is fully supported. > @code{append_args} @tab N @tab > @item @code{dispatch} construct @tab N @tab > @item device-specific ICV settings with environment variables @tab Y @tab > -@item @code{assume} directive @tab Y @tab > +@item @code{assume} and @code{assumes} directives @tab Y @tab > @item @code{nothing} directive @tab Y @tab > @item @code{error} directive @tab Y @tab > @item @code{masked} construct @tab Y @tab > @@ -4456,6 +4456,9 @@ The implementation remark: > @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported > using the C library @code{printf} functions and the Fortran > @code{print}/@code{write} statements. > +@item OpenMP code that has a requires directive with @code{unified_address}, > + @code{unified_shared_memory} or @code{reverse_offload} will remove > + any GCN device from the list of available devices (``host fallback''). > @end itemize > > > @@ -4507,6 +4510,15 @@ The implementation remark: > @item Compilation OpenMP code that contains @code{requires reverse_offload} > requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30} > is not supported. > +@item For code containing reverse offload (i.e. @code{target} regions with > + @code{device(ancestor:1)}), there is a slight performance penality > + for @emph{all} target regions, consisting mostly of shutdown delay > + Per device, reverse offload regions are processed serial such that s/serial/serially/ ? > + the next reverse offload region is only executed after the previous > + one returns. > +@item OpenMP code that has a requires directive with @code{unified_address} > + or @code{unified_shared_memory} will remove any nvptx device from the > + list of available devices (``host fallback''). > @end itemize Otherwise LGTM Jakub