From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id 1DDA63858C20 for ; Tue, 11 Oct 2022 11:12:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1DDA63858C20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from [10.10.3.121] (unknown [10.10.3.121]) by mail.ispras.ru (Postfix) with ESMTPS id 317844385197; Tue, 11 Oct 2022 11:12:04 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru 317844385197 Date: Tue, 11 Oct 2022 14:12:04 +0300 (MSK) From: Alexander Monakov To: Jakub Jelinek cc: Tobias Burnus , gcc-patches , Tom de Vries Subject: Re: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling In-Reply-To: Message-ID: <832946f-bb12-23d7-7d64-47b85c95125@ispras.ru> References: <57b3ae5e-8f15-8bea-fa09-39bccbaa2414@codesourcery.com> <3f0fc49f-b07f-bee2-51a8-a5d03f1c33ed@codesourcery.com> <30e3ed49-0d14-8015-57ef-3d70b1dea69a@codesourcery.com> <3ebce406-46e4-8f98-8c53-83b61423644e@codesourcery.com> <798d7ee1-2ffa-a591-38cb-a9ad421265d0@codesourcery.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 11 Oct 2022, Jakub Jelinek wrote: > So, does this mean one has to have gcc configured --with-arch=sm_70 > or later to make reverse offloading work (and then on the other > side no support for older PTX arches at all)? > If yes, I was kind of hoping we could arrange for it to be more > user-friendly, build libgomp.a normally (sm_35 or what is the default), > build the single TU in libgomp that needs the sm_70 stuff with -march=sm_70 > and arrange for mkoffload to link in the sm_70 stuff only if the user > wants reverse offload (or has requires reverse_offload?). In that case > ignore sm_60 and older devices, if reverse offload isn't wanted, don't link > in the part that needs sm_70 and make stuff working on sm_35 and later. > Or perhaps have 2 versions of target.o, one sm_35 and one sm_70 and let > mkoffload choose among them. My understanding is such trickery should not be necessary with the barrier-based approach, i.e. the sequence of PTX instructions st % plain store membar.sys st.volatile should be enough to guarantee that the former store is visible on the host before the latter, and work all the way back to sm_20. Alexander