From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tdevries@suse.de>
Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29])
 by sourceware.org (Postfix) with ESMTPS id 530BB385800E
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Apr 2022 09:57:59 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 530BB385800E
Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512)
 (No client certificate requested)
 by smtp-out2.suse.de (Postfix) with ESMTPS id 25A6E1F7AD;
 Wed,  6 Apr 2022 09:57:58 +0000 (UTC)
Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512)
 (No client certificate requested)
 by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id EA7F613A8E;
 Wed,  6 Apr 2022 09:57:57 +0000 (UTC)
Received: from dovecot-director2.suse.de ([192.168.254.65])
 by imap2.suse-dmz.suse.de with ESMTPSA id NIC9NyVkTWI0FgAAMHmgww
 (envelope-from <tdevries@suse.de>); Wed, 06 Apr 2022 09:57:57 +0000
Message-ID: <93217117-2cb5-e3de-a3d7-0faed46f4311@suse.de>
Date: Wed, 6 Apr 2022 11:57:57 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.7.0
Subject: Re: Proposal to remove '--with-cuda-driver' (was: [wwwdocs][patch]
 gcc-12: Nvptx updates)
Content-Language: en-US
To: Thomas Schwinge <thomas@codesourcery.com>, Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org, Tobias Burnus <tobias@codesourcery.com>,
 Roger Sayle <roger@nextmovesoftware.com>
References: <1ffa4e66-af1a-4392-795a-31a8f0047c92@codesourcery.com>
 <aff67d31-d328-171d-b3b7-a2886ee2ccc1@suse.de>
 <87ee2bh8a4.fsf@euler.schwinge.homeip.net>
From: Tom de Vries <tdevries@suse.de>
In-Reply-To: <87ee2bh8a4.fsf@euler.schwinge.homeip.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, SPF_HELO_NONE,
 SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Apr 2022 09:58:01 -0000

On 4/5/22 17:14, Thomas Schwinge wrote:
> Hi!
> 
> Still catching up with GCC/nvptx back end changes...  %-)
> 
> 
> In the following I'm not discussing the patch to document
> "gcc-12: Nvptx updates", but rather one aspect of the
> "gcc-12: Nvptx updates" themselves.  ;-)
> 
> On 2022-03-30T14:27:41+0200, Tom de Vries <tdevries@suse.de> wrote:
>> +  <li>The <code>-march</code> flag has been added.  The <code>-misa</code>
>> +    flag is now considered an alias of the <code>-march</code> flag.</li>
>> +  <li>Support for PTX ISA target architectures <code>sm_53</code>,
>> +    <code>sm_70</code>, <code>sm_75</code> and <code>sm_80</code> has been
>> +    added.  These can be specified using the <code>-march</code> flag.</li>
>> +  <li>The default PTX ISA target architecture has been set back
>> +    to <code>sm_30</code>, to fix support for <code>sm_30</code> boards.</li>
>> +  <li>The <code>-march-map</code> flag has been added.  The
>> +    <code>-march-map</code> value will be mapped to an valid
>> +    <code>-march</code> flag value.  For instance,
>> +    <code>-march-map=sm_50</code> maps to <code>-march=sm_35</code>.
>> +    This can be used to specify that generated code is to be executed on a
>> +    board with at least some specific compute capability, without having to
>> +    know the valid values for the <code>-march</code> flag.</li>
> 
> Regarding the following:
> 
>>     <li>The <code>-mptx</code> flag has been added to specify the PTX ISA version
>>         for the generated code; permitted values are <code>3.1</code>
>> -      (default, matches previous GCC versions) and <code>6.3</code>.
>> +      (matches previous GCC versions), <code>6.0</code>, <code>6.3</code>,
>> +      and <code>7.0</code>. If not specified, the used version is the minimal
>> +      version required for <code>-march</code> but at least <code>6.0</code>.
>>     </li>
> 
> For "the PTX ISA version [used is] at least '6.0'", per
> <https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes>,
> this means we now require "CUDA 9.0, driver r384" (or more recent).

Well, that would be the case if there was no -mptx=3.1.

> Per <https://developer.nvidia.com/cuda-toolkit-archive>:
> "CUDA Toolkit 9.0 (Sept 2017)", so ~4.5 years old.
> Per <https://download.nvidia.com/XFree86/Linux-x86_64/>, I'm guessing a

I just see a list with version numbers there, I'm not sure what 
information you're referring to.

> similar timeframe for the imprecise "r384" Driver version stated in that
> table.  That should all be fine (re not mandating use of all-too-recent
> versions).
> 

I don't know what an imprecise driver is.

> Now, consider doing a GCC/nvptx offloading build with
> '--with-cuda-driver' pointing to CUDA 9.0 (or more recent).  This means
> that the libgomp nvptx plugin may now use CUDA Driver features of the
> CUDA 9.0 distribution ("driver r384", etc.) -- because that's what it is
> being 'configure'd and linked against.  (I say "may now use", because
> we're currently not making a lot of effort to use "modern" CUDA Driver
> features -- but we could, and probably should.  That's a separate
> discussion, of course.)  It then follows that the libgomp nvptx plugin
> has a hard dependency on CUDA Driver features of the CUDA 9.0
> distribution ("driver r384", etc.).  That's dependency as in ABI: via
> '*.so' symbol versions as well as internal CUDA interface configuration;
> see <cuda.h> doing different '#define's for different
> '__CUDA_API_VERSION' etc.)
> 
> Now assume one such dependency on "modern" CUDA Driver were not
> implemented by:
> 

Thanks for reminding me, I forgot about this configure option.

>> +  <li>An <code>mptx-3.1</code> multilib was added.  This allows using older
>> +      drivers which do not support PTX ISA version 6.0.</li>
> 
> ... this "old" CUDA Driver.  Then you do have the '-mptx-3.1' multilib to
> use with "old" CUDA Driver -- but you cannot actually use the libgomp
> nvptx plugin, because that's been built against "modern" CUDA Driver.
> 

I remember the following problem: using -with-cuda-driver to specify 
what cuda driver interface (version) you want to link the libgomp plugin 
against, and then using an older driver in combination with that libgomp 
plugin.   We may run into trouble, typically at libgomp plugin load 
time, with an error mentioning an unresolved symbol or some abi symbol 
version being not sufficient.

So, do I understand it correctly that your point is that using -mptx=3.1 
doesn't fix that problem?

> Same problem, generally, for 'nvptx-run' of the nvptx-tools, which has
> similar CUDA Driver dependencies.
> 
> Now, that may currently be a latent problem only, because we're not
> actually making use of "modern" CUDA Driver features.  But, I'd like to
> resolve this "impedance mismatch", before we actually run into such
> problems.
> 

It would be helpful for me if you would come up with an example of a 
modification to the libgomp plugin that would cause trouble in 
combination with mptx=3.1.

> Already long ago Jakub put in changes to use '--without-cuda-driver' to
> "Allow building GCC with PTX offloading even without CUDA being installed
> (gcc and nvptx-tools patches)": "Especially for distributions it is
> undesirable to need to have proprietary CUDA libraries and headers
> installed when building GCC.", and I understand GNU/Linux distributions
> all use that.  That configuration uses the GCC-provided
> 'libgomp/plugin/cuda/cuda.h', 'libgomp/plugin/cuda-lib.def' to manually
> define the CUDA Driver ABI to use, and then 'dlopen("libcuda.so.1")'.
> (Similar to what the libgomp GCN (and before: HSA) plugin is doing, for
> example.)  Quite likely that our group (at work) are the only ones to
> actually use '--with-cuda-driver'?
> 

Right, I see in my scripts that I don't use --with-cuda-driver, possibly 
because of years-ago running into issues when changing drivers forth and 
back.

> My proposal now is: we remove '--with-cuda-driver' (make its use a no-op,
> per standard GNU Autoconf behavior), and offer '--without-cuda-driver'
> only.  This shouldn't cause any user-visible change in behavior, so safe
> without a prior deprecation phase.
> 

I think the dlopen use-case is the most flexible, and I don't see any 
user benefit from using --with-cuda-driver, so I don't see a problem 
with removing --with-cuda-driver for the user.

I did wonder about keeping it available in some form, say rename to 
--maintainer-mode-with-cuda-driver.  This could be useful for debugging 
/ comparison purposes.  But it would mean having to test it when making 
relevant changes, which is maintenance burden for a feature not visible 
to the user, so I guess that's not worth it.

So, I'm fine with removing.

Thanks,
- Tom