From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id C072C3858405 for ; Wed, 30 Mar 2022 11:46:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C072C3858405 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7D3541F37B; Wed, 30 Mar 2022 11:46:00 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4AE9813A60; Wed, 30 Mar 2022 11:46:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id GOdoEPhCRGLGIQAAMHmgww (envelope-from ); Wed, 30 Mar 2022 11:46:00 +0000 Content-Type: multipart/mixed; boundary="------------SHE5HDqyIwIhHCgtGfpcSaJi" Message-ID: <1726a02d-c669-9a1c-22ed-6bcae68ef9d5@suse.de> Date: Wed, 30 Mar 2022 13:45:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH][nvptx, doc] Update misa and mptx, add march and march-map Content-Language: en-US To: Tobias Burnus , gcc-patches@gcc.gnu.org Cc: Thomas Schwinge References: <20220329133931.GA5489@delia.home> <385a766e-5eab-3af3-1d65-7115b60506ba@codesourcery.com> <4264f563-36b8-37ba-3ea8-291a0dac5921@suse.de> From: Tom de Vries In-Reply-To: X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2022 11:46:05 -0000 This is a multi-part message in MIME format. --------------SHE5HDqyIwIhHCgtGfpcSaJi Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 3/30/22 11:02, Tobias Burnus wrote: > On 30.03.22 10:03, Tom de Vries wrote: > >> On 3/29/22 16:47, Tobias Burnus wrote: >>> I think it would be useful to have additionally some wording for the >>> (new in GCC 12/new since today) macros, > [...] >> The macro is defined also if the option is not specified, so I think >> this formulation is not 100% clear in that aspect.  I've reformulated >> to fix that. > Fine. (It was a copy, paste + modify from elsewhere.) > >> Also, I took out the detail of how the value is determined, since >> we're just following __CUDA_ARCH__ rather than defining our own policy. > > OK. While I am not sure that it is obvious, also the example makes clear > what value to expect. Combining the two, I concur that the details > aren't required. > >> Any comments? > > LGTM. > > Tobias > > > PS: Regarding the sm_30 -> sm_35 change (before in this email thread). > That was not meant to be in the the .texi file, but just as item to > remember when updating the wwwdocs / gcc-12/changes.html document. > I see, I misunderstood then. FWIW, it's already added to the version in my sandbox. > It was/is also not completely clear to me whether there is still this > CUDA 11.x issue of not supporting sm_30 (only sm_35 and higher) or not. Thanks for reminding me of this issue. > I assume it still exists but is mitigated at > compiler-usage/libgomp-runtime-usage time as PTX ISA now defaults to 6.0 > such that CUDA – but shouldn't it still see sm_30 instead of sm_35 in > this case? > > If so, I think it will still show up when using either explicitly PTX > ISA 3.1 or when building GCC itself and all of the following holds: > nvptx-tools is installed, CUDA (in a too new version) is installed > (ptxas in $PATH) , and the the pending pull request nvptx-tools has not > been applied that ignores the non-explicit '--verify' when .target sm_xx > or PTX ISA .version is not supported by ptxas. I don't think the 6.0 default has any influence (and I'll be using -mptx=3.1 below to make sure we run into the worst-case behaviour). Anyway, in absence of an nvptx-tools fix I committed a work-around in the compiler: ... #define ASM_SPEC "%{misa=*:-m %*; :-m sm_35}%{misa=sm_30:--no-verify}" ... Note that this was before reverting back the default to sm_30, and I probably forgot to update this spot when changing the default. So now, there are effectively two workarounds in place. This (implicitly using sm_30) passes: ... $ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps -Wa,--verify -mptx=3.1 ) ... because as we can see with -v, sm_35 is used to verify: ... ./build-gcc/gcc/as -m sm_35 --verify -o hello.o hello.s ... This (explicitly using sm_30) passes: ... $ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps -march=sm_30 -mptx=3.1 ) ... because as we can see with -v, the --no-verify workaround is triggered: ... ./build-gcc/gcc/as -m sm_30 --no-verify -o hello.o hello.s ... But that one stops working once we use an explicit -Wa,--verify: ... $ ( PATH=$PATH:~/cuda/11.6.0/bin; ./gcc.sh ~/hello.c -c -save-temps -Wa,--verify -march=sm_30 -mptx=3.1 ) ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name' nvptx-as: ptxas returned 255 exit status ... So, it seems using sm_35 to verify sm_30 is the most robust workaround. I'm currently testing attached patch. Thanks, - Tom --------------SHE5HDqyIwIhHCgtGfpcSaJi Content-Type: text/x-patch; charset=UTF-8; name="0001-nvptx-Fix-ASM_SPEC-workaround-for-sm_30.patch" Content-Disposition: attachment; filename="0001-nvptx-Fix-ASM_SPEC-workaround-for-sm_30.patch" Content-Transfer-Encoding: base64 W252cHR4XSBGaXggQVNNX1NQRUMgd29ya2Fyb3VuZCBmb3Igc21fMzAKCk5ld2VyIHZlcnNp b25zIG9mIENVREEgbm8gbG9uZ2VyIHN1cHBvcnQgc21fMzAsIGFuZCBudnB0eC10b29scyBh cwpjdXJyZW50bHkgZG9lc24ndCBoYW5kbGUgdGhhdCBncmFjZWZ1bGx5IHdoZW4gdmVyaWZ5 aW5nCiggaHR0cHM6Ly9naXRodWIuY29tL01lbnRvckVtYmVkZGVkL252cHR4LXRvb2xzL2lz c3Vlcy8zMCApLgoKVGhlcmUncyBhIC0tbm8tdmVyaWZ5IHdvcmstYXJvdW5kIGluIHBsYWNl IGluIEFTTV9TUEVDLCBidXQgdGhhdCBvbmUgZG9lc24ndAp3b3JrIHdoZW4gdXNpbmcgLVdh LC0tdmVyaWZ5IG9uIHRoZSBjb21tYW5kIGxpbmUuCgpVc2UgYSBtb3JlIHJvYnVzdCB3b3Jr YXJvdW5kOiB2ZXJpZnkgdXNpbmcgc21fMzUgd2hlbiBtaXNhPXNtXzMwIGlzIHNwZWNpZmll ZAooZWl0aGVyIGltcGxpY2l0bHkgb3IgZXhwbGljaXRseSkuCgpUZXN0ZWQgb24gbnZwdHgu CgpnY2MvQ2hhbmdlTG9nOgoKMjAyMi0wMy0zMCAgVG9tIGRlIFZyaWVzICA8dGRldnJpZXNA c3VzZS5kZT4KCgkqIGNvbmZpZy9udnB0eC9udnB0eC5oIChBU01fU1BFQyk6IFVzZSAiLW0g c21fMzUiIGZvciAtbWlzYT1zbV8zMC4KCi0tLQogZ2NjL2NvbmZpZy9udnB0eC9udnB0eC5o IHwgMjIgKysrKysrKysrKysrKysrKysrLS0tLQogMSBmaWxlIGNoYW5nZWQsIDE4IGluc2Vy dGlvbnMoKyksIDQgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvZ2NjL2NvbmZpZy9udnB0 eC9udnB0eC5oIGIvZ2NjL2NvbmZpZy9udnB0eC9udnB0eC5oCmluZGV4IDc1YWM3YTY2NmIx My4uM2IwNmYzMzAzMmZkIDEwMDY0NAotLS0gYS9nY2MvY29uZmlnL252cHR4L252cHR4LmgK KysrIGIvZ2NjL2NvbmZpZy9udnB0eC9udnB0eC5oCkBAIC0yOSwxMCArMjksMjQgQEAKIAog I2RlZmluZSBTVEFSVEZJTEVfU1BFQyAiJXttbWFpbmtlcm5lbDpjcnQwLm99IgogCi0vKiBE ZWZhdWx0IG5lZWRzIHRvIGJlIGluIHN5bmMgd2l0aCBkZWZhdWx0IGZvciBtaXNhIGluIG52 cHR4Lm9wdC4KLSAgIFdlIGFkZCBhIGRlZmF1bHQgaGVyZSB0byB3b3JrIGFyb3VuZCBhIGhh cmQtY29kZWQgc21fMzAgZGVmYXVsdCBpbgotICAgbnZwdHgtYXMuICAqLwotI2RlZmluZSBB U01fU1BFQyAiJXttaXNhPSo6LW0gJSo7IDotbSBzbV8zNX0le21pc2E9c21fMzA6LS1uby12 ZXJpZnl9IgorLyogTmV3ZXIgdmVyc2lvbnMgb2YgQ1VEQSBubyBsb25nZXIgc3VwcG9ydCBz bV8zMCwgYW5kIG52cHR4LXRvb2xzIGFzCisgICBjdXJyZW50bHkgZG9lc24ndCBoYW5kbGUg dGhhdCBncmFjZWZ1bGx5IHdoZW4gdmVyaWZ5aW5nCisgICAoIGh0dHBzOi8vZ2l0aHViLmNv bS9NZW50b3JFbWJlZGRlZC9udnB0eC10b29scy9pc3N1ZXMvMzAgKS4gIFdvcmsgYXJvdW5k CisgICB0aGlzIGJ5IHZlcmlmeWluZyB3aXRoIHNtXzM1IHdoZW4gaGF2aW5nIG1pc2E9c21f MzAgKGVpdGhlciBpbXBsaWNpdGx5CisgICBvciBleHBsaWNpdGx5KS4gICovCisjZGVmaW5l IEFTTV9TUEVDCQkJCVwKKyAgIiV7IgkJCQkJCVwKKyAgLyogRXhwbGljdCBtaXNhPXNtXzMw LiAgKi8JCQlcCisgICJtaXNhPXNtXzMwOi1tIHNtXzM1IgkJCQlcCisgIC8qIFNlcGFyYXRv ci4JICovCQkJCVwKKyAgIjsgIgkJCQkJCVwKKyAgLyogQ2F0Y2gtYWxsLgkgKi8JCQkJXAor ICAibWlzYT0qOi1tICUqIgkJCQlcCisgIC8qIFNlcGFyYXRvci4JICovCQkJCVwKKyAgIjsg IgkJCQkJCVwKKyAgLyogSW1wbGljaXQgbWlzYT1zbV8zMC4gICovCQkJXAorICAiOi1tIHNt XzM1IgkJCQkJXAorICAifSIKIAogI2RlZmluZSBUQVJHRVRfQ1BVX0NQUF9CVUlMVElOUygp IG52cHR4X2NwdV9jcHBfYnVpbHRpbnMgKCkKIAo= --------------SHE5HDqyIwIhHCgtGfpcSaJi--