From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 6AD3B3858427 for ; Fri, 1 Apr 2022 15:34:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6AD3B3858427 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3B5C71FD00; Fri, 1 Apr 2022 15:34:51 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1FF24132C1; Fri, 1 Apr 2022 15:34:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id tlDxBZsbR2JtHQAAMHmgww (envelope-from ); Fri, 01 Apr 2022 15:34:51 +0000 Content-Type: multipart/mixed; boundary="------------afY0oVk5tn5yA2HnulfHTsxm" Message-ID: <37fabb8f-8273-f337-3e70-1795d957288e@suse.de> Date: Fri, 1 Apr 2022 17:34:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90 Content-Language: en-US To: Thomas Schwinge Cc: Jakub Jelinek , gcc-patches@gcc.gnu.org References: <20220401112438.GA19247@delia> <8735ixm1hr.fsf@euler.schwinge.homeip.net> From: Tom de Vries In-Reply-To: <8735ixm1hr.fsf@euler.schwinge.homeip.net> X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2022 15:34:55 -0000 This is a multi-part message in MIME format. --------------afY0oVk5tn5yA2HnulfHTsxm Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 4/1/22 14:28, Thomas Schwinge wrote: > Hi Tom! > > On 2022-04-01T13:24:40+0200, Tom de Vries wrote: >> When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on >> an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run >> into: >> ... >> FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \ >> -DGOMP_NVPTX_JIT=-O0 execution test >> FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \ >> -DGOMP_NVPTX_JIT=-O0 execution test >> ... >> >> Fix this by further limiting recursion depth in the test-cases for nvptx. >> >> Furthermore, make the recursion depth limiting nvptx-specific. > > Careful: > >> --- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 >> +++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 >> @@ -1,4 +1,16 @@ >> ! { dg-do run } >> +! { dg-additional-options "-cpp" } >> +! Reduced from 25 to 23, otherwise execution runs out of thread stack on >> +! Nvidia Titan V. >> +! Reduced from 23 to 22, otherwise execution runs out of thread stack on >> +! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0. >> +! Reduced from 22 to 20, otherwise execution runs out of thread stack on >> +! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0. >> +! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } } */ > > 'offload_target_nvptx' doesn't mean that offloading execution is done on > nvptx, but rather that we're "*compiling* for offload target nvptx" > (emphasis mine). That means, with such a change we're now getting > different behavior in a system with an AMD GPU, when using a toolchain > that only has GCN offloading configured vs. a toolchain that has GCN and > nvptx offloading configured. This isn't going to cause any real > problems, of course, but it's confusing, and a bad example of > 'offload_target_nvptx'. > > 'offload_device_nvptx' ought to work: "using nvptx offload device". > Thanks for pointing that out. I tried to understand this multiple offloading configuration a bit, and came up with the following mental model: it's possible to have a host with say an nvptx and amd offloading device, and then configure and build a toolchain that can generate a single executable that can offload to either device, depending on the value of appropriate openacc/openmp environment variables. So, in principle the libgomp testsuite could have a mode in which it does that: run the same executable twice, once for each offloading device. In that case, even using offload_device_nvptx would not be accurate enough, and we'd need to test for offload device type at runtime, as used to be done in libgomp/testsuite/libgomp.fortran/task-detach-6.f90. I've tried to copy that setup to libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90, but that doesn't seem to work anymore. I've also tried copying that test-case to libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 to rule out any subdir-related problems, but no luck there either. Attached is that copy approach, could you try it out and see if it works for you? Do you perhaps have an idea why it's failing? I can make a patch using offload_device_nvptx, but I'd prefer to understand first why the approach above isn't working. Thanks, - Tom --------------afY0oVk5tn5yA2HnulfHTsxm Content-Type: text/x-patch; charset=UTF-8; name="0001-libgomp-testsuite-Add-libgomp.fortran-copy-of-declare_target-1.f90.patch" Content-Disposition: attachment; filename*0="0001-libgomp-testsuite-Add-libgomp.fortran-copy-of-declare_t"; filename*1="arget-1.f90.patch" Content-Transfer-Encoding: base64 W2xpYmdvbXAvdGVzdHN1aXRlXSBBZGQgbGliZ29tcC5mb3J0cmFuL2NvcHktb2YtZGVjbGFy ZV90YXJnZXQtMS5mOTAKCi0tLQogLi4uL2xpYmdvbXAuZm9ydHJhbi9jb3B5LW9mLWRlY2xh cmVfdGFyZ2V0LTEuZjkwICAgfCA0OSArKysrKysrKysrKysrKysrKysrKysrCiAxIGZpbGUg Y2hhbmdlZCwgNDkgaW5zZXJ0aW9ucygrKQoKZGlmZiAtLWdpdCBhL2xpYmdvbXAvdGVzdHN1 aXRlL2xpYmdvbXAuZm9ydHJhbi9jb3B5LW9mLWRlY2xhcmVfdGFyZ2V0LTEuZjkwIGIvbGli Z29tcC90ZXN0c3VpdGUvbGliZ29tcC5mb3J0cmFuL2NvcHktb2YtZGVjbGFyZV90YXJnZXQt MS5mOTAKbmV3IGZpbGUgbW9kZSAxMDA2NDQKaW5kZXggMDAwMDAwMDAwMDAuLjZkY2Y1MzEy MDcwCi0tLSAvZGV2L251bGwKKysrIGIvbGliZ29tcC90ZXN0c3VpdGUvbGliZ29tcC5mb3J0 cmFuL2NvcHktb2YtZGVjbGFyZV90YXJnZXQtMS5mOTAKQEAgLTAsMCArMSw0OSBAQAorISB7 IGRnLWRvIHJ1biB9CishIHsgZGctYWRkaXRpb25hbC1zb3VyY2VzIG9uX2RldmljZV9hcmNo LmMgfQorCittb2R1bGUgZV81M18xX21vZAorICBpbnRlZ2VyIDo6IFRIUkVTSE9MRCA9IDIw Citjb250YWlucworICBpbnRlZ2VyIHJlY3Vyc2l2ZSBmdW5jdGlvbiBmaWIgKG4pIHJlc3Vs dCAoZikKKyAgICAhJG9tcCBkZWNsYXJlIHRhcmdldAorICAgIGludGVnZXIgOjogbgorICAg IGlmIChuIDw9IDApIHRoZW4KKyAgICAgIGYgPSAwCisgICAgZWxzZSBpZiAobiA9PSAxKSB0 aGVuCisgICAgICBmID0gMQorICAgIGVsc2UKKyAgICAgIGYgPSBmaWIgKG4gLSAxKSArIGZp YiAobiAtIDIpCisgICAgZW5kIGlmCisgIGVuZCBmdW5jdGlvbgorCisgIGludGVnZXIgZnVu Y3Rpb24gZmliX3dyYXBwZXIgKG4pCisgICAgaW50ZWdlciA6OiB4CisgICAgISRvbXAgdGFy Z2V0IG1hcCh0bzogbikgbWFwKGZyb206IHgpIGlmKG4gPiBUSFJFU0hPTEQpCisgICAgICB4 ID0gZmliIChuKQorICAgICEkb21wIGVuZCB0YXJnZXQKKyAgICBmaWJfd3JhcHBlciA9IHgK KyAgZW5kIGZ1bmN0aW9uCitlbmQgbW9kdWxlCisKK3Byb2dyYW0gZV81M18xCisgIHVzZSBl XzUzXzFfbW9kLCBvbmx5IDogZmliLCBmaWJfd3JhcHBlcgorICBpbnRlZ2VyIDo6IFJFQ19E RVBUSCA9IDI1CisKKyAgaW50ZXJmYWNlCisgICAgaW50ZWdlciBmdW5jdGlvbiBvbl9kZXZp Y2VfYXJjaF9udnB0eCgpIGJpbmQoQykKKyAgICBlbmQgZnVuY3Rpb24gb25fZGV2aWNlX2Fy Y2hfbnZwdHgKKyAgZW5kIGludGVyZmFjZQorCisgIGlmIChvbl9kZXZpY2VfYXJjaF9udnB0 eCAoKSAvPSAwKSB0aGVuCisgICAgICEgUmVkdWNlZCBmcm9tIDI1IHRvIDIzLCBvdGhlcndp c2UgZXhlY3V0aW9uIHJ1bnMgb3V0IG9mIHRocmVhZCBzdGFjayBvbgorICAgICAhIE52aWRp YSBUaXRhbiBWLgorICAgICAhIFJlZHVjZWQgZnJvbSAyMyB0byAyMiwgb3RoZXJ3aXNlIGV4 ZWN1dGlvbiBydW5zIG91dCBvZiB0aHJlYWQgc3RhY2sgb24KKyAgICAgISBOdmlkaWEgVDQw MCAoMkdCIHZhcmlhbnQpLCB3aGVuIHJ1biB3aXRoIEdPTVBfTlZQVFhfSklUPS1PMC4KKyAg ICAgISBSZWR1Y2VkIGZyb20gMjIgdG8gMjAsIG90aGVyd2lzZSBleGVjdXRpb24gcnVucyBv dXQgb2YgdGhyZWFkIHN0YWNrIG9uCisgICAgICEgTnZpZGlhIFJUWCBBMjAwMCAoNkdCIHZh cmlhbnQpLCB3aGVuIHJ1biB3aXRoIEdPTVBfTlZQVFhfSklUPS1PMC4KKyAgICAgUkVDX0RF UFRIID0gMjAKKyAgZW5kIGlmCisKKyAgaWYgKGZpYiAoMTUpIC89IGZpYl93cmFwcGVyICgx NSkpIHN0b3AgMQorICBpZiAoZmliIChSRUNfREVQVEgpIC89IGZpYl93cmFwcGVyIChSRUNf REVQVEgpKSBzdG9wIDIKK2VuZCBwcm9ncmFtCg== --------------afY0oVk5tn5yA2HnulfHTsxm--