public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Thomas Schwinge <thomas@codesourcery.com>
Cc: Jakub Jelinek <jakub@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90
Date: Fri, 1 Apr 2022 17:34:50 +0200	[thread overview]
Message-ID: <37fabb8f-8273-f337-3e70-1795d957288e@suse.de> (raw)
In-Reply-To: <8735ixm1hr.fsf@euler.schwinge.homeip.net>

[-- Attachment #1: Type: text/plain, Size: 3350 bytes --]

On 4/1/22 14:28, Thomas Schwinge wrote:
> Hi Tom!
> 
> On 2022-04-01T13:24:40+0200, Tom de Vries <tdevries@suse.de> wrote:
>> When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
>> an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
>> into:
>> ...
>> FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
>>    -DGOMP_NVPTX_JIT=-O0 execution test
>> FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
>>    -DGOMP_NVPTX_JIT=-O0 execution test
>> ...
>>
>> Fix this by further limiting recursion depth in the test-cases for nvptx.
>>
>> Furthermore, make the recursion depth limiting nvptx-specific.
> 
> Careful:
> 
>> --- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
>> +++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
>> @@ -1,4 +1,16 @@
>>   ! { dg-do run }
>> +! { dg-additional-options "-cpp" }
>> +! Reduced from 25 to 23, otherwise execution runs out of thread stack on
>> +! Nvidia Titan V.
>> +! Reduced from 23 to 22, otherwise execution runs out of thread stack on
>> +! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
>> +! Reduced from 22 to 20, otherwise execution runs out of thread stack on
>> +! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
>> +! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } } */
> 
> 'offload_target_nvptx' doesn't mean that offloading execution is done on
> nvptx, but rather that we're "*compiling* for offload target nvptx"
> (emphasis mine).  That means, with such a change we're now getting
> different behavior in a system with an AMD GPU, when using a toolchain
> that only has GCN offloading configured vs. a toolchain that has GCN and
> nvptx offloading configured.  This isn't going to cause any real
> problems, of course, but it's confusing, and a bad example of
> 'offload_target_nvptx'.
> 
> 'offload_device_nvptx' ought to work: "using nvptx offload device".
> 

Thanks for pointing that out.

I tried to understand this multiple offloading configuration a bit, and 
came up with the following mental model: it's possible to have a host 
with say an nvptx and amd offloading device, and then configure and 
build a toolchain that can generate a single executable that can offload 
to either device, depending on the value of appropriate openacc/openmp 
environment variables.

So, in principle the libgomp testsuite could have a mode in which it 
does that: run the same executable twice, once for each offloading 
device.  In that case, even using offload_device_nvptx would not be 
accurate enough, and we'd need to test for offload device type at 
runtime, as used to be done in 
libgomp/testsuite/libgomp.fortran/task-detach-6.f90.

I've tried to copy that setup to 
libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90, but 
that doesn't seem to work anymore.  I've also tried copying that 
test-case to 
libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 to rule 
out any subdir-related problems, but no luck there either.

Attached is that copy approach, could you try it out and see if it works 
for you?

Do you perhaps have an idea why it's failing?

I can make a patch using offload_device_nvptx, but I'd prefer to 
understand first why the approach above isn't working.

Thanks,
- Tom

[-- Attachment #2: 0001-libgomp-testsuite-Add-libgomp.fortran-copy-of-declare_target-1.f90.patch --]
[-- Type: text/x-patch, Size: 1864 bytes --]

[libgomp/testsuite] Add libgomp.fortran/copy-of-declare_target-1.f90

---
 .../libgomp.fortran/copy-of-declare_target-1.f90   | 49 ++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
new file mode 100644
index 00000000000..6dcf5312070
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
@@ -0,0 +1,49 @@
+! { dg-do run }
+! { dg-additional-sources on_device_arch.c }
+
+module e_53_1_mod
+  integer :: THRESHOLD = 20
+contains
+  integer recursive function fib (n) result (f)
+    !$omp declare target
+    integer :: n
+    if (n <= 0) then
+      f = 0
+    else if (n == 1) then
+      f = 1
+    else
+      f = fib (n - 1) + fib (n - 2)
+    end if
+  end function
+
+  integer function fib_wrapper (n)
+    integer :: x
+    !$omp target map(to: n) map(from: x) if(n > THRESHOLD)
+      x = fib (n)
+    !$omp end target
+    fib_wrapper = x
+  end function
+end module
+
+program e_53_1
+  use e_53_1_mod, only : fib, fib_wrapper
+  integer :: REC_DEPTH = 25
+
+  interface
+    integer function on_device_arch_nvptx() bind(C)
+    end function on_device_arch_nvptx
+  end interface
+
+  if (on_device_arch_nvptx () /= 0) then
+     ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+     ! Nvidia Titan V.
+     ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+     ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+     ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+     ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+     REC_DEPTH = 20
+  end if
+
+  if (fib (15) /= fib_wrapper (15)) stop 1
+  if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
+end program

  reply	other threads:[~2022-04-01 15:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-01 11:24 Tom de Vries
2022-04-01 11:26 ` Jakub Jelinek
2022-04-01 12:28 ` Thomas Schwinge
2022-04-01 15:34   ` Tom de Vries [this message]
2022-04-01 15:38     ` Jakub Jelinek
2022-04-01 15:57       ` Tom de Vries
2022-04-04 11:05         ` Tom de Vries
2022-04-04 11:07           ` Jakub Jelinek
2022-04-04 11:37             ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37fabb8f-8273-f337-3e70-1795d957288e@suse.de \
    --to=tdevries@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).