public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Hafiz Abid Qadeer <abidh@codesourcery.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Tobias Burnus <tobias@codesourcery.com>,
	 <gcc-patches@gcc.gnu.org>, <fortran@gcc.gnu.org>
Subject: Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).
Date: Fri, 21 Jan 2022 18:15:58 +0100	[thread overview]
Message-ID: <8735lh6mcx.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <fddcdfcf-3fab-1674-722e-2756a1d6aef8@mentor.com>

Hi Abid!

On 2022-01-11T22:31:54+0000, Hafiz Abid Qadeer <abid_qadeer@mentor.com> wrote:
> From d1fb55bff497a20e6feefa50bd03890e7a903c0e Mon Sep 17 00:00:00 2001
> From: Hafiz Abid Qadeer <abidh@codesourcery.com>
> Date: Fri, 24 Sep 2021 10:04:12 +0100
> Subject: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).
>
> This patch adds support for OpenMP 5.0 allocate clause for fortran. It does not
> yet support the allocator-modifier as specified in OpenMP 5.1. The allocate
> clause is already supported in C/C++.

> libgomp/ChangeLog:
>
>       * testsuite/libgomp.fortran/allocate-1.c: New test.
>       * testsuite/libgomp.fortran/allocate-1.f90: New test.

I'm seeing this test case randomly/non-deterministically FAIL to execute,
differently on different systems and runs, for example:

    libgomp:
    libgomp:
    libgomp: Out of memory allocating 4 bytesOut of memory allocating 4 bytes
    libgomp:
    libgomp:
    libgomp: Out of memory allocating 168 bytes

    libgomp: Out of memory allocating 4 bytes

    libgomp: Out of memory allocating 4 bytes

    libgomp: Out of memory allocating 4 bytes

I'd assume there's some concurrency issue: the problem disappears if I
manually specify a lowerish 'OMP_NUM_THREADS', and conversely, on a
system where I don't normally see the FAILs, I can trigger them with a
largish 'OMP_NUM_THREADS', such as 'OMP_NUM_THREADS=18' and higher.

For example:

    Thread 10 "a.out" hit Breakpoint 1, omp_aligned_alloc (alignment=4, size=4, allocator=6326576) at [...]/source-gcc/libgomp/allocator.c:318
    318       if (allocator_data)
    (gdb) print *allocator_data
    $1 = {memspace = omp_default_mem_space, alignment = 64, pool_size = 8192, used_pool_size = 8188, fb_data = omp_null_allocator, sync_hint = 3, access = 7, fallback = 12, pinned = 0, partition = 15}

Given the high 'used_pool_size', is that to be expected, and the test
case shouldn't be requesting "so much" memory?  Or might the problem
actually be in 'libgomp/allocator.c' (not touched by your commit)?

All but Thread 10 are in 'gomp_team_barrier_wait_end' -- should memory
have been released at that point?

    (gdb) thread apply 10 bt

    Thread 10 (Thread 0x7ffff32e2700 (LWP 1601318)):
    #0  omp_aligned_alloc (alignment=4, size=4, allocator=6326576) at [...]/source-gcc/libgomp/allocator.c:320
    #1  0x00007ffff790b4db in GOMP_alloc (alignment=4, size=4, allocator=6326576) at [...]/source-gcc/libgomp/allocator.c:364
    #2  0x0000000000401f3f in foo_._omp_fn.3 () at source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:136
    #3  0x00007ffff78f31e6 in gomp_thread_start (xdata=<optimized out>) at [...]/source-gcc/libgomp/team.c:129
    #4  0x00007ffff789e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
    #5  0x00007ffff77c5293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread apply 1 bt

    Thread 1 (Thread 0x7ffff72ec1c0 (LWP 1601309)):
    #0  futex_wait (val=96, addr=<optimized out>) at [...]/source-gcc/libgomp/config/linux/x86/futex.h:97
    #1  do_wait (val=96, addr=<optimized out>) at [...]/source-gcc/libgomp/config/linux/wait.h:67
    #2  gomp_team_barrier_wait_end (bar=<optimized out>, state=96) at [...]/source-gcc/libgomp/config/linux/bar.c:112
    #3  0x0000000000401f53 in foo_._omp_fn.3 () at source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:136
    #4  0x00007ffff78ea4f2 in GOMP_parallel (fn=0x401e6b <foo_._omp_fn.3>, data=0x7fffffffd450, num_threads=18, flags=0) at [...]/source-gcc/libgomp/parallel.c:178
    #5  0x00000000004012ab in foo (x=42, p=..., q=..., px=2, h=6326576, fl=0) at source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:122
    #6  0x00000000004018e9 in MAIN__ () at source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:326

Manually compiling the test case, I see a lot of '-Wtabs' diagnostics
(can be ignored, I suppose), but also:

    source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:11:47:

       11 |     integer(c_int) function is_64bit_aligned (a) bind(C)
          |                                               1
    Warning: Variable ‘a’ at (1) is a dummy argument of the BIND(C) procedure ‘is_64bit_aligned’ but may not be C interoperable [-Wc-binding-type]

Is that something to worry about?

And:

    source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:31:19:

       31 |   integer  :: n, n1, n2, n3, n4
          |                   1
    Warning: Unused variable ‘n1’ declared at (1) [-Wunused-variable]
    source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:18:27:

       18 | subroutine foo (x, p, q, px, h, fl)
          |                           1
    Warning: Unused dummy argument ‘px’ at (1) [-Wunused-dummy-argument]

For reference, quoting below the new Fortran test case.


Grüße
 Thomas


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/allocate-1.c
> @@ -0,0 +1,7 @@
> +#include <stdint.h>
> +
> +int
> +is_64bit_aligned_ (uintptr_t a)
> +{
> +  return ( (a & 0x3f) == 0);
> +}

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/allocate-1.f90
> @@ -0,0 +1,333 @@
> +! { dg-do run }
> +! { dg-additional-sources allocate-1.c }
> +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
> +
> +module m
> +  use omp_lib
> +  use iso_c_binding
> +  implicit none
> +
> +  interface
> +    integer(c_int) function is_64bit_aligned (a) bind(C)
> +      import :: c_int
> +      integer  :: a
> +    end
> +  end interface
> +end module m
> +
> +subroutine foo (x, p, q, px, h, fl)
> +  use omp_lib
> +  use iso_c_binding
> +  integer  :: x
> +  integer, dimension(4) :: p
> +  integer, dimension(4) :: q
> +  integer  :: px
> +  integer (kind=omp_allocator_handle_kind) :: h
> +  integer  :: fl
> +
> +  integer  :: y
> +  integer  :: r, i, i1, i2, i3, i4, i5
> +  integer  :: l, l3, l4, l5, l6
> +  integer  :: n, n1, n2, n3, n4
> +  integer  :: j2, j3, j4
> +  integer, dimension(4) :: l2
> +  integer, dimension(4) :: r2
> +  integer, target  :: xo
> +  integer, target  :: yo
> +  integer, dimension(x) :: v
> +  integer, dimension(x) :: w
> +
> +  type s_type
> +    integer      :: a
> +    integer      :: b
> +  end type
> +
> +  type (s_type) :: s
> +  s%a = 27
> +  s%b = 29
> +  y = 0
> +  r = 0
> +  n = 8
> +  n2 = 9
> +  n3 = 10
> +  n4 = 11
> +  xo = x
> +  yo = y
> +
> +  do i = 1, 4
> +    r2(i) = 0;
> +  end do
> +
> +  do i = 1, 4
> +    p(i) = 0;
> +  end do
> +
> +  do i = 1, 4
> +    q(i) = 0;
> +  end do
> +
> +  do i = 1, x
> +    w(i) = i
> +  end do
> +
> +  !$omp parallel private (y, v) firstprivate (x) allocate (x, y, v)
> +  if (x /= 42) then
> +    stop 1
> +  end if
> +  v(1) = 7
> +  if ( (and(fl, 2) /= 0) .and.          &
> +       ((is_64bit_aligned(x) == 0) .or. &
> +        (is_64bit_aligned(y) == 0) .or. &
> +        (is_64bit_aligned(v(1)) == 0))) then
> +      stop 2
> +  end if
> +
> +  !$omp barrier
> +  y = 1;
> +  x = x + 1
> +  v(1) = 7
> +  v(41) = 8
> +  !$omp barrier
> +  if (x /= 43 .or. y /= 1) then
> +    stop 3
> +  end if
> +  if (v(1) /= 7 .or. v(41) /= 8) then
> +    stop 4
> +  end if
> +  !$omp end parallel
> +
> +  !$omp teams
> +  !$omp parallel private (y) firstprivate (x, w) allocate (h: x, y, w)
> +
> +  if (x /= 42 .or. w(17) /= 17 .or. w(41) /= 41) then
> +    stop 5
> +  end if
> +  !$omp barrier
> +  y = 1;
> +  x = x + 1
> +  w(19) = w(19) + 1
> +  !$omp barrier
> +  if (x /= 43 .or. y /= 1 .or. w(19) /= 20) then
> +    stop 6
> +  end if
> +  if ( (and(fl, 1) /= 0) .and.          &
> +       ((is_64bit_aligned(x) == 0) .or. &
> +        (is_64bit_aligned(y) == 0) .or. &
> +        (is_64bit_aligned(w(1)) == 0))) then
> +    stop 7
> +  end if
> +  !$omp end parallel
> +  !$omp end teams
> +
> +  !$omp parallel do private (y) firstprivate (x)  reduction(+: r) allocate (h: x, y, r, l, n) lastprivate (l)  linear (n: 16)
> +  do i = 0, 63
> +    if (x /= 42) then
> +      stop 8
> +    end if
> +    y = 1;
> +    l = i;
> +    n = n + y + 15;
> +    r = r + i;
> +    if ( (and(fl, 1) /= 0) .and.          &
> +         ((is_64bit_aligned(x) == 0) .or. &
> +          (is_64bit_aligned(y) == 0) .or. &
> +          (is_64bit_aligned(r) == 0) .or. &
> +          (is_64bit_aligned(l) == 0) .or. &
> +          (is_64bit_aligned(n) == 0))) then
> +      stop 9
> +    end if
> +  end do
> +  !$omp end parallel do
> +
> +  !$omp parallel
> +    !$omp do lastprivate (l2) private (i1) allocate (h: l2, l3, i1) lastprivate (conditional: l3)
> +    do i1 = 0, 63
> +      l2(1) = i1
> +      l2(2) = i1 + 1
> +      l2(3) = i1 + 2
> +      l2(4) = i1 + 3
> +      if (i1 < 37) then
> +        l3 = i1
> +      end if
> +      if ( (and(fl, 1) /= 0) .and.          &
> +           ((is_64bit_aligned(l2(1)) == 0) .or. &
> +            (is_64bit_aligned(l3) == 0) .or. &
> +            (is_64bit_aligned(i1) == 0))) then
> +     stop 10
> +      end if
> +    end do
> +
> +    !$omp do collapse(2) lastprivate(l4, i2, j2) linear (n2:17) allocate (h: n2, l4, i2, j2)
> +    do i2 = 3, 4
> +      do j2 = 17, 22, 2
> +     n2 = n2 + 17
> +     l4 = i2 * 31 + j2
> +     if ( (and(fl, 1) /= 0) .and.          &
> +       ((is_64bit_aligned(l4) == 0) .or. &
> +       (is_64bit_aligned(n2) == 0) .or. &
> +       (is_64bit_aligned(i2) == 0) .or. &
> +       (is_64bit_aligned(j2) == 0))) then
> +       stop 11
> +     end if
> +      end do
> +    end do
> +
> +    !$omp do collapse(2) lastprivate(l5, i3, j3) linear (n3:17) schedule (static, 3) allocate (n3, l5, i3, j3)
> +    do i3 = 3, 4
> +      do j3 = 17, 22, 2
> +       n3 = n3 + 17
> +       l5 = i3 * 31 + j3
> +       if ( (and(fl, 2) /= 0) .and.      &
> +       ((is_64bit_aligned(l5) == 0) .or. &
> +       (is_64bit_aligned(n3) == 0) .or. &
> +       (is_64bit_aligned(i3) == 0) .or. &
> +       (is_64bit_aligned(j3) == 0))) then
> +       stop 12
> +     end if
> +      end do
> +    end do
> +
> +    !$omp do collapse(2) lastprivate(l6, i4, j4) linear (n4:17) schedule (dynamic) allocate (h: n4, l6, i4, j4)
> +    do i4 = 3, 4
> +      do j4 = 17, 22,2
> +       n4 = n4 + 17;
> +       l6 = i4 * 31 + j4;
> +     if ( (and(fl, 1) /= 0) .and.          &
> +       ((is_64bit_aligned(l6) == 0) .or. &
> +       (is_64bit_aligned(n4) == 0) .or. &
> +       (is_64bit_aligned(i4) == 0) .or. &
> +       (is_64bit_aligned(j4) == 0))) then
> +       stop 13
> +     end if
> +      end do
> +    end do
> +
> +    !$omp do lastprivate (i5) allocate (i5)
> +    do i5 = 1, 17, 3
> +      if ( (and(fl, 2) /= 0) .and.          &
> +        (is_64bit_aligned(i5) == 0)) then
> +     stop 14
> +      end if
> +    end do
> +
> +    !$omp do reduction(+:p, q, r2) allocate(h: p, q, r2)
> +    do i = 0, 31
> +     p(3) = p(3) +  i;
> +     p(4) = p(4) + (2 * i)
> +     q(1) = q(1) + (3 * i)
> +     q(3) = q(3) + (4 * i)
> +     r2(1) = r2(1) + (5 * i)
> +     r2(4) = r2(4) + (6 * i)
> +     if ( (and(fl, 1) /= 0) .and.          &
> +       ((is_64bit_aligned(q(1)) == 0) .or. &
> +       (is_64bit_aligned(p(1)) == 0) .or. &
> +       (is_64bit_aligned(r2(1)) == 0) )) then
> +       stop 15
> +     end if
> +    end do
> +
> +    !$omp task private(y) firstprivate(x) allocate(x, y)
> +    if (x /= 42) then
> +      stop 16
> +    end if
> +
> +    if ( (and(fl, 2) /= 0) .and.          &
> +      ((is_64bit_aligned(x) == 0) .or. &
> +      (is_64bit_aligned(y) == 0) )) then
> +      stop 17
> +    end if
> +    !$omp end task
> +
> +    !$omp task private(y) firstprivate(x) allocate(h: x, y)
> +    if (x /= 42) then
> +      stop 16
> +    end if
> +
> +    if ( (and(fl, 1) /= 0) .and.          &
> +      ((is_64bit_aligned(x) == 0) .or. &
> +      (is_64bit_aligned(y) == 0) )) then
> +      stop 17
> +    end if
> +    !$omp end task
> +
> +    !$omp task private(y) firstprivate(s) allocate(s, y)
> +    if (s%a /= 27 .or. s%b /= 29) then
> +      stop 18
> +    end if
> +
> +    if ( (and(fl, 2) /= 0) .and.          &
> +      ((is_64bit_aligned(s%a) == 0) .or. &
> +      (is_64bit_aligned(y) == 0) )) then
> +      stop 19
> +    end if
> +    !$omp end task
> +
> +    !$omp task private(y) firstprivate(s) allocate(h: s, y)
> +    if (s%a /= 27 .or. s%b /= 29) then
> +      stop 18
> +    end if
> +
> +    if ( (and(fl, 1) /= 0) .and.          &
> +      ((is_64bit_aligned(s%a) == 0) .or. &
> +      (is_64bit_aligned(y) == 0) )) then
> +      stop 19
> +    end if
> +    !$omp end task
> +
> +  !$omp end parallel
> +
> +  if (r /= ((64 * 63) / 2) .or. l /= 63 .or. n /= (8 + 16 * 64)) then
> +    stop 20
> +  end if
> +
> +  if (l2(1) /= 63 .or. l2(2) /= 64 .or. l2(3) /= 65 .or. l2(4) /= 66 .or. l3 /= 36) then
> +    stop 21
> +  end if
> +
> +  if (i2 /= 5 .or. j2 /= 23 .or. n2 /= (9 + (17 * 6)) .or. l4 /= (4 * 31 + 21)) then
> +    stop 22
> +  end if
> +
> +  if (i3 /= 5 .or. j3 /= 23 .or. n3 /= (10 + (17 * 6))  .or. l5 /= (4 * 31 + 21)) then
> +    stop 23
> +  end if
> +
> +  if (i4 /= 5 .or. j4 /= 23 .or. n4 /= (11 + (17 * 6))  .or. l6 /= (4 * 31 + 21)) then
> +    stop 24
> +  end if
> +
> +  if (i5 /= 19) then
> +    stop 24
> +  end if
> +
> +  if (p(3) /= ((32 * 31) / 2) .or. p(4) /= (2 * p(3))         &
> +      .or. q(1) /= (3 * p(3)) .or. q(3) /= (4 * p(3))         &
> +      .or. r2(1) /= (5 * p(3)) .or. r2(4) /= (6 * p(3))) then
> +    stop 25
> +  end if
> +
> +end subroutine
> +
> +program main
> +  use omp_lib
> +  integer, dimension(4) :: p
> +  integer, dimension(4) :: q
> +
> +  type (omp_alloctrait) :: traits(3)
> +  integer (omp_allocator_handle_kind) :: a
> +
> +  traits = [omp_alloctrait (omp_atk_alignment, 64), &
> +            omp_alloctrait (omp_atk_fallback, omp_atv_null_fb), &
> +            omp_alloctrait (omp_atk_pool_size, 8192)]
> +  a = omp_init_allocator (omp_default_mem_space, 3, traits)
> +  if (a == omp_null_allocator) stop 1
> +
> +  call omp_set_default_allocator (omp_default_mem_alloc);
> +  call foo (42, p, q, 2, a, 0);
> +  call foo (42, p, q, 2, omp_default_mem_alloc, 0);
> +  call foo (42, p, q, 2, a, 1);
> +  call omp_set_default_allocator (a);
> +  call foo (42, p, q, 2, omp_null_allocator, 3);
> +  call foo (42, p, q, 2, omp_default_mem_alloc, 2);
> +  call omp_destroy_allocator (a);
> +end
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

  parent reply	other threads:[~2022-01-21 17:16 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211022130502.2211568-1-abidh@codesourcery.com>
2021-10-22 13:28 ` Tobias Burnus
     [not found] ` <20211102162714.GF304296@tucnak>
     [not found]   ` <e3c9083c-0906-699a-b437-84a49ece33b8@mentor.com>
     [not found]     ` <20211220200650.GN2646553@tucnak>
     [not found]       ` <fddcdfcf-3fab-1674-722e-2756a1d6aef8@mentor.com>
2022-01-14  9:10         ` Thomas Schwinge
2022-01-14 11:45           ` Tobias Burnus
2022-01-14 11:55             ` Jakub Jelinek
2022-01-14 12:20               ` Tobias Burnus
2022-01-17 14:01                 ` Hafiz Abid Qadeer
2022-01-21 17:15         ` Thomas Schwinge [this message]
2022-01-21 17:43           ` Tobias Burnus
2022-01-24  8:45             ` Tobias Burnus
2022-01-24 12:54               ` Hafiz Abid Qadeer
2022-01-25  9:19                 ` Thomas Schwinge
2022-01-25 10:32                   ` Tobias Burnus
2022-01-31 19:13                     ` Hafiz Abid Qadeer
2022-02-04  9:46                       ` Thomas Schwinge
2022-02-04 11:25                         ` Hafiz Abid Qadeer
2022-02-05 19:09                           ` Hafiz Abid Qadeer
2022-02-16 10:29                             ` Hafiz Abid Qadeer
2022-02-04  9:37               ` Thomas Schwinge
2022-02-04 13:57                 ` [committed] libgomp.fortran/allocate-1.f90: Minor cleanup (was: Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).) Tobias Burnus
2022-02-04 15:33                   ` Thomas Schwinge
2022-02-04 16:34                     ` Tobias Burnus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735lh6mcx.fsf@euler.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=abidh@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).