From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id BD4503858C60; Fri, 21 Jan 2022 17:16:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BD4503858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: ggXJixo7MunwJpi0SXR1pVf9pYQDZyZscaVmKHI3LGwoloFxNb5scFOG7AUQd/Lm2ujvzD7jLj QHI+SQMSpg4xBS5yfYEZDtU2m0yxpuHkRvUoHqiG+Vxso7K0AuP7qczaB5ry72k6DbAozY4D1G ElVD8umeBGidxpwr1Nnz765C9iZqqsGGWYWnX9hCp7PVQVNXSADa19RMAMhmWyCiUTjGHrWxFT 0608F4w7dk5RtOlcbXzU2Jmj4y0xnYtFcTonvGimjh9OXgHaBgLFpmuv5XjqfZkhoSRluWhA8E 7wX0lpTQMhGGZ4Ii22ZNNVv7 X-IronPort-AV: E=Sophos;i="5.88,306,1635235200"; d="scan'208";a="73604917" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 21 Jan 2022 09:16:06 -0800 IronPort-SDR: CBWhatLZepEC/58C1Eb+UUX3vEKa+dpTqBe5TDLGMNaqjJlxE+Rzg2DDmrZq7w28eRhVg+q4e3 +r84jd7E/0Sc23tJGWrACaIS/jcoj7Glfx+Oyn1KPzQQHlFKrtESbu6dBquhnlgyEy656bd0Y6 xoYI4vNVXz92cw4bKvLS4LZgtVYnj71H39t5lgUgfijxISWZqYN+FcmAF8NprCLiz4l1adWte4 5A7Q2BitfQJbKCFbaLAb/mKrsyDrsNwYSfBAk0LzHTkmOm1gNfcEIG1fk9AEhaGa2iFFvKyzBg ufo= From: Thomas Schwinge To: Hafiz Abid Qadeer CC: Jakub Jelinek , Tobias Burnus , , Subject: Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0). In-Reply-To: References: <20211022130502.2211568-1-abidh@codesourcery.com> <20211102162714.GF304296@tucnak> <20211220200650.GN2646553@tucnak> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Fri, 21 Jan 2022 18:15:58 +0100 Message-ID: <8735lh6mcx.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Jan 2022 17:16:10 -0000 Hi Abid! On 2022-01-11T22:31:54+0000, Hafiz Abid Qadeer wro= te: > From d1fb55bff497a20e6feefa50bd03890e7a903c0e Mon Sep 17 00:00:00 2001 > From: Hafiz Abid Qadeer > Date: Fri, 24 Sep 2021 10:04:12 +0100 > Subject: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0). > > This patch adds support for OpenMP 5.0 allocate clause for fortran. It do= es not > yet support the allocator-modifier as specified in OpenMP 5.1. The alloca= te > clause is already supported in C/C++. > libgomp/ChangeLog: > > * testsuite/libgomp.fortran/allocate-1.c: New test. > * testsuite/libgomp.fortran/allocate-1.f90: New test. I'm seeing this test case randomly/non-deterministically FAIL to execute, differently on different systems and runs, for example: libgomp: libgomp: libgomp: Out of memory allocating 4 bytesOut of memory allocating 4 byt= es libgomp: libgomp: libgomp: Out of memory allocating 168 bytes libgomp: Out of memory allocating 4 bytes libgomp: Out of memory allocating 4 bytes libgomp: Out of memory allocating 4 bytes I'd assume there's some concurrency issue: the problem disappears if I manually specify a lowerish 'OMP_NUM_THREADS', and conversely, on a system where I don't normally see the FAILs, I can trigger them with a largish 'OMP_NUM_THREADS', such as 'OMP_NUM_THREADS=3D18' and higher. For example: Thread 10 "a.out" hit Breakpoint 1, omp_aligned_alloc (alignment=3D4, s= ize=3D4, allocator=3D6326576) at [...]/source-gcc/libgomp/allocator.c:318 318 if (allocator_data) (gdb) print *allocator_data $1 =3D {memspace =3D omp_default_mem_space, alignment =3D 64, pool_size= =3D 8192, used_pool_size =3D 8188, fb_data =3D omp_null_allocator, sync_hi= nt =3D 3, access =3D 7, fallback =3D 12, pinned =3D 0, partition =3D 15} Given the high 'used_pool_size', is that to be expected, and the test case shouldn't be requesting "so much" memory? Or might the problem actually be in 'libgomp/allocator.c' (not touched by your commit)? All but Thread 10 are in 'gomp_team_barrier_wait_end' -- should memory have been released at that point? (gdb) thread apply 10 bt Thread 10 (Thread 0x7ffff32e2700 (LWP 1601318)): #0 omp_aligned_alloc (alignment=3D4, size=3D4, allocator=3D6326576) at= [...]/source-gcc/libgomp/allocator.c:320 #1 0x00007ffff790b4db in GOMP_alloc (alignment=3D4, size=3D4, allocato= r=3D6326576) at [...]/source-gcc/libgomp/allocator.c:364 #2 0x0000000000401f3f in foo_._omp_fn.3 () at source-gcc/libgomp/tests= uite/libgomp.fortran/allocate-1.f90:136 #3 0x00007ffff78f31e6 in gomp_thread_start (xdata=3D) a= t [...]/source-gcc/libgomp/team.c:129 #4 0x00007ffff789e609 in start_thread (arg=3D) at pthre= ad_create.c:477 #5 0x00007ffff77c5293 in clone () at ../sysdeps/unix/sysv/linux/x86_64= /clone.S:95 (gdb) thread apply 1 bt Thread 1 (Thread 0x7ffff72ec1c0 (LWP 1601309)): #0 futex_wait (val=3D96, addr=3D) at [...]/source-gcc/l= ibgomp/config/linux/x86/futex.h:97 #1 do_wait (val=3D96, addr=3D) at [...]/source-gcc/libg= omp/config/linux/wait.h:67 #2 gomp_team_barrier_wait_end (bar=3D, state=3D96) at [= ...]/source-gcc/libgomp/config/linux/bar.c:112 #3 0x0000000000401f53 in foo_._omp_fn.3 () at source-gcc/libgomp/tests= uite/libgomp.fortran/allocate-1.f90:136 #4 0x00007ffff78ea4f2 in GOMP_parallel (fn=3D0x401e6b = , data=3D0x7fffffffd450, num_threads=3D18, flags=3D0) at [...]/source-gcc/l= ibgomp/parallel.c:178 #5 0x00000000004012ab in foo (x=3D42, p=3D..., q=3D..., px=3D2, h=3D63= 26576, fl=3D0) at source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f= 90:122 #6 0x00000000004018e9 in MAIN__ () at source-gcc/libgomp/testsuite/lib= gomp.fortran/allocate-1.f90:326 Manually compiling the test case, I see a lot of '-Wtabs' diagnostics (can be ignored, I suppose), but also: source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:11:47: 11 | integer(c_int) function is_64bit_aligned (a) bind(C) | 1 Warning: Variable =E2=80=98a=E2=80=99 at (1) is a dummy argument of the= BIND(C) procedure =E2=80=98is_64bit_aligned=E2=80=99 but may not be C inte= roperable [-Wc-binding-type] Is that something to worry about? And: source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:31:19: 31 | integer :: n, n1, n2, n3, n4 | 1 Warning: Unused variable =E2=80=98n1=E2=80=99 declared at (1) [-Wunused= -variable] source-gcc/libgomp/testsuite/libgomp.fortran/allocate-1.f90:18:27: 18 | subroutine foo (x, p, q, px, h, fl) | 1 Warning: Unused dummy argument =E2=80=98px=E2=80=99 at (1) [-Wunused-du= mmy-argument] For reference, quoting below the new Fortran test case. Gr=C3=BC=C3=9Fe Thomas > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/allocate-1.c > @@ -0,0 +1,7 @@ > +#include > + > +int > +is_64bit_aligned_ (uintptr_t a) > +{ > + return ( (a & 0x3f) =3D=3D 0); > +} > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/allocate-1.f90 > @@ -0,0 +1,333 @@ > +! { dg-do run } > +! { dg-additional-sources allocate-1.c } > +! { dg-prune-output "command-line option '-fintrinsic-modules-path=3D.*'= is valid for Fortran but not for C" } > + > +module m > + use omp_lib > + use iso_c_binding > + implicit none > + > + interface > + integer(c_int) function is_64bit_aligned (a) bind(C) > + import :: c_int > + integer :: a > + end > + end interface > +end module m > + > +subroutine foo (x, p, q, px, h, fl) > + use omp_lib > + use iso_c_binding > + integer :: x > + integer, dimension(4) :: p > + integer, dimension(4) :: q > + integer :: px > + integer (kind=3Domp_allocator_handle_kind) :: h > + integer :: fl > + > + integer :: y > + integer :: r, i, i1, i2, i3, i4, i5 > + integer :: l, l3, l4, l5, l6 > + integer :: n, n1, n2, n3, n4 > + integer :: j2, j3, j4 > + integer, dimension(4) :: l2 > + integer, dimension(4) :: r2 > + integer, target :: xo > + integer, target :: yo > + integer, dimension(x) :: v > + integer, dimension(x) :: w > + > + type s_type > + integer :: a > + integer :: b > + end type > + > + type (s_type) :: s > + s%a =3D 27 > + s%b =3D 29 > + y =3D 0 > + r =3D 0 > + n =3D 8 > + n2 =3D 9 > + n3 =3D 10 > + n4 =3D 11 > + xo =3D x > + yo =3D y > + > + do i =3D 1, 4 > + r2(i) =3D 0; > + end do > + > + do i =3D 1, 4 > + p(i) =3D 0; > + end do > + > + do i =3D 1, 4 > + q(i) =3D 0; > + end do > + > + do i =3D 1, x > + w(i) =3D i > + end do > + > + !$omp parallel private (y, v) firstprivate (x) allocate (x, y, v) > + if (x /=3D 42) then > + stop 1 > + end if > + v(1) =3D 7 > + if ( (and(fl, 2) /=3D 0) .and. & > + ((is_64bit_aligned(x) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) .or. & > + (is_64bit_aligned(v(1)) =3D=3D 0))) then > + stop 2 > + end if > + > + !$omp barrier > + y =3D 1; > + x =3D x + 1 > + v(1) =3D 7 > + v(41) =3D 8 > + !$omp barrier > + if (x /=3D 43 .or. y /=3D 1) then > + stop 3 > + end if > + if (v(1) /=3D 7 .or. v(41) /=3D 8) then > + stop 4 > + end if > + !$omp end parallel > + > + !$omp teams > + !$omp parallel private (y) firstprivate (x, w) allocate (h: x, y, w) > + > + if (x /=3D 42 .or. w(17) /=3D 17 .or. w(41) /=3D 41) then > + stop 5 > + end if > + !$omp barrier > + y =3D 1; > + x =3D x + 1 > + w(19) =3D w(19) + 1 > + !$omp barrier > + if (x /=3D 43 .or. y /=3D 1 .or. w(19) /=3D 20) then > + stop 6 > + end if > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(x) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) .or. & > + (is_64bit_aligned(w(1)) =3D=3D 0))) then > + stop 7 > + end if > + !$omp end parallel > + !$omp end teams > + > + !$omp parallel do private (y) firstprivate (x) reduction(+: r) alloca= te (h: x, y, r, l, n) lastprivate (l) linear (n: 16) > + do i =3D 0, 63 > + if (x /=3D 42) then > + stop 8 > + end if > + y =3D 1; > + l =3D i; > + n =3D n + y + 15; > + r =3D r + i; > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(x) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) .or. & > + (is_64bit_aligned(r) =3D=3D 0) .or. & > + (is_64bit_aligned(l) =3D=3D 0) .or. & > + (is_64bit_aligned(n) =3D=3D 0))) then > + stop 9 > + end if > + end do > + !$omp end parallel do > + > + !$omp parallel > + !$omp do lastprivate (l2) private (i1) allocate (h: l2, l3, i1) last= private (conditional: l3) > + do i1 =3D 0, 63 > + l2(1) =3D i1 > + l2(2) =3D i1 + 1 > + l2(3) =3D i1 + 2 > + l2(4) =3D i1 + 3 > + if (i1 < 37) then > + l3 =3D i1 > + end if > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(l2(1)) =3D=3D 0) .or. & > + (is_64bit_aligned(l3) =3D=3D 0) .or. & > + (is_64bit_aligned(i1) =3D=3D 0))) then > + stop 10 > + end if > + end do > + > + !$omp do collapse(2) lastprivate(l4, i2, j2) linear (n2:17) allocate= (h: n2, l4, i2, j2) > + do i2 =3D 3, 4 > + do j2 =3D 17, 22, 2 > + n2 =3D n2 + 17 > + l4 =3D i2 * 31 + j2 > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(l4) =3D=3D 0) .or. & > + (is_64bit_aligned(n2) =3D=3D 0) .or. & > + (is_64bit_aligned(i2) =3D=3D 0) .or. & > + (is_64bit_aligned(j2) =3D=3D 0))) then > + stop 11 > + end if > + end do > + end do > + > + !$omp do collapse(2) lastprivate(l5, i3, j3) linear (n3:17) schedule= (static, 3) allocate (n3, l5, i3, j3) > + do i3 =3D 3, 4 > + do j3 =3D 17, 22, 2 > + n3 =3D n3 + 17 > + l5 =3D i3 * 31 + j3 > + if ( (and(fl, 2) /=3D 0) .and. & > + ((is_64bit_aligned(l5) =3D=3D 0) .or. & > + (is_64bit_aligned(n3) =3D=3D 0) .or. & > + (is_64bit_aligned(i3) =3D=3D 0) .or. & > + (is_64bit_aligned(j3) =3D=3D 0))) then > + stop 12 > + end if > + end do > + end do > + > + !$omp do collapse(2) lastprivate(l6, i4, j4) linear (n4:17) schedule= (dynamic) allocate (h: n4, l6, i4, j4) > + do i4 =3D 3, 4 > + do j4 =3D 17, 22,2 > + n4 =3D n4 + 17; > + l6 =3D i4 * 31 + j4; > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(l6) =3D=3D 0) .or. & > + (is_64bit_aligned(n4) =3D=3D 0) .or. & > + (is_64bit_aligned(i4) =3D=3D 0) .or. & > + (is_64bit_aligned(j4) =3D=3D 0))) then > + stop 13 > + end if > + end do > + end do > + > + !$omp do lastprivate (i5) allocate (i5) > + do i5 =3D 1, 17, 3 > + if ( (and(fl, 2) /=3D 0) .and. & > + (is_64bit_aligned(i5) =3D=3D 0)) then > + stop 14 > + end if > + end do > + > + !$omp do reduction(+:p, q, r2) allocate(h: p, q, r2) > + do i =3D 0, 31 > + p(3) =3D p(3) + i; > + p(4) =3D p(4) + (2 * i) > + q(1) =3D q(1) + (3 * i) > + q(3) =3D q(3) + (4 * i) > + r2(1) =3D r2(1) + (5 * i) > + r2(4) =3D r2(4) + (6 * i) > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(q(1)) =3D=3D 0) .or. & > + (is_64bit_aligned(p(1)) =3D=3D 0) .or. & > + (is_64bit_aligned(r2(1)) =3D=3D 0) )) then > + stop 15 > + end if > + end do > + > + !$omp task private(y) firstprivate(x) allocate(x, y) > + if (x /=3D 42) then > + stop 16 > + end if > + > + if ( (and(fl, 2) /=3D 0) .and. & > + ((is_64bit_aligned(x) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) )) then > + stop 17 > + end if > + !$omp end task > + > + !$omp task private(y) firstprivate(x) allocate(h: x, y) > + if (x /=3D 42) then > + stop 16 > + end if > + > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(x) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) )) then > + stop 17 > + end if > + !$omp end task > + > + !$omp task private(y) firstprivate(s) allocate(s, y) > + if (s%a /=3D 27 .or. s%b /=3D 29) then > + stop 18 > + end if > + > + if ( (and(fl, 2) /=3D 0) .and. & > + ((is_64bit_aligned(s%a) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) )) then > + stop 19 > + end if > + !$omp end task > + > + !$omp task private(y) firstprivate(s) allocate(h: s, y) > + if (s%a /=3D 27 .or. s%b /=3D 29) then > + stop 18 > + end if > + > + if ( (and(fl, 1) /=3D 0) .and. & > + ((is_64bit_aligned(s%a) =3D=3D 0) .or. & > + (is_64bit_aligned(y) =3D=3D 0) )) then > + stop 19 > + end if > + !$omp end task > + > + !$omp end parallel > + > + if (r /=3D ((64 * 63) / 2) .or. l /=3D 63 .or. n /=3D (8 + 16 * 64)) t= hen > + stop 20 > + end if > + > + if (l2(1) /=3D 63 .or. l2(2) /=3D 64 .or. l2(3) /=3D 65 .or. l2(4) /= =3D 66 .or. l3 /=3D 36) then > + stop 21 > + end if > + > + if (i2 /=3D 5 .or. j2 /=3D 23 .or. n2 /=3D (9 + (17 * 6)) .or. l4 /=3D= (4 * 31 + 21)) then > + stop 22 > + end if > + > + if (i3 /=3D 5 .or. j3 /=3D 23 .or. n3 /=3D (10 + (17 * 6)) .or. l5 /= =3D (4 * 31 + 21)) then > + stop 23 > + end if > + > + if (i4 /=3D 5 .or. j4 /=3D 23 .or. n4 /=3D (11 + (17 * 6)) .or. l6 /= =3D (4 * 31 + 21)) then > + stop 24 > + end if > + > + if (i5 /=3D 19) then > + stop 24 > + end if > + > + if (p(3) /=3D ((32 * 31) / 2) .or. p(4) /=3D (2 * p(3)) & > + .or. q(1) /=3D (3 * p(3)) .or. q(3) /=3D (4 * p(3)) & > + .or. r2(1) /=3D (5 * p(3)) .or. r2(4) /=3D (6 * p(3))) then > + stop 25 > + end if > + > +end subroutine > + > +program main > + use omp_lib > + integer, dimension(4) :: p > + integer, dimension(4) :: q > + > + type (omp_alloctrait) :: traits(3) > + integer (omp_allocator_handle_kind) :: a > + > + traits =3D [omp_alloctrait (omp_atk_alignment, 64), & > + omp_alloctrait (omp_atk_fallback, omp_atv_null_fb), & > + omp_alloctrait (omp_atk_pool_size, 8192)] > + a =3D omp_init_allocator (omp_default_mem_space, 3, traits) > + if (a =3D=3D omp_null_allocator) stop 1 > + > + call omp_set_default_allocator (omp_default_mem_alloc); > + call foo (42, p, q, 2, a, 0); > + call foo (42, p, q, 2, omp_default_mem_alloc, 0); > + call foo (42, p, q, 2, a, 1); > + call omp_set_default_allocator (a); > + call foo (42, p, q, 2, omp_null_allocator, 3); > + call foo (42, p, q, 2, omp_default_mem_alloc, 2); > + call omp_destroy_allocator (a); > +end ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955