From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BF03F3858CDA; Tue, 16 Aug 2022 15:37:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BF03F3858CDA From: "hberre3 at gatech dot edu" To: gcc-bugs@gcc.gnu.org Subject: [Bug libgomp/106643] New: [gfortran + OpenACC] Allocate in module causes refcount error Date: Tue, 16 Aug 2022 15:37:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libgomp X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: hberre3 at gatech dot edu X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2022 15:37:19 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106643 Bug ID: 106643 Summary: [gfortran + OpenACC] Allocate in module causes refcount error Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: hberre3 at gatech dot edu CC: jakub at gcc dot gnu.org Target Milestone: --- I built GCC 13 from the default branch with offloading support for the new = AMD MI 210 GPUs by following the documented instructions. I ran into the follow= ing runtime error when running our offloaded code written in Fortran leveraging OpenACC: ``` /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:1153: goacc_enter_data_internal: Assertion `n->refcount !=3D REFCOUNT_INFINITY && n->refcount !=3D REFCOUNT_LINK' failed. ``` I was able to create a minimal reproducible example for it: p_main.f90 ``` program p_main use m_macron ! =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D implicit none call s_vars_init() call s_vars_read() call s_macron_init() end program p_main ``` m_macron.f90 ``` module m_macron use m_vars implicit none real(kind(0d0)), allocatable, dimension(:) :: valls !$acc declare create(valls) contains subroutine s_macron_init() integer :: i print*, "size=3D", size print*, "allocate(valls(1:size))" allocate(valls(1:size)) print*, "acc enter data create(valls(1:size))" !$acc enter data create(valls(1:size)) print*, "!$acc update device(valls(1:size))" valls(size) =3D size - 2 !$acc update device(valls(1:size)) print*, valls(1:size) print*, "acc exit data delete(valls)" !$acc exit data delete(valls) end subroutine s_macron_init end module m_macron ``` m_vars.f90 ``` module m_vars integer :: size contains subroutine s_vars_init() size =3D -100 end subroutine s_vars_init subroutine s_vars_read() ! Namelist of the global parameters which may be specified by user namelist /user_inputs/ size open (1, FILE=3Dtrim("in.inp"), FORM=3D'formatted', ACTION=3D'read', STATUS=3D'old') read (1, NML=3Duser_inputs); close (1) end subroutine s_vars_read end module m_vars ``` in.inp ``` &user_inputs size =3D 10 &end/ ``` In order to generate this error, I had to create and dynamically allocate t= he array in another module. I initially wrote this in a single F90 file but the executable ran as expected. The error gets produced when running: ``` !$acc enter data create(valls(1:size)) ``` Here is the full output when running the executable with the `GOMP_DEBUG` environment variable set: ``` GOACC_data_start: mapnum=3D3, hostaddrs=3D0x7fff23d2efd0, size=3D0x7fff23d2= efb0, kinds=3D0x603102 GOACC_data_start: prepare mappings GOACC_data_start: mappings prepared size=3D 10 allocate(valls(1:size)) acc enter data create(valls(1:size)) main: /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:1153: goacc_enter_data_internal: Assertion `n->refcount !=3D REFCOUNT_INFINITY && n->refcount !=3D REFCOUNT_LINK' failed. Program received signal SIGABRT: Process abort signal. Backtrace for this error: #0 0x7f4aa4fcdb1f in ??? #1 0x7f4aa4fcda9f in ??? #2 0x7f4aa4fa0e04 in ??? #3 0x7f4aa4fa0cd8 in ??? #4 0x7f4aa4fc63f5 in ??? #5 0x7f4aa57b79e5 in goacc_enter_data_internal at /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:1153 #6 0x7f4aa57b79e5 in goacc_enter_exit_data_internal at /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:1405 #7 0x7f4aa57b8aab in GOACC_enter_data at /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/libgomp/oacc-mem.c:1478 #8 0x4013b7 in __m_macron_MOD_s_macron_init at /nethome/hberre3/bug/m_macron.f90:22 #9 0x40182a in p_main at /nethome/hberre3/bug/p_main.f90:11 #10 0x401866 in main at /nethome/hberre3/bug/p_main.f90:3 run.sh: line 17: 3052573 Aborted (core dumped) ./main ``` I am not sure if this error is from libgomp or another part of the GCC codebase. I assume it is related to an issue with the scoping of the array,= as it should be available throughout the entire program, per my reading of the OpenACC spec: ``` The associated region is the implicit region associated with the function, subroutine, or program in which the directive appears. If the directive app= ears in the declaration section of a Fortran module subprogram or in a C or C++ global scope, the associated region is the implicit region for the whole program.=20 ``` Our main code currently doesn't call `!$acc enter data create` for dynamica= lly allocated arrays since it relies on NVIDIA (/PGI) hooking into the `allocat= e` call on the CPU. I ran into the above error when converting our allocation/deallocation routines. Here is the output of `gfortran -v`: ``` [hberre3@8:instinct]:~ $ ~/tools/gcc/13/bin/gfortran -v Using built-in specs. COLLECT_GCC=3D/nethome/hberre3/tools/gcc/13/bin/gfortran COLLECT_LTO_WRAPPER=3D/nethome/hberre3/tools/gcc/13/libexec/gcc/x86_64-pc-l= inux-gnu/13.0.0/lto-wrapper OFFLOAD_TARGET_NAMES=3Damdgcn-amdhsa Target: x86_64-pc-linux-gnu Configured with: /nethome/hberre3/USERSCRATCH/build-gcc-amdgpu//gcc/configu= re --build=3Dx86_64-pc-linux-gnu --host=3Dx86_64-pc-linux-gnu --target=3Dx86_64-pc-linux-gnu --enable-offload-targets=3Damdgcn-amdhsa=3D/nethome/hberre3/tools/gcc/13/am= dgcn-amdhsa --enable-languages=3Dc,c++,fortran,lto --disable-multilib --prefix=3D/nethome/hberre3/tools/gcc/13 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.0.0 20220811 (experimental) (GCC) ``` The code is compiled with `/nethome/hberre3/tools/gcc/13/bin/gfortran -O0 -g -fopenacc '-foffload-options=3D-lgfortran -lm' -foffload-options=3Damdgcn-amdhsa=3D-march=3Dgfx90a -fno-exceptions`. This example runs with NVFortran on NVIDIA GPUs. Thank you for taking a loo= k!=