public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Tobias Burnus <tobias@codesourcery.com>
Cc: Hafiz Abid Qadeer <abidh@codesourcery.com>,
	gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org
Subject: Re: [PATCH 2/5] [gfortran] Translate allocate directive (OpenMP 5.0).
Date: Tue, 11 Oct 2022 16:15:24 +0200	[thread overview]
Message-ID: <Y0V6fCaU+AksopaH@tucnak> (raw)
In-Reply-To: <3683274e-33d7-d2a1-ffd8-d678cecba5d8@codesourcery.com>

On Tue, Oct 11, 2022 at 03:22:02PM +0200, Tobias Burnus wrote:
> Hi Jakub,
> 
> On 11.10.22 14:24, Jakub Jelinek wrote:
> 
> There is another issue besides what I wrote in my last review,
> and I'm afraid I don't know what to do about it, hoping Tobias
> has some ideas.
> The problem is that without the allocate-stmt associated allocate directive,
> Fortran allocatables are easily always allocated with malloc and freed with
> free.  The deallocation can be implicit through reallocation, or explicit
> deallocate statement etc.
> ...
> But when some allocatables are now allocated with a different
> allocator (when allocate-stmt associated allocate directive is used),
> some allocatables are allocated with malloc and others with GOMP_alloc
> but we need to free them with the corresponding allocator based on how
> they were allocated, what has been allocated with malloc should be
> deallocated with free, what has been allocated with GOMP_alloc should be
> deallocated with GOMP_free.
> 
> 
> 
> I think the most common case is:
> 
> integer, allocatable :: var(:)
> !$omp allocators allocator(my_alloc) ! must be in same scope as decl of 'var'
> ...
> ! optionally: deallocate(var)
> end ! of scope: block/subroutine/... - automatic deallocation

So you talk here about the declarative directive the patch does sorry on,
or about the executable one above allocate stmt?

Anyway, even this simple case has the problem that one can have
subroutine foo (var)
  integer, allocatable:: var(:)
  var = [1, 2, 3] ! reallocate
end subroutine
and call foo (var) above.

> Those can be easily handled. It gets more complicated with control flow:
> 
> if (...) then
>  !$omp allocators allocator(...)
>  allocate(...)
> else
>  allocate (...)
> endif
> 
> 
> 
> However, the problem is really that there is is no mandatory
> '!$omp deallocators' and also the wording like:
> 
> "If any operation of the base language causes a reallocation of
> an array that is allocated with a memory allocator then that
> memory allocator will be used to release the current memory
> and to allocate the new memory." (OpenMP 5.0 wording)
> 
> There has been some attempt to relax the rules a bit, e.g. by
> adding the wording:
> "For allocated allocatable components of such variables, the allocator that
> will be used for the deallocation and allocation is unspecified."
> 
> And some wording change (→issues 3189) to clarify related component issues.
> 
> But nonetheless, there is still the issue of:
> 
> (a) explicit DEALLOCATE in some other translation unit
> (b) some intrinsic operation which reallocate the memory, either via libgomp
> or in the source code:
>  a = [1,2,3]  ! possibly reallocates
>  str = trim(str) ! possibly reallocates
> where the first one calls 'realloc' directly in the code and the second one
> calls 'libgomp' for that.
> 
> * * *
> 
> I don't see a good solution – and there is in principle the same issue with
> unified-shared memory (USM) on hardware that does not support transparently
> accessing all host memory on the device.
> 
> Compilers support this case by allocating memory in some special memory,
> which is either accessible from both sides ('pinned') or migrates on the
> first access from the device side - but remains there until the accessing
> device kernel ends ('managed memory').
> 
> Newer hardware (+ associated Linux kernel support) permit accessing all
> memory in a somewhat fast way, avoiding this issue (and special handling
> is then left to the user.) For AMDGCN, my understanding is that all hardware
> supported by GCC supports this - but glacial speed until the last hardware
> architectures. For Nvidia, this is supported since Pascal (I think for Titan X,
> P100, i.e. sm_5.2/sm_60) - but I believe not for all Pascal/Kepler hardware.
> 
> I mention this because the USM implementation at
> https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597976.html
> suffers from this.
> And https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601059.html
> tries to solve the the 'trim' example issue above - i.e. the case where
> libgomp reallocates pinned/managed (pseudo-)USM memory.
> 
> * * *
> 
> The deallocation can be done in a completely different TU from where it has
> been allocated, in theory it could be also not compiled with -fopenmp, etc.
> So, I'm afraid we need to store somewhere whether we used malloc or
> GOMP_alloc for the allocation (say somewhere in the array descriptor and for
> other stuff somewhere on the side?) and slow down all code that needs
> deallocation to check that bit (or say we don't support
> deallocation/reallocation of OpenMP allocated allocatables without -fopenmp
> on the deallocation TU and only slow down -fopenmp compiled code)?
> 
> The problem with storing is that gfortran inserts the malloc/realloc/free calls directly, i.e. without library preloading, intercepting those libcalls, I do not see how it can work at all.

Well, it can use a weak symbol, if not linked against libgomp, the bit
that it is OpenMP shouldn't be set and so realloc/free will be used
and do
  if (arrdescr.gomp_alloced_bit)
    GOMP_free (arrdescr.data, 0);
  else
    free (arrdescr.data);
and similar.  And I think we can just document that we do this only for
-fopenmp compiled code.
But do we have a place to store that bit?  I presume in array descriptors
there could be some bit for it, but what to do about scalar allocatables,
or allocatable components etc.?
In theory we could use ugly stuff like if all the allocations would be
guaranteed to have at least 2 byte alignment use LSB bit of the pointer
to mark GOMP_alloc allocated memory for the scalar allocatables etc. but
then would need in -fopenmp compiled code to strip it away.

As for pinned memory, if it is allocated through libgomp allocators, that
should just work if GOMP_free/GOMP_realloc is used, that is why we have
those extra data in front of the allocations where we store everything we
need.  But those also make the OpenMP allocations incompatible with
malloc/free allocations.

	Jakub


  reply	other threads:[~2022-10-11 14:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13 14:53 [PATCH 0/5] [gfortran] Support for " Hafiz Abid Qadeer
2022-01-13 14:53 ` [PATCH 1/5] [gfortran] Add parsing support " Hafiz Abid Qadeer
2022-10-11 12:13   ` Jakub Jelinek
2023-02-01 11:59   ` [og12] Fix 'omp_allocator_handle_kind' example in 'gfortran.dg/gomp/allocate-4.f90' (was: [PATCH 1/5] [gfortran] Add parsing support for allocate directive (OpenMP 5.0).) Thomas Schwinge
2023-02-01 12:12     ` Tobias Burnus
2023-02-09 11:35   ` [og12] 'gfortran.dg/gomp/allocate-4.f90' -> 'libgomp.fortran/allocate-5.f90' (was: [PATCH 1/5] [gfortran] Add parsing support for allocate directive (OpenMP 5.0)) Thomas Schwinge
2022-01-13 14:53 ` [PATCH 2/5] [gfortran] Translate allocate directive (OpenMP 5.0) Hafiz Abid Qadeer
2022-10-11 12:24   ` Jakub Jelinek
2022-10-11 13:22     ` Tobias Burnus
2022-10-11 14:15       ` Jakub Jelinek [this message]
2022-10-11 14:27         ` Jakub Jelinek
2022-10-11 14:38         ` Tobias Burnus
2022-01-13 14:53 ` [PATCH 3/5] [gfortran] Handle cleanup of omp allocated variables " Hafiz Abid Qadeer
2022-01-13 14:53 ` [PATCH 4/5] [gfortran] Gimplify allocate directive " Hafiz Abid Qadeer
2022-01-13 14:53 ` [PATCH 5/5] [gfortran] Lower " Hafiz Abid Qadeer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0V6fCaU+AksopaH@tucnak \
    --to=jakub@redhat.com \
    --cc=abidh@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).