public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Julian Brown <julian@codesourcery.com>,
	Tobias Burnus <tobias@codesourcery.com>
Cc: Jakub Jelinek <jakub@redhat.com>, <gcc-patches@gcc.gnu.org>,
	<fortran@gcc.gnu.org>
Subject: Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous
Date: Tue, 9 Feb 2021 16:37:31 +0100	[thread overview]
Message-ID: <87sg65uu9w.fsf@dem-tschwing-1.ger.mentorg.com> (raw)
In-Reply-To: <20210209130522.280554b9@squid.athome>

Hi!

On 2021-02-09T13:05:22+0000, Julian Brown <julian@codesourcery.com> wrote:
> On Tue, 9 Feb 2021 13:45:36 +0100
> Tobias Burnus <tobias@codesourcery.com> wrote:
>
>> On 09.02.21 12:58, Thomas Schwinge wrote:
>> >> Granted. The array(:)%re access might update too much, but that's
>> >> not different to array with strides or with contiguous arrays
>> >> sections which contain component reference (and more than one
>> >> component).
>> > (Is that indeed allowed to "update too much"?)
>>
>> Yes - that's the general problem with strides or bit sets;
>> copying only a subset – and doing so atomically – is not
>> always possible or feasible.
>>
>> *OpenACC* 3.1 has for "2.14.4  Update Directive" the restriction:
>>
>> "Noncontiguous subarrays may appear. It is implementation-specific
>>   whether noncontiguous regions are updated by using one transfer
>>   for each contiguous subregion, or whether the noncontiguous data
>>   is packed, transferred once, and unpacked, or whether one or more
>>   larger subarrays (no larger than the smallest contiguous region
>>   that contains the specified subarray) are updated."
>>
>> For map, I saw that that's the case – but I think Julian's
>> patch does not handle this correctly for:
>>
>> type t
>>    integer :: i, j, k
>> end type t
>> type(t) :: A(100)
>>    ... host(A(:)%j)

So I understand the variants in the quoted OpenACC part as follows.
However I don't claim that I necessarily got all the fine detail right at
the language level (English as well as OpenACC)!  So, please verify.

"Using one transfer for each contiguous subregion":

    for n in 1..100: transfer A(n)%j individually

"Noncontiguous data is packed, transferred once, and unpacked":

    device:
      integer :: tmp(100)
      for n in 1..100: tmp(n) = A(n)%j
    transfer tmp
    host:
      for n in 1..100: A(n)%j = tmp(n)

In this example here, I understand "subarrays (no larger than the
smallest contiguous region that contains the specified subarray)" to
again resolve to 'A(n)%j', so doesn't add other another variant.

This -- per my reading! -- would be different here:

    type t
       integer :: i, j1, j2, k
    end type t
    type(t) :: A(100)
       ... host(A(:)%j1, A(:)%j2)

... where I understand this to mean that each 'A(n)%j1' plus 'A(:)%j2'
may be transferred together: either "using one transfer for each
contiguous subregion":

    for n in 1..100: transfer memory region of A(n)%j1..A(n)%j2 individually

..., or "packed, transferred once, and unpacked":

    device:
      integer :: tmp(2 * 100)
      for n in 1..100:
        tmp(2 * n) = A(n)%j1
        tmp(2 * n + 1) = A(n)%j2
    transfer tmp
    host:
      for n in 1..100:
        A(n)%j1 = tmp(2 * n)
        A(n)%j2 = tmp(2 * n + 1)

I do however not read into this that the following would be valid:

>> I think instead of transferring A(1)%j to A(100)%j, it transfers
>> all of A(:), i.e. also A(1)%i and A(100)%k :-(

I don't think it's appropriate for an 'update' to alter anything else
than the exact 'var's as specified.

> Yes it will -- but given that A(2)%i and A(99)%k (and all the in-between
> values) can legitimately be transferred according to the spec

I don't read it that way, I'm afraid.  :-O

> how much
> of a problem is that? In particular, are there situations where this
> "over-updating" can lead to incorrect results in a conforming program?

In your reading indeed it wouldn't, because the user couldn't expect the
following:

> Perhaps the question is, can a user legitimately expect the host and
> offloaded versions of some given memory block to hold different data,
> like maintaining different data in a cache than the storage backing
> that cache? One use-case for that might be double buffering a "single
> array" (i.e. the host and device versions of that array). I don't think
> that's something we'd want to encourage, though.

I find the wording in the spec rather explicitly, for example: OpenACC
3.1, 2.7 "Data Clauses":

| In all cases, the compiler will allocate and manage a copy of the 'var'
| in the memory of the current device, creating a *visible device copy*
| of that 'var', for data not in shared memory.

Emphasis mine; and I indeed understand this to mean that the user can
"legitimately expect the host and offloaded versions of some given memory
block to hold different data, [...]" (your words from above).

> I think, rather, that partial updates are an optimisation the user can
> use when they know that only part of an array has been updated, so
> slight over-copying is harmless.

Interesting -- and, again: that makes sense in your reading.


So.  Should everybody work through this again, trying to reach consensus?
Do we need to clarify that with OpenACC?


As for OpenMP, Tobias stated:

|On 2021-02-09T13:45:36+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
|> For OpenMP and map, I recall encountering a code which did do
|> this for OpenMP (i.e. contiguous subsection). I think it was
|> related to derived-type 'map', but I do not recall anymore.
|>
|>
|> Looking at the *OpenMP* 5.1 spec, I see that 'target update' also
|> allows: "The list items that appear in the to or from clauses
|> may include array sections with stride expressions."
|> While for the map clause, there is:
|> 'If a list item is an array section, it must specify contiguous
|> storage.'

(I suppose we all agree on that.)

|> But I did not see a more explicit description how that should be
|> handled, contrary to the rather explicit description for OpenACC.

Surprising.  Can some relevant wording (like OpenACC's "visible device
copy") be found elsewhere (in the hundreds of pages...)?

By default, I would've assumed "defensive", so again: "I don't think it's
appropriate for an 'update' to alter anything else than the exact 'var's
as specified." (my words from above)?


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

  parent reply	other threads:[~2021-02-09 15:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 17:50 Tobias Burnus
2021-02-09  9:45 ` Thomas Schwinge
2021-02-09 11:41   ` Tobias Burnus
2021-02-09 11:58     ` Thomas Schwinge
2021-02-09 12:45       ` Tobias Burnus
2021-02-09 13:05         ` Julian Brown
2021-02-09 13:05           ` Julian Brown
2021-02-09 15:37           ` Thomas Schwinge [this message]
2021-02-09 16:08             ` Julian Brown
2021-02-16 14:37     ` *ping* – " Tobias Burnus
2021-02-16 15:57 ` Paul Richard Thomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sg65uu9w.fsf@dem-tschwing-1.ger.mentorg.com \
    --to=thomas@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=julian@codesourcery.com \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).