From: Thomas Schwinge <thomas@codesourcery.com>
To: Julian Brown <julian@codesourcery.com>,
Tobias Burnus <tobias@codesourcery.com>
Cc: Jakub Jelinek <jakub@redhat.com>, <gcc-patches@gcc.gnu.org>,
<fortran@gcc.gnu.org>
Subject: Re: [Patch] Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguous
Date: Tue, 9 Feb 2021 16:37:31 +0100 [thread overview]
Message-ID: <87sg65uu9w.fsf@dem-tschwing-1.ger.mentorg.com> (raw)
In-Reply-To: <20210209130522.280554b9@squid.athome>
Hi!
On 2021-02-09T13:05:22+0000, Julian Brown <julian@codesourcery.com> wrote:
> On Tue, 9 Feb 2021 13:45:36 +0100
> Tobias Burnus <tobias@codesourcery.com> wrote:
>
>> On 09.02.21 12:58, Thomas Schwinge wrote:
>> >> Granted. The array(:)%re access might update too much, but that's
>> >> not different to array with strides or with contiguous arrays
>> >> sections which contain component reference (and more than one
>> >> component).
>> > (Is that indeed allowed to "update too much"?)
>>
>> Yes - that's the general problem with strides or bit sets;
>> copying only a subset – and doing so atomically – is not
>> always possible or feasible.
>>
>> *OpenACC* 3.1 has for "2.14.4 Update Directive" the restriction:
>>
>> "Noncontiguous subarrays may appear. It is implementation-specific
>> whether noncontiguous regions are updated by using one transfer
>> for each contiguous subregion, or whether the noncontiguous data
>> is packed, transferred once, and unpacked, or whether one or more
>> larger subarrays (no larger than the smallest contiguous region
>> that contains the specified subarray) are updated."
>>
>> For map, I saw that that's the case – but I think Julian's
>> patch does not handle this correctly for:
>>
>> type t
>> integer :: i, j, k
>> end type t
>> type(t) :: A(100)
>> ... host(A(:)%j)
So I understand the variants in the quoted OpenACC part as follows.
However I don't claim that I necessarily got all the fine detail right at
the language level (English as well as OpenACC)! So, please verify.
"Using one transfer for each contiguous subregion":
for n in 1..100: transfer A(n)%j individually
"Noncontiguous data is packed, transferred once, and unpacked":
device:
integer :: tmp(100)
for n in 1..100: tmp(n) = A(n)%j
transfer tmp
host:
for n in 1..100: A(n)%j = tmp(n)
In this example here, I understand "subarrays (no larger than the
smallest contiguous region that contains the specified subarray)" to
again resolve to 'A(n)%j', so doesn't add other another variant.
This -- per my reading! -- would be different here:
type t
integer :: i, j1, j2, k
end type t
type(t) :: A(100)
... host(A(:)%j1, A(:)%j2)
... where I understand this to mean that each 'A(n)%j1' plus 'A(:)%j2'
may be transferred together: either "using one transfer for each
contiguous subregion":
for n in 1..100: transfer memory region of A(n)%j1..A(n)%j2 individually
..., or "packed, transferred once, and unpacked":
device:
integer :: tmp(2 * 100)
for n in 1..100:
tmp(2 * n) = A(n)%j1
tmp(2 * n + 1) = A(n)%j2
transfer tmp
host:
for n in 1..100:
A(n)%j1 = tmp(2 * n)
A(n)%j2 = tmp(2 * n + 1)
I do however not read into this that the following would be valid:
>> I think instead of transferring A(1)%j to A(100)%j, it transfers
>> all of A(:), i.e. also A(1)%i and A(100)%k :-(
I don't think it's appropriate for an 'update' to alter anything else
than the exact 'var's as specified.
> Yes it will -- but given that A(2)%i and A(99)%k (and all the in-between
> values) can legitimately be transferred according to the spec
I don't read it that way, I'm afraid. :-O
> how much
> of a problem is that? In particular, are there situations where this
> "over-updating" can lead to incorrect results in a conforming program?
In your reading indeed it wouldn't, because the user couldn't expect the
following:
> Perhaps the question is, can a user legitimately expect the host and
> offloaded versions of some given memory block to hold different data,
> like maintaining different data in a cache than the storage backing
> that cache? One use-case for that might be double buffering a "single
> array" (i.e. the host and device versions of that array). I don't think
> that's something we'd want to encourage, though.
I find the wording in the spec rather explicitly, for example: OpenACC
3.1, 2.7 "Data Clauses":
| In all cases, the compiler will allocate and manage a copy of the 'var'
| in the memory of the current device, creating a *visible device copy*
| of that 'var', for data not in shared memory.
Emphasis mine; and I indeed understand this to mean that the user can
"legitimately expect the host and offloaded versions of some given memory
block to hold different data, [...]" (your words from above).
> I think, rather, that partial updates are an optimisation the user can
> use when they know that only part of an array has been updated, so
> slight over-copying is harmless.
Interesting -- and, again: that makes sense in your reading.
So. Should everybody work through this again, trying to reach consensus?
Do we need to clarify that with OpenACC?
As for OpenMP, Tobias stated:
|On 2021-02-09T13:45:36+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
|> For OpenMP and map, I recall encountering a code which did do
|> this for OpenMP (i.e. contiguous subsection). I think it was
|> related to derived-type 'map', but I do not recall anymore.
|>
|>
|> Looking at the *OpenMP* 5.1 spec, I see that 'target update' also
|> allows: "The list items that appear in the to or from clauses
|> may include array sections with stride expressions."
|> While for the map clause, there is:
|> 'If a list item is an array section, it must specify contiguous
|> storage.'
(I suppose we all agree on that.)
|> But I did not see a more explicit description how that should be
|> handled, contrary to the rather explicit description for OpenACC.
Surprising. Can some relevant wording (like OpenACC's "visible device
copy") be found elsewhere (in the hundreds of pages...)?
By default, I would've assumed "defensive", so again: "I don't think it's
appropriate for an 'update' to alter anything else than the exact 'var's
as specified." (my words from above)?
Grüße
Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf
next prev parent reply other threads:[~2021-02-09 15:37 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-08 17:50 Tobias Burnus
2021-02-09 9:45 ` Thomas Schwinge
2021-02-09 11:41 ` Tobias Burnus
2021-02-09 11:58 ` Thomas Schwinge
2021-02-09 12:45 ` Tobias Burnus
2021-02-09 13:05 ` Julian Brown
2021-02-09 13:05 ` Julian Brown
2021-02-09 15:37 ` Thomas Schwinge [this message]
2021-02-09 16:08 ` Julian Brown
2021-02-16 14:37 ` *ping* – " Tobias Burnus
2021-02-16 15:57 ` Paul Richard Thomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sg65uu9w.fsf@dem-tschwing-1.ger.mentorg.com \
--to=thomas@codesourcery.com \
--cc=fortran@gcc.gnu.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=julian@codesourcery.com \
--cc=tobias@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).