* [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
@ 2021-08-09 13:55 Tobias Burnus
2021-08-09 14:27 ` Thomas Schwinge
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Tobias Burnus @ 2021-08-09 13:55 UTC (permalink / raw)
To: gcc-patches, Gerald Pfeifer, Andrew Stubbs, Thomas Schwinge
[-- Attachment #1: Type: text/plain, Size: 739 bytes --]
Now that the GCN/OpenACC patches for this have been committed today,
I think it makes sense to add it to the documentation.
(I was told that some follow-up items are still pending, but as
the feature does work ...)
Cf. also Andrew's talk of last year,
https://linuxplumbersconf.org/event/7/contributions/749/attachments/560/988/AMD_GCN_Update_-_LPC_2020.pdf
which I utilized when writing the attached wwwdocs patch.
Comments and/or suggestions?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: gcn-worker.diff --]
[-- Type: text/x-patch, Size: 792 bytes --]
gcc-12/changes.html (GCN): >1 workers per gang
diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 9c2799cf..05e24737 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -119,10 +119,14 @@
<h3 id="amdgcn">AMD Radeon (GCN)</h3>
<ul>
<li>Debug experience with ROCGDB has been improved.</li>
<li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
was added.</li>
+ <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
+ per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
+ are supported. (Except that the hardware limit of 40 workers total may
+ not be exceeded.)</li>
</ul>
<!-- <h3 id="arc">ARC</h3> -->
<!-- <h3 id="arm">ARM</h3> -->
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
@ 2021-08-09 14:27 ` Thomas Schwinge
2021-08-10 10:10 ` Tobias Burnus
2021-08-09 18:53 ` Gerald Pfeifer
2022-02-02 15:39 ` Tobias Burnus
2 siblings, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2021-08-09 14:27 UTC (permalink / raw)
To: Tobias Burnus, Julian Brown; +Cc: gcc-patches, Gerald Pfeifer, Andrew Stubbs
Hi!
On 2021-08-09T15:55:07+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> Now that the GCN/OpenACC patches for this have been committed today,
> I think it makes sense to add it to the documentation.
Thanks for thinking of this.
> (I was told that some follow-up items are still pending, but as
> the feature does work ...)
ACK.
> Cf. also Andrew's talk of last year,
> https://linuxplumbersconf.org/event/7/contributions/749/attachments/560/988/AMD_GCN_Update_-_LPC_2020.pdf
> which I utilized when writing the attached wwwdocs patch.
>
> Comments and/or suggestions?
> gcc-12/changes.html (GCN): >1 workers per gang
ACK for that aspect specifically, but:
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> + <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
> + per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
> + are supported. (Except that the hardware limit of 40 workers total may
> + not be exceeded.)</li>
I haven't changed anything related to a "limitation of [...] 2 gangs per
CU has been lifted". Maybe that has already been done earlier, maybe
that still has to be done? I don't know -- Julian?
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
2021-08-09 14:27 ` Thomas Schwinge
@ 2021-08-09 18:53 ` Gerald Pfeifer
2022-02-02 15:39 ` Tobias Burnus
2 siblings, 0 replies; 9+ messages in thread
From: Gerald Pfeifer @ 2021-08-09 18:53 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, Andrew Stubbs, Thomas Schwinge
On Mon, 9 Aug 2021, Tobias Burnus wrote:
> Comments and/or suggestions?
Looks good from my perspective, with the feedback that Thomas
provided.
(Is "CU" a sufficiently established term, or might it make sense
to spell it out?)
Thanks,
Gerald
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-09 14:27 ` Thomas Schwinge
@ 2021-08-10 10:10 ` Tobias Burnus
2021-08-16 8:34 ` Stubbs, Andrew
0 siblings, 1 reply; 9+ messages in thread
From: Tobias Burnus @ 2021-08-10 10:10 UTC (permalink / raw)
To: Thomas Schwinge, Gerald Pfeifer, Julian Brown, Andrew Stubbs; +Cc: gcc-patches
Hi all,
On 09.08.21 20:53, Gerald Pfeifer wrote:
> (Is "CU" a sufficiently established term, or might it make sense
> to spell it out?)
I don't know – but we could use "per compute unit (CU)".
On 09.08.21 16:27, Thomas Schwinge wrote:
> On 2021-08-09T15:55:07+0200, Tobias Burnus<tobias@codesourcery.com> wrote:
>> +++ b/htdocs/gcc-12/changes.html
>> + <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
>> + per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
>> + are supported. (Except that the hardware limit of 40 workers total may
>> + not be exceeded.)</li>
> I haven't changed anything related to a "limitation of [...] 2 gangs per
> CU has been lifted". Maybe that has already been done earlier, maybe
> that still has to be done? I don't know -- Julian?
Looking at the current code, it has:
if (dims[0] == 0) dims[0] = get_cu_count (kernel->agent); /* Gangs. */
Thus at least when nothing else has been specified, it uses #CUs of gangs,
running on #CUs CUs, i.e. 1 gang per CU.
[OG11 – but not mainline] What's needed is something like:
dims[0] = get_cu_count (kernel->agent) * (32 / dims[1]);
which I see in OG11 – oddly, I also see there code like:
def->gdims[0] = get_cu_count (agent); // * (40 / gcn_threads);
In other words: For gangs > #CUs or >1 gang per CU, the following patch
is needed:
[OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
[email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
"not suitable for mainline until the multiple-worker support is merged there"
@Andrew + @Julian: Do you intent to commit it relatively soon?
Regarding the wwwdocs patch, I can hold off until that commit or reword
it to only cover the workers part.
Thanks Thomas & Gerald for the comments!
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-10 10:10 ` Tobias Burnus
@ 2021-08-16 8:34 ` Stubbs, Andrew
2021-08-16 9:28 ` Thomas Schwinge
0 siblings, 1 reply; 9+ messages in thread
From: Stubbs, Andrew @ 2021-08-16 8:34 UTC (permalink / raw)
To: Burnus, Tobias, Schwinge, Thomas, Gerald Pfeifer, Brown, Julian
Cc: gcc-patches
> In other words: For gangs > #CUs or >1 gang per CU, the following patch
> is needed:
> [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
> [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
> "not suitable for mainline until the multiple-worker support is merged
> there"
>
> @Andrew + @Julian: Do you intent to commit it relatively soon?
> Regarding the wwwdocs patch, I can hold off until that commit or reword
> it to only cover the workers part.
Were these not part of the patch set Thomas was working on?
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-16 8:34 ` Stubbs, Andrew
@ 2021-08-16 9:28 ` Thomas Schwinge
0 siblings, 0 replies; 9+ messages in thread
From: Thomas Schwinge @ 2021-08-16 9:28 UTC (permalink / raw)
To: andrew_stubbs, tobias_burnus; +Cc: Gerald Pfeifer, julian_brown, gcc-patches
Hi!
On 2021-08-16T10:34:34+0200, "Stubbs, Andrew" <Andrew_Stubbs@mentor.com> wrote:
>> In other words: For gangs > #CUs or >1 gang per CU, the following patch
>> is needed:
>> [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
>> [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
>> "not suitable for mainline until the multiple-worker support is merged
>> there"
>>
>> @Andrew + @Julian: Do you intent to commit it relatively soon?
>> Regarding the wwwdocs patch, I can hold off until that commit or reword
>> it to only cover the workers part.
>
> Were these not part of the patch set Thomas was working on?
That one is "amdgcn: Tune default OpenMP/OpenACC GPU utilization", which
indeed I'm planning to handle in the next few weeks. So I suggest we
simply hold back Tobias' 'gcc-12/changes.html' patch until that time.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
2021-08-09 14:27 ` Thomas Schwinge
2021-08-09 18:53 ` Gerald Pfeifer
@ 2022-02-02 15:39 ` Tobias Burnus
2022-02-02 16:23 ` Andrew Stubbs
2 siblings, 1 reply; 9+ messages in thread
From: Tobias Burnus @ 2022-02-02 15:39 UTC (permalink / raw)
To: gcc-patches, Gerald Pfeifer, Andrew Stubbs, Thomas Schwinge
[-- Attachment #1: Type: text/plain, Size: 637 bytes --]
On 09.08.21 15:55, Tobias Burnus wrote:
> Now that the GCN/OpenACC patches for this have been committed today,
> I think it makes sense to add it to the documentation.
> (I was told that some follow-up items are still pending, but as
> the feature does work ...)
I think the follow-up patches have now been committed.
How about the attached patch?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: gcn-v2.diff --]
[-- Type: text/x-patch, Size: 756 bytes --]
gcc-12/changes.html (GCN): >1 workers per gang
diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 479bd6c5..738f4c73 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -360,6 +360,10 @@ a work-in-progress.</p>
<li>Debug experience with ROCGDB has been improved.</li>
<li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
was added.</li>
+ <li>For offloading, the limitation of using only one wavefront per compute
+ unit (CU) has been lifted; up to 40 workgroup per CU and 16 wavefronts
+ per workgroup are supported. Additionally, the number of used wavefronts
+ and workgroups was tuned for performance.</li>
</ul>
<!-- <h3 id="arc">ARC</h3> -->
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2022-02-02 15:39 ` Tobias Burnus
@ 2022-02-02 16:23 ` Andrew Stubbs
2022-02-02 17:54 ` Tobias Burnus
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Stubbs @ 2022-02-02 16:23 UTC (permalink / raw)
To: Tobias Burnus, gcc-patches, Gerald Pfeifer, Thomas Schwinge
On 02/02/2022 15:39, Tobias Burnus wrote:
> On 09.08.21 15:55, Tobias Burnus wrote:
>> Now that the GCN/OpenACC patches for this have been committed today,
>> I think it makes sense to add it to the documentation.
>> (I was told that some follow-up items are still pending, but as
>> the feature does work ...)
>
> I think the follow-up patches have now been committed.
> How about the attached patch?
We should probably add a qualification "(up to a limit of 40 wavefronts
in total, per CU)" or else were suggesting that there can be 40
workgroups of 16 wavefronts, which the hardware will not do.
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
2022-02-02 16:23 ` Andrew Stubbs
@ 2022-02-02 17:54 ` Tobias Burnus
0 siblings, 0 replies; 9+ messages in thread
From: Tobias Burnus @ 2022-02-02 17:54 UTC (permalink / raw)
To: Andrew Stubbs, Tobias Burnus, gcc-patches, Gerald Pfeifer,
Thomas Schwinge
[-- Attachment #1: Type: text/plain, Size: 385 bytes --]
Now committed, taking Andrew's comments into account:
https://gcc.gnu.org/gcc-12/changes.html#amdgcn
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: wwwdocs.diff --]
[-- Type: text/x-patch, Size: 953 bytes --]
commit 4b4a85b696b3bb6700eea8f687bc45f1fd0b1f8c
Author: Tobias Burnus <tobias@codesourcery.com>
Date: Wed Feb 2 18:51:32 2022 +0100
gcc-12/changes.html (GCN): >1 workers per gang
diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 479bd6c5..b6341fda 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -360,6 +360,11 @@ a work-in-progress.</p>
<li>Debug experience with ROCGDB has been improved.</li>
<li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
was added.</li>
+ <li>For offloading, the limitation of using only one wavefront per compute
+ unit (CU) has been lifted. Up to 40 workgroups per CU and 16 wavefronts
+ per workgroup are supported (up to a limit of 40 wavefronts in total,
+ per CU). Additionally, the number of used wavefronts and workgroups was
+ tuned for performance.</li>
</ul>
<!-- <h3 id="arc">ARC</h3> -->
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-02-02 17:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
2021-08-09 14:27 ` Thomas Schwinge
2021-08-10 10:10 ` Tobias Burnus
2021-08-16 8:34 ` Stubbs, Andrew
2021-08-16 9:28 ` Thomas Schwinge
2021-08-09 18:53 ` Gerald Pfeifer
2022-02-02 15:39 ` Tobias Burnus
2022-02-02 16:23 ` Andrew Stubbs
2022-02-02 17:54 ` Tobias Burnus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).