public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
@ 2021-08-09 13:55 Tobias Burnus
  2021-08-09 14:27 ` Thomas Schwinge
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Tobias Burnus @ 2021-08-09 13:55 UTC (permalink / raw)
  To: gcc-patches, Gerald Pfeifer, Andrew Stubbs, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 739 bytes --]

Now that the GCN/OpenACC patches for this have been committed today,
I think it makes sense to add it to the documentation.
(I was told that some follow-up items are still pending, but as
the feature does work ...)

Cf. also Andrew's talk of last year,
https://linuxplumbersconf.org/event/7/contributions/749/attachments/560/988/AMD_GCN_Update_-_LPC_2020.pdf
which I utilized when writing the attached wwwdocs patch.

Comments and/or suggestions?

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: gcn-worker.diff --]
[-- Type: text/x-patch, Size: 792 bytes --]

gcc-12/changes.html (GCN): >1 workers per gang

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 9c2799cf..05e24737 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -119,10 +119,14 @@
 <h3 id="amdgcn">AMD Radeon (GCN)</h3>
 <ul>
   <li>Debug experience with ROCGDB has been improved.</li>
   <li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
       was added.</li>
+  <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
+      per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
+      are supported. (Except that the hardware limit of 40 workers total may
+      not be exceeded.)</li>
 </ul>
 
 <!-- <h3 id="arc">ARC</h3> -->
 
 <!-- <h3 id="arm">ARM</h3> -->

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
@ 2021-08-09 14:27 ` Thomas Schwinge
  2021-08-10 10:10   ` Tobias Burnus
  2021-08-09 18:53 ` Gerald Pfeifer
  2022-02-02 15:39 ` Tobias Burnus
  2 siblings, 1 reply; 9+ messages in thread
From: Thomas Schwinge @ 2021-08-09 14:27 UTC (permalink / raw)
  To: Tobias Burnus, Julian Brown; +Cc: gcc-patches, Gerald Pfeifer, Andrew Stubbs

Hi!

On 2021-08-09T15:55:07+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> Now that the GCN/OpenACC patches for this have been committed today,
> I think it makes sense to add it to the documentation.

Thanks for thinking of this.

> (I was told that some follow-up items are still pending, but as
> the feature does work ...)

ACK.

> Cf. also Andrew's talk of last year,
> https://linuxplumbersconf.org/event/7/contributions/749/attachments/560/988/AMD_GCN_Update_-_LPC_2020.pdf
> which I utilized when writing the attached wwwdocs patch.
>
> Comments and/or suggestions?

> gcc-12/changes.html (GCN): >1 workers per gang

ACK for that aspect specifically, but:

> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html

> +  <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
> +      per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
> +      are supported. (Except that the hardware limit of 40 workers total may
> +      not be exceeded.)</li>

I haven't changed anything related to a "limitation of [...] 2 gangs per
CU has been lifted".  Maybe that has already been done earlier, maybe
that still has to be done?  I don't know -- Julian?


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
  2021-08-09 14:27 ` Thomas Schwinge
@ 2021-08-09 18:53 ` Gerald Pfeifer
  2022-02-02 15:39 ` Tobias Burnus
  2 siblings, 0 replies; 9+ messages in thread
From: Gerald Pfeifer @ 2021-08-09 18:53 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, Andrew Stubbs, Thomas Schwinge

On Mon, 9 Aug 2021, Tobias Burnus wrote:
> Comments and/or suggestions?

Looks good from my perspective, with the feedback that Thomas
provided.

(Is "CU" a sufficiently established term, or might it make sense
to spell it out?)

Thanks,
Gerald

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-09 14:27 ` Thomas Schwinge
@ 2021-08-10 10:10   ` Tobias Burnus
  2021-08-16  8:34     ` Stubbs, Andrew
  0 siblings, 1 reply; 9+ messages in thread
From: Tobias Burnus @ 2021-08-10 10:10 UTC (permalink / raw)
  To: Thomas Schwinge, Gerald Pfeifer, Julian Brown, Andrew Stubbs; +Cc: gcc-patches

Hi all,

On 09.08.21 20:53, Gerald Pfeifer wrote:

> (Is "CU" a sufficiently established term, or might it make sense
> to spell it out?)
I don't know – but we could use "per compute unit (CU)".

On 09.08.21 16:27, Thomas Schwinge wrote:
> On 2021-08-09T15:55:07+0200, Tobias Burnus<tobias@codesourcery.com>  wrote:
>> +++ b/htdocs/gcc-12/changes.html
>> +  <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
>> +      per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
>> +      are supported. (Except that the hardware limit of 40 workers total may
>> +      not be exceeded.)</li>
> I haven't changed anything related to a "limitation of [...] 2 gangs per
> CU has been lifted".  Maybe that has already been done earlier, maybe
> that still has to be done?  I don't know -- Julian?

Looking at the current code, it has:
   if (dims[0] == 0) dims[0] = get_cu_count (kernel->agent); /* Gangs.  */

Thus at least when nothing else has been specified, it uses #CUs of gangs,
running on #CUs CUs, i.e. 1 gang per CU.

[OG11 – but not mainline] What's needed is something like:
       dims[0] = get_cu_count (kernel->agent) * (32 / dims[1]);
which I see in OG11 – oddly, I also see there code like:
           def->gdims[0] = get_cu_count (agent); // * (40 / gcn_threads);

In other words:  For gangs > #CUs or >1 gang per CU, the following patch
is needed:
   [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
   [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
      "not suitable for mainline until the multiple-worker support is merged there"

@Andrew + @Julian:  Do you intent to commit it relatively soon?
Regarding the wwwdocs patch, I can hold off until that commit or reword
it to only cover the workers part.

Thanks Thomas & Gerald for the comments!

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-10 10:10   ` Tobias Burnus
@ 2021-08-16  8:34     ` Stubbs, Andrew
  2021-08-16  9:28       ` Thomas Schwinge
  0 siblings, 1 reply; 9+ messages in thread
From: Stubbs, Andrew @ 2021-08-16  8:34 UTC (permalink / raw)
  To: Burnus, Tobias, Schwinge, Thomas, Gerald Pfeifer, Brown, Julian
  Cc: gcc-patches

> In other words:  For gangs > #CUs or >1 gang per CU, the following patch
> is needed:
>    [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
>    [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
>       "not suitable for mainline until the multiple-worker support is merged
> there"
> 
> @Andrew + @Julian:  Do you intent to commit it relatively soon?
> Regarding the wwwdocs patch, I can hold off until that commit or reword
> it to only cover the workers part.

Were these not part of the patch set Thomas was working on?

Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-16  8:34     ` Stubbs, Andrew
@ 2021-08-16  9:28       ` Thomas Schwinge
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Schwinge @ 2021-08-16  9:28 UTC (permalink / raw)
  To: andrew_stubbs, tobias_burnus; +Cc: Gerald Pfeifer, julian_brown, gcc-patches

Hi!

On 2021-08-16T10:34:34+0200, "Stubbs, Andrew" <Andrew_Stubbs@mentor.com> wrote:
>> In other words:  For gangs > #CUs or >1 gang per CU, the following patch
>> is needed:
>>    [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
>>    [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
>>       "not suitable for mainline until the multiple-worker support is merged
>> there"
>>
>> @Andrew + @Julian:  Do you intent to commit it relatively soon?
>> Regarding the wwwdocs patch, I can hold off until that commit or reword
>> it to only cover the workers part.
>
> Were these not part of the patch set Thomas was working on?

That one is "amdgcn: Tune default OpenMP/OpenACC GPU utilization", which
indeed I'm planning to handle in the next few weeks.  So I suggest we
simply hold back Tobias' 'gcc-12/changes.html' patch until that time.


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
  2021-08-09 14:27 ` Thomas Schwinge
  2021-08-09 18:53 ` Gerald Pfeifer
@ 2022-02-02 15:39 ` Tobias Burnus
  2022-02-02 16:23   ` Andrew Stubbs
  2 siblings, 1 reply; 9+ messages in thread
From: Tobias Burnus @ 2022-02-02 15:39 UTC (permalink / raw)
  To: gcc-patches, Gerald Pfeifer, Andrew Stubbs, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 637 bytes --]

On 09.08.21 15:55, Tobias Burnus wrote:
> Now that the GCN/OpenACC patches for this have been committed today,
> I think it makes sense to add it to the documentation.
> (I was told that some follow-up items are still pending, but as
> the feature does work ...)

I think the follow-up patches have now been committed.
How about the attached patch?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: gcn-v2.diff --]
[-- Type: text/x-patch, Size: 756 bytes --]

gcc-12/changes.html (GCN): >1 workers per gang

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 479bd6c5..738f4c73 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -360,6 +360,10 @@ a work-in-progress.</p>
   <li>Debug experience with ROCGDB has been improved.</li>
   <li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
       was added.</li>
+  <li>For offloading, the limitation of using only one wavefront per compute
+      unit (CU) has been lifted; up to 40 workgroup per CU and 16 wavefronts
+      per workgroup are supported. Additionally, the number of used wavefronts
+      and workgroups was tuned for performance.</li>
 </ul>
 
 <!-- <h3 id="arc">ARC</h3> -->

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2022-02-02 15:39 ` Tobias Burnus
@ 2022-02-02 16:23   ` Andrew Stubbs
  2022-02-02 17:54     ` Tobias Burnus
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Stubbs @ 2022-02-02 16:23 UTC (permalink / raw)
  To: Tobias Burnus, gcc-patches, Gerald Pfeifer, Thomas Schwinge

On 02/02/2022 15:39, Tobias Burnus wrote:
> On 09.08.21 15:55, Tobias Burnus wrote:
>> Now that the GCN/OpenACC patches for this have been committed today,
>> I think it makes sense to add it to the documentation.
>> (I was told that some follow-up items are still pending, but as
>> the feature does work ...)
> 
> I think the follow-up patches have now been committed.
> How about the attached patch?

We should probably add a qualification "(up to a limit of 40 wavefronts 
in total, per CU)" or else were suggesting that there can be 40 
workgroups of 16 wavefronts, which the hardware will not do.

Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang
  2022-02-02 16:23   ` Andrew Stubbs
@ 2022-02-02 17:54     ` Tobias Burnus
  0 siblings, 0 replies; 9+ messages in thread
From: Tobias Burnus @ 2022-02-02 17:54 UTC (permalink / raw)
  To: Andrew Stubbs, Tobias Burnus, gcc-patches, Gerald Pfeifer,
	Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]

Now committed, taking Andrew's comments into account:

https://gcc.gnu.org/gcc-12/changes.html#amdgcn

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: wwwdocs.diff --]
[-- Type: text/x-patch, Size: 953 bytes --]

commit 4b4a85b696b3bb6700eea8f687bc45f1fd0b1f8c
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Wed Feb 2 18:51:32 2022 +0100

    gcc-12/changes.html (GCN): >1 workers per gang

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 479bd6c5..b6341fda 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -360,6 +360,11 @@ a work-in-progress.</p>
   <li>Debug experience with ROCGDB has been improved.</li>
   <li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code>
       was added.</li>
+  <li>For offloading, the limitation of using only one wavefront per compute
+      unit (CU) has been lifted. Up to 40 workgroups per CU and 16 wavefronts
+      per workgroup are supported (up to a limit of 40 wavefronts in total,
+      per CU). Additionally, the number of used wavefronts and workgroups was
+      tuned for performance.</li>
 </ul>
 
 <!-- <h3 id="arc">ARC</h3> -->

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-02 17:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 13:55 [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang Tobias Burnus
2021-08-09 14:27 ` Thomas Schwinge
2021-08-10 10:10   ` Tobias Burnus
2021-08-16  8:34     ` Stubbs, Andrew
2021-08-16  9:28       ` Thomas Schwinge
2021-08-09 18:53 ` Gerald Pfeifer
2022-02-02 15:39 ` Tobias Burnus
2022-02-02 16:23   ` Andrew Stubbs
2022-02-02 17:54     ` Tobias Burnus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).