public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine
@ 2021-09-09 20:43 ntukanov at cmu dot edu
  2021-09-09 21:12 ` [Bug inline-asm/102264] " pinskia at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: ntukanov at cmu dot edu @ 2021-09-09 20:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

            Bug ID: 102264
           Summary: Macro Intrinsics fail to use all the registers on the
                    machine
           Product: gcc
           Version: 9.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: inline-asm
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ntukanov at cmu dot edu
  Target Milestone: ---

I am trying to use custom intrinsics in order to have more control over the
assembly that the compiler is generating. The concept of these custom
intrinsics comes from http://users.ece.cmu.edu/~franzf/papers/wpmvp16.pdf.

For performance reasons, my code requires me to use all the available SIMD
registers on the machine, but when I use my custom intrinsics, I am only
getting half of the SIMD registers which leads to register spilling.

This is the code and generated assembly in question:
https://godbolt.org/z/fqn53G9qT

Any help would be greatly appericated.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug inline-asm/102264] Macro Intrinsics fail to use all the registers on the machine
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
@ 2021-09-09 21:12 ` pinskia at gcc dot gnu.org
  2021-09-10  3:27 ` ntukanov at cmu dot edu
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-09 21:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There seems to be some extra moves the register allocator cannot remove and
that is causing some extra spilling.

Your loop has 32 live variables and that is just at the limit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug inline-asm/102264] Macro Intrinsics fail to use all the registers on the machine
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
  2021-09-09 21:12 ` [Bug inline-asm/102264] " pinskia at gcc dot gnu.org
@ 2021-09-10  3:27 ` ntukanov at cmu dot edu
  2021-09-10  7:49 ` [Bug target/102264] " rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: ntukanov at cmu dot edu @ 2021-09-10  3:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

--- Comment #2 from Nicholai Tukanov <ntukanov at cmu dot edu> ---
(In reply to Andrew Pinski from comment #1)
> There seems to be some extra moves the register allocator cannot remove and
> that is causing some extra spilling.
>
> Your loop has 32 live variables and that is just at the limit.

Can the register allocator be modified to recognize the other registers? The
problem seems limited to the compute instruction (vpdpwssd in this case). 

I specifically choose 32 to max out the registers. Since the compute
instruction gets limited to half of that (zmm0-zmm15), the extra moves are
killing the performance.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] Macro Intrinsics fail to use all the registers on the machine
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
  2021-09-09 21:12 ` [Bug inline-asm/102264] " pinskia at gcc dot gnu.org
  2021-09-10  3:27 ` ntukanov at cmu dot edu
@ 2021-09-10  7:49 ` rguenth at gcc dot gnu.org
  2021-09-10  8:24 ` pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-10  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*
          Component|inline-asm                  |target

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Maybe 'x' is simply the wrong constraint?  Try 'y'.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] Macro Intrinsics fail to use all the registers on the machine
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (2 preceding siblings ...)
  2021-09-10  7:49 ` [Bug target/102264] " rguenth at gcc dot gnu.org
@ 2021-09-10  8:24 ` pinskia at gcc dot gnu.org
  2021-09-10  8:42 ` [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-10  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> Maybe 'x' is simply the wrong constraint?  Try 'y'.

It is 'v':

v
Any EVEX encodable SSE register (%xmm0-%xmm31).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (3 preceding siblings ...)
  2021-09-10  8:24 ` pinskia at gcc dot gnu.org
@ 2021-09-10  8:42 ` pinskia at gcc dot gnu.org
  2021-09-19 23:17 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-10  8:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
      Known to work|                            |4.9.4
   Last reconfirmed|                            |2021-09-10
            Summary|Macro Intrinsics fail to    |[9/10/11/12 Regression]
                   |use all the registers on    |extra spilling when using
                   |the machine                 |inline-asm and all
                   |                            |registers
      Known to fail|                            |5.1.0

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Using the 'v' constraint, there is still a register allocation issue but almost
all of the extra moves are gone.  The register allocation issue looks to be a
regression from GCC 4.9.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (4 preceding siblings ...)
  2021-09-10  8:42 ` [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers pinskia at gcc dot gnu.org
@ 2021-09-19 23:17 ` pinskia at gcc dot gnu.org
  2022-01-20 10:14 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-19 23:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |9.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (5 preceding siblings ...)
  2021-09-19 23:17 ` pinskia at gcc dot gnu.org
@ 2022-01-20 10:14 ` rguenth at gcc dot gnu.org
  2022-03-23  8:42 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-20 10:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 52239
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52239&action=edit
testcase with the constraints fixed

I don't see how 4.9.4 "works", even that spills some regs.  It does seem to
spill a little less but not sure if AVX512 support which was new in GCC 4.9 is
up to speed there.

I've attached the testcase with fixed asm constraints, GCC 11 produces
18 spills of zmm and 12 reloads.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (6 preceding siblings ...)
  2022-01-20 10:14 ` rguenth at gcc dot gnu.org
@ 2022-03-23  8:42 ` rguenth at gcc dot gnu.org
  2022-05-27  9:46 ` [Bug target/102264] [10/11/12/13 " rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-23  8:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
So any comment?  I think the outcome is reasonable.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [10/11/12/13 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (7 preceding siblings ...)
  2022-03-23  8:42 ` rguenth at gcc dot gnu.org
@ 2022-05-27  9:46 ` rguenth at gcc dot gnu.org
  2022-06-28 10:46 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.5                         |10.4

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [10/11/12/13 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (8 preceding siblings ...)
  2022-05-27  9:46 ` [Bug target/102264] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:46 ` jakub at gcc dot gnu.org
  2023-07-07 10:40 ` [Bug target/102264] [11/12/13/14 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [11/12/13/14 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (9 preceding siblings ...)
  2022-06-28 10:46 ` jakub at gcc dot gnu.org
@ 2023-07-07 10:40 ` rguenth at gcc dot gnu.org
  2024-03-10  3:20 ` law at gcc dot gnu.org
  2024-03-22 14:08 ` law at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [11/12/13/14 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (10 preceding siblings ...)
  2023-07-07 10:40 ` [Bug target/102264] [11/12/13/14 " rguenth at gcc dot gnu.org
@ 2024-03-10  3:20 ` law at gcc dot gnu.org
  2024-03-22 14:08 ` law at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-10  3:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org

--- Comment #11 from Jeffrey A. Law <law at gcc dot gnu.org> ---
I agree with c#7.  If you use all the registers, then you're lucky it compiled
at all without throwing an error.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/102264] [11/12/13/14 Regression] extra spilling when using inline-asm and all registers
  2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
                   ` (11 preceding siblings ...)
  2024-03-10  3:20 ` law at gcc dot gnu.org
@ 2024-03-22 14:08 ` law at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-22 14:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102264

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |INVALID

--- Comment #12 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Per c#7.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-22 14:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 20:43 [Bug inline-asm/102264] New: Macro Intrinsics fail to use all the registers on the machine ntukanov at cmu dot edu
2021-09-09 21:12 ` [Bug inline-asm/102264] " pinskia at gcc dot gnu.org
2021-09-10  3:27 ` ntukanov at cmu dot edu
2021-09-10  7:49 ` [Bug target/102264] " rguenth at gcc dot gnu.org
2021-09-10  8:24 ` pinskia at gcc dot gnu.org
2021-09-10  8:42 ` [Bug target/102264] [9/10/11/12 Regression] extra spilling when using inline-asm and all registers pinskia at gcc dot gnu.org
2021-09-19 23:17 ` pinskia at gcc dot gnu.org
2022-01-20 10:14 ` rguenth at gcc dot gnu.org
2022-03-23  8:42 ` rguenth at gcc dot gnu.org
2022-05-27  9:46 ` [Bug target/102264] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:46 ` jakub at gcc dot gnu.org
2023-07-07 10:40 ` [Bug target/102264] [11/12/13/14 " rguenth at gcc dot gnu.org
2024-03-10  3:20 ` law at gcc dot gnu.org
2024-03-22 14:08 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).