public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10
@ 2022-02-26 3:58 meissner at gcc dot gnu.org
2022-02-26 12:46 ` [Bug target/104698] " segher at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: meissner at gcc dot gnu.org @ 2022-02-26 3:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Bug ID: 104698
Summary: Inefficient code for DI to TI sign extend on power10
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
On power10, signed conversion from DImode to TImode is inefficient for GCC 11
and the current GCC 12. GCC 10 does not do this optimization.
On power10, GCC tries to generate the 'vextsd2q' instruction. However, to
generate this instruction, it would typically generate a 'mtvsrsdd' instruction
to get the TImode value into an Altivec register in the bottom 64-bits, then it
does the vextsd2g instruction, and finally it generates 'mfvsrd' and 'mfvsrld'
instructions to get the value back into the GPR registers.
For power9, it generates a move instruction and then an arithmetic shift right
63 bits to fill the upper word with the copy of the sign bit.
GCC should generate the following code sequences:
1) For GPR register to GPR register: Move register, and 'sradi' to create the
sign bits in the upper word.
2) For GPR register to VSX register to Altivec register: Splat the value to
fill the bottom 64 bits, and then do 'vextsd2q'.
3) For memory to GPR register, load the value into the low register, and fill
the high register with the sign bit.
4) For memory to Altivec register, load the value with load VSX vector
rightmost doubleword, and then do 'vextsd2q'.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
@ 2022-02-26 12:46 ` segher at gcc dot gnu.org
2022-02-26 12:50 ` segher at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2022-02-26 12:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #1 from Segher Boessenkool <segher at gcc dot gnu.org> ---
GCC should not use unspecs for any basic operations like this. *That* is
the problem.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
2022-02-26 12:46 ` [Bug target/104698] " segher at gcc dot gnu.org
@ 2022-02-26 12:50 ` segher at gcc dot gnu.org
2022-02-28 21:16 ` meissner at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2022-02-26 12:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2022-02-26
Ever confirmed|0 |1
--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Trying 6 -> 12:
6: r120:TI=unspec[r121:DI] 190
REG_DEAD r121:DI
12: %3:TI=unspec[r120:TI] 189
REG_DEAD r120:TI
Failed to match this instruction:
(set (reg/i:TI 3 3)
(unspec:TI [
(unspec:TI [
(reg:DI 121)
] UNSPEC_MTVSRD_DITI_W1)
] UNSPEC_EXTENDDITI2))
If this was expressed as RTL, it would just work.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
2022-02-26 12:46 ` [Bug target/104698] " segher at gcc dot gnu.org
2022-02-26 12:50 ` segher at gcc dot gnu.org
@ 2022-02-28 21:16 ` meissner at gcc dot gnu.org
2022-03-05 5:03 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: meissner at gcc dot gnu.org @ 2022-02-28 21:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #3 from Michael Meissner <meissner at gcc dot gnu.org> ---
It goes beyond 'just use RTL'.
The problem is the code only generates an altivec instruction. So if the
__int128_t value is in a GPR, the compiler will need to do a move to the vector
registers (1 insn), the instruction, and then move back to the GPRs (2 insns).
What it needs to do is have code paths for when the __int128_t is in a GPR and
a code path when it is in an altivec register.
I have patches that I'm testing that does this (i.e. handles both GPR and
Altivec registers) to avoid having to do direct moves.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
` (2 preceding siblings ...)
2022-02-28 21:16 ` meissner at gcc dot gnu.org
@ 2022-03-05 5:03 ` cvs-commit at gcc dot gnu.org
2023-06-02 1:22 ` bergner at gcc dot gnu.org
2023-10-13 22:30 ` meissner at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-03-05 5:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Michael Meissner <meissner@gcc.gnu.org>:
https://gcc.gnu.org/g:1301d7f647c7ac40da7f910aa6e790205e34bb8b
commit r12-7501-g1301d7f647c7ac40da7f910aa6e790205e34bb8b
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Sat Mar 5 00:01:52 2022 -0500
Optimize signed DImode -> TImode on power10.
On power10, GCC tries to optimize the signed conversion from DImode to
TImode by using the vextsd2q instruction. However to generate this
instruction, it would have to generate 3 direct moves (1 from the GPR
registers to the altivec registers, and 2 from the altivec registers to
the GPR register).
This patch generates the shift right immediate instruction to do the
conversion if the target/source registers ares GPR registers like it does
on earlier systems. If the target/source registers are Altivec registers,
it will generate the vextsd2q instruction.
2022-03-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/104698
* config/rs6000/vsx.md (UNSPEC_MTVSRD_DITI_W1): Delete.
(mtvsrdd_diti_w1): Delete.
(extendditi2): Convert from define_expand to
define_insn_and_split. Replace with code to deal with both GPR
registers and with altivec registers.
gcc/testsuite/
PR target/104698
* gcc.target/powerpc/pr104698-1.c: New test.
* gcc.target/powerpc/pr104698-2.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
` (3 preceding siblings ...)
2022-03-05 5:03 ` cvs-commit at gcc dot gnu.org
@ 2023-06-02 1:22 ` bergner at gcc dot gnu.org
2023-10-13 22:30 ` meissner at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: bergner at gcc dot gnu.org @ 2023-06-02 1:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #5 from Peter Bergner <bergner at gcc dot gnu.org> ---
Mike, are we doing backports of this? ...or can we marked this as FIXED?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/104698] Inefficient code for DI to TI sign extend on power10
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
` (4 preceding siblings ...)
2023-06-02 1:22 ` bergner at gcc dot gnu.org
@ 2023-10-13 22:30 ` meissner at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: meissner at gcc dot gnu.org @ 2023-10-13 22:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Michael Meissner <meissner at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #6 from Michael Meissner <meissner at gcc dot gnu.org> ---
The patch was committed to the GCC 13 branch on March 5th, 2022 and later
backported to GCC 12.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-13 22:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-26 3:58 [Bug target/104698] New: Inefficient code for DI to TI sign extend on power10 meissner at gcc dot gnu.org
2022-02-26 12:46 ` [Bug target/104698] " segher at gcc dot gnu.org
2022-02-26 12:50 ` segher at gcc dot gnu.org
2022-02-28 21:16 ` meissner at gcc dot gnu.org
2022-03-05 5:03 ` cvs-commit at gcc dot gnu.org
2023-06-02 1:22 ` bergner at gcc dot gnu.org
2023-10-13 22:30 ` meissner at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).