public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10
@ 2021-02-26 17:02 munroesj at gcc dot gnu.org
2021-02-26 17:07 ` [Bug middle-end/99293] " munroesj at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: munroesj at gcc dot gnu.org @ 2021-02-26 17:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
Bug ID: 99293
Summary: Built-in vec_splat generates sub-optimal code for
-mcpu=power10
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 50263
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50263&action=edit
Simplified test case
While adding code to Power Vector Library (PVECLIB), for the POWER10 target, I
see strange code generation for Altivec built-in vec_splat for the vector long
long type. I would expect a xxpermdi (xxspltd) based on the "Power Vector
Intrinsic Programming Reference".
But I see the following generated:
0000000000000300 <test_vec_rlq_PWR10>:
300: 67 02 69 7c mfvsrld r9,vs35
304: 67 4b 09 7c mtvsrdd vs32,r9,r9
308: 05 00 42 10 vrlq v2,v2,v0
30c: 20 00 80 4e blr
While these seems to functionally correct, the trip through the GPR seems
unnecessary. It requires two serially dependent instructions where a single
xxspltd would do. I expected:
0000000000000300 <test_vec_rlq_PWR10>:
300: 57 1b 63 f0 xxspltd vs35,vs35,1
304: 05 18 42 10 vrlq v2,v2,v3
308: 20 00 80 4e blr
The compiler was:
Compiler: gcc version 10.2.1 20210104 (Advance-Toolchain 14.0-2) [2093e873bb6c]
(GCC)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/99293] Built-in vec_splat generates sub-optimal code for -mcpu=power10
2021-02-26 17:02 [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10 munroesj at gcc dot gnu.org
@ 2021-02-26 17:07 ` munroesj at gcc dot gnu.org
2021-02-27 0:52 ` segher at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: munroesj at gcc dot gnu.org @ 2021-02-26 17:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
--- Comment #1 from Steven Munroe <munroesj at gcc dot gnu.org> ---
Created attachment 50264
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50264&action=edit
Compile test for simplied test case
Download vec_dummy.c and vec_int128_ppc.h into a local directory and compile
gcc -O3 -mcpu=power10 -m64 -c vec_dummy.c
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/99293] Built-in vec_splat generates sub-optimal code for -mcpu=power10
2021-02-26 17:02 [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10 munroesj at gcc dot gnu.org
2021-02-26 17:07 ` [Bug middle-end/99293] " munroesj at gcc dot gnu.org
@ 2021-02-27 0:52 ` segher at gcc dot gnu.org
2021-06-05 5:09 ` [Bug target/99293] " meissner at gcc dot gnu.org
2021-06-05 5:09 ` meissner at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: segher at gcc dot gnu.org @ 2021-02-27 0:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2021-02-27
Ever confirmed|0 |1
--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
It generates non-optimal code for older CPUs as well (it does two splats
instead of one:
xxpermdi 0,35,35,3 # 7 [c=4 l=4] vsx_extract_v2di/1
xxpermdi 35,0,0,0 # 9 [c=4 l=4] vsx_splat_v2di_reg/0
vrlq 2,2,3
This is because we get things like
Trying 7 -> 9:
7: r117:DI=vec_select(r127:V1TI#0,parallel)
REG_DEAD r127:V1TI
9: r124:V2DI=vec_duplicate(r117:DI)
REG_DEAD r117:DI
Failed to match this instruction:
(set (reg:V2DI 124)
(vec_duplicate:V2DI (vec_select:DI (subreg:V2DI (reg:V1TI 127) 0)
(parallel [
(const_int 0 [0])
]))))
(the patterns we do have use vec_concat instead).
Confirmed.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/99293] Built-in vec_splat generates sub-optimal code for -mcpu=power10
2021-02-26 17:02 [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10 munroesj at gcc dot gnu.org
2021-02-26 17:07 ` [Bug middle-end/99293] " munroesj at gcc dot gnu.org
2021-02-27 0:52 ` segher at gcc dot gnu.org
@ 2021-06-05 5:09 ` meissner at gcc dot gnu.org
2021-06-05 5:09 ` meissner at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: meissner at gcc dot gnu.org @ 2021-06-05 5:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
--- Comment #3 from Michael Meissner <meissner at gcc dot gnu.org> ---
Created attachment 50947
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50947&action=edit
Proposed patch
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/99293] Built-in vec_splat generates sub-optimal code for -mcpu=power10
2021-02-26 17:02 [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10 munroesj at gcc dot gnu.org
` (2 preceding siblings ...)
2021-06-05 5:09 ` [Bug target/99293] " meissner at gcc dot gnu.org
@ 2021-06-05 5:09 ` meissner at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: meissner at gcc dot gnu.org @ 2021-06-05 5:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
Michael Meissner <meissner at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot gnu.org
Status|NEW |ASSIGNED
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-06-05 5:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 17:02 [Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10 munroesj at gcc dot gnu.org
2021-02-26 17:07 ` [Bug middle-end/99293] " munroesj at gcc dot gnu.org
2021-02-27 0:52 ` segher at gcc dot gnu.org
2021-06-05 5:09 ` [Bug target/99293] " meissner at gcc dot gnu.org
2021-06-05 5:09 ` meissner at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).