public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
@ 2021-04-21 0:47 pinskia at gcc dot gnu.org
2021-08-25 8:13 ` [Bug target/100165] " pinskia at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-04-21 0:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
Bug ID: 100165
Summary: fmov could be used to zero out the upper bits instead
of movi/zip or movi/ins with __builtin_shuffle and
zero vector
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: pinskia at gcc dot gnu.org
Target Milestone: ---
Target: aarch64-*-*
Take:
typedef double V __attribute__((vector_size(16)));
typedef long long VI __attribute__((vector_size(16)));
V
foo (V x)
{
return __builtin_shuffle (x, (V) { 0, 0, }, (VI) {0, 3});
}
----- CUT ----
Or
typedef float V __attribute__((vector_size(16)));
typedef int VI __attribute__((vector_size(16)));
V
foo (V x)
{
return __builtin_shuffle (x, (V) { 0, 0, 0, 0 }, (VI) {0, 1, 4, 5});
}
---- CUT ----
Both should just produce:
fmov d0, d0
ret
---- CUT ----
The x86_64 specific version of this was PR 94680 which I just confirmed today.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
@ 2021-08-25 8:13 ` pinskia at gcc dot gnu.org
2023-11-12 21:27 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-25 8:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This
V
foo (V x)
{
return __builtin_shuffle (x, (V) { 0, 0, 0, 0, }, (VI) { 0, 1, 6, 7});
}
Produces:
movi v1.4s, 0
ins v0.d[1], v1.d[1]
Which is better but fmov is still better :).
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
2021-08-25 8:13 ` [Bug target/100165] " pinskia at gcc dot gnu.org
@ 2023-11-12 21:27 ` pinskia at gcc dot gnu.org
2023-11-12 21:32 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-12 21:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 56564
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56564&action=edit
Full testcase
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
2021-08-25 8:13 ` [Bug target/100165] " pinskia at gcc dot gnu.org
2023-11-12 21:27 ` pinskia at gcc dot gnu.org
@ 2023-11-12 21:32 ` pinskia at gcc dot gnu.org
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-12 21:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2023-11-12
Ever confirmed|0 |1
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Currently the trunk produces:
```
foo:
ins v0.d[1], xzr
ret
foo1:
movi v31.4s, 0
zip1 v0.2d, v0.2d, v31.2d
ret
foo2:
ins v0.d[1], xzr
ret
```
Which is better than 10.x but still not using fmov.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
` (2 preceding siblings ...)
2023-11-12 21:32 ` pinskia at gcc dot gnu.org
@ 2023-11-12 21:45 ` pinskia at gcc dot gnu.org
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
2024-02-27 8:44 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-12 21:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Mine, I will handle this. Most likely for GCC 15 though.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
` (3 preceding siblings ...)
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
@ 2023-11-12 21:45 ` pinskia at gcc dot gnu.org
2024-02-27 8:44 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-12 21:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
` (4 preceding siblings ...)
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
@ 2024-02-27 8:44 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-27 8:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For the ones which produce ins, it should be easy to modify the pattern to emit
fmov for those cases, that is `elt == 0`:
(define_insn "aarch64_simd_vec_set_zero<mode>"
[(set (match_operand:VALLS_F16 0 "register_operand" "=w")
(vec_merge:VALLS_F16
(match_operand:VALLS_F16 1 "aarch64_simd_imm_zero" "")
(match_operand:VALLS_F16 3 "register_operand" "0")
(match_operand:SI 2 "immediate_operand" "i")))]
"TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
{
int elt = ENDIAN_LANE_N (<nunits>, exact_log2 (INTVAL (operands[2])));
operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
return "ins\\t%0.<Vetype>[%p2], <vwcore>zr";
}
)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-02-27 8:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 0:47 [Bug target/100165] New: fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector pinskia at gcc dot gnu.org
2021-08-25 8:13 ` [Bug target/100165] " pinskia at gcc dot gnu.org
2023-11-12 21:27 ` pinskia at gcc dot gnu.org
2023-11-12 21:32 ` pinskia at gcc dot gnu.org
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
2023-11-12 21:45 ` pinskia at gcc dot gnu.org
2024-02-27 8:44 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).