public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64
@ 2022-11-05 7:42 ramana at gcc dot gnu.org
2023-05-19 4:12 ` [Bug target/107533] " pinskia at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: ramana at gcc dot gnu.org @ 2022-11-05 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533
Bug ID: 107533
Summary: Inefficient code sequence for fp16 testcase on aarch64
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ramana at gcc dot gnu.org
Target Milestone: ---
Derived from PR92999
struct phalf {
__fp16 first;
__fp16 second;
};
struct phalf phalf_copy(struct phalf* src) __attribute__((noinline));
struct phalf phalf_copy(struct phalf* src) {
return *src;
}
Compiling for AArch64 with a recent enough compiler produces.
phalf_copy:
ldr w0, [x0]
ubfx x1, x0, 0, 16
lsr w0, w0, 16
dup v0.4h, w1
dup v1.4h, w0
ret
Couldn't it just be ldr h0, [x0]
ldr h1, [x0, 2]
IIRC this is in base v8 rather than v8.2
regards
Ramana
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug target/107533] Inefficient code sequence for fp16 testcase on aarch64
2022-11-05 7:42 [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64 ramana at gcc dot gnu.org
@ 2023-05-19 4:12 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-19 4:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-05-19
Ever confirmed|0 |1
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Take:
```
struct phalf {
__fp16 first;
__fp16 second;
};
void phalf_copy_(struct phalf *d, struct phalf src) {
*d = src;
}
struct phalf phalf_copy0( __fp16 f, __fp16 s) {
struct phalf t = {f, s};
return t;
}
struct phalf phalf_copy(struct phalf* src) {
return *src;
}
struct phalf phalf_copy1(__fp16 *f, __fp16 *s) {
struct phalf t = {*f, *s};
return t;
}
void phalf_copy2(struct phalf *d, __fp16 *f, __fp16 *s) {
struct phalf t = {*f, *s};
*d = t;
}
void phalf_copy3(struct phalf *d, struct phalf* src) {
*d = *src;
}
void phalf_copy4(struct phalf *d, __fp16 f, __fp16 s) {
struct phalf t = {f, s};
*d = t;
}
```
2,3,4 are all ok, while 0, none, _ and 1 are bad.
Which points to return values and argument passing being bad (which we already
knew had issues).
Confirmed.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-05-19 4:12 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-05 7:42 [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64 ramana at gcc dot gnu.org
2023-05-19 4:12 ` [Bug target/107533] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).