public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64
@ 2022-11-05  7:42 ramana at gcc dot gnu.org
  2023-05-19  4:12 ` [Bug target/107533] " pinskia at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: ramana at gcc dot gnu.org @ 2022-11-05  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533

            Bug ID: 107533
           Summary: Inefficient code sequence for fp16 testcase on aarch64
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ramana at gcc dot gnu.org
  Target Milestone: ---

Derived from PR92999 



struct phalf {
    __fp16 first;
    __fp16 second;
};

struct phalf phalf_copy(struct phalf* src) __attribute__((noinline));
struct phalf phalf_copy(struct phalf* src) {
    return *src;
}

Compiling for AArch64 with a recent enough compiler produces. 

phalf_copy:
        ldr     w0, [x0]
        ubfx    x1, x0, 0, 16
        lsr     w0, w0, 16
        dup     v0.4h, w1
        dup     v1.4h, w0
        ret


Couldn't it just be ldr h0, [x0]
                    ldr h1, [x0, 2] 

IIRC this is in base v8 rather than v8.2 


regards
Ramana

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/107533] Inefficient code sequence for fp16 testcase on aarch64
  2022-11-05  7:42 [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64 ramana at gcc dot gnu.org
@ 2023-05-19  4:12 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-19  4:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-05-19
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Take:
```
struct phalf {
    __fp16 first;
    __fp16 second;
};

void phalf_copy_(struct phalf *d, struct phalf src) {
    *d = src;
}
struct phalf phalf_copy0( __fp16 f, __fp16 s) {
    struct phalf t = {f, s};
    return t;
}
struct phalf phalf_copy(struct phalf* src) {
    return *src;
}
struct phalf phalf_copy1(__fp16 *f, __fp16 *s) {
    struct phalf t = {*f, *s};
    return t;
}
void phalf_copy2(struct phalf *d, __fp16 *f, __fp16 *s) {
    struct phalf t = {*f, *s};
    *d = t;
}
void phalf_copy3(struct phalf *d, struct phalf* src) {
    *d = *src;
}
void phalf_copy4(struct phalf *d, __fp16 f, __fp16 s) {
    struct phalf t = {f, s};
    *d = t;
}
```
2,3,4 are all ok, while 0, none, _ and 1 are bad.
Which points to return values and argument passing being bad (which we already
knew had issues).

Confirmed.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-05-19  4:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-05  7:42 [Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64 ramana at gcc dot gnu.org
2023-05-19  4:12 ` [Bug target/107533] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).