public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select
@ 2020-12-30  9:35 ktkachov at gcc dot gnu.org
  2020-12-30  9:40 ` [Bug target/98477] " ktkachov at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2020-12-30  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

            Bug ID: 98477
           Summary: aarch64: Unnecessary GPR -> FPR moves for conditional
                    select
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Code like
void
foo (int a, double *b)
{
  *b = a ? 10000.0 : 200.0;
}

generates:
foo:
        cmp     w0, 0
        mov     x2, 149533581377536
        movk    x2, 0x40c3, lsl 48
        mov     x0, 4641240890982006784
        fmov    d0, x2
        fmov    d1, x0
        fcsel   d0, d0, d1, ne
        str     d0, [x1]
        ret

We don't need to do the FCSEL on the FPR side if we're just storing it to
memory. We can just do a GPR CSEL and avoid the FMOVs.
I've seen this pattern in the disassembly of some math library routines.
Maybe we should add a =w,w,w alternative to the CSEL patterns in the backend?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
@ 2020-12-30  9:40 ` ktkachov at gcc dot gnu.org
  2020-12-30  9:47 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2020-12-30  9:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #1 from ktkachov at gcc dot gnu.org ---
Or a =r,r,r alternative to the FCSEL pattern instead...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
  2020-12-30  9:40 ` [Bug target/98477] " ktkachov at gcc dot gnu.org
@ 2020-12-30  9:47 ` pinskia at gcc dot gnu.org
  2020-12-30  9:50 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-12-30  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
                 CC|                            |pinskia at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
  2020-12-30  9:40 ` [Bug target/98477] " ktkachov at gcc dot gnu.org
  2020-12-30  9:47 ` pinskia at gcc dot gnu.org
@ 2020-12-30  9:50 ` pinskia at gcc dot gnu.org
  2020-12-30  9:50 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-12-30  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to ktkachov from comment #1)
> Or a =r,r,r alternative to the FCSEL pattern instead...

Should most likely add the r alternative to *cmov<mode>_insn (GPF) and the w
alternative to *cmov<mode>_insn (ALLI).  So you can avoid moving back and forth
in general.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-12-30  9:50 ` pinskia at gcc dot gnu.org
@ 2020-12-30  9:50 ` pinskia at gcc dot gnu.org
  2023-11-18  6:59 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-12-30  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-12-30
            Version|unknown                     |11.0
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2020-12-30  9:50 ` pinskia at gcc dot gnu.org
@ 2023-11-18  6:59 ` pinskia at gcc dot gnu.org
  2024-04-24  0:02 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-18  6:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |pinskia at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I am going to look into this ...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-11-18  6:59 ` pinskia at gcc dot gnu.org
@ 2024-04-24  0:02 ` pinskia at gcc dot gnu.org
  2024-04-24  6:05 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24  0:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So adding the `r` alternative to *cmov<mode>_insn (GPF) works kinda of but then
we seem to have a register allocation issue.

Even this still causes FPREGS from being chosen:
```
void
foo (int a, double *b)
{
  double t = a ? 10000.0 : 200.0;
  asm("":"+r"(t));
  *b = t;
}

```

Someone else will need to look into register allocator issue later on.

I did find a testcase where we don't get the fmovs though (which forces to use
x0).
```
void
foo (int a, double *b)
{
  double t = a ? 10000.0 : 200.0;
  register double tt __asm__("x0");
  tt = t;
  asm("":"+r"(tt));
  *b = tt;
}

```

With that we now get:
```
        cmp     w0, 0
        mov     x0, 149533581377536
        mov     x2, 4641240890982006784
        movk    x0, 0x40c3, lsl 48
        csel    x0, x2, x0, eq
        str     x0, [x1]
        ret
```

So at least I can write up a testcase ...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-04-24  0:02 ` pinskia at gcc dot gnu.org
@ 2024-04-24  6:05 ` pinskia at gcc dot gnu.org
  2024-04-24  6:09 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24  6:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
here is a testcase for the fcsel usage for integer:
```
void
foo (int a, double *b)
{
  double t = a ? 10000.0 : 200.0;
  register double tt __asm__("x0");
  tt = t;
  asm("":"+r"(tt));
  *b = tt;
}
```

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-04-24  6:05 ` pinskia at gcc dot gnu.org
@ 2024-04-24  6:09 ` pinskia at gcc dot gnu.org
  2024-04-24  6:22 ` pinskia at gcc dot gnu.org
  2024-04-24  6:28 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24  6:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
here is a testcase for the fcsel usage for integer cmov:

```
void
foo (int a, int *b)
{
  int t = a ? 11 : 22;
  register int tt __asm__("s0");
  tt = t;
  asm("":"+w"(tt));
  *b = tt;
}

```

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-04-24  6:09 ` pinskia at gcc dot gnu.org
@ 2024-04-24  6:22 ` pinskia at gcc dot gnu.org
  2024-04-24  6:28 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #7)
> here is a testcase for the fcsel usage for integer cmov:

A slightly better example where there is no use of inline-asm or forcing to
specific registers:
```
#define vector16 __attribute__((vector_size(16)))
void
foo (int a, int *b, vector16 int c, vector16 int d)
{
  int t = a ? c[0] : d[0];
  *b = t;
}

```
We should be able to produce:
```
foo:
        cmp     w0, 0
        fcsel   s1, s1, s0, eq
        str     s1, [x1]
        ret
```

And here is a decent one for float modes (-O2 -fno-ssa-phiopt is needed though,
otherwise the tree level does the VCE after the cmov):
```
#define vector8 __attribute__((vector_size(8)))
void
foo (int a, double *b, long long c, long long d)
{
  double ct;
  double dt;
  __builtin_memcpy(&ct, &c, sizeof(long long));
  __builtin_memcpy(&dt, &d, sizeof(long long));
  double t = a ? ct : dt;
  *b = t;
}
```

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/98477] aarch64: Unnecessary GPR -> FPR moves for conditional select
  2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-04-24  6:22 ` pinskia at gcc dot gnu.org
@ 2024-04-24  6:28 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24  6:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98477

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 58022
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58022&action=edit
Patch which I tested

I still need to add the testcases and finish up the commit message and
changelogs. I will do that tomorrow. Posting this here tonight so I don't lose
the patch.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-04-24  6:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-30  9:35 [Bug target/98477] New: aarch64: Unnecessary GPR -> FPR moves for conditional select ktkachov at gcc dot gnu.org
2020-12-30  9:40 ` [Bug target/98477] " ktkachov at gcc dot gnu.org
2020-12-30  9:47 ` pinskia at gcc dot gnu.org
2020-12-30  9:50 ` pinskia at gcc dot gnu.org
2020-12-30  9:50 ` pinskia at gcc dot gnu.org
2023-11-18  6:59 ` pinskia at gcc dot gnu.org
2024-04-24  0:02 ` pinskia at gcc dot gnu.org
2024-04-24  6:05 ` pinskia at gcc dot gnu.org
2024-04-24  6:09 ` pinskia at gcc dot gnu.org
2024-04-24  6:22 ` pinskia at gcc dot gnu.org
2024-04-24  6:28 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).