public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition
@ 2020-04-27 16:42 gabravier at gmail dot com
  2020-04-27 20:50 ` [Bug rtl-optimization/94804] " gabravier at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: gabravier at gmail dot com @ 2020-04-27 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

            Bug ID: 94804
           Summary: Failure to elide useless movs in 128-bit addition
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

using i128 = __int128;

i128 add128(i128 a, i128 b)
{
    return a + b;
}

This is how LLVM handles this code : 

add128(__int128, __int128):
  mov rax, rdi
  add rax, rdx
  adc rsi, rcx
  mov rdx, rsi
  ret

GCC seems to have an insistence on moving `b` from its original registers
before actually doing the addition : 

add128(__int128, __int128):
  mov r9, rdi ; useless
  mov rax, rdx
  mov r8, rsi ; useless
  mov rdx, rcx
  add rax, r9 ; could just have used rdi
  adc rdx, r8 ; could just have used rsi
  ret

This seems to be specific to x86_64 : It does not occur on aarch64 or ppc64le

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
@ 2020-04-27 20:50 ` gabravier at gmail dot com
  2020-04-28  6:49 ` glisse at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: gabravier at gmail dot com @ 2020-04-27 20:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
For subtraction, it's even worse.

using i128 = __int128;

i128 sub128(i128 a, i128 b)
{
    return a - b;
}

results in 

sub128(__int128, __int128):
  mov rax, rdi
  sub rax, rdx
  sbb rsi, rcx
  mov rdx, rsi
  ret

with LLVM and 

sub128(__int128, __int128):
  mov r9, rdi
  mov r8, rsi
  mov rdi, r8
  mov rax, r9
  mov r8, rdx
  sub rax, r8
  mov rdx, rdi
  sbb rdx, rcx
  ret

with GCC.

The excess of `mov`s feels to me like there is some sort of bug in the 128-bit
register allocator or something like that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
  2020-04-27 20:50 ` [Bug rtl-optimization/94804] " gabravier at gmail dot com
@ 2020-04-28  6:49 ` glisse at gcc dot gnu.org
  2020-04-28  7:41 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-04-28  6:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ra

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
Gcc's register allocation is not well optimized for hard registers at function
boundaries (inlining makes this case not very important), there are several
related bug reports. It would be nice to improve that, but it is likely to get
lower priority than if you can find a similar issue in the middle of a hot
loop.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
  2020-04-27 20:50 ` [Bug rtl-optimization/94804] " gabravier at gmail dot com
  2020-04-28  6:49 ` glisse at gcc dot gnu.org
@ 2020-04-28  7:41 ` rguenth at gcc dot gnu.org
  2020-04-28 12:38 ` gabravier at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-28  7:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-04-28
             Status|UNCONFIRMED                 |NEW

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-04-28  7:41 ` rguenth at gcc dot gnu.org
@ 2020-04-28 12:38 ` gabravier at gmail dot com
  2020-04-28 17:14 ` glisse at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: gabravier at gmail dot com @ 2020-04-28 12:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

--- Comment #3 from Gabriel Ravier <gabravier at gmail dot com> ---
So, things like 

uint64_t swap64(uint64_t x)
{
    uint64_t a = __builtin_bswap32(x);
    x >>= 32;
    a <<= 32;
    return __builtin_bswap32(x) | a;
}

Having similar problems with useless movs is from the same non well-optimized
register allocation on function boundaries ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-04-28 12:38 ` gabravier at gmail dot com
@ 2020-04-28 17:14 ` glisse at gcc dot gnu.org
  2023-01-19 23:17 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-04-28 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #3)
> Having similar problems with useless movs is from the same non
> well-optimized register allocation on function boundaries ?

I don't know, but possibly not. I'll shut up because I am not a RA
specialist...
(and if you expect to see it optimized to bswap64, then obviously it is
unrelated to register allocation)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2020-04-28 17:14 ` glisse at gcc dot gnu.org
@ 2023-01-19 23:17 ` pinskia at gcc dot gnu.org
  2023-01-19 23:20 ` [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition with __int128_t arguments pinskia at gcc dot gnu.org
  2023-01-19 23:23 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-19 23:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rl.alt.accnt at gmail dot com

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 108471 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition with __int128_t arguments
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2023-01-19 23:17 ` pinskia at gcc dot gnu.org
@ 2023-01-19 23:20 ` pinskia at gcc dot gnu.org
  2023-01-19 23:23 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-19 23:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |101926
            Summary|Failure to elide useless    |Failure to elide useless
                   |movs in 128-bit addition    |movs in 128-bit addition
                   |                            |with __int128_t arguments

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is ra issue with arguments really.
We get good code with:
```
using i128 = __int128;
i128 sub128(i128 *a, i128 *b)
{
    return *a - *b;
}
```
```
        movq    (%rdi), %rax
        movq    8(%rdi), %rdx
        subq    (%rsi), %rax
        sbbq    8(%rsi), %rdx
```

With:
```
using i128 = __int128;

void sub128(i128 a, i128 b, i128 *c)
{
    *c =  a - b;
}
```
We get not so good code (extra movs):
```
        movq    %rsi, %rax
        movq    %rdi, %rsi
        movq    %rax, %rdi
        subq    %rdx, %rsi
        sbbq    %rcx, %rdi
        movq    %rsi, (%r8)
        movq    %rdi, 8(%r8)
```


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926
[Bug 101926] [meta-bug] struct/complex argument passing and return should be
improved

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition with __int128_t arguments
  2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2023-01-19 23:20 ` [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition with __int128_t arguments pinskia at gcc dot gnu.org
@ 2023-01-19 23:23 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-19 23:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |denis.campredon at gmail dot com

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 97961 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-19 23:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 16:42 [Bug rtl-optimization/94804] New: Failure to elide useless movs in 128-bit addition gabravier at gmail dot com
2020-04-27 20:50 ` [Bug rtl-optimization/94804] " gabravier at gmail dot com
2020-04-28  6:49 ` glisse at gcc dot gnu.org
2020-04-28  7:41 ` rguenth at gcc dot gnu.org
2020-04-28 12:38 ` gabravier at gmail dot com
2020-04-28 17:14 ` glisse at gcc dot gnu.org
2023-01-19 23:17 ` pinskia at gcc dot gnu.org
2023-01-19 23:20 ` [Bug rtl-optimization/94804] Failure to elide useless movs in 128-bit addition with __int128_t arguments pinskia at gcc dot gnu.org
2023-01-19 23:23 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).