public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/105137] New: Missed optimization 64-bit adds and shifts
@ 2022-04-02 14:01 andre.schackier at gmail dot com
2022-11-28 20:03 ` [Bug middle-end/105137] " pinskia at gcc dot gnu.org
2023-01-01 17:01 ` cvs-commit at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: andre.schackier at gmail dot com @ 2022-04-02 14:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105137
Bug ID: 105137
Summary: Missed optimization 64-bit adds and shifts
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: andre.schackier at gmail dot com
Target Milestone: ---
Given the following source code [godbolt](https://godbolt.org/z/8KMMhefqY)
#include <stdint.h>
typedef __int128_t int128_t;
int64_t foo(int128_t a, int64_t b, int cond) {
if (cond) {
a += ((int128_t)b) << 64;
}
return a >> 64;
}
int64_t bar(int128_t a, int64_t b, int cond) {
int64_t r = a >> 64;
if (cond) {
r += b;
}
return r;
}
Compiling with "-O3" we get:
foo:
mov rax, rsi
mov rsi, rdi
mov rdi, rax
test ecx, ecx
je .L2
xor r8d, r8d
add rsi, r8
adc rdi, rdx
.L2:
mov rax, rdi
ret
bar:
add rdx, rsi
mov rax, rsi
test ecx, ecx
cmovne rax, rdx
ret
Although both functions do the same, gcc implements worse code for foo.
Credits: This was entirely found by Trevor Spiteri reported at the llvm-project
here: https://github.com/llvm/llvm-project/issues/54718
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/105137] Missed optimization 64-bit adds and shifts
2022-04-02 14:01 [Bug middle-end/105137] New: Missed optimization 64-bit adds and shifts andre.schackier at gmail dot com
@ 2022-11-28 20:03 ` pinskia at gcc dot gnu.org
2023-01-01 17:01 ` cvs-commit at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-28 20:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105137
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-linux-gnu
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Severity|normal |enhancement
Last reconfirmed| |2022-11-28
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed. Interesting aarch64 is able to optimize it all the way to be the
same code for both of them (well the only difference is the first add has the
operands swapped but still is the same really).
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/105137] Missed optimization 64-bit adds and shifts
2022-04-02 14:01 [Bug middle-end/105137] New: Missed optimization 64-bit adds and shifts andre.schackier at gmail dot com
2022-11-28 20:03 ` [Bug middle-end/105137] " pinskia at gcc dot gnu.org
@ 2023-01-01 17:01 ` cvs-commit at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-01 17:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105137
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:4f1314f547f69d3a2b1f16ce301267e3bfb4e427
commit r13-4945-g4f1314f547f69d3a2b1f16ce301267e3bfb4e427
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Sun Jan 1 17:00:28 2023 +0000
Add post-reload splitter for extendditi2 on x86_64.
This is another step towards a possible solution for PR 105137.
This patch introduces a define_insn for extendditi2 that allows
DImode to TImode sign-extension to be represented in the early
RTL optimizers, before being split post-reload into the exact
same idiom as currently produced by RTL expansion.
Typically this produces the identical code, so the first new
test case:
__int128 foo(long long x) { return (__int128)x; }
continues to generate:
foo: movq %rdi, %rax
cqto
ret
The "magic" is that this representation allows combine and the
other RTL optimizers to do a better job. Hence, the second
test case:
__int128 foo(__int128 a, long long b) {
a += ((__int128)b) << 70;
return a;
}
which mainline with -O2 currently generates as:
foo: movq %rsi, %rax
movq %rdx, %rcx
movq %rdi, %rsi
salq $6, %rcx
movq %rax, %rdi
xorl %eax, %eax
movq %rcx, %rdx
addq %rsi, %rax
adcq %rdi, %rdx
ret
with this patch now becomes:
foo: movl $0, %eax
salq $6, %rdx
addq %rdi, %rax
adcq %rsi, %rdx
ret
i.e. the same code for the signed and unsigned extension variants.
2023-01-01 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/i386.md (extendditi2): New define_insn.
(define_split): Use DWIH mode iterator to treat new extendditi2
identically to existing extendsidi2_1.
(define_peephole2): Likewise.
(define_peephole2): Likewise.
(define_Split): Likewise.
gcc/testsuite/ChangeLog
* gcc.target/i386/extendditi2-1.c: New test case.
* gcc.target/i386/extendditi2-2.c: Likewise.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-01-01 17:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-02 14:01 [Bug middle-end/105137] New: Missed optimization 64-bit adds and shifts andre.schackier at gmail dot com
2022-11-28 20:03 ` [Bug middle-end/105137] " pinskia at gcc dot gnu.org
2023-01-01 17:01 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).