public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/46514] New: 128-bit shifts on x86_64 generate silly code unless the shift amount is constant
@ 2010-11-17 1:41 luto at mit dot edu
2010-11-17 20:08 ` [Bug rtl-optimization/46514] " ubizjak at gmail dot com
0 siblings, 1 reply; 2+ messages in thread
From: luto at mit dot edu @ 2010-11-17 1:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46514
Summary: 128-bit shifts on x86_64 generate silly code unless
the shift amount is constant
Product: gcc
Version: 4.5.1
Status: UNCONFIRMED
Severity: minor
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: luto@mit.edu
Created attachment 22428
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22428
Preprocessed source
I'm using 4.5.1 (Fedora 14) with -O3, but -O2 does the same thing.
This really easy case:
uint64_t shift_test_31(__uint128_t x, uint32_t shift)
{
if (shift != 31)
__builtin_unreachable();
return (uint64_t)(x >> shift);
}
generates:
0000000000000050 <shift_test_31>:
50: 48 89 f8 mov %rdi,%rax
53: 48 0f ac f0 1f shrd $0x1f,%rsi,%rax
58: c3 retq
59: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
which is entirely sensible. But this:
uint64_t shift_test_le_31(__uint128_t x, uint32_t shift)
{
if (shift >= 32)
__builtin_unreachable();
return (uint64_t)(x >> shift);
}
generates this:
0000000000000060 <shift_test_le_31>:
60: 89 d1 mov %edx,%ecx
62: 48 89 6c 24 f8 mov %rbp,-0x8(%rsp)
67: 48 89 f5 mov %rsi,%rbp
6a: 48 0f ad f7 shrd %cl,%rsi,%rdi
6e: 48 d3 ed shr %cl,%rbp
71: f6 c2 40 test $0x40,%dl
74: 48 89 5c 24 f0 mov %rbx,-0x10(%rsp)
79: 48 0f 45 fd cmovne %rbp,%rdi
7d: 48 8b 5c 24 f0 mov -0x10(%rsp),%rbx
82: 48 8b 6c 24 f8 mov -0x8(%rsp),%rbp
87: 48 89 f8 mov %rdi,%rax
8a: c3 retq
which contains a pointless shr, test, and cmovne. (Even if I change the
__builtin_unreachable() into a real branch, I get the same code.)
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug rtl-optimization/46514] 128-bit shifts on x86_64 generate silly code unless the shift amount is constant
2010-11-17 1:41 [Bug rtl-optimization/46514] New: 128-bit shifts on x86_64 generate silly code unless the shift amount is constant luto at mit dot edu
@ 2010-11-17 20:08 ` ubizjak at gmail dot com
0 siblings, 0 replies; 2+ messages in thread
From: ubizjak at gmail dot com @ 2010-11-17 20:08 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46514
Uros Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2010.11.17 19:51:24
Ever Confirmed|0 |1
--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2010-11-17 19:51:24 UTC ---
This is how doubleword (TImode on x86_64 and DImode on x86_32 targets) shifts
are handled. Doubleword instructions are expanded to final instruction sequence
late after register allocation pass, so earlier optimization passes know that
they are processing SHIFT expressions and optimize them as shifts.
The expansion detects constant count operand and emits special sequence, but
for sure it can't detect limited set of possible count operands, and emits
universal sequence in this case.
That said, doubleword operation code is not the most optimized code around, on
the grounds that it is usually not used in performance critical part of the
application. Just try to avoid it as much as possible.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-11-17 19:51 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17 1:41 [Bug rtl-optimization/46514] New: 128-bit shifts on x86_64 generate silly code unless the shift amount is constant luto at mit dot edu
2010-11-17 20:08 ` [Bug rtl-optimization/46514] " ubizjak at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).