public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t)
@ 2011-09-09 12:12 wouter.vermaelen at scarlet dot be
2012-01-23 0:24 ` [Bug rtl-optimization/50339] " svfuerst at gmail dot com
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: wouter.vermaelen at scarlet dot be @ 2011-09-09 12:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Bug #: 50339
Summary: suboptimal register allocation for abs(__int128_t)
Classification: Unclassified
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: wouter.vermaelen@scarlet.be
This function:
__int128_t abs128(__int128_t a)
{
return (a >= 0) ? a : -a;
}
Currently generates the following code (with -O3):
(linux x86_64, g++-4.7.0, SVN revision 178692)
49 89 f9 mov %rdi,%r9
48 89 f7 mov %rsi,%rdi
49 89 f2 mov %rsi,%r10
48 c1 ff 3f sar $0x3f,%rdi
48 89 f8 mov %rdi,%rax
48 89 fa mov %rdi,%rdx
4c 31 c8 xor %r9,%rax
4c 31 d2 xor %r10,%rdx
48 29 f8 sub %rdi,%rax
48 19 fa sbb %rdi,%rdx
c3 retq
But the following has 2 'mov' instructions less:
48 89 f8 mov %rdi,%rax
48 89 f2 mov %rsi,%rdx
48 89 d1 mov %rdx,%rcx
48 c1 f9 3f sar $0x3f,%rcx
48 31 c8 xor %rcx,%rax
48 31 ca xor %rcx,%rdx
48 29 c8 sub %rcx,%rax
48 19 ca sbb %rcx,%rdx
c3 retq
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
@ 2012-01-23 0:24 ` svfuerst at gmail dot com
2012-11-02 14:34 ` glisse at gcc dot gnu.org
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: svfuerst at gmail dot com @ 2012-01-23 0:24 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Steven Fuerst <svfuerst at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |svfuerst at gmail dot com
--- Comment #1 from Steven Fuerst <svfuerst at gmail dot com> 2012-01-22 23:44:55 UTC ---
It is possible to remove yet another mov instruction:
mov %rdi,%rax
mov %rsi,%rdx
sar $0x3f,%rsi
xor %rsi,%rax
xor %rsi,%rdx
sub %rsi,%rax
sbb %rsi,%rdx
retq
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
2012-01-23 0:24 ` [Bug rtl-optimization/50339] " svfuerst at gmail dot com
@ 2012-11-02 14:34 ` glisse at gcc dot gnu.org
2013-02-15 8:58 ` [Bug rtl-optimization/50339] [4.8 Regression] " ubizjak at gmail dot com
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-11-02 14:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> 2012-11-02 14:33:27 UTC ---
It looks even worse in 4.8:
movq %rdi, %r9
movq %rsi, %rdi
movq %rsi, %r10
sarq $63, %rdi
movq %rdi, %rcx
xorq %r9, %rcx
movq %rcx, %rax
movq %r10, %rcx
xorq %rdi, %rcx
subq %rdi, %rax
movq %rcx, %rdx
sbbq %rdi, %rdx
ret
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] [4.8 Regression] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
2012-01-23 0:24 ` [Bug rtl-optimization/50339] " svfuerst at gmail dot com
2012-11-02 14:34 ` glisse at gcc dot gnu.org
@ 2013-02-15 8:58 ` ubizjak at gmail dot com
2013-02-15 9:20 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: ubizjak at gmail dot com @ 2013-02-15 8:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Uros Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ra
Status|UNCONFIRMED |NEW
Last reconfirmed| |2013-02-15
CC| |vmakarov at gcc dot gnu.org
Target Milestone|--- |4.8.0
Summary|suboptimal register |[4.8 Regression] suboptimal
|allocation for |register allocation for
|abs(__int128_t) |abs(__int128_t)
Ever Confirmed|0 |1
--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2013-02-15 08:57:41 UTC ---
(In reply to comment #2)
> It looks even worse in 4.8:
So, a regression from 4.7.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] [4.8 Regression] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (2 preceding siblings ...)
2013-02-15 8:58 ` [Bug rtl-optimization/50339] [4.8 Regression] " ubizjak at gmail dot com
@ 2013-02-15 9:20 ` rguenth at gcc dot gnu.org
2013-02-21 12:58 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-15 9:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Severity|enhancement |normal
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] [4.8 Regression] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (3 preceding siblings ...)
2013-02-15 9:20 ` rguenth at gcc dot gnu.org
@ 2013-02-21 12:58 ` jakub at gcc dot gnu.org
2013-02-21 21:28 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-21 12:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-21 12:57:40 UTC ---
Created attachment 29517
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29517
gcc48-pr50339.patch
Patch that improves 4.8 generated code to one insn better than what 4.7 did, by
lowering ASHIFTRT similarly how lower-subreg lowers ASHIFT and LSHIFTRT
already.
On this testcase the difference is between unpatched trunk and patched trunk:
- movq %rdi, %r9
- movq %rsi, %rdi
- movq %rsi, %r10
- sarq $63, %rdi
- movq %rdi, %rcx
- xorq %r9, %rcx
- movq %rcx, %rax
- movq %r10, %rcx
- xorq %rdi, %rcx
- subq %rdi, %rax
- movq %rcx, %rdx
- sbbq %rdi, %rdx
+ movq %rsi, %rax
+ sarq $63, %rax
+ movq %rax, %r9
+ xorq %rax, %rdi
+ xorq %r9, %rsi
+ movq %rdi, %rax
+ movq %rsi, %rdx
+ subq %r9, %rax
+ sbbq %r9, %rdx
i.e. 4 moves instead of former 7 (no idea why RA chooses to do this shift on
%rax (i.e. first move %rsi to %rax, then shift %rax, then move %rax to %r9),
instead of copying %rsi to %r9 and shifting %r9, that would mean one less
move).
Even smaller code would probably need different expansion or much smarter
register allocation.
Anyway, also tested:
__int128_t
f1 (__int128_t a)
{
return a >> 67;
}
__int128_t
f2 (__int128_t a)
{
return a >> 64;
}
__int128_t
f3 (__int128_t a)
{
return a >> 127;
}
__uint128_t
f4 (__uint128_t a)
{
return a >> 67;
}
__uint128_t
f5 (__uint128_t a)
{
return a >> 64;
}
__uint128_t
f6 (__uint128_t a)
{
return a >> 127;
}
on x86_64 and the difference at -O2 is:
- movq %rsi, %rax
movq %rsi, %rdx
+ movq %rsi, %rax
sarq $63, %rdx
sarq $3, %rax
for f1,
- movq %rsi, %rdx
movq %rsi, %rax
- sarq $63, %rdx
+ cqto
for f2 and
+ sarq $63, %rsi
movq %rsi, %rdx
- sarq $63, %rdx
- movq %rdx, %rax
+ movq %rsi, %rax
for f3, so either no pessimization, or small improvement. On:
long long int
f1 (long long int a)
{
return a >> 35;
}
long long int
f2 (long long int a)
{
return a >> 32;
}
long long int
f3 (long long int a)
{
return a >> 63;
}
unsigned long long int
f4 (unsigned long long int a)
{
return a >> 35;
}
unsigned long long int
f5 (unsigned long long int a)
{
return a >> 32;
}
unsigned long long int
f6 (unsigned long long int a)
{
return a >> 63;
}
for -O2 -m32 the improvements are even better, for f1:
- movl 8(%esp), %edx
- movl %edx, %eax
- movl %eax, %edx
- sarl $31, %edx
+ movl 8(%esp), %eax
+ cltd
sarl $3, %eax
and for f2:
- movl 8(%esp), %edx
- movl %edx, %eax
- movl %eax, %edx
- sarl $31, %edx
+ movl 8(%esp), %eax
+ cltd
(no difference for f3).
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] [4.8 Regression] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (4 preceding siblings ...)
2013-02-21 12:58 ` jakub at gcc dot gnu.org
@ 2013-02-21 21:28 ` jakub at gcc dot gnu.org
2013-02-21 21:36 ` [Bug rtl-optimization/50339] " jakub at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-21 21:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-21 21:28:07 UTC ---
Author: jakub
Date: Thu Feb 21 21:28:03 2013
New Revision: 196214
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196214
Log:
PR rtl-optimization/50339
* lower-subreg.h (struct lower_subreg_choices): Add splitting_ashiftrt
field.
* lower-subreg.c (compute_splitting_shift): Handle ASHIFTRT.
(compute_costs): Call compute_splitting_shift also for ASHIFTRT
into splitting_ashiftrt field.
(find_decomposable_shift_zext, resolve_shift_zext): Handle also
ASHIFTRT.
(dump_choices): Fix up printing LSHIFTRT choices, print ASHIFTRT
choices.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lower-subreg.c
trunk/gcc/lower-subreg.h
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (5 preceding siblings ...)
2013-02-21 21:28 ` jakub at gcc dot gnu.org
@ 2013-02-21 21:36 ` jakub at gcc dot gnu.org
2013-03-22 14:48 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-02-21 21:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.8 Regression] suboptimal |suboptimal register
|register allocation for |allocation for
|abs(__int128_t) |abs(__int128_t)
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-02-21 21:36:17 UTC ---
Removing the regression tag, while the RA should be still improved, this is no
longer a regression.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (6 preceding siblings ...)
2013-02-21 21:36 ` [Bug rtl-optimization/50339] " jakub at gcc dot gnu.org
@ 2013-03-22 14:48 ` jakub at gcc dot gnu.org
2013-05-31 11:00 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-03-22 14:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.0 |4.8.1
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-03-22 14:45:36 UTC ---
GCC 4.8.0 is being released, adjusting target milestone.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (7 preceding siblings ...)
2013-03-22 14:48 ` jakub at gcc dot gnu.org
@ 2013-05-31 11:00 ` jakub at gcc dot gnu.org
2013-10-16 9:51 ` jakub at gcc dot gnu.org
2015-06-22 14:26 ` rguenth at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-05-31 11:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.1 |4.8.2
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.1 has been released.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (8 preceding siblings ...)
2013-05-31 11:00 ` jakub at gcc dot gnu.org
@ 2013-10-16 9:51 ` jakub at gcc dot gnu.org
2015-06-22 14:26 ` rguenth at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-10-16 9:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.2 |4.8.3
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.2 has been released.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/50339] suboptimal register allocation for abs(__int128_t)
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
` (9 preceding siblings ...)
2013-10-16 9:51 ` jakub at gcc dot gnu.org
@ 2015-06-22 14:26 ` rguenth at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-22 14:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50339
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.3 |---
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-06-22 14:26 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-09 12:12 [Bug rtl-optimization/50339] New: suboptimal register allocation for abs(__int128_t) wouter.vermaelen at scarlet dot be
2012-01-23 0:24 ` [Bug rtl-optimization/50339] " svfuerst at gmail dot com
2012-11-02 14:34 ` glisse at gcc dot gnu.org
2013-02-15 8:58 ` [Bug rtl-optimization/50339] [4.8 Regression] " ubizjak at gmail dot com
2013-02-15 9:20 ` rguenth at gcc dot gnu.org
2013-02-21 12:58 ` jakub at gcc dot gnu.org
2013-02-21 21:28 ` jakub at gcc dot gnu.org
2013-02-21 21:36 ` [Bug rtl-optimization/50339] " jakub at gcc dot gnu.org
2013-03-22 14:48 ` jakub at gcc dot gnu.org
2013-05-31 11:00 ` jakub at gcc dot gnu.org
2013-10-16 9:51 ` jakub at gcc dot gnu.org
2015-06-22 14:26 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).