public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code
@ 2010-09-16 1:17 ekuznetsov at divxcorp dot com
2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: ekuznetsov at divxcorp dot com @ 2010-09-16 1:17 UTC (permalink / raw)
To: gcc-bugs
I've attached two copies of a simple function. They are identical except for
the type of the internal variable (one uses 'int64_t', the other uses 'int').
When compiled with GCC 4.4.3 on a x64 platform using -O3 optimizations, the
assembly code for the first version will contain a conditional move instruction
'cmov', the second version will contain a branch. Since branches are extremely
slow, the second version ends up two times slower than the first version.
--
Summary: GCC optimizer for Intel x64 generates inefficient code
Product: gcc
Version: 4.4.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ekuznetsov at divxcorp dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
@ 2010-09-16 1:19 ` ekuznetsov at divxcorp dot com
2010-09-16 23:09 ` ekuznetsov at divxcorp dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ekuznetsov at divxcorp dot com @ 2010-09-16 1:19 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from ekuznetsov at divxcorp dot com 2010-09-16 01:18 -------
Created an attachment (id=21807)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21807&action=view)
Sample code
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com
@ 2010-09-16 23:09 ` ekuznetsov at divxcorp dot com
2010-09-17 9:59 ` ubizjak at gmail dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ekuznetsov at divxcorp dot com @ 2010-09-16 23:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from ekuznetsov at divxcorp dot com 2010-09-16 23:08 -------
Created an attachment (id=21813)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21813&action=view)
Output of gcc -v -O3 gcc-bug.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com
2010-09-16 23:09 ` ekuznetsov at divxcorp dot com
@ 2010-09-17 9:59 ` ubizjak at gmail dot com
2010-09-17 10:03 ` ubizjak at gmail dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2010-09-17 9:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from ubizjak at gmail dot com 2010-09-17 09:59 -------
This all happens in IF conversion pass.
4.6 regresses in the sense that a branch is emitted instead of cmov for:
int
summation_helper_1 (long * products, unsigned long count)
{
int s = 0;
unsigned long i;
for (i = 0; i < count; i++)
{
int val = (products[i] > 0) ? 1 : -1;
products[i] *= val;
if (products[i] != i)
val = -val;
products[i] = val;
s += val;
}
return s;
}
gcc-4.4.4 -O3 produces:
.L16:
movq (%rdi,%rdx,8), %r10
testq %r10, %r10
setg %r8b
xorl %ecx, %ecx
testq %r10, %r10
movzbl %r8b, %r9d
movzbl %r8b, %r8d
setle %cl
leaq -1(%r8,%r8), %r8
leal -1(%rcx,%rcx), %ecx
leal -1(%r9,%r9), %r9d
imulq %r8, %r10
movslq %ecx,%r11
cmpq %r10, %rdx
cmovne %r11, %r8
cmove %r9d, %ecx
movq %r8, (%rdi,%rdx,8)
addq $1, %rdx
addl %ecx, %eax
cmpq %rdx, %rsi
ja .L16
and gcc-4.6 20100917
.L15:
movq (%rdi,%rdx,8), %r8
testq %r8, %r8
movq %r8, %r10
setg %cl
xorl %r9d, %r9d
testq %r8, %r8
movzbl %cl, %r11d
movzbl %cl, %ecx
setle %r9b
leaq -1(%rcx,%rcx), %rcx
leaq -1(%r9,%r9), %r9
imulq %rcx, %r10
cmpq %r10, %rdx
cmove %rcx, %r9
leal -1(%r11,%r11), %ecx
movq %r9, (%rdi,%rdx,8)
je .L12
xorl %ecx, %ecx
testq %r8, %r8
setle %cl
leal -1(%rcx,%rcx), %ecx
.L12:
addq $1, %rdx
addl %ecx, %eax
cmpq %rsi, %rdx
jne .L15
--
ubizjak at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|0000-00-00 00:00:00 |2010-09-17 09:59:36
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
` (2 preceding siblings ...)
2010-09-17 9:59 ` ubizjak at gmail dot com
@ 2010-09-17 10:03 ` ubizjak at gmail dot com
2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com
2010-09-17 13:45 ` matz at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2010-09-17 10:03 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from ubizjak at gmail dot com 2010-09-17 10:02 -------
Confirmed. Regression?
--
ubizjak at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
Ever Confirmed|0 |1
Last reconfirmed|2010-09-17 09:59:36 |2010-09-17 10:02:53
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
` (3 preceding siblings ...)
2010-09-17 10:03 ` ubizjak at gmail dot com
@ 2010-09-17 13:04 ` hjl dot tools at gmail dot com
2010-09-17 13:45 ` matz at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: hjl dot tools at gmail dot com @ 2010-09-17 13:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from hjl dot tools at gmail dot com 2010-09-17 13:04 -------
(In reply to comment #4)
> This all happens in IF conversion pass.
>
> 4.6 regresses in the sense that a branch is emitted instead of cmov for:
>
This is caused by revision 159106:
http://gcc.gnu.org/ml/gcc-cvs/2010-05/msg00156.html
--
hjl dot tools at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |matz at suse dot de
Summary|GCC optimizer for Intel x64 |[4.6 Regression] GCC
|generates inefficient code |optimizer for Intel x64
| |generates inefficient code
Target Milestone|--- |4.6.0
Version|4.4.3 |4.6.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
` (4 preceding siblings ...)
2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com
@ 2010-09-17 13:45 ` matz at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-09-17 13:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from matz at gcc dot gnu dot org 2010-09-17 13:45 -------
It might have been exposed by that revision, but that merely points out
a deficiency in RTL if conversion. The final gimple code doesn't have
explicit jumps in the inner loop, but uses cond_expr:
<bb 3>:
# s_22 = PHI <0(2), s_30(3)>
# i_19 = PHI <0(2), i_31(3)>
D.2693_11 = MEM[base: products_9(D), index: i_19, step: 8, offset: 0B];
val_4 = [cond_expr] D.2693_11 <= 0 ? -1 : 1;
prephitmp.9_39 = [cond_expr] D.2693_11 <= 0 ? -1 : 1;
prephitmp.10_40 = [cond_expr] D.2693_11 <= 0 ? 1 : -1;
prephitmp.11_41 = [cond_expr] D.2693_11 <= 0 ? 1 : -1;
D.2698_21 = D.2693_11 * prephitmp.9_39;
D.2699_25 = (long unsigned int) D.2698_21;
val_3 = [cond_expr] i_19 != D.2699_25 ? prephitmp.10_40 : val_4;
prephitmp.11_43 = [cond_expr] i_19 != D.2699_25 ? prephitmp.11_41 :
prephitmp.9_39;
MEM[base: products_9(D), index: i_19, step: 8, offset: 0B] = prephitmp.11_43;
s_30 = val_3 + s_22;
i_31 = i_19 + 1;
if (i_31 != count_7(D))
goto <bb 3>;
else
goto <bb 4>;
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-09-17 13:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com
2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com
2010-09-16 23:09 ` ekuznetsov at divxcorp dot com
2010-09-17 9:59 ` ubizjak at gmail dot com
2010-09-17 10:03 ` ubizjak at gmail dot com
2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com
2010-09-17 13:45 ` matz at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).