public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code @ 2010-09-16 1:17 ekuznetsov at divxcorp dot com 2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com ` (5 more replies) 0 siblings, 6 replies; 12+ messages in thread From: ekuznetsov at divxcorp dot com @ 2010-09-16 1:17 UTC (permalink / raw) To: gcc-bugs I've attached two copies of a simple function. They are identical except for the type of the internal variable (one uses 'int64_t', the other uses 'int'). When compiled with GCC 4.4.3 on a x64 platform using -O3 optimizations, the assembly code for the first version will contain a conditional move instruction 'cmov', the second version will contain a branch. Since branches are extremely slow, the second version ends up two times slower than the first version. -- Summary: GCC optimizer for Intel x64 generates inefficient code Product: gcc Version: 4.4.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ekuznetsov at divxcorp dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com @ 2010-09-16 1:19 ` ekuznetsov at divxcorp dot com 2010-09-16 23:09 ` ekuznetsov at divxcorp dot com ` (4 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: ekuznetsov at divxcorp dot com @ 2010-09-16 1:19 UTC (permalink / raw) To: gcc-bugs ------- Comment #1 from ekuznetsov at divxcorp dot com 2010-09-16 01:18 ------- Created an attachment (id=21807) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21807&action=view) Sample code -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com 2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com @ 2010-09-16 23:09 ` ekuznetsov at divxcorp dot com 2010-09-17 9:59 ` ubizjak at gmail dot com ` (3 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: ekuznetsov at divxcorp dot com @ 2010-09-16 23:09 UTC (permalink / raw) To: gcc-bugs ------- Comment #3 from ekuznetsov at divxcorp dot com 2010-09-16 23:08 ------- Created an attachment (id=21813) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21813&action=view) Output of gcc -v -O3 gcc-bug.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com 2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com 2010-09-16 23:09 ` ekuznetsov at divxcorp dot com @ 2010-09-17 9:59 ` ubizjak at gmail dot com 2010-09-17 10:03 ` ubizjak at gmail dot com ` (2 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: ubizjak at gmail dot com @ 2010-09-17 9:59 UTC (permalink / raw) To: gcc-bugs ------- Comment #4 from ubizjak at gmail dot com 2010-09-17 09:59 ------- This all happens in IF conversion pass. 4.6 regresses in the sense that a branch is emitted instead of cmov for: int summation_helper_1 (long * products, unsigned long count) { int s = 0; unsigned long i; for (i = 0; i < count; i++) { int val = (products[i] > 0) ? 1 : -1; products[i] *= val; if (products[i] != i) val = -val; products[i] = val; s += val; } return s; } gcc-4.4.4 -O3 produces: .L16: movq (%rdi,%rdx,8), %r10 testq %r10, %r10 setg %r8b xorl %ecx, %ecx testq %r10, %r10 movzbl %r8b, %r9d movzbl %r8b, %r8d setle %cl leaq -1(%r8,%r8), %r8 leal -1(%rcx,%rcx), %ecx leal -1(%r9,%r9), %r9d imulq %r8, %r10 movslq %ecx,%r11 cmpq %r10, %rdx cmovne %r11, %r8 cmove %r9d, %ecx movq %r8, (%rdi,%rdx,8) addq $1, %rdx addl %ecx, %eax cmpq %rdx, %rsi ja .L16 and gcc-4.6 20100917 .L15: movq (%rdi,%rdx,8), %r8 testq %r8, %r8 movq %r8, %r10 setg %cl xorl %r9d, %r9d testq %r8, %r8 movzbl %cl, %r11d movzbl %cl, %ecx setle %r9b leaq -1(%rcx,%rcx), %rcx leaq -1(%r9,%r9), %r9 imulq %rcx, %r10 cmpq %r10, %rdx cmove %rcx, %r9 leal -1(%r11,%r11), %ecx movq %r9, (%rdi,%rdx,8) je .L12 xorl %ecx, %ecx testq %r8, %r8 setle %cl leal -1(%rcx,%rcx), %ecx .L12: addq $1, %rdx addl %ecx, %eax cmpq %rsi, %rdx jne .L15 -- ubizjak at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|0000-00-00 00:00:00 |2010-09-17 09:59:36 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com ` (2 preceding siblings ...) 2010-09-17 9:59 ` ubizjak at gmail dot com @ 2010-09-17 10:03 ` ubizjak at gmail dot com 2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com 2010-09-17 13:45 ` matz at gcc dot gnu dot org 5 siblings, 0 replies; 12+ messages in thread From: ubizjak at gmail dot com @ 2010-09-17 10:03 UTC (permalink / raw) To: gcc-bugs ------- Comment #5 from ubizjak at gmail dot com 2010-09-17 10:02 ------- Confirmed. Regression? -- ubizjak at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |NEW Ever Confirmed|0 |1 Last reconfirmed|2010-09-17 09:59:36 |2010-09-17 10:02:53 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com ` (3 preceding siblings ...) 2010-09-17 10:03 ` ubizjak at gmail dot com @ 2010-09-17 13:04 ` hjl dot tools at gmail dot com 2010-09-17 13:45 ` matz at gcc dot gnu dot org 5 siblings, 0 replies; 12+ messages in thread From: hjl dot tools at gmail dot com @ 2010-09-17 13:04 UTC (permalink / raw) To: gcc-bugs ------- Comment #6 from hjl dot tools at gmail dot com 2010-09-17 13:04 ------- (In reply to comment #4) > This all happens in IF conversion pass. > > 4.6 regresses in the sense that a branch is emitted instead of cmov for: > This is caused by revision 159106: http://gcc.gnu.org/ml/gcc-cvs/2010-05/msg00156.html -- hjl dot tools at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matz at suse dot de Summary|GCC optimizer for Intel x64 |[4.6 Regression] GCC |generates inefficient code |optimizer for Intel x64 | |generates inefficient code Target Milestone|--- |4.6.0 Version|4.4.3 |4.6.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com ` (4 preceding siblings ...) 2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com @ 2010-09-17 13:45 ` matz at gcc dot gnu dot org 5 siblings, 0 replies; 12+ messages in thread From: matz at gcc dot gnu dot org @ 2010-09-17 13:45 UTC (permalink / raw) To: gcc-bugs ------- Comment #7 from matz at gcc dot gnu dot org 2010-09-17 13:45 ------- It might have been exposed by that revision, but that merely points out a deficiency in RTL if conversion. The final gimple code doesn't have explicit jumps in the inner loop, but uses cond_expr: <bb 3>: # s_22 = PHI <0(2), s_30(3)> # i_19 = PHI <0(2), i_31(3)> D.2693_11 = MEM[base: products_9(D), index: i_19, step: 8, offset: 0B]; val_4 = [cond_expr] D.2693_11 <= 0 ? -1 : 1; prephitmp.9_39 = [cond_expr] D.2693_11 <= 0 ? -1 : 1; prephitmp.10_40 = [cond_expr] D.2693_11 <= 0 ? 1 : -1; prephitmp.11_41 = [cond_expr] D.2693_11 <= 0 ? 1 : -1; D.2698_21 = D.2693_11 * prephitmp.9_39; D.2699_25 = (long unsigned int) D.2698_21; val_3 = [cond_expr] i_19 != D.2699_25 ? prephitmp.10_40 : val_4; prephitmp.11_43 = [cond_expr] i_19 != D.2699_25 ? prephitmp.11_41 : prephitmp.9_39; MEM[base: products_9(D), index: i_19, step: 8, offset: 0B] = prephitmp.11_43; s_30 = val_3 + s_22; i_31 = i_19 + 1; if (i_31 != count_7(D)) goto <bb 3>; else goto <bb 4>; -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <bug-45685-4@http.gcc.gnu.org/bugzilla/>]
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> @ 2010-09-29 18:34 ` rguenth at gcc dot gnu.org 2010-11-04 17:41 ` jakub at gcc dot gnu.org ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: rguenth at gcc dot gnu.org @ 2010-09-29 18:34 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Priority|P3 |P2 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> 2010-09-29 18:34 ` rguenth at gcc dot gnu.org @ 2010-11-04 17:41 ` jakub at gcc dot gnu.org 2010-11-24 15:32 ` ebotcazou at gcc dot gnu.org ` (2 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: jakub at gcc dot gnu.org @ 2010-11-04 17:41 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2010-11-04 17:41:45 UTC --- With -fno-tree-loop-if-convert -O3 the generated code for #c4 testcase is actually better, but still one insn longer than 4.4.x. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> 2010-09-29 18:34 ` rguenth at gcc dot gnu.org 2010-11-04 17:41 ` jakub at gcc dot gnu.org @ 2010-11-24 15:32 ` ebotcazou at gcc dot gnu.org 2010-11-24 15:56 ` jakub at gcc dot gnu.org 2010-11-24 16:10 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 12+ messages in thread From: ebotcazou at gcc dot gnu.org @ 2010-11-24 15:32 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 Eric Botcazou <ebotcazou at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |ebotcazou at gcc dot | |gnu.org AssignedTo|unassigned at gcc dot |ebotcazou at gcc dot |gnu.org |gnu.org --- Comment #9 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2010-11-24 15:15:20 UTC --- Investigating. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> ` (2 preceding siblings ...) 2010-11-24 15:32 ` ebotcazou at gcc dot gnu.org @ 2010-11-24 15:56 ` jakub at gcc dot gnu.org 2010-11-24 16:10 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 12+ messages in thread From: jakub at gcc dot gnu.org @ 2010-11-24 15:56 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 --- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> 2010-11-24 15:27:41 UTC --- I guess it would be helpful if some tree pass figured out that computing cond_expr is usually quite expensive, and that instead of computing 4 different cond_exprs (always one in wider type, one in narrower type and one with ? -1 : 1 and one with ? 1 : -1 for the same condition) it is more efficient to compute it just once (in wider mode) and then negate and/or cast to narrower mode to get the other needed results. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/45685] [4.6 Regression] GCC optimizer for Intel x64 generates inefficient code [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> ` (3 preceding siblings ...) 2010-11-24 15:56 ` jakub at gcc dot gnu.org @ 2010-11-24 16:10 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 12+ messages in thread From: rguenth at gcc dot gnu.org @ 2010-11-24 16:10 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45685 --- Comment #11 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-11-24 15:55:45 UTC --- Separating predicate computation from predicate use should help this. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-11-24 15:56 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-09-16 1:17 [Bug rtl-optimization/45685] New: GCC optimizer for Intel x64 generates inefficient code ekuznetsov at divxcorp dot com 2010-09-16 1:19 ` [Bug rtl-optimization/45685] " ekuznetsov at divxcorp dot com 2010-09-16 23:09 ` ekuznetsov at divxcorp dot com 2010-09-17 9:59 ` ubizjak at gmail dot com 2010-09-17 10:03 ` ubizjak at gmail dot com 2010-09-17 13:04 ` [Bug rtl-optimization/45685] [4.6 Regression] " hjl dot tools at gmail dot com 2010-09-17 13:45 ` matz at gcc dot gnu dot org [not found] <bug-45685-4@http.gcc.gnu.org/bugzilla/> 2010-09-29 18:34 ` rguenth at gcc dot gnu.org 2010-11-04 17:41 ` jakub at gcc dot gnu.org 2010-11-24 15:32 ` ebotcazou at gcc dot gnu.org 2010-11-24 15:56 ` jakub at gcc dot gnu.org 2010-11-24 16:10 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).