public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Useless conditional branches
@ 2010-03-02  8:56 Alain Ketterlin
  2010-03-02  9:40 ` Piotr Wyderski
  2010-03-02  9:56 ` Andrew Haley
  0 siblings, 2 replies; 5+ messages in thread
From: Alain Ketterlin @ 2010-03-02  8:56 UTC (permalink / raw)
  To: gcc


It looks like gcc sometimes produces "useless" conditional branches.
I've found code like this:

   xor    %edx,%edx
   ; code with no effect on edx (see full code below)
   test   %edx,%edx
   jne    <somewhere else>

The branch on the last line is never taken. Why does gcc generate such
code sequences? Is this patched at runtime, or something? Am I missing
something obvious here?

I append the function's complete code below. There is another
suspicious branch at 0xb31cd8 (never taken, for less obvious
reasons---edx is never zero at that point).

I have found hundreds such occurrences across the CPU2006 suite. Does
anybody have any idea why this happens? Is there any specific
optimization to enable or disable to avoid such dead edges? Thanks in
advance for any remark/idea/...


This code is from 416.gamess (from SPEC CPU2006), function "formf",
compiled with "gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)" (from a
stock ubuntu 9.10), with options "-g -O3 -march=native
-fno-optimize-sibling-calls". 416.gamess is compiled with gfortran,
but the same thing happens with C or C++ programs. The same also
happens at lower optimization levels (-01), but less frequently.

uname -m gives "x86_64", and /proc/cpuinfo contains:

vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Duo CPU     P8700  @ 2.53GHz
[...]
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida
tpr_shadow vnmi flexpriority

Here is an objdump disassembly of the code, broken into basic-blocks:

0000000000b31c60 <formf_>:
   b31c60: push   %rbx
   b31c61: xor    %edx,%edx
   b31c63: mov    %rsi,%rbx
   b31c66: mov    $0x2,%r9d
   b31c6c: mov    $0xa,%r8d
   b31c72: mov    $0x6,%esi
   b31c77: mov    $0xe,%ecx
   b31c7c: test   %edx,%edx
   b31c7e: jne    b31cda <formf_+0x7a>

   b31c80: mov    $0x3ff8000000000000,%r11
   b31c8a: mov    $0xbfe0000000000000,%r10
   b31c94: mov    %r11,(%rdi)
   b31c97: movq   $0x0,0x40(%rdi)
   b31c9f: movq   $0x0,0x20(%rdi)
   b31ca7: mov    %r10,0x60(%rdi)
   b31cab: xor    %eax,%eax
   b31cad: nopl   (%rax)

   b31cb0: inc    %edx
   b31cb2: movq   $0x0,0x10(%rdi,%rax,8)
   b31cbb: movq   $0x0,0x50(%rdi,%rax,8)
   b31cc4: movq   $0x0,0x30(%rdi,%rax,8)
   b31ccd: movq   $0x0,0x70(%rdi,%rax,8)
   b31cd6: test   %edx,%edx
   b31cd8: je     b31c80 <formf_+0x20>

   b31cda: cmp    $0x2,%edx
   b31cdd: jne    b31d18 <formf_+0xb8>

   b31cdf: mov    $0xbfe0000000000000,%r11
   b31ce9: mov    $0xbfe0000000000000,%r10
   b31cf3: mov    %r11,(%rdi,%r9,8)
   b31cf7: mov    $0x2,%eax
   b31cfc: movq   $0x0,(%rdi,%r8,8)
   b31d04: movq   $0x0,(%rdi,%rsi,8)
   b31d0c: mov    %r10,(%rdi,%rcx,8)
   b31d10: jmp    b31cb0 <formf_+0x50>

   b31d12: nopw   0x0(%rax,%rax,1)

   b31d18: movslq %edx,%rax
   b31d1b: cmp    $0x1,%edx
   b31d1e: movq   $0x0,(%rdi,%rax,8)
   b31d26: movq   $0x0,0x40(%rdi,%rax,8)
   b31d2f: movq   $0x0,0x20(%rdi,%rax,8)
   b31d38: movq   $0x0,0x60(%rdi,%rax,8)
   b31d41: jne    b31cb0 <formf_+0x50>

   b31d47: movq   $0x0,0x50(%rdi,%rax,8)
   b31d50: movq   $0x0,0x30(%rdi,%rax,8)
   b31d59: mov    $0x3fe0000000000000,%r9
   b31d63: mov    $0xbff8000000000000,%r8
   b31d6d: mov    %r9,0x10(%rdi,%rax,8)
   b31d72: mov    %r8,0x70(%rdi,%rax,8)
   b31d77: mov    $0xd0f838,%edx
   b31d7c: mov    $0xd0f7b0,%esi
   b31d81: mov    %rbx,%rdi
   b31d84: xor    %eax,%eax
   b31d86: callq  977b20 <vclr_>
   b31d8b: mov    $0x3fe0000000000000,%rsi
   b31d95: mov    $0x3fe0000000000000,%rcx
   b31d9f: mov    %rsi,(%rbx)
   b31da2: mov    %rcx,0x60(%rbx)
   b31da6: mov    $0xbfe0000000000000,%rdx
   b31db0: mov    $0xbfe0000000000000,%rax
   b31dba: mov    %rdx,0x18(%rbx)
   b31dbe: mov    %rax,0x78(%rbx)
   b31dc2: pop    %rbx
   b31dc3: retq

Let me know if more detail is needed.

-- Alain.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Useless conditional branches
  2010-03-02  8:56 Useless conditional branches Alain Ketterlin
@ 2010-03-02  9:40 ` Piotr Wyderski
  2010-03-02  9:56 ` Andrew Haley
  1 sibling, 0 replies; 5+ messages in thread
From: Piotr Wyderski @ 2010-03-02  9:40 UTC (permalink / raw)
  To: Alain Ketterlin; +Cc: gcc

Alain Ketterlin wrote:

> I've found code like this:
>
>  xor    %edx,%edx
>  ; code with no effect on edx (see full code below)
>  test   %edx,%edx
>  jne    <somewhere else>

I have experienced similar sequences where your
"code with no effect" was a lot of SSE instructions,
so I can confirm that the problem exists.

Best regards
Piotr Wyderski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Useless conditional branches
  2010-03-02  8:56 Useless conditional branches Alain Ketterlin
  2010-03-02  9:40 ` Piotr Wyderski
@ 2010-03-02  9:56 ` Andrew Haley
  2010-03-02 10:35   ` Alain Ketterlin
  2010-03-02 10:42   ` Richard Guenther
  1 sibling, 2 replies; 5+ messages in thread
From: Andrew Haley @ 2010-03-02  9:56 UTC (permalink / raw)
  To: gcc

On 03/02/2010 08:55 AM, Alain Ketterlin wrote:
> 
> It looks like gcc sometimes produces "useless" conditional branches.
> I've found code like this:
> 
>   xor    %edx,%edx
>   ; code with no effect on edx (see full code below)
>   test   %edx,%edx
>   jne    <somewhere else>
> 
> The branch on the last line is never taken. Why does gcc generate such
> code sequences? Is this patched at runtime, or something? Am I missing
> something obvious here?

> Let me know if more detail is needed.

We really need a test case, with source, that illustrates the problem.
When we have that, we can treat is as a missed-optimization bug.

Andrew.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Useless conditional branches
  2010-03-02  9:56 ` Andrew Haley
@ 2010-03-02 10:35   ` Alain Ketterlin
  2010-03-02 10:42   ` Richard Guenther
  1 sibling, 0 replies; 5+ messages in thread
From: Alain Ketterlin @ 2010-03-02 10:35 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc

Andrew Haley wrote:
> On 03/02/2010 08:55 AM, Alain Ketterlin wrote:
>> It looks like gcc sometimes produces "useless" conditional branches.
>> I've found code like this:
>>
>>   xor    %edx,%edx
>>   ; code with no effect on edx (see full code below)
>>   test   %edx,%edx
>>   jne    <somewhere else>
>>
>> The branch on the last line is never taken. Why does gcc generate such
>> code sequences? Is this patched at runtime, or something? Am I missing
>> something obvious here?

> We really need a test case, with source, that illustrates the problem.
> When we have that, we can treat is as a missed-optimization bug.

Sure. I can provide a list of functions from the SPEC CPU2006 where it 
happens. Here is the list for 403.gcc, at -01 and -03 (the CPU2006 gcc 
is based on gcc-3.2, according to SPEC, so probably different from the 
actual code base).

=== 403.gcc (-O1)
FUN init_builtins 0x43582f BLOCK 0x43582f
FUN cpp_finish_options 0x435a39 BLOCK 0x435a52
FUN life_analysis 0x4be8f7 BLOCK 0x4beb79
FUN optimize_mode_switching 0x580b9a BLOCK 0x580ea4
=== 403.gcc (-O3)
FUN start_function 0x413af0 BLOCK 0x41410e
FUN init_builtins 0x442640 BLOCK 0x442640
FUN cpp_finish_options 0x442ec0 BLOCK 0x442f28
FUN can_combine_p 0x4770f0 BLOCK 0x477196
FUN gen_rtx 0x4bf440 BLOCK 0x4bf5a8
FUN expand_shift 0x4ca5a0 BLOCK 0x4ca6ce
FUN life_analysis 0x4edbc0 BLOCK 0x4ee03d
FUN htab_create 0x6a1940 BLOCK 0x6a1940
FUN htab_expand 0x6a1aa0 BLOCK 0x6a1aa0
FUN htab_try_create 0x6a1ee0 BLOCK 0x6a1ee0

(the addresses may be meaningless, but if both addresses are equal it 
means that the problem appears on the entry block). I can't guarantee 
this list is exhaustive.

My full list for CPU2006 programs is a bit too long to post on the 
mailing list (around a thousand lines). I can send it by mail (is a list 
of function names enough?).

I'll try to extract a minimal example if you want and/or have no access 
to CPU2006. Give me one or two days.

-- Alain.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Useless conditional branches
  2010-03-02  9:56 ` Andrew Haley
  2010-03-02 10:35   ` Alain Ketterlin
@ 2010-03-02 10:42   ` Richard Guenther
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Guenther @ 2010-03-02 10:42 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc

On Tue, Mar 2, 2010 at 10:55 AM, Andrew Haley <aph@redhat.com> wrote:
> On 03/02/2010 08:55 AM, Alain Ketterlin wrote:
>>
>> It looks like gcc sometimes produces "useless" conditional branches.
>> I've found code like this:
>>
>>   xor    %edx,%edx
>>   ; code with no effect on edx (see full code below)
>>   test   %edx,%edx
>>   jne    <somewhere else>
>>
>> The branch on the last line is never taken. Why does gcc generate such
>> code sequences? Is this patched at runtime, or something? Am I missing
>> something obvious here?
>
>> Let me know if more detail is needed.
>
> We really need a test case, with source, that illustrates the problem.
> When we have that, we can treat is as a missed-optimization bug.

I can't reproduce it with 4.3 nor 4.5 but indeed 4.4 has this interesting
code sequence.  It looks like a missed jump threading opportunity.

Richard.

> Andrew.
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-03-02 10:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-02  8:56 Useless conditional branches Alain Ketterlin
2010-03-02  9:40 ` Piotr Wyderski
2010-03-02  9:56 ` Andrew Haley
2010-03-02 10:35   ` Alain Ketterlin
2010-03-02 10:42   ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).