public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/39284]  New: Computed gotos combined too aggressively
@ 2009-02-23 22:42 jyasskin at gmail dot com
  2009-02-23 22:44 ` [Bug c/39284] " jyasskin at gmail dot com
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: jyasskin at gmail dot com @ 2009-02-23 22:42 UTC (permalink / raw)
  To: gcc-bugs

The following uses a file "ceval.i" that I'll upload shortly and a nightly
build of gcc-4.4.0-20090223. The file is a version of Python's core
interpretation loop modified to use computed gotos in order to help the
processor's branch predictor do a better job than with a switch statement.
However, gcc combines the computed gotos into only a few instructions,
producing a result nearly as bad as the switch.

The computed gotos are all of the form "goto *next_label;". About half of the
instances counted are reachable, as measured by the asm volatile farther down.
$ grep -c -F 'goto *next_label' ceval.i
256

I compile with -fno-gcse because
http://gcc.gnu.org/onlinedocs/gcc-4.3.3/gcc/Optimize-Options.html#index-fgcse-646
says it helps and with "--param max-goto-duplication-insns=100000" because
Diego Novillo suggested tweaking that parameter.
$ gcc-4.4 -m32  -pthread  -fno-strict-aliasing -g -fwrapv -O3 -Wall
-Wstrict-prototypes -fno-gcse --param max-goto-duplication-insns=100000  -S -dA
ceval.i -o ceval.s
$ egrep -c 'jmp[[:space:]]*\*' ceval.s
4


I tried to make the common instruction sequence leading up to the indirection
jump shorter by putting an asm volatile before it, but that didn't work. You
can see the result by running:

$ sed 's!goto \*next_label!asm volatile ("/* Computed goto */":::"memory");goto
*next_label!' < ceval.i > ceval-asm.i
$ gcc-4.4 -m32  -pthread  -fno-strict-aliasing -g -fwrapv -O3 -fno-gcse --param
max-goto-duplication-insns=100000  -S -dA ceval-asm.i -o ceval-asm.s
$ egrep -c 'Computed goto' ceval-asm.s
130
$ egrep -c 'jmp[[:space:]]*\*' ceval-asm.s
4


One of the relevant bits of assembly (search for "Computed goto" in
ceval-asm.s) is:

.L1125:
        # basic block 66
.LBE835:
        # ../src/Python/ceval.c:1000
        .loc 1 1000 0
        jmp     *-72(%ebp)
.LVL558:
        .p2align 4,,7
        .p2align 3
.L1433:
        # basic block 67

...

        # basic block 114
#APP
# 11 "../src/Include/ceval-vm.i" 1
        /* Computed goto */
# 0 "" 2
#NO_APP
        movl    %esi, %ebx
.LVL599:
        .p2align 4,,7
        .p2align 3
.L484:
        # basic block 115
        # ../src/Python/ceval.c:1000
        .loc 1 1000 0
        movl    $1, %eax
.LVL600:
        movl    %ebx, %esi
.LVL601:
        jmp     .L1125

The documentation for max-goto-duplication-insns says:

    The maximum number of instructions to duplicate to a block that jumps to a
computed goto. To avoid O(N^2) behavior in a number of passes, GCC factors
computed gotos early in the compilation process, and unfactors them as late as
possible. Only computed jumps at the end of a basic blocks with no more than
max-goto-duplication-insns are unfactored. The default value is 8. 

Since .L1125 and basic block 66 have only a single instruction, the computed
goto should have been unfactored.

$ gcc-4.4 -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /home/dnovillo/perf/sbox/gcc/local.x86_64/src/configure
--prefix=/home/dnovillo/perf/sbox/gcc/local.x86_64/inst
--srcdir=/home/dnovillo/perf/sbox/gcc/local.x86_64/src
--with-gmp=/home/dnovillo/i686 --with-mpfr=/home/dnovillo/i686
--build=i686-pc-linux-gnu --enable-languages=c++,fortran,objc,obj-c++
Thread model: posix
gcc version 4.4.0 20090223 (experimental) (GCC)


-- 
           Summary: Computed gotos combined too aggressively
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jyasskin at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-06-09 15:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-23 22:42 [Bug c/39284] New: Computed gotos combined too aggressively jyasskin at gmail dot com
2009-02-23 22:44 ` [Bug c/39284] " jyasskin at gmail dot com
2009-02-23 22:46 ` [Bug middle-end/39284] " pinskia at gcc dot gnu dot org
2009-02-23 22:49 ` pinskia at gcc dot gnu dot org
2009-02-23 22:58 ` jyasskin at gmail dot com
2009-02-23 23:19 ` steven at gcc dot gnu dot org
2009-02-24 10:17 ` rguenth at gcc dot gnu dot org
2009-02-24 15:26 ` jyasskin at gmail dot com
2009-06-08 20:56 ` hubicka at gcc dot gnu dot org
2009-06-08 22:43 ` steven at gcc dot gnu dot org
2009-06-08 22:46 ` steven at gcc dot gnu dot org
2009-06-09 15:19 ` hubicka at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).