public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "fredrik dot svahn at gmail dot com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug c/42621] New: 4.4/4.5 Regression, Computed gotos on AMD 800% slower Date: Tue, 05 Jan 2010 11:44:00 -0000 [thread overview] Message-ID: <bug-42621-18630@http.gcc.gnu.org/bugzilla/> (raw) When compiling a program with computed goto:s with gcc 4.4.2 it runs significantly slower (up to a factor 10) than when it is compiled with e.g. gcc 4.1/4.3 with the same optimization flags (-O2 or -O3). A small dummy test program without header file dependencies is attached. I am compiling with a commandline like "gcc -O3 test.c -o testp.4.4.2", and run the generated executable without arguments, like "./testp.4.4.2". Generating cpu specific instructions, e.g. "-march=athlon64" seems to make no difference. I have also tried with "-fno-gcse" (as recommended in the docs) to no avail. Same results with targets x86_64 and i686 on Novell SLES 10 and Arch Linux. Interestingly enough I do not see this problem on any Intel processor I have tried, but I have seen the slowdown on all AMD processors I have tried (e.g. Dual-Core AMD Opteron Processor 2216 and AMD Turion 64 X2 Mobile Technology TL-60). In fact, the exact same two binaries resulting from compilation with gcc 4.4.2 and gcc 4.3 for i686 which show a significant performance difference on an AMD will not show any significant difference on an Intel Core 2 Duo T7500. Some observations: 1. On AMD there is a huge difference in the number of mispredicted branches between the program compiled with gcc-4.4.2 and the program compiled with earlier compilers. See for instance the following output from oprofile: --- Counted RETIRED_INDIRECT_BRANCHES_MISPREDICTED events (Retired Indirect Branches Mispredicted) with a unit mask of 0x00 (No unit mask) count 500 Counted RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS events (Retired Mispredicted Branch Instructions) with a unit mask of 0x00 (No unit mask) count 500 Counted RETIRED_TAKEN_BRANCH_INSTRUCTIONS events (Retired taken branch instructions) with a unit mask of 0x00 (No unit mask) count 500 RETIRED_INDIRE...|RETIRED_MISPRE...|RETIRED_TAKEN_...| samples| %| samples| %| samples| %| ------------------------------------------------------ 185416 88.7799 186587 82.8723 381826 48.1913 testp.4.4.2 5605 2.6838 6275 2.7870 157401 19.8660 testp.4.3 2. Gcc 4.3 generates the following assembler around the "eq:" label in the attached program: 4004c0: 48 81 fb 00 e1 f5 05 cmp $0x5f5e100,%rbx 4004c7: 74 21 je 4004ea <main+0x6a> 4004c9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 4004d0: 48 63 c5 movslq %ebp,%rax 4004d3: 48 8b 44 c4 b0 mov -0x50(%rsp,%rax,8),%rax 4004d8: ff e0 jmpq *%rax While gcc 4.4.2 will generate an additional jump instruction: 4004c0: ff e0 jmpq *%rax ... 4004d8: 48 81 fb 00 e1 f5 05 cmp $0x5f5e100,%rbx 4004df: 74 21 je 400502 <main+0x82> 4004e1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 4004e8: 48 63 c5 movslq %ebp,%rax 4004eb: 48 8b 44 c4 88 mov -0x78(%rsp,%rax,8),%rax 4004f0: eb ce jmp 4004c0 <main+0x40> 3. I see the same behaviour with a month-old snapshot of gcc 4.5. Examples of compilers used (have tried with a number of differrent builds on different targets): Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../configure --prefix=/usr --enable-shared --enable-languages=c,c++,fortran,objc,obj-c++,ada --enable-threads=posix --mandir=/usr/share/man --infodir=/usr/share/info --enable-__cxa_atexit --disable-multilib --libdir=/usr/lib --libexecdir=/usr/lib --enable-clocale=gnu --disable-libstdcxx-pch --with-tune=generic Thread model: posix gcc version 4.4.2 (GCC) Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../configure --prefix=/usr --enable-shared --enable-languages=c,c++ --enable-threads=posix --mandir=/usr/share/man --infodir=/usr/share/info --enable-__cxa_atexit --disable-multilib --libdir=/usr/lib --libexecdir=/usr/lib --enable-clocale=gnu --disable-libstdcxx-pch --with-tune=generic --disable-werror --enable-checking=release --program-suffix=-4.3 --enable-version-specific-runtime-libs Thread model: posix gcc version 4.3.3 (GCC) Test program: ============= #define VALUE 100000000 int main(int argc, char *argv[]) { void *ops[] = { &&inc, &&eq, &>, &<, &>e, &<e, &&zero, &¬_implemented, &&exit }; long i = 0; int next_op = argc; //unknown at compile time... int fail_op = 0; //inc goto *ops[0]; inc: i++; goto *ops[next_op]; eq: if (!(i == VALUE)) goto handle_fail; return 0; gt: if (!(i > VALUE)) goto handle_fail; return 0; lt: if (!(i < VALUE)) goto handle_fail; return 0; gte: if (!(i >= VALUE)) goto handle_fail; return 0; lte: if (!(i <= VALUE)) goto handle_fail; return 0; zero: if (!(i == 0)) goto handle_fail; return 0; not_implemented: fail_op = 8; //exit goto handle_fail; exit: return -1; handle_fail: goto *ops[fail_op]; } -- Summary: 4.4/4.5 Regression, Computed gotos on AMD 800% slower Product: gcc Version: 4.4.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: fredrik dot svahn at gmail dot com GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42621
next reply other threads:[~2010-01-05 11:44 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2010-01-05 11:44 fredrik dot svahn at gmail dot com [this message] 2010-01-05 12:46 ` [Bug rtl-optimization/42621] [4.4/4.5 Regression] " rguenth at gcc dot gnu dot org 2010-01-05 12:50 ` steven at gcc dot gnu dot org 2010-01-05 21:51 ` steven at gcc dot gnu dot org 2010-01-05 21:56 ` pinskia at gcc dot gnu dot org 2010-01-05 22:11 ` steven at gcc dot gnu dot org 2010-01-06 11:37 ` fredrik dot svahn at gmail dot com 2010-01-06 11:44 ` fredrik dot svahn at gmail dot com 2010-01-06 23:00 ` fredrik dot svahn at gmail dot com 2010-01-07 14:51 ` rguenth at gcc dot gnu dot org 2010-01-10 23:31 ` steven at gcc dot gnu dot org 2010-01-13 22:27 ` [Bug rtl-optimization/42621] [4.4 " rguenth at gcc dot gnu dot org 2010-01-18 13:14 ` carlr at freemail dot gr 2010-01-21 13:16 ` jakub at gcc dot gnu dot org 2010-04-30 9:25 ` jakub at gcc dot gnu dot org 2010-07-14 20:49 ` jyasskin at gmail dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-42621-18630@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).