public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "siarhei.siamashka at gmail dot com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/55623] New: [ARM] GCC should not prefer long dependency chains, they inhibit performance on superscalar processors Date: Sun, 09 Dec 2012 10:00:00 -0000 [thread overview] Message-ID: <bug-55623-4@http.gcc.gnu.org/bugzilla/> (raw) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55623 Bug #: 55623 Summary: [ARM] GCC should not prefer long dependency chains, they inhibit performance on superscalar processors Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: siarhei.siamashka@gmail.com This is a missing optimization. Or in this particular case, it's more like GCC is reversing an attempt of a programmer to optimize the code for superscalar dual-issue processors. $ arm-none-linux-gnueabi-gcc -O2 -mcpu=cortex-a8 -o badsched badsched.c $ objdump -d badsched 00000000 <f1>: 0: e1a03120 lsr r3, r0, #2 4: e08330a0 add r3, r3, r0, lsr #1 8: e08331a0 add r3, r3, r0, lsr #3 c: e0833220 add r3, r3, r0, lsr #4 10: e08332a0 add r3, r3, r0, lsr #5 14: e0833320 add r3, r3, r0, lsr #6 18: e08333a0 add r3, r3, r0, lsr #7 1c: e0833420 add r3, r3, r0, lsr #8 20: e08334a0 add r3, r3, r0, lsr #9 24: e0833520 add r3, r3, r0, lsr #10 28: e08335a0 add r3, r3, r0, lsr #11 2c: e0833620 add r3, r3, r0, lsr #12 30: e08336a0 add r3, r3, r0, lsr #13 34: e0833720 add r3, r3, r0, lsr #14 38: e08337a0 add r3, r3, r0, lsr #15 3c: e0833820 add r3, r3, r0, lsr #16 40: e08338a0 add r3, r3, r0, lsr #17 44: e0833920 add r3, r3, r0, lsr #18 48: e08339a0 add r3, r3, r0, lsr #19 4c: e0833a20 add r3, r3, r0, lsr #20 50: e0833aa0 add r3, r3, r0, lsr #21 54: e0833b20 add r3, r3, r0, lsr #22 58: e0833ba0 add r3, r3, r0, lsr #23 5c: e0830c20 add r0, r3, r0, lsr #24 60: e12fff1e bx lr 00000064 <f2>: 64: e1a031a0 lsr r3, r0, #3 68: e1a02220 lsr r2, r0, #4 6c: e08330a0 add r3, r3, r0, lsr #1 70: e0822120 add r2, r2, r0, lsr #2 74: e08332a0 add r3, r3, r0, lsr #5 78: e0822320 add r2, r2, r0, lsr #6 7c: e08333a0 add r3, r3, r0, lsr #7 80: e0822420 add r2, r2, r0, lsr #8 84: e08334a0 add r3, r3, r0, lsr #9 88: e0822520 add r2, r2, r0, lsr #10 8c: e08335a0 add r3, r3, r0, lsr #11 90: e0822620 add r2, r2, r0, lsr #12 94: e08336a0 add r3, r3, r0, lsr #13 98: e0822720 add r2, r2, r0, lsr #14 9c: e08337a0 add r3, r3, r0, lsr #15 a0: e0822820 add r2, r2, r0, lsr #16 a4: e08338a0 add r3, r3, r0, lsr #17 a8: e0822920 add r2, r2, r0, lsr #18 ac: e08339a0 add r3, r3, r0, lsr #19 b0: e0822a20 add r2, r2, r0, lsr #20 b4: e0833aa0 add r3, r3, r0, lsr #21 b8: e0822b20 add r2, r2, r0, lsr #22 bc: e0833ba0 add r3, r3, r0, lsr #23 c0: e0820c20 add r0, r2, r0, lsr #24 c4: e0800003 add r0, r0, r3 c8: e12fff1e bx lr Guess which one of these two functions will be faster? === Cortex-A8 @1000MHz === $ time ./badsched 1 real 0m2.512s user 0m2.500s sys 0m0.000s $ time ./badsched 2 real 0m2.064s user 0m2.008s sys 0m0.008s === Cortex-A15 @1700MHz === real 0m2.786s user 0m2.770s sys 0m0.005s real 0m1.451s user 0m1.440s sys 0m0.005s There is a function call and loop overhead which prevents Cortex-A8 from showing ~2x better performance in the case of using "f2" function. We can try to mark these function as static in order to get them inlined, but in this case the asm workaround becomes ineffective in a rather interesting way, which also demonstrates instructions scheduling issues.
next reply other threads:[~2012-12-09 10:00 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-12-09 10:00 siarhei.siamashka at gmail dot com [this message] 2012-12-09 10:01 ` [Bug tree-optimization/55623] " siarhei.siamashka at gmail dot com 2012-12-09 10:48 ` [Bug middle-end/55623] " pinskia at gcc dot gnu.org 2012-12-09 11:19 ` siarhei.siamashka at gmail dot com 2012-12-09 11:22 ` siarhei.siamashka at gmail dot com 2012-12-09 11:36 ` steven at gcc dot gnu.org 2012-12-09 12:13 ` steven at gcc dot gnu.org 2012-12-09 12:15 ` steven at gcc dot gnu.org 2012-12-10 9:47 ` rguenth at gcc dot gnu.org 2012-12-11 16:33 ` ramana at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-55623-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).