public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "jake.stine at gmail dot com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/54073] [4.7 Regression] SciMark Monte Carlo test performance has seriously decreased in recent GCC releases Date: Sat, 16 Feb 2013 19:12:00 -0000 [thread overview] Message-ID: <bug-54073-4-8DirZzV8pz@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-54073-4@http.gcc.gnu.org/bugzilla/> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54073 Jake Stine <jake.stine at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jake.stine at gmail dot com --- Comment #16 from Jake Stine <jake.stine at gmail dot com> 2013-02-16 19:12:05 UTC --- Hi, I have done quite a bit of analysis on cmov performance across x86 architectures, so I will share here in case it helps: Quick summary: Conditional moves on Intel Core/Xeon and AMD Bulldozer architectures should probably be avoided "as a rule." History: Conditional moves were beneficial for the Intel Pentium 4, and also (but less-so) for AMD Athlon/Phenom chips. In the AMD Athlon/Phenom case the performance of cmov vs cmp+branch is determined more by the alignment of the target of the branch, than by the prediction rate of the branch. The instruction decoders would incur penalties on certain types of unaligned branch targets (when taken), or when decoding sequences of instructions that contained multiple branches within a 16byte "fetch" window (taken or not). cmov was sometimes handy for avoiding those. With regard to more current Intel Core and AMD Bulldozer/Bobcat architecture: I have found that use of conditional moves (cmov) is only beneficial if the branch that the move is replacing is badly mis-predicted. In my tests, the cmov only became clearly "optimal" when the branch was predicted correctly less than 92% of the time, which is abysmal by modern branch predictor standards and rarely occurs in practice. Above 97% prediction rates, cmov is typically slower than cmp+branch. Inside loops that contain branches with prediction rates approaching 100% (as is the case presented by the OP), cmov becomes a severe performance bottleneck. This holds true for both Core and Bulldozer. Bulldozer has less efficient branching than the i7, but is also severely bottlenecked by its limited fetch/decode. Cmov requires executing more total instructions, and that makes Bulldozer very unhappy. Note that my tests involved relatively simple loops that did not suffer from the added register pressure that cmov introduces. In practice, the prognosis for cmov being "optimal" is even worse than what I've observed in a controlled environment. Furthermore, to my knowledge the status of cmov vs. branch performance on x86 will not be changing anytime soon. cmov will continue to be a liability well into the next couple architecture releases from Intel and AMD. Piledriver will have added fetch/decode resources but should also have a smaller mispredict penalty, so its doubtful cmov will gain much advantages there either. Therefore I would recommend setting -fno-tree-loop-if-convert for all -march matching Intel Core and AMD Bulldozer/Bobcat families. There is one good use-case for cmov on x86: Mis-predicted conditions inside of loops. Currently there's no way to force that behavior in situations where I, the programmer, am fully aware that the condition is chaotic/random. A builtin cmov or condition hint would be nice. For now I'm forced to address those (fortunately infrequent) situations via inline asm.
next prev parent reply other threads:[~2013-02-16 19:12 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-07-23 15:25 [Bug tree-optimization/54073] New: " t.artem at mailcity dot com 2012-07-23 15:44 ` [Bug tree-optimization/54073] " t.artem at mailcity dot com 2012-07-24 9:23 ` [Bug tree-optimization/54073] [4.7/4.8 Regression] " rguenth at gcc dot gnu.org 2012-07-24 11:29 ` markus at trippelsdorf dot de 2012-07-24 13:21 ` rguenth at gcc dot gnu.org 2012-07-26 15:41 ` venkataramanan.kumar at amd dot com 2012-07-26 16:13 ` markus at trippelsdorf dot de 2012-08-16 11:06 ` rguenth at gcc dot gnu.org 2012-09-07 10:09 ` rguenth at gcc dot gnu.org 2012-09-20 10:28 ` jakub at gcc dot gnu.org 2012-11-13 13:05 ` jakub at gcc dot gnu.org 2012-11-13 15:07 ` t.artem at mailcity dot com 2012-11-13 15:14 ` ubizjak at gmail dot com 2012-11-13 15:24 ` jakub at gcc dot gnu.org 2012-11-13 15:55 ` hubicka at gcc dot gnu.org 2012-11-16 11:41 ` jakub at gcc dot gnu.org 2012-11-16 14:50 ` [Bug tree-optimization/54073] [4.7 " jakub at gcc dot gnu.org 2012-12-31 9:41 ` pinskia at gcc dot gnu.org 2013-02-16 19:12 ` jake.stine at gmail dot com [this message] 2013-02-17 8:41 ` ubizjak at gmail dot com 2013-04-11 7:59 ` rguenth at gcc dot gnu.org 2014-06-12 13:16 ` rguenth at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-54073-4-8DirZzV8pz@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).