public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "jiwang at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can Date: Fri, 23 Jan 2015 17:33:00 -0000 [thread overview] Message-ID: <bug-62173-4-W8uB3Cwmg7@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-62173-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173 --- Comment #16 from Jiong Wang <jiwang at gcc dot gnu.org> --- After some work on this issue, things have gone beyond my expectations. To summarize my understanding of this issue and what I have got: Reason of regression ====== the testcase contains a loop which is hotcode, and there is one critical loop invariant, actually base address of local array, if it can't be ivopted then there will be big performance regression on this testcase. because of one gcc tree level ivopt issue, this variable will not be ivopted on 64bit machine, mips, powerpc, aarch64 etc. mostly because there some type precision issues in gcc tree level code that gcc is doing too conservative decision on overflow checking and thus abort the ivopt in early stage. For details, please see above discussion. Then why there is regression on AArch64 since r213488 "[AArch64] Improve TARGET_LEGITIMIZE_ADDRESS_P hook" ? it's because except tree level ivopt pass, gcc also do extra ivopt on rtl level by the rtl pre pass. previously, we have inefficient implementation on aarch64 TARGET_LEGITIMIZE_ADDRESS_P while this inefficient implementation happen to generated some instruction pattern that rtl level ivopt could catch and that critical variable finally ivopted by rtl pass. r213488 fixed the inefficiency of TARGET_LEGITIMIZE_ADDRESS_P, and as a side-effect, we will not generate that specific instruction pattern again, thus no ivopt on rtl pre, and thus regression happen. before r213488, ivopt happen on AArch64 only, while after r213488, AArch64 act the same as all other 64bit targets, mips64, powerpc64, sparc64 etc. To Fix ====== * the idea way is fix this issue on tree-ssa-loop-ivopt pass. We need to correct type precision & overflow checking issue, then gcc could ivopt the critical variable as early as in tree level for all 64bit targets. * gcc rtl level ivopt could be improved also. we could re-shuffle rtl pattern, then rtl pre pass could catch more ivopt opportunities and act as a supplement to tree ivopt. What I have tried ================= I have tried to fix this issue on rtl level. We need two patches. * one patch to correct aarch64 TARGET_LEGITIMIZE_ADDRESS hook to generate more efficient rtl pattern for array indexed address. * one patch to re-shuffle rtl pattern to let rtl pre pass catch more ivopt opportunities. For targets except AArch64, patch 2 alone is enough to let ivopt happen in rtl pre pass. While for AArch64, we need to correct TARGET_LEGITIMIZE_ADDRESS first, to don't split valid "offset" into "magic-base + offset" which make the rtl sequences unnecessarily complexer and cause troubles for later rtl pass. but this fix on AArch64 TARGET_LEGITIMIZE_ADDRESS hook then exposed another issue on tree-ssa-loop-ivopt. When running benchmarks, I found some post-index addressing opportunities are missing after we correct TARGET_LEGITIMIZE_ADDRESS. The correct of TARGET_LEGITIMIZE_ADDRESS will actually correct address cost which tree-ssa-loop-ivopt will query, and looks like there are glitches in tree-ssa-loop-ivopt cost model that gcc ivopt some simple post-index addressable variable, and thus generated sub-optimal code. And the patch for re-shuffle rtl pattern alone, https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00355.html, has critical problem. It's unsafe to re-use those REG_NOTE info in those places. So the patch needs re-work, or we need to investigate a better solution to fix this on rtl level. Future work =========== I hope someone could have a looks at the tree level fix which is the ideal fix. For RTL level fix, now the issue dependent chain for my solution becomes: fix-on-rtl-level \ v re-shuffle insn for rtl pre \ v fix AArch64 TARGET_LEGITIMIZE_ADDRESS (B) \ v fix tree-ssa-loop-ivopt cost model to make better decision between ivopt and post-index. (A) I think task A and B are important as they are quite general improvements to AArch64 and gcc generic code besides this testcase issue. I'd prepare a testcase and create a seperate ticket for task A first. And, I don't think we can fix this issue in 5.0 release, so move the target to the next release? Thanks.
next prev parent reply other threads:[~2015-01-23 17:33 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-08-18 16:12 [Bug target/62173] New: [AArch64] Performance regression due to r213488 spop at gcc dot gnu.org 2014-08-18 16:39 ` [Bug target/62173] " pinskia at gcc dot gnu.org 2014-08-18 19:13 ` spop at gcc dot gnu.org 2014-08-19 1:37 ` amker at gcc dot gnu.org 2014-10-28 11:28 ` [Bug target/62173] [5.0 regression] " jiwang at gcc dot gnu.org 2014-11-14 9:37 ` jiwang at gcc dot gnu.org 2014-11-17 2:14 ` amker.cheng at gmail dot com 2014-11-24 12:15 ` jiwang at gcc dot gnu.org 2014-11-24 12:38 ` jiwang at gcc dot gnu.org 2014-11-24 13:06 ` rguenth at gcc dot gnu.org 2014-11-24 23:01 ` jiwang at gcc dot gnu.org 2014-11-26 10:54 ` [Bug target/62173] [5.0 regression] [AArch64] Can't ivopt array base address while ARM can jiwang at gcc dot gnu.org 2014-11-27 9:35 ` jiwang at gcc dot gnu.org 2014-11-27 12:00 ` [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can rguenther at suse dot de 2014-11-27 12:16 ` rguenther at suse dot de 2014-11-27 13:34 ` jiwang at gcc dot gnu.org 2015-01-23 17:33 ` jiwang at gcc dot gnu.org [this message] 2015-01-26 10:30 ` rguenth at gcc dot gnu.org 2015-01-26 11:10 ` rguenth at gcc dot gnu.org 2015-01-26 13:48 ` ramana at gcc dot gnu.org 2015-01-26 14:19 ` amker at gcc dot gnu.org 2015-01-26 14:51 ` rguenther at suse dot de 2015-01-26 14:53 ` rguenther at suse dot de 2015-01-26 15:03 ` amker at gcc dot gnu.org 2015-01-26 15:38 ` jiwang at gcc dot gnu.org 2015-01-27 3:21 ` amker at gcc dot gnu.org 2015-01-27 7:56 ` amker at gcc dot gnu.org 2015-01-27 9:11 ` rguenther at suse dot de 2015-01-28 18:26 ` LpSolit at netscape dot net 2015-01-29 6:48 ` amker at gcc dot gnu.org 2015-01-30 6:42 ` amker at gcc dot gnu.org 2015-01-30 12:32 ` rguenth at gcc dot gnu.org 2015-02-05 7:27 ` amker at gcc dot gnu.org 2015-03-11 17:30 ` [Bug tree-optimization/62173] [5 Regression] " jiwang at gcc dot gnu.org 2015-03-11 17:46 ` jiwang at gcc dot gnu.org 2015-03-11 17:52 ` [Bug tree-optimization/62173] [5/6 " jakub at gcc dot gnu.org 2015-03-13 8:34 ` amker at gcc dot gnu.org 2015-06-02 3:34 ` amker at gcc dot gnu.org 2015-06-03 3:56 ` amker at gcc dot gnu.org 2015-07-22 11:44 ` jiwang at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-62173-4-W8uB3Cwmg7@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).