[Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "jiwang at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can
Date: Fri, 23 Jan 2015 17:33:00 -0000	[thread overview]
Message-ID: <bug-62173-4-W8uB3Cwmg7@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-62173-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173

--- Comment #16 from Jiong Wang <jiwang at gcc dot gnu.org> ---
After some work on this issue, things have gone beyond my expectations. To
summarize my understanding of this issue and what I have got:

Reason of regression
======
  the testcase contains a loop which is hotcode, and there is one critical 
loop invariant, actually base address of local array, if it can't be ivopted
then there will be big performance regression on this testcase.

  because of one gcc tree level ivopt issue, this variable will not be 
ivopted on 64bit machine, mips, powerpc, aarch64 etc.

  mostly because there some type precision issues in gcc tree level code that 
gcc is doing too conservative decision on overflow checking and thus abort the
ivopt in early stage. For details, please see above discussion.

  Then why there is regression on AArch64 since

     r213488 "[AArch64] Improve TARGET_LEGITIMIZE_ADDRESS_P hook" ?

  it's because except tree level ivopt pass, gcc also do extra ivopt on
rtl level by the rtl pre pass. previously, we have inefficient implementation
on aarch64 TARGET_LEGITIMIZE_ADDRESS_P while this inefficient implementation
happen to generated some instruction pattern that rtl level ivopt could catch 
and that critical variable finally ivopted by rtl pass.

  r213488 fixed the inefficiency of TARGET_LEGITIMIZE_ADDRESS_P, and as a
side-effect, we will not generate that specific instruction pattern again, thus
no ivopt on rtl pre, and thus regression happen.

  before r213488, ivopt happen on AArch64 only, while after r213488, AArch64
act the same as all other 64bit targets, mips64, powerpc64, sparc64 etc.

To Fix
======
  * the idea way is fix this issue on tree-ssa-loop-ivopt pass.
    We need to correct type precision & overflow checking issue, then gcc could
    ivopt the critical variable as early as in tree level for all 64bit
targets.

  * gcc rtl level ivopt could be improved also. we could re-shuffle rtl
    pattern, then rtl pre pass could catch more ivopt opportunities and
    act as a supplement to tree ivopt.

What I have tried
=================
  I have tried to fix this issue on rtl level. We need two patches.

    * one patch to correct aarch64 TARGET_LEGITIMIZE_ADDRESS hook to generate
      more efficient rtl pattern for array indexed address.

    * one patch to re-shuffle rtl pattern to let rtl pre pass catch more ivopt
      opportunities.

  For targets except AArch64, patch 2 alone is enough to let ivopt happen in
rtl
pre pass.

  While for AArch64, we need to correct TARGET_LEGITIMIZE_ADDRESS first, to   
don't split valid "offset" into "magic-base + offset" which make the rtl
sequences unnecessarily complexer and cause troubles for later rtl pass.

  but this fix on AArch64 TARGET_LEGITIMIZE_ADDRESS hook then exposed another
issue on tree-ssa-loop-ivopt. When running benchmarks, I found some post-index
addressing opportunities are missing after we correct
TARGET_LEGITIMIZE_ADDRESS.
The correct of TARGET_LEGITIMIZE_ADDRESS will actually correct address cost
which tree-ssa-loop-ivopt will query, and looks like there are glitches in
tree-ssa-loop-ivopt cost model that gcc ivopt some simple post-index
addressable
variable, and thus generated sub-optimal code.

  And the patch for re-shuffle rtl pattern alone,
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00355.html, has critical problem.
It's unsafe to re-use those REG_NOTE info in those places. So the patch needs
re-work, or we need to
investigate a better solution to fix this on rtl level.

Future work
===========
  I hope someone could have a looks at the tree level fix which is the ideal 
fix.

  For RTL level fix, now the issue dependent chain for my solution becomes:

    fix-on-rtl-level
      \
       v
      re-shuffle insn for rtl pre
        \
         v
        fix AArch64 TARGET_LEGITIMIZE_ADDRESS (B)
          \
           v
          fix tree-ssa-loop-ivopt cost model to
          make better decision between ivopt
          and post-index. (A)

I think task A and B are important as they are quite general improvements to
AArch64 and gcc generic code besides this testcase issue.

I'd prepare a testcase and create a seperate ticket for task A first.

And, I don't think we can fix this issue in 5.0 release, so move the target to
the next release?

Thanks.

next prev parent reply	other threads:[~2015-01-23 17:33 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-18 16:12 [Bug target/62173] New: [AArch64] Performance regression due to r213488 spop at gcc dot gnu.org
2014-08-18 16:39 ` [Bug target/62173] " pinskia at gcc dot gnu.org
2014-08-18 19:13 ` spop at gcc dot gnu.org
2014-08-19  1:37 ` amker at gcc dot gnu.org
2014-10-28 11:28 ` [Bug target/62173] [5.0 regression] " jiwang at gcc dot gnu.org
2014-11-14  9:37 ` jiwang at gcc dot gnu.org
2014-11-17  2:14 ` amker.cheng at gmail dot com
2014-11-24 12:15 ` jiwang at gcc dot gnu.org
2014-11-24 12:38 ` jiwang at gcc dot gnu.org
2014-11-24 13:06 ` rguenth at gcc dot gnu.org
2014-11-24 23:01 ` jiwang at gcc dot gnu.org
2014-11-26 10:54 ` [Bug target/62173] [5.0 regression] [AArch64] Can't ivopt array base address while ARM can jiwang at gcc dot gnu.org
2014-11-27  9:35 ` jiwang at gcc dot gnu.org
2014-11-27 12:00 ` [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can rguenther at suse dot de
2014-11-27 12:16 ` rguenther at suse dot de
2014-11-27 13:34 ` jiwang at gcc dot gnu.org
2015-01-23 17:33 ` jiwang at gcc dot gnu.org [this message]
2015-01-26 10:30 ` rguenth at gcc dot gnu.org
2015-01-26 11:10 ` rguenth at gcc dot gnu.org
2015-01-26 13:48 ` ramana at gcc dot gnu.org
2015-01-26 14:19 ` amker at gcc dot gnu.org
2015-01-26 14:51 ` rguenther at suse dot de
2015-01-26 14:53 ` rguenther at suse dot de
2015-01-26 15:03 ` amker at gcc dot gnu.org
2015-01-26 15:38 ` jiwang at gcc dot gnu.org
2015-01-27  3:21 ` amker at gcc dot gnu.org
2015-01-27  7:56 ` amker at gcc dot gnu.org
2015-01-27  9:11 ` rguenther at suse dot de
2015-01-28 18:26 ` LpSolit at netscape dot net
2015-01-29  6:48 ` amker at gcc dot gnu.org
2015-01-30  6:42 ` amker at gcc dot gnu.org
2015-01-30 12:32 ` rguenth at gcc dot gnu.org
2015-02-05  7:27 ` amker at gcc dot gnu.org
2015-03-11 17:30 ` [Bug tree-optimization/62173] [5 Regression] " jiwang at gcc dot gnu.org
2015-03-11 17:46 ` jiwang at gcc dot gnu.org
2015-03-11 17:52 ` [Bug tree-optimization/62173] [5/6 " jakub at gcc dot gnu.org
2015-03-13  8:34 ` amker at gcc dot gnu.org
2015-06-02  3:34 ` amker at gcc dot gnu.org
2015-06-03  3:56 ` amker at gcc dot gnu.org
2015-07-22 11:44 ` jiwang at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-62173-4-W8uB3Cwmg7@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).