public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "solar-gcc at openwall dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure
Date: Wed, 18 Feb 2015 00:03:00 -0000	[thread overview]
Message-ID: <bug-51017-4-h0lgJvXWAZ@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-51017-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017

--- Comment #17 from Alexander Peslyak <solar-gcc at openwall dot com> ---
(In reply to Richard Biener from comment #16)
> I'm completely confused now as to what the original regression was reported
> against.

I'm sorry, I should have re-read my original description of the regression
before I wrote comment 13.  Together, these are indeed confusing.

> I thought it was the default options in the Makefile, -O2
> -fomit-frame-pointer, which showed the regression and you found -Os would
> mitigate it somewhat (and I more specifically told you it is -fno-tree-pre
> that makes the actual difference).

That's one of the regressions I mentioned in the original description.  Yes,
you identified -fno-tree-pre as the component of -Os that makes the difference
- Thank You!  However, I also mentioned in the original description that a
bigger regression with 4.6+ vs. 4.5 and 4.4 remained despite of -Os, and I had
no similar workaround for it at the time (but enabling -fopenmp made it go
away, perhaps due to changes to declarations in the source code in #ifdef
_OPENMP blocks).  I think we can now say that this bigger 4.6+ regression was
primarily caused by the unaligned load instructions.  So two regressions are
figured out, and the remaining slowdown (not investigated yet) vs. 4.1 to 4.3
(which worked best) is only 6% to 10% in recent versions (9% in 4.9.2).

> So - what options give good results with old compilers but bad results with
> new compilers?

On CPUs where movups/movdqu are slower than their aligned counterparts (for
addresses that happen to be aligned), any sane optimization options of 4.6+
give bad results as compared to pre-4.6 with same options.  As you say, this
can be fixed in the source code (and I most likely will fix it there), but I
think many other programs may experience similar slowdowns, so maybe GCC should
do something about this too.

Other than that, either -Os or -fno-tree-pre works around the second worst
slowdown seen in 4.6+.

To avoid confusion, maybe this bug should focus on one of the three
regressions?  Should we keep it for PRE only?

Should we create a new bug for the unnecessary and non-optional use of
unaligned load instructions for source code like this, or is this considered
the new intended behavior despite of the major slowdown on such CPUs? 
(Presumably not only for JtR.  I'd expect this to affect many programs.)

Should we also create a bug for investigating the remaining slowdown of 9% in
4.9.2 (vs. 4.1 to 4.3), or is it considered too minor to bother?

Thank you!


  parent reply	other threads:[~2015-02-18  0:03 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-08  0:43 [Bug middle-end/51017] New: GCC 4.6 performance regression (vs. 4.4/4.5) solar-gcc at openwall dot com
2011-11-08  0:57 ` [Bug middle-end/51017] " solar-gcc at openwall dot com
2011-11-08  1:05 ` solar-gcc at openwall dot com
2011-12-15  0:34 ` pinskia at gcc dot gnu.org
2012-01-03  4:46 ` solar-gcc at openwall dot com
2012-01-04 19:39 ` solar-gcc at openwall dot com
2012-01-04 22:43 ` jakub at gcc dot gnu.org
2012-01-04 23:00 ` solar-gcc at openwall dot com
2015-02-09  0:12 ` pinskia at gcc dot gnu.org
2015-02-16  0:08 ` solar-gcc at openwall dot com
2015-02-16  1:10 ` solar-gcc at openwall dot com
2015-02-16 10:51 ` [Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure rguenth at gcc dot gnu.org
2015-02-17  2:21 ` solar-gcc at openwall dot com
2015-02-17  2:56 ` solar-gcc at openwall dot com
2015-02-17  3:11 ` solar-gcc at openwall dot com
2015-02-17  9:25 ` rguenth at gcc dot gnu.org
2015-02-17  9:27 ` rguenth at gcc dot gnu.org
2015-02-18  0:03 ` solar-gcc at openwall dot com [this message]
2015-02-18  1:25 ` solar-gcc at openwall dot com
2015-02-18  3:20 ` solar-gcc at openwall dot com
2015-02-18 10:32 ` [Bug tree-optimization/51017] [4.8/4.9/5 Regression] GCC performance regression (vs. 4.4/4.5), PRE increases register pressure too much rguenth at gcc dot gnu.org
2015-02-18 11:09 ` rguenth at gcc dot gnu.org
2015-02-25 14:26 ` law at redhat dot com
2015-06-23  8:14 ` [Bug tree-optimization/51017] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
2015-06-26 20:04 ` [Bug tree-optimization/51017] [4.9/5/6 " jakub at gcc dot gnu.org
2015-06-26 20:33 ` jakub at gcc dot gnu.org
2021-05-14  9:46 ` [Bug tree-optimization/51017] [9/10/11/12 Regression] GCC performance regression (vs. 4.4/4.5), PRE/LIM increase " jakub at gcc dot gnu.org
2021-06-01  8:05 ` rguenth at gcc dot gnu.org
2022-05-27  9:34 ` [Bug tree-optimization/51017] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:30 ` jakub at gcc dot gnu.org
2023-07-07 10:29 ` [Bug tree-optimization/51017] [11/12/13/14 " rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-51017-4-h0lgJvXWAZ@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).