public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune
@ 2020-03-27 21:49 jamborm at gcc dot gnu.org
  2020-03-27 22:06 ` [Bug target/94373] " pinskia at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2020-03-27 21:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

            Bug ID: 94373
           Summary: 548.exchange2_r run time is 7-12% worse than GCC 9 at
                    -O2 and generic march/mtune
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

When compiled with just -O2, SPEC 2017 INTrate benchmark
548.exchange2_r runs slower than when compiled with GCC 9.2. It is:

-  8% slower on AMD Zen2-based server CPU (rev. 26b3e568a60)
- 12% slower on Intel Cascade Lake server CPU (rev. abe13e1847f)
-  7% slower on AMD Zen1-based server CPU (rev. 26b3e568a60)

During GCC 10 development cycle the benchmark was relatively noisy and
the run time was increasing in many small steps, but between October 7
and November 15 we were doing 3% better than GCC 9 (on Zen2).
Specifically the following commit brought about the improvement:

  commit 806bdf4e40d31cf55744c876eb9f17654de36b99
  Author: Richard Biener <rguenther@suse.de>
  Date:   Mon Oct 7 07:53:45 2019 +0000

    re PR tree-optimization/91975 (worse code for small array copy using
pointer arithmetic than array indexing)

    2019-10-07  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/91975
            * tree-ssa-loop-ivcanon.c (constant_after_peeling): Consistently
            handle invariants.

    From-SVN: r276645

But it was undone by its revert:

  commit f0af4848ac40d2342743c9b16416310d61db85b5
  Author: Richard Biener <rguenther@suse.de>
  Date:   Fri Nov 15 09:09:16 2019 +0000

    re PR tree-optimization/92039 (Spurious -Warray-bounds warnings building
32-bit glibc)

    2019-11-15  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/92039
            PR tree-optimization/91975
            * tree-ssa-loop-ivcanon.c (constant_after_peeling): Revert
            previous change, treat invariants consistently as non-constant.
            (tree_estimate_loop_size): Ternary ops with just the first op
            constant are not optimized away.

            * gcc.dg/tree-ssa/cunroll-2.c: Revert to state previous to
            unroller adjustment.
            * g++.dg/tree-ssa/ivopts-3.C: Likewise.

    From-SVN: r278281

On the Intel machine, reverting the revert fixes the regression too.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
@ 2020-03-27 22:06 ` pinskia at gcc dot gnu.org
  2020-03-30  5:09 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-03-27 22:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|tree-optimization           |target

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Seems IV related and most likely a target cost model issue too.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
  2020-03-27 22:06 ` [Bug target/94373] " pinskia at gcc dot gnu.org
@ 2020-03-30  5:09 ` crazylht at gmail dot com
  2020-03-30  6:26 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2020-03-30  5:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
I think
Change lea_cost from 2 --> 1 in skylake can fix this regressions.

Since it's stage4 now, i hold my patch.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
  2020-03-27 22:06 ` [Bug target/94373] " pinskia at gcc dot gnu.org
  2020-03-30  5:09 ` crazylht at gmail dot com
@ 2020-03-30  6:26 ` crazylht at gmail dot com
  2020-03-30  7:59 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2020-03-30  6:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #2)
> I think
> Change lea_cost from 2 --> 1 in skylake can fix this regressions.
> 
> Since it's stage4 now, i hold my patch.

Classify: it's for -O2 -mtune=skylake-avx512

not sure the what cause the regression for -O2 -mtune=generic.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-03-30  6:26 ` crazylht at gmail dot com
@ 2020-03-30  7:59 ` rguenth at gcc dot gnu.org
  2021-02-04 17:15 ` [Bug target/94373] 548.exchange2_r run time is 16-35% " jamborm at gcc dot gnu.org
  2023-01-18 16:57 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-03-30  7:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note the cited commit simply caused more complete unrolling to happen.  Too
much actually which is why I reverted it.  Note GCC 9.2 does not have that more
unrolling so the difference must be something else in the end.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 16-35% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2020-03-30  7:59 ` rguenth at gcc dot gnu.org
@ 2021-02-04 17:15 ` jamborm at gcc dot gnu.org
  2023-01-18 16:57 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-02-04 17:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=98782
            Summary|548.exchange2_r run time is |548.exchange2_r run time is
                   |7-12% worse than GCC 9 at   |16-35% worse than GCC 9 at
                   |-O2 and generic march/mtune |-O2 and generic march/mtune
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-02-04

--- Comment #5 from Martin Jambor <jamborm at gcc dot gnu.org> ---
The regression is confirmed, it is now 35% on Cascadelake, 20% on Zen3
and 16% on Zen2 against GCC 9 (19%, 12% and 8% respectively against
GCC 10)

LNT sees it too:
  - on znver2 (though with LTO):
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=278.407.0&plot.1=298.407.0&
  - on znver1:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=303.407.0&plot.1=31.407.0&
  - on a Kabylake:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=206.407.0&plot.1=30.407.0&

I wonder whether the RA issue PR 98782 might be also related, although
it mostly focuses on -O3.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/94373] 548.exchange2_r run time is 16-35% worse than GCC 9 at -O2 and generic march/mtune
  2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-02-04 17:15 ` [Bug target/94373] 548.exchange2_r run time is 16-35% " jamborm at gcc dot gnu.org
@ 2023-01-18 16:57 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2023-01-18 16:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94373

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Martin Jambor from comment #5)
> 
> LNT sees it too:
>   - on znver2 (though with LTO):
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=278.407.0&plot.1=298.407.0&

As can be seen in the above link, this has been fixed about a year ago
and then even improved on.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-18 16:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 21:49 [Bug tree-optimization/94373] New: 548.exchange2_r run time is 7-12% worse than GCC 9 at -O2 and generic march/mtune jamborm at gcc dot gnu.org
2020-03-27 22:06 ` [Bug target/94373] " pinskia at gcc dot gnu.org
2020-03-30  5:09 ` crazylht at gmail dot com
2020-03-30  6:26 ` crazylht at gmail dot com
2020-03-30  7:59 ` rguenth at gcc dot gnu.org
2021-02-04 17:15 ` [Bug target/94373] 548.exchange2_r run time is 16-35% " jamborm at gcc dot gnu.org
2023-01-18 16:57 ` jamborm at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).