public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
@ 2022-10-26  9:05 rvmallad at amazon dot com
  2022-10-26  9:20 ` [Bug tree-optimization/107409] " rvmallad at amazon dot com
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-10-26  9:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

            Bug ID: 107409
           Summary: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rvmallad at amazon dot com
  Target Milestone: ---

Created attachment 53773
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53773&action=edit
Input and source files.

Below is some perf data executing the 519.lbm_r benchmark on aarch64
architecture (Graviton 3 processor). I have comparison of the baseline perf
(mainline commit ID: f56d48b2471c388401174029324e1f4c4b84fcdb) vs. a fix for
the same (revert the code change in commit ID:
a9a4edf0e71bbac9f1b5dcecdcf9250111d16889).

Steps to compile:
$ gcc -std=c99 -mabi=lp64 -g -Ofast -mcpu=native lbm.i main.i -lm -flto -o
519_lbm_r_base

$ time ./519_lbm_r_base 3000 reference.dat 0 0 100_100_130_ldc.of
real    2m50.946s

Reverting the code changes in commit ID:
a9a4edf0e71bbac9f1b5dcecdcf9250111d16889
$ time ./519_lbm_r_fix 3000 reference.dat 0 0 100_100_130_ldc.of
real    2m42.091s

The code change reverted was in the following file:
* tree-cfg.c (execute_fixup_cfg): Update also max_bb_count when scaling happen.

Author: Jan Hubicka <hubicka@ucw.cz>
Date:   Sat Nov 30 22:25:24 2019 +0100

Please find attached the files to reproduce this issue and the fix.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
@ 2022-10-26  9:20 ` rvmallad at amazon dot com
  2022-10-27  8:08 ` marxin at gcc dot gnu.org
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-10-26  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #1 from Rama Malladi <rvmallad at amazon dot com> ---
$ /home/ubuntu/gccfixissue1/bin/gcc  -v
Using built-in specs.
COLLECT_GCC=/home/ubuntu/gccfixissue1/bin/gcc
COLLECT_LTO_WRAPPER=/home/ubuntu/gccfixissue1/libexec/gcc/aarch64-unknown-linux-gnu/13.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../configure --prefix=/home/ubuntu/gccfixissue1
--enable-languages=c,fortran
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.0.0 20221021 (experimental) (GCC)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
  2022-10-26  9:20 ` [Bug tree-optimization/107409] " rvmallad at amazon dot com
@ 2022-10-27  8:08 ` marxin at gcc dot gnu.org
  2022-10-27  8:13 ` marxin at gcc dot gnu.org
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-10-27  8:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org
             Blocks|                            |26163

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Rama Malladi from comment #0)
> Created attachment 53773 [details]
> Input and source files.
> 

Note you should not share publicly source files of the entire SPEC benchmark.
That very likely violates their license rules!


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
  2022-10-26  9:20 ` [Bug tree-optimization/107409] " rvmallad at amazon dot com
  2022-10-27  8:08 ` marxin at gcc dot gnu.org
@ 2022-10-27  8:13 ` marxin at gcc dot gnu.org
  2022-10-27  8:15 ` rvmallad at amazon dot com
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-10-27  8:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
Can you please share perf-profile before and after the revision?

Note I can't see it for Altra aarch64 CPU:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=633.477.0&plot.1=683.477.0&plot.2=664.477.0&plot.3=648.477.0&plot.4=618.477.0&plot.5=605.477.0&plot.6=759.477.0&plot.7=584.477.0&

However, there are huge changes in between GCC 6/7 and a newer releases. Note
the benchmark is pretty small and very sensitive to instruction caches.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (2 preceding siblings ...)
  2022-10-27  8:13 ` marxin at gcc dot gnu.org
@ 2022-10-27  8:15 ` rvmallad at amazon dot com
  2022-10-27  8:28 ` marxin at gcc dot gnu.org
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-10-27  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #4 from Rama Malladi <rvmallad at amazon dot com> ---
Hi Martin,
Thanks for the guidance. Can we delete the attachment from this bug report?

Regards,
Rama

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (3 preceding siblings ...)
  2022-10-27  8:15 ` rvmallad at amazon dot com
@ 2022-10-27  8:28 ` marxin at gcc dot gnu.org
  2022-10-27 12:08 ` rvmallad at amazon dot com
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-10-27  8:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #5 from Martin Liška <marxin at gcc dot gnu.org> ---
Please try writing here: overseers@sourceware.org

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (4 preceding siblings ...)
  2022-10-27  8:28 ` marxin at gcc dot gnu.org
@ 2022-10-27 12:08 ` rvmallad at amazon dot com
  2022-10-27 12:18 ` mark at gcc dot gnu.org
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-10-27 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #6 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Martin Liška from comment #5)
> Please try writing here: overseers@sourceware.org

I have asked for deletion. Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (5 preceding siblings ...)
  2022-10-27 12:08 ` rvmallad at amazon dot com
@ 2022-10-27 12:18 ` mark at gcc dot gnu.org
  2022-10-27 12:21 ` rvmallad at amazon dot com
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at gcc dot gnu.org @ 2022-10-27 12:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #7 from Mark Wielaard <mark at gcc dot gnu.org> ---
The content of attachment 53773 has been deleted for the following reason:

https://sourceware.org/pipermail/overseers/2022q4/019048.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (6 preceding siblings ...)
  2022-10-27 12:18 ` mark at gcc dot gnu.org
@ 2022-10-27 12:21 ` rvmallad at amazon dot com
  2022-12-01  6:54 ` [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba rvmallad at amazon dot com
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-10-27 12:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #8 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Mark Wielaard from comment #7)
> The content of attachment 53773 [details] has been deleted for the following
> reason:
> 
> https://sourceware.org/pipermail/overseers/2022q4/019048.html

Thank you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (7 preceding siblings ...)
  2022-10-27 12:21 ` rvmallad at amazon dot com
@ 2022-12-01  6:54 ` rvmallad at amazon dot com
  2022-12-01 10:09 ` marxin at gcc dot gnu.org
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-12-01  6:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #9 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Martin Liška from comment #3)
> Can you please share perf-profile before and after the revision?
> 
> Note I can't see it for Altra aarch64 CPU:
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=633.477.0&plot.
> 1=683.477.0&plot.2=664.477.0&plot.3=648.477.0&plot.4=618.477.0&plot.5=605.
> 477.0&plot.6=759.477.0&plot.7=584.477.0&
> 
> However, there are huge changes in between GCC 6/7 and a newer releases.
> Note the benchmark is pretty small and very sensitive to instruction caches.

Hi, I got IPC data for baseline version of compiler and with this patch
reverted.

This is on Graviton3 processor machine, executing 1-copy rate run of 519.lbm_r.

Baseline: Compiler commit ID: f896c13489d22b30d01257bc8316ab97b3359d1c
Cycles:            148,489,372,938
Instructions:      382,748,379,257
IPC:               2.58

Baseline with code change in a9a4edf0e71bbac9f1b5dcecdcf9250111d16889 reverted.

$ git diff gcc/tree-cfg.cc
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index d982988048f..736432713fe 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -9984,7 +9984,7 @@ execute_fixup_cfg (void)
     }
   if (scale)
     {
-      update_max_bb_count ();
+//      update_max_bb_count ();
       compute_function_frequency ();
     }

Cycles:            140,937,228,769
Instructions:      368,881,714,982
IPC:               2.62

From the above, I do see the instructions executed are higher for the baseline
compiler code-gen vs. the one with patch reverted. Can you please look into the
code-gen and let me know if you find some perf opportunity with this patch
revert? Thank you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (8 preceding siblings ...)
  2022-12-01  6:54 ` [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba rvmallad at amazon dot com
@ 2022-12-01 10:09 ` marxin at gcc dot gnu.org
  2022-12-08 10:32 ` rvmallad at amazon dot com
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-12-01 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-12-01
     Ever confirmed|0                           |1

--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
@Honza ?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (9 preceding siblings ...)
  2022-12-01 10:09 ` marxin at gcc dot gnu.org
@ 2022-12-08 10:32 ` rvmallad at amazon dot com
  2022-12-09  9:48 ` rvmallad at amazon dot com
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-12-08 10:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #11 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Martin Liška from comment #10)
> @Honza ?

Just checking if this can be fixed/ implemented. Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (10 preceding siblings ...)
  2022-12-08 10:32 ` rvmallad at amazon dot com
@ 2022-12-09  9:48 ` rvmallad at amazon dot com
  2022-12-09 10:05 ` marxin at gcc dot gnu.org
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-12-09  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #12 from Rama Malladi <rvmallad at amazon dot com> ---
I found difference in dumps at various stages of the compilation for the
mainline GCC and with update_max_bb_count() commented. Here are the details:

Mainline: Commit ID: 63a42ffc0833553fbcb84b50cf0fd2d867b8a92f

There was difference in the dumps for these 2 stages:
"einline" and "earlydebug"

Since we use LTO for this build of 519.lbm_r build, I found these differences
in these stages of the link-time optimizer:
"vect", "slp1", "ivopts", "earlydebug", "debug"

Also, this perf drop of 5%-6% with update_max_bb_count() code was observed only
on ARM64 instances (Graviton3) and not on x86_64 instances (Intel Xeon).

I ran the other SPEC cpu2017_fprate benchmarks on ARM64 with this code
commented on GCC mainline and I haven't observed any perf regression. So, maybe
worth a fix.

Thank you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (11 preceding siblings ...)
  2022-12-09  9:48 ` rvmallad at amazon dot com
@ 2022-12-09 10:05 ` marxin at gcc dot gnu.org
  2022-12-12  9:48 ` rvmallad at amazon dot com
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-12-09 10:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #13 from Martin Liška <marxin at gcc dot gnu.org> ---
Note the mentioned revision is a fix and yes, sometimes these revisions can end
up with a regression as profile estimation is a complex guess.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (12 preceding siblings ...)
  2022-12-09 10:05 ` marxin at gcc dot gnu.org
@ 2022-12-12  9:48 ` rvmallad at amazon dot com
  2023-01-09  4:38 ` rvmallad at amazon dot com
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2022-12-12  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #14 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Martin Liška from comment #13)
> Note the mentioned revision is a fix and yes, sometimes these revisions can
> end up with a regression as profile estimation is a complex guess.

Yes, possibly. So, from my understanding, the update_max_bb_count() tracks the
max basic block count and takes a decision to inline or not in this case/ run.
That is likely why we see a larger instruction count w this function executed/
enabled.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (13 preceding siblings ...)
  2022-12-12  9:48 ` rvmallad at amazon dot com
@ 2023-01-09  4:38 ` rvmallad at amazon dot com
  2023-01-09  8:41 ` marxin at gcc dot gnu.org
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-01-09  4:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #15 from Rama Malladi <rvmallad at amazon dot com> ---
Hi, Can we review this issue and suggest next steps/ action please? Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (14 preceding siblings ...)
  2023-01-09  4:38 ` rvmallad at amazon dot com
@ 2023-01-09  8:41 ` marxin at gcc dot gnu.org
  2023-01-30 18:15 ` jamborm at gcc dot gnu.org
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-01-09  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #16 from Martin Liška <marxin at gcc dot gnu.org> ---
@Honza: ???

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (15 preceding siblings ...)
  2023-01-09  8:41 ` marxin at gcc dot gnu.org
@ 2023-01-30 18:15 ` jamborm at gcc dot gnu.org
  2023-02-02 21:35 ` spop at gcc dot gnu.org
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: jamborm at gcc dot gnu.org @ 2023-01-30 18:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org

--- Comment #17 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #3)
> Note I can't see it for Altra aarch64 CPU:

I think LNT can see it very well and it has appeared around the reported time:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=759.477.0&plot.1=584.477.0&

But I don't think the commit given in the bug summary is the culprit, this
clearly happened in GCC 13 development cycle.  Rama, what is your reasoning to
suggest  reverting this particular commit?  If you bisected the issue, can you
double check you arrived at the correct one?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (16 preceding siblings ...)
  2023-01-30 18:15 ` jamborm at gcc dot gnu.org
@ 2023-02-02 21:35 ` spop at gcc dot gnu.org
  2023-02-03  2:00 ` rvmallad at amazon dot com
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: spop at gcc dot gnu.org @ 2023-02-02 21:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

Sebastian Pop <spop at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spop at gcc dot gnu.org

--- Comment #18 from Sebastian Pop <spop at gcc dot gnu.org> ---
A new 5% regression happened in gcc-trunk more recently and may be due to
another patch.

Rama was bisecting a 15% perf regression on lbm when updating gcc-7 to gcc-10.
The regression can be seen on the LNT graph link from comment#3 

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=633.477.0&plot.1=683.477.0&plot.2=664.477.0&plot.3=648.477.0&plot.4=618.477.0&plot.5=605.477.0&plot.6=759.477.0&plot.7=584.477.0

gcc-6 has execution time of 213 seconds
gcc-7 is at 215 seconds
gcc-8 is at 266
gcc-9 at 259
gcc-10 at 260

Honza's patch seems to be unrelated as it was committed to trunk before gcc-10
release on May 7, 2020:

commit a9a4edf0e71bbac9f1b5dcecdcf9250111d16889
Author: Jan Hubicka <hubicka@ucw.cz>
Date:   Sat Nov 30 22:25:24 2019 +0100

    Update max_bb_count in execute_fixup_cfg


We need to git-bisect between gcc-7 and gcc-8.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (17 preceding siblings ...)
  2023-02-02 21:35 ` spop at gcc dot gnu.org
@ 2023-02-03  2:00 ` rvmallad at amazon dot com
  2023-02-20  3:57 ` rvmallad at amazon dot com
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-02-03  2:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #19 from Rama Malladi <rvmallad at amazon dot com> ---
Thanks @Sebastian and @Martin J. I will get another bisect between GCC 7-and-8.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (18 preceding siblings ...)
  2023-02-03  2:00 ` rvmallad at amazon dot com
@ 2023-02-20  3:57 ` rvmallad at amazon dot com
  2023-02-24 10:26 ` rvmallad at amazon dot com
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-02-20  3:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #20 from Rama Malladi <rvmallad at amazon dot com> ---
@Martin J and @Sebastian P, Let me walk you through the perf data and my
triage.

First, my triage has been on Graviton 3 (neoverse-v1) processor based
instances. Next, I was looking for perf delta going from gcc-7 to gcc-10. I
found 2 issues: One was reported in 107413
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413) and fixed (the perf delta
between gcc-7 and gcc-8 -- 215s vs. 266s); Another one is the issue reported in
here.

I did another triage and landed at the same commit that I reported earlier.

# first bad commit: [a9a4edf0e71bbac9f1b5dcecdcf9250111d16889] Update
max_bb_count in execute_fixup_cfg

Please let me know any further info/ studies you would like to see on this
report.

Thank you.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (19 preceding siblings ...)
  2023-02-20  3:57 ` rvmallad at amazon dot com
@ 2023-02-24 10:26 ` rvmallad at amazon dot com
  2023-03-30  4:56 ` rvmallad at amazon dot com
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-02-24 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #21 from Rama Malladi <rvmallad at amazon dot com> ---
I did another triage for perf loss on Graviton 2 processor (neoverse-n1) based
instance and found this commit: `a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` to
be the reason. As I had indicated in my earlier reply, I was doing a triage of
perf loss going from gcc-7 to gcc-10.

The perf of 519.libm_r 1-copy run improved 1.08x with the revert of commit:
`a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` on gcc-mainline (
`2f1691be517fcdcabae9cd671ab511eb0e08b1d5`).

I am guessing that we don't see it on LNT/ Altra CPUs.

So, please look into this issue fix. Let me know if you have any queries.
Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (20 preceding siblings ...)
  2023-02-24 10:26 ` rvmallad at amazon dot com
@ 2023-03-30  4:56 ` rvmallad at amazon dot com
  2023-03-30  4:58 ` rvmallad at amazon dot com
  2023-03-30  7:52 ` marxin at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-03-30  4:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #22 from Rama Malladi <rvmallad at amazon dot com> ---
I will close this issue as we were unable to reproduce the perf drop going from
gcc-7 to gcc-8 on a Graviton2 based instance. The performance of 519.lbm_r
built with gcc-7.4 was same as that with gcc-8.5.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (21 preceding siblings ...)
  2023-03-30  4:56 ` rvmallad at amazon dot com
@ 2023-03-30  4:58 ` rvmallad at amazon dot com
  2023-03-30  7:52 ` marxin at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: rvmallad at amazon dot com @ 2023-03-30  4:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #23 from Rama Malladi <rvmallad at amazon dot com> ---
(In reply to Rama Malladi from comment #22)
> I will close this issue as we were unable to reproduce the perf drop going
> from gcc-7 to gcc-8 on a Graviton2 based instance. The performance of
> 519.lbm_r built with gcc-7.4 was same as that with gcc-8.5.

Can someone from the GCC dev/ regression team close this issue as I am unable
to find an option for the same? Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba
  2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
                   ` (22 preceding siblings ...)
  2023-03-30  4:58 ` rvmallad at amazon dot com
@ 2023-03-30  7:52 ` marxin at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-03-30  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|NEW                         |RESOLVED

--- Comment #24 from Martin Liška <marxin at gcc dot gnu.org> ---
Sure, let's close it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-03-30  7:52 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26  9:05 [Bug tree-optimization/107409] New: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark rvmallad at amazon dot com
2022-10-26  9:20 ` [Bug tree-optimization/107409] " rvmallad at amazon dot com
2022-10-27  8:08 ` marxin at gcc dot gnu.org
2022-10-27  8:13 ` marxin at gcc dot gnu.org
2022-10-27  8:15 ` rvmallad at amazon dot com
2022-10-27  8:28 ` marxin at gcc dot gnu.org
2022-10-27 12:08 ` rvmallad at amazon dot com
2022-10-27 12:18 ` mark at gcc dot gnu.org
2022-10-27 12:21 ` rvmallad at amazon dot com
2022-12-01  6:54 ` [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba rvmallad at amazon dot com
2022-12-01 10:09 ` marxin at gcc dot gnu.org
2022-12-08 10:32 ` rvmallad at amazon dot com
2022-12-09  9:48 ` rvmallad at amazon dot com
2022-12-09 10:05 ` marxin at gcc dot gnu.org
2022-12-12  9:48 ` rvmallad at amazon dot com
2023-01-09  4:38 ` rvmallad at amazon dot com
2023-01-09  8:41 ` marxin at gcc dot gnu.org
2023-01-30 18:15 ` jamborm at gcc dot gnu.org
2023-02-02 21:35 ` spop at gcc dot gnu.org
2023-02-03  2:00 ` rvmallad at amazon dot com
2023-02-20  3:57 ` rvmallad at amazon dot com
2023-02-24 10:26 ` rvmallad at amazon dot com
2023-03-30  4:56 ` rvmallad at amazon dot com
2023-03-30  4:58 ` rvmallad at amazon dot com
2023-03-30  7:52 ` marxin at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).