public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102473] New: 521.wrf_r 5% slower at -Ofast and generic x86_64 tuning after r12-3426-g8f323c712ea76c
@ 2021-09-23 16:45 jamborm at gcc dot gnu.org
  2021-09-24  6:52 ` [Bug target/102473] [12 Regression] " rguenth at gcc dot gnu.org
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-09-23 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102473

            Bug ID: 102473
           Summary: 521.wrf_r 5% slower at -Ofast and generic x86_64
                    tuning after r12-3426-g8f323c712ea76c
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: crazylht at gmail dot com
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

All three x86_64 LNT machines have detected a 4.5-5.2% performance
regression of SPEC FPrate 2017 benchmarks 521.wrf_r when compiled with
-Ofast and the default (generic) march and mtune.

Zen2 based machine regressed by 5%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=294.548.0
Zen1 based machine regressed by 5.2%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=35.548.0
Kabylake based machine regressed by 4.5%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=34.548.0

On an AMD zen2 based machine I have bisected the regression to commit
r12-3426-g8f323c712ea76c:

8f323c712ea76cc4506b03895e9b991e4e4b2baf is the first bad commit
commit 8f323c712ea76cc4506b03895e9b991e4e4b2baf
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Sep 7 12:39:04 2021 +0800

    Optimize v4sf reduction.

    gcc/ChangeLog:

            PR target/101059
            * config/i386/sse.md (reduc_plus_scal_<mode>): Split to ..
            (reduc_plus_scal_v4sf): .. this, New define_expand.
            (reduc_plus_scal_v2df): .. and this, New define_expand.


I have confirmed that the commit causes a similar regression on
another Intel Skylake server.

On the Zen2 machine, this is the difference in samples collected by
perf for different symbols (before is commit 60eec23b5ed, after commit
8f323c712ea):

| Symbol                                      | sys lib | Before | After | 
diff |     % |
|---------------------------------------------+---------+--------+-------+-------+-------|
| __logf_fma                                  | yes     |  68882 | 68940 |  
+58 | +0.08 |
| __atanf                                     | yes     |  66664 | 66196 | 
-468 | -0.70 |
| __module_advect_em_MOD_advect_scalar_pd     | no      |  62286 | 62348 |  
+62 | +0.10 |
| __powf_fma                                  | yes     |  56213 | 56127 |  
-86 | -0.15 |
| __module_mp_wsm5_MOD_nislfv_rain_plm        | no      |  46990 | 48340 |
+1350 | +2.87 |
| __module_mp_wsm5_MOD_wsm52d                 | no      |  41031 | 40968 |  
-63 | -0.15 |
| __module_small_step_em_MOD_advance_uv       | no      |  30908 | 30909 |   
+1 | +0.00 |
| __module_small_step_em_MOD_advance_w        | no      |  28738 | 28600 | 
-138 | -0.48 |
| __module_advect_em_MOD_advect_scalar        | no      |  28400 | 28429 |  
+29 | +0.10 |
| __expf_fma                                  | yes     |  26702 | 26516 | 
-186 | -0.70 |
| __module_big_step_utilities_em_MOD_phy_prep | no      |  25878 | 25816 |  
-62 | -0.24 |
| psim_unstable_                              | no      |  24994 | 25106 | 
+112 | +0.45 |
| __module_bl_ysu_MOD_ysu2d                   | no      |  24799 | 25251 | 
+452 | +1.82 |
| psih_unstable_                              | no      |  22600 | 23139 | 
+539 | +2.38 |
| __module_small_step_em_MOD_advance_mu_t     | no      |  22250 | 22232 |  
-18 | -0.08 |
| __memset_avx2_unaligned_erms                | yes     |  21748 | 21613 | 
-135 | -0.62 |
| _ZGVbN4vv_powf_sse4                         | yes     |  21206 | 21355 | 
+149 | +0.70 |


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-07-26 13:27 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-23 16:45 [Bug target/102473] New: 521.wrf_r 5% slower at -Ofast and generic x86_64 tuning after r12-3426-g8f323c712ea76c jamborm at gcc dot gnu.org
2021-09-24  6:52 ` [Bug target/102473] [12 Regression] " rguenth at gcc dot gnu.org
2021-09-24  7:39 ` crazylht at gmail dot com
2021-09-24  7:50 ` rguenth at gcc dot gnu.org
2021-09-24 10:43 ` crazylht at gmail dot com
2021-09-26  7:29 ` crazylht at gmail dot com
2021-09-27  2:01 ` crazylht at gmail dot com
2021-09-27  2:16 ` crazylht at gmail dot com
2021-09-27  7:49 ` crazylht at gmail dot com
2021-09-27  7:51 ` cvs-commit at gcc dot gnu.org
2021-09-27  8:10 ` jamborm at gcc dot gnu.org
2021-09-27  8:18 ` crazylht at gmail dot com
2021-09-27 14:27 ` hjl.tools at gmail dot com
2021-09-28  2:20 ` crazylht at gmail dot com
2021-09-28  2:24 ` hjl.tools at gmail dot com
2021-09-28  2:59 ` crazylht at gmail dot com
2022-01-20  9:53 ` rguenth at gcc dot gnu.org
2022-05-06  8:31 ` [Bug target/102473] [12/13 " jakub at gcc dot gnu.org
2022-07-26 13:27 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).