public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108629] New: 549.fotonik3d_r regresses 15-24% at -O2 -flto -march=x86-64-v3 since r13-1203-g038b077689bb53
@ 2023-02-01 13:01 jamborm at gcc dot gnu.org
0 siblings, 0 replies; only message in thread
From: jamborm at gcc dot gnu.org @ 2023-02-01 13:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108629
Bug ID: 108629
Summary: 549.fotonik3d_r regresses 15-24% at -O2 -flto
-march=x86-64-v3 since r13-1203-g038b077689bb53
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jamborm at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
Blocks: 26163
Target Milestone: ---
Host: x86_64-linux
Target: x86_64-linux
When benchmarking trunk revision 99ea0d76116 I noticed a 24%
regression on Zen4 and Zen3 machines and 16% on a Zen2 and a Intel
CascadeLake when running 549.fotonik3d_r from SPEC 2017 FPrate suite
built with options -O2 -g -march=x86-64-v3 -flto=32 compared to the
binary produced by GCC 12.
The number of branches reported by perf stat between gcc 12 and the
aforementioned trunk revision on the Zen3 machine jumped by 90%.
The symbol profile changed from:
Overhead Samples Shared object Name
33.23% 40078 fotonik3d_r_peak.gcc12
__upml_mod_MOD_upml_updatee_simple.lto_priv.0
27.74% 33471 fotonik3d_r_peak.gcc12 __upml_mod_MOD_upml_updateh
17.50% 21114 fotonik3d_r_peak.gcc12 __material_mod_MOD_mat_updatee
9.52% 11493 fotonik3d_r_peak.gcc12 __update_mod_MOD_updateh
9.49% 11445 fotonik3d_r_peak.gcc12 __power_mod_MOD_power_dft
To:
Overhead Samples Shared object Name
26.68% 39825 fotonik3d_r_peak.trunk
__upml_mod_MOD_upml_updatee_simple.lto_priv.0
22.35% 33368 fotonik3d_r_peak.trunk __upml_mod_MOD_upml_updateh
13.99% 20892 fotonik3d_r_peak.trunk __material_mod_MOD_mat_updatee
13.96% 20816 fotonik3d_r_peak.trunk __power_mod_MOD_power_dft
11.51% 17164 libgcc_s.so.1 __muldc3
8.60% 12840 fotonik3d_r_peak.trunk __update_mod_MOD_updateh
On the Zen3 machine at least, I have bisected this to:
commit 038b077689bb5310386b04d40a2cea234f01e6aa
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Wed Jun 22 11:27:15 2022 +0100
data-ref: Improve non-loop disambiguation [PR106019]
When dr_may_alias_p is called without a loop context, it tries
to use the tree-affine interface to calculate the difference
between the two addresses and use that difference to check whether
the gap between the accesses is known at compile time. However, as the
example in the PR shows, this doesn't expand SSA_NAMEs and so can easily
be defeated by things like reassociation.
One fix would have been to use aff_combination_expand to expand the
SSA_NAMEs, but we'd then need some way of maintaining the associated
cache. This patch instead reuses the innermost_loop_behavior fields
(which exist even when no loop context is provided).
It might still be useful to do the aff_combination_expand thing too,
if an example turns out to need it.
gcc/
PR tree-optimization/106019
* tree-data-ref.cc (dr_may_alias_p): Try using the
innermost_loop_behavior to disambiguate non-loop queries.
gcc/testsuite/
PR tree-optimization/106019
* gcc.dg/vect/bb-slp-pr106019.c: New test.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-02-01 13:01 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01 13:01 [Bug tree-optimization/108629] New: 549.fotonik3d_r regresses 15-24% at -O2 -flto -march=x86-64-v3 since r13-1203-g038b077689bb53 jamborm at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).