From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 303A7385842A; Wed, 29 Nov 2023 13:52:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 303A7385842A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1701265958; bh=VVAf5qzuQWJxf4o/4v2XldmIYxeOeoXumB7cMOKxRPI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=niBVtR3QsIbybr2JOxP1+ntuN/ElNcIKuWuHdtP+p6H2DQxzpoPilz0UK42wNQZvq 5DkvqSg4GDYqFNFnLPgQtU7zpPnlCWfgZE45OiYRrzdD1SHFIYBx0YBsCcLwz6rvmu tRVaEwRZrxC31M1uqGkN2kH3LxoZPQswuISxAA/c= From: "jamborm at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/112697] [14 Regression] 30-40% exec time regression of 433.milc on zen2 since r14-4972-g8aa47713701b1f Date: Wed, 29 Nov 2023 13:52:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection X-Bugzilla-Severity: normal X-Bugzilla-Who: jamborm at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112697 --- Comment #6 from Martin Jambor --- Created attachment 56719 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D56719&action=3Dedit Perf annotate of milc built with r14-4971-g0beb1611754742 commit r14-4971-g0beb1611754742: $ perf stat taskset -c 0 specinvoke Performance counter stats for 'taskset -c 0 specinvoke': 216908.59 msec task-clock:u # 1.000 CPUs utilized=20=20=20=20=20=20=20=20=20=20=20=20=20 0 context-switches:u # 0.000 /sec=20= =20=20=20=20=20=20=20 0 cpu-migrations:u # 0.000 /sec=20= =20=20=20=20=20=20=20 889694 page-faults:u # 4.102 K/sec= =20=20=20=20=20=20=20 697007650237 cycles:u # 3.213 GHz=20= =20=20=20=20=20=20=20 (83.33%) 31999772966 stalled-cycles-frontend:u # 4.59% frontend cycles idle (83.33%) 540485725923 stalled-cycles-backend:u # 77.54% backend cycles idle (83.33%) 1061256162815 instructions:u # 1.52 insn per cycle=20=20=20=20=20=20=20=20=20=20=20=20 # 0.51 stalled cycles= per insn (83.33%) 58760648879 branches:u # 270.901 M/sec= =20=20=20=20=20=20 (83.34%) 11890202 branch-misses:u # 0.02% of all branches (83.33%) 216.935387643 seconds time elapsed 211.436079000 seconds user 5.472459000 seconds sys $ perf record taskset -c 0 specinvoke [ perf record: Woken up 132 times to write data ] [ perf record: Captured and wrote 32.901 MB perf.data (862286 samples) ] $ perf report -n --percent-limit=3D1 --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 862K of event 'cycles:Pu' # Event count (approx.): 695776598661 # # Overhead Samples Command Shared Object Symbol= =20=20=20=20=20=20=20 # ........ ............ ............... ......................=20 ...................................... # 22.68% 197003 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_na 20.99% 177912 milc_base.mine- milc_base.mine-lto-gen [.] u_shift_fermion 19.04% 163787 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_nn 6.85% 58509 milc_base.mine- milc_base.mine-lto-gen [.] scalar_mult_add_su3_matrix 5.51% 50953 milc_base.mine- milc_base.mine-lto-gen [.] path_product 5.40% 46083 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_an 4.22% 35853 milc_base.mine- milc_base.mine-lto-gen [.] add_force_to_mom 3.77% 32446 milc_base.mine- milc_base.mine-lto-gen [.] imp_gauge_force.constprop.0 1.98% 16848 milc_base.mine- milc_base.mine-lto-gen [.] compute_gen_staple 1.94% 16462 milc_base.mine- milc_base.mine-lto-gen [.] make_anti_hermitian 1.73% 14655 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_mat_vec_sum_4dir 1.35% 11472 milc_base.mine- milc_base.mine-lto-gen [.] mult_adj_su3_mat_4vec 1.27% 10801 milc_base.mine- libc.so.6 [.] __memset_avx2_unaligned_erms $ perf annotate -n --percent-limit=3D1 > ~/tmp/milc-perf-annotate-0beb16117= 54=20 (gzipeped and attached)=