From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 83B763858D3C; Wed, 29 Nov 2023 13:53:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 83B763858D3C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1701266039; bh=mT6mjPiPPHU1RdIkKIMm4KGAjQTJzrs4KIi3q9sNPRc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=sCTfEEV+g3jQRKj3nYyVRh49x1BywTG5ypOvD81kQLvnCIASSQ0E9CFVf9xCoc8lp WBGaN6zVsuG0afBlyMjn1G2vEF20cnNC/9heNlbKzCYkszpHq88Oc2JssoxIz3n7Ip 6YBawSww0tTk/5h630ceTUx59QiX1ilYaIlbhezg= From: "jamborm at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/112697] [14 Regression] 30-40% exec time regression of 433.milc on zen2 since r14-4972-g8aa47713701b1f Date: Wed, 29 Nov 2023 13:53:59 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection X-Bugzilla-Severity: normal X-Bugzilla-Who: jamborm at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112697 --- Comment #7 from Martin Jambor --- Created attachment 56720 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D56720&action=3Dedit Perf annotate of milc built with r14-4972-g8aa47713701b1f commit r14-4972-g8aa47713701b1f: $ perf stat taskset -c 0 specinvoke Performance counter stats for 'taskset -c 0 specinvoke': 272931.43 msec task-clock:u # 1.000 CPUs utilized=20=20=20=20=20=20=20=20=20=20=20=20=20 0 context-switches:u # 0.000 /sec=20= =20=20=20=20=20=20=20 0 cpu-migrations:u # 0.000 /sec=20= =20=20=20=20=20=20=20 472353 page-faults:u # 1.731 K/sec= =20=20=20=20=20=20=20 886165387570 cycles:u # 3.247 GHz=20= =20=20=20=20=20=20=20 (83.33%) 31546898034 stalled-cycles-frontend:u # 3.56% frontend cycles idle (83.33%) 729878095777 stalled-cycles-backend:u # 82.36% backend cycles idle (83.33%) 1061779557370 instructions:u # 1.20 insn per cycle=20=20=20=20=20=20=20=20=20=20=20=20 # 0.69 stalled cycles= per insn (83.33%) 58797121078 branches:u # 215.428 M/sec= =20=20=20=20=20=20 (83.33%) 6960852 branch-misses:u # 0.01% of all branches (83.33%) 272.967381843 seconds time elapsed 268.718335000 seconds user 4.212584000 seconds sys $ perf record taskset -c 0 specinvoke [ perf record: Woken up 167 times to write data ] [ perf record: Captured and wrote 41.549 MB perf.data (1088982 samples) ] $ perf report -n --percent-limit=3D1 --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1M of event 'cycles:Pu' # Event count (approx.): 883903400858 # # Overhead Samples Command Shared Object Symbol= =20=20=20=20=20=20=20 # ........ ............ ............... ......................=20 ...................................... # 24.34% 260907 milc_base.mine- milc_base.mine-lto-gen [.] add_force_to_mom 18.01% 198287 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_na 17.45% 187529 milc_base.mine- milc_base.mine-lto-gen [.] u_shift_fermion 14.22% 155596 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_nn 5.61% 60601 milc_base.mine- milc_base.mine-lto-gen [.] scalar_mult_add_su3_matrix 4.35% 51034 milc_base.mine- milc_base.mine-lto-gen [.] path_product 4.24% 46032 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_an 2.99% 32624 milc_base.mine- milc_base.mine-lto-gen [.] imp_gauge_force.constprop.0 1.50% 16242 milc_base.mine- milc_base.mine-lto-gen [.] compute_gen_staple 1.35% 14580 milc_base.mine- milc_base.mine-lto-gen [.] mult_su3_mat_vec_sum_4dir 1.21% 12922 milc_base.mine- milc_base.mine-lto-gen [.] make_anti_hermitian 1.06% 11469 milc_base.mine- milc_base.mine-lto-gen [.] mult_adj_su3_mat_4vec 1.03% 11111 milc_base.mine- libc.so.6 [.] __memset_avx2_unaligned_erms $ perf annotate -n --percent-limit=3D1 > ~/tmp/milc-perf-annotate-8aa477137= 01=20 (gzipeped and attached)=