From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 3394B3854556; Fri, 9 Dec 2022 09:48:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3394B3854556 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1670579326; bh=WW0kOMceBvCmLx/oe7NL0M3QTUVgwY5qJTM/2/XdMBA=; h=From:To:Subject:Date:In-Reply-To:References:From; b=IZQ06Dg82I9ZWpRv2noaBsk8sAwEcyvpCbiqJwf5CBVt5EdCFcQlD/wfJW+jZk3pH JuiBjPMh6IfrS7tzUx4BDxcFcfeoUhPlKodEhtRqxG6XEGqOUnZbzG6WUixYSVlS1g Cu30G+WBLtaHaKjgdj4D5inY8h9JFjvtYUqpJCPY= From: "rvmallad at amazon dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba Date: Fri, 09 Dec 2022 09:48:41 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rvmallad at amazon dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107409 --- Comment #12 from Rama Malladi --- I found difference in dumps at various stages of the compilation for the mainline GCC and with update_max_bb_count() commented. Here are the details: Mainline: Commit ID: 63a42ffc0833553fbcb84b50cf0fd2d867b8a92f There was difference in the dumps for these 2 stages: "einline" and "earlydebug" Since we use LTO for this build of 519.lbm_r build, I found these differenc= es in these stages of the link-time optimizer: "vect", "slp1", "ivopts", "earlydebug", "debug" Also, this perf drop of 5%-6% with update_max_bb_count() code was observed = only on ARM64 instances (Graviton3) and not on x86_64 instances (Intel Xeon). I ran the other SPEC cpu2017_fprate benchmarks on ARM64 with this code commented on GCC mainline and I haven't observed any perf regression. So, m= aybe worth a fix. Thank you.=