From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BF6C83886C48; Tue, 29 Mar 2022 06:48:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BF6C83886C48 From: "lili.cui at intel dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU Date: Tue, 29 Mar 2022 06:48:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: lili.cui at intel dot com X-Bugzilla-Status: WAITING X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2022 06:48:30 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104271 --- Comment #7 from cuilili --- Created attachment 52706 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D52706&action=3Dedit Add a heuristic for eliminate redundant load and store in inline pass. Hi Richard, Could you help take a look? This is my first time adding code in mid-end, h= ope you can give me some advice, thank you! I add a INLINE_HINT_eliminate_load_and_store hint in to inline pass. when callee's memory access is caller's local memory parameter and access size is greater than the target threshold, we will enable the hint. with the hint, inlining_insns_auto will enlarge the bound. The target hook is only enabled= for x86 now. With the patch applied Icelake server: 538.imagic_r get 15.18% improvement for multicopy and 40.78% improvement for single copy with no measurable changes for other benchmarks. Casecadelake: 538.imagic_r get 12.4% improvement for multicopy with and code size increased by 0.4%. With no measurable changes for other benchmarks. Znver3 server: 538.imagic_r get 9.6% improvement for multicopy with and code size increased by 0.5%. With no measurable changes for other benchmarks.=