From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0964F385842B; Thu, 26 Aug 2021 18:46:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0964F385842B From: "meissner at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10") Date: Thu, 26 Aug 2021 18:46:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: ipa X-Bugzilla-Version: 11.2.1 X-Bugzilla-Keywords: diagnostic, lto, rejects-valid X-Bugzilla-Severity: normal X-Bugzilla-Who: meissner at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: linkw at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Aug 2021 18:46:50 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102059 Michael Meissner changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |meissner at gcc dot gnu.org --- Comment #19 from Michael Meissner --- The main power8 fusion that GCC does is combining: addis rtmp,r0,symbol@hi(r2) ld/lbz/lwz rx,symbol@lo(rtmp) into: addis rx,symbol@hi(r2) ld/lbz/lwz rx,symbol@lo(rx) This fusion is listed as one of the fusion types in the power10 documents. = The fusion type is wideimmediate. Note, when you are compiling for -mcpu=3Dpow= er10, this fusion case doesn't often get used because we use PC-relative loads. = But the machine does support it. In addition, it combines loads to a traditional floating point register, and then a move to a traditional Altivec register. Similarly, it will combine= a move from a traditional Altivec register to a traditional floating point register, and then a store: lfd fy,32(rx) xxlor fy,vsrx xxlor vsrz,fy,fy stfd fy,32(rz) into: li rtmp,32 li rtmp,32 lxdx vsrz,2,rtmp stxdx vsrx.rz.rtmp Now on power9 and power10, this sequence is not generated because we have t= he lxsd and stxsd instructions (and plxsd/pstxsd in power10). So I suspect, we may want to move the p8 load fusion case support to fusion= .md, and do it for power10 as well. Aaron Sawdey may have other thoughts, since= he has been working on the power10 fusion support, and knows more what is actu= ally implemented in current hardware. Then for inlining, we may want to exclude p8_fusion and p10_fusion in the comparison in rs6000_can_inline_p, since these are optimizations that don't affect the instructions generated. Note, there were so-called power9 fusion code that was originally in the po= wer9 spec, but was not implemented in the hardware. I removed support for these= in November 2018.=