From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id A094738582A4; Wed, 15 Feb 2023 09:13:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A094738582A4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676452418; bh=9P7a+v2GXW484nPX0mh9TciYgfoxMqGQImQ8BSOBEBY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=L/R1FXizHptCrBcHXtWLjuh0GCjDu0FRIJiLcxuL8tBqVHEG8ZUycTSnSm6AL7aVD 3X5e7McKY6/BZRlQaRx3u/eU7JVFbobJ8zI14ACh7/dSrXG6mTeA3BaUMQ6xuI5E5d t3WMDemHx8utxHBuzSMO6ET6Ypuj7FZ8EqrYx2a0= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103109] madd not used for multiply add on POWER9 Date: Wed, 15 Feb 2023 09:13:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103109 --- Comment #5 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:3f71b82596e992eb6e53fe9bbd70a4b52bc908e8 commit r13-5999-g3f71b82596e992eb6e53fe9bbd70a4b52bc908e8 Author: Jakub Jelinek Date: Wed Feb 15 09:56:47 2023 +0100 powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787] WIDEN_MULT_PLUS_EXPR as documented has the factor operands with the same precision and the addend and result another one at least twice as wide. Similarly, {,u}maddMN4 is documented as 'maddMN4' Multiply operands 1 and 2, sign-extend them to mode N, add operand 3, and store the result in operand 0. Operands 1 and 2 have mode M and operands 0 and 3 have mode N. Both modes must be integer or fixed-point modes and N must be twice the size of M. In other words, 'maddMN4' is like 'mulMN3' except that it also adds operand 3. These instructions are not allowed to 'FAIL'. 'umaddMN4' Like 'maddMN4', but zero-extend the multiplication operands instead of sign-extending them. The PR103109 addition of these expanders to rs6000 didn't handle this correctly though, it treated the last argument as also having mode M sign or zero extended into N. Unfortunately this means incorrect code generation whenever the last operand isn't really sign or zero extended from DImode to TImode. The following patch removes maddditi4 expander altogether from rs6000.m= d, because we'd need maddhd 9,3,4,5 sradi 10,5,63 maddld 3,3,4,5 sub 9,9,10 add 4,9,6 which is longer than mulld 9,3,4 mulhd 4,3,4 addc 3,9,5 adde 4,4,6 and nothing would be able to optimize the case of last operand already sign-extended from DImode to TImode into just mr 9,3 maddld 3,3,4,5 maddhd 4,9,4,5 or so. And fixes umaddditi4, so that it emits an add at the end to add the high half of the last operand, fortunately in this case if the high half of the last operand is known to be zero (i.e. last operand is zero extended from DImode to TImode) then combine will drop the useless add. If we wanted to get back the signed op1 * op2 + op3 all in the DImode into TImode op0, we'd need to introduce a new tree code next to WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't be done at expansion time in maddMN4 expander to detect whether the operand is sign extended especially because of SUBREGs and the awkwardn= ess of looking at earlier emitted instructions, and combine would need 5 instruction combination. 2023-02-15 Jakub Jelinek PR target/108787 PR target/103109 * config/rs6000/rs6000.md (maddditi4): Change into umaddditi4 only expander, change operand 3 to be TImode, emit maddlddi4 and umadddi4_highpart{,_le} with its low half and finally add the h= igh half to the result. * gcc.dg/pr108787.c: New test. * gcc.target/powerpc/pr108787.c: New test. * gcc.target/powerpc/pr103109-1.c: Adjust expected instruction counts.=