From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BB85E3858401; Sat, 6 Nov 2021 18:56:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB85E3858401 From: "tkoenig at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103109] New: madd not used for multiply add on POWER9 Date: Sat, 06 Nov 2021 18:56:26 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: tkoenig at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Nov 2021 18:56:26 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103109 Bug ID: 103109 Summary: madd not used for multiply add on POWER9 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- The following code #include void Long_multiplication( uint64_t multiplicand[], uint64_t multiplier[], uint64_t sum[], uint64_t ilength, uint64_t jlength ) { uint64_t acarry, mcarry, product; for( uint64_t i =3D 0; i < (ilength + jlength); i++ ) sum[i] =3D 0; acarry =3D 0; for( uint64_t j =3D 0; j < jlength; j++ ) { mcarry =3D 0; for( uint64_t i =3D 0; i < ilength; i++ ) { __uint128_t mcarry_prod; __uint128_t acarry_sum; mcarry_prod =3D ((__uint128_t) multiplicand[i]) * ((__uint128_t) multiplier[j]) + (__uint128_t) mcarry; mcarry =3D mcarry_prod >> 64; product =3D mcarry_prod; acarry_sum =3D ((__uint128_t) sum[i+j]) + ((__uint128_t) acarry) + product; sum[i+j] +=3D acarry_sum; acarry =3D acarry_sum >> 64; // {mcarry, product} =3D multiplicand[i]*multiplier[j] // + mcarry; // {acarry,sum[i+j]} =3D {sum[i+j]+acarry} + product; } } } is translated by $ gcc -mcpu=3Dpower9 -mtune=3Dpower9 -S -O3 big_int.c to (assembler output of the loop) .L4: mtctr 25 mr 12,23 add 3,24,4 li 5,0 .p2align 4,,15 .L5: ldu 10,8(12) ldx 11,29,4 ldu 9,8(3) mulld 8,10,11 mulhdu 10,10,11 addc 30,8,5 addze 31,10 and 21,30,6 and 22,31,7 addc 10,21,9 mr 5,31 adde 8,22,28 addc 10,10,0 add 9,9,10 addze 0,8 std 9,0(3) bdnz .L5 addi 27,27,1 addi 4,4,8 cmpld 0,26,27 bne 0,.L4 For the idiom to calculate mcarry_prod, I would have expected a pair of maddhdu and maddld instructions. This is with gcc-Version 12.0.0 20211028 (experimental) (GCC)=