public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/103109] New: madd not used for multiply add on POWER9
@ 2021-11-06 18:56 tkoenig at gcc dot gnu.org
  2022-10-05 17:18 ` [Bug target/103109] " bergner at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2021-11-06 18:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109

            Bug ID: 103109
           Summary: madd not used for multiply add on POWER9
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

The following code

#include <stdint.h>

void Long_multiplication( uint64_t multiplicand[],
                          uint64_t multiplier[],
                          uint64_t sum[],
                          uint64_t ilength, uint64_t jlength )
{
  uint64_t acarry, mcarry, product;

  for( uint64_t i = 0;
       i < (ilength + jlength);
       i++ )
    sum[i] = 0;

  acarry = 0;
  for( uint64_t j = 0; j < jlength; j++ )
    {
      mcarry = 0;
      for( uint64_t i = 0; i < ilength; i++ )
        {
          __uint128_t mcarry_prod;
          __uint128_t acarry_sum;
          mcarry_prod = ((__uint128_t) multiplicand[i]) * ((__uint128_t)
multiplier[j])
            + (__uint128_t) mcarry;
          mcarry = mcarry_prod >> 64;
          product = mcarry_prod;
          acarry_sum = ((__uint128_t) sum[i+j]) + ((__uint128_t) acarry) +
product;
          sum[i+j] += acarry_sum;
          acarry = acarry_sum >> 64;
          //      {mcarry, product} = multiplicand[i]*multiplier[j]
          //                            + mcarry;
          //      {acarry,sum[i+j]} = {sum[i+j]+acarry} + product;

        }
    }
}

is translated by

$ gcc -mcpu=power9 -mtune=power9 -S -O3 big_int.c

to (assembler output of the loop)

.L4:
        mtctr 25
        mr 12,23
        add 3,24,4
        li 5,0
        .p2align 4,,15
.L5:
        ldu 10,8(12)
        ldx 11,29,4
        ldu 9,8(3)
        mulld 8,10,11
        mulhdu 10,10,11
        addc 30,8,5
        addze 31,10
        and 21,30,6
        and 22,31,7
        addc 10,21,9
        mr 5,31
        adde 8,22,28
        addc 10,10,0
        add 9,9,10
        addze 0,8
        std 9,0(3)
        bdnz .L5
        addi 27,27,1
        addi 4,4,8
        cmpld 0,26,27
        bne 0,.L4

For the idiom to calculate mcarry_prod, I would have expected
a pair of maddhdu and maddld instructions.

This is with

gcc-Version 12.0.0 20211028 (experimental) (GCC)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/103109] madd not used for multiply add on POWER9
  2021-11-06 18:56 [Bug target/103109] New: madd not used for multiply add on POWER9 tkoenig at gcc dot gnu.org
@ 2022-10-05 17:18 ` bergner at gcc dot gnu.org
  2022-10-11  0:31 ` guihaoc at gcc dot gnu.org
  2023-02-15  9:13 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: bergner at gcc dot gnu.org @ 2022-10-05 17:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109

--- Comment #3 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to HaoChen Gui from comment #2)
> Fixed by r13-2107.

This is marked version = GCC 12.  Were you planning on backporting this?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/103109] madd not used for multiply add on POWER9
  2021-11-06 18:56 [Bug target/103109] New: madd not used for multiply add on POWER9 tkoenig at gcc dot gnu.org
  2022-10-05 17:18 ` [Bug target/103109] " bergner at gcc dot gnu.org
@ 2022-10-11  0:31 ` guihaoc at gcc dot gnu.org
  2023-02-15  9:13 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2022-10-11  0:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109

--- Comment #4 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #3)
> (In reply to HaoChen Gui from comment #2)
> > Fixed by r13-2107.
> 
> This is marked version = GCC 12.  Were you planning on backporting this?


Not sure if the patch needs to be back ported. It's not a functional issue.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/103109] madd not used for multiply add on POWER9
  2021-11-06 18:56 [Bug target/103109] New: madd not used for multiply add on POWER9 tkoenig at gcc dot gnu.org
  2022-10-05 17:18 ` [Bug target/103109] " bergner at gcc dot gnu.org
  2022-10-11  0:31 ` guihaoc at gcc dot gnu.org
@ 2023-02-15  9:13 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-15  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:3f71b82596e992eb6e53fe9bbd70a4b52bc908e8

commit r13-5999-g3f71b82596e992eb6e53fe9bbd70a4b52bc908e8
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Feb 15 09:56:47 2023 +0100

    powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787]

    WIDEN_MULT_PLUS_EXPR as documented has the factor operands with
    the same precision and the addend and result another one at least twice
    as wide.
    Similarly, {,u}maddMN4 is documented as
    'maddMN4'
         Multiply operands 1 and 2, sign-extend them to mode N, add operand
         3, and store the result in operand 0.  Operands 1 and 2 have mode M
         and operands 0 and 3 have mode N.  Both modes must be integer or
         fixed-point modes and N must be twice the size of M.

         In other words, 'maddMN4' is like 'mulMN3' except that it also adds
         operand 3.

         These instructions are not allowed to 'FAIL'.

    'umaddMN4'
         Like 'maddMN4', but zero-extend the multiplication operands instead
         of sign-extending them.
    The PR103109 addition of these expanders to rs6000 didn't handle this
    correctly though, it treated the last argument as also having mode M
    sign or zero extended into N.  Unfortunately this means incorrect code
    generation whenever the last operand isn't really sign or zero extended
    from DImode to TImode.

    The following patch removes maddditi4 expander altogether from rs6000.md,
    because we'd need
            maddhd 9,3,4,5
            sradi 10,5,63
            maddld 3,3,4,5
            sub 9,9,10
            add 4,9,6
    which is longer than
            mulld 9,3,4
            mulhd 4,3,4
            addc 3,9,5
            adde 4,4,6
    and nothing would be able to optimize the case of last operand already
    sign-extended from DImode to TImode into just
            mr 9,3
            maddld 3,3,4,5
            maddhd 4,9,4,5
    or so.  And fixes umaddditi4, so that it emits an add at the end to add
    the high half of the last operand, fortunately in this case if the high
    half of the last operand is known to be zero (i.e. last operand is zero
    extended from DImode to TImode) then combine will drop the useless add.

    If we wanted to get back the signed op1 * op2 + op3 all in the DImode
    into TImode op0, we'd need to introduce a new tree code next to
    WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't
    be done at expansion time in maddMN4 expander to detect whether the
    operand is sign extended especially because of SUBREGs and the awkwardness
    of looking at earlier emitted instructions, and combine would need 5
    instruction combination.

    2023-02-15  Jakub Jelinek  <jakub@redhat.com>

            PR target/108787
            PR target/103109
            * config/rs6000/rs6000.md (<u>maddditi4): Change into umaddditi4
only
            expander, change operand 3 to be TImode, emit maddlddi4 and
            umadddi4_highpart{,_le} with its low half and finally add the high
            half to the result.

            * gcc.dg/pr108787.c: New test.
            * gcc.target/powerpc/pr108787.c: New test.
            * gcc.target/powerpc/pr103109-1.c: Adjust expected instruction
counts.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-15  9:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-06 18:56 [Bug target/103109] New: madd not used for multiply add on POWER9 tkoenig at gcc dot gnu.org
2022-10-05 17:18 ` [Bug target/103109] " bergner at gcc dot gnu.org
2022-10-11  0:31 ` guihaoc at gcc dot gnu.org
2023-02-15  9:13 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).