public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "bergner at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/105556] New: RA assigns an MMA vector input operand to vs0-vs31 causing an MMA accumulator to be spilled
Date: Tue, 10 May 2022 20:37:38 +0000	[thread overview]
Message-ID: <bug-105556-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105556

            Bug ID: 105556
           Summary: RA assigns an MMA vector input operand to vs0-vs31
                    causing an MMA accumulator to be spilled
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bergner at gcc dot gnu.org
  Target Milestone: ---

With current trunk and GCC 12, the MMA optimized dgemm kernel in OpenBLAS is
seeing a performance regression compared to GCC 11 and GCC 10.  The problem is
that the core loop in dgemm uses 8 accumulator variables, which want to use all
8 accumulator registers.  Using the 8 accumulators means we should not use the
vs0 thru vs31 vector registers for the MMA instruction's normal vector input
operands. However with trunk and GCC 12, the register allocator is assigning
one vector input to one of the vs0-vs31 registers leading us to spill one of
the accumulators and that causes a bad performance loss.

The trunk and GCC 12 asm for the core loop looks like:

.L5:
        lxvp 0,0(10)
        lxv 40,0(9)
        addi 10,10,64
        addi 9,9,64
        lxv 41,-48(9)
        lxv 42,-32(9)
        lxv 43,-16(9)
        lxvp 2,32(1)
        lxvp 32,-32(10)
        xvf64gerpp 4,0,40
        xvf64gerpp 6,0,41
        xvf64gerpp 3,0,42
        xvf64gerpp 2,0,43
        lxvp 0,64(1)
        xvf64gerpp 5,32,40
        xvf64gerpp 7,32,41
        xvf64gerpp 1,32,42
        xxmtacc 0
        xvf64gerpp 0,32,43
        xxmfacc 0
        stxvp 2,32(1)
        stxvp 0,64(1)
        bdnz .L5

Note the use of vs0 in the MMA instructions which forces the spilling of ACC0.
The "better" GCC 11 and GCC 10 code looks like:
.L5:
        lxvp 44,0(10)
        lxvp 32,32(10)
        addi 9,9,64
        addi 10,10,64
        lxv 39,-64(9)
        lxv 40,-48(9)
        lxv 41,-32(9)
        lxv 42,-16(9)
        xvf64gerpp 4,44,39
        xvf64gerpp 5,32,39
        xvf64gerpp 6,44,40
        xvf64gerpp 7,32,40
        xvf64gerpp 3,44,41
        xvf64gerpp 1,32,41
        xvf64gerpp 2,44,42
        xvf64gerpp 0,32,42
        bdnz .L5

             reply	other threads:[~2022-05-10 20:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10 20:37 bergner at gcc dot gnu.org [this message]
2022-05-10 20:39 ` [Bug target/105556] " bergner at gcc dot gnu.org
2022-05-10 20:48 ` bergner at gcc dot gnu.org
2022-05-18  2:33 ` cvs-commit at gcc dot gnu.org
2022-05-18 14:49 ` bergner at gcc dot gnu.org
2022-05-20 23:00 ` cvs-commit at gcc dot gnu.org
2022-05-20 23:01 ` bergner at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-105556-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).