Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Max Filippov <jcmvbkbc@gmail.com>
To: "Takayuki 'January June' Suwa" <jjsuwa_sys3175@yahoo.co.jp>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once
Date: Fri, 20 Jan 2023 07:14:17 -0800	[thread overview]
Message-ID: <CAMo8BfLFFSHHUxPJ90Z=w8GnWYezfkPK1okQtdTmwtGgmqCNEw@mail.gmail.com> (raw)
In-Reply-To: <465b0cbe-73ca-f5a0-661d-d34217e29b4d@yahoo.co.jp>

Hi Suwa-san,

On Wed, Jan 18, 2023 at 7:50 PM Takayuki 'January June' Suwa
<jjsuwa_sys3175@yahoo.co.jp> wrote:
>
> In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg check (possibly FAIL).
>
> =====
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later.  However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (the frame pointer is needed to recover the stack pointer
> and must be excluded).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
>         ; prologue {
>   ...
>         s32i.n  a14, sp, 16
>   ...
>         ; } prologue
>   ...
>         mov.n   a14, a6
>   ...
>         call0   foo
>   ...
>         mov.n   a8, a14
>   ...
>         ; epilogue {
>   ...
>         l32i.n  a14, sp, 16
>   ...
>         ; } epilogue
>
> It can be possible like this:
>
> ;; after
>         ; prologue {
>   ...
>         (deleted)
>   ...
>         ; } prologue
>   ...
>         s32i.n  a6, sp, 16
>   ...
>         call0   foo
>   ...
>         l32i.n  a8, sp, 16
>   ...
>         ; epilogue {
>   ...
>         (deleted)
>   ...
>         ; } epilogue
>
> This patch introduces a new peephole2 pattern that implements the above.
>
> gcc/ChangeLog:
>
>         * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
>         the use of callee-saved register that saves and restores only once
>         for other register, by using its stack slot directly.
> ---
>  gcc/config/xtensa/xtensa.md | 62 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)

There are still issues with this change in the libgomp:

FAIL: libgomp.c/examples-4/target-1.c execution test
FAIL: libgomp.c/examples-4/target-2.c execution test

They come from the following function:

code produced before the change:
       .literal_position
       .literal .LC8, init@PLT
       .literal .LC9, 400000
       .literal .LC10, 100000
       .literal .LC11, -800000
       .literal .LC12, 800000
       .align  4
       .global vec_mult_ref
       .type   vec_mult_ref, @function
vec_mult_ref:
       l32r    a9, .LC11
       addi    sp, sp, -16
       l32r    a10, .LC9
       s32i.n  a12, sp, 8
       s32i.n  a13, sp, 4
       s32i.n  a0, sp, 12
       add.n   sp, sp, a9
       add.n   a12, sp, a10
       l32r    a9, .LC8
       mov.n   a13, a2
       mov.n   a3, sp
       mov.n   a2, a12
       callx0  a9
       l32r    a7, .LC10
       mov.n   a10, a12
       mov.n   a11, sp
       mov.n   a2, a13
       loop    a7, .L17_LEND
.L17:
       l32i.n  a9, a10, 0
       l32i.n  a6, a11, 0
       addi.n  a10, a10, 4
       mull    a9, a9, a6
       addi.n  a11, a11, 4
       s32i.n  a9, a2, 0
       addi.n  a2, a2, 4
       .L17_LEND:
       l32r    a9, .LC12
       add.n   sp, sp, a9
       l32i.n  a0, sp, 12
       l32i.n  a12, sp, 8
       l32i.n  a13, sp, 4
       addi    sp, sp, 16
       ret.n

--------------------

with the change:
       .literal_position
       .literal .LC8, init@PLT
       .literal .LC9, 400000
       .literal .LC10, 100000
       .literal .LC11, -800000
       .literal .LC12, 800000
       .align  4
       .global vec_mult_ref
       .type   vec_mult_ref, @function
vec_mult_ref:
       l32r    a9, .LC11
       l32r    a10, .LC9
       addi    sp, sp, -16
       s32i.n  a12, sp, 8
       s32i.n  a0, sp, 12
       add.n   sp, sp, a9
       add.n   a12, sp, a10
       l32r    a9, .LC8
       s32i.n  a2, sp, 4
       mov.n   a3, sp
       mov.n   a2, a12
       callx0  a9
       l32r    a7, .LC10
       l32i.n  a2, sp, 4
       mov.n   a10, a12
       mov.n   a11, sp
       loop    a7, .L17_LEND
.L17:
       l32i.n  a9, a10, 0
       l32i.n  a6, a11, 0
       addi.n  a10, a10, 4
       mull    a9, a9, a6
       addi.n  a11, a11, 4
       s32i.n  a9, a2, 0
       addi.n  a2, a2, 4
       .L17_LEND:
       l32r    a9, .LC12
       add.n   sp, sp, a9
       l32i.n  a0, sp, 12
       l32i.n  a12, sp, 8
       addi    sp, sp, 16
       ret.n

the stack pointer is modified after saving callee-saved registers,
but the stack offset where a2 is stored and reloaded does not take
this into an account.

After having this many attempts and getting to the issues that are
really hard to detect I wonder if the target backend is the right place
for this optimization?

-- 
Thanks.
-- Max

next prev parent reply	other threads:[~2023-01-20 15:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <465b0cbe-73ca-f5a0-661d-d34217e29b4d.ref@yahoo.co.jp>
2023-01-19  3:50 ` Takayuki 'January June' Suwa
2023-01-20 15:14   ` Max Filippov [this message]
2023-01-21  4:39     ` Takayuki 'January June' Suwa
2023-01-22 15:45       ` Max Filippov
2023-01-23  3:33         ` Takayuki 'January June' Suwa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMo8BfLFFSHHUxPJ90Z=w8GnWYezfkPK1okQtdTmwtGgmqCNEw@mail.gmail.com' \
    --to=jcmvbkbc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jjsuwa_sys3175@yahoo.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).