Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <richard.guenther@gmail.com>
To: Alexandre Oliva <oliva@adacore.com>
Cc: gcc-patches@gcc.gnu.org, "H.J. Lu" <hjl.tools@gmail.com>,
	 Jan Hubicka <hubicka@ucw.cz>, Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining
Date: Thu, 25 May 2023 15:32:20 +0200	[thread overview]
Message-ID: <CAFiYyc3G2DxJhArmsMtkScxPmeMrEdc1hxYA5HQ1R9NUkHau3A@mail.gmail.com> (raw)
In-Reply-To: <orttw0a6gv.fsf@lxoliva.fsfla.org>

On Thu, May 25, 2023 at 3:25 PM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On May 25, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > On Thu, May 25, 2023 at 1:10 PM Alexandre Oliva <oliva@adacore.com> wrote:
> >>
> >> On May 25, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
> >>
> >> > I mean we could do what RTL expansion would do later and do
> >> > by-pieces, thus emit multiple loads/stores but not n loads and then
> >> > n stores but interleaved.
> >>
> >> That wouldn't help e.g. gcc.dg/memcpy-6.c's fold_move_8, because
> >> MOVE_MAX and MOVE_MAX_PIECES currently limits inline expansion to 4
> >> bytes on x86 without SSE, both in gimple and RTL, and interleaved loads
> >> and stores wouldn't help with memmove.  We can't fix that by changing
> >> code that uses MOVE_MAX and/or MOVE_MAX_PIECES, when these limits are
> >> set too low.
>
> > Btw, there was a short period where the MOVE_MAX limit was restricted
> > but that had fallout and we've reverted since then.
>
> Erhm...  Are we even talking about the same issue?
>
> i386/i386.h reduced the 32-bit non-SSE MOVE_MAX from 16 to 4, which
> broke this test; I'm proposing to bounce it back up to 8, so that we get
> a little more memmove inlining, enough for tests that expect that much
> to pass.
>
> You may be focusing on the gimple-fold bit, because I mentioned it, but
> even the rtl expander is failing to expand the memmove because of the
> setting, as evidenced by the test's failure in the scan for memmove in
> the final dump.

So indeed fold_move_8 expands to the following, even with -minline-all-stringops

fold_move_8:
.LFB5:
        .cfi_startproc
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        subl    $8, %esp
        movl    $a+3, %eax
        subl    $4, %esp
        pushl   $8
        pushl   $a
        pushl   %eax
        call    memmove
        addl    $16, %esp
        nop

I do think it's still up to RTL expansion or the target to decide whether
its worth spending two registers to handle the overlap or maybe
emit a compare & jump to do forward and backward variants.

Yes, increasing MOVE_MAX to 8 makes this expand at the GIMPLE
level already, which I belive is premature and difficult to undo.

> That MOVE_MAX change was a significant regression in codegen for 32-bit
> non-SSE x86, and I'm proposing to fix that.  Compensating for that
> regression elsewhere doesn't seem desirable to me: MOVE_MAX can be much
> higher even on other x86 variants, so the effects of such attempts may
> harm quite significantly more modern CPUs.
>
> Conversely, I don't expect the reduction of MOVE_MAX on SSE-less x86 a
> couple of years ago to have been measured for performance effects, given
> the little overall relevance of such CPUs, and the very visible and
> undesirable effects on codegen that change brought onto them.  And yet,
> I'm being very conservative in the proposed reversion, because
> benchmarking such targets in any meaningful way would be somewhat
> challenging for myself as well.
>
> So, could we please have this narrow fix of this limited regression at
> the spot where it was introduced accepted, rather than debating
> tangents?
>
> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>

     prev parent reply	other threads:[~2023-05-25 13:34 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24  5:47 Alexandre Oliva
2023-05-24  9:12 ` Richard Biener
2023-05-25 10:01   ` Alexandre Oliva
2023-05-25 10:49     ` Richard Biener
2023-05-25 11:10       ` Alexandre Oliva
2023-05-25 11:33         ` Richard Biener
2023-05-25 13:25           ` Alexandre Oliva
2023-05-25 13:32             ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc3G2DxJhArmsMtkScxPmeMrEdc1hxYA5HQ1R9NUkHau3A@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=hubicka@ucw.cz \
    --cc=oliva@adacore.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).