From: Richard Biener <richard.guenther@gmail.com>
To: Alexandre Oliva <oliva@adacore.com>
Cc: gcc-patches@gcc.gnu.org, "H.J. Lu" <hjl.tools@gmail.com>,
Jan Hubicka <hubicka@ucw.cz>, Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining
Date: Thu, 25 May 2023 15:32:20 +0200 [thread overview]
Message-ID: <CAFiYyc3G2DxJhArmsMtkScxPmeMrEdc1hxYA5HQ1R9NUkHau3A@mail.gmail.com> (raw)
In-Reply-To: <orttw0a6gv.fsf@lxoliva.fsfla.org>
On Thu, May 25, 2023 at 3:25 PM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On May 25, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > On Thu, May 25, 2023 at 1:10 PM Alexandre Oliva <oliva@adacore.com> wrote:
> >>
> >> On May 25, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
> >>
> >> > I mean we could do what RTL expansion would do later and do
> >> > by-pieces, thus emit multiple loads/stores but not n loads and then
> >> > n stores but interleaved.
> >>
> >> That wouldn't help e.g. gcc.dg/memcpy-6.c's fold_move_8, because
> >> MOVE_MAX and MOVE_MAX_PIECES currently limits inline expansion to 4
> >> bytes on x86 without SSE, both in gimple and RTL, and interleaved loads
> >> and stores wouldn't help with memmove. We can't fix that by changing
> >> code that uses MOVE_MAX and/or MOVE_MAX_PIECES, when these limits are
> >> set too low.
>
> > Btw, there was a short period where the MOVE_MAX limit was restricted
> > but that had fallout and we've reverted since then.
>
> Erhm... Are we even talking about the same issue?
>
> i386/i386.h reduced the 32-bit non-SSE MOVE_MAX from 16 to 4, which
> broke this test; I'm proposing to bounce it back up to 8, so that we get
> a little more memmove inlining, enough for tests that expect that much
> to pass.
>
> You may be focusing on the gimple-fold bit, because I mentioned it, but
> even the rtl expander is failing to expand the memmove because of the
> setting, as evidenced by the test's failure in the scan for memmove in
> the final dump.
So indeed fold_move_8 expands to the following, even with -minline-all-stringops
fold_move_8:
.LFB5:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
subl $8, %esp
movl $a+3, %eax
subl $4, %esp
pushl $8
pushl $a
pushl %eax
call memmove
addl $16, %esp
nop
I do think it's still up to RTL expansion or the target to decide whether
its worth spending two registers to handle the overlap or maybe
emit a compare & jump to do forward and backward variants.
Yes, increasing MOVE_MAX to 8 makes this expand at the GIMPLE
level already, which I belive is premature and difficult to undo.
> That MOVE_MAX change was a significant regression in codegen for 32-bit
> non-SSE x86, and I'm proposing to fix that. Compensating for that
> regression elsewhere doesn't seem desirable to me: MOVE_MAX can be much
> higher even on other x86 variants, so the effects of such attempts may
> harm quite significantly more modern CPUs.
>
> Conversely, I don't expect the reduction of MOVE_MAX on SSE-less x86 a
> couple of years ago to have been measured for performance effects, given
> the little overall relevance of such CPUs, and the very visible and
> undesirable effects on codegen that change brought onto them. And yet,
> I'm being very conservative in the proposed reversion, because
> benchmarking such targets in any meaningful way would be somewhat
> challenging for myself as well.
>
> So, could we please have this narrow fix of this limited regression at
> the spot where it was introduced accepted, rather than debating
> tangents?
>
> --
> Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
> Free Software Activist GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts. Ask me about <https://stallmansupport.org>
prev parent reply other threads:[~2023-05-25 13:34 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-24 5:47 Alexandre Oliva
2023-05-24 9:12 ` Richard Biener
2023-05-25 10:01 ` Alexandre Oliva
2023-05-25 10:49 ` Richard Biener
2023-05-25 11:10 ` Alexandre Oliva
2023-05-25 11:33 ` Richard Biener
2023-05-25 13:25 ` Alexandre Oliva
2023-05-25 13:32 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc3G2DxJhArmsMtkScxPmeMrEdc1hxYA5HQ1R9NUkHau3A@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=hubicka@ucw.cz \
--cc=oliva@adacore.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).