From: Richard Sandiford <richard.sandiford@arm.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org>,
Jeff Law <jlaw@tachyum.com>
Subject: Re: [PATCH v4] Add QI vector mode support to by-pieces for memset
Date: Fri, 30 Jul 2021 10:05:18 +0100 [thread overview]
Message-ID: <mptr1fggoj5.fsf@arm.com> (raw)
In-Reply-To: <CAMe9rOpy8+h9qDyOR8q5LYAaKRthoetFYvzQky46BnVxiExhZw@mail.gmail.com> (H. J. Lu's message of "Tue, 27 Jul 2021 20:20:21 -0700")
"H.J. Lu" <hjl.tools@gmail.com> writes:
> On Tue, Jul 27, 2021 at 8:31 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Mon, Jul 26, 2021 at 4:19 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >
>> > On Mon, Jul 26, 2021 at 3:56 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> > >
>> > > On Mon, Jul 26, 2021 at 2:53 PM Richard Sandiford
>> > > <richard.sandiford@arm.com> wrote:
>> > > >
>> > > > "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > > > > On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford
>> > > > > <richard.sandiford@arm.com> wrote:
>> > > > >>
>> > > > >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > > > >> > +to avoid stack realignment when expanding memset. The default is
>> > > > >> > +@code{gen_reg_rtx}.
>> > > > >> > +@end deftypefn
>> > > > >> > +
>> > > > >> > @deftypefn {Target Hook} unsigned TARGET_LOOP_UNROLL_ADJUST (unsigned @var{nunroll}, class loop *@var{loop})
>> > > > >> > This target hook returns a new value for the number of times @var{loop}
>> > > > >> > should be unrolled. The parameter @var{nunroll} is the number of times
>> > > > >> > […]
>> > > > >> > @@ -1446,7 +1511,10 @@ can_store_by_pieces (unsigned HOST_WIDE_INT len,
>> > > > >> > max_size = STORE_MAX_PIECES + 1;
>> > > > >> > while (max_size > 1 && l > 0)
>> > > > >> > {
>> > > > >> > - scalar_int_mode mode = widest_int_mode_for_size (max_size);
>> > > > >> > + /* Since this can be called before virtual registers are ready
>> > > > >> > + to use, avoid QI vector mode here. */
>> > > > >> > + fixed_size_mode mode
>> > > > >> > + = widest_fixed_size_mode_for_size (max_size, false);
>> > > > >>
>> > > > >> I think I might have asked this before, sorry, but: when is that true
>> > > > >> and why does it matter?
>> > > > >
>> > > > > can_store_by_pieces may be called:
>> > > > >
>> > > > > value-prof.c: if (!can_store_by_pieces (val, builtin_memset_read_str,
>> > > > > value-prof.c: if (!can_store_by_pieces (val, builtin_memset_read_str,
>> > > > >
>> > > > > before virtual registers can be used. When true is passed to
>> > > > > widest_fixed_size_mode_for_size, virtual registers may be used
>> > > > > to expand memset to broadcast, which leads to ICE. Since for the
>> > > > > purpose of can_store_by_pieces, we don't need to expand memset
>> > > > > to broadcast and pass false here can avoid ICE.
>> > > >
>> > > > Ah, I see, thanks.
>> > > >
>> > > > That sounds like a problem in the way that the memset const function is
>> > > > written though. can_store_by_pieces is just a query function, so I don't
>> > > > think it should be trying to create new registers for can_store_by_pieces,
>> > > > even if it could. At the same time, can_store_by_pieces should make the
>> > > > same choices as the real expander would.
>> > > >
>> > > > I think this means that:
>> > > >
>> > > > - gen_memset_broadcast should be inlined into its callers, with the
>> > > > builtin_memset_read_str getting the CONST_INT_P case and
>> > > > builtin_memset_gen_str getting the variable case.
>> > > >
>> > > > - builtin_memset_read_str should then stop at and return the
>> > > > gen_const_vec_duplicate when the prev argument is null.
>> >
>> > This doesn't work since can_store_by_pieces has
>> >
>> > cst = (*constfun) (constfundata, nullptr, offset, mode);
>> > if (!targetm.legitimate_constant_p (mode, cst))
>>
>> We can add a target hook, targetm.legitimate_memset_constant_p,
>> which defaults to targetm.legitimate_constant_p. Will it be acceptable?
>
> In the v5 patch, I changed it to
>
> cst = (*constfun) (constfundata, nullptr, offset, mode);
> /* All CONST_VECTORs are legitimate if vec_duplicate
> is supported. */
Maybe “can be loaded” rather than “are legitimate”, since they're
not necessarily legitimate in the sense of legitimate_constant_p
(hence the patch). Also, since we assume elsewhere that
vec_duplicate is a precondition for picking a vector mode,
I think we should do the same here (and note that in the comment).
So…
> if (!((memsetp
> && VECTOR_MODE_P (mode)
> && GET_MODE_INNER (mode) == QImode
> && (optab_handler (vec_duplicate_optab, mode)
> != CODE_FOR_nothing))
I think we need only the (memsetp && VECTOR_MODE_P (mode)) check.
This feels a bit of a hack TBH. I think the same principles apply
to vectors and integers here: forcing the constant to memory is
still likely to be an optimisation, but is an extra overhead that
we should probably account for.
However, I agree this is probably the most practical way forward
at the moment.
Thanks,
Richard
> || targetm.legitimate_constant_p (mode, cst)))
> return 0;
>
>> > ix86_legitimate_constant_p only allows 0 or -1 for CONST_VECTOR.
>> > can_store_by_pieces doesn't work well for vector modes.
>> >
>> > > > Only when prev is nonnull should it go on to call the hook
>> > > > and copy the constant to the register that the hook returns.
>> > >
>> > > How about keeping gen_memset_broadcast and passing PREV to it:
>> > >
>> > > rtx target;
>> > > if (CONST_INT_P (data))
>> > > {
>> > > rtx const_vec = gen_const_vec_duplicate (mode, data);
>> > > if (prev == NULL)
>> > > /* Return CONST_VECTOR when called by a query function. */
>> > > target = const_vec;
>> > > else
>> > > {
>> > > /* Use the move expander with CONST_VECTOR. */
>> > > target = targetm.gen_memset_scratch_rtx (mode);
>> > > emit_move_insn (target, const_vec);
>> > > }
>> > > }
>> > > else
>> > > {
>> > > target = targetm.gen_memset_scratch_rtx (mode);
>> > > class expand_operand ops[2];
>> > > create_output_operand (&ops[0], target, mode);
>> > > create_input_operand (&ops[1], data, QImode);
>> > > expand_insn (icode, 2, ops);
>> > > if (!rtx_equal_p (target, ops[0].value))
>> > > emit_move_insn (target, ops[0].value);
>> > > }
>> > >
>> > > > I admit that's uglier than the current version, but it looks like that's
>> > > > what the current interface expects.
>> > > >
>> > > > Thanks,
>> > > > Richard
>> > >
>> > >
>> > >
>> > > --
>> > > H.J.
>> >
>> >
>> >
>> > --
>> > H.J.
>>
>>
>>
>> --
>> H.J.
next prev parent reply other threads:[~2021-07-30 9:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-21 20:02 H.J. Lu
2021-07-26 18:42 ` Richard Sandiford
2021-07-26 21:04 ` H.J. Lu
2021-07-26 21:53 ` Richard Sandiford
2021-07-26 22:56 ` H.J. Lu
2021-07-26 23:19 ` H.J. Lu
2021-07-27 15:31 ` H.J. Lu
2021-07-28 3:20 ` H.J. Lu
2021-07-30 9:05 ` Richard Sandiford [this message]
2021-07-30 12:38 ` H.J. Lu
2021-07-30 9:06 ` Richard Sandiford
2021-07-30 12:39 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptr1fggoj5.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=jlaw@tachyum.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).