public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org>
Cc: Jeff Law <jlaw@tachyum.com>,  "H.J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] Rewrite memset expanders with vec_duplicate
Date: Fri, 16 Jul 2021 14:23:58 +0100	[thread overview]
Message-ID: <mpt1r7y5r4h.fsf@arm.com> (raw)
In-Reply-To: <CAMe9rOr82vAMJRZFy76JUr0f=rrNrvSMirLSp-9zKKXKXH9YDA@mail.gmail.com> (H. J. Lu via Gcc-patches's message of "Fri, 16 Jul 2021 05:57:01 -0700")

"H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> > 1. Rewrite builtin_memset_read_str and builtin_memset_gen_str with
>> > vec_duplicate_optab to duplicate QI value to TI/OI/XI value.
>> > 2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
>> > scratch register to avoid stack realignment when expanding memset.
>> >
>> >       PR middle-end/90773
>> >       * builtins.c (gen_memset_value_from_prev): New function.
>> >       (gen_memset_broadcast): Likewise.
>> >       (builtin_memset_read_str): Use gen_memset_value_from_prev
>> >       and gen_memset_broadcast.
>> >       (builtin_memset_gen_str): Likewise.
>> >       * target.def (gen_memset_scratch_rtx): New hook.
>> >       * doc/tm.texi.in: Add TARGET_GEN_MEMSET_SCRATCH_RTX.
>> >       * doc/tm.texi: Regenerated.
>> > ---
>> >  gcc/builtins.c     | 123 +++++++++++++++++++++++++++++++++++++--------
>> >  gcc/doc/tm.texi    |   5 ++
>> >  gcc/doc/tm.texi.in |   2 +
>> >  gcc/target.def     |   7 +++
>> >  4 files changed, 116 insertions(+), 21 deletions(-)
>> >
>> > diff --git a/gcc/builtins.c b/gcc/builtins.c
>> > index 39ab139b7e1..c1758ae2efc 100644
>> > --- a/gcc/builtins.c
>> > +++ b/gcc/builtins.c
>> > @@ -6686,26 +6686,111 @@ expand_builtin_strncpy (tree exp, rtx target)
>> >    return NULL_RTX;
>> >  }
>> >
>> > -/* Callback routine for store_by_pieces.  Read GET_MODE_BITSIZE (MODE)
>> > -   bytes from constant string DATA + OFFSET and return it as target
>> > -   constant.  If PREV isn't nullptr, it has the RTL info from the
>> > +/* Return the RTL of a register in MODE generated from PREV in the
>> >     previous iteration.  */
>> >
>> > -rtx
>> > -builtin_memset_read_str (void *data, void *prevp,
>> > -                      HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
>> > -                      scalar_int_mode mode)
>> > +static rtx
>> > +gen_memset_value_from_prev (void *prevp, scalar_int_mode mode)
>> >  {
>> > +  rtx target = nullptr;
>> >    by_pieces_prev *prev = (by_pieces_prev *) prevp;
>> >    if (prev != nullptr && prev->data != nullptr)
>> >      {
>> >        /* Use the previous data in the same mode.  */
>> >        if (prev->mode == mode)
>> >       return prev->data;
>> > +
>> > +      rtx prev_rtx = prev->data;
>> > +      machine_mode prev_mode = prev->mode;
>> > +      unsigned int word_size = GET_MODE_SIZE (word_mode);
>> > +      if (word_size < GET_MODE_SIZE (prev->mode)
>> > +       && word_size > GET_MODE_SIZE (mode))
>> > +     {
>> > +       /* First generate subreg of word mode if the previous mode is
>> > +          wider than word mode and word mode is wider than MODE.  */
>> > +       prev_rtx = simplify_gen_subreg (word_mode, prev_rtx,
>> > +                                       prev_mode, 0);
>> > +       prev_mode = word_mode;
>> > +     }
>> > +      if (prev_rtx != nullptr)
>> > +     target = simplify_gen_subreg (mode, prev_rtx, prev_mode, 0);
>> >      }
>> > +  return target;
>> > +}
>> > +
>> > +/* Return the RTL of a register in MODE broadcasted from DATA.  */
>> > +
>> > +static rtx
>> > +gen_memset_broadcast (rtx data, scalar_int_mode mode)
>> > +{
>> > +  /* Skip if regno_reg_rtx isn't initialized.  */
>> > +  if (!regno_reg_rtx)
>> > +    return nullptr;
>> > +
>> > +  rtx target = nullptr;
>> > +
>> > +  unsigned int nunits = GET_MODE_SIZE (mode) / GET_MODE_SIZE (QImode);
>> > +  machine_mode vector_mode;
>> > +  if (!mode_for_vector (QImode, nunits).exists (&vector_mode))
>> > +    gcc_unreachable ();
>>
>> Sorry, I realise it's a bit late to be raising this objection now,
>> but I don't think it's a good idea to use scalar integer modes as
>> a proxy for vector modes.  In principle there's no reason why a
>> target has to define an integer mode for every vector mode.
>
> A target always defines the largest integer mode.

Right.  But a target shouldn't *need* to define an integer mode
for every vector mode.

>> If we want the mode to be a vector then I think the by-pieces
>> infrastructure should be extended to support vectors directly,
>> rather than assuming that each piece can be represented as
>> a scalar_int_mode.
>>
>
> The current by-pieces infrastructure operates on scalar_int_mode.
> Only for memset, there is
>
> /* Callback routine for store_by_pieces.  Return the RTL of a register
>    containing GET_MODE_SIZE (MODE) consecutive copies of the unsigned
>    char value given in the RTL register data.  For example, if mode is
>    4 bytes wide, return the RTL for 0x01010101*data.  If PREV isn't
>    nullptr, it has the RTL info from the previous iteration.  */
>
> static rtx
> builtin_memset_gen_str (void *data, void *prevp,
>                         HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
>                         scalar_int_mode mode)
>
> It is a broadcast.  If a target can broadcast a byte to a wider integer,
> can you suggest a way to use it in the current by-pieces infrastructure?

I meant that we should extend the by-pieces infrastructure so that
it no longer requires scalar_int_mode, rather than work within the
current limits.

IMO the fact that we're (legitimately) going through vec_duplicate_optab
shows that we really are dealing with vectors here, not integers.

Thanks,
Richard

  reply	other threads:[~2021-07-16 13:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13 21:49 H.J. Lu
2021-07-16 11:38 ` Richard Sandiford
2021-07-16 12:57   ` H.J. Lu
2021-07-16 13:23     ` Richard Sandiford [this message]
2021-07-16 14:15       ` H.J. Lu
2021-07-16 23:57         ` [PATCH] Add QI vector mode support to by-pieces for memset H.J. Lu
2021-07-19 14:40           ` Richard Sandiford
2021-07-20  2:43             ` H.J. Lu
2021-07-20  6:38               ` Richard Sandiford
2021-07-20 12:48                 ` H.J. Lu
2021-07-20 13:10                   ` H.J. Lu
2021-07-20 14:26                   ` Richard Sandiford
2021-07-20 15:12                     ` Richard Sandiford
2021-07-20 18:52                       ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpt1r7y5r4h.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=jlaw@tachyum.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).