public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org>,
	Uros Bizjak <ubizjak@gmail.com>,
	 Jakub Jelinek <jakub@redhat.com>,
	Hongtao Liu <crazylht@gmail.com>,
	 Richard Biener <richard.guenther@gmail.com>,
	Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH v5 2/2] x86: Add vec_duplicate<mode> expander
Date: Mon, 28 Jun 2021 12:38:18 -0700	[thread overview]
Message-ID: <CAMe9rOqC1heG7Pf+zhHfV+AxwPnJQOnDBw3QGKpiqHEH91OSkw@mail.gmail.com> (raw)
In-Reply-To: <mptsg126u8v.fsf@arm.com>

On Mon, Jun 28, 2021 at 5:36 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> "H.J. Lu" <hjl.tools@gmail.com> writes:
> > On Sun, Jun 27, 2021 at 2:00 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> > On Sun, Jun 27, 2021 at 1:43 AM Richard Sandiford
> >> > <richard.sandiford@arm.com> wrote:
> >> >>
> >> >> "H.J. Lu" <hjl.tools@gmail.com> writes:
> >> >> > 1. Update vec_duplicate to allow to fail so that backend can only allow
> >> >> > broadcasting an integer constant to a vector when broadcast instruction
> >> >> > is available.  This can be used by memset expander to avoid vec_duplicate
> >> >> > when loading from constant pool is more efficient.
> >> >>
> >> >> I don't see any changes in target-independent code though, other than
> >> >> the doc update.  It's still the case that (existing) uses of
> >> >> vec_duplicate_optab do not allow it to fail.
> >> >
> >> > I have a followup patch set on
> >> >
> >> > https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/pieces/broadcast
> >> >
> >> > to use it to expand memset with vector broadcast:
> >> >
> >> > https://gitlab.com/x86-gcc/gcc/-/commit/991c87f8a83ca736ae9ed92baa3ebadca289f6e3
> >> >
> >> > For SSE2 which doesn't have vector broadcast, the constant vector broadcast
> >> > expander returns FAIL and load from constant pool will be used.
> >>
> >> Hmm, but as Jeff and I mentioned in the earlier replies,
> >> vec_duplicate_optab shouldn't be used for constants.  Constants
> >> should go via the move expanders instead.
> >>
> >> In a previous message I suggested:
> >>
> >>   … would it work to change:
> >>
> >>         /* Try using vec_duplicate_optab for uniform vectors.  */
> >>         if (!TREE_SIDE_EFFECTS (exp)
> >>             && VECTOR_MODE_P (mode)
> >>             && eltmode == GET_MODE_INNER (mode)
> >>             && ((icode = optab_handler (vec_duplicate_optab, mode))
> >>                 != CODE_FOR_nothing)
> >>             && (elt = uniform_vector_p (exp)))
> >>
> >>   to something like:
> >>
> >>         /* Try using vec_duplicate_optab for uniform vectors.  */
> >>         if (!TREE_SIDE_EFFECTS (exp)
> >>             && VECTOR_MODE_P (mode)
> >>             && eltmode == GET_MODE_INNER (mode)
> >>             && (elt = uniform_vector_p (exp)))
> >>           {
> >>             if (TREE_CODE (elt) == INTEGER_CST
> >>                 || TREE_CODE (elt) == POLY_INT_CST
> >>                 || TREE_CODE (elt) == REAL_CST
> >>                 || TREE_CODE (elt) == FIXED_CST)
> >>               {
> >>                 rtx src = gen_const_vec_duplicate (mode, expand_normal (node));
> >>                 emit_move_insn (target, src);
> >>                 break;
> >>               }
> >>             …
> >>           }
> >>
> >> if that code was the source of the constant operand.  If we're adding a
> >> new use of vec_duplicate_optab then that should be similarly protected
> >> against constant operands.
> >>
> >
> > Your comments apply to my initial vec_duplicate patch that caused the
> > gcc.dg/pr100239.c failure.  It has been fixed by
> >
> > commit ffe3a37f54ab866d85bdde48c2a32be5e09d8515
> > Author: Richard Biener <rguenther@suse.de>
> > Date:   Mon Jun 7 20:08:13 2021 +0200
> >
> >     middle-end/100951 - make sure to generate VECTOR_CST in lowering
> >
> >     When vector lowering creates piecewise ops make sure to create
> >     VECTOR_CSTs instead of CONSTRUCTORs when possible.
> >
> > The problem I am running into now is in my memset vector broadcast
> > patch.  In order to optimize vector broadcast for memset, I need to
> > generate a pseudo register for
> >
> >  __builtin_memset (ops, 3, 38);
> >
> > only when vector broadcast is available:
> >
> >   rtx target = nullptr;
> >
> >   unsigned int nunits = GET_MODE_SIZE (mode) / GET_MODE_SIZE (QImode);
> >   machine_mode vector_mode;
> >   if (!mode_for_vector (QImode, nunits).exists (&vector_mode))
> >     gcc_unreachable ();
> >
> >   enum insn_code icode = optab_handler (vec_duplicate_optab,
> >                                         vector_mode);
> >   if (icode != CODE_FOR_nothing)
> >     {
> >       rtx reg = targetm.gen_memset_scratch_rtx (vector_mode);
> >       class expand_operand ops[2];
> >       create_output_operand (&ops[0], reg, vector_mode);
> >       create_input_operand (&ops[1], data, QImode);
> >       if (maybe_expand_insn (icode, 2, ops))
> >         {
> >           if (!rtx_equal_p (reg, ops[0].value))
> >             emit_move_insn (reg, ops[0].value);
> >           target = lowpart_subreg (mode, reg, vector_mode);
> >         }
> >     }
> >
> >   return target;  <<< Return nullptr to load from constant pool.
>
> I don't think this is a correct use of vec_duplicate_optab.  If the
> scalar operand is a constant then the move should always go through
> the move expanders instead, as a move from a CONST_VECTOR.

Like this?

  enum insn_code icode = optab_handler (vec_duplicate_optab,
                                        vector_mode);
  if (icode != CODE_FOR_nothing)
    {
      rtx reg = targetm.gen_memset_scratch_rtx (vector_mode);
      if (CONST_INT_P (data))
        {
          /* Use the move expander with CONST_VECTOR.  */
          rtvec v = rtvec_alloc (nunits);
          for (unsigned int i = 0; i < nunits; i++)
            RTVEC_ELT (v, i) = data;
          rtx const_vec = gen_rtx_CONST_VECTOR (vector_mode, v);
          emit_move_insn (reg, const_vec);
        }
      else
        {

          class expand_operand ops[2];
          create_output_operand (&ops[0], reg, vector_mode);
          create_input_operand (&ops[1], data, QImode);
          expand_insn (icode, 2, ops);
          if (!rtx_equal_p (reg, ops[0].value))
            emit_move_insn (reg, ops[0].value);
        }
      target = lowpart_subreg (mode, reg, vector_mode);
    }


-- 
H.J.

  reply	other threads:[~2021-06-28 19:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-26 20:02 [PATCH v5 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast H.J. Lu
2021-06-26 20:02 ` [PATCH v5 1/2] " H.J. Lu
2021-06-28  1:48   ` Hongtao Liu
2021-06-29  0:40     ` H.J. Lu
2021-06-26 20:02 ` [PATCH v5 2/2] x86: Add vec_duplicate<mode> expander H.J. Lu
2021-06-27  8:43   ` Richard Sandiford
2021-06-27 11:29     ` H.J. Lu
2021-06-27 21:00       ` Richard Sandiford
2021-06-28 12:16         ` H.J. Lu
2021-06-28 12:36           ` Richard Sandiford
2021-06-28 19:38             ` H.J. Lu [this message]
2021-06-29  8:17               ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOqC1heG7Pf+zhHfV+AxwPnJQOnDBw3QGKpiqHEH91OSkw@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=richard.guenther@gmail.com \
    --cc=richard.sandiford@arm.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).