public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Hongtao Liu <crazylht@gmail.com>
Cc: Jan Hubicka <hubicka@ucw.cz>, liuhongt <hongtao.liu@intel.com>,
	gcc-patches@gcc.gnu.org,  hjl.tools@gmail.com,
	Richard Biener <richard.guenther@gmail.com>
Subject: Re: [PATCH] Take register pressure into account for vec_construct/scalar_to_vec when the components are not loaded from memory.
Date: Mon, 4 Dec 2023 08:51:47 +0100	[thread overview]
Message-ID: <CAFULd4YMD4dKBui33Wne+fh8Yje_woTGs6b+yd3w=deehW79VA@mail.gmail.com> (raw)
In-Reply-To: <CAMZc-bx9vNTxq6PnCVn4KvyxpXO_QsevSrdxq7x09PkEUBbFmw@mail.gmail.com>

On Mon, Dec 4, 2023 at 8:11 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Dec 1, 2023 at 3:39 AM liuhongt <hongtao.liu@intel.com> wrote:
> > >
> > > > Hmm, I would suggest you put reg_needed into the class and accumulate
> > > > over all vec_construct, with your patch you pessimize a single v32qi
> > > > over two separate v16qi for example.  Also currently the whole block is
> > > > gated with INTEGRAL_TYPE_P but register pressure would be also
> > > > a concern for floating point vectors.  finish_cost would then apply an
> > > > adjustment.
> > >
> > > Changed.
> > >
> > > > 'target_avail_regs' is for GENERAL_REGS, does that include APX regs?
> > > > I don't see anything similar for FP regs, but I guess the target should know
> > > > or maybe there's a #regs in regclass query already.
> > > Haven't see any, use below setting.
> > >
> > > unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> > >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > No big impact on SPEC2017.
> > > Observe 1 big improvement from other benchmark by avoiding vectorization with
> > > vec_construct v32qi which caused lots of spills.
> > >
> > > Ok for trunk?
> >
> > LGTM, let's see what x86 maintainers think.
> +Honza and Uros.
> Any comments?

I have no comment on vector stuff, I think you are the most
experienced developer in this area.

Uros.

> >
> > Richard.
> >
> > > For vec_contruct, the components must be live at the same time if
> > > they're not loaded from memory, when the number of those components
> > > exceeds available registers, spill happens. Try to account that with a
> > > rough estimation.
> > > ??? Ideally, we should have an overall estimation of register pressure
> > > if we know the live range of all variables.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> > >         Count sse_reg/gpr_regs for components not loaded from memory.
> > >         (ix86_vector_costs:ix86_vector_costs): New constructor.
> > >         (ix86_vector_costs::m_num_gpr_needed[3]): New private memeber.
> > >         (ix86_vector_costs::m_num_sse_needed[3]): Ditto.
> > >         (ix86_vector_costs::finish_cost): Estimate overall register
> > >         pressure cost.
> > >         (ix86_vector_costs::ix86_vect_estimate_reg_pressure): New
> > >         function.
> > > ---
> > >  gcc/config/i386/i386.cc | 54 ++++++++++++++++++++++++++++++++++++++---
> > >  1 file changed, 50 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index 9390f525b99..dcaea6c2096 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -24562,15 +24562,34 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
> > >  /* x86-specific vector costs.  */
> > >  class ix86_vector_costs : public vector_costs
> > >  {
> > > -  using vector_costs::vector_costs;
> > > +public:
> > > +  ix86_vector_costs (vec_info *, bool);
> > >
> > >    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > >                               stmt_vec_info stmt_info, slp_tree node,
> > >                               tree vectype, int misalign,
> > >                               vect_cost_model_location where) override;
> > >    void finish_cost (const vector_costs *) override;
> > > +
> > > +private:
> > > +
> > > +  /* Estimate register pressure of the vectorized code.  */
> > > +  void ix86_vect_estimate_reg_pressure ();
> > > +  /* Number of GENERAL_REGS/SSE_REGS used in the vectorizer, it's used for
> > > +     estimation of register pressure.
> > > +     ??? Currently it's only used by vec_construct/scalar_to_vec
> > > +     where we know it's not loaded from memory.  */
> > > +  unsigned m_num_gpr_needed[3];
> > > +  unsigned m_num_sse_needed[3];
> > >  };
> > >
> > > +ix86_vector_costs::ix86_vector_costs (vec_info* vinfo, bool costing_for_scalar)
> > > +  : vector_costs (vinfo, costing_for_scalar),
> > > +    m_num_gpr_needed (),
> > > +    m_num_sse_needed ()
> > > +{
> > > +}
> > > +
> > >  /* Implement targetm.vectorize.create_costs.  */
> > >
> > >  static vector_costs *
> > > @@ -24748,8 +24767,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > >      }
> > >    else if ((kind == vec_construct || kind == scalar_to_vec)
> > >            && node
> > > -          && SLP_TREE_DEF_TYPE (node) == vect_external_def
> > > -          && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> > > +          && SLP_TREE_DEF_TYPE (node) == vect_external_def)
> > >      {
> > >        stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
> > >        unsigned i;
> > > @@ -24785,7 +24803,15 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > >                   && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
> > >                       || !VECTOR_TYPE_P (TREE_TYPE
> > >                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))
> > > -           stmt_cost += ix86_cost->sse_to_integer;
> > > +           {
> > > +             if (fp)
> > > +               m_num_sse_needed[where]++;
> > > +             else
> > > +               {
> > > +                 m_num_gpr_needed[where]++;
> > > +                 stmt_cost += ix86_cost->sse_to_integer;
> > > +               }
> > > +           }
> > >         }
> > >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
> > >         if (TREE_CODE (op) == SSA_NAME)
> > > @@ -24821,6 +24847,24 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > >    return retval;
> > >  }
> > >
> > > +void
> > > +ix86_vector_costs::ix86_vect_estimate_reg_pressure ()
> > > +{
> > > +  unsigned gpr_spill_cost = COSTS_N_INSNS (ix86_cost->int_store [2]) / 2;
> > > +  unsigned sse_spill_cost = COSTS_N_INSNS (ix86_cost->sse_store[0]) / 2;
> > > +
> > > +  /* Any better way to have target available fp registers, currently use SSE_REGS.  */
> > > +  unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> > > +  for (unsigned i = 0; i != 3; i++)
> > > +    {
> > > +      if (m_num_gpr_needed[i] > target_avail_regs)
> > > +       m_costs[i] += gpr_spill_cost * (m_num_gpr_needed[i] - target_avail_regs);
> > > +      /* Only measure sse registers pressure.  */
> > > +      if (TARGET_SSE && (m_num_sse_needed[i] > target_avail_sse))
> > > +       m_costs[i] += sse_spill_cost * (m_num_sse_needed[i] - target_avail_sse);
> > > +    }
> > > +}
> > > +
> > >  void
> > >  ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> > >  {
> > > @@ -24843,6 +24887,8 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> > >         m_costs[vect_body] = INT_MAX;
> > >      }
> > >
> > > +  ix86_vect_estimate_reg_pressure ();
> > > +
> > >    vector_costs::finish_cost (scalar_costs);
> > >  }
> > >
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> BR,
> Hongtao

  reply	other threads:[~2023-12-04  7:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28  7:52 [PATCH] Take register pressure into account for vec_construct " liuhongt
2023-11-29  7:47 ` Richard Biener
2023-11-29  8:00   ` Hongtao Liu
2023-12-01  2:38   ` [PATCH] Take register pressure into account for vec_construct/scalar_to_vec " liuhongt
2023-12-01 14:26     ` Richard Biener
2023-12-04  7:10       ` Hongtao Liu
2023-12-04  7:51         ` Uros Bizjak [this message]
2023-12-05  2:47           ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFULd4YMD4dKBui33Wne+fh8Yje_woTGs6b+yd3w=deehW79VA@mail.gmail.com' \
    --to=ubizjak@gmail.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=hongtao.liu@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).