public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongtao Liu <crazylht@gmail.com>
To: Jan Hubicka <hubicka@ucw.cz>, Uros Bizjak <ubizjak@gmail.com>
Cc: liuhongt <hongtao.liu@intel.com>,
	gcc-patches@gcc.gnu.org, hjl.tools@gmail.com,
	 Richard Biener <richard.guenther@gmail.com>
Subject: Re: [PATCH] Take register pressure into account for vec_construct/scalar_to_vec when the components are not loaded from memory.
Date: Mon, 4 Dec 2023 15:10:52 +0800	[thread overview]
Message-ID: <CAMZc-bx9vNTxq6PnCVn4KvyxpXO_QsevSrdxq7x09PkEUBbFmw@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc1dbF9EbMPZ_wOVjXzv-4qoRfS5=5GLK7VoCzD5kbHxBg@mail.gmail.com>

On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Dec 1, 2023 at 3:39 AM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > > Hmm, I would suggest you put reg_needed into the class and accumulate
> > > over all vec_construct, with your patch you pessimize a single v32qi
> > > over two separate v16qi for example.  Also currently the whole block is
> > > gated with INTEGRAL_TYPE_P but register pressure would be also
> > > a concern for floating point vectors.  finish_cost would then apply an
> > > adjustment.
> >
> > Changed.
> >
> > > 'target_avail_regs' is for GENERAL_REGS, does that include APX regs?
> > > I don't see anything similar for FP regs, but I guess the target should know
> > > or maybe there's a #regs in regclass query already.
> > Haven't see any, use below setting.
> >
> > unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > No big impact on SPEC2017.
> > Observe 1 big improvement from other benchmark by avoiding vectorization with
> > vec_construct v32qi which caused lots of spills.
> >
> > Ok for trunk?
>
> LGTM, let's see what x86 maintainers think.
+Honza and Uros.
Any comments?
>
> Richard.
>
> > For vec_contruct, the components must be live at the same time if
> > they're not loaded from memory, when the number of those components
> > exceeds available registers, spill happens. Try to account that with a
> > rough estimation.
> > ??? Ideally, we should have an overall estimation of register pressure
> > if we know the live range of all variables.
> >
> > gcc/ChangeLog:
> >
> >         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> >         Count sse_reg/gpr_regs for components not loaded from memory.
> >         (ix86_vector_costs:ix86_vector_costs): New constructor.
> >         (ix86_vector_costs::m_num_gpr_needed[3]): New private memeber.
> >         (ix86_vector_costs::m_num_sse_needed[3]): Ditto.
> >         (ix86_vector_costs::finish_cost): Estimate overall register
> >         pressure cost.
> >         (ix86_vector_costs::ix86_vect_estimate_reg_pressure): New
> >         function.
> > ---
> >  gcc/config/i386/i386.cc | 54 ++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 50 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index 9390f525b99..dcaea6c2096 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -24562,15 +24562,34 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
> >  /* x86-specific vector costs.  */
> >  class ix86_vector_costs : public vector_costs
> >  {
> > -  using vector_costs::vector_costs;
> > +public:
> > +  ix86_vector_costs (vec_info *, bool);
> >
> >    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> >                               stmt_vec_info stmt_info, slp_tree node,
> >                               tree vectype, int misalign,
> >                               vect_cost_model_location where) override;
> >    void finish_cost (const vector_costs *) override;
> > +
> > +private:
> > +
> > +  /* Estimate register pressure of the vectorized code.  */
> > +  void ix86_vect_estimate_reg_pressure ();
> > +  /* Number of GENERAL_REGS/SSE_REGS used in the vectorizer, it's used for
> > +     estimation of register pressure.
> > +     ??? Currently it's only used by vec_construct/scalar_to_vec
> > +     where we know it's not loaded from memory.  */
> > +  unsigned m_num_gpr_needed[3];
> > +  unsigned m_num_sse_needed[3];
> >  };
> >
> > +ix86_vector_costs::ix86_vector_costs (vec_info* vinfo, bool costing_for_scalar)
> > +  : vector_costs (vinfo, costing_for_scalar),
> > +    m_num_gpr_needed (),
> > +    m_num_sse_needed ()
> > +{
> > +}
> > +
> >  /* Implement targetm.vectorize.create_costs.  */
> >
> >  static vector_costs *
> > @@ -24748,8 +24767,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> >      }
> >    else if ((kind == vec_construct || kind == scalar_to_vec)
> >            && node
> > -          && SLP_TREE_DEF_TYPE (node) == vect_external_def
> > -          && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> > +          && SLP_TREE_DEF_TYPE (node) == vect_external_def)
> >      {
> >        stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
> >        unsigned i;
> > @@ -24785,7 +24803,15 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> >                   && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
> >                       || !VECTOR_TYPE_P (TREE_TYPE
> >                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))
> > -           stmt_cost += ix86_cost->sse_to_integer;
> > +           {
> > +             if (fp)
> > +               m_num_sse_needed[where]++;
> > +             else
> > +               {
> > +                 m_num_gpr_needed[where]++;
> > +                 stmt_cost += ix86_cost->sse_to_integer;
> > +               }
> > +           }
> >         }
> >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
> >         if (TREE_CODE (op) == SSA_NAME)
> > @@ -24821,6 +24847,24 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> >    return retval;
> >  }
> >
> > +void
> > +ix86_vector_costs::ix86_vect_estimate_reg_pressure ()
> > +{
> > +  unsigned gpr_spill_cost = COSTS_N_INSNS (ix86_cost->int_store [2]) / 2;
> > +  unsigned sse_spill_cost = COSTS_N_INSNS (ix86_cost->sse_store[0]) / 2;
> > +
> > +  /* Any better way to have target available fp registers, currently use SSE_REGS.  */
> > +  unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> > +  for (unsigned i = 0; i != 3; i++)
> > +    {
> > +      if (m_num_gpr_needed[i] > target_avail_regs)
> > +       m_costs[i] += gpr_spill_cost * (m_num_gpr_needed[i] - target_avail_regs);
> > +      /* Only measure sse registers pressure.  */
> > +      if (TARGET_SSE && (m_num_sse_needed[i] > target_avail_sse))
> > +       m_costs[i] += sse_spill_cost * (m_num_sse_needed[i] - target_avail_sse);
> > +    }
> > +}
> > +
> >  void
> >  ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> >  {
> > @@ -24843,6 +24887,8 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> >         m_costs[vect_body] = INT_MAX;
> >      }
> >
> > +  ix86_vect_estimate_reg_pressure ();
> > +
> >    vector_costs::finish_cost (scalar_costs);
> >  }
> >
> > --
> > 2.31.1
> >



-- 
BR,
Hongtao

  reply	other threads:[~2023-12-04  7:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28  7:52 [PATCH] Take register pressure into account for vec_construct " liuhongt
2023-11-29  7:47 ` Richard Biener
2023-11-29  8:00   ` Hongtao Liu
2023-12-01  2:38   ` [PATCH] Take register pressure into account for vec_construct/scalar_to_vec " liuhongt
2023-12-01 14:26     ` Richard Biener
2023-12-04  7:10       ` Hongtao Liu [this message]
2023-12-04  7:51         ` Uros Bizjak
2023-12-05  2:47           ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZc-bx9vNTxq6PnCVn4KvyxpXO_QsevSrdxq7x09PkEUBbFmw@mail.gmail.com \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=hongtao.liu@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=richard.guenther@gmail.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).