From: Hongtao Liu <crazylht@gmail.com>
To: Jan Hubicka <hubicka@ucw.cz>, Uros Bizjak <ubizjak@gmail.com>
Cc: liuhongt <hongtao.liu@intel.com>,
gcc-patches@gcc.gnu.org, hjl.tools@gmail.com,
Richard Biener <richard.guenther@gmail.com>
Subject: Re: [PATCH] Take register pressure into account for vec_construct/scalar_to_vec when the components are not loaded from memory.
Date: Mon, 4 Dec 2023 15:10:52 +0800 [thread overview]
Message-ID: <CAMZc-bx9vNTxq6PnCVn4KvyxpXO_QsevSrdxq7x09PkEUBbFmw@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc1dbF9EbMPZ_wOVjXzv-4qoRfS5=5GLK7VoCzD5kbHxBg@mail.gmail.com>
On Fri, Dec 1, 2023 at 10:26 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Dec 1, 2023 at 3:39 AM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > > Hmm, I would suggest you put reg_needed into the class and accumulate
> > > over all vec_construct, with your patch you pessimize a single v32qi
> > > over two separate v16qi for example. Also currently the whole block is
> > > gated with INTEGRAL_TYPE_P but register pressure would be also
> > > a concern for floating point vectors. finish_cost would then apply an
> > > adjustment.
> >
> > Changed.
> >
> > > 'target_avail_regs' is for GENERAL_REGS, does that include APX regs?
> > > I don't see anything similar for FP regs, but I guess the target should know
> > > or maybe there's a #regs in regclass query already.
> > Haven't see any, use below setting.
> >
> > unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > No big impact on SPEC2017.
> > Observe 1 big improvement from other benchmark by avoiding vectorization with
> > vec_construct v32qi which caused lots of spills.
> >
> > Ok for trunk?
>
> LGTM, let's see what x86 maintainers think.
+Honza and Uros.
Any comments?
>
> Richard.
>
> > For vec_contruct, the components must be live at the same time if
> > they're not loaded from memory, when the number of those components
> > exceeds available registers, spill happens. Try to account that with a
> > rough estimation.
> > ??? Ideally, we should have an overall estimation of register pressure
> > if we know the live range of all variables.
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> > Count sse_reg/gpr_regs for components not loaded from memory.
> > (ix86_vector_costs:ix86_vector_costs): New constructor.
> > (ix86_vector_costs::m_num_gpr_needed[3]): New private memeber.
> > (ix86_vector_costs::m_num_sse_needed[3]): Ditto.
> > (ix86_vector_costs::finish_cost): Estimate overall register
> > pressure cost.
> > (ix86_vector_costs::ix86_vect_estimate_reg_pressure): New
> > function.
> > ---
> > gcc/config/i386/i386.cc | 54 ++++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 50 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index 9390f525b99..dcaea6c2096 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -24562,15 +24562,34 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
> > /* x86-specific vector costs. */
> > class ix86_vector_costs : public vector_costs
> > {
> > - using vector_costs::vector_costs;
> > +public:
> > + ix86_vector_costs (vec_info *, bool);
> >
> > unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > stmt_vec_info stmt_info, slp_tree node,
> > tree vectype, int misalign,
> > vect_cost_model_location where) override;
> > void finish_cost (const vector_costs *) override;
> > +
> > +private:
> > +
> > + /* Estimate register pressure of the vectorized code. */
> > + void ix86_vect_estimate_reg_pressure ();
> > + /* Number of GENERAL_REGS/SSE_REGS used in the vectorizer, it's used for
> > + estimation of register pressure.
> > + ??? Currently it's only used by vec_construct/scalar_to_vec
> > + where we know it's not loaded from memory. */
> > + unsigned m_num_gpr_needed[3];
> > + unsigned m_num_sse_needed[3];
> > };
> >
> > +ix86_vector_costs::ix86_vector_costs (vec_info* vinfo, bool costing_for_scalar)
> > + : vector_costs (vinfo, costing_for_scalar),
> > + m_num_gpr_needed (),
> > + m_num_sse_needed ()
> > +{
> > +}
> > +
> > /* Implement targetm.vectorize.create_costs. */
> >
> > static vector_costs *
> > @@ -24748,8 +24767,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > }
> > else if ((kind == vec_construct || kind == scalar_to_vec)
> > && node
> > - && SLP_TREE_DEF_TYPE (node) == vect_external_def
> > - && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> > + && SLP_TREE_DEF_TYPE (node) == vect_external_def)
> > {
> > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
> > unsigned i;
> > @@ -24785,7 +24803,15 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
> > || !VECTOR_TYPE_P (TREE_TYPE
> > (TREE_OPERAND (gimple_assign_rhs1 (def), 0))))))
> > - stmt_cost += ix86_cost->sse_to_integer;
> > + {
> > + if (fp)
> > + m_num_sse_needed[where]++;
> > + else
> > + {
> > + m_num_gpr_needed[where]++;
> > + stmt_cost += ix86_cost->sse_to_integer;
> > + }
> > + }
> > }
> > FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
> > if (TREE_CODE (op) == SSA_NAME)
> > @@ -24821,6 +24847,24 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > return retval;
> > }
> >
> > +void
> > +ix86_vector_costs::ix86_vect_estimate_reg_pressure ()
> > +{
> > + unsigned gpr_spill_cost = COSTS_N_INSNS (ix86_cost->int_store [2]) / 2;
> > + unsigned sse_spill_cost = COSTS_N_INSNS (ix86_cost->sse_store[0]) / 2;
> > +
> > + /* Any better way to have target available fp registers, currently use SSE_REGS. */
> > + unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
> > + for (unsigned i = 0; i != 3; i++)
> > + {
> > + if (m_num_gpr_needed[i] > target_avail_regs)
> > + m_costs[i] += gpr_spill_cost * (m_num_gpr_needed[i] - target_avail_regs);
> > + /* Only measure sse registers pressure. */
> > + if (TARGET_SSE && (m_num_sse_needed[i] > target_avail_sse))
> > + m_costs[i] += sse_spill_cost * (m_num_sse_needed[i] - target_avail_sse);
> > + }
> > +}
> > +
> > void
> > ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> > {
> > @@ -24843,6 +24887,8 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
> > m_costs[vect_body] = INT_MAX;
> > }
> >
> > + ix86_vect_estimate_reg_pressure ();
> > +
> > vector_costs::finish_cost (scalar_costs);
> > }
> >
> > --
> > 2.31.1
> >
--
BR,
Hongtao
next prev parent reply other threads:[~2023-12-04 7:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-28 7:52 [PATCH] Take register pressure into account for vec_construct " liuhongt
2023-11-29 7:47 ` Richard Biener
2023-11-29 8:00 ` Hongtao Liu
2023-12-01 2:38 ` [PATCH] Take register pressure into account for vec_construct/scalar_to_vec " liuhongt
2023-12-01 14:26 ` Richard Biener
2023-12-04 7:10 ` Hongtao Liu [this message]
2023-12-04 7:51 ` Uros Bizjak
2023-12-05 2:47 ` Hongtao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZc-bx9vNTxq6PnCVn4KvyxpXO_QsevSrdxq7x09PkEUBbFmw@mail.gmail.com \
--to=crazylht@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=hongtao.liu@intel.com \
--cc=hubicka@ucw.cz \
--cc=richard.guenther@gmail.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).