Re: [PATCH] RISC-V: costs: support shift-and-add in strength-reduction

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Philipp Tomsich <philipp.tomsich@vrull.eu>
To: Palmer Dabbelt <palmer@rivosinc.com>
Cc: gcc-patches@gcc.gnu.org, Kito Cheng <kito.cheng@gmail.com>,
	 Vineet Gupta <vineetg@rivosinc.com>,
	christoph.muellner@vrull.eu, jlaw@ventanamicro.com
Subject: Re: [PATCH] RISC-V: costs: support shift-and-add in strength-reduction
Date: Thu, 10 Nov 2022 16:09:35 +0100	[thread overview]
Message-ID: <CAAeLtUBG_CBWLFpmcWgA+Tg4hVesovrtD_=_78gusn=-ChnTmA@mail.gmail.com> (raw)
In-Reply-To: <mhng-3a1b1869-3786-43ae-b543-e5e245ded6d4@palmer-ri-x1c9a>

On Thu, 10 Nov 2022 at 02:46, Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Tue, 08 Nov 2022 11:54:34 PST (-0800), philipp.tomsich@vrull.eu wrote:
> > The strength-reduction implementation in expmed.c will assess the
> > profitability of using shift-and-add using a RTL expression that wraps
> > a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
> > function recognizes this as expressing a sh[123]add instruction, we
> > will return an inflated cost---thus defeating the optimization.
> >
> > This change adds the necessary idiom recognition to provide an
> > accurate cost for this for of expressing sh[123]add.
> >
> > Instead on expanding to
> >       li      a5,200
> >       mulw    a0,a5,a0
> > with this change, the expression 'a * 200' is sythesized as:
> >       sh2add  a0,a0,a0   // *5 = a + 4 * a
> >       sh2add  a0,a0,a0   // *5 = a + 4 * a
> >       slli    a0,a0,3    // *8
>
> That's more instructions, but multiplication is generally expensive.  At
> some point I remember the SiFive cores getting very fast integer
> multipliers, but I don't see that reflected in the cost model anywhere
> so maybe I'm just wrong?  Andrew or Kito might remember...
>
> If the mul-based sequences are still faster on the SiFive cores then we
> should probably find a way to keep emitting them, which may just be a
> matter of adjusting those multiply costs.  Moving to the shift-based
> sequences seems reasonable for a generic target, though.

The cost for a regular MULT is COSTS_N_INSNS(4) for the series-7 (see
the SImode and DImode entries in the int_mul line):
/* Costs to use when optimizing for Sifive 7 Series.  */
static const struct riscv_tune_param sifive_7_tune_info = {
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},       /* fp_add */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},       /* fp_mul */
  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},     /* fp_div */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},       /* int_mul */
  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},       /* int_div */
  2,                                            /* issue_rate */
  4,                                            /* branch_cost */
  3,                                            /* memory_cost */
  8,                                            /* fmv_cost */
  true,                                         /* slow_unaligned_access */
};

So the break-even is at COSTS_N_INSNS(4) + rtx_cost(immediate).

Testing against series-7, we get up to 5 (4 for the mul + 1 for the
li) instructions from strength reduction:

val * 783
=>
sh1add a5,a0,a0
slli a5,a5,4
add a5,a5,a0
slli a5,a5,4
sub a0,a5,a0

but fall back to a mul, once the cost exceeds this:

val * 1574
=>
li a5,1574
mul a0,a0,a5

> Either way, it probably warrants a test case to make sure we don't
> regress in the future.

Ack. Will be added for v2.

>
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
> >       if expressed as a plus and multiplication with a power-of-2.

This will still need to be regenerated (it's referring to a '.c'
extension still).

> >
> > ---
> >
> >  gcc/config/riscv/riscv.cc | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index ab6c745c722..0b2c4b3599d 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -2451,6 +2451,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN
> >         *total = COSTS_N_INSNS (1);
> >         return true;
> >       }
> > +      /* Before strength-reduction, the shNadd can be expressed as the addition
> > +      of a multiplication with a power-of-two.  If this case is not handled,
> > +      the strength-reduction in expmed.c will calculate an inflated cost. */
> > +      if (TARGET_ZBA
> > +       && mode == word_mode
> > +       && GET_CODE (XEXP (x, 0)) == MULT
> > +       && REG_P (XEXP (XEXP (x, 0), 0))
> > +       && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> > +       && IN_RANGE (pow2p_hwi (INTVAL (XEXP (XEXP (x, 0), 1))), 1, 3))
>
> IIUC the fall-through is biting us here and this matches power-of-2 +1
> and power-of-2 -1.  That looks to be the case for the one below, though,
> so not sure if I'm just missing something?

The strength-reduction in expmed.cc uses "(PLUS (reg) (MULT (reg)
<pow2>))" to express a shift-then-add.
Here's one of the relevant snippets (from the internal costing in expmed.cc):
  all.shift_mult = gen_rtx_MULT (mode, all.reg, all.reg);
  all.shift_add = gen_rtx_PLUS (mode, all.shift_mult, all.reg);

So while we normally encounter a "(PLUS (reg) (ASHIFT (reg)
<shamt>))", for the strength-reduction we also need to provide the
cost for the expression with a MULT).
The other idioms (those matching above and below the new one) always
require an ASHIFT for the inner.

>
> > +     {
> > +       *total = COSTS_N_INSNS (1);
> > +       return true;
> > +     }
> >        /* shNadd.uw pattern for zba.
> >        [(set (match_operand:DI 0 "register_operand" "=r")
> >              (plus:DI

next prev parent reply	other threads:[~2022-11-10 15:09 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08 19:54 Philipp Tomsich
2022-11-10  1:46 ` Palmer Dabbelt
2022-11-10 15:09   ` Philipp Tomsich [this message]
2022-11-10 20:47     ` Palmer Dabbelt
2022-11-10 21:11       ` Philipp Tomsich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAeLtUBG_CBWLFpmcWgA+Tg4hVesovrtD_=_78gusn=-ChnTmA@mail.gmail.com' \
    --to=philipp.tomsich@vrull.eu \
    --cc=christoph.muellner@vrull.eu \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=kito.cheng@gmail.com \
    --cc=palmer@rivosinc.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).