public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Philipp Tomsich <philipp.tomsich@vrull.eu>
To: Kito Cheng <kito.cheng@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Manolis Tsamis <manolis.tsamis@vrull.eu>,
	 Jim Wilson <jim.wilson.gcc@gmail.com>,
	Vineet Gupta <vineetg@rivosinc.com>,
	 Palmer Dabbelt <palmer@rivosinc.com>,
	Andrew Waterman <andrew@sifive.com>
Subject: Re: [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split
Date: Tue, 7 Jun 2022 12:50:01 +0200	[thread overview]
Message-ID: <CAAeLtUALEtY0+FfXp2w50JBaJ2LdDxh5zj34q8tUD1x9RX4SXg@mail.gmail.com> (raw)
In-Reply-To: <CA+yXCZDTgMh1o_fbnPZY+UGVG8Q9-Buupk7iiS7JiW=+fJHeqQ@mail.gmail.com>

On Tue, 7 Jun 2022 at 12:24, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > The current method of treating shifts of extended values on RISC-V
> > frequently causes sequences of 3 shifts, despite the presence of the
> > 'zero_extendsidi2_shifted' pattern.
> >
> > Consider:
> >     unsigned long f(unsigned int a, unsigned long b)
> >     {
> >             a = a << 1;
> >             unsigned long c = (unsigned long) a;
> >             c = b + (c<<4);
> >             return c;
> >     }
> > which will present at combine-time as:
> >     Trying 7, 8 -> 9:
> >         7: r78:SI=r81:DI#0<<0x1
> >           REG_DEAD r81:DI
> >         8: r79:DI=zero_extend(r78:SI)
> >           REG_DEAD r78:SI
> >         9: r72:DI=r79:DI<<0x4
> >           REG_DEAD r79:DI
> >     Failed to match this instruction:
> >     (set (reg:DI 72 [ _1 ])
> >         (and:DI (ashift:DI (reg:DI 81)
> >                 (const_int 5 [0x5]))
> >         (const_int 68719476704 [0xfffffffe0])))
> > and produce the following (optimized) assembly:
> >     f:
> >         slliw   a5,a0,1
> >         slli    a5,a5,32
> >         srli    a5,a5,28
> >         add     a0,a5,a1
> >         ret
> >
> > The current way of handling this (in 'zero_extendsidi2_shifted')
> > doesn't apply for two reasons:
> > - this is seen before reload, and
> > - (more importantly) the constant mask is not 0xfffffffful.
> >
> > To address this, we introduce a generalized version of shifting
> > zero-extended values that supports any mask of consecutive ones as
> > long as the number of training zeros is the inner shift-amount.
> >
> > With this new split, we generate the following assembly for the
> > aforementioned function:
> >     f:
> >         slli    a0,a0,33
> >         srli    a0,a0,28
> >         add     a0,a0,a1
> >         ret
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
> >           with a generalized split that requires no clobber, runs
> >           before reload and works for smaller masks.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> >  gcc/config/riscv/riscv.md | 37 ++++++++++++++++++++-----------------
> >  1 file changed, 20 insertions(+), 17 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index b8ab0cf169a..cc10cd90a74 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -2119,23 +2119,26 @@ (define_split
> >  ;; occur when unsigned int is used for array indexing.  Split this into two
> >  ;; shifts.  Otherwise we can get 3 shifts.
> >
> > -(define_insn_and_split "zero_extendsidi2_shifted"
> > -  [(set (match_operand:DI 0 "register_operand" "=r")
> > -       (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> > -                          (match_operand:QI 2 "immediate_operand" "I"))
> > -               (match_operand 3 "immediate_operand" "")))
> > -   (clobber (match_scratch:DI 4 "=&r"))]
> > -  "TARGET_64BIT && !TARGET_ZBA
> > -   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> > -  "#"
> > -  "&& reload_completed"
> > -  [(set (match_dup 4)
> > -       (ashift:DI (match_dup 1) (const_int 32)))
> > -   (set (match_dup 0)
> > -       (lshiftrt:DI (match_dup 4) (match_dup 5)))]
> > -  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
> > -  [(set_attr "type" "shift")
> > -   (set_attr "mode" "DI")])
> > +(define_split
> > +  [(set (match_operand:DI 0 "register_operand")
> > +       (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> > +                          (match_operand:QI 2 "immediate_operand"))
> > +               (match_operand:DI 3 "consecutive_bits_operand")))]
> > +  "TARGET_64BIT"
> > +  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
> > +   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]
>
> I would prefer to keep using another register if possible:
>
> like this:
> +  [(set (match_dup 6) (ashift:DI (match_dup 1) (match_dup 4)))
> +   (set (match_dup 0) (lshiftrt:DI (match_dup 6) (match_dup 5)))]
>
> if (can_create_pseudo_p)
>   operands[6] = gen_reg_rtx (DImode);
> else
>   operands[6] = operands[0];

I don't see the benefit to this (unless you expect opportunities for
CSE), as there will be a linear dependency chain anyway.  I'd like to
understand your reasoning behind this a bit better, as our style
currently generally tries to not avoid introducing temporaries if it
is avoidable.

Thanks,
Philipp.

>
> > +{
> > +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> > +       int leading = clz_hwi (mask);
> > +       int trailing = ctz_hwi (mask);
> > +
> > +       /* The shift-amount must match the number of trailing bits */
> > +       if (trailing != UINTVAL (operands[2]))
> > +          FAIL;
> > +
> > +       operands[4] = GEN_INT (leading + trailing);
> > +       operands[5] = GEN_INT (leading);
> > +})
> >
> >  ;;
> >  ;;  ....................
> > --
> > 2.34.1
> >

  reply	other threads:[~2022-06-07 10:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24 21:47 [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
2022-06-07 10:25   ` Kito Cheng
2022-06-14 11:38     ` Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
2022-06-07 10:25   ` Kito Cheng
2022-06-14 11:39     ` Philipp Tomsich
2022-06-17  8:34   ` Andreas Schwab
2022-06-17 14:00     ` Kito Cheng
2022-06-17 14:19       ` Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split Philipp Tomsich
2022-06-07 10:24   ` Kito Cheng
2022-06-07 10:50     ` Philipp Tomsich [this message]
2022-06-07 13:18       ` Kito Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAeLtUALEtY0+FfXp2w50JBaJ2LdDxh5zj34q8tUD1x9RX4SXg@mail.gmail.com \
    --to=philipp.tomsich@vrull.eu \
    --cc=andrew@sifive.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jim.wilson.gcc@gmail.com \
    --cc=kito.cheng@gmail.com \
    --cc=manolis.tsamis@vrull.eu \
    --cc=palmer@rivosinc.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).