From: Jeff Law <jeffreyalaw@gmail.com>
To: 钟居哲 <juzhe.zhong@rivai.ai>, gcc-patches <gcc-patches@gcc.gnu.org>
Cc: "kito.cheng" <kito.cheng@gmail.com>,
"kito.cheng" <kito.cheng@sifive.com>, palmer <palmer@dabbelt.com>,
palmer <palmer@rivosinc.com>, "rdapp.gcc" <rdapp.gcc@gmail.com>
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
Date: Thu, 29 Jun 2023 17:39:22 -0600 [thread overview]
Message-ID: <62be1be8-280f-2989-6a31-32c87dcbadcd@gmail.com> (raw)
In-Reply-To: <70EE22A0D10CD0D8+2023062906005450585022@rivai.ai>
On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
>
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
So just to be explicit, I see no difference with that test before/after
your proposed change. Nor would I expect one based on my understanding
of the patch.
The explicit conversions I see are because we need the output of the
conversion in multiple vfmul instructions. That won't be helped by the
patch you've proposed.
To be more concrete:
> vsetvli t1,t5,e32,mf2,ta,ma # 99 [c=0 l=4] vsetvldi
> vle32.v v2,0(a4) # 23 [c=4 l=4] pred_movvnx2sf/1
> vle32.v v1,0(a5) # 25 [c=4 l=4] pred_movvnx2sf/1
> vsetvli t0,zero,e32,mf2,ta,ma # 101 [c=0 l=4] vsetvldi
> vfwcvt.f.f.v v3,v2 # 77 [c=4 l=4] pred_extendvnx2df/0
> vfwcvt.f.f.v v2,v1 # 79 [c=4 l=4] pred_extendvnx2df/0
> vsetvli zero,t1,e32,mf2,ta,ma # 102 [c=0 l=4] vsetvl_discard_resultdi
> vle32.v v5,0(a6) # 31 [c=4 l=4] pred_movvnx2sf/1
> vle32.v v4,0(a7) # 39 [c=4 l=4] pred_movvnx2sf/1
> vsetvli t0,zero,e32,mf2,ta,ma # 103 [c=0 l=4] vsetvldi
> vfwcvt.f.f.v v1,v5 # 81 [c=4 l=4] pred_extendvnx2df/0
> vsetvli zero,zero,e64,m1,ta,ma # 104 [c=16 l=4] vsetvl_vtype_change_only
> vfmul.vv v5,v2,v3 # 29 [c=4 l=4] pred_mulvnx2df/2
> vfmul.vv v2,v1,v2 # 34 [c=4 l=4] pred_mulvnx2df/2
> vsetvli zero,t1,e64,m1,ta,ma # 105 [c=0 l=4] vsetvl_discard_resultdi
> vse64.v v2,0(a1) # 35 [c=4 l=4] pred_storevnx2df
> vse64.v v5,0(a0) # 30 [c=4 l=4] pred_storevnx2df
> vsetvli t6,zero,e64,m1,ta,ma # 106 [c=0 l=4] vsetvldi
> vfmul.vv v1,v1,v3 # 37 [c=4 l=4] pred_mulvnx2df/2
> vsetvli zero,zero,e32,mf2,ta,ma # 107 [c=20 l=4] vsetvl_vtype_change_only
> vfwcvt.f.f.v v2,v4 # 83 [c=4 l=4] pred_extendvnx2df/0
> vsetvli zero,t1,e64,m1,ta,ma # 108 [c=0 l=4] vsetvl_discard_resultdi
> vse64.v v1,0(a2) # 38 [c=4 l=4] pred_storevnx2df
> vsetvli t6,zero,e64,m1,ta,ma # 109 [c=0 l=4] vsetvldi
> slli t4,t1,2 # 22 [c=4 l=4] ashldi3
> slli t3,t1,3 # 27 [c=4 l=4] ashldi3
> vfmul.vv v1,v2,v3 # 42 [c=4 l=4] pred_mulvnx2df/2
Note how the output of the explicit conversion done in insn 77 is used
by the vfmul in insns 29, 37 and 42. Similarly for the other explcit
conversions.
Your pattern isn't going to help that problem.
You could model this as a dependency height reduction. I think that
will get you were you want to go.
You'll need a pattern that matches this:
> (parallel [
> (set (reg:VNx2DF 160 [ vect__11.15 ])
> (if_then_else:VNx2DF (unspec:VNx2BI [
> (const_vector:VNx2BI repeat [
> (const_int 1 [0x1])
> ])
> (reg:DI 169)
> (const_int 2 [0x2]) repeated x2
> (const_int 1 [0x1])
> (const_int 7 [0x7])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> (reg:SI 69 frm)
> ] UNSPEC_VPREDICATE)
> (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
> (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
> (unspec:VNx2DF [
> (reg:SI 0 zero)
> ] UNSPEC_VUNDEF)))
> (set (reg:VNx2DF 143 [ vect__8.14 ])
> (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
> (set (reg:VNx2DF 145 [ vect__5.10 ])
> (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
> ])
It'll need to be a define_insn_and_split as its a 3->3 splitter. The
split will emit the two extensions and the widening multiply as 3
distinct insns.
This has two positive effects. First the widening multiply is no longer
data dependent on the float_extend and so it can issue when ever r144
and r146 are ready rather than when r143 and r145 are ready.
The second effect is I think this pattern will end up matching all the
multiplies in this sample code. As a result all the float_extend insns
you generated when splitting become dead and should be removed by DCE.
Jeff
next prev parent reply other threads:[~2023-06-29 23:39 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-28 4:15 Juzhe-Zhong
2023-06-28 16:24 ` Jeff Law
2023-06-28 22:00 ` 钟居哲
2023-06-29 22:59 ` Jeff Law
2023-06-29 23:02 ` 钟居哲
2023-06-29 23:04 ` 钟居哲
2023-06-29 23:39 ` Jeff Law [this message]
2023-06-30 10:14 ` Robin Dapp
2023-06-30 22:35 ` Jeff Law
2023-07-01 11:45 ` Robin Dapp
[not found] ` <8D5801744511A6AD+6077E043-F267-4BC0-90B8-B2FCDCA10089@rivai.ai>
2023-07-03 7:49 ` Robin Dapp
2023-07-03 8:42 ` juzhe.zhong
2023-07-03 8:44 ` Robin Dapp
2023-07-03 8:45 ` juzhe.zhong
2023-07-03 8:49 ` Robin Dapp
2023-07-03 8:51 ` juzhe.zhong
2023-07-07 21:11 ` Jeff Law
2023-07-07 23:05 ` 钟居哲
2023-06-29 23:41 ` Jeff Law
[not found] ` <99D6E636A491D16D+F0E92F80-33DF-4109-912E-F9CAAD6F07B5@rivai.ai>
2023-06-29 23:48 ` Jeff Law
2023-06-30 0:44 ` juzhe.zhong
[not found] ` <2023062906005450585022@rivai.ai>
2023-06-28 22:59 ` 钟居哲
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62be1be8-280f-2989-6a31-32c87dcbadcd@gmail.com \
--to=jeffreyalaw@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=juzhe.zhong@rivai.ai \
--cc=kito.cheng@gmail.com \
--cc=kito.cheng@sifive.com \
--cc=palmer@dabbelt.com \
--cc=palmer@rivosinc.com \
--cc=rdapp.gcc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).