Thank you so much. Kito helped me fix it already. RVV instruction patterns can have CSE optimizations now. juzhe.zhong@rivai.ai From: Richard Biener Date: 2023-02-02 20:26 To: juzhe.zhong@rivai.ai CC: gcc-patches; kito.cheng; richard.sandiford; jeffreyalaw; apinski Subject: Re: Re: [PATCH] CPROP: Allow cprop optimization when the function has a single block On Thu, 2 Feb 2023, juzhe.zhong@rivai.ai wrote: > Yeah, Thanks. You are right. CSE should do the job. > Now I know the reason CSE failed to optimize is I include VL_REGNUM(66)/VTYPE_RENUM(67) hard reg > as the dependency of pred_broadcast: > (insn 19 18 20 4 (set (reg:VNx1DI 152) > > (if_then_else:VNx1DI (unspec:VNx1BI [ > > (const_vector:VNx1BI repeat [ > > (const_int 1 [0x1]) > > ]) > > (const_int 4 [0x4]) > > (const_int 2 [0x2]) repeated x2 > > (const_int 0 [0]) > > (reg:SI 66 vl) > > (reg:SI 67 vtype) > > ] UNSPEC_VPREDICATE) > > (vec_duplicate:VNx1DI (reg/v:DI 148 [ x ])) > > (unspec:VNx1DI [ > > (const_int 0 [0]) > > ] UNSPEC_VUNDEF))) "rvv.c":22:23 695 {pred_broadcastvnx1di} > > (nil)) > Then CSE failed to set the 152 as copy. > > VL_REGNUM(66)/VTYPE_RENUM(67) are the global hard reg that I should make each RVV instruction depend on them. > Since we use vsetvl instruction (which is setting global VL_REGNUM(66)/VTYPE_RENUM(67) status) to set the global status for > each RVV instruction. > Including the dependency here is to make sure the global VL/VTYPE status is correct of each RVV instruction. (If we don't include > such dependency in RVV instruction, instruction scheduling may move the RVV instructions and vsetvl instructions randomly then > produce incorrect vsetvl configuration) > > The original reg_class of VL_REGNUM(66)/VTYPE_RENUM(67) I set here: > riscv_regno_to_class [VL_REGNUM] = VL_REGS; > riscv_regno_to_class [VTYPE_RENUM] = VTYPE_REGS; > Such configuration make CSE failed. > > However, if I change the reg_class : > riscv_regno_to_class [VL_REGNUM] = NO_REGS; > riscv_regno_to_class [VTYPE_RENUM] = NO_REGS; > The CSE now can do the optimization now! > > 1) Would you mind telling me the difference between them? No idea. I think CSE avoids to touch hard register references because eliding them to copies can increase register pressure. > 2) If I set these 2 global status register as NO_REGS, will it create > issues for the global status configuration of each RVV instructions ? No idea either. Usually these kind of dependences are introduced by targets at the point the VL setting is introduced to avoid pessimizing optimizations earlier. Often, for cases like a VL register, this is done after register allocation only and indeed necessary to avoid the second scheduling pass from breaking things. Richard.