Hi, Jeff. That's odd. I think maybe you should first clean up your environment ? Or you didn't build up the toolchain correctly with this patch? Compile option: --param=riscv-autovec-preference=scalable -O3 -ffast-math Before this patch: https://godbolt.org/z/Y5d44WMqs fail.s: lw t5,0(sp) ble t5,zero,.L5 .L3: vsetvli t1,t5,e32,mf2,ta,ma vle32.v v2,0(a4) vle32.v v1,0(a5) vsetvli t0,zero,e32,mf2,ta,ma vfwcvt.f.f.v v3,v2 vfwcvt.f.f.v v2,v1 vsetvli zero,t1,e32,mf2,ta,ma vle32.v v5,0(a6) vle32.v v4,0(a7) vsetvli t0,zero,e32,mf2,ta,ma vfwcvt.f.f.v v1,v5 vsetvli zero,zero,e64,m1,ta,ma vfmul.vv v5,v2,v3 vfmul.vv v2,v1,v2 vsetvli zero,t1,e64,m1,ta,ma vse64.v v2,0(a1) vse64.v v5,0(a0) vsetvli t6,zero,e64,m1,ta,ma vfmul.vv v1,v1,v3 vsetvli zero,zero,e32,mf2,ta,ma vfwcvt.f.f.v v2,v4 vsetvli zero,t1,e64,m1,ta,ma vse64.v v1,0(a2) vsetvli t6,zero,e64,m1,ta,ma slli t4,t1,2 slli t3,t1,3 vfmul.vv v1,v2,v3 sub t5,t5,t1 vsetvli zero,t1,e64,m1,ta,ma vse64.v v1,0(a3) add a4,a4,t4 add a5,a5,t4 add a0,a0,t3 add a6,a6,t4 add a1,a1,t3 add a2,a2,t3 add a7,a7,t4 add a3,a3,t3 bne t5,zero,.L3 .L5: ret After this patch: pass.s: lw t5,0(sp) ble t5,zero,.L5 .L3: vsetvli t1,t5,e32,mf2,ta,ma vle32.v v1,0(a4) vle32.v v3,0(a5) vle32.v v2,0(a6) vle32.v v4,0(a7) vsetvli t6,zero,e32,mf2,ta,ma vfwmul.vv v5,v3,v2 vfwmul.vv v6,v1,v3 vsetvli zero,t1,e64,m1,ta,ma vse64.v v6,0(a0) vse64.v v5,0(a1) vsetvli t6,zero,e32,mf2,ta,ma slli t4,t1,2 slli t3,t1,3 vfwmul.vv v3,v2,v1 sub t5,t5,t1 vfwmul.vv v2,v1,v4 vsetvli zero,t1,e64,m1,ta,ma vse64.v v3,0(a2) vse64.v v2,0(a3) add a4,a4,t4 add a5,a5,t4 add a0,a0,t3 add a6,a6,t4 add a1,a1,t3 add a2,a2,t3 add a7,a7,t4 add a3,a3,t3 bne t5,zero,.L3 .L5: ret It's very obvious the codegen with this patch is perfect. I have attached the .S in this patch. I am not claiming that this patch solution is the only solution. I am welcome you can provide another solution as long as you can make this codegen become the perfect codegen that this patch achieved. I think maybe you should make sure you are using the correct toolchain that built with patch. Thanks. juzhe.zhong@rivai.ai From: Jeff Law Date: 2023-06-30 07:48 To: juzhe.zhong CC: gcc-patches; kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering On 6/29/23 17:46, juzhe.zhong wrote: > You should try the example check the codegen before and after the patch. > You will understand it. I've already done that. It makes _no_ difference on the godbold example. Jeff