On 2/2/2024 11:10 PM, Li, Pan2 wrote: > Hi Edwin > >> I believe the only problematic failures are the 5 vls calling convention >> ones where only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) are found. > > Does this "only 24" comes from calling-convention-1.c? Oops sorry about that. I said I would include all the 7 failures and ended up not doing that. The failures are here FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 35 FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 33 FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 31 FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29 FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29 These all have the problem of only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) being found. So that is calling-conventions 1, 2, 3, 4, 7 with only 24 matching RE. FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times vmv1r.v\\s+v[0-9]+,\\s*v[0-9]+ 24 <-- found 36 times FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times vmv2r.v\\s+v[0-9]+,\\s*v[0-9]+ 12 <-- found 28 times FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times vmv4r.v\\s+v[0-9]+,\\s*v[0-9]+ 16 <-- found 19 times These find more vmv's than expected FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c -O2 scan-assembler-times vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 0 times FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c -O2 -flto -fno-use-linker-plugin -flto-partition=none scan-assembler-times vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 0 times FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects scan-assembler-times vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 0 times These failures are from vsetvli zero,a0,e2,m1,ta,ma being found instead. I believe these should be fine. > >> This is what I'm getting locally (first instance of wrong match): >> v32qi_RET1_ARG8: >> .LFB109: > > V32qi will pass the args by reference instead of GPR(s), thus It is expected. I think we need to diff the asm code before and after the patch for the whole test-file. > The RE "ld\\s+a[0-1],\\s*[0-9]+\\(sp\\)" would like to check vls mode values are returned by a[0-1]. > I've been using this https://godbolt.org/z/vdxTY3rc7 (calling convention 1) as my comparison to what I have compiled locally (included as attachment). From what I see, the differences, aside from reordering due to latency, are that the ld insns use a5 (for 32-512) or t4 (for 1024-2048) or t5 (for 4096) for ARG8 and ARG9. Is there something else that I might be missing? Edwin