Hi, Han. It's awesome that some one want to optimize dynamic LMUL feature of GCC. I knew this feature is not stable yet and I failed to find the time to optimize it (Still busy with fixing bugs). Could you give me more details why this patch can refine those 2 cases with picking larger LMUL (I am happy with those 2 cases be changed as using larger LMUL )? It seems this patch is ignoring the first vectorized statement during the live calculation ? Thanks. juzhe.zhong@rivai.ai From: demin.han Date: 2023-12-20 16:15 To: gcc-patches@gcc.gnu.org CC: juzhe.zhong@rivai.ai; pan2.li@intel.com Subject: [PATCH] RISC-V: Fix calculation of max live vregs For the stmt _1 = _2 + _3, assume that _2 or _3 not used after this stmt. _1 can use same register with _2 or _3 if without early clobber. Two registers are needed, but current calculation is three. This patch preserves point 0 for bb entry and excludes its def when calculates live regs of certain point. Signed-off-by: demin.han gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (max_number_of_live_regs): Fix max live vregs calc (preferred_new_lmul_p): Ditto gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Moved to... * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-10.c: ...here. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c: Moved to... * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-11.c: ...here. --- gcc/config/riscv/riscv-vector-costs.cc | 10 +++++----- .../rvv/{dynamic-lmul2-7.c => dynamic-lmul4-10.c} | 6 +++--- .../rvv/{dynamic-lmul4-4.c => dynamic-lmul8-11.c} | 6 +++--- 3 files changed, 11 insertions(+), 11 deletions(-) rename gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/{dynamic-lmul2-7.c => dynamic-lmul4-10.c} (79%) rename gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/{dynamic-lmul4-4.c => dynamic-lmul8-11.c} (87%) diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index e7bc9ed5233..a316603e207 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -123,7 +123,7 @@ compute_local_program_points ( /* Collect the stmts that is vectorized and mark their program point. */ for (i = 0; i < nbbs; i++) { - int point = 0; + int point = 1; basic_block bb = bbs[i]; vec program_points = vNULL; if (dump_enabled_p ()) @@ -300,13 +300,13 @@ max_number_of_live_regs (const basic_block bb, unsigned int i; unsigned int live_point = 0; auto_vec live_vars_vec; - live_vars_vec.safe_grow_cleared (max_point + 1, true); + live_vars_vec.safe_grow_cleared (max_point, true); for (hash_map::iterator iter = live_ranges.begin (); iter != live_ranges.end (); ++iter) { tree var = (*iter).first; pair live_range = (*iter).second; - for (i = live_range.first; i <= live_range.second; i++) + for (i = live_range.first + 1; i <= live_range.second; i++) { machine_mode mode = TYPE_MODE (TREE_TYPE (var)); unsigned int nregs @@ -485,7 +485,7 @@ update_local_live_ranges ( if (!program_points_per_bb.get (e->src)) continue; unsigned int max_point - = (*program_points_per_bb.get (e->src)).length () - 1; + = (*program_points_per_bb.get (e->src)).length (); live_range = live_ranges->get (def); if (!live_range) continue; @@ -571,7 +571,7 @@ preferred_new_lmul_p (loop_vec_info other_loop_vinfo) { basic_block bb = (*iter).first; unsigned int max_point - = (*program_points_per_bb.get (bb)).length () - 1; + = (*program_points_per_bb.get (bb)).length () + 1; if ((*iter).second.is_empty ()) continue; /* We prefer larger LMUL unless it causes register spillings. */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-10.c similarity index 79% rename from gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c rename to gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-10.c index 636332dbb62..74e629168f8 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-10.c @@ -17,10 +17,10 @@ bar (int *x, int a, int b, int n) return sum1 + sum2; } -/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler {e32,m4} } } */ /* { dg-final { scan-assembler-not {jr} } } */ /* { dg-final { scan-assembler-times {ret} 2 } } * /* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ -/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ -/* { dg-final { scan-tree-dump "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ /* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-11.c similarity index 87% rename from gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c rename to gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-11.c index 01a359bc7c8..01c976dd67b 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-11.c @@ -39,9 +39,9 @@ void foo2 (int64_t *__restrict a, } } -/* { dg-final { scan-assembler {e64,m4} } } */ +/* { dg-final { scan-assembler {e64,m8} } } */ /* { dg-final { scan-assembler-not {csrr} } } */ -/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ -/* { dg-final { scan-tree-dump "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump "Maximum lmul = 8" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ /* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ /* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ -- 2.43.0