>> As long as we're just looking for the maximum number of live registers, >> we can use a sliding-window approach: create a structure with all >> start and end points, sort it, and increase the current pressure >> if we start a new range or decrease. That's O(n log n). I failed to see it can help. Current approach is straightforward. for (hash_map::iterator iter = live_ranges.begin (); iter != live_ranges.end (); ++iter) { tree var = (*iter).first; pair live_range = (*iter).second; for (i = live_range.first; i <= live_range.second; i++) { machine_mode mode = TYPE_MODE (TREE_TYPE (var)); unsigned int nregs = compute_nregs_for_mode (mode, biggest_mode, lmul); live_vars_vec[i] += nregs; if (live_vars_vec[i] > max_nregs) max_nregs = live_vars_vec[i]; } } Could you revise this piece of codes ? Other comments has been addressed in V4: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629959.html juzhe.zhong@rivai.ai From: Robin Dapp Date: 2023-09-12 04:31 To: Juzhe-Zhong; gcc-patches CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw Subject: Re: [PATCH V3] RISC-V: Support Dynamic LMUL Cost model Hi Juzhe, glad that we can use the dominator info directly. Could we move the calculation of the info to the beginning (if it's not available)? That makes it clearer that it's a prerequisite. Function comments look good now. Some general remarks kind of similar to v1: - I would prefer a hash_map or similar to hold the end point for a range instead of looking through potentially all ranges in contrived cases. - As long as we're just looking for the maximum number of live registers, we can use a sliding-window approach: create a structure with all start and end points, sort it, and increase the current pressure if we start a new range or decrease. That's O(n log n). > + const ssa_use_operand_t *const head = &(SSA_NAME_IMM_USE_NODE (t)); > + const ssa_use_operand_t *ptr; > + > + for (ptr = head->next; ptr != head; ptr = ptr->next) > + { Why does FOR_EACH_IMM_USE not work here? > + unsigned int max_point > + = (*program_points_per_bb.get (e->src)).length () - 1; > + for (k = 0; k < (*live_ranges).length (); k++) > + { > + if ((*live_ranges)[i].var == def) Would also be nice not having to search through all ranges but just index/hash it via var (or similar). What about one test with global live ranges? Not a necessity IMHO we can still add it later. Regards Robin