This patch is my attempt to address the compile-time hog issue in PR rtl-optimization/110587. Richard Biener's analysis shows that compilation of pr28071.c with -O0 currently spends ~70% in timer "LRA non-specific" due to return_regno_p failing to filter a large number of calls to regno_in_use_p, resulting in quadratic behaviour. For this pathological test case, things can be improved significantly. Although the return register (%rax) is indeed mentioned a large number of times in this function, due to inlining, the inlined functions access the returned register in TImode, whereas the current function returns a DImode. Hence the check to see if we're the last SET of the return register, which should be followed by a USE, can be improved by also testing the mode. Implementation-wise, rather than pass an additional mode parameter to LRA's local return_regno_p function, which only has a single caller, it's more convenient to pass the rtx REG_P, and from this extract both the REGNO and the mode in the callee, and rename this function to return_reg_p. The good news is that with this change "LRA non-specific" drops from 70% to 13%. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, with no new failures. Ok for mainline? 2023-07-22 Roger Sayle gcc/ChangeLog PR middle-end/28071 PR rtl-optimization/110587 * lra-spills.cc (return_regno_p): Change argument and rename to... (return_reg_p): Check if the given register RTX has the same REGNO and machine mode as the function's return value. (lra_final_code_change): Update call to return_reg_p. Thanks in advance, Roger --