Hi Richard, I had pinged the community about this problem back in March, and I will be taking Dimitrije's place, considering he isn't working on these patches anymore. Your solution for 2/2 seems reasonable, I don't exactly know why target_reg_cost hasn't been accounted for in the first case, nor do I know why that particular case was specified at all. I will get back to you when I have researched 1/2 a bit more thoroughly. Regards, Jovan ________________________________ From: Richard Biener Sent: Monday, May 15, 2023 2:23 PM To: Dimitrije Milošević Cc: gcc-patches@gcc.gnu.org ; Djordje Todorovic ; jeffreyalaw@gmail.com Subject: Re: [PATCH 2/2] ivopts: Revert register pressure cost when there are enough registers. On Mon, May 15, 2023 at 12:44 PM Richard Biener wrote: > > On Wed, Dec 21, 2022 at 2:12 PM Dimitrije Milošević > wrote: > > > > When there are enough registers, the register pressure cost is > > unnecessarily bumped by adding another n_cands. > > > > This behavior may result in register pressure costs for the case > > when there are enough registers being higher than for other cases. > > > > When there are enough registers, the register pressure cost should be > > equal to n_invs + n_cands. > > > > This used to be the case before c18101f. > > > > gcc/ChangeLog: > > > > * tree-ssa-loop-ivopts.cc (ivopts_estimate_reg_pressure): Adjust. > > > > Signed-off-by: Dimitrije Milosevic > > --- > > gcc/tree-ssa-loop-ivopts.cc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > > index 60c61dc9e49..3176482d0d9 100644 > > --- a/gcc/tree-ssa-loop-ivopts.cc > > +++ b/gcc/tree-ssa-loop-ivopts.cc > > @@ -6092,7 +6092,7 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs, > > > > /* If we have enough registers. */ > > if (regs_needed + target_res_regs < available_regs) > > - cost = n_new; > > + return n_new; > > This still doesn't make much sense (before nor after). We're > comparing apples and oranges. > > I think it would make most sense to merge this case with the following > and thus do > the following. The distinction between the cases should be preserved > and attenuated > by the adding of n_cands at the end (as tie-breaker). > > Does this help the mips case? I'm going to throw it at x86_64-linux > bootstrap/regtest. > > Btw, I don't think using address complexity makes much sense for a port that > has only one addressing mode so I guess a better approach for 1/2 would be > to make sure it is consistently the same value (I suppose it is not, otherwise > you wouldn't have changed it). Oh, and we're adding the > reg-pressure cost to the same bucket as well, and there we don't really know > how many times we're going to spill. That said, I think ->complexity should > rather go away - we are asking for address-cost already and IVOPTs uses > built RTX to query the target. > > But yes, I agree ivopts_estimate_reg_pressure has an issue. > > Sorry for the very long delay, > Richard. The patch below bootstraps and regtests ok on x86_64-unknown-linux-gnu, but I guess that doesn't mean much. Richard. > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > index 6fbd2d59318..bc8493622de 100644 > --- a/gcc/tree-ssa-loop-ivopts.cc > +++ b/gcc/tree-ssa-loop-ivopts.cc > @@ -6077,8 +6077,9 @@ ivopts_estimate_reg_pressure (struct ivopts_data > *data, unsigned n_invs, > unsigned n_cands) > { > unsigned cost; > - unsigned n_old = data->regs_used, n_new = n_invs + n_cands; > - unsigned regs_needed = n_new + n_old, available_regs = target_avail_regs; > + unsigned n_old = data->regs_used; > + unsigned regs_needed = n_invs + n_cands + n_old; > + unsigned available_regs = target_avail_regs; > bool speed = data->speed; > > /* If there is a call in the loop body, the call-clobbered registers > @@ -6087,10 +6088,7 @@ ivopts_estimate_reg_pressure (struct > ivopts_data *data, unsigned n_invs, > available_regs = available_regs - target_clobbered_regs; > > /* If we have enough registers. */ > - if (regs_needed + target_res_regs < available_regs) > - cost = n_new; > - /* If close to running out of registers, try to preserve them. */ > - else if (regs_needed <= available_regs) > + if (regs_needed <= available_regs) > cost = target_reg_cost [speed] * regs_needed; > /* If we run out of available registers but the number of candidates > does not, we penalize extra registers using target_spill_cost. */ > > > > /* If close to running out of registers, try to preserve them. */ > > else if (regs_needed <= available_regs) > > cost = target_reg_cost [speed] * regs_needed; > > -- > > 2.25.1 > >