LLVM will try to find scratch register even after RA to resolve the long
jump issue. so maybe we could consider similar approach? And I guess the
most complicate part would be the scratch register is not found, and
require spill/reload after RA.

Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>於 2023年6月26日 週一，22:31寫道：

>
>
> On 6/25/23 12:45, Stefan O'Rear wrote:
>
> >
> > To clarify: are you proposing to make ra (or t1 in the hypothetical) a
> fixed
> > register for all functions, or only those heuristically identified as
> potentially
> > larger than 1MiB?  And would this extend to forcing the creation of
> stack frames
> > for all functions, including very small functions?  I am concerned this
> would
> > result in a substantial performance regression.For the case Yanzhang is
> discussing (firmware and such), yes.  And
> that's simply the cost they're going to have to pay for wanting
> consistent backtraces without utilizing dwarf unwind info, sframe or orc.
>
> Normal builds won't be using those options and thus won't suffer from
> those performance penalties.
>
> >
> > Without seeing the patch I can't know if I'm missing something obvious
> but I
> > would say t1 has three advantages:
> >
> > 1. Consistency with tail, possibly simpler implementation.
> And as I've already stated, this sequence is defined by the assembler.
> While I do want to revisit a compiler only solution, it's way down on my
> list of things to improve if I do a cost/benefit analysis.   If  someone
> wants to take a stab at it, I'm all for it.  But it's not a simple
> problem due the phase ordering issues.
>
> >
> > 2. Very few functions use all seven t-registers.  qemu linux-user in
> 2016 had an
> > off-by-one bug that corrupted t6 in sigreturn and it took months for
> anyone to
> > notice.  By contrast, ra has live data in every non-_Noreturn function.
> That's a terrible way to evaluate the impact.  The right way is to use
> real benchmarks.  Not synthetic benchmarks.  Not indirect observations
> that require triggering a bug in a sigreturn path.  Build and run a real
> benchmark.
>
>
>
> >
> > 3. Any jalr instruction which has rs1=ra has a hint effect on the return
> address
> > stack (call, return, or coroutine swap); a jalr which is intended to be
> treated
> > as a plain jump must have rs1!=ra, rs1!=t0.
> I'm well aware of these concerns.  We support disambiguating various
> jump forms to facilitate different branch predictors.
>
> jeff
>