public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andrew Waterman <andrew@sifive.com>
To: Jeff Law <jlaw@ventanamicro.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [committed] [PR target/93062] RISC-V: Handle long conditional branches for RISC-V
Date: Tue, 10 Oct 2023 17:24:33 -0700	[thread overview]
Message-ID: <CA++6G0CQ=fRQV0CSt-o8F_sSL_26rxfmyaGXHmpuSh7XZ31yLw@mail.gmail.com> (raw)
In-Reply-To: <001ae968-da60-4e3b-8909-d6b99980ea63@ventanamicro.com>

I remembered another concern since we discussed this patch privately.
Using ra for long calls results in a sequence that will corrupt the
return-address stack.  Corrupting the RAS is potentially more costly
than mispredicting a branch, since it can result in a cascading
sequence of mispredictions as the program returns up the stack.  Of
course, if these long calls are dynamically quite rare, this isn't the
end of the world.  But it's always preferable to use a register other
than ra or t0 to avoid this performance reduction.  I know nothing
about the complexity of register scavenging, but it would be nice to
opportunistically use a scratch register (other than t0), falling back
to ra only when necessary.

Tangentially, I noticed the patch uses `jump label, ra' for far
branches but uses `call label' for far jumps.  These corrupt the RAS
in opposite ways (the former pops the RAS and the latter pushes it.
Any reason for using a different sequence in one than the other?



On Tue, Oct 10, 2023 at 3:11 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>
>
> Ventana has had a variant of this patch from Andrew W. in its tree for
> at least a year.   I'm dusting it off and submitting it on Andrew's behalf.
>
> There's multiple approaches we could be using here.
>
> First we could make $ra fixed and use it as the scratch register for the
> long branch sequences.
>
> Second, we could add a match_scratch to all the conditional branch
> patterns and allow the register allocator to assign the scratch register
> from the pool of GPRs.
>
> Third we could do register scavenging.  This can usually work, though it
> can get complex in some scenarios.
>
> Forth we could use trampolines for extended reach.
>
> Andrew's original patch did a bit of the first approach (make $ra fixed)
> and mostly the second approach.  The net is it was probably the worst in
> terms of impacting code generation -- we lost a register *and* forced
> every branch instruction to get a scratch register allocated.
>
> I had expected the second approach to produce better code than the
> first, but that wasn't actually the case in practice.  It's probably a
> combination of allocating a GPR at every branch point (even with a life
> of a single insn, there's a cost) and perhaps the additional operands on
> conditional branches spoiling simplistic pattern matching in one or more
> passes.
>
> In addition to performing better based on dynamic instruction counts,
> the first approach is significantly simpler to implement.  Given those
> two positives, that's what I've chosen to go with.  Yes it does remove
> $ra from the set of registers available, but the impact of that is *tiny*.
>
> If someone wanted to dive into one of the other approaches to address a
> real world impact, that's great.  If that happens I would strongly
> suggest also evaluating perlbench from spec2017.  It seems particularly
> sensitive to this issue in terms of approach #2's impact on code generation.
>
> I've built & regression tested this variant on the vt1 configuration
> without regressions.  Earlier versions have been bootstrapped as well.
>
> Pushed to the trunk,
>
> Jeff
>

  reply	other threads:[~2023-10-11  0:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10 22:11 Jeff Law
2023-10-11  0:24 ` Andrew Waterman [this message]
2023-10-11  3:26   ` Jeff Law
2023-10-11 12:55     ` Andrew Waterman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA++6G0CQ=fRQV0CSt-o8F_sSL_26rxfmyaGXHmpuSh7XZ31yLw@mail.gmail.com' \
    --to=andrew@sifive.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).