From: Palmer Dabbelt <palmer@dabbelt.com>
To: jlaw@ventanamicro.com
Cc: manolis.tsamis@vrull.eu, gcc-patches@gcc.gnu.org,
Vineet Gupta <vineetg@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>,
philipp.tomsich@vrull.eu
Subject: Re: [PATCH] Enable shrink wrapping for the RISC-V target.
Date: Tue, 18 Oct 2022 10:35:42 -0700 (PDT) [thread overview]
Message-ID: <mhng-1ef6a2eb-a41e-4eb7-931a-04efc949db61@palmer-ri-x1c9a> (raw)
In-Reply-To: <2e2ca1dc-902e-23ca-babf-11a4978530ef@ventanamicro.com>
On Tue, 18 Oct 2022 08:57:37 PDT (-0700), jlaw@ventanamicro.com wrote:
>
> Just a couple more comments in-line.
>
> On 10/18/22 09:18, Manolis Tsamis wrote:
>>
>>>> +/* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS. */
>>>> +
>>>> +static sbitmap
>>>> +riscv_get_separate_components (void)
>>>> +{
>>>> + HOST_WIDE_INT offset;
>>>> + sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
>>>> + bitmap_clear (components);
>>>> +
>>>> + if (riscv_use_save_libcall (&cfun->machine->frame)
>>>> + || cfun->machine->interrupt_handler_p)
>>> riscv_use_save_libcall() already checks interrupt_handler_p, so that's
>>> redundant. That said, I'm not sure riscv_use_save_libcall() is the
>>> right check here as unless I'm missing something we don't have all those
>>> other constraints when shrink-wrapping.
>>>
>> riscv_use_save_libcall returns false when interrupt_handler_p is true, so the
>> check for interrupt_handler_p in the branch is not redundant in this case.
>>
>> I encountered some issues when shrink wrapping and libcall was used in the same
>> function. Thinking that libcall replaces the prologue/epilogue I didn't see a
>> reason to have both at the same time and hence I opted to disable
>> shrink wrapping in that case. From my understanding this should be harmless?
>
> I would have expected things to work fine with libcalls, perhaps with
> the exception of the save/restore libcalls. So that needs deeper
> investigation.
The save/restore libcalls only support saving/restoring a handful of
register configurations (just the saved X registers in the order they're
usually saved in by GCC). It should be OK for correctness to over-save
registers, but it kind of just un-does the shrink wrapping so not sure
it's worth worrying about at that point.
There's also some oddness around the save/restore libcall ABI, it's not
the standard function ABI but instead a GCC-internal one. IIRC it just
uses the alternate link register (ie, t0 instead of ra) but I may have
forgotten something else.
>>> It seems kind of clunky to have two copies of all these loops (and we'll
>>> need a third to make this work with the V stuff), but we've got that
>>> issue elsewhere in the port so I don't think you need to fix it here
>>> (though the V stuff will be there for the v2, so you'll need the third
>>> copy of each loop).
>>>
>> Indeed, I was following the other ports here. Do you think it would be
>> better to refactor this when the code for the V extension is added?
>> By taking into account what code will be needed for V, a proper refactored
>> function could be made to handle all cases.
>
> I think refactoring when V gets added would be fine. While we could
> probably refactor it correctly now (it isn't terribly complex code after
> all), but we're more likely to get it right with the least amount of
> work if we do it when V is submitted.
Some of the V register blocks are already there, but ya I agree we can
just wait. There's going to be a bunch of V-related churn for a bit,
juggling those patches is already enough of a headache ;)
>>> Either way, this deserves a test case. I think it should be possible to
>>> write one by introducing some register pressure around a
>>> shrink-wrappable block that needs a long stack offset and making sure
>>> in-flight registers don't get trashed.
>>>
>> I tried to think of some way to introduce a test like that but couldn't and
>> I don't see how it would be done. Shrink wrapping only affects saved registers
>> so there are always available temporaries that are not affected by
>> shrink wrapping.
>> (Register pressure should be irrelevant in this case if I understand correctly).
>> Also the implementation checks for SMALL_OPERAND (offset) shrink wrapping
>> should be unaffected from long stack offsets. If you see some way to write
>> a test for that based on what I explained please explain how I could do that.
>
> I think the register pressure was just to ensure that some saves were
> needed to trigger an attempt to shrink wrap something. You'd also need
> something to eat stack space (local array which gets referenced as an
> asm operand, but where the asm doesn't generate any code perhaps)?
> Whether or not that works depends on stack layout though which I don't
> know well enough for riscv.
Sorry for being a bit vague, but it's because I always find it takes a
bit of time to write up tests like this. I think something like this
might do it, but that almost certainly won't work as-is:
// Some extern bits to try and trip up the optimizer.
extern long helper(long *sa, long a, long b, long c, ...);
extern long glob_array[1024];
// The function takes a bunch of arguments to fill up the A
// registers.
long func(long a, long b, long c, ...) {
long stack_array[1024]; // should be big enough to make SP offsets
// take a temporary
// Here I'm loading up all the T registers with something that
// must be saved to the stack, since those helper calls below may
// change the value of glob_array. We probably need to fill the S
// registers too?
long t0 = glob_array[arg + 0];
long t1 = glob_array[arg + 1];
long t2 = ...
// This should trigger shrink-wrapping, as the function calls will
// need a bunch of register state saved. The idea is to make sure
// all the other registers are somehow in flight at this point and
// could only be recovered via their saved copies, but due to the
// large stack we'll need a temporary to construct the slot
// addresses.
if (a) {
t0 = helper(stack_array, a, b, ...);
}
// Go re-use those temporaries, again indirect to avoid
// optimization.
a += glob_array[t0];
b += glob_array[t1];
c ...;
return helper(stack_array, a, b, c, ...);
}
At that point you should end up with the save/restore code in that if
block needing to do something like
lui TMP, IMM_HI
addi TMP, sp, IMM_LO
sd REG0, 0(t0)
sd REG1, 8(t0)
...
and if one of those registers it's trying to save is TMP then we're in
trouble.
> Generally looks pretty good though.
Ya, and a useful optimization so thanks for the help ;)
>
>
> Jeff
next prev parent reply other threads:[~2022-10-18 17:35 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-06 10:39 mtsamis
2022-10-02 20:32 ` Palmer Dabbelt
2022-10-18 15:18 ` Manolis Tsamis
2022-10-18 15:57 ` Jeff Law
2022-10-18 17:35 ` Palmer Dabbelt [this message]
2022-10-19 17:15 ` Jeff Law
2022-10-20 7:42 ` Manolis Tsamis
2022-11-02 14:12 ` Manolis Tsamis
2022-11-02 15:02 ` Jeff Law
2022-10-20 7:35 ` Manolis Tsamis
2022-11-02 13:54 ` Manolis Tsamis
2022-11-02 15:06 ` Jeff Law
2022-11-03 0:26 ` Palmer Dabbelt
2022-11-03 22:23 ` Jeff Law
2022-11-07 22:07 ` Palmer Dabbelt
2022-11-13 1:32 ` Jeff Law
2022-11-16 10:26 ` Manolis Tsamis
2022-11-17 2:09 ` Jeff Law
2022-11-17 10:54 ` Manolis Tsamis
2022-11-17 11:59 ` Philipp Tomsich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mhng-1ef6a2eb-a41e-4eb7-931a-04efc949db61@palmer-ri-x1c9a \
--to=palmer@dabbelt.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jlaw@ventanamicro.com \
--cc=kito.cheng@gmail.com \
--cc=manolis.tsamis@vrull.eu \
--cc=philipp.tomsich@vrull.eu \
--cc=vineetg@rivosinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).