public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][RFA/RFC] Stack clash mitigation patch 06/08 - V3
Date: Tue, 05 Sep 2017 21:48:00 -0000	[thread overview]
Message-ID: <20170905214825.GC13471@gate.crashing.org> (raw)
In-Reply-To: <e184d5bb-e822-42a0-3d08-5f087a192035@redhat.com>

Hi!

On Sat, Sep 02, 2017 at 12:31:16AM -0600, Jeff Law wrote:
> On 08/29/2017 05:14 PM, Segher Boessenkool wrote:
> > Actually, everywhere it is used it has a Pmode == SImode wart before
> > it, to emit the right update_stack insn...  So fold that into this
> > function, name it rs6000_emit_allocate_stack_1?
> Agreed.  That seems like a nice cleanup.  Look at the call from
> rs6000_emit_allocate_stack.  Instead of a Pmode == SImode, it tests
> TARGET_32BIT instead.  But I think that one is still safe to convert
> based on how we set Pmode in rs6000_option_override_internal.

TARGET_32BIT is exactly the same as Pmode == SImode.  Sometimes the
latter is more expressive (if it describes more directly what is being
tested).

> > You don't describe what the return value here is (but neither does
> > rs6000_emit_allocate_stack); it is the single instruction that actually
> > changes the stack pointer?  For the split stack support?  (Does the stack
> > clash code actually work with split stack, has that been tested?)
> As far as I was able to ascertain (and essentially duplicate) we return
> the insn that allocates the stack, but only if the allocation was
> handled a single insn.  Otherwise we return NULL.
> 
> But that was determined largely by guesswork.  It interacts with split
> stack support, but I'm not entirely what it's meant to do.  If you have
> insights here, I'll happily add comments to the routines -- when I was
> poking at this stuff I was rather distressed to not have any real
> guidance on what the return value should be.
> 
> I have not tested stack clash and split stacks. ISTM they ought to be
> able to work together, but I haven't though real deep about it.

(To test, I think testing Go on 64-bit is the best you can do).

rs6000_emit_prologue has a comment:

  /* sp_adjust is the stack adjusting instruction, tracked so that the
     insn setting up the split-stack arg pointer can be emitted just
     prior to it, when r12 is not used here for other purposes.  */

> > Maybe we should keep track of that sp_adjust insn in a global, or in
> > machine_function, or even in the stack info struct.
> Maybe.  It's certainly not obvious to me what is does and how it does
> it.  THe physical distance between where it gets set and modified and
> its use point doesn't help.

Yes, and it complicates control flow.

> > I'm not sure r0 is always free to use here, it's not obvious.  You can't
> > use the START_USE etc. macros to check, those only work inside
> > rs6000_emit_prologue.  I'll figure something out.
> Should I just leave it as-is until you have a suggestion for the right
> way to find a temporary at this point?

Sure, that's fine.

> > The way more common case is no probes at all, but that is handled in the
> > caller; maybe move it to this function instead?
> The comment is correct for what's handled by that code.  As you note the
> most common case is no probing needed and in those cases we never call
> rs6000_emit_probe_stack_range_stack_clash.
> 
> The code was written that way so that the common case (no explicit
> probes) would just fall into the old code.  That minimized the
> possibility that I mucked anything up :-)

I think it's cleaner if you don't handle that check externally, but
you decide.

> > It seems the only thing preventing the probes from being elided is that
> > GCC does not realise the probe stores are not actually needed?  I guess
> > that should be enough, but maybe there should be a test for this.
> I'm not sure what you're asking here.  In the code above we're
> allocating a multiple of PROBE_INTERVAL bytes.  We must probe every
> PROBE_INTERVAL bytes.

Yes, but what prevents later passes from getting rid of the stores?
Maybe some volatile is needed?  The stack pointer updates cannot easily
be removed by anything, but relying on that to also protect the stores
is a bit fragile.

> >> +	{
> >> +	  emit_move_insn (end_addr, GEN_INT (-rounded_size));
> >> +	  insn = emit_insn (gen_add3_insn (end_addr, end_addr,
> >> +					   stack_pointer_rtx));
> >> +	  add_reg_note (insn, REG_FRAME_RELATED_EXPR,
> >> +			gen_rtx_SET (end_addr,
> >> +				     gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> >> +						   rs)));
> >> +	}
> >> +      RTX_FRAME_RELATED_P (insn) = 1;
> > 
> > What are these notes for?  Not totally obvious...  could use a comment :-)
> At a high level we're describing to the CFI machinery how to compute the
> CFA.
> 
> ie we have
> 
> (set (endreg) (-rounded_size))
> (set (endreg) (endreg) (stack_pointer_rtx))
> 
> But only the second insn actually participates in the CFA computation
> because it's the only one marked as RTX_FRAME_RELATED.  But the CFI
> machinery isn't going to be able to figure out what that insn does.  So
> we describe it with a note.  Think of it as a REG_EQUAL note for the CFI
> machinery.

Is end_addr used for calculating the CFA in any way?  That's what I'm
not seeing.

> >> +/* Probe a range of stack addresses from REG1 to REG3 inclusive.  These are
> >> +   absolute addresses.  REG2 contains the outer stack pointer that must be
> >> +   stored into *sp at each allocation.
> > 
> > s/outer stack pointer/pointer to the previous frame/ ?  Or just say
> > "backchain"?
> I'll just use backchain here and in another instance I found.

Thanks.  That's the terminology the ABIs use.

> >> @@ -10272,6 +10272,15 @@
> >>    rtx neg_op0;
> >>    rtx insn, par, set, mem;
> >>  
> >> +  /* By allowing reg_or_cint_operand as the predicate we can get
> >> +     better code for stack-clash-protection because we do not lose
> >> +     size information.  But the rest of the code expects the operand
> >> +     to be reg_or_short_operand.  If it isn't, then force it into
> >> +     a register.  */
> >> +  rtx orig_op1 = operands[1];
> >> +  if (!reg_or_short_operand (operands[1], Pmode))
> >> +    operands[1] = force_reg (Pmode, operands[1]);
> > 
> > I don't understand the "we do not lose size information" bit.  Maybe
> > you can show an example?
> If you have a large constant alloca -- large enough that the size does
> not fit in a reg_or_cint_operand, then the generic code in explow.c will
> force the size into a register before calling this expander.
> 
> Thus we don't know the allocation is constant within this expander which
> limits our ability to rotate the loop, unroll the loop and/or elide any
> residual allocations.

Ah, I see.  Could you put that tidbit in the comment please?  Something
like "If we changed the operands[1] contraint to reg_or_short_operand
we would not notice in here we are asked for a constant size allocation
if it does not fit 16 bits (the caller would force the constant into a
register)."

> >> +      /* We do occasionally get in here with constant sizes, we might
> >> +	 as well do a reasonable job when we obviously can.  */
> >> +      if (rounded_size != CONST0_RTX (Pmode))
> > 
> > That's just const0_rtx.
> The canonical way is to test against CONST0_RTX (mode).  Testing
> "const0_rtx" may be safe here, but in general the right way is
> CONST0_RTX (mode).

But mode is an integer mode here, so const0_rtx is fine (and shorter
and easier to read, and more usual).

Thanks,


Segher

  reply	other threads:[~2017-09-05 21:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31  5:47 Jeff Law
2017-08-30  8:34 ` Segher Boessenkool
2017-09-02  6:31   ` Jeff Law
2017-09-05 21:48     ` Segher Boessenkool [this message]
2017-09-07 17:27       ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170905214825.GC13471@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).