Re: [PATCH][RFA/RFC] Stack clash mitigation patch 07/08

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Trevor Saunders <tbsaunde@tbsaunde.org>
To: Jeff Law <law@redhat.com>
Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>
Subject: Re: [PATCH][RFA/RFC] Stack clash mitigation patch 07/08
Date: Sun, 16 Jul 2017 18:36:00 -0000	[thread overview]
Message-ID: <20170716183647.xemddebcpp562qfr@ball> (raw)
In-Reply-To: <96154a33-ab91-8b37-5b0e-ee0c019810c6@redhat.com>

On Wed, Jul 12, 2017 at 03:08:28PM -0600, Jeff Law wrote:
> On 07/12/2017 06:47 AM, Trevor Saunders wrote:
> > On Tue, Jul 11, 2017 at 08:02:26PM -0600, Jeff Law wrote:
> >> On 07/11/2017 06:20 PM, Wilco Dijkstra wrote:
> >>> Jeff Law wrote:
> >>>> aarch64 is the first target that does not have any implicit probes in
> >>>> the caller.  Thus at prologue entry it must make conservative
> >>>> assumptions about the offset of the most recent probed address relative
> >>>> to the stack pointer.
> >>>
> >>> No - like I mentioned before that's not correct nor acceptable as it would imply
> >>> that ~70% of functions need a probe at entry. I did a quick run across SPEC and
> >>> found the outgoing argument size is > 1024 in just 9 functions out of 147000!
> >>> Those 9 were odd special cases due to auto generated code to interface between
> >>> C and Fortran. This is extremely unlikely to occur anywhere else. So even assuming
> >>> an unchecked caller, large outgoing arguments are simply not a realistic threat.
> >> Whether or not such frames exist in SPEC isn't the question.   THere's
> >> nothing in the ABI or ISA that allows us to avoid those probes without
> >> compromising security.
> >>
> >> Mixed code scenarios are going to be a fact of life, probably forever
> >> unless we can convince every ISV providing software that works on top of
> >> RHEL/*BSD/whatever  to turn on probing (based on my experiences, that
> >> has exactly a zero chance of occurring).
> > 
> > On the other hand if probing is fast enough that it can be on by default
> > in gcc there should be much less of it.  Even if you did change the ABI
> > to require probing it seems unlikely that code violating that
> > requirement would hit problems other than this security concern, so I'd
> > expect there will be some non compliant asm out there.
> Certainly my goal is to enable it by default one day.  Even if that's
> ultimately done at the distro level or by a configure time switch.
> 
> Certainly if/when we reach that point the amount of unprotected code
> will drop dramatically, but based on my experience I fully expect some
> folks to turn it off.  They'll say something like "hey, we've audited
> our code and we don't have any large stack or allocas, so I'm turning
> this thing off to get some un-measurable performance gain."  It'd be an
> uber-dumb thing to do, but it's going to happen.

I agree with that, though we could consider not having an option for it.
Of course people can compile there own compiler, but I think its pretty
clear if you do that the security problem is your fault too.

> And once that happens, they're open to attack again, even if they were
> correct in their audit and their code only calls stack-clash protected
> libraries.
> 
> Tweaking the ABI to mandate touching *sp in the outgoing args area &
> alloca space is better because we likely wouldn't have an option to
> avoid that access.   So a well meaning, but clueless, developer couldn't
> turn the option off like they could stack checking.

However see the comment about assembly code, I can easily see someone
forgeting to touch *sp in hand written assembly.  Obviously much less of
that, but it kind of sounds like you want perfect here.

> >> In a mixed code scenario a caller may have a large alloca or large
> >> outgoing argument area that pushes the stack pointer into the guard page
> >> without actually accessing the guard page.  That requires a callee which
> >> is compiled with stack checking to make worst case assumptions at
> >> function entry to protect itself as much as possible from these attacks.
> > 
> > It seems to me pretty important to ask how many programs out there have
> > a caller that can push the stack into the guard page, but not past it.
> I've actually spent a lot of time thing about that precise problem. You
> don't need large frames to do that -- you just need controlled heap
> leaks and/or controlled recursion.  ie, even if a function has no
> allocas and a small frame, it can put the stack pointer into the guard.

There may not be room for it on 32 bit platforms, but I think another
thing we learned here is that its a mistake to allow the heap to grow
into space the stack might use.  That would require the use of recursion
here.

> THus in the callee we have to make worst case assumptions to be safe on
> targets where there is no implicit probe in the caller.
> 
> 
> 
> > I'd expect that's not the case for most allocas, either they are safe,
> > or can increase the stack arbitrarily.  I'd expect its more likely with
> > outgoing arg or large buffer allocations though.
> Unfortunately not.  A small alloca is not inherently safe unless you
> also touch the allocated space.  That's precisely why the generic code
> touches residuals after the loop that handles PROBE_INTERVAL chunks
> (consider a small alloca in a loop).
> 
> That's also why I suggested a small backwards compatible tweak to the
> aarch64 ABI.  Specifically there should be a mandated touch of *sp in
> any function where the outgoing args size is greater than some well
> defined value or if the function allocates any dynamic stack space.
> 
> That mandated touch of *sp at the ABI level would be enough to allow for
> a much less conservative assumed state at function start and would allow
> the vast majority of functions to need no probing at all.
> 
> 
> 
>   I think the largest
> > buffer Qualys found was less than 400k? So 1 256k guard page should
> > protect 95% of functions, and 1m or 2m  seems like enough to protect
> > against all non malicious programs.  I'm not sure, but this is a 64 bit
> > arch, so it seems like we should have the adress space for large guard
> > pages like that.
> I'm all for larger guards, particularly on 64 bit architectures.
> 
> We use 64k pages on aarch64 for RHEL which implicitly gives us a minimum
> guard of 64k.  That would be a huge factor in any analysis I would do if
> the aarch64 maintainers choose not to fully protect their architecture
> and I was forced to make a recommendation for Red Hat and its customers.
>  I hope I don't have to sit down and do the analysis on this and make
> any kind of recommendation.

Its certainly easier to say its not the compilers job to fully protect
when you don't have to make that recommendation ;)

> The fact that Qualys found nothing larger than X (for any X) in the code
> they scanned isn't relevant.  There could well be code out there they
> did not look at that uses > X or code that is yet to be written that
> uses > X.

If you want to protect all code someone could write I agree.  I've been
thinking more about protecting most reasonable code and saying if you
allocate a 50mb buffer on the stack that's your bug, and we don't need
to make that safe.

> > 
> > There's also the trade off that increasing the amount of code that
> > probes reduces the set of code that can either move the stack into or
> > past the guard page.
> Correct.
> 
> > 
> >> THe aarch64 maintainers can certain nix what I've done or modify it in
> >> ways that suit them.  That is their choice as port maintainers.
> >> However, Red Hat will have to evaluate the cost of reducing security for
> >> our customer base against the performance improvement of such changes.
> >> As I've said before, I do not know where that decision would fall.
> >>
> >>
> >>>
> >>> Therefore even when using a tiny 4K probe size we can safely adjust SP by 3KB
> >>> before needing an explicit probe - now only 0.6% of functions need a probe.
> >>> If we choose a proper minimum probe distance, say 64KB, explicit probes are
> >>> basically non-existent (just 35 functions, or ~0.02% of all functions are > 64KB).
> >>> Clearly inserting probes can be the default as the impact on code quality is negligible.
> >> Again, there's nothing that says 3k is safe.   You're picking an
> >> arbitrary point that is safe in a codebase you've looked at.  But I'm
> >> looking at this from a "what guarantees do I have from an ABI or ISA
> >> standpoint".  The former may be more performant, but it's inherently
> >> less secure than the latter.

It might well not be what you need, but it does kind of seem like what's
generally useful is protecting almost all if not all "reasonable"
programs with the least performance impact.  Looking at various real
programs to decide what is reasonable or not seems like a good way to
draw that line.

> > On the other hand making -fstack-check=clash the default seems to me
> > like a very significant security improvement.
> Agreed and it's where I'd like this to go.

yeah, I'm just wondering if it isn't possible to get there already on 64
bit machines with some kernel changes.

thanks / sorry about lag.

Trev

> 
> Jeff

next prev parent reply	other threads:[~2017-07-16 18:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-12  0:20 Wilco Dijkstra
2017-07-12  2:02 ` Jeff Law
2017-07-12 12:47   ` Trevor Saunders
2017-07-12 21:08     ` Jeff Law
2017-07-16 18:36       ` Trevor Saunders [this message]
2017-07-17 15:51         ` Jeff Law
2017-07-17 20:30           ` Trevor Saunders
2017-07-14  6:19 ` Jeff Law
2017-07-17 11:27   ` Wilco Dijkstra
2017-07-17 15:24     ` Jeff Law
2017-07-17 15:31       ` Jakub Jelinek
2017-07-17 22:42       ` Wilco Dijkstra
2017-07-17 23:16         ` Jeff Law
2017-07-14  8:17 ` Jakub Jelinek
2017-07-14 11:49   ` Wilco Dijkstra
  -- strict thread matches above, loose matches on Subject: below --
2017-07-11 21:21 Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170716183647.xemddebcpp562qfr@ball \
    --to=tbsaunde@tbsaunde.org \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).