Does -fstack-protector really need to clear registers?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Does -fstack-protector really need to clear registers?
@ 2020-08-05 17:12 Richard Sandiford
  2020-08-25 20:49 ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Sandiford @ 2020-08-05 17:12 UTC (permalink / raw)
  To: gcc

PR96191 [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96191]
was a bug raised against the aarch64 implementation of -fstack-protector.
As it happens, the same bug affected arm, but AFAICT those are the only
two affected targets.

-fstack-protector works by:

* creating a special canary slot in the stack frame,
* initialising it with a special canary value, and then
* checking at the end of the function whether the canary slot still has
  the correct value.

If the slot has changed value, the function calls a special stack-smash
handler that would typically abort the program.

On many targets, the code that sets up and tests the canary slot will
need to load the canary value into registers.  However, GCC tries
to guarantee that this value does not remain in registers beyond the
“test” and “set” operations.  For example, the documentation of the
stack_protect_test pattern says:

  This pattern, if defined, compares a @code{ptr_mode} value from the
  valid memory location in operand 1 with the memory in operand 0 without
  leaving the value in a register afterward and branches to operand 2 if
  the values were equal.

  If this pattern is not defined, then a plain compare pattern and
  conditional branch pattern is used.

The bug in the PR was that aarch64 (and arm) did this when setting up
the canary slot, but not when testing it.

However, it's not obvious (to me) whether this is really necessary and
what it's really protecting against.  Even if we zero out the registers
after these patterns, the canary value is still readily available by
other means:

(1) We don't make any effort to hide the address of the canary value
    (typically &__stack_chk_guard, although some targets support
    alternatives).  It's not obvious what “hiding” this address
    would actually mean in practice, since it would often be easily
    predictable from other non-secret addresses.

(2) The canary value is often available in stack locations, such as:

    (a) a canary in the current function, if the current function
        uses stack-smash protection

    (b) a “protected” caller further up the call stack

    (c) canary values left around by previous calls, if the canary value
        happens to occupy “dead space” in the current frame

    And being on the stack is of course fundamental to the whole scheme.

It's also not clear where the requirement came from.  It didn't seem to
be in IBM's original implementation:

  https://gcc.gnu.org/pipermail/gcc-patches/2004-September/148566.html

but was part of rth's initial reimplementation:

  https://gcc.gnu.org/pipermail/gcc-patches/2005-May/170338.html

LLVM has taken the deliberate decision *not* to zero out the registers
for set and test patterns, see:

  https://bugs.llvm.org/show_bug.cgi?id=46994

So if this zeroing isn't offering any significant protection, there's a
danger that GCC is being uncompetitive on code size for very little gain.

On the other side, I guess there's the argument that this is typically
“only” 2 extra instructions per function.  I think those instructions
still have to pay their way though.  E.g. for two more instructions we
could clear the address register, and for one more instruction we could
zero out the stack slot after testing it.  It isn't obvious to me why
the two extra instructions we currently have are worth it but those
five aren't.

Any thoughts?  Is there a specific reason for the current trade-off?
Is it something we should revisit?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Does -fstack-protector really need to clear registers?
  2020-08-05 17:12 Does -fstack-protector really need to clear registers? Richard Sandiford
@ 2020-08-25 20:49 ` Jeff Law
  2020-08-26  9:29   ` Florian Weimer
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2020-08-25 20:49 UTC (permalink / raw)
  To: Richard Sandiford, gcc

On Wed, 2020-08-05 at 18:12 +0100, Richard Sandiford wrote:
> PR96191 [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96191]
> was a bug raised against the aarch64 implementation of -fstack-protector.
> As it happens, the same bug affected arm, but AFAICT those are the only
> two affected targets.
> 
> -fstack-protector works by:
> 
> * creating a special canary slot in the stack frame,
> * initialising it with a special canary value, and then
> * checking at the end of the function whether the canary slot still has
>   the correct value.
> 
> If the slot has changed value, the function calls a special stack-smash
> handler that would typically abort the program.
> 
> On many targets, the code that sets up and tests the canary slot will
> need to load the canary value into registers.  However, GCC tries
> to guarantee that this value does not remain in registers beyond the
> “test” and “set” operations.  For example, the documentation of the
> stack_protect_test pattern says:
> 
>   This pattern, if defined, compares a @code{ptr_mode} value from the
>   valid memory location in operand 1 with the memory in operand 0 without
>   leaving the value in a register afterward and branches to operand 2 if
>   the values were equal.
> 
>   If this pattern is not defined, then a plain compare pattern and
>   conditional branch pattern is used.
> 
> The bug in the PR was that aarch64 (and arm) did this when setting up
> the canary slot, but not when testing it.
> 
> However, it's not obvious (to me) whether this is really necessary and
> what it's really protecting against.  Even if we zero out the registers
> after these patterns, the canary value is still readily available by
> other means:
I suspect folks were just being paranoid and trying to do the right thing. The
fact there's other way to get at the data doesn't necessarily mean we should
leave it lying around in a convenient register.


> 
> (1) We don't make any effort to hide the address of the canary value
>     (typically &__stack_chk_guard, although some targets support
>     alternatives).  It's not obvious what “hiding” this address
>     would actually mean in practice, since it would often be easily
>     predictable from other non-secret addresses.
Well, isn't that only when the target doesn't support storing the guard in thread
local storage??   ie, I don't think you can get it consistently this way.

Now the pointer guard is a completely different story and I've been trying to get
that fixed for years ;(  Though I must admit it was awful nice to have it
available in a global when I needed to debug a problem related to pointer
mangling in glibc...



> 
> (2) The canary value is often available in stack locations, such as:
> 
>     (a) a canary in the current function, if the current function
>         uses stack-smash protection
> 
>     (b) a “protected” caller further up the call stack
> 
>     (c) canary values left around by previous calls, if the canary value
>         happens to occupy “dead space” in the current frame
> 
>     And being on the stack is of course fundamental to the whole scheme.
Yea, but these you have to go and find ;-)

> 

Jeff


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Does -fstack-protector really need to clear registers?
  2020-08-25 20:49 ` Jeff Law
@ 2020-08-26  9:29   ` Florian Weimer
  0 siblings, 0 replies; 3+ messages in thread
From: Florian Weimer @ 2020-08-26  9:29 UTC (permalink / raw)
  To: Jeff Law via Gcc

* Jeff Law via Gcc:

>> (1) We don't make any effort to hide the address of the canary value
>>     (typically &__stack_chk_guard, although some targets support
>>     alternatives).  It's not obvious what “hiding” this address
>>     would actually mean in practice, since it would often be easily
>>     predictable from other non-secret addresses.

> Well, isn't that only when the target doesn't support storing the
> guard in thread local storage??  ie, I don't think you can get it
> consistently this way.

The downside of the reference canary in the TCB is that the TCB is
writable and (in glibc) on the stack.  So if you overflow goes far
enough, you can overwrite both the canary and the reference.

With a bit more toolchain support, we could have a randomness relocation
(processed by the dynamic loader) and put the reference value into
.data.relro.  Sure, its address maybe easier to guess, but you can
easily have several canaries, and the overflow-into-the-TCB trick will
no longer work.  I think this is what OpenBSD did at some point.

But targets that do not have convenient PC-relative addressing probably
want to keep using the TCB approach.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-08-26  9:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-05 17:12 Does -fstack-protector really need to clear registers? Richard Sandiford
2020-08-25 20:49 ` Jeff Law
2020-08-26  9:29   ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).