Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)
Date: Tue, 10 Jul 2018 13:43:00 -0000	[thread overview]
Message-ID: <f3bcd108-a10d-776d-d521-c06594d7155e@arm.com> (raw)
In-Reply-To: <CAFiYyc2YPu_2DsdMpYnKJqBWQFodBGfGUiXcdCEcx+mE9ZUWpw@mail.gmail.com>

On 10/07/18 12:21, Richard Biener wrote:
> On Tue, Jul 10, 2018 at 12:53 PM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>>
>> On 10/07/18 11:10, Richard Biener wrote:
>>> On Tue, Jul 10, 2018 at 10:39 AM Richard Earnshaw (lists)
>>> <Richard.Earnshaw@arm.com> wrote:
>>>>
>>>> On 10/07/18 08:19, Richard Biener wrote:
>>>>> On Mon, Jul 9, 2018 at 6:39 PM Richard Earnshaw
>>>>> <Richard.Earnshaw@arm.com> wrote:
>>>>>>
>>>>>>
>>>>>> The patches I posted earlier this year for mitigating against
>>>>>> CVE-2017-5753 (Spectre variant 1) attracted some useful feedback, from
>>>>>> which it became obvious that a rethink was needed.  This mail, and the
>>>>>> following patches attempt to address that feedback and present a new
>>>>>> approach to mitigating against this form of attack surface.
>>>>>>
>>>>>> There were two major issues with the original approach:
>>>>>>
>>>>>> - The speculation bounds were too tightly constrained - essentially
>>>>>>   they had to represent and upper and lower bound on a pointer, or a
>>>>>>   pointer offset.
>>>>>> - The speculation constraints could only cover the immediately preceding
>>>>>>   branch, which often did not fit well with the structure of the existing
>>>>>>   code.
>>>>>>
>>>>>> An additional criticism was that the shape of the intrinsic did not
>>>>>> fit particularly well with systems that used a single speculation
>>>>>> barrier that essentially had to wait until all preceding speculation
>>>>>> had to be resolved.
>>>>>>
>>>>>> To address all of the above, these patches adopt a new approach, based
>>>>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>>>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>>>>> but which we have extended to deal with inter-function speculation.
>>>>>> The patches divide the problem into two halves.
>>>>>>
>>>>>> The first half is some target-specific code to track the speculation
>>>>>> condition through the generated code to provide an internal variable
>>>>>> which can tell us whether or not the CPU's control flow speculation
>>>>>> matches the data flow calculations.  The idea is that the internal
>>>>>> variable starts with the value TRUE and if the CPU's control flow
>>>>>> speculation ever causes a jump to the wrong block of code the variable
>>>>>> becomes false until such time as the incorrect control flow
>>>>>> speculation gets unwound.
>>>>>>
>>>>>> The second half is that a new intrinsic function is introduced that is
>>>>>> much simpler than we had before.  The basic version of the intrinsic
>>>>>> is now simply:
>>>>>>
>>>>>>       T var = __builtin_speculation_safe_value (T unsafe_var);
>>>>>>
>>>>>> Full details of the syntax can be found in the documentation patch, in
>>>>>> patch 1.  In summary, when not speculating the intrinsic returns
>>>>>> unsafe_var; when speculating then if it can be shown that the
>>>>>> speculative flow has diverged from the intended control flow then zero
>>>>>> is returned.  An optional second argument can be used to return an
>>>>>> alternative value to zero.  The builtin may cause execution to pause
>>>>>> until the speculation state is resolved.
>>>>>
>>>>> So a trivial target implementation would be to emit a barrier and then
>>>>> it would always return unsafe_var and never zero.  What I don't understand
>>>>> fully is what users should do here, thus what the value of ever returning
>>>>> "unsafe" is.  Also I wonder why the API is forcing you to single-out a
>>>>> special value instead of doing
>>>>>
>>>>>  bool safe = __builtin_speculation_safe_value_p (T unsafe_value);
>>>>>  if (!safe)
>>>>>    /* what now? */
>>>>>
>>>>> I'm only guessing that the correct way to handle "unsafe" is basically
>>>>>
>>>>>  while (__builtin_speculation_safe_value (val) == 0)
>>>>>     ;
>>>>>
>>>>> use val, it's now safe
>>>>
>>>> No, a safe version of val is returned, not a bool telling you it is now
>>>> safe to use the original.
>>>
>>> OK, so making the old value dead is required to preserve the desired
>>> dataflow.
>>>
>>> But how should I use the special value that signaled "failure"?
>>>
>>> Obviously the user isn't supposed to simply replace 'val' with
>>>
>>>  val = __builtin_speculation_safe_value (val);
>>>
>>> to make it speculation-proof.  So - how should the user _use_ this
>>> builtin?  The docs do not say anything about this but says the
>>> very confusing
>>>
>>> +The function may use target-dependent speculation tracking state to cause
>>> +@var{failval} to be returned when it is known that speculative
>>> +execution has incorrectly predicted a conditional branch operation.
>>>
>>> because speculation is about executing instructions as if they were
>>> supposed to be executed.  Once it is known the prediciton was wrong
>>> no more "wrong" instructions will be executed but a previously
>>> speculated instruction cannot know it was "falsely" speculated.
>>>
>>> Does the above try to say that the function may return failval if the
>>> instruction is currently executed speculatively instead?  That would
>>> make sense to me.  And return failval independent of if the speculation
>>> later turns out to be correct or not.
>>>
>>>>  You must use the sanitized version in future,
>>>> not the unprotected version.
>>>>
>>>>
>>>> So the usage is going to be more like:
>>>>
>>>> val = __builtin_speculation_safe_value (val);  // Overwrite val with a
>>>> sanitized version.
>>>>
>>>> You have to use the cleaned up version, the unclean version is still
>>>> vulnerable to incorrect speculation.
>>>
>>> But then where does failval come into play?  The above cannot be correct
>>> if failval matters.
>>
>> failval only comes into play if the CPU is speculating incorrectly.
>> It's just a safe value that some CPUs might use until such time as the
>> speculation gets unwound.  In most cases it can be zero.  Perhaps a more
>> concrete example would be something like.
>>
>>
>> void **mem;
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return mem[untrusted];
>>   return NULL;
>> }
>>
>> This can be rewritten as either:
>>
>> void *mem;
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return mem[__builtin_speculation_safe_value (untrusted)];
>>   return NULL;
>> }
>>
>> if leaking mem[0] is not a problem; or if you're really paranoid:
>>
>> void* f(unsigned untrusted)
>> {
>>   if (untrusted < 100)
>>     return *__builtin_speculation_safe_value (mem + untrusted);
>>   return NULL;
>> }
>>
>> which becomes a NULL pointer dereference if the speculation is
>> incorrect.  The programmer doesn't need to test this code (any test
>> would likely be useless anyway as the CPU would predict the outcome and
>> speculate based on such a prediction).  They do need to think a bit
>> about what might leak if the CPU does miss-speculate, hence the two
>> examples above.
>>
>> A more complex case, where you might want a different failure value can
>> come up if you have a mask operation.  A slightly convoluted example
>> (derived from a real example we found in the Linux kernel) might be
>>
>> if (mode32)
>>   mask_from = 0x100000000;
>> else
>>   mask_from = 0;
>>
>> ... // some code
>>
>> return mem[untrusted_val & (mask_from - 1)];
>>
>> It would probably be better to place the speculation cleanup nearer the
>> initialization of mask_from, especially if mask_from could be used more
>> than once; but in that case 0 is less safe than any other (since 0-1 is
>> all bits 1).  In this case you might write:
>>
>> if (mode32)
>>   mask_from = 0x100000000;
>> else
>>   mask_from = 0;
>>
>> mask_from = __builtin_speculation_safe_value (mask_from, 1);
>>
>> ... // some code
>>
>> return mem[untrusted_val & (mask_from - 1)];
>>
>> And in this case if the speculation is incorrect the (mask_from - 1)
>> expression becomes 0 and the whole range is protected.
>>
>> Note that speculation is fine if it is correct, we don't want to hold
>> things up in that case since it will happen anyway.  It's only a problem
>> if the speculation is incorrect :-)
> 
> Ok, please amend at least the user-level documentation with one or two
> examples like above.
> 
>>
>>>
>>> Does the builtin try to say that iff the current instruction is speculated
>>> then it will return failval, otherwise it will return val and thus following
>>> code needs to handle 'failval' in a way that is safe?
>>
>> No.  From an abstract machine sense the builtin only ever returns its
>> first argument.  The fail value only appears if the CPU guesses the
>> direction of a branch and guesses *incorrectly* (incorrect speculation)
> 
> I understand how it works when the implementation issues a fence.
> How does the actual implementation (assembly) look like that makes
> the CPU know whether it speculated wrongly or not at the time the
> value is needed for further speculation given the *__bsfs (mem + offset)
> instruction _is_ speculated as well.
> 

If you've not already read it, then I suggest you read
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/download-the-whitepaper
and the link I posted in the original mail to Chandler's posting on this
subject.

In essence, we start with a register (tracker) initialized to ~0.  At
every conditional branch we change

	b.cond tgt
	...
tgt:
	...

into

	b.cond tgt
	// tracker == ~cond ? tracker, 0
	csel	tracker, tracker, 0, ~cond
	...
tgt:
	// tracker == cond ? tracker, 0
	csel	tracker, tracker, 0, cond


Now when we need to inhibit speculation we emit

	and	safe_val, unsafe_val, tracker
	CSDB

Assuming we want the default result of 0 if speculating unsafely.  CSDB
is a light-weight barrier that ensures that all preceding
/data-processing/ instructions are using real results from earlier
instructions: it will resolve any data-flow speculation, but not
necessarily any control-flow speculation.

Obviously the inserted instructions only appear on the edges between the
blocks, but we take care of that during the insertion process.

R.

> Richard.
> 
>>>
>>> Again, the wording "when it is known that speculative execution has
>>> incorrectly predicted a conditional branch operation" is very confusing...
>>>
>>> Richard.
>>>
>>>> R.
>>>>
>>>>>
>>>>> that is, the return value is only interesting in sofar as to whether it is equal
>>>>> to val or the special value?
>>>>>
>>>>> That said, I wonder why we don't hide that details from the user and
>>>>> provide a predicate instead.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> There are seven patches in this set, as follows.
>>>>>>
>>>>>> 1) Introduces the new intrinsic __builtin_sepculation_safe_value.
>>>>>> 2) Adds a basic hard barrier implementation for AArch32 (arm) state.
>>>>>> 3) Adds a basic hard barrier implementation for AArch64 state.
>>>>>> 4) Adds a new command-line option -mtrack-speculation (currently a no-op).
>>>>>> 5) Disables CB[N]Z and TB[N]Z when -mtrack-speculation.
>>>>>> 6) Adds the new speculation tracking pass for AArch64
>>>>>> 7) Uses the new speculation tracking pass to generate CSDB-based barrier
>>>>>>    sequences
>>>>>>
>>>>>> I haven't added a speculation-tracking pass for AArch32 at this time.
>>>>>> It is possible to do this, but would require quite a lot of rework for
>>>>>> the arm backend due to the limited number of registers that are
>>>>>> available.
>>>>>>
>>>>>> Although patch 6 is AArch64 specific, I'd appreciate a review from
>>>>>> someone more familiar with the branch edge code than myself.  There
>>>>>> appear to be a number of tricky issues with more complex edges so I'd
>>>>>> like a second opinion on that code in case I've missed an important
>>>>>> case.
>>>>>>
>>>>>> R.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Richard Earnshaw (7):
>>>>>>   Add __builtin_speculation_safe_value
>>>>>>   Arm - add speculation_barrier pattern
>>>>>>   AArch64 - add speculation barrier
>>>>>>   AArch64 - Add new option -mtrack-speculation
>>>>>>   AArch64 - disable CB[N]Z TB[N]Z when tracking speculation
>>>>>>   AArch64 - new pass to add conditional-branch speculation tracking
>>>>>>   AArch64 - use CSDB based sequences if speculation tracking is enabled
>>>>>>
>>>>>>  gcc/builtin-types.def                     |   6 +
>>>>>>  gcc/builtins.c                            |  57 ++++
>>>>>>  gcc/builtins.def                          |  20 ++
>>>>>>  gcc/c-family/c-common.c                   | 143 +++++++++
>>>>>>  gcc/c-family/c-cppbuiltin.c               |   5 +-
>>>>>>  gcc/config.gcc                            |   2 +-
>>>>>>  gcc/config/aarch64/aarch64-passes.def     |   1 +
>>>>>>  gcc/config/aarch64/aarch64-protos.h       |   3 +-
>>>>>>  gcc/config/aarch64/aarch64-speculation.cc | 494 ++++++++++++++++++++++++++++++
>>>>>>  gcc/config/aarch64/aarch64.c              |  88 +++++-
>>>>>>  gcc/config/aarch64/aarch64.md             | 140 ++++++++-
>>>>>>  gcc/config/aarch64/aarch64.opt            |   4 +
>>>>>>  gcc/config/aarch64/iterators.md           |   3 +
>>>>>>  gcc/config/aarch64/t-aarch64              |  10 +
>>>>>>  gcc/config/arm/arm.md                     |  21 ++
>>>>>>  gcc/config/arm/unspecs.md                 |   1 +
>>>>>>  gcc/doc/cpp.texi                          |   4 +
>>>>>>  gcc/doc/extend.texi                       |  29 ++
>>>>>>  gcc/doc/invoke.texi                       |  10 +-
>>>>>>  gcc/doc/md.texi                           |  15 +
>>>>>>  gcc/doc/tm.texi                           |  20 ++
>>>>>>  gcc/doc/tm.texi.in                        |   2 +
>>>>>>  gcc/target.def                            |  23 ++
>>>>>>  gcc/targhooks.c                           |  27 ++
>>>>>>  gcc/targhooks.h                           |   2 +
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-1.c     |  40 +++
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-2.c     |  19 ++
>>>>>>  gcc/testsuite/gcc.dg/spec-barrier-3.c     |  13 +
>>>>>>  28 files changed, 1191 insertions(+), 11 deletions(-)
>>>>>>  create mode 100644 gcc/config/aarch64/aarch64-speculation.cc
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-1.c
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-2.c
>>>>>>  create mode 100644 gcc/testsuite/gcc.dg/spec-barrier-3.c
>>>>
>>

next prev parent reply	other threads:[~2018-07-10 13:43 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-09 16:39 Richard Earnshaw
2018-07-09 16:39 ` [PATCH 1/7] Add __builtin_speculation_safe_value Richard Earnshaw
2018-07-23 14:28   ` Richard Earnshaw (lists)
2018-07-24 17:26   ` Richard Biener
2018-07-25  9:49     ` Richard Earnshaw (lists)
2018-07-25 10:36       ` Richard Biener
2018-07-25 12:41         ` Richard Earnshaw (lists)
2018-07-25 13:47           ` Richard Biener
2018-07-26 10:03             ` Richard Earnshaw (lists)
2018-07-26 12:41               ` Richard Biener
2018-07-26 13:06                 ` Richard Earnshaw (lists)
2018-07-26 13:13                   ` Richard Biener
2018-07-26 23:34           ` Joseph Myers
2018-07-27  0:46             ` Paul Koning
2018-07-27  8:59               ` Richard Earnshaw (lists)
2018-07-27 10:59                 ` Joseph Myers
2018-07-25 18:03     ` Richard Earnshaw (lists)
2018-07-26  8:42       ` Richard Biener
2018-07-09 16:39 ` [PATCH 6/7] AArch64 - new pass to add conditional-branch speculation tracking Richard Earnshaw
2018-07-11 21:01   ` Jeff Law
2018-07-23 14:33     ` Richard Earnshaw (lists)
2018-07-24 21:31       ` Jeff Law
2018-07-09 16:39 ` [PATCH 4/7] AArch64 - Add new option -mtrack-speculation Richard Earnshaw
2018-07-09 16:39 ` [PATCH 2/7] Arm - add speculation_barrier pattern Richard Earnshaw
2018-07-09 16:39 ` [PATCH 5/7] AArch64 - disable CB[N]Z TB[N]Z when tracking speculation Richard Earnshaw
2018-07-09 16:39 ` [PATCH 7/7] AArch64 - use CSDB based sequences if speculation tracking is enabled Richard Earnshaw
2018-07-09 16:39 ` [PATCH 3/7] AArch64 - add speculation barrier Richard Earnshaw
2018-07-09 23:13 ` [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753) Jeff Law
2018-07-10  8:49   ` Richard Earnshaw (lists)
2018-07-10 13:48     ` Bill Schmidt
2018-07-10 14:14       ` Richard Earnshaw (lists)
2018-07-10 15:44         ` Jeff Law
2018-07-10 15:42     ` Jeff Law
2018-07-10 16:43       ` Richard Earnshaw (lists)
2018-07-11 20:47         ` Jeff Law
2018-07-11 22:31           ` Richard Earnshaw (lists)
2018-07-10  7:19 ` Richard Biener
2018-07-10  8:39   ` Richard Earnshaw (lists)
2018-07-10 10:10     ` Richard Biener
2018-07-10 10:53       ` Richard Earnshaw (lists)
2018-07-10 11:22         ` Richard Biener
2018-07-10 13:43           ` Richard Earnshaw (lists) [this message]
2018-07-10 15:56         ` Jeff Law
2018-07-27  9:38 ` [PATCH 00/11] (v2) " Richard Earnshaw
2018-07-27  9:38   ` [PATCH 01/11] Add __builtin_speculation_safe_value Richard Earnshaw
2018-07-27 12:11     ` Nathan Sidwell
2018-07-27 12:32       ` Richard Earnshaw (lists)
2018-07-27 12:49         ` Nathan Sidwell
2018-07-27 12:53       ` Richard Earnshaw (lists)
2018-07-30 13:16     ` Richard Biener
2018-07-31 19:25       ` H.J. Lu
2018-07-31 20:51         ` Ian Lance Taylor via gcc-patches
2018-08-01  8:50           ` Richard Earnshaw (lists)
2018-08-01  8:54             ` Jakub Jelinek
2018-08-01  9:25               ` Richard Earnshaw (lists)
2018-07-27  9:38   ` [PATCH 07/11] AArch64 - use CSDB based sequences if speculation tracking is enabled Richard Earnshaw
2018-07-27  9:38   ` [PATCH 10/11] x86 - add speculation_barrier pattern Richard Earnshaw
2018-07-28  8:25     ` Uros Bizjak
2018-07-31 23:15       ` H.J. Lu
2018-07-27  9:38   ` [PATCH 11/11] rs6000 " Richard Earnshaw
2018-07-31 22:01     ` Bill Schmidt
2018-07-31 23:31       ` Segher Boessenkool
2018-07-27  9:38   ` [PATCH 02/11] Arm " Richard Earnshaw
2018-08-06 14:01     ` Christophe Lyon
2018-08-06 15:59       ` Richard Earnshaw (lists)
2018-07-27  9:38   ` [PATCH 03/11] AArch64 - add speculation barrier Richard Earnshaw
2018-07-27  9:38   ` [PATCH 05/11] AArch64 - disable CB[N]Z TB[N]Z when tracking speculation Richard Earnshaw
2018-07-27  9:38   ` [PATCH 04/11] AArch64 - Add new option -mtrack-speculation Richard Earnshaw
2018-07-27  9:38   ` [PATCH 08/11] targhooks - provide an alternative hook for targets that never execute speculatively Richard Earnshaw
2018-07-30 13:17     ` Richard Biener
2018-07-27  9:38   ` [PATCH 09/11] pdp11 - example of a port not needing a speculation barrier Richard Earnshaw
2018-07-27 13:27     ` Paul Koning
2018-07-27 15:19       ` Richard Biener
2018-07-27  9:38   ` [PATCH 06/11] AArch64 - new pass to add conditional-branch speculation tracking Richard Earnshaw
2018-07-27 19:49   ` [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753) John David Anglin
2018-08-02 18:40     ` Jeff Law
2018-08-02 20:19       ` John David Anglin
2018-08-03  9:06         ` Richard Earnshaw (lists)
2018-08-06 21:52           ` John David Anglin
2018-08-07 14:05             ` Richard Earnshaw (lists)
2018-08-07 14:56               ` John David Anglin
2018-08-03 17:26         ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3bcd108-a10d-776d-d521-c06594d7155e@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).