From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-481391-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 96797 invoked by alias); 11 Jul 2018 22:31:11 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 80649 invoked by uid 89); 11 Jul 2018 22:30:58 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: foss.arm.com
Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Jul 2018 22:30:55 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD0D07A9;	Wed, 11 Jul 2018 15:30:46 -0700 (PDT)
Received: from [192.168.1.19] (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3FCE13F318;	Wed, 11 Jul 2018 15:30:46 -0700 (PDT)
Subject: Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)
To: Jeff Law <law@redhat.com>, gcc-patches@gcc.gnu.org
References: <1531154299-28349-1-git-send-email-Richard.Earnshaw@arm.com> <5fd0b05c-722c-fb07-4423-6a3f81d17fc6@redhat.com> <f814fcaf-1578-35ce-7ce2-a87989039681@arm.com> <ec6576d4-096a-fbc9-9662-02eccebc2756@redhat.com> <27b267b6-1406-7467-6c15-d4a4390523de@arm.com> <34f08c5b-72b1-4009-e6f0-de019984d6a2@redhat.com>
From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
Message-ID: <044a14f4-4e4e-3c7f-5675-7914695add09@arm.com>
Date: Wed, 11 Jul 2018 22:31:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <34f08c5b-72b1-4009-e6f0-de019984d6a2@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2018-07/txt/msg00588.txt.bz2

On 11/07/18 21:46, Jeff Law wrote:
> On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
>> On 10/07/18 16:42, Jeff Law wrote:
>>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
>>>> On 10/07/18 00:13, Jeff Law wrote:
>>>>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>>>>>
>>>>>> To address all of the above, these patches adopt a new approach, based
>>>>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>>>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>>>>> but which we have extended to deal with inter-function speculation.
>>>>>> The patches divide the problem into two halves.
>>>>> We're essentially turning the control dependency into a value that we
>>>>> can then use to munge the pointer or the resultant data.
>>>>>
>>>>>>
>>>>>> The first half is some target-specific code to track the speculation
>>>>>> condition through the generated code to provide an internal variable
>>>>>> which can tell us whether or not the CPU's control flow speculation
>>>>>> matches the data flow calculations.  The idea is that the internal
>>>>>> variable starts with the value TRUE and if the CPU's control flow
>>>>>> speculation ever causes a jump to the wrong block of code the variable
>>>>>> becomes false until such time as the incorrect control flow
>>>>>> speculation gets unwound.
>>>>> Right.
>>>>>
>>>>> So one of the things that comes immediately to mind is you have to run
>>>>> this early enough that you can still get to all the control flow and
>>>>> build your predicates.  Otherwise you have do undo stuff like
>>>>> conditional move generation.
>>>>
>>>> No, the opposite, in fact.  We want to run this very late, at least on
>>>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>>>> fine - they're data-flow operations, not control flow (in fact, that's
>>>> exactly what the control flow tracker instructions are).  By running it
>>>> late we avoid disrupting any of the earlier optimization passes as well.
>>> Ack.  I looked at the aarch64 implementation after sending my message
>>> and it clearly runs very late.
>>>
>>> I haven't convinced myself that all the work generic parts of the
>>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>>> isn't, you're probably getting enough coverage to drastically reduce the
>>> attack surface.  I'm going to have to think about the early
>>> transformations we make and how they interact here harder.  But I think
>>> the general approach can dramatically reduce the attack surface.
>>
>> My argument here would be that we are concerned about speculation that
>> the CPU does with the generated program.  We're not particularly
>> bothered about the abstract machine description it's based upon.  As
>> long as the earlier transforms lead to a valid translation (it hasn't
>> removed a necessary bounds check) then running late is fine.
> I'm thinking about obfuscation of the bounds check or the pointer or
> turning branchy into straightline code, possibly doing some speculation
> in the process, if-conversion and the like.
> 
> For example hoist_adjacent_loads which results in speculative loads and
> likely a conditional move to select between the two loaded values.
> 
> Or what if we've done something like
> 
> if (x < maxval)
>    res = *p;
> 
> And we've turned that into
> 
> 
> t = *p;
> res = (x < maxval) ? t : res;

Hmm, interesting.  But for that to be safe, the compiler would have to
be able to prove that dereferencing p was safe even if x >= maxval,
otherwise the run-time code could fault (so if there's any chance that
it could point to something vulnerable, then there must also be a chance
that it points to unmapped memory).  Given that requirement, I don't
think this case can be a specific concern, since the requirement implies
that p must already be within some known bounds for the type of object
it points to.

R.

> 
> 
> That may be implemented as a conditional move at the RTL level, so
> protecting that may be nontrivial.
> 
> In those examples the compiler itself has introduced the speculation.
> 
> I can't find the conditional obfuscation I was looking for, so it's hard
> to rule it in our out as potentially problematical.
> 
> WRT pointer obfuscation, we no longer propagate conditional equivalences
> very agressively, so it may be a non-issue in the end.
> 
> But again, even with these concerns I think what you're doing cuts down
> the attack surface in meaningful ways.
> 
> 
> 
>>
>> I can't currently conceive a situation where the compiler would be able
>> to remove a /necessary/ bounds check that could lead to unsafe
>> speculation later on.  A redundant bounds check removal shouldn't be a
>> problem as the non-redundant check should remain and that will still get
>> tracking code added.
> It's less about removal and more about either compiler-generated
> speculation or obfuscation of the patterns you're looking for.
> 
> 
> jeff
> 
> 
> 
>