From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-481383-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 86270 invoked by alias); 11 Jul 2018 20:47:06 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 86260 invoked by uid 89); 11 Jul 2018 20:47:06 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=cuts, H*f:sk:f814fca, munge, resultant
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Jul 2018 20:47:04 +0000
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id C9D915F744;	Wed, 11 Jul 2018 20:47:02 +0000 (UTC)
Received: from localhost.localdomain (ovpn-112-9.rdu2.redhat.com [10.10.112.9])	by smtp.corp.redhat.com (Postfix) with ESMTP id 8125361B72;	Wed, 11 Jul 2018 20:46:46 +0000 (UTC)
Subject: Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)
To: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>, gcc-patches@gcc.gnu.org
References: <1531154299-28349-1-git-send-email-Richard.Earnshaw@arm.com> <5fd0b05c-722c-fb07-4423-6a3f81d17fc6@redhat.com> <f814fcaf-1578-35ce-7ce2-a87989039681@arm.com> <ec6576d4-096a-fbc9-9662-02eccebc2756@redhat.com> <27b267b6-1406-7467-6c15-d4a4390523de@arm.com>
From: Jeff Law <law@redhat.com>
Openpgp: preference=signencrypt
Message-ID: <34f08c5b-72b1-4009-e6f0-de019984d6a2@redhat.com>
Date: Wed, 11 Jul 2018 20:47:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <27b267b6-1406-7467-6c15-d4a4390523de@arm.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-IsSubscribed: yes
X-SW-Source: 2018-07/txt/msg00580.txt.bz2

On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 16:42, Jeff Law wrote:
>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
>>> On 10/07/18 00:13, Jeff Law wrote:
>>>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>>>>
>>>>> To address all of the above, these patches adopt a new approach, based
>>>>> in part on a posting by Chandler Carruth to the LLVM developers list
>>>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>>>>> but which we have extended to deal with inter-function speculation.
>>>>> The patches divide the problem into two halves.
>>>> We're essentially turning the control dependency into a value that we
>>>> can then use to munge the pointer or the resultant data.
>>>>
>>>>>
>>>>> The first half is some target-specific code to track the speculation
>>>>> condition through the generated code to provide an internal variable
>>>>> which can tell us whether or not the CPU's control flow speculation
>>>>> matches the data flow calculations.  The idea is that the internal
>>>>> variable starts with the value TRUE and if the CPU's control flow
>>>>> speculation ever causes a jump to the wrong block of code the variable
>>>>> becomes false until such time as the incorrect control flow
>>>>> speculation gets unwound.
>>>> Right.
>>>>
>>>> So one of the things that comes immediately to mind is you have to run
>>>> this early enough that you can still get to all the control flow and
>>>> build your predicates.  Otherwise you have do undo stuff like
>>>> conditional move generation.
>>>
>>> No, the opposite, in fact.  We want to run this very late, at least on
>>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>>> fine - they're data-flow operations, not control flow (in fact, that's
>>> exactly what the control flow tracker instructions are).  By running it
>>> late we avoid disrupting any of the earlier optimization passes as well.
>> Ack.  I looked at the aarch64 implementation after sending my message
>> and it clearly runs very late.
>>
>> I haven't convinced myself that all the work generic parts of the
>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>> isn't, you're probably getting enough coverage to drastically reduce the
>> attack surface.  I'm going to have to think about the early
>> transformations we make and how they interact here harder.  But I think
>> the general approach can dramatically reduce the attack surface.
> 
> My argument here would be that we are concerned about speculation that
> the CPU does with the generated program.  We're not particularly
> bothered about the abstract machine description it's based upon.  As
> long as the earlier transforms lead to a valid translation (it hasn't
> removed a necessary bounds check) then running late is fine.
I'm thinking about obfuscation of the bounds check or the pointer or
turning branchy into straightline code, possibly doing some speculation
in the process, if-conversion and the like.

For example hoist_adjacent_loads which results in speculative loads and
likely a conditional move to select between the two loaded values.

Or what if we've done something like

if (x < maxval)
   res = *p;

And we've turned that into


t = *p;
res = (x < maxval) ? t : res;


That may be implemented as a conditional move at the RTL level, so
protecting that may be nontrivial.

In those examples the compiler itself has introduced the speculation.

I can't find the conditional obfuscation I was looking for, so it's hard
to rule it in our out as potentially problematical.

WRT pointer obfuscation, we no longer propagate conditional equivalences
very agressively, so it may be a non-issue in the end.

But again, even with these concerns I think what you're doing cuts down
the attack surface in meaningful ways.


> 
> I can't currently conceive a situation where the compiler would be able
> to remove a /necessary/ bounds check that could lead to unsafe
> speculation later on.  A redundant bounds check removal shouldn't be a
> problem as the non-redundant check should remain and that will still get
> tracking code added.
It's less about removal and more about either compiler-generated
speculation or obfuscation of the patterns you're looking for.


jeff