From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-75908-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 69872 invoked by alias); 15 Dec 2016 05:25:49 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 69847 invoked by uid 89); 15 Dec 2016 05:25:47 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=awhile, __SIZE_TYPE__, __size_type__, suspect
X-HELO: mx1.redhat.com
Subject: Re: [PATCH] explicit_bzero final
To: Zack Weinberg <zackw@panix.com>
References: <20161212230622.14045-1-zackw@panix.com>
 <96232743-345c-5126-e526-9a8aadc4fdb7@redhat.com>
 <CAKCAbMgRrFKsisTu1wMYtLnc44Zwt5-GPFpt8GPWu2oXv6JN=Q@mail.gmail.com>
 <2e51eedf-874e-cb81-79a6-0a0c667371db@redhat.com>
 <CAKCAbMgxfviybrcRtSO-=eEc0YQc-TfwP+fvtKqQsodBTr_isg@mail.gmail.com>
 <ad37af4c-f3a5-00ec-b7aa-cee930a0f08b@redhat.com>
 <CAKCAbMgeK23rcGA-uQUjC8ctXTcYcSpYquyO8kwku7Kn0Q=NQQ@mail.gmail.com>
Cc: Florian Weimer <fweimer@redhat.com>,
        GNU C Library <libc-alpha@sourceware.org>,
        Joseph Myers <joseph@codesourcery.com>,
        Adhemerval Zanella <adhemerval.zanella@linaro.org>,
        Wilco Dijkstra <Wilco.Dijkstra@arm.com>
From: Jeff Law <law@redhat.com>
Message-ID: <af2c6326-a763-0dd0-a8d5-f818a9678987@redhat.com>
Date: Thu, 15 Dec 2016 05:25:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <CAKCAbMgeK23rcGA-uQUjC8ctXTcYcSpYquyO8kwku7Kn0Q=NQQ@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2016-12/txt/msg00536.txt.bz2

On 12/14/2016 04:11 PM, Zack Weinberg wrote:
> On Wed, Dec 14, 2016 at 11:58 AM, Jeff Law <law@redhat.com> wrote:
>>
>> The concept of tainting variables and tracking that property seems distinct
>> from making sure that the compiler properly handles explicit_bzero.  And
>> just to be clear, I see that property as having value.
>
> The thing is, if you do the taint tracking, that's _all_ you need to
> do, I think.
Perhaps.  This is the first time I've really thought about it.

>
> Let's recap the motivating example for the inline, since it's been
> awhile and Jeff may not have seen it:
Thanks.  I certainly haven't followed the discussion prior to yesterday.

>
> ```
> typedef __SIZE_TYPE__ size_t;
>
> extern void *memset (void *, int, size_t);
> extern void explicit_bzero (void *, size_t);
> extern void __explicit_bzero_fence (const void *, size_t);
>
> #ifdef INLINE_EXPLICIT_BZERO
> extern inline __attribute__ ((gnu_inline)) void
> explicit_bzero(void *p, size_t n)
> {
>   memset (p, 0, n);
>   __explicit_bzero_fence (p, n);
> }
> #endif
>
> // 128 bits of key material
> struct k { unsigned long long lo, hi; };
>
> extern struct k get_key ();
> extern void do_encrypt (struct k, char *, const char *, size_t);
>
> void
> encrypt (char *out, const char *in, size_t n)
> {
>   struct k key = get_key ();
>   do_encrypt (key, out, in, n);
> #ifndef NO_CLEAR
>   explicit_bzero (&key, sizeof (struct k));
> #endif
> }
> ```
>
> If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you
> will see that the `key` variable lives entirely in registers.  If you
> compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`,
> `key` is spilled to the stack immediately after the call to `get_key`
> (it's more obvious that this is what's happening if you turn off RTL
> scheduling). This is what we want to avoid.  With
> `-DINLINE_EXPLICIT_BZERO` ... the spill still happens,
> disappointingly, but it _could have_ been eliminated, because those
> stack slots are not read again before they are overwritten with zeroes
> by the memset.  (LLVM 3.8 does eliminate the spill.)
Right.  Just to correct a bit on terminology, it's not spilled to the 
stack, but the object has to be addressable due to the call to 
explicit_bzero.  That forces the object into the stack.  It might still 
wind up in the stack due to other reasons (it's an aggregate and thus 
not subject to live in registers).

Once you inline the call to explicit_bzero, then the compiler has the 
opportunity to remove the addressable property, but it probably can't 
due to the use in the memset and bzero_fence calls.

As it happens I'm looking at a slew of DSE issues right now and can give 
some insights.  Note the following assumes my improvements and does not 
necessarily represent what you can do with the trunk right now.

DSE fails to track the assignment key = get_key().  It just doesn't even 
try.  If that's hacked around (easy) and we start tracking the memory 
for KEY.  Then we have the use in the call to do_encrypt.  That's a 
read, plain and simple.  But let's assume we can finesse past that.  We 
then see the call to memset and that kills the memory for KEY.  So we 
can actually track the memory with a little work.  There's no stores 
that can actually be removed, but we can track the memory.

Note this is all at the higher gimple level.  Tracking it in the RTL DSE 
pass would be tougher, but in theory possible and perhaps more useful 
since the argument passing artifacts are exposed.

But the fact remains that as long as KEY is addressable you're going to 
have this class of problem and with the explicit_bzero proposal anything 
passed to it will be considered addressable and is likely going to drop 
into memory, which is precisely what you do not want.

Rather than explicit_bzero what you may want is something more like a 
clobbering assignment.


> Anyway, right before lowering to RTL, the IR with -DINLINE_EXPLICIT_BZERO is
>
> encrypt (char * out, const char * in, size_t n)
> {
>   struct k key;
>
>   <bb 2>:
>   key = get_key ();
>   do_encrypt (key, out_3(D), in_4(D), n_5(D));
>   memset (&key, 0, 16);
>   __explicit_bzero_fence (&key, 16);
>   key ={v} {CLOBBER};
>   return;
> }
Right, that's about what I'd expect.

>
> With a compiler that implemented the taint analysis I am imagining, we
> would have instead
>
> encrypt (char * out, const char * in, size_t n)
> {
>   struct k key __attribute__ ((tainted));
>
>   <bb 2>:
>   key = get_key ();
>   do_encrypt (key, out_3(D), in_4(D), n_5(D));
>   memset (&key, 0, 16);
>   key = {0, 0};
>   USE (key);
>   key ={v} {CLOBBER};
>   return;
> }
So if you get to this point, in particular the key = {0, 0} assignment, 
then what's the point of the explicit_bzero/memset call?  It's dead. 
What you're describing with key = {0, 0} is an empty CONSTRUCTOR 
assignment.  And I *think* those won't force your object to be addressable.

FWIW, if you had this, there's a good chance the compiler would pick up 
the fact that the memset call is dead, but I really don't see the point 
in having the memset call at that point.

>
> (Alas, looking at the assembly here has made me realize that this is
> maybe harder to implement than I thought; it probably can't be done
> just by slapping in some writes and USEs at tree level; it needs to
> work out which of the _hard registers_ and _stack slots_ are tainted,
> and generate the appropriate clears, _after_ register allocation.  And
> it needs to assume that function calls _don't_ clobber call-clobbered
> registers.  Oh my aching head.)
Getting it 100% is tough for precisely those reasons.  But if we just 
have a clobbering assignment rather than messing with explicit_bzero, 
then what you get out of the gimple side will be as close to correct at 
that point as you can get and the object is much more likely to end up 
in registers than memory.

So why not drop explicit_bzero and instead we define something to get 
you an empty CONSTRUCTOR assignment.  I think those have the semantics 
you want without forcing your object to be addressable.


Then you just have to find a way to deal with the argument passing and 
register allocation artifacts, and that's going to be hard I suspect.

jeff