From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 69872 invoked by alias); 15 Dec 2016 05:25:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 69847 invoked by uid 89); 15 Dec 2016 05:25:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=awhile, __SIZE_TYPE__, __size_type__, suspect X-HELO: mx1.redhat.com Subject: Re: [PATCH] explicit_bzero final To: Zack Weinberg References: <20161212230622.14045-1-zackw@panix.com> <96232743-345c-5126-e526-9a8aadc4fdb7@redhat.com> <2e51eedf-874e-cb81-79a6-0a0c667371db@redhat.com> Cc: Florian Weimer , GNU C Library , Joseph Myers , Adhemerval Zanella , Wilco Dijkstra From: Jeff Law Message-ID: Date: Thu, 15 Dec 2016 05:25:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-12/txt/msg00536.txt.bz2 On 12/14/2016 04:11 PM, Zack Weinberg wrote: > On Wed, Dec 14, 2016 at 11:58 AM, Jeff Law wrote: >> >> The concept of tainting variables and tracking that property seems distinct >> from making sure that the compiler properly handles explicit_bzero. And >> just to be clear, I see that property as having value. > > The thing is, if you do the taint tracking, that's _all_ you need to > do, I think. Perhaps. This is the first time I've really thought about it. > > Let's recap the motivating example for the inline, since it's been > awhile and Jeff may not have seen it: Thanks. I certainly haven't followed the discussion prior to yesterday. > > ``` > typedef __SIZE_TYPE__ size_t; > > extern void *memset (void *, int, size_t); > extern void explicit_bzero (void *, size_t); > extern void __explicit_bzero_fence (const void *, size_t); > > #ifdef INLINE_EXPLICIT_BZERO > extern inline __attribute__ ((gnu_inline)) void > explicit_bzero(void *p, size_t n) > { > memset (p, 0, n); > __explicit_bzero_fence (p, n); > } > #endif > > // 128 bits of key material > struct k { unsigned long long lo, hi; }; > > extern struct k get_key (); > extern void do_encrypt (struct k, char *, const char *, size_t); > > void > encrypt (char *out, const char *in, size_t n) > { > struct k key = get_key (); > do_encrypt (key, out, in, n); > #ifndef NO_CLEAR > explicit_bzero (&key, sizeof (struct k)); > #endif > } > ``` > > If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you > will see that the `key` variable lives entirely in registers. If you > compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`, > `key` is spilled to the stack immediately after the call to `get_key` > (it's more obvious that this is what's happening if you turn off RTL > scheduling). This is what we want to avoid. With > `-DINLINE_EXPLICIT_BZERO` ... the spill still happens, > disappointingly, but it _could have_ been eliminated, because those > stack slots are not read again before they are overwritten with zeroes > by the memset. (LLVM 3.8 does eliminate the spill.) Right. Just to correct a bit on terminology, it's not spilled to the stack, but the object has to be addressable due to the call to explicit_bzero. That forces the object into the stack. It might still wind up in the stack due to other reasons (it's an aggregate and thus not subject to live in registers). Once you inline the call to explicit_bzero, then the compiler has the opportunity to remove the addressable property, but it probably can't due to the use in the memset and bzero_fence calls. As it happens I'm looking at a slew of DSE issues right now and can give some insights. Note the following assumes my improvements and does not necessarily represent what you can do with the trunk right now. DSE fails to track the assignment key = get_key(). It just doesn't even try. If that's hacked around (easy) and we start tracking the memory for KEY. Then we have the use in the call to do_encrypt. That's a read, plain and simple. But let's assume we can finesse past that. We then see the call to memset and that kills the memory for KEY. So we can actually track the memory with a little work. There's no stores that can actually be removed, but we can track the memory. Note this is all at the higher gimple level. Tracking it in the RTL DSE pass would be tougher, but in theory possible and perhaps more useful since the argument passing artifacts are exposed. But the fact remains that as long as KEY is addressable you're going to have this class of problem and with the explicit_bzero proposal anything passed to it will be considered addressable and is likely going to drop into memory, which is precisely what you do not want. Rather than explicit_bzero what you may want is something more like a clobbering assignment. > Anyway, right before lowering to RTL, the IR with -DINLINE_EXPLICIT_BZERO is > > encrypt (char * out, const char * in, size_t n) > { > struct k key; > > : > key = get_key (); > do_encrypt (key, out_3(D), in_4(D), n_5(D)); > memset (&key, 0, 16); > __explicit_bzero_fence (&key, 16); > key ={v} {CLOBBER}; > return; > } Right, that's about what I'd expect. > > With a compiler that implemented the taint analysis I am imagining, we > would have instead > > encrypt (char * out, const char * in, size_t n) > { > struct k key __attribute__ ((tainted)); > > : > key = get_key (); > do_encrypt (key, out_3(D), in_4(D), n_5(D)); > memset (&key, 0, 16); > key = {0, 0}; > USE (key); > key ={v} {CLOBBER}; > return; > } So if you get to this point, in particular the key = {0, 0} assignment, then what's the point of the explicit_bzero/memset call? It's dead. What you're describing with key = {0, 0} is an empty CONSTRUCTOR assignment. And I *think* those won't force your object to be addressable. FWIW, if you had this, there's a good chance the compiler would pick up the fact that the memset call is dead, but I really don't see the point in having the memset call at that point. > > (Alas, looking at the assembly here has made me realize that this is > maybe harder to implement than I thought; it probably can't be done > just by slapping in some writes and USEs at tree level; it needs to > work out which of the _hard registers_ and _stack slots_ are tainted, > and generate the appropriate clears, _after_ register allocation. And > it needs to assume that function calls _don't_ clobber call-clobbered > registers. Oh my aching head.) Getting it 100% is tough for precisely those reasons. But if we just have a clobbering assignment rather than messing with explicit_bzero, then what you get out of the gimple side will be as close to correct at that point as you can get and the object is much more likely to end up in registers than memory. So why not drop explicit_bzero and instead we define something to get you an empty CONSTRUCTOR assignment. I think those have the semantics you want without forcing your object to be addressable. Then you just have to find a way to deal with the argument passing and register allocation artifacts, and that's going to be hard I suspect. jeff