From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 69977 invoked by alias); 12 Aug 2016 12:42:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 69844 invoked by uid 89); 12 Aug 2016 12:42:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy=Automatic, hash_set, 30PM, inkscape X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 12 Aug 2016 12:42:04 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 2B968AD9D; Fri, 12 Aug 2016 12:42:02 +0000 (UTC) Subject: Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope To: Jakub Jelinek References: <572C7A3E.4000905@suse.cz> <20160506122225.GH26501@tucnak.zalov.cz> <57332B69.4040001@suse.cz> <20160512104156.GY28550@tucnak.redhat.com> <57348F45.5020700@suse.cz> Cc: GCC Patches From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: <14d5949e-7aa9-d811-f009-a74362254251@suse.cz> Date: Fri, 12 Aug 2016 12:42:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2 MIME-Version: 1.0 In-Reply-To: <57348F45.5020700@suse.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2016-08/txt/msg01018.txt.bz2 PING^1 > On 05/12/2016 12:41 PM, Jakub Jelinek wrote: >> On Wed, May 11, 2016 at 02:54:01PM +0200, Martin Liška wrote: >>> On 05/06/2016 02:22 PM, Jakub Jelinek wrote: >>>> On Fri, May 06, 2016 at 01:04:30PM +0200, Martin Liška wrote: >>>>> I've started working on the patch couple of month go, basically after >>>>> a brief discussion with Jakub on IRC. >>>>> >>>>> I'm sending the initial version which can successfully run instrumented >>>>> tramp3d, postgresql server and Inkscape. It catches the basic set of >>>>> examples which are added in following patch. >>>>> >>>>> The implementation is quite straightforward as works in following steps: >>>>> >>>>> 1) Every local variable stack slot is poisoned at the very beginning of a function (RTL emission) >>>>> 2) In gimplifier, once we spot a DECL_EXPR, a variable is unpoisoned (by emitting ASAN_MARK builtin) >>>>> and the variable is marked as addressable >>>> >>>> Not all vars have DECL_EXPRs though. >> >> Just random comments from quick skim, need to find enough spare time to >> actually try it and see how it works. >> >>> Yeah, I've spotted one interesting example which is part of LLVM's testsuite: >>> >>> struct IntHolder { >>> int val; >>> }; >>> >>> const IntHolder *saved; >>> >>> void save(const IntHolder &holder) { >>> saved = &holder; >>> } >>> >>> int main(int argc, char *argv[]) { >>> save({10}); >>> int x = saved->val; // BOOM >>> return x; >>> } >>> >>> It would be also good to handle such temporaries. Any suggestions how to handle that in gimplifier? >> >> Dunno, guess you need to do something in the FE for it already (talk to >> Jason?). At least in *.original dump there is already: >> <> save ((const struct IntHolder &) &TARGET_EXPR ) >>>>>; >> int x = (int) saved->val; >> return = x; >> and the info on where the D.2263 temporary goes out of scope is lost. > > Thanks for sample, I will ask Jason to help me with that. > >> >>> Apart from that, second version of the patch changes: >>> + fixed issues with missing stack unpoisoning; currently, I mark all VAR_DECLs that >>> are in ASAN_MARK internal fns and stack prologue/epilogue is emitted just for these vars >>> + removed unneeded hunks (tree-vect-patterns.c and asan_poisoning.cc) >>> + LABEL unpoisoning code makes stable sort for variables that were already used in the context >>> + stack poisoning hasn't worked for -O1+ due to following guard in asan.c >>> /* Automatic vars in the current function will be always accessible. */ >>> + direct shadow memory poisoning/unpoisoning code is introduced - in both scenarios (RTL and GIMPLE), >>> I would appreciate feedback if storing multiple bytes is fine? What is the maximum memory wide >>> store mode supported by a target? How can I get such information? >>> + the maximum object size handled by a direct emission is guarded by use-after-scope-direct-emission-threshold >>> parameter; initial value (256B) should maximally emit store of 32B >> >> Would be better if user visible param was in bytes rather than bits IMHO. >> > > Well, the size of an object is in bytes, but as we map every 8 (yeah, that's configurable, I'm quite curious about > real respecting of ASAN_SHADOW_SHIFT) bytes of real memory to > a single byte in shadow memory, thus the division by 8 is needed. > >>> Yeah, depends because of: >>> >>> static inline bool >>> asan_sanitize_use_after_scope (void) >>> { >>> return ((flag_sanitize & SANITIZE_ADDRESS_USE_AFTER_SCOPE) >>> == SANITIZE_ADDRESS_USE_AFTER_SCOPE >>> && flag_stack_reuse == SR_NONE >>> && ASAN_STACK); >>> } >>> >>> Where ASAN_STACK comes from params.h. >> >> I'd prefer just prototype the function in the header and define in asan.c >> or some other source file. Or maybe split it, do the important case >> (flag_sanitize check) inline and call out of line function for the rest. >> Why do you check flag_stack_reuse? I thought you'd arrange for it to be >> different when -fsanitize=use-after-scope? > > Right, the sanitization does not relate to flag_stack, thus removing the dependency, > we can remove need of including params.h in various places. > >> >>> @@ -243,6 +243,11 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value; >>> static bool asan_shadow_offset_computed; >>> static vec sanitized_sections; >>> >>> +/* Set of variable declarations that are going to be guarded by >>> + use-after-scope sanitizer. */ >>> + >>> +static hash_set asan_handled_variables(13); >> >> Not sure about the formatting here, don't we use xxx instead of xxx >> ? And I'd expect space before (. > > Yeah, done. > >>> @@ -1020,6 +1020,91 @@ asan_function_start (void) >>> current_function_funcdef_no); >>> } >>> >>> +/* Return number of shadow bytes that are occupied by a local variable >>> + of SIZE bytes. */ >>> + >>> +static unsigned HOST_WIDE_INT >>> +get_shadow_memory_size (unsigned HOST_WIDE_INT size) >>> +{ >>> + /* Round up size of object. */ >>> + unsigned HOST_WIDE_INT r; >>> + if ((r = size % BITS_PER_UNIT) != 0) >>> + size += BITS_PER_UNIT - r; >> >> Isn't there a ROUND_UP macro? > > Thanks. I was looking for the macro. > > I'm sending v3 where all of aforementioned notes were handled. > > Thanks, > Martin > >> >> Jakub >> >