From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 88316 invoked by alias); 11 Sep 2019 18:34:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 88300 invoked by uid 89); 11 Sep 2019 18:34:01 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.8 required=5.0 tests=AWL,BAYES_00,ENV_AND_HDR_SPF_MATCH,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS,USER_IN_DEF_SPF_WL autolearn=ham version=3.3.1 spammy=longjmp, H*i:sk:91529a1, H*f:sk:91529a1 X-HELO: mail-vs1-f65.google.com Received: from mail-vs1-f65.google.com (HELO mail-vs1-f65.google.com) (209.85.217.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Sep 2019 18:33:48 +0000 Received: by mail-vs1-f65.google.com with SMTP id q9so14442992vsl.4 for ; Wed, 11 Sep 2019 11:33:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=mJ+djzNTTtNa9n8zz4kbbhbYGh/52XN0tpl44L5KP/U=; b=sHsQUysnEWVCwnba6SSMEZLRewQE5hHPJzHsY+ADKMLErAZs9aaEi4UpGx69XHmLjk +aqFpSsx3SuXIY+mTeM2FEWkr01+utz/sfh+PdGBCLpzqmWIk2e5UnMDn8/qv+CitTxD ykC5U3aNgZQ1YZ/Uf0JuDS5p9JesYFsq55AF14ezVkjvBlhuyP7hSe8bzoAhVUj4Ge6a T1F0oxmca5Fb4W5dL82YEGYw7dxv7k/nFu/b9hqAuYAKdI/BOakAfpqx5bHatbF6+ZKk D9IXjA4+wF5fJvvZGmxTj0izPLS6kmjwBe5alx+BnHUujbWzpZgv4Dh5cezVv7gioeT9 Wwug== MIME-Version: 1.0 References: <156778058239.16148.17480879484406897649.scripted-patch-series@arm.com> <936e0222-0b05-b4de-7a68-9b91e79a6f76@suse.cz> <8fc78139-481e-6dbc-0996-2cae58627c25@arm.com> <111f6243-834f-9095-274e-f003cf329509@suse.cz> <91529a11-1c59-b9d9-670c-98435bab8611@arm.com> In-Reply-To: <91529a11-1c59-b9d9-670c-98435bab8611@arm.com> From: "Evgenii Stepanov via gcc-patches" Reply-To: Evgenii Stepanov Date: Wed, 11 Sep 2019 18:34:00 -0000 Message-ID: Subject: Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC To: Matthew Malcomson Cc: =?UTF-8?Q?Martin_Li=C5=A1ka?= , "gcc-patches@gcc.gnu.org" , "dodji@redhat.com" , nd , "kcc@google.com" , "jakub@redhat.com" , "dvyukov@google.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-09/txt/msg00773.txt.bz2 On Wed, Sep 11, 2019 at 9:37 AM Matthew Malcomson wrote: > > On 11/09/19 12:53, Martin Li=C5=A1ka wrote: > > On 9/9/19 5:54 PM, Matthew Malcomson wrote: > >> On 09/09/19 11:47, Martin Li=C5=A1ka wrote: > >>> On 9/6/19 4:46 PM, Matthew Malcomson wrote: > >>>> Hello, > >>>> > >> As I understand it, `hwasan-abi=3Dinterceptor` vs `platform` is about > >> adding such MTE emulation for "application code" or "platform code (e.= g. > >> kernel)" respectively. > > > > Hm, are you sure? Clang also uses -fsanitize=3Dkernel-hwaddress which s= hould > > be equivalent to kernel-address for -fsanitize=3Daddress. > > > > I'm not at all sure it's to do with the kernel ;-} > > Here's the commit that adds the flag. > https://reviews.llvm.org/D56038 > > From the commit message it seems the point is to distinguish between > running on runtimes that natively support HWASAN (named the "platform" > abi) and those where functions like malloc and pthread_create have to be > intercepted (named the "interceptor" abi). > > I had assumed that targeting the kernel would be in the "platform" > group, but it could easily not be the case. > > Considering the message form the below commit it seems that this is more > targeted at instrumenting things like libc https://reviews.llvm.org/D5092= 2. With hwasan we tried a different approach from asan: instead of intercepting libc we build it with sanitizer instrumentation, and rely on a few hooks to update internal state of the tool on interesting events, such as process startup, thread creation and destruction, stack unwind (longjmp, vfork). This effectively puts hwasan _below_ libc (as in libc depends on libhwasan). It has worked amazingly well for Android, where we aim to sanitize most of platform code at once. Ex. ASan has this requirement that the main executable needs to be built with ASan before any of the libraries could - otherwise the tool will not be able to interpose malloc/free symbols. As a consequence, when there are binaries that can not be sanitized for any reason, we need to keep unsanitized copies of all their transitive dependencies, and that turns into a huge build/deployment mess. Hwasan approach avoids this problem by making sure that the allocator is always there (because everything depends on libc). The downside, of course, is that this can not be used to sanitize a single binary without a specially built libc. Hence the "interceptor" ABI, which was an attempt to support running hwasan-instrumented applications on regular, non-hwasan devices. We are not developing this mode any longer, but it is used to run compiler-rt tests on aarch64-android. > I'm currently working on writing down the questions I plan to ask the > developers of HWASAN in LLVM, I'll put this on the list :-) > > >> > >>> > >> There's an even more fundamental problem of accesses within the > >> instrumented binary -- I haven't yet figured out how to remove the tag > >> before accesses on architectures without the AArch64 TBI feature. > > > > Which should platforms like x86_64, right? > > Yes. > As yet I haven't gotten anything working for architectures without TBI > (everything except AArch64). > This particular problem was one I was hoping for suggestions around (my > first of the questions in my cover letter). We have support for hwasan on x86_64 in LLVM (by removing tags before accesses), but it is not really practical because any library built without instrumentation is a big source of false positives. Even, say, libc++/libstdc++. We use it exclusively for tests. > >>>> > >>>> The current patch series is far from complete, but I'm posting the c= urrent state > >>>> to provide something to discuss at the Cauldron next week. > >>>> > >>>> In its current state, this sanitizer only works on AArch64 with a cu= stom kernel > >>>> to allow tagged pointers in system calls. This is discussed in the = below link > >>>> https://source.android.com/devices/tech/debug/hwasan -- the custom k= ernel allows > >>>> tagged pointers in syscalls. > >>> > >>> Can you be please more specific. Is the MTE in upstream linux kernel?= If so, > >>> starting from which version? > >> > >> I find I can only make complicated statements remotely clear in bullet > >> points ;-) > >> > >> What I was trying to say was: > >> - HWASAN from this patch series requires AArch64 TBI. > >> (I have not handled architectures without TBI) > >> - The upstream kernel does not accept tagged pointers in syscalls. > >> (programs that use TBI must currently clear tags before passing > >> pointers to the kernel) > > > > I know that in case of ASAN, the libasan provides wrappers (interceptor= s) for various glibc > > functions that are often system calls. Similar wrappers are probably us= ed in HWASAN > > and so that one can create the memory pointer tags. > > > >> - This patch series doesn't include any way to avoid passing tagged > >> pointers to syscalls. > > > > I bet LLVM has the same problem so I would expect a handling in the int= erceptors. > > > > I'm pretty sure this problem hasn't been solved with interceptors. > > The android page describing hwasan specifically mentions the requirement > of a Linux kernel accepting tagged pointers, and I believe this is the > most supported environment. > > https://source.android.com/devices/tech/debug/hwasan > "HWASan requires the Linux kernel to accept tagged pointers in system > call arguments." > > Also, there are surprisingly few interceptors defined in libhwasan. > > Thanks, > Matthew > > >> - Hence on order to test the sanitizer I'm using a kernel that has been > >> patched to accept tagged pointers in many syscalls. > >> - The link to the android.com site is just another source describing t= he > >> same requirement. > >> > >> > >> The support for the relaxed ABI (of accepting tagged pointers in vario= us > >> syscalls in the kernel) is being discussed on the kernel mailing list, > >> the latest patchset I know of is here: > >> https://lkml.org/lkml/2019/7/25/725 The main patchset is this one: https://patchwork.kernel.org/cover/11055001/ AFAIK, it's expected to go in the next merge window. > > > > Thanks for pointer. > >