From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24711 invoked by alias); 9 Sep 2019 15:55:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 24701 invoked by uid 89); 9 Sep 2019 15:55:10 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.2 required=5.0 tests=AWL,BAYES_00,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: EUR04-DB3-obe.outbound.protection.outlook.com Received: from mail-eopbgr60044.outbound.protection.outlook.com (HELO EUR04-DB3-obe.outbound.protection.outlook.com) (40.107.6.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Sep 2019 15:55:07 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9+olIdNwgbzmCUrLZlrHARaNcCbcXFvp2JpUAVi/Ork=; b=D7KsqmYx8UZhOtxMweCuSmgrtJ3/AuWCiCop8opjTpihQUkBzKNn1dzts+3qlCvlJa/qrKUIx8wF/xOVXfzIuG8QZ4PqNldVj+qQIRCq9RSSiS2zTtjWDHKeMM+13bUdJnlGqMstu/lO0yF3OcQtJ5LfZ2kwxiO5EyXYJMh9bhs= Received: from DB6PR0801CA0052.eurprd08.prod.outlook.com (2603:10a6:4:2b::20) by VI1PR0802MB2478.eurprd08.prod.outlook.com (2603:10a6:800:bb::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.15; Mon, 9 Sep 2019 15:55:02 +0000 Received: from AM5EUR03FT011.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e08::206) by DB6PR0801CA0052.outlook.office365.com (2603:10a6:4:2b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.14 via Frontend Transport; Mon, 9 Sep 2019 15:55:02 +0000 Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=temperror action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT011.mail.protection.outlook.com (10.152.16.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.14 via Frontend Transport; Mon, 9 Sep 2019 15:55:00 +0000 Received: ("Tessian outbound 24b6d28e5e38:v28"); Mon, 09 Sep 2019 15:54:57 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 551cbf572d2755f6 X-CR-MTA-TID: 64aa7808 Received: from 414e3e0c7ec3.1 (ip-172-16-0-2.eu-west-1.compute.internal [104.47.13.58]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id 40735F04-5FC7-4A85-9415-A61C36D8E154.1; Mon, 09 Sep 2019 15:54:52 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04lp2058.outbound.protection.outlook.com [104.47.13.58]) by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 414e3e0c7ec3.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 09 Sep 2019 15:54:52 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lyuez5qK90QPZhVGAasNmZzEWsPnFIBDME4e3U2LV3S9TvP0z8494k9NA5aD+2wxI29HKqS9o7GAwdwYcS7eaJFNawqsbmhvQGhzBPEC3Dvpzcxf8Z3L+tLQz/KAqKsssATZSNLh6Xy1b+z+L88BmQZSOq8u1r4Iw0bsQYnhjbbnCSlp+jR1FOJRO/LwudnfcizagGi4CQgv2/e5OFUaB+rne469M1XMy/qj4wwmbiDaM5/vwEtwGrkBGfgciE02mk7BmaQF7JSdqCMNw9r/jlIW4yVREdNqh6RlSiQ+g2DL69BcusBp21Gvh45BhONemJ5DmPAm7JYUJ4IY2+f7Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9+olIdNwgbzmCUrLZlrHARaNcCbcXFvp2JpUAVi/Ork=; b=D4odo+6K+JUvHsoqpTFzKilSmxH6RyprPUOW9YHG+ti7AA2UvTrRxwhtbZQW3zVIW5oTgNe1jottOAE1GZKm5lXfJ0sFsubkjI0wU0cT9KxfWdh+PCh8tKuQI/v55txfzaTBwn6DP52nlWL2bqG3rNx+dK5tIzUwebE9/4iv8GVRL1WHKvYxSuAdSTEWbqsPqbIAriAC/nSiyHIl5+XL8eQ3DAkLTqS3qwVQnmPBegtXNb5Bo52yMePx2dbHqjHC5KCSdZUMZ1GW4z/yHQd8C6zs0xgCAz/ixnT77uASdjLGuinsf6dMt0AJBRIemGe95act0IapZPKEhvfVLT7cQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9+olIdNwgbzmCUrLZlrHARaNcCbcXFvp2JpUAVi/Ork=; b=D7KsqmYx8UZhOtxMweCuSmgrtJ3/AuWCiCop8opjTpihQUkBzKNn1dzts+3qlCvlJa/qrKUIx8wF/xOVXfzIuG8QZ4PqNldVj+qQIRCq9RSSiS2zTtjWDHKeMM+13bUdJnlGqMstu/lO0yF3OcQtJ5LfZ2kwxiO5EyXYJMh9bhs= Received: from VI1PR08MB5471.eurprd08.prod.outlook.com (52.133.246.83) by VI1PR08MB3566.eurprd08.prod.outlook.com (20.177.61.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.18; Mon, 9 Sep 2019 15:54:49 +0000 Received: from VI1PR08MB5471.eurprd08.prod.outlook.com ([fe80::206a:65bd:e6a9:536b]) by VI1PR08MB5471.eurprd08.prod.outlook.com ([fe80::206a:65bd:e6a9:536b%2]) with mapi id 15.20.2241.018; Mon, 9 Sep 2019 15:54:49 +0000 From: Matthew Malcomson To: =?Windows-1252?Q?Martin_Li=9Aka?= , "gcc-patches@gcc.gnu.org" CC: "dodji@redhat.com" , nd , "kcc@google.com" , "jakub@redhat.com" , "dvyukov@google.com" Subject: Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC Date: Mon, 09 Sep 2019 15:55:00 -0000 Message-ID: <8fc78139-481e-6dbc-0996-2cae58627c25@arm.com> References: <156778058239.16148.17480879484406897649.scripted-patch-series@arm.com> <936e0222-0b05-b4de-7a68-9b91e79a6f76@suse.cz> In-Reply-To: <936e0222-0b05-b4de-7a68-9b91e79a6f76@suse.cz> Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Matthew.Malcomson@arm.com; X-Microsoft-Antispam-Untrusted: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600166)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020);SRVR:VI1PR08MB3566; X-MS-Exchange-PUrlCount: 7 x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM;SFS:(10009020)(4636009)(376002)(366004)(39860400002)(136003)(346002)(396003)(189003)(199004)(44832011)(966005)(86362001)(31696002)(14454004)(229853002)(6486002)(2501003)(25786009)(66574012)(478600001)(66066001)(99286004)(6512007)(6306002)(6246003)(316002)(6436002)(53936002)(71190400001)(71200400001)(110136005)(4326008)(54906003)(76176011)(52116002)(476003)(31686004)(2616005)(26005)(305945005)(30864003)(5660300002)(7736002)(81166006)(81156014)(6116002)(186003)(8936002)(446003)(11346002)(66476007)(102836004)(66556008)(64756008)(6506007)(66946007)(66446008)(486006)(36756003)(2906002)(386003)(53546011)(8676002)(5024004)(256004)(14444005)(3846002);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR08MB3566;H:VI1PR08MB5471.eurprd08.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Message-Info-Original: ARhqr+3MIElA0n4juiQNOn/N2TkIVAOXbEXay/F2B5FEB8IVBhNWpghMnUrnAjwVK5cZQNhe2aPzozaHccOU8Oq4qFc0q9ByyEvtvN86f1cg2OeayF2UO3fIJ0KzPUjQx1a3SMoYdHCBcKRYmOpHIVCqBfjleXPU+n1FdcssDuoDxKhf6/hD2Yko16weUwEAgbY1sfq60vTgoSg0S2QdQy2l38VhtrMKzY/4y5Hc7hP9RahfxIi/kUruXhVMoJl0s7DRgEON4TqTKCZZ+9H6lYAHKiNH138Ll0cezbxP2lgp01M030cXA/ugIC/Mddl/L3wYJ9FXj+CsaE3EZkamCV0be7f532mnBsMdabxBkMjURtAnNaCMSmV3ljJzX6R5xtONG/JFkay0kfcoM8oGqAVnJpUGpJmLOFTIH9SOzMM= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-ID: <1509B20E5651434A8998E736B4176715@eurprd08.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Matthew.Malcomson@arm.com; Return-Path: Matthew.Malcomson@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT011.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 5238cf37-69e2-4d53-c47c-08d7353e0b8d X-IsSubscribed: yes X-SW-Source: 2019-09/txt/msg00560.txt.bz2 On 09/09/19 11:47, Martin Li=9Aka wrote: > On 9/6/19 4:46 PM, Matthew Malcomson wrote: >> Hello, >> >> This patch series is a WORK-IN-PROGRESS towards porting the LLVM hardware >> address sanitizer (HWASAN) in GCC. The document describing HWASAN can b= e found >> here http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.h= tml. >=20 > Hello. >=20 > I'm happy that you are working on the functionality for GCC and I can pro= vide > my knowledge that I have with ASAN. I briefly read the patch series and I= have > multiple questions (and observations): >=20 > 1) Is the ambition of the patchset to be a software emulation of MTE that= can > work targets that do not support MTE? Is it something what clang > names hwasan-abi=3Dinterceptor? The ambition is to provide a software emulation of MTE for AArch64=20 targets that don't support MTE. I also hope to have the framework set up so that enabling for other=20 architectures is relatively easy and can be done by those interested. As I understand it, `hwasan-abi=3Dinterceptor` vs `platform` is about=20 adding such MTE emulation for "application code" or "platform code (e.g.=20 kernel)" respectively. >=20 > 2) Do you have a real aarch64 hardware that has MTE support? Would it be = possible > for the future to give such a machine to GCC Compile Farm for testing= purpose? No our team doesn't have real MTE hardware, I have been testing on an=20 AArch64 machine that has TBI, other work in the team that requires MTE=20 support is being tested on the Arm "Fast Models" emulator. >=20 > 3) I like the idea of sharing of internal functions like ASAN_CHECK/HWASA= N_CHECK. > We should benefit from that in the future. >=20 > 4) Am I correct that due to escape of "tagged" pointers, one needs to hav= e an entire > DSO (dynamic shared object) built with hwasan enabled? Otherwise, a deref= erence of > a tagged pointer will lead to a segfault (except TBI feature on aarch64)? Yes, one needs to take pains to avoid the escape of tagged pointers on=20 architectures other than AArch64. I don't believe that compiling the entire DSO with HWASAN enabled is=20 enough, since pointers can be passed across DSO boundaries. I haven't yet looked into how to handle this. There's an even more fundamental problem of accesses within the=20 instrumented binary -- I haven't yet figured out how to remove the tag=20 before accesses on architectures without the AArch64 TBI feature. >=20 > 5) Is there a documentation/definition of how shadow memory for memory ta= gging looks like? > Is it similar to ASAN, where one can get to tag with: > u8 memory_tag =3D *((PTR >> TG) + SHADOW_OFFSET) & 0xf? >=20 Yes, it's similar. From the libhwasan code, the function to fetch a pointer to the shadow=20 memory byte corresponding to a memory address is MemToShadow. constexpr uptr kShadowScale =3D 4; inline uptr MemToShadow(uptr untagged_addr) { return (untagged_addr >> kShadowScale) + __hwasan_shadow_memory_dynamic_address; } https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf= 14326b8073a9d/lib/hwasan/hwasan_mapping.h#L42 > 6) Note that thing like memtag_tag_size, memtag_granule_size define an AB= I of libsanitizer >=20 Yes, the size of these values define an ABI. Those particular hooks are added as a demonstration for how something=20 like MTE would be implemented on top of this framework (where the=20 backend would specify the tag and granule size to match their targets=20 architecture). HWASAN itself would use the hard-coded tag and granule size that matches=20 what libsanitizer uses. https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf= 14326b8073a9d/lib/hwasan/hwasan_mapping.h#L36 I define these as `HWASAN_TAG_SIZE` and `HWASAN_TAG_GRANULE_SIZE` in=20 asan.h, and when using the sanitizer library the macro=20 `HARDWARE_MEMORY_TAGGING` would be false so their values would be constant. >> >> The current patch series is far from complete, but I'm posting the curre= nt state >> to provide something to discuss at the Cauldron next week. >> >> In its current state, this sanitizer only works on AArch64 with a custom= kernel >> to allow tagged pointers in system calls. This is discussed in the belo= w link >> https://source.android.com/devices/tech/debug/hwasan -- the custom kerne= l allows >> tagged pointers in syscalls. >=20 > Can you be please more specific. Is the MTE in upstream linux kernel? If = so, > starting from which version? I find I can only make complicated statements remotely clear in bullet=20 points ;-) What I was trying to say was: - HWASAN from this patch series requires AArch64 TBI. (I have not handled architectures without TBI) - The upstream kernel does not accept tagged pointers in syscalls. (programs that use TBI must currently clear tags before passing pointers to the kernel) - This patch series doesn't include any way to avoid passing tagged pointers to syscalls. - Hence on order to test the sanitizer I'm using a kernel that has been patched to accept tagged pointers in many syscalls. - The link to the android.com site is just another source describing the same requirement. The support for the relaxed ABI (of accepting tagged pointers in various=20 syscalls in the kernel) is being discussed on the kernel mailing list,=20 the latest patchset I know of is here: https://lkml.org/lkml/2019/7/25/725 I wasn't trying to say anything about MTE in that paragraph, but kernel=20 support for MTE is not in upstream linux kernel and is currently being=20 worked on. >=20 >> I have also not yet put tests into the DejaGNU framework, but instead ha= ve a >> simple test file from which the tests will eventually come. That test f= ile is >> attached to this email despite not being in the patch series. >> >> Something close to this patch series bootstraps and passes most regressi= on >> tests when ~--with-build-config=3Dbootstrap-hwasan~ is used. The regres= sions it >> doesn't pass are all the other sanitizer tests and all linker plugin tes= ts. >> The linker plugin tests fail due to a configuration problem where the li= brary >> path is not correctly set. >> (I say "something close to this patch series" because I recently made a = change >> that breaks bootstrap but I believe is the best approach once I've fixed= it, >> hence for an RFC I'm leaving it in). >> >> HWASAN works by storing a tag in the top bits of every pointer and a col= our in >> a shadow memory region corresponding to every area of memory. On every = memory >> access through a pointer the tag in the pointer is checked against the c= olour in >> shadow memory corresponding to the memory the pointer is accessing. If = the tag >> and colour do not match then a fault is signalled. >> >> The instrumentation required for this sanitizer has a large overlap with= the >> instrumentation required for implementing MTE (which has similar functio= nality >> but checks are automatically done in the hardware and instructions for c= olouring >> shadow memory and for managing tags are provided by the architecture). >> https://community.arm.com/developer/ip-products/processors/b/processors-= ip-blog/posts/arm-a-profile-architecture-2018-developments-armv85a >> >> We hope to use the HWASAN framework to implement MTE tagging on the stac= k, and >> hence I have a "dummy" patch demonstrating the approach envisaged for th= is. >=20 > What's the situation with heap allocated memory and global variables? For the heap, whatever library function allocates memory should return a=20 tagged pointer and colour the shadow memory accordingly. This pointer=20 can then be treated exactly the same as all other pointers in=20 instrumented code. On freeing of memory the shadow memory is uncoloured in order to detect=20 use-after-free. For HWASAN this means malloc and friends need to be intercepted, and=20 this is done by the runtime library. For MTE there will need to be some updates in the system libraries. A discussion on the way this will be done in glibc has been started here: https://www.sourceware.org/ml/libc-alpha/2019-09/msg00114.html Global variables are untagged. For MTE we are planning on having these untagged. This is in order to allow uninstrumented object files to be statically=20 linked into MTE aware object files. Since global object accesses are directly generated into the code, there=20 would be no way to tag global objects and still use the code from that=20 static object. Since global objects will not be coloured for MTE, I am not planning on=20 colouring them for HWASAN. There would be a reasonable amount of work,=20 including a new mechanism for associating objects with tags. Having all global variables untagged means that nothing need be done,=20 all pointers to global variables will have a tag of zero and the shadow=20 memory will correspondingly be left coloured as zero. >=20 >> >> Though there is still much to implement here, the general approach shoul= d be >> clear. Any feedback is welcomed, but I have three main points that I'm >> particularly hoping for external opinions. >> >> 1) The current approach stores a tag on the RTL representing a given var= iable, >> in order to implement HWASAN for x86_64 the tag needs to be removed = before >> every memory access but not on things like function calls. >> Is there any obvious way to handle removing the tag in these places? >> Maybe something with legitimize_address? >=20 > Not being a target expect, but I bet you'll need to store the tag with a = RTL > representation of a stack variable. >=20 > Thanks, > Martin >=20 >> 2) The first draft presented here introduces a new RTL expression called >> ADDTAG. I now believe that a hook would be neater here but haven't = yet >> looked into it. Do people agree? >> (addtag is introduced in the patch titled "Put tags into each stack = variable >> pointer", but the reason it's introduced is so the backend can defin= e how >> this gets implemented with a ~define_expand~ and that's only needed = for the >> MTE handling as introduced in "Add in MTE stubs") >> 3) This patch series has not yet had much thought go towards it around c= ommand >> line arguments. I personally quite like the idea of having >> ~-fsanitize=3Dhwaddress~ turn on "checking memory tags against shado= w memory >> colour", and MTE being just a hardware acceleration of this ability. >> I suspect this idea wouldn't be liked by all and would like to hear = some >> opinions. >> >> Thanks, >> Matthew >> >=20