From: "Kostya Serebryany via gcc-patches" <gcc-patches@gcc.gnu.org>
To: Matthew Malcomson <Matthew.Malcomson@arm.com>,
Peter Collingbourne <pcc@google.com>,
Evgeniy Stepanov <eugenis@google.com>
Cc: "Martin Liška" <mliska@suse.cz>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
"dodji@redhat.com" <dodji@redhat.com>, nd <nd@arm.com>,
"jakub@redhat.com" <jakub@redhat.com>,
"dvyukov@google.com" <dvyukov@google.com>
Subject: Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC
Date: Tue, 10 Sep 2019 01:06:00 -0000 [thread overview]
Message-ID: <CAN=P9phAe+GDvEt3gP_6r=MRexgzdMODHjfRtApJ3QX=5TNtFA@mail.gmail.com> (raw)
In-Reply-To: <8fc78139-481e-6dbc-0996-2cae58627c25@arm.com>
+Peter Collingbourne +Evgeniy Stepanov (the main developers of HWASAN
in LLVM, FYI)
Please note that Peter has recently implemented support for globals in
LLVM's HWASAN.
--kcc
On Mon, Sep 9, 2019 at 8:55 AM Matthew Malcomson
<Matthew.Malcomson@arm.com> wrote:
>
> On 09/09/19 11:47, Martin Liška wrote:
> > On 9/6/19 4:46 PM, Matthew Malcomson wrote:
> >> Hello,
> >>
> >> This patch series is a WORK-IN-PROGRESS towards porting the LLVM hardware
> >> address sanitizer (HWASAN) in GCC. The document describing HWASAN can be found
> >> here http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html.
> >
> > Hello.
> >
> > I'm happy that you are working on the functionality for GCC and I can provide
> > my knowledge that I have with ASAN. I briefly read the patch series and I have
> > multiple questions (and observations):
> >
> > 1) Is the ambition of the patchset to be a software emulation of MTE that can
> > work targets that do not support MTE? Is it something what clang
> > names hwasan-abi=interceptor?
>
> The ambition is to provide a software emulation of MTE for AArch64
> targets that don't support MTE.
> I also hope to have the framework set up so that enabling for other
> architectures is relatively easy and can be done by those interested.
>
> As I understand it, `hwasan-abi=interceptor` vs `platform` is about
> adding such MTE emulation for "application code" or "platform code (e.g.
> kernel)" respectively.
>
> >
> > 2) Do you have a real aarch64 hardware that has MTE support? Would it be possible
> > for the future to give such a machine to GCC Compile Farm for testing purpose?
>
> No our team doesn't have real MTE hardware, I have been testing on an
> AArch64 machine that has TBI, other work in the team that requires MTE
> support is being tested on the Arm "Fast Models" emulator.
>
> >
> > 3) I like the idea of sharing of internal functions like ASAN_CHECK/HWASAN_CHECK.
> > We should benefit from that in the future.
> >
> > 4) Am I correct that due to escape of "tagged" pointers, one needs to have an entire
> > DSO (dynamic shared object) built with hwasan enabled? Otherwise, a dereference of
> > a tagged pointer will lead to a segfault (except TBI feature on aarch64)?
>
>
> Yes, one needs to take pains to avoid the escape of tagged pointers on
> architectures other than AArch64.
>
> I don't believe that compiling the entire DSO with HWASAN enabled is
> enough, since pointers can be passed across DSO boundaries.
> I haven't yet looked into how to handle this.
>
> There's an even more fundamental problem of accesses within the
> instrumented binary -- I haven't yet figured out how to remove the tag
> before accesses on architectures without the AArch64 TBI feature.
>
>
> >
> > 5) Is there a documentation/definition of how shadow memory for memory tagging looks like?
> > Is it similar to ASAN, where one can get to tag with:
> > u8 memory_tag = *((PTR >> TG) + SHADOW_OFFSET) & 0xf?
> >
>
> Yes, it's similar.
>
> From the libhwasan code, the function to fetch a pointer to the shadow
> memory byte corresponding to a memory address is MemToShadow.
>
> constexpr uptr kShadowScale = 4;
> inline uptr MemToShadow(uptr untagged_addr) {
> return (untagged_addr >> kShadowScale) +
> __hwasan_shadow_memory_dynamic_address;
> }
>
> https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf14326b8073a9d/lib/hwasan/hwasan_mapping.h#L42
>
>
> > 6) Note that thing like memtag_tag_size, memtag_granule_size define an ABI of libsanitizer
> >
>
> Yes, the size of these values define an ABI.
>
> Those particular hooks are added as a demonstration for how something
> like MTE would be implemented on top of this framework (where the
> backend would specify the tag and granule size to match their targets
> architecture).
>
> HWASAN itself would use the hard-coded tag and granule size that matches
> what libsanitizer uses.
> https://github.com/llvm-mirror/compiler-rt/blob/99ce9876124e910475c627829bf14326b8073a9d/lib/hwasan/hwasan_mapping.h#L36
>
> I define these as `HWASAN_TAG_SIZE` and `HWASAN_TAG_GRANULE_SIZE` in
> asan.h, and when using the sanitizer library the macro
> `HARDWARE_MEMORY_TAGGING` would be false so their values would be constant.
>
>
> >>
> >> The current patch series is far from complete, but I'm posting the current state
> >> to provide something to discuss at the Cauldron next week.
> >>
> >> In its current state, this sanitizer only works on AArch64 with a custom kernel
> >> to allow tagged pointers in system calls. This is discussed in the below link
> >> https://source.android.com/devices/tech/debug/hwasan -- the custom kernel allows
> >> tagged pointers in syscalls.
> >
> > Can you be please more specific. Is the MTE in upstream linux kernel? If so,
> > starting from which version?
>
> I find I can only make complicated statements remotely clear in bullet
> points ;-)
>
> What I was trying to say was:
> - HWASAN from this patch series requires AArch64 TBI.
> (I have not handled architectures without TBI)
> - The upstream kernel does not accept tagged pointers in syscalls.
> (programs that use TBI must currently clear tags before passing
> pointers to the kernel)
> - This patch series doesn't include any way to avoid passing tagged
> pointers to syscalls.
> - Hence on order to test the sanitizer I'm using a kernel that has been
> patched to accept tagged pointers in many syscalls.
> - The link to the android.com site is just another source describing the
> same requirement.
>
>
> The support for the relaxed ABI (of accepting tagged pointers in various
> syscalls in the kernel) is being discussed on the kernel mailing list,
> the latest patchset I know of is here:
> https://lkml.org/lkml/2019/7/25/725
>
> I wasn't trying to say anything about MTE in that paragraph, but kernel
> support for MTE is not in upstream linux kernel and is currently being
> worked on.
>
> >
> >> I have also not yet put tests into the DejaGNU framework, but instead have a
> >> simple test file from which the tests will eventually come. That test file is
> >> attached to this email despite not being in the patch series.
> >>
> >> Something close to this patch series bootstraps and passes most regression
> >> tests when ~--with-build-config=bootstrap-hwasan~ is used. The regressions it
> >> doesn't pass are all the other sanitizer tests and all linker plugin tests.
> >> The linker plugin tests fail due to a configuration problem where the library
> >> path is not correctly set.
> >> (I say "something close to this patch series" because I recently made a change
> >> that breaks bootstrap but I believe is the best approach once I've fixed it,
> >> hence for an RFC I'm leaving it in).
> >>
> >> HWASAN works by storing a tag in the top bits of every pointer and a colour in
> >> a shadow memory region corresponding to every area of memory. On every memory
> >> access through a pointer the tag in the pointer is checked against the colour in
> >> shadow memory corresponding to the memory the pointer is accessing. If the tag
> >> and colour do not match then a fault is signalled.
> >>
> >> The instrumentation required for this sanitizer has a large overlap with the
> >> instrumentation required for implementing MTE (which has similar functionality
> >> but checks are automatically done in the hardware and instructions for colouring
> >> shadow memory and for managing tags are provided by the architecture).
> >> https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-a-profile-architecture-2018-developments-armv85a
> >>
> >> We hope to use the HWASAN framework to implement MTE tagging on the stack, and
> >> hence I have a "dummy" patch demonstrating the approach envisaged for this.
> >
> > What's the situation with heap allocated memory and global variables?
>
> For the heap, whatever library function allocates memory should return a
> tagged pointer and colour the shadow memory accordingly. This pointer
> can then be treated exactly the same as all other pointers in
> instrumented code.
> On freeing of memory the shadow memory is uncoloured in order to detect
> use-after-free.
>
> For HWASAN this means malloc and friends need to be intercepted, and
> this is done by the runtime library.
>
> For MTE there will need to be some updates in the system libraries.
> A discussion on the way this will be done in glibc has been started here:
> https://www.sourceware.org/ml/libc-alpha/2019-09/msg00114.html
>
>
>
> Global variables are untagged.
>
> For MTE we are planning on having these untagged.
> This is in order to allow uninstrumented object files to be statically
> linked into MTE aware object files.
> Since global object accesses are directly generated into the code, there
> would be no way to tag global objects and still use the code from that
> static object.
>
>
> Since global objects will not be coloured for MTE, I am not planning on
> colouring them for HWASAN. There would be a reasonable amount of work,
> including a new mechanism for associating objects with tags.
>
> Having all global variables untagged means that nothing need be done,
> all pointers to global variables will have a tag of zero and the shadow
> memory will correspondingly be left coloured as zero.
>
> >
> >>
> >> Though there is still much to implement here, the general approach should be
> >> clear. Any feedback is welcomed, but I have three main points that I'm
> >> particularly hoping for external opinions.
> >>
> >> 1) The current approach stores a tag on the RTL representing a given variable,
> >> in order to implement HWASAN for x86_64 the tag needs to be removed before
> >> every memory access but not on things like function calls.
> >> Is there any obvious way to handle removing the tag in these places?
> >> Maybe something with legitimize_address?
> >
> > Not being a target expect, but I bet you'll need to store the tag with a RTL
> > representation of a stack variable.
> >
> > Thanks,
> > Martin
> >
> >> 2) The first draft presented here introduces a new RTL expression called
> >> ADDTAG. I now believe that a hook would be neater here but haven't yet
> >> looked into it. Do people agree?
> >> (addtag is introduced in the patch titled "Put tags into each stack variable
> >> pointer", but the reason it's introduced is so the backend can define how
> >> this gets implemented with a ~define_expand~ and that's only needed for the
> >> MTE handling as introduced in "Add in MTE stubs")
> >> 3) This patch series has not yet had much thought go towards it around command
> >> line arguments. I personally quite like the idea of having
> >> ~-fsanitize=hwaddress~ turn on "checking memory tags against shadow memory
> >> colour", and MTE being just a hardware acceleration of this ability.
> >> I suspect this idea wouldn't be liked by all and would like to hear some
> >> opinions.
> >>
> >> Thanks,
> >> Matthew
> >>
> >
>
next prev parent reply other threads:[~2019-09-10 1:06 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-06 14:46 Matthew Malcomson
2019-09-06 14:46 ` [RFC][PATCH 3/X][libsanitizer] Allow compilation for HWASAN_WITH_INTERCEPTORS=OFF Matthew Malcomson
2019-09-09 9:27 ` Martin Liška
2019-09-06 14:46 ` [RFC][PATCH 2/X][libsanitizer] Tie the hwasan library into our build system Matthew Malcomson
2019-09-06 14:46 ` [RFC][PATCH 4/X][libsanitizer] Pass size and pointer info to error reporting functions Matthew Malcomson
2019-09-09 9:27 ` Martin Liška
2019-09-06 14:46 ` [RFC][PATCH 7/X][libsanitizer] Add option to bootstrap using HWASAN Matthew Malcomson
2019-09-06 14:46 ` [RFC][PATCH 8/X][libsanitizer] Ensure HWASAN required alignment for stack variables Matthew Malcomson
2019-09-06 14:46 ` [RFC][PATCH 5/X][libsanitizer] Introduce longjmp/setjmp interceptors to libhwasan Matthew Malcomson
2019-09-09 10:02 ` Martin Liška
2019-09-09 10:29 ` Matthew Malcomson
2019-09-09 10:49 ` Martin Liška
2019-09-06 14:46 ` [RFC][PATCH 14/X][libsanitizer] Introduce HWASAN block-scope poisoning Matthew Malcomson
2019-09-06 14:46 ` [RFC][PATCH 1/X][libsanitizer] Introduce libsanitizer to GCC tree Matthew Malcomson
2019-09-09 9:26 ` Martin Liška
2019-09-06 14:47 ` [RFC][PATCH 10/X][libsanitizer] Colour the shadow stack for each stack variable Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 13/X][libsanitizer] Instrument known builtin function calls Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 16/X][libsanitizer] Build libhwasan with interceptors Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 11/X][libsanitizer] Uncolour stack frame on function exit Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 15/X][libsanitizer] Add in MTE stubs Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 12/X][libsanitizer] Check pointer tags match address tags Matthew Malcomson
2019-09-06 14:47 ` [RFC][PATCH 6/X][libsanitizer] Add -fsanitize=hwaddress flags Matthew Malcomson
2019-09-09 10:06 ` Martin Liška
2019-09-09 10:18 ` Matthew Malcomson
2019-09-09 10:20 ` Martin Liška
2019-09-06 14:47 ` [RFC][PATCH 9/X][libsanitizer] Put tags into each stack variable pointer Matthew Malcomson
2019-09-09 10:47 ` [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC Martin Liška
2019-09-09 15:55 ` Matthew Malcomson
2019-09-10 1:06 ` Kostya Serebryany via gcc-patches [this message]
2019-09-11 11:53 ` Martin Liška
2019-09-11 16:37 ` Matthew Malcomson
2019-09-11 18:34 ` Evgenii Stepanov via gcc-patches
2019-09-23 8:02 ` Martin Liška
2019-10-23 11:02 ` Matthew Malcomson
2019-10-24 10:11 ` Martin Liška
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN=P9phAe+GDvEt3gP_6r=MRexgzdMODHjfRtApJ3QX=5TNtFA@mail.gmail.com' \
--to=gcc-patches@gcc.gnu.org \
--cc=Matthew.Malcomson@arm.com \
--cc=dodji@redhat.com \
--cc=dvyukov@google.com \
--cc=eugenis@google.com \
--cc=jakub@redhat.com \
--cc=kcc@google.com \
--cc=mliska@suse.cz \
--cc=nd@arm.com \
--cc=pcc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).