public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 00/16] memory tagging improvements
@ 2021-03-04 16:30 Szabolcs Nagy
  2021-03-04 16:30 ` [PATCH 01/16] malloc: Fix a realloc crash with heap tagging [BZ 27468] Szabolcs Nagy
                   ` (15 more replies)
  0 siblings, 16 replies; 42+ messages in thread
From: Szabolcs Nagy @ 2021-03-04 16:30 UTC (permalink / raw)
  To: libc-alpha, Richard.Earnshaw, DJ Delorie

This set tries to reduce the overhead of memory tagging support: when
glibc is built with --enable-memory-tagging there are runtime checks
that affect performance even if heap tagging is not enabled at runtime.

I refactored the code to reduce internal abstractions and ifdefs. And
a number of patches optimize heap tagging on aarch64 with MTE.

Also available in the nsz/mtag branch.

Issues found and fixed:
- realloc failed to untag the old memory in some cases.
- incorrect but useless macro definition in public sys/prctl.h.
- incorrect but harmless internal MTAG_MMAP_FLAGS declaration.

Issues found but not fixed:
- large granule is not supported: both chunk header and user allocation
  must be tag granule aligned, but the code assumes that the chunk
  header is 2*SIZE_SZ so larger granule is not supported.
- request2size and checked_request2size are not consistent which looks
  wrong (likely works in practice other than some sizes don't end up in
  fastbin or tcache that one might expected to, but i think this can
  break easily). Fixing it is not obvious because of assumptions about
  the chunk layout that is no longer true with mtag enabled.

Other possible optimizations:
- ifunc malloc apis
  good: near zero overhead for runtime dispatch.
  bad: malloc has to be built twice or larger refactoring is needed.
- unconditional top byte zero
  internal data is tagged 0, only user allocation is tagged non-zero
  (we rely on this e.g. where MADV_DONTNEED is used: then the kernel
  zeros the tags so with non-zero tags internal pointers would not be
  able to access recycled memory), so tag_at(p) for internal data
  could be unconditional top byte clear.
  good: faster than the branch in tag_at, may improve security (?).
  bad: design change.
- unconditional layout
  padding up to tag granule can be done unconditionally
  good: no runtime check, same layout with and without heap tagging.
  bad: wasting space with small allocations, design change.

This patchset tried to avoid intrusive changes that affect behaviour.

Benchmarked using bench-malloc-thread glibc ubenchmark on Neoverse N1.
For size=16 the runtime check overhead is
  ~ 32% before the patch set
  ~ 16% after the patch set.
On a practical workload or for large sizes the overhead should be tiny.

Szabolcs Nagy (16):
  malloc: Fix a realloc crash with heap tagging [BZ 27468]
  Remove PR_TAGGED_ADDR_ENABLE from sys/prctl.h
  malloc: Move MTAG_MMAP_FLAGS definition
  malloc: Simplify __mtag_tag_new_usable
  malloc: Avoid taggig mmaped memory on free
  malloc: Ensure the generic mtag hooks are not used
  malloc: Refactor TAG_ macros to avoid indirection
  malloc: Use global flag instead of function pointer dispatch for mtag
  malloc: Only support zeroing and not arbitrary memset with mtag
  malloc: Change calloc when tagging is disabled
  malloc: Use branches instead of mtag_granule_mask
  malloc: Use mtag_enabled instead of USE_MTAG
  aarch64: inline __libc_mtag_address_get_tag
  aarch64: inline __libc_mtag_new_tag
  aarch64: Optimize __libc_mtag_tag_region
  aarch64: Optimize __libc_mtag_tag_zero_region

 include/malloc.h                         |   7 -
 malloc/arena.c                           |  45 +-----
 malloc/hooks.c                           |  20 ++-
 malloc/malloc.c                          | 177 ++++++++++++-----------
 sysdeps/aarch64/Makefile                 |   4 +-
 sysdeps/aarch64/__mtag_address_get_tag.S |  32 ----
 sysdeps/aarch64/__mtag_memset_tag.S      |  53 -------
 sysdeps/aarch64/__mtag_new_tag.S         |  37 -----
 sysdeps/aarch64/__mtag_tag_region.S      |  98 ++++++++++---
 sysdeps/aarch64/__mtag_tag_zero_region.S | 113 +++++++++++++++
 sysdeps/aarch64/libc-mtag.h              |  32 ++--
 sysdeps/generic/libc-mtag.h              |  43 ++++--
 sysdeps/unix/sysv/linux/sys/prctl.h      |   4 -
 13 files changed, 353 insertions(+), 312 deletions(-)
 delete mode 100644 sysdeps/aarch64/__mtag_address_get_tag.S
 delete mode 100644 sysdeps/aarch64/__mtag_memset_tag.S
 delete mode 100644 sysdeps/aarch64/__mtag_new_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_tag_zero_region.S

-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 06/16] malloc: Ensure the generic mtag hooks are not used
@ 2021-03-05 15:16 Wilco Dijkstra
  2021-03-05 18:15 ` DJ Delorie
  0 siblings, 1 reply; 42+ messages in thread
From: Wilco Dijkstra @ 2021-03-05 15:16 UTC (permalink / raw)
  To: Szabolcs Nagy, dj; +Cc: 'GNU C Library'

Hi,

>> This may make it impossible to compile individual glibc modules with
>> -O0, say, for debugging purposes.  I do this quite often with malloc.c
>> and do not want to lose that ability.
>> 
>> Do we trust the compiler's default optimizations (at -O0) to do code
>> elimination based on constant conditions?
>
> yes, this breaks -O0, i haven't thought of that.

No this works fine. -O0 optimizes if (0) like you'd expect. You do need a bit
of hacking to do it, I tried:

CPPFLAGS-malloc.c += -O0 -D__OPTIMIZE__

> i thought it's better to guard everything with
> one flag in the code than ifdefs (that can be
> runtime flag with tagging enabled or constant 0
> when disabled) since ifdefed code is not checked
> by the compiler.

Agreed, generic C code is better than preprocessed code that is never
checked for all possible configurations.

And to reduce the overhead further without ifuncs we will likely need to
move most of the tagging code into a separate if (tagging_enabled) block
similar to the single-threaded fast path (instead of checking the boolean
many times).

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 06/16] malloc: Ensure the generic mtag hooks are not used
@ 2021-03-05 19:01 Wilco Dijkstra
  0 siblings, 0 replies; 42+ messages in thread
From: Wilco Dijkstra @ 2021-03-05 19:01 UTC (permalink / raw)
  To: dj; +Cc: 'GNU C Library'

Hi DJ,

>> No this works fine. -O0 optimizes if (0) like you'd expect. You do need a bit
>> of hacking to do it, I tried:
>
> Did you try it on the oldest version of gcc that glibc still supports?

No, but since this is not an officially supported option (-O0 always gives an error),
I don't see how that matters until -O0 becomes officially supported.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-04-13 21:32 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 16:30 [PATCH 00/16] memory tagging improvements Szabolcs Nagy
2021-03-04 16:30 ` [PATCH 01/16] malloc: Fix a realloc crash with heap tagging [BZ 27468] Szabolcs Nagy
2021-03-05  0:15   ` DJ Delorie
2021-03-05 12:01     ` Szabolcs Nagy
2021-03-05 18:42       ` DJ Delorie
2021-03-05 20:51   ` DJ Delorie
2021-03-04 16:30 ` [PATCH 02/16] Remove PR_TAGGED_ADDR_ENABLE from sys/prctl.h Szabolcs Nagy
2021-03-26 11:29   ` Szabolcs Nagy
2021-04-13  8:37     ` Szabolcs Nagy
2021-04-13 21:32       ` DJ Delorie
2021-03-04 16:31 ` [PATCH 03/16] malloc: Move MTAG_MMAP_FLAGS definition Szabolcs Nagy
2021-03-05  1:07   ` DJ Delorie
2021-03-04 16:31 ` [PATCH 04/16] malloc: Simplify __mtag_tag_new_usable Szabolcs Nagy
2021-03-05  0:20   ` DJ Delorie
2021-03-05 12:24     ` Szabolcs Nagy
2021-03-05 18:52   ` DJ Delorie
2021-03-04 16:31 ` [PATCH 05/16] malloc: Avoid taggig mmaped memory on free Szabolcs Nagy
2021-03-05  1:01   ` DJ Delorie
2021-03-04 16:31 ` [PATCH 06/16] malloc: Ensure the generic mtag hooks are not used Szabolcs Nagy
2021-03-05  1:05   ` DJ Delorie
2021-03-05 12:44     ` Szabolcs Nagy
2021-03-05 20:30   ` DJ Delorie
2021-03-04 16:32 ` [PATCH 07/16] malloc: Refactor TAG_ macros to avoid indirection Szabolcs Nagy
2021-03-05  0:28   ` DJ Delorie
2021-03-04 16:32 ` [PATCH 08/16] malloc: Use global flag instead of function pointer dispatch for mtag Szabolcs Nagy
2021-03-05  0:46   ` DJ Delorie
2021-03-05 12:53     ` Szabolcs Nagy
2021-03-04 16:32 ` [PATCH 09/16] malloc: Only support zeroing and not arbitrary memset with mtag Szabolcs Nagy
2021-03-05  0:49   ` DJ Delorie
2021-03-04 16:33 ` [PATCH 10/16] malloc: Change calloc when tagging is disabled Szabolcs Nagy
2021-03-05  1:06   ` DJ Delorie
2021-03-04 16:33 ` [PATCH 11/16] malloc: Use branches instead of mtag_granule_mask Szabolcs Nagy
2021-03-05 21:00   ` DJ Delorie
2021-03-04 16:33 ` [PATCH 12/16] malloc: Use mtag_enabled instead of USE_MTAG Szabolcs Nagy
2021-03-05  0:56   ` DJ Delorie
2021-03-04 16:34 ` [PATCH 13/16] aarch64: inline __libc_mtag_address_get_tag Szabolcs Nagy
2021-03-04 16:34 ` [PATCH 14/16] aarch64: inline __libc_mtag_new_tag Szabolcs Nagy
2021-03-04 16:34 ` [PATCH 15/16] aarch64: Optimize __libc_mtag_tag_region Szabolcs Nagy
2021-03-04 16:34 ` [PATCH 16/16] aarch64: Optimize __libc_mtag_tag_zero_region Szabolcs Nagy
2021-03-05 15:16 [PATCH 06/16] malloc: Ensure the generic mtag hooks are not used Wilco Dijkstra
2021-03-05 18:15 ` DJ Delorie
2021-03-05 19:01 Wilco Dijkstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).