public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/7] RFC Memory tagging support
@ 2020-06-15 14:40 Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 1/7] config: Allow memory tagging to be enabled when configuring glibc Richard Earnshaw
                   ` (12 more replies)
  0 siblings, 13 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

Last year I posted a preliminary set of patches for discussion
purposes on adding support for tagged memory to glibc.  This version
polishes that to the point where I believe it is now deployable.

The first four patches are generic changes, the final three add the
aarch64 specific code.

The first patch simply adds a new configuration option to the build
system which can be turned on with the option --enable-memory-tagging.
The default at present is 'no'.

The second patch adds a glibc tunable that can be used at run-time to
enable the feature (the default again, is disabled).  This tunable
would be always present, but have no effect on systems lacking support
for memory tagging.  I've structured the tunable as a bit-mask of
features that can be used with memory tagging, though at present only
two bits have defined uses.

The third patch is the meat of the changes; it adds the changes to the
malloc APIs.  I've tried as far as possible to ensure that when memory
tagging is disabled, there is no change in behaviour, even when the
memory tagging is configured into the library, but there are
inevitably a small number of changes needed in the optimizations that
calloc performs since tagging would require that all the tags were
correctly set, even if the memory does not strictly have to be zeroed.
I've made use of function pointers in the code, much the same way as
the morecore hook is used, so that when tagging is disabled, the
functions called are the same as the traditional operations; this also
ensures that glibc does not require any internal ifunc resolution in
order to work.

The fourth patch adds support for the new prctl operations that are
being proposed to the linux kernel.  The kernel changes are to a
generic header and this patch mirrors that design decision in glibc.

The fifth patch is a place-holder, so that this series of changes is
stand-alone.  Work is already underway to change the string operations
to be MTE safe without losing too much in the way of performance.  I
expect this patch to be removed entirely before the series is
committed.

The final two patches add the remaining aarch64 support.  The first
adds the support code to examine the tunable and HW caps; and enable
memory tagging in the kernel as needed.  The second adds the final
pieces needed to support memory tagging in glibc.

Testing
=======

I've run all the malloc tests.  While not all of them pass at present,
I have examined all the failing tests and I'm confident that none of the
failures represent real bugs in the code; but I have not yet decided how
best to address these failures.  The failures fall into three categories

1) Tests that expect a particular failure mode on double-free operations.
I've added a quick tag-checking access in the free entry path that
essentially asserts that the tag colour of a free'd block of memory
matches the colour of the pointer - this leads to the kernel delivering
a different signal that the test code does not expect.

2) Tests that assume that malloc_usable_size will return a specific
amount of free space.  The assumptions are not correct, because the
tag colouring boundaries needed for MTE means that the 8 bytes in the
block containing the back pointer can no-longer be used by users when
we have MTE (they have a different colour that belongs to the malloc
data structures).

3) Tests that construct a fake internal malloc data structure and then
try to perform operations on them.  I haven't looked at these in too
much detail, but the first issue is that the fake header is only
8-byte aligned and for MTE to work it requires a 16-byte aligned
structure (the tag read/write operations require the address be
granule aligned, and the real glibc data structure has this property).

In addition to the above I've run the code on a traditional aarch64
machine that lacks MTE (to confirm that the code behaves as expected
on existing machines) and on a model that has support for MTE.

Richard.

Richard Earnshaw (7):
  config: Allow memory tagging to be enabled when configuring glibc
  elf: Add a tunable to control use of tagged memory
  malloc: Basic support for memory tagging in the malloc() family
  linux: Add compatibility definitions to sys/prctl.h for MTE
  aarch64: Mitigations for string functions when MTE is enabled.
  aarch64: Add sysv specific enabling code for memory tagging
  aarch64: Add aarch64-specific files for memory tagging support

 config.h.in                                   |   3 +
 config.make.in                                |   2 +
 configure                                     |  17 ++
 configure.ac                                  |  10 +
 elf/dl-tunables.list                          |   9 +
 malloc/arena.c                                |  42 ++++-
 malloc/malloc.c                               | 171 ++++++++++++++----
 malloc/malloc.h                               |   7 +
 manual/install.texi                           |  13 ++
 manual/tunables.texi                          |  31 ++++
 sysdeps/aarch64/Makefile                      |   5 +
 sysdeps/aarch64/__mtag_address_get_tag.S      |  31 ++++
 sysdeps/aarch64/__mtag_memset_tag.S           |  46 +++++
 sysdeps/aarch64/__mtag_new_tag.S              |  38 ++++
 sysdeps/aarch64/__mtag_tag_region.S           |  44 +++++
 sysdeps/aarch64/libc-mtag.h                   |  57 ++++++
 sysdeps/aarch64/memchr.S                      |  21 ++-
 sysdeps/aarch64/multiarch/strlen_asimd.S      |   2 +-
 sysdeps/aarch64/strchr.S                      |  15 ++
 sysdeps/aarch64/strchrnul.S                   |  14 +-
 sysdeps/aarch64/strcmp.S                      |  12 +-
 sysdeps/aarch64/strcpy.S                      |   2 +-
 sysdeps/aarch64/strlen.S                      |   2 +-
 sysdeps/aarch64/strncmp.S                     |  10 +-
 sysdeps/aarch64/strrchr.S                     |  15 +-
 sysdeps/generic/libc-mtag.h                   |  52 ++++++
 sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h  |   2 +
 sysdeps/unix/sysv/linux/aarch64/bits/mman.h   |  32 ++++
 .../unix/sysv/linux/aarch64/cpu-features.c    |  22 +++
 .../unix/sysv/linux/aarch64/cpu-features.h    |   1 +
 sysdeps/unix/sysv/linux/sys/prctl.h           |  18 ++
 31 files changed, 696 insertions(+), 50 deletions(-)
 create mode 100644 sysdeps/aarch64/__mtag_address_get_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_memset_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_new_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_tag_region.S
 create mode 100644 sysdeps/aarch64/libc-mtag.h
 create mode 100644 sysdeps/generic/libc-mtag.h
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/mman.h

-- 
2.26.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/7] config: Allow memory tagging to be enabled when configuring glibc
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 2/7] elf: Add a tunable to control use of tagged memory Richard Earnshaw
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 44 bytes --]

This is a multi-part message in MIME format.

[-- Attachment #2: Type: text/plain, Size: 467 bytes --]


This patch adds the configuration machinery to allow memory tagging to be
enabled from the command line via the configure option --enable-memory-tagging.

The current default is off, though in time we may change that once the API
is more stable.
---
 config.h.in         |  3 +++
 config.make.in      |  2 ++
 configure           | 17 +++++++++++++++++
 configure.ac        | 10 ++++++++++
 manual/install.texi | 13 +++++++++++++
 5 files changed, 45 insertions(+)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0001-config-Allow-memory-tagging-to-be-enabled-when-confi.patch --]
[-- Type: text/x-patch; name="0001-config-Allow-memory-tagging-to-be-enabled-when-confi.patch", Size: 3956 bytes --]

diff --git a/config.h.in b/config.h.in
index dea43df438..b7f37e8e2a 100644
--- a/config.h.in
+++ b/config.h.in
@@ -148,6 +148,9 @@
 /* Define if __stack_chk_guard canary should be randomized at program startup.  */
 #undef ENABLE_STACKGUARD_RANDOMIZE
 
+/* Define if memory tagging support should be enabled.  */
+#undef _LIBC_MTAG
+
 /* Package description.  */
 #undef PKGVERSION
 
diff --git a/config.make.in b/config.make.in
index 2fed3da773..b3be8e0381 100644
--- a/config.make.in
+++ b/config.make.in
@@ -84,6 +84,8 @@ mach-interface-list = @mach_interface_list@
 
 experimental-malloc = @experimental_malloc@
 
+memory-tagging = @memory_tagging@
+
 nss-crypt = @libc_cv_nss_crypt@
 static-nss-crypt = @libc_cv_static_nss_crypt@
 
diff --git a/configure b/configure
index 8df47d61f8..92854715e0 100755
--- a/configure
+++ b/configure
@@ -678,6 +678,7 @@ link_obsolete_rpc
 libc_cv_static_nss_crypt
 libc_cv_nss_crypt
 build_crypt
+memory_tagging
 experimental_malloc
 enable_werror
 all_warnings
@@ -783,6 +784,7 @@ enable_all_warnings
 enable_werror
 enable_multi_arch
 enable_experimental_malloc
+enable_memory_tagging
 enable_crypt
 enable_nss_crypt
 enable_obsolete_rpc
@@ -1454,6 +1456,8 @@ Optional Features:
                           architectures
   --disable-experimental-malloc
                           disable experimental malloc features
+  --enable-memory-tagging enable memory tagging if supported by the
+                          architecture [default=no]
   --disable-crypt         do not build nor install the passphrase hashing
                           library, libcrypt
   --enable-nss-crypt      enable libcrypt to use nss
@@ -3527,6 +3531,19 @@ fi
 
 
 
+# Check whether --enable-memory-tagging was given.
+if test "${enable_memory_tagging+set}" = set; then :
+  enableval=$enable_memory_tagging; memory_tagging=$enableval
+else
+  memory_tagging=no
+fi
+
+if test "$memory_tagging" = yes; then
+  $as_echo "#define _LIBC_MTAG 1" >>confdefs.h
+
+fi
+
+
 # Check whether --enable-crypt was given.
 if test "${enable_crypt+set}" = set; then :
   enableval=$enable_crypt; build_crypt=$enableval
diff --git a/configure.ac b/configure.ac
index 5f229679a9..307a32d94b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -311,6 +311,16 @@ AC_ARG_ENABLE([experimental-malloc],
 	      [experimental_malloc=yes])
 AC_SUBST(experimental_malloc)
 
+AC_ARG_ENABLE([memory-tagging],
+	      AC_HELP_STRING([--enable-memory-tagging],
+			     [enable memory tagging if supported by the architecture @<:@default=no@:>@]),
+	      [memory_tagging=$enableval],
+	      [memory_tagging=no])
+if test "$memory_tagging" = yes; then
+  AC_DEFINE(_LIBC_MTAG)
+fi
+AC_SUBST(memory_tagging)
+
 AC_ARG_ENABLE([crypt],
               AC_HELP_STRING([--disable-crypt],
                              [do not build nor install the passphrase hashing library, libcrypt]),
diff --git a/manual/install.texi b/manual/install.texi
index 71bf47cac6..abf7197090 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -167,6 +167,19 @@ on non-CET processors.  @option{--enable-cet} has been tested for
 x86_64 and x32 on CET SDVs, but Intel CET support hasn't been validated
 for i686.
 
+@item --enable-memory-tagging
+Enable memory tagging support on architectures that support it.  When
+@theglibc{} is built with this option then the resulting library will
+be able to control the use of tagged memory when hardware support is
+present by use of the tunable @samp{glibc.memtag.enable}.  This includes
+the generation of tagged memory when using the @code{malloc} APIs.
+
+At present only AArch64 platforms with MTE provide this functionality,
+although the library will still operate (without memory tagging) on
+older versions of the architecture.
+
+The default is to disable support for memory tagging.
+
 @item --disable-profile
 Don't build libraries with profiling information.  You may want to use
 this option if you don't plan to do profiling.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 2/7] elf: Add a tunable to control use of tagged memory
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 1/7] config: Allow memory tagging to be enabled when configuring glibc Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 3/7] malloc: Basic support for memory tagging in the malloc() family Richard Earnshaw
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]


Add a new glibc tunable: mtag.enable, bound to the environment variable
_MTAG_ENABLE.  This is a decimal constant in the range 0-255 but used
as a bit-field.

Bit 0 enables use of tagged memory in the malloc family of functions.
Bit 1 enables precise faulting of tag failure on platforms where this
can be controlled.
Other bits are currently unused, but if set will cause memory tag
checking for the current process to be enabled in the kernel.
---
 elf/dl-tunables.list |  9 +++++++++
 manual/tunables.texi | 31 +++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-elf-Add-a-tunable-to-control-use-of-tagged-memory.patch --]
[-- Type: text/x-patch; name="0002-elf-Add-a-tunable-to-control-use-of-tagged-memory.patch", Size: 1940 bytes --]

diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list
index 0d398dd251..92b6f21b44 100644
--- a/elf/dl-tunables.list
+++ b/elf/dl-tunables.list
@@ -126,4 +126,13 @@ glibc {
       default: 3
     }
   }
+
+  memtag {
+    enable {
+      type: INT_32
+      minval: 0
+      maxval: 255
+      env_alias: _MTAG_ENABLE
+    }
+  }
 }
diff --git a/manual/tunables.texi b/manual/tunables.texi
index ec18b10834..428a0918e6 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -35,6 +35,8 @@ their own namespace.
 * POSIX Thread Tunables:: Tunables in the POSIX thread subsystem
 * Hardware Capability Tunables::  Tunables that modify the hardware
 				  capabilities seen by @theglibc{}
+* Memory Tagging Tunables::  Tunables that control the use of hardware
+			     memory tagging
 @end menu
 
 @node Tunable names
@@ -423,3 +425,32 @@ instead.
 
 This tunable is specific to i386 and x86-64.
 @end deftp
+
+@node Memory Tagging Tunables
+@section Memory Tagging Tunables
+@cindex memory tagging tunables
+
+@deftp {Tunable namespace} glibc.memtag
+If the hardware supports memory tagging, these tunables can be used to
+control the way @theglibc{} uses this feature.  Currently, only AArch64
+supports this feature.
+@end deftp
+
+@deftp Tunable glibc.memtag.enable
+This tunable takes a value between 0 and 255 and acts as a bitmask
+that enables various capabilities.
+
+Bit 0 (the least significant bit) causes the malloc subsystem to allocate
+tagged memory, with each allocation being assigned a random tag.
+
+Bit 1 enables precise faulting mode for tag violations on systems that
+support deferred tag violation reporting.  This may cause programs
+to run more slowly.
+
+Other bits are currently reserved.
+
+@Theglibc{} startup code will automatically enable memory tagging
+support in the kernel if this tunable has any non-zero value.
+
+The default value is @samp{0}, which disables all memory tagging.
+@end deftp

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 3/7] malloc: Basic support for memory tagging in the malloc() family
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 1/7] config: Allow memory tagging to be enabled when configuring glibc Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 2/7] elf: Add a tunable to control use of tagged memory Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 4/7] linux: Add compatibility definitions to sys/prctl.h for MTE Richard Earnshaw
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 925 bytes --]


This patch adds the basic support for memory tagging.

Various flavours are supported, particularly being able to turn on
tagged memory at run-time: this allows the same code to be used on
systems where memory tagging support is not present without neededing
a separate build of glibc.  Also, depending on whether the kernel
supports it, the code will use mmap for the default arena if morecore
does not, or cannot support tagged memory (on AArch64 it is not
available).

All the hooks use function pointers to allow this to work without
needing ifuncs.

malloc: allow memory tagging to be controlled from a tunable
---
 malloc/arena.c              |  42 ++++++++-
 malloc/malloc.c             | 171 ++++++++++++++++++++++++++++--------
 malloc/malloc.h             |   7 ++
 sysdeps/generic/libc-mtag.h |  52 +++++++++++
 4 files changed, 233 insertions(+), 39 deletions(-)
 create mode 100644 sysdeps/generic/libc-mtag.h


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-malloc-Basic-support-for-memory-tagging-in-the-mallo.patch --]
[-- Type: text/x-patch; name="0003-malloc-Basic-support-for-memory-tagging-in-the-mallo.patch", Size: 21753 bytes --]

diff --git a/malloc/arena.c b/malloc/arena.c
index cecdb7f4c4..e306028c9a 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -274,17 +274,36 @@ next_env_entry (char ***position)
 #endif
 
 
-#ifdef SHARED
+#if defined(SHARED) || defined(_LIBC_MTAG)
 static void *
 __failing_morecore (ptrdiff_t d)
 {
   return (void *) MORECORE_FAILURE;
 }
+#endif
 
+#ifdef SHARED
 extern struct dl_open_hook *_dl_open_hook;
 libc_hidden_proto (_dl_open_hook);
 #endif
 
+#ifdef _LIBC_MTAG
+static void *
+__mtag_tag_new_usable (void *ptr)
+{
+  if (ptr)
+    ptr = __libc_mtag_tag_region (__libc_mtag_new_tag (ptr),
+				  __malloc_usable_size (ptr));
+  return ptr;
+}
+
+static void *
+__mtag_tag_new_memset (void *ptr, int val, size_t size)
+{
+  return __libc_mtag_memset_with_tag (__libc_mtag_new_tag (ptr), val, size);
+}
+#endif
+
 static void
 ptmalloc_init (void)
 {
@@ -293,6 +312,23 @@ ptmalloc_init (void)
 
   __malloc_initialized = 0;
 
+#ifdef _LIBC_MTAG
+  if ((TUNABLE_GET_FULL (glibc, memtag, enable, int32_t, NULL) & 1) != 0)
+    {
+      /* If the environment says that we should be using tagged memory
+	 and that morecore does not support tagged regions, then
+	 disable it.  */
+      if (__MTAG_SBRK_UNTAGGED)
+	__morecore = __failing_morecore;
+
+      __mtag_mmap_flags = __MTAG_MMAP_FLAGS;
+      __tag_new_memset = __mtag_tag_new_memset;
+      __tag_region = __libc_mtag_tag_region;
+      __tag_new_usable = __mtag_tag_new_usable;
+      __tag_at = __libc_mtag_address_get_tag;
+    }
+#endif
+
 #ifdef SHARED
   /* In case this libc copy is in a non-default namespace, never use brk.
      Likewise if dlopened from statically linked program.  */
@@ -512,7 +548,7 @@ new_heap (size_t size, size_t top_pad)
             }
         }
     }
-  if (__mprotect (p2, size, PROT_READ | PROT_WRITE) != 0)
+  if (__mprotect (p2, size, MTAG_MMAP_FLAGS | PROT_READ | PROT_WRITE) != 0)
     {
       __munmap (p2, HEAP_MAX_SIZE);
       return 0;
@@ -542,7 +578,7 @@ grow_heap (heap_info *h, long diff)
     {
       if (__mprotect ((char *) h + h->mprotect_size,
                       (unsigned long) new_size - h->mprotect_size,
-                      PROT_READ | PROT_WRITE) != 0)
+                      MTAG_MMAP_FLAGS | PROT_READ | PROT_WRITE) != 0)
         return -2;
 
       h->mprotect_size = new_size;
diff --git a/malloc/malloc.c b/malloc/malloc.c
index 1282863681..2e6668f9bd 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -242,6 +242,9 @@
 /* For DIAG_PUSH/POP_NEEDS_COMMENT et al.  */
 #include <libc-diag.h>
 
+/* For memory tagging.  */
+#include <libc-mtag.h>
+
 #include <malloc/malloc-internal.h>
 
 /* For SINGLE_THREAD_P.  */
@@ -279,6 +282,18 @@
 #define MALLOC_DEBUG 0
 #endif
 
+/* When using tagged memory, we cannot share the end of the user block
+   with the header for the next chunk, so ensure that we allocate
+   blocks that are rounded up to the granule size.  Take care not to
+   overflow from close to MAX_SIZE_T to a small number.  */
+static inline size_t
+ROUND_UP_ALLOCATION_SIZE(size_t bytes)
+{
+  if ((bytes & (__MTAG_GRANULE_SIZE - 1)) != 0)
+    return bytes | (__MTAG_GRANULE_SIZE - 1);
+  return bytes;
+}
+
 #ifndef NDEBUG
 # define __assert_fail(assertion, file, line, function)			\
 	 __malloc_assert(assertion, file, line, function)
@@ -378,6 +393,36 @@ __malloc_assert (const char *assertion, const char *file, unsigned int line,
 void * __default_morecore (ptrdiff_t);
 void *(*__morecore)(ptrdiff_t) = __default_morecore;
 
+#ifdef _LIBC_MTAG
+static void *
+__default_tag_region (void *ptr, size_t size)
+{
+  return ptr;
+}
+
+static void *
+__default_tag_nop (void *ptr)
+{
+  return ptr;
+}
+
+int __mtag_mmap_flags = 0;
+
+void *(*__tag_new_memset)(void *, int, size_t) = memset;
+void *(*__tag_region)(void *, size_t) = __default_tag_region;
+void *(*__tag_new_usable)(void *) = __default_tag_nop;
+void *(*__tag_at)(void *) = __default_tag_nop;
+
+# define TAG_NEW_MEMSET(ptr, val, size) __tag_new_memset (ptr, val, size)
+# define TAG_REGION(ptr, size) __tag_region (ptr, size)
+# define TAG_NEW_USABLE(ptr) __tag_new_usable (ptr)
+# define TAG_AT(ptr) __tag_at (ptr)
+#else
+# define TAG_NEW_MEMSET(ptr, val, size) memset (ptr, val, size)
+# define TAG_REGION(ptr, size) (ptr)
+# define TAG_NEW_USABLE(ptr) (ptr)
+# define TAG_AT(ptr) (ptr)
+#endif
 
 #include <string.h>
 
@@ -1184,8 +1229,9 @@ nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
 /* conversion from malloc headers to user pointers, and back */
 
-#define chunk2mem(p)   ((void*)((char*)(p) + 2*SIZE_SZ))
-#define mem2chunk(mem) ((mchunkptr)((char*)(mem) - 2*SIZE_SZ))
+#define chunk2mem(p) ((void*)TAG_AT (((char*)(p) + 2*SIZE_SZ)))
+#define chunk2rawmem(p) ((void*)((char*)(p) + 2*SIZE_SZ))
+#define mem2chunk(mem) ((mchunkptr)TAG_AT (((char*)(mem) - 2*SIZE_SZ)))
 
 /* The smallest possible chunk */
 #define MIN_CHUNK_SIZE        (offsetof(struct malloc_chunk, fd_nextsize))
@@ -1964,7 +2010,7 @@ do_check_chunk (mstate av, mchunkptr p)
       /* chunk is page-aligned */
       assert (((prev_size (p) + sz) & (GLRO (dl_pagesize) - 1)) == 0);
       /* mem is aligned */
-      assert (aligned_OK (chunk2mem (p)));
+      assert (aligned_OK (chunk2rawmem (p)));
     }
 }
 
@@ -1988,7 +2034,7 @@ do_check_free_chunk (mstate av, mchunkptr p)
   if ((unsigned long) (sz) >= MINSIZE)
     {
       assert ((sz & MALLOC_ALIGN_MASK) == 0);
-      assert (aligned_OK (chunk2mem (p)));
+      assert (aligned_OK (chunk2rawmem (p)));
       /* ... matching footer field */
       assert (prev_size (next_chunk (p)) == sz);
       /* ... and is fully consolidated */
@@ -2067,7 +2113,7 @@ do_check_remalloced_chunk (mstate av, mchunkptr p, INTERNAL_SIZE_T s)
   assert ((sz & MALLOC_ALIGN_MASK) == 0);
   assert ((unsigned long) (sz) >= MINSIZE);
   /* ... and alignment */
-  assert (aligned_OK (chunk2mem (p)));
+  assert (aligned_OK (chunk2rawmem (p)));
   /* chunk is less than MINSIZE more than request */
   assert ((long) (sz) - (long) (s) >= 0);
   assert ((long) (sz) - (long) (s + MINSIZE) < 0);
@@ -2322,7 +2368,8 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
       /* Don't try if size wraps around 0 */
       if ((unsigned long) (size) > (unsigned long) (nb))
         {
-          mm = (char *) (MMAP (0, size, PROT_READ | PROT_WRITE, 0));
+          mm = (char *) (MMAP (0, size,
+			       MTAG_MMAP_FLAGS | PROT_READ | PROT_WRITE, 0));
 
           if (mm != MAP_FAILED)
             {
@@ -2336,14 +2383,14 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
 
               if (MALLOC_ALIGNMENT == 2 * SIZE_SZ)
                 {
-                  /* For glibc, chunk2mem increases the address by 2*SIZE_SZ and
+                  /* For glibc, chunk2rawmem increases the address by 2*SIZE_SZ and
                      MALLOC_ALIGN_MASK is 2*SIZE_SZ-1.  Each mmap'ed area is page
                      aligned and therefore definitely MALLOC_ALIGN_MASK-aligned.  */
-                  assert (((INTERNAL_SIZE_T) chunk2mem (mm) & MALLOC_ALIGN_MASK) == 0);
+                  assert (((INTERNAL_SIZE_T) chunk2rawmem (mm) & MALLOC_ALIGN_MASK) == 0);
                   front_misalign = 0;
                 }
               else
-                front_misalign = (INTERNAL_SIZE_T) chunk2mem (mm) & MALLOC_ALIGN_MASK;
+                front_misalign = (INTERNAL_SIZE_T) chunk2rawmem (mm) & MALLOC_ALIGN_MASK;
               if (front_misalign > 0)
                 {
                   correction = MALLOC_ALIGNMENT - front_misalign;
@@ -2515,7 +2562,9 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
           /* Don't try if size wraps around 0 */
           if ((unsigned long) (size) > (unsigned long) (nb))
             {
-              char *mbrk = (char *) (MMAP (0, size, PROT_READ | PROT_WRITE, 0));
+              char *mbrk = (char *) (MMAP (0, size,
+					   MTAG_MMAP_FLAGS | PROT_READ | PROT_WRITE,
+					   0));
 
               if (mbrk != MAP_FAILED)
                 {
@@ -2586,7 +2635,7 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
 
                   /* Guarantee alignment of first new chunk made from this space */
 
-                  front_misalign = (INTERNAL_SIZE_T) chunk2mem (brk) & MALLOC_ALIGN_MASK;
+                  front_misalign = (INTERNAL_SIZE_T) chunk2rawmem (brk) & MALLOC_ALIGN_MASK;
                   if (front_misalign > 0)
                     {
                       /*
@@ -2644,10 +2693,10 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
                 {
                   if (MALLOC_ALIGNMENT == 2 * SIZE_SZ)
                     /* MORECORE/mmap must correctly align */
-                    assert (((unsigned long) chunk2mem (brk) & MALLOC_ALIGN_MASK) == 0);
+                    assert (((unsigned long) chunk2rawmem (brk) & MALLOC_ALIGN_MASK) == 0);
                   else
                     {
-                      front_misalign = (INTERNAL_SIZE_T) chunk2mem (brk) & MALLOC_ALIGN_MASK;
+                      front_misalign = (INTERNAL_SIZE_T) chunk2rawmem (brk) & MALLOC_ALIGN_MASK;
                       if (front_misalign > 0)
                         {
                           /*
@@ -2832,7 +2881,7 @@ munmap_chunk (mchunkptr p)
   if (DUMPED_MAIN_ARENA_CHUNK (p))
     return;
 
-  uintptr_t mem = (uintptr_t) chunk2mem (p);
+  uintptr_t mem = (uintptr_t) chunk2rawmem (p);
   uintptr_t block = (uintptr_t) p - prev_size (p);
   size_t total_size = prev_size (p) + size;
   /* Unfortunately we have to do the compilers job by hand here.  Normally
@@ -2887,7 +2936,7 @@ mremap_chunk (mchunkptr p, size_t new_size)
 
   p = (mchunkptr) (cp + offset);
 
-  assert (aligned_OK (chunk2mem (p)));
+  assert (aligned_OK (chunk2rawmem (p)));
 
   assert (prev_size (p) == offset);
   set_head (p, (new_size - offset) | IS_MMAPPED);
@@ -3051,6 +3100,7 @@ __libc_malloc (size_t bytes)
     = atomic_forced_read (__malloc_hook);
   if (__builtin_expect (hook != NULL, 0))
     return (*hook)(bytes, RETURN_ADDRESS (0));
+  bytes = ROUND_UP_ALLOCATION_SIZE (bytes);
 #if USE_TCACHE
   /* int_free also calls request2size, be careful to not pad twice.  */
   size_t tbytes;
@@ -3068,14 +3118,15 @@ __libc_malloc (size_t bytes)
       && tcache
       && tcache->counts[tc_idx] > 0)
     {
-      return tcache_get (tc_idx);
+      victim = tcache_get (tc_idx);
+      return TAG_NEW_USABLE (victim);
     }
   DIAG_POP_NEEDS_COMMENT;
 #endif
 
   if (SINGLE_THREAD_P)
     {
-      victim = _int_malloc (&main_arena, bytes);
+      victim = TAG_NEW_USABLE (_int_malloc (&main_arena, bytes));
       assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
 	      &main_arena == arena_for_chunk (mem2chunk (victim)));
       return victim;
@@ -3096,6 +3147,8 @@ __libc_malloc (size_t bytes)
   if (ar_ptr != NULL)
     __libc_lock_unlock (ar_ptr->mutex);
 
+  victim = TAG_NEW_USABLE (victim);
+
   assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
           ar_ptr == arena_for_chunk (mem2chunk (victim)));
   return victim;
@@ -3119,8 +3172,17 @@ __libc_free (void *mem)
   if (mem == 0)                              /* free(0) has no effect */
     return;
 
+#ifdef _LIBC_MTAG
+  /* Quickly check that the freed pointer matches the tag for the memory.
+     This gives a useful double-free detection.  */
+  *(volatile char *)mem;
+#endif
+
   p = mem2chunk (mem);
 
+  /* Mark the chunk as belonging to the library again.  */
+  (void)TAG_REGION (chunk2rawmem (p), __malloc_usable_size (mem));
+
   if (chunk_is_mmapped (p))                       /* release mmapped memory. */
     {
       /* See if the dynamic brk/mmap threshold needs adjusting.
@@ -3170,6 +3232,13 @@ __libc_realloc (void *oldmem, size_t bytes)
   if (oldmem == 0)
     return __libc_malloc (bytes);
 
+  bytes = ROUND_UP_ALLOCATION_SIZE (bytes);
+#ifdef _LIBC_MTAG
+  /* Perform a quick check to ensure that the pointer's tag matches the
+     memory's tag.  */
+  *(volatile char*) oldmem;
+#endif
+
   /* chunk corresponding to oldmem */
   const mchunkptr oldp = mem2chunk (oldmem);
   /* its size */
@@ -3225,7 +3294,15 @@ __libc_realloc (void *oldmem, size_t bytes)
 #if HAVE_MREMAP
       newp = mremap_chunk (oldp, nb);
       if (newp)
-        return chunk2mem (newp);
+	{
+	  void *newmem = chunk2rawmem (newp);
+	  /* Give the new block a different tag.  This helps to ensure
+	     that stale handles to the previous mapping are not
+	     reused.  There's a performance hit for both us and the
+	     caller for doing this, so we might want to
+	     reconsider.  */
+	  return TAG_NEW_USABLE (newmem);
+	}
 #endif
       /* Note the extra SIZE_SZ overhead. */
       if (oldsize - SIZE_SZ >= nb)
@@ -3308,7 +3385,7 @@ _mid_memalign (size_t alignment, size_t bytes, void *address)
       return 0;
     }
 
-
+  bytes = ROUND_UP_ALLOCATION_SIZE (bytes);
   /* Make sure alignment is power of 2.  */
   if (!powerof2 (alignment))
     {
@@ -3323,8 +3400,7 @@ _mid_memalign (size_t alignment, size_t bytes, void *address)
       p = _int_memalign (&main_arena, alignment, bytes);
       assert (!p || chunk_is_mmapped (mem2chunk (p)) ||
 	      &main_arena == arena_for_chunk (mem2chunk (p)));
-
-      return p;
+      return TAG_NEW_USABLE (p);
     }
 
   arena_get (ar_ptr, bytes + alignment + MINSIZE);
@@ -3342,7 +3418,7 @@ _mid_memalign (size_t alignment, size_t bytes, void *address)
 
   assert (!p || chunk_is_mmapped (mem2chunk (p)) ||
           ar_ptr == arena_for_chunk (mem2chunk (p)));
-  return p;
+  return TAG_NEW_USABLE (p);
 }
 /* For ISO C11.  */
 weak_alias (__libc_memalign, aligned_alloc)
@@ -3351,20 +3427,27 @@ libc_hidden_def (__libc_memalign)
 void *
 __libc_valloc (size_t bytes)
 {
+  void *p;
+
   if (__malloc_initialized < 0)
     ptmalloc_init ();
 
+  bytes = ROUND_UP_ALLOCATION_SIZE (bytes);
   void *address = RETURN_ADDRESS (0);
   size_t pagesize = GLRO (dl_pagesize);
-  return _mid_memalign (pagesize, bytes, address);
+  p = _mid_memalign (pagesize, bytes, address);
+  return TAG_NEW_USABLE (p);
 }
 
 void *
 __libc_pvalloc (size_t bytes)
 {
+  void *p;
+
   if (__malloc_initialized < 0)
     ptmalloc_init ();
 
+  bytes = ROUND_UP_ALLOCATION_SIZE (bytes);
   void *address = RETURN_ADDRESS (0);
   size_t pagesize = GLRO (dl_pagesize);
   size_t rounded_bytes;
@@ -3378,19 +3461,22 @@ __libc_pvalloc (size_t bytes)
     }
   rounded_bytes = rounded_bytes & -(pagesize - 1);
 
-  return _mid_memalign (pagesize, rounded_bytes, address);
+  p = _mid_memalign (pagesize, rounded_bytes, address);
+  return TAG_NEW_USABLE (p);
 }
 
 void *
 __libc_calloc (size_t n, size_t elem_size)
 {
   mstate av;
-  mchunkptr oldtop, p;
-  INTERNAL_SIZE_T sz, csz, oldtopsize;
+  mchunkptr oldtop;
+  INTERNAL_SIZE_T sz, oldtopsize;
   void *mem;
+#ifndef _LIBC_MTAG
   unsigned long clearsize;
   unsigned long nclears;
   INTERNAL_SIZE_T *d;
+#endif
   ptrdiff_t bytes;
 
   if (__glibc_unlikely (__builtin_mul_overflow (n, elem_size, &bytes)))
@@ -3398,6 +3484,7 @@ __libc_calloc (size_t n, size_t elem_size)
        __set_errno (ENOMEM);
        return NULL;
     }
+
   sz = bytes;
 
   void *(*hook) (size_t, const void *) =
@@ -3413,6 +3500,7 @@ __libc_calloc (size_t n, size_t elem_size)
 
   MAYBE_INIT_TCACHE ();
 
+  sz = ROUND_UP_ALLOCATION_SIZE (sz);
   if (SINGLE_THREAD_P)
     av = &main_arena;
   else
@@ -3467,7 +3555,14 @@ __libc_calloc (size_t n, size_t elem_size)
   if (mem == 0)
     return 0;
 
-  p = mem2chunk (mem);
+  /* If we are using memory tagging, then we need to set the tags
+     regardless of MORECORE_CLEARS, so we zero the whole block while
+     doing so.  */
+#ifdef _LIBC_MTAG
+  return TAG_NEW_MEMSET (mem, 0, __malloc_usable_size (mem));
+#else
+  mchunkptr p = mem2chunk (mem);
+  INTERNAL_SIZE_T csz = chunksize (p);
 
   /* Two optional cases in which clearing not necessary */
   if (chunk_is_mmapped (p))
@@ -3478,8 +3573,6 @@ __libc_calloc (size_t n, size_t elem_size)
       return mem;
     }
 
-  csz = chunksize (p);
-
 #if MORECORE_CLEARS
   if (perturb_byte == 0 && (p == oldtop && csz > oldtopsize))
     {
@@ -3522,6 +3615,7 @@ __libc_calloc (size_t n, size_t elem_size)
     }
 
   return mem;
+#endif
 }
 
 /*
@@ -4618,7 +4712,7 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
           av->top = chunk_at_offset (oldp, nb);
           set_head (av->top, (newsize - nb) | PREV_INUSE);
           check_inuse_chunk (av, oldp);
-          return chunk2mem (oldp);
+          return TAG_NEW_USABLE (chunk2rawmem (oldp));
         }
 
       /* Try to expand forward into next chunk;  split off remainder below */
@@ -4651,7 +4745,10 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
             }
           else
             {
-	      memcpy (newmem, chunk2mem (oldp), oldsize - SIZE_SZ);
+	      void *oldmem = chunk2mem (oldp);
+	      newmem = TAG_NEW_USABLE (newmem);
+	      memcpy (newmem, oldmem, __malloc_usable_size (oldmem));
+	      (void) TAG_REGION (chunk2rawmem (oldp), oldsize);
               _int_free (av, oldp, 1);
               check_inuse_chunk (av, newp);
               return chunk2mem (newp);
@@ -4673,6 +4770,8 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
   else   /* split remainder */
     {
       remainder = chunk_at_offset (newp, nb);
+      /* Clear any user-space tags before writing the header.  */
+      remainder = TAG_REGION (remainder, remainder_size);
       set_head_size (newp, nb | (av != &main_arena ? NON_MAIN_ARENA : 0));
       set_head (remainder, remainder_size | PREV_INUSE |
                 (av != &main_arena ? NON_MAIN_ARENA : 0));
@@ -4682,8 +4781,8 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
     }
 
   check_inuse_chunk (av, newp);
-  return chunk2mem (newp);
-}
+  return TAG_NEW_USABLE (chunk2rawmem (newp));
+    }
 
 /*
    ------------------------------ memalign ------------------------------
@@ -4760,7 +4859,7 @@ _int_memalign (mstate av, size_t alignment, size_t bytes)
       p = newp;
 
       assert (newsize >= nb &&
-              (((unsigned long) (chunk2mem (p))) % alignment) == 0);
+              (((unsigned long) (chunk2rawmem (p))) % alignment) == 0);
     }
 
   /* Also give back spare room at the end */
@@ -4814,7 +4913,7 @@ mtrim (mstate av, size_t pad)
                                                 + sizeof (struct malloc_chunk)
                                                 + psm1) & ~psm1);
 
-                assert ((char *) chunk2mem (p) + 4 * SIZE_SZ <= paligned_mem);
+                assert ((char *) chunk2rawmem (p) + 4 * SIZE_SZ <= paligned_mem);
                 assert ((char *) p + size > paligned_mem);
 
                 /* This is the size we could potentially free.  */
@@ -4902,7 +5001,7 @@ __malloc_usable_size (void *m)
   size_t result;
 
   result = musable (m);
-  return result;
+  return (size_t) (((INTERNAL_SIZE_T)result) & ~(__MTAG_GRANULE_SIZE - 1));
 }
 
 /*
diff --git a/malloc/malloc.h b/malloc/malloc.h
index a6903fdd54..d012da9a9f 100644
--- a/malloc/malloc.h
+++ b/malloc/malloc.h
@@ -77,6 +77,13 @@ extern void *pvalloc (size_t __size) __THROW __attribute_malloc__ __wur;
    contiguous pieces of memory.  */
 extern void *(*__morecore) (ptrdiff_t __size);
 
+#ifdef _LIBC_MTAG
+extern int __mtag_mmap_flags;
+#define MTAG_MMAP_FLAGS __mtag_mmap_flags
+#else
+#define MTAG_MMAP_FLAGS 0
+#endif
+
 /* Default value of `__morecore'.  */
 extern void *__default_morecore (ptrdiff_t __size)
 __THROW __attribute_malloc__;
diff --git a/sysdeps/generic/libc-mtag.h b/sysdeps/generic/libc-mtag.h
new file mode 100644
index 0000000000..3e9885451c
--- /dev/null
+++ b/sysdeps/generic/libc-mtag.h
@@ -0,0 +1,52 @@
+/* libc-internal interface for tagged (colored) memory support.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _GENERIC_LIBC_MTAG_H
+#define _GENERIC_LIBC_MTAG_H 1
+
+/* Generic bindings for systems that do not support memory tagging.  */
+
+/* Used to ensure additional alignment when objects need to have distinct
+   tags.  */
+#define __MTAG_GRANULE_SIZE 1
+
+/* Non-zero if memory obtained via morecore (sbrk) is not tagged.  */
+#define __MTAG_SBRK_UNTAGGED 0
+
+/* Extra flags to pass to mmap() to request a tagged region of memory.  */
+#define __MTAG_MMAP_FLAGS 0
+
+/* Set the tags for a region of memory, which must have size and alignment
+   that are multiples of __MTAG_GRANULE_SIZE.  Size cannot be zero.
+   void *__libc_mtag_tag_region (const void *, size_t)  */
+#define __libc_mtag_tag_region(p, s) (p)
+
+/* Optimized equivalent to __libc_mtag_tag_region followed by memset.  */
+#define __libc_mtag_memset_with_tag memset
+
+/* Convert address P to a pointer that is tagged correctly for that
+   location.
+   void *__libc_mtag_address_get_tag (void*)  */
+#define __libc_mtag_address_get_tag(p) (p)
+
+/* Assign a new (random) tag to a pointer P (does not adjust the tag on
+   the memory addressed).
+   void *__libc_mtag_new_tag (void*)  */
+#define __libc_mtag_new_tag(p) (p)
+
+#endif /* _GENERIC_LIBC_MTAG_H */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 4/7] linux: Add compatibility definitions to sys/prctl.h for MTE
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (2 preceding siblings ...)
  2020-06-15 14:40 ` [PATCH 3/7] malloc: Basic support for memory tagging in the malloc() family Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 5/7] aarch64: Mitigations for string functions when MTE is enabled Richard Earnshaw
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 432 bytes --]


Older versions of the Linux kernel headers obviously lack support for
memory tagging, but we still want to be able to build in support when
using those (obviously it can't be enabled on such systems).

The linux kernel extensions are made to the platform-independent
header (linux/prctl.h), so this patch takes a similar approach.
---
 sysdeps/unix/sysv/linux/sys/prctl.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0004-linux-Add-compatibility-definitions-to-sys-prctl.h-f.patch --]
[-- Type: text/x-patch; name="0004-linux-Add-compatibility-definitions-to-sys-prctl.h-f.patch", Size: 971 bytes --]

diff --git a/sysdeps/unix/sysv/linux/sys/prctl.h b/sysdeps/unix/sysv/linux/sys/prctl.h
index 7f748ebeeb..4d01379c23 100644
--- a/sysdeps/unix/sysv/linux/sys/prctl.h
+++ b/sysdeps/unix/sysv/linux/sys/prctl.h
@@ -21,6 +21,24 @@
 #include <features.h>
 #include <linux/prctl.h>  /*  The magic values come from here  */
 
+/* Recent extensions to linux which may post-date the kernel headers
+   we're picking up...  */
+
+/* Memory tagging control operations (for AArch64).  */
+#ifndef PR_TAGGED_ADDR_ENABLE
+# define PR_TAGGED_ADDR_ENABLE	(1UL << 8)
+#endif
+
+#ifndef PR_MTE_TCF_SHIFT
+# define PR_MTE_TCF_SHIFT	1
+# define PR_MTE_TCF_NONE	(0UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_SYNC	(1UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_ASYNC	(2UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_MASK	(3UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TAG_SHIFT	3
+# define PR_MTE_TAG_MASK	(0xffffUL << PR_MTE_TAG_SHIFT)
+#endif
+
 __BEGIN_DECLS
 
 /* Control process execution.  */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 5/7] aarch64: Mitigations for string functions when MTE is enabled.
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (3 preceding siblings ...)
  2020-06-15 14:40 ` [PATCH 4/7] linux: Add compatibility definitions to sys/prctl.h for MTE Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 6/7] aarch64: Add sysv specific enabling code for memory tagging Richard Earnshaw
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 44 bytes --]

This is a multi-part message in MIME format.

[-- Attachment #2: Type: text/plain, Size: 1260 bytes --]


This is a place-holder patch for the changes needed to the string
functions to make them safe when using memory tagging.  It is expected
that this patch will be replaced before the final series is committed.

When memory tagging is enabled, functions must not fetch data beyond a
granule boundary.  Unfortunately, this affects a number of the
optimized string operations for aarch64 which assume that provided a
page boundary is not being crossed any amount of data within the page
may be accessed.  This patch replaces the existing string functions
with variants that do not violate the granule size limitations that
now exist.

This patch has not been tuned for performance.
---
 sysdeps/aarch64/memchr.S                 | 21 ++++++++++++++++++++-
 sysdeps/aarch64/multiarch/strlen_asimd.S |  2 +-
 sysdeps/aarch64/strchr.S                 | 15 +++++++++++++++
 sysdeps/aarch64/strchrnul.S              | 14 +++++++++++++-
 sysdeps/aarch64/strcmp.S                 | 12 +++++++++---
 sysdeps/aarch64/strcpy.S                 |  2 +-
 sysdeps/aarch64/strlen.S                 |  2 +-
 sysdeps/aarch64/strncmp.S                | 10 ++++++++--
 sysdeps/aarch64/strrchr.S                | 15 ++++++++++++++-
 9 files changed, 82 insertions(+), 11 deletions(-)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0005-aarch64-Mitigations-for-string-functions-when-MTE-is.patch --]
[-- Type: text/x-patch; name="0005-aarch64-Mitigations-for-string-functions-when-MTE-is.patch", Size: 6450 bytes --]

diff --git a/sysdeps/aarch64/memchr.S b/sysdeps/aarch64/memchr.S
index 85c65cbfca..6e01a0a0a9 100644
--- a/sysdeps/aarch64/memchr.S
+++ b/sysdeps/aarch64/memchr.S
@@ -64,6 +64,25 @@
  */
 
 ENTRY (MEMCHR)
+#ifdef _LIBC_MTAG
+	/* Quick-and-dirty implementation for MTE.  Needs a rewrite as
+	   granules are only 16 bytes in size.  */
+	/* Do not dereference srcin if no bytes to compare.  */
+	cbz	cntin, L(zero_length)
+	and	chrin, chrin, #255
+L(next_byte):	
+	ldrb	wtmp2, [srcin], #1
+	cmp	wtmp2, chrin
+	b.eq	L(found)
+	subs	cntin, cntin, #1
+	b.ne	L(next_byte)
+L(zero_length):
+	mov	result, #0
+	ret
+L(found):
+	sub	result, srcin, #1
+	ret
+#else	
 	/* Do not dereference srcin if no bytes to compare.  */
 	cbz	cntin, L(zero_length)
 	/*
@@ -152,10 +171,10 @@ L(tail):
 	/* Select result or NULL */
 	csel	result, xzr, result, eq
 	ret
-
 L(zero_length):
 	mov	result, #0
 	ret
+#endif /* _LIBC_MTAG */
 END (MEMCHR)
 weak_alias (MEMCHR, memchr)
 libc_hidden_builtin_def (memchr)
diff --git a/sysdeps/aarch64/multiarch/strlen_asimd.S b/sysdeps/aarch64/multiarch/strlen_asimd.S
index 236a2c96a6..c2c718e493 100644
--- a/sysdeps/aarch64/multiarch/strlen_asimd.S
+++ b/sysdeps/aarch64/multiarch/strlen_asimd.S
@@ -51,7 +51,7 @@
 #define REP8_01 0x0101010101010101
 #define REP8_7f 0x7f7f7f7f7f7f7f7f
 
-#ifdef TEST_PAGE_CROSS
+#if defined _LIBC_MTAG || defined TEST_PAGE_CROSS
 # define MIN_PAGE_SIZE 16
 #else
 # define MIN_PAGE_SIZE 4096
diff --git a/sysdeps/aarch64/strchr.S b/sysdeps/aarch64/strchr.S
index 4a75e73945..32c500609e 100644
--- a/sysdeps/aarch64/strchr.S
+++ b/sysdeps/aarch64/strchr.S
@@ -63,6 +63,20 @@
 
 ENTRY (strchr)
 	DELOUSE (0)
+#ifdef _LIBC_MTAG
+	/* Quick and dirty implementation for MTE  */
+	and	chrin, chrin, #255
+L(next_byte):
+	ldrb	wtmp2, [srcin], #1
+	cbz	wtmp2, L(end)
+	cmp	wtmp2, chrin
+	b.ne	L(next_byte)
+	sub	result, srcin, #1
+	ret
+L(end):
+	mov	result, #0
+	ret
+#else
 	mov	wtmp2, #0x0401
 	movk	wtmp2, #0x4010, lsl #16
 	dup	vrepchr.16b, chrin
@@ -134,6 +148,7 @@ L(tail):
 	add	result, src, tmp1, lsr #1
 	csel	result, result, xzr, eq
 	ret
+#endif
 END (strchr)
 libc_hidden_builtin_def (strchr)
 weak_alias (strchr, index)
diff --git a/sysdeps/aarch64/strchrnul.S b/sysdeps/aarch64/strchrnul.S
index a65be6cba8..78a9252eb8 100644
--- a/sysdeps/aarch64/strchrnul.S
+++ b/sysdeps/aarch64/strchrnul.S
@@ -61,6 +61,18 @@
 
 ENTRY (__strchrnul)
 	DELOUSE (0)
+#ifdef _LIBC_MTAG
+	/* Quick and dirty implementation for MTE  */
+	and	chrin, chrin, #255
+L(next_byte):
+	ldrb	wtmp2, [srcin], #1
+	cmp	wtmp2, #0
+	ccmp	wtmp2, chrin, #4, ne	/* NZCV = 0x0100  */
+	b.ne	L(next_byte)
+	
+	sub	result, srcin, #1
+	ret
+#else
 	/* Magic constant 0x40100401 to allow us to identify which lane
 	   matches the termination condition.  */
 	mov	wtmp2, #0x0401
@@ -126,6 +138,6 @@ L(tail):
 	/* tmp1 is twice the offset into the fragment.  */
 	add	result, src, tmp1, lsr #1
 	ret
-
+#endif /* _LIBC_MTAG */
 END(__strchrnul)
 weak_alias (__strchrnul, strchrnul)
diff --git a/sysdeps/aarch64/strcmp.S b/sysdeps/aarch64/strcmp.S
index d044c29e9b..d01b199ab3 100644
--- a/sysdeps/aarch64/strcmp.S
+++ b/sysdeps/aarch64/strcmp.S
@@ -46,6 +46,12 @@
 #define zeroones	x10
 #define pos		x11
 
+#if defined _LIBC_MTAG || defined TEST_PAGE_CROSS
+# define MIN_PAGE_SIZE 16
+#else
+# define MIN_PAGE_SIZE 4096
+#endif
+
 	/* Start of performance-critical section  -- one 64B cache line.  */
 ENTRY_ALIGN(strcmp, 6)
 
@@ -161,10 +167,10 @@ L(do_misaligned):
 	b.ne	L(do_misaligned)
 
 L(loop_misaligned):
-	/* Test if we are within the last dword of the end of a 4K page.  If
+	/* Test if we are within the last dword of the end of a page.  If
 	   yes then jump back to the misaligned loop to copy a byte at a time.  */
-	and	tmp1, src2, #0xff8
-	eor	tmp1, tmp1, #0xff8
+	and	tmp1, src2, #(MIN_PAGE_SIZE - 8)
+	eor	tmp1, tmp1, #(MIN_PAGE_SIZE - 8)
 	cbz	tmp1, L(do_misaligned)
 	ldr	data1, [src1], #8
 	ldr	data2, [src2], #8
diff --git a/sysdeps/aarch64/strcpy.S b/sysdeps/aarch64/strcpy.S
index 548130e413..82548f3d53 100644
--- a/sysdeps/aarch64/strcpy.S
+++ b/sysdeps/aarch64/strcpy.S
@@ -87,7 +87,7 @@
 	   misaligned, crosses a page boundary - after that we move to aligned
 	   fetches for the remainder of the string.  */
 
-#ifdef STRCPY_TEST_PAGE_CROSS
+#if defined _LIBC_MTAG || defined STRCPY_TEST_PAGE_CROSS
 	/* Make everything that isn't Qword aligned look like a page cross.  */
 #define MIN_PAGE_P2 4
 #else
diff --git a/sysdeps/aarch64/strlen.S b/sysdeps/aarch64/strlen.S
index e01fab7c2a..7455a668bb 100644
--- a/sysdeps/aarch64/strlen.S
+++ b/sysdeps/aarch64/strlen.S
@@ -57,7 +57,7 @@
 #define REP8_7f 0x7f7f7f7f7f7f7f7f
 #define REP8_80 0x8080808080808080
 
-#ifdef TEST_PAGE_CROSS
+#if defined _LIBC_MTAG || defined TEST_PAGE_CROSS
 # define MIN_PAGE_SIZE 16
 #else
 # define MIN_PAGE_SIZE 4096
diff --git a/sysdeps/aarch64/strncmp.S b/sysdeps/aarch64/strncmp.S
index c5141fab8a..40c805f609 100644
--- a/sysdeps/aarch64/strncmp.S
+++ b/sysdeps/aarch64/strncmp.S
@@ -51,6 +51,12 @@
 #define endloop		x15
 #define count		mask
 
+#if defined _LIBC_MTAG || defined TEST_PAGE_CROSS
+# define MIN_PAGE_SIZE 16
+#else
+# define MIN_PAGE_SIZE 4096
+#endif
+
 ENTRY_ALIGN_AND_PAD (strncmp, 6, 7)
 	DELOUSE (0)
 	DELOUSE (1)
@@ -233,8 +239,8 @@ L(do_misaligned):
 	subs	limit_wd, limit_wd, #1
 	b.lo	L(done_loop)
 L(loop_misaligned):
-	and	tmp2, src2, #0xff8
-	eor	tmp2, tmp2, #0xff8
+	and	tmp2, src2, #(MIN_PAGE_SIZE - 8)
+	eor	tmp2, tmp2, #(MIN_PAGE_SIZE - 8)
 	cbz	tmp2, L(page_end_loop)
 
 	ldr	data1, [src1], #8
diff --git a/sysdeps/aarch64/strrchr.S b/sysdeps/aarch64/strrchr.S
index 94da08d351..ef00e969d9 100644
--- a/sysdeps/aarch64/strrchr.S
+++ b/sysdeps/aarch64/strrchr.S
@@ -70,6 +70,19 @@
 ENTRY(strrchr)
 	DELOUSE (0)
 	cbz	x1, L(null_search)
+#ifdef _LIBC_MTAG
+	/* Quick and dirty version for MTE.  */
+	and	chrin, chrin, #255
+	mov	src_match, #0
+L(next_byte):	
+	ldrb	wtmp2, [srcin]
+	cmp	wtmp2, chrin
+	csel	src_match, src_match, srcin, ne
+	add	srcin, srcin, #1
+	cbnz	wtmp2, L(next_byte)
+	mov 	result, src_match
+	ret
+#else
 	/* Magic constant 0x40100401 to allow us to identify which lane
 	   matches the requested byte.  Magic constant 0x80200802 used
 	   similarly for NUL termination.  */
@@ -158,9 +171,9 @@ L(tail):
 	csel	result, result, xzr, ne
 
 	ret
+#endif
 L(null_search):
 	b	__strchrnul
-
 END(strrchr)
 weak_alias (strrchr, rindex)
 libc_hidden_builtin_def (strrchr)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 6/7] aarch64: Add sysv specific enabling code for memory tagging
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (4 preceding siblings ...)
  2020-06-15 14:40 ` [PATCH 5/7] aarch64: Mitigations for string functions when MTE is enabled Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 14:40 ` [PATCH 7/7] aarch64: Add aarch64-specific files for memory tagging support Richard Earnshaw
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]


Add various defines and stubs for enabling MTE on AArch64 sysv-like
systems such as Linux.  The HWCAP feature bit is copied over in the
same way as other feature bits.  Similarly we add a new wrapper header
for mman.h to define the PROT_MTE flag that can be used with mmap and
related functions.

We add a new field to struct cpu_features that can be used, for
example, to check whether or not certain ifunc'd routines should be
bound to MTE-safe versions.

Finally, if we detect that MTE should be enabled (ie via the glibc
tunable); we enable MTE during startup as required.
---
 sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h  |  2 ++
 sysdeps/unix/sysv/linux/aarch64/bits/mman.h   | 32 +++++++++++++++++++
 .../unix/sysv/linux/aarch64/cpu-features.c    | 22 +++++++++++++
 .../unix/sysv/linux/aarch64/cpu-features.h    |  1 +
 4 files changed, 57 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/mman.h


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0006-aarch64-Add-sysv-specific-enabling-code-for-memory-t.patch --]
[-- Type: text/x-patch; name="0006-aarch64-Add-sysv-specific-enabling-code-for-memory-t.patch", Size: 3739 bytes --]

diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
index f52840c2c4..4092603fd7 100644
--- a/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h
@@ -54,3 +54,5 @@
 #define HWCAP_SB		(1 << 29)
 #define HWCAP_PACA		(1 << 30)
 #define HWCAP_PACG		(1UL << 31)
+
+#define HWCAP2_MTE		(1 << 18)
diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/mman.h b/sysdeps/unix/sysv/linux/aarch64/bits/mman.h
new file mode 100644
index 0000000000..fa3f3a31f4
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/mman.h
@@ -0,0 +1,32 @@
+/* Definitions for POSIX memory map interface.  Linux/aarch64 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_MMAN_H
+# error "Never use <bits/mman.h> directly; include <sys/mman.h> instead."
+#endif
+
+/* The following definitions basically come from the kernel headers.
+   But the kernel header is not namespace clean.  */
+
+/* Other flags.  */
+#define PROT_MTE	0x20		/* Normal Tagged mapping.  */
+
+#include <bits/mman-map-flags-generic.h>
+
+/* Include generic Linux declarations.  */
+#include <bits/mman-linux.h>
diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
index 896c588fee..a8554f3e5d 100644
--- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
+++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
@@ -19,6 +19,7 @@
 #include <cpu-features.h>
 #include <sys/auxv.h>
 #include <elf/dl-hwcaps.h>
+#include <sys/prctl.h>
 
 #define DCZID_DZP_MASK (1 << 4)
 #define DCZID_BS_MASK (0xf)
@@ -83,4 +84,25 @@ init_cpu_features (struct cpu_features *cpu_features)
 
   if ((dczid & DCZID_DZP_MASK) == 0)
     cpu_features->zva_size = 4 << (dczid & DCZID_BS_MASK);
+
+  /* Setup memory tagging support if the HW and kernel support it, and if
+     the user has requested it.  */
+#if HAVE_TUNABLES
+  int mte_state = TUNABLE_GET (glibc, memtag, enable, unsigned, 0);
+  cpu_features->mte_state = (GLRO (dl_hwcap2) & HWCAP2_MTE) ? mte_state : 0;
+#else
+  cpu_features->mte_state = 0;
+#endif
+  /* For now, disallow tag 0, so that we can clearly see when tagged
+     addresses are being allocated.  */
+  if (cpu_features->mte_state & 2)
+    __prctl (PR_SET_TAGGED_ADDR_CTRL,
+	     (PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC
+	      | (0xfffe << PR_MTE_TAG_SHIFT)),
+	     0, 0, 0);
+  else if (cpu_features->mte_state)
+    __prctl (PR_SET_TAGGED_ADDR_CTRL,
+	     (PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_ASYNC
+	      | (0xfffe << PR_MTE_TAG_SHIFT)),
+	     0, 0, 0);
 }
diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h
index 1389cea1b3..604de27c88 100644
--- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h
+++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h
@@ -64,6 +64,7 @@ struct cpu_features
 {
   uint64_t midr_el1;
   unsigned zva_size;
+  unsigned mte_state;
 };
 
 #endif /* _CPU_FEATURES_AARCH64_H  */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 7/7] aarch64: Add aarch64-specific files for memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (5 preceding siblings ...)
  2020-06-15 14:40 ` [PATCH 6/7] aarch64: Add sysv specific enabling code for memory tagging Richard Earnshaw
@ 2020-06-15 14:40 ` Richard Earnshaw
  2020-06-15 15:03 ` [PATCH 0/7] RFC Memory " H.J. Lu
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 14:40 UTC (permalink / raw)
  To: libc-alpha; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 813 bytes --]


This final patch provides the architecture-specific implementation of
the memory-tagging support hooks for aarch64.
---
 sysdeps/aarch64/Makefile                 |  5 +++
 sysdeps/aarch64/__mtag_address_get_tag.S | 31 +++++++++++++
 sysdeps/aarch64/__mtag_memset_tag.S      | 46 +++++++++++++++++++
 sysdeps/aarch64/__mtag_new_tag.S         | 38 ++++++++++++++++
 sysdeps/aarch64/__mtag_tag_region.S      | 44 ++++++++++++++++++
 sysdeps/aarch64/libc-mtag.h              | 57 ++++++++++++++++++++++++
 6 files changed, 221 insertions(+)
 create mode 100644 sysdeps/aarch64/__mtag_address_get_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_memset_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_new_tag.S
 create mode 100644 sysdeps/aarch64/__mtag_tag_region.S
 create mode 100644 sysdeps/aarch64/libc-mtag.h


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0007-aarch64-Add-aarch64-specific-files-for-memory-taggin.patch --]
[-- Type: text/x-patch; name="0007-aarch64-Add-aarch64-specific-files-for-memory-taggin.patch", Size: 8964 bytes --]

diff --git a/sysdeps/aarch64/Makefile b/sysdeps/aarch64/Makefile
index 9cb141004d..34b5aa7f6e 100644
--- a/sysdeps/aarch64/Makefile
+++ b/sysdeps/aarch64/Makefile
@@ -21,4 +21,9 @@ endif
 
 ifeq ($(subdir),misc)
 sysdep_headers += sys/ifunc.h
+sysdep_routines += __mtag_tag_region __mtag_new_tag __mtag_address_get_tag
+endif
+
+ifeq ($(subdir),string)
+sysdep_routines += __mtag_memset_tag
 endif
diff --git a/sysdeps/aarch64/__mtag_address_get_tag.S b/sysdeps/aarch64/__mtag_address_get_tag.S
new file mode 100644
index 0000000000..654c9d660c
--- /dev/null
+++ b/sysdeps/aarch64/__mtag_address_get_tag.S
@@ -0,0 +1,31 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+#define ptr x0
+
+	.arch armv8.5-a
+	.arch_extension memtag
+
+ENTRY (__libc_mtag_address_get_tag)
+
+	ldg	ptr, [ptr]
+	ret
+END (__libc_mtag_address_get_tag)
+libc_hidden_builtin_def (__libc_mtag_address_get_tag)
diff --git a/sysdeps/aarch64/__mtag_memset_tag.S b/sysdeps/aarch64/__mtag_memset_tag.S
new file mode 100644
index 0000000000..bc98dc49d2
--- /dev/null
+++ b/sysdeps/aarch64/__mtag_memset_tag.S
@@ -0,0 +1,46 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+/* Use the same register names and assignments as memset.  */
+#include "memset-reg.h"
+
+	.arch armv8.5-a
+	.arch_extension memtag
+
+/* NB, only supported on variants with 64-bit pointers.  */
+
+/* FIXME: This is a minimal implementation.  We could do much better than
+   this for large values of COUNT.  */
+
+ENTRY_ALIGN(__libc_mtag_memset_with_tag, 6)
+
+	and	valw, valw, 255
+	orr	valw, valw, valw, lsl 8
+	orr	valw, valw, valw, lsl 16
+	orr	val, val, val, lsl 32
+	mov	dst, dstin
+
+L(loop):
+	stgp	val, val, [dst], #16
+	subs	count, count, 16
+	bne	L(loop)
+	ldg	dstin, [dstin] // Recover the tag created (might be untagged).
+	ret
+END (__libc_mtag_memset_with_tag)
+libc_hidden_builtin_def (__libc_mtag_memset_with_tag)
diff --git a/sysdeps/aarch64/__mtag_new_tag.S b/sysdeps/aarch64/__mtag_new_tag.S
new file mode 100644
index 0000000000..3a22995e9f
--- /dev/null
+++ b/sysdeps/aarch64/__mtag_new_tag.S
@@ -0,0 +1,38 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+	.arch armv8.5-a
+	.arch_extension memtag
+
+/* NB, only supported on variants with 64-bit pointers.  */
+
+/* FIXME: This is a minimal implementation.  We could do better than
+   this for larger values of COUNT.  */
+
+#define ptr x0
+#define xset x1
+
+ENTRY(__libc_mtag_new_tag)
+	// Guarantee that the new tag is not the same as now.
+	gmi	xset, ptr, xzr
+	irg	ptr, ptr, xset
+	ret
+END (__libc_mtag_new_tag)
+libc_hidden_builtin_def (__libc_mtag_new_tag)
diff --git a/sysdeps/aarch64/__mtag_tag_region.S b/sysdeps/aarch64/__mtag_tag_region.S
new file mode 100644
index 0000000000..41019781d0
--- /dev/null
+++ b/sysdeps/aarch64/__mtag_tag_region.S
@@ -0,0 +1,44 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+/* Use the same register names and assignments as memset.  */
+
+	.arch armv8.5-a
+	.arch_extension memtag
+
+/* NB, only supported on variants with 64-bit pointers.  */
+
+/* FIXME: This is a minimal implementation.  We could do better than
+   this for larger values of COUNT.  */
+
+#define dstin x0
+#define count x1
+#define dst   x2
+
+ENTRY_ALIGN(__libc_mtag_tag_region, 6)
+
+	mov	dst, dstin
+L(loop):
+	stg	dst, [dst], #16
+	subs	count, count, 16
+	bne	L(loop)
+	ldg	dstin, [dstin] // Recover the tag created (might be untagged).
+	ret
+END (__libc_mtag_tag_region)
+libc_hidden_builtin_def (__libc_mtag_tag_region)
diff --git a/sysdeps/aarch64/libc-mtag.h b/sysdeps/aarch64/libc-mtag.h
new file mode 100644
index 0000000000..9c7d00c541
--- /dev/null
+++ b/sysdeps/aarch64/libc-mtag.h
@@ -0,0 +1,57 @@
+/* libc-internal interface for tagged (colored) memory support.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _AARCH64_LIBC_MTAG_H
+#define _AARCH64_LIBC_MTAG_H 1
+
+#ifndef _LIBC_MTAG
+/* Generic bindings for systems that do not support memory tagging.  */
+#include_next "libc-mtag.h"
+#else
+
+/* Used to ensure additional alignment when objects need to have distinct
+   tags.  */
+#define __MTAG_GRANULE_SIZE 16
+
+/* Non-zero if memory obtained via morecore (sbrk) is not tagged.  */
+#define __MTAG_SBRK_UNTAGGED 1
+
+/* Extra flags to pass to mmap to get tagged pages.  */
+#define __MTAG_MMAP_FLAGS PROT_MTE
+
+/* Set the tags for a region of memory, which must have size and alignment
+   that are multiples of __MTAG_GRANULE_SIZE.  Size cannot be zero.
+   void *__libc_mtag_tag_region (const void *, size_t)  */
+void *__libc_mtag_tag_region (void *, size_t);
+
+/* Optimized equivalent to __libc_mtag_tag_region followed by memset.  */
+void *__libc_mtag_memset_with_tag(void *, int, size_t);
+
+/* Convert address P to a pointer that is tagged correctly for that
+   location.
+   void *__libc_mtag_address_get_tag (void*)  */
+void *__libc_mtag_address_get_tag(void *);
+
+/* Assign a new (random) tag to a pointer P (does not adjust the tag on
+   the memory addressed).
+   void *__libc_mtag_new_tag (void*)  */
+void *__libc_mtag_new_tag(void *);
+
+#endif /* _LIBC_MTAG */
+
+#endif /* _AARCH64_LIBC_MTAG_H */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (6 preceding siblings ...)
  2020-06-15 14:40 ` [PATCH 7/7] aarch64: Add aarch64-specific files for memory tagging support Richard Earnshaw
@ 2020-06-15 15:03 ` H.J. Lu
  2020-06-15 15:11   ` Richard Earnshaw (lists)
  2020-06-15 15:08 ` Paul Eggert
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-15 15:03 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: GNU C Library

On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
>
> Last year I posted a preliminary set of patches for discussion
> purposes on adding support for tagged memory to glibc.  This version
> polishes that to the point where I believe it is now deployable.
>
> The first four patches are generic changes, the final three add the
> aarch64 specific code.
>
> The first patch simply adds a new configuration option to the build
> system which can be turned on with the option --enable-memory-tagging.
> The default at present is 'no'.
>
> The second patch adds a glibc tunable that can be used at run-time to
> enable the feature (the default again, is disabled).  This tunable
> would be always present, but have no effect on systems lacking support
> for memory tagging.  I've structured the tunable as a bit-mask of
> features that can be used with memory tagging, though at present only
> two bits have defined uses.
>
> The third patch is the meat of the changes; it adds the changes to the
> malloc APIs.  I've tried as far as possible to ensure that when memory
> tagging is disabled, there is no change in behaviour, even when the
> memory tagging is configured into the library, but there are
> inevitably a small number of changes needed in the optimizations that
> calloc performs since tagging would require that all the tags were
> correctly set, even if the memory does not strictly have to be zeroed.
> I've made use of function pointers in the code, much the same way as
> the morecore hook is used, so that when tagging is disabled, the
> functions called are the same as the traditional operations; this also
> ensures that glibc does not require any internal ifunc resolution in
> order to work.
>
> The fourth patch adds support for the new prctl operations that are
> being proposed to the linux kernel.  The kernel changes are to a
> generic header and this patch mirrors that design decision in glibc.
>
> The fifth patch is a place-holder, so that this series of changes is
> stand-alone.  Work is already underway to change the string operations
> to be MTE safe without losing too much in the way of performance.  I
> expect this patch to be removed entirely before the series is
> committed.
>
> The final two patches add the remaining aarch64 support.  The first
> adds the support code to examine the tunable and HW caps; and enable
> memory tagging in the kernel as needed.  The second adds the final
> pieces needed to support memory tagging in glibc.
>

Obviously, pointer comparison and algorithm will be impacted by MTE.
From what you are proposing, only parts of glibc will be MTE compatible.
Is this correct?

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (7 preceding siblings ...)
  2020-06-15 15:03 ` [PATCH 0/7] RFC Memory " H.J. Lu
@ 2020-06-15 15:08 ` Paul Eggert
  2020-06-15 16:37   ` Richard Earnshaw
  2020-06-15 16:37 ` Joseph Myers
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Paul Eggert @ 2020-06-15 15:08 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha

On 6/15/20 7:40 AM, Richard Earnshaw wrote:
> 3) Tests that construct a fake internal malloc data structure and then
> try to perform operations on them.  I haven't looked at these in too
> much detail, but the first issue is that the fake header is only
> 8-byte aligned and for MTE to work it requires a 16-byte aligned
> structure

Is the problem here that the tests fail quickly due to easy-to-check alignment
issues, and so no longer test the more-interesting defenses?

Would it make sense for these tests to align the fake internal data structure to
16 bytes using _Alignas? I.e., should the tests be trying to fail due to easy
alignment issues, or should they be trying to skip the easy alignment issues and
go on to the harder parts of the tests?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 15:03 ` [PATCH 0/7] RFC Memory " H.J. Lu
@ 2020-06-15 15:11   ` Richard Earnshaw (lists)
  2020-06-15 15:37     ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-15 15:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On 15/06/2020 16:03, H.J. Lu wrote:
> On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
>>
>> Last year I posted a preliminary set of patches for discussion
>> purposes on adding support for tagged memory to glibc.  This version
>> polishes that to the point where I believe it is now deployable.
>>
>> The first four patches are generic changes, the final three add the
>> aarch64 specific code.
>>
>> The first patch simply adds a new configuration option to the build
>> system which can be turned on with the option --enable-memory-tagging.
>> The default at present is 'no'.
>>
>> The second patch adds a glibc tunable that can be used at run-time to
>> enable the feature (the default again, is disabled).  This tunable
>> would be always present, but have no effect on systems lacking support
>> for memory tagging.  I've structured the tunable as a bit-mask of
>> features that can be used with memory tagging, though at present only
>> two bits have defined uses.
>>
>> The third patch is the meat of the changes; it adds the changes to the
>> malloc APIs.  I've tried as far as possible to ensure that when memory
>> tagging is disabled, there is no change in behaviour, even when the
>> memory tagging is configured into the library, but there are
>> inevitably a small number of changes needed in the optimizations that
>> calloc performs since tagging would require that all the tags were
>> correctly set, even if the memory does not strictly have to be zeroed.
>> I've made use of function pointers in the code, much the same way as
>> the morecore hook is used, so that when tagging is disabled, the
>> functions called are the same as the traditional operations; this also
>> ensures that glibc does not require any internal ifunc resolution in
>> order to work.
>>
>> The fourth patch adds support for the new prctl operations that are
>> being proposed to the linux kernel.  The kernel changes are to a
>> generic header and this patch mirrors that design decision in glibc.
>>
>> The fifth patch is a place-holder, so that this series of changes is
>> stand-alone.  Work is already underway to change the string operations
>> to be MTE safe without losing too much in the way of performance.  I
>> expect this patch to be removed entirely before the series is
>> committed.
>>
>> The final two patches add the remaining aarch64 support.  The first
>> adds the support code to examine the tunable and HW caps; and enable
>> memory tagging in the kernel as needed.  The second adds the final
>> pieces needed to support memory tagging in glibc.
>>
> 
> Obviously, pointer comparison and algorithm will be impacted by MTE.
> From what you are proposing, only parts of glibc will be MTE compatible.
> Is this correct?
> 

Only *undefined* pointer comparisons will be impacted, such as comparing
objects from different allocations.

Within an allocation comparisons are fine.  And also, pointer
equivalence is fine as well.

R.

> -- 
> H.J.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 15:11   ` Richard Earnshaw (lists)
@ 2020-06-15 15:37     ` H.J. Lu
  2020-06-15 16:30       ` Szabolcs Nagy
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-15 15:37 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: GNU C Library

On Mon, Jun 15, 2020 at 8:11 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 15/06/2020 16:03, H.J. Lu wrote:
> > On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
> >>
> >> Last year I posted a preliminary set of patches for discussion
> >> purposes on adding support for tagged memory to glibc.  This version
> >> polishes that to the point where I believe it is now deployable.
> >>
> >> The first four patches are generic changes, the final three add the
> >> aarch64 specific code.
> >>
> >> The first patch simply adds a new configuration option to the build
> >> system which can be turned on with the option --enable-memory-tagging.
> >> The default at present is 'no'.
> >>
> >> The second patch adds a glibc tunable that can be used at run-time to
> >> enable the feature (the default again, is disabled).  This tunable
> >> would be always present, but have no effect on systems lacking support
> >> for memory tagging.  I've structured the tunable as a bit-mask of
> >> features that can be used with memory tagging, though at present only
> >> two bits have defined uses.
> >>
> >> The third patch is the meat of the changes; it adds the changes to the
> >> malloc APIs.  I've tried as far as possible to ensure that when memory
> >> tagging is disabled, there is no change in behaviour, even when the
> >> memory tagging is configured into the library, but there are
> >> inevitably a small number of changes needed in the optimizations that
> >> calloc performs since tagging would require that all the tags were
> >> correctly set, even if the memory does not strictly have to be zeroed.
> >> I've made use of function pointers in the code, much the same way as
> >> the morecore hook is used, so that when tagging is disabled, the
> >> functions called are the same as the traditional operations; this also
> >> ensures that glibc does not require any internal ifunc resolution in
> >> order to work.
> >>
> >> The fourth patch adds support for the new prctl operations that are
> >> being proposed to the linux kernel.  The kernel changes are to a
> >> generic header and this patch mirrors that design decision in glibc.
> >>
> >> The fifth patch is a place-holder, so that this series of changes is
> >> stand-alone.  Work is already underway to change the string operations
> >> to be MTE safe without losing too much in the way of performance.  I
> >> expect this patch to be removed entirely before the series is
> >> committed.
> >>
> >> The final two patches add the remaining aarch64 support.  The first
> >> adds the support code to examine the tunable and HW caps; and enable
> >> memory tagging in the kernel as needed.  The second adds the final
> >> pieces needed to support memory tagging in glibc.
> >>
> >
> > Obviously, pointer comparison and algorithm will be impacted by MTE.
> > From what you are proposing, only parts of glibc will be MTE compatible.
> > Is this correct?
> >
>
> Only *undefined* pointer comparisons will be impacted, such as comparing
> objects from different allocations.
>
> Within an allocation comparisons are fine.  And also, pointer
> equivalence is fine as well.
>

Is "ptr1 - ptr2" valid to compute the distance between 2 pointers?

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 15:37     ` H.J. Lu
@ 2020-06-15 16:30       ` Szabolcs Nagy
  2020-06-15 16:40         ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-15 16:30 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Earnshaw (lists), GNU C Library

The 06/15/2020 08:37, H.J. Lu via Libc-alpha wrote:
> On Mon, Jun 15, 2020 at 8:11 AM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
> >
> > On 15/06/2020 16:03, H.J. Lu wrote:
> > > On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
> > >>
> > >> Last year I posted a preliminary set of patches for discussion
> > >> purposes on adding support for tagged memory to glibc.  This version
> > >> polishes that to the point where I believe it is now deployable.
> > >>
> > >> The first four patches are generic changes, the final three add the
> > >> aarch64 specific code.
> > >>
> > >> The first patch simply adds a new configuration option to the build
> > >> system which can be turned on with the option --enable-memory-tagging.
> > >> The default at present is 'no'.
> > >>
> > >> The second patch adds a glibc tunable that can be used at run-time to
> > >> enable the feature (the default again, is disabled).  This tunable
> > >> would be always present, but have no effect on systems lacking support
> > >> for memory tagging.  I've structured the tunable as a bit-mask of
> > >> features that can be used with memory tagging, though at present only
> > >> two bits have defined uses.
> > >>
> > >> The third patch is the meat of the changes; it adds the changes to the
> > >> malloc APIs.  I've tried as far as possible to ensure that when memory
> > >> tagging is disabled, there is no change in behaviour, even when the
> > >> memory tagging is configured into the library, but there are
> > >> inevitably a small number of changes needed in the optimizations that
> > >> calloc performs since tagging would require that all the tags were
> > >> correctly set, even if the memory does not strictly have to be zeroed.
> > >> I've made use of function pointers in the code, much the same way as
> > >> the morecore hook is used, so that when tagging is disabled, the
> > >> functions called are the same as the traditional operations; this also
> > >> ensures that glibc does not require any internal ifunc resolution in
> > >> order to work.
> > >>
> > >> The fourth patch adds support for the new prctl operations that are
> > >> being proposed to the linux kernel.  The kernel changes are to a
> > >> generic header and this patch mirrors that design decision in glibc.
> > >>
> > >> The fifth patch is a place-holder, so that this series of changes is
> > >> stand-alone.  Work is already underway to change the string operations
> > >> to be MTE safe without losing too much in the way of performance.  I
> > >> expect this patch to be removed entirely before the series is
> > >> committed.
> > >>
> > >> The final two patches add the remaining aarch64 support.  The first
> > >> adds the support code to examine the tunable and HW caps; and enable
> > >> memory tagging in the kernel as needed.  The second adds the final
> > >> pieces needed to support memory tagging in glibc.
> > >>
> > >
> > > Obviously, pointer comparison and algorithm will be impacted by MTE.
> > > From what you are proposing, only parts of glibc will be MTE compatible.
> > > Is this correct?
> > >
> >
> > Only *undefined* pointer comparisons will be impacted, such as comparing
> > objects from different allocations.
> >
> > Within an allocation comparisons are fine.  And also, pointer
> > equivalence is fine as well.
> >
> 
> Is "ptr1 - ptr2" valid to compute the distance between 2 pointers?

in iso c that's only valid within the same object.

(but e.g. gcc tries to detect which way the stack grows
by comparing stack pointers across stack frames: that's
not legal in c, and does not work if stack objects are
tagged with mte, this patch set is for heap tagging though)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 15:08 ` Paul Eggert
@ 2020-06-15 16:37   ` Richard Earnshaw
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 16:37 UTC (permalink / raw)
  To: Paul Eggert, Richard Earnshaw; +Cc: libc-alpha

On 15/06/2020 16:08, Paul Eggert wrote:
> On 6/15/20 7:40 AM, Richard Earnshaw wrote:
>> 3) Tests that construct a fake internal malloc data structure and then
>> try to perform operations on them.  I haven't looked at these in too
>> much detail, but the first issue is that the fake header is only
>> 8-byte aligned and for MTE to work it requires a 16-byte aligned
>> structure
> 
> Is the problem here that the tests fail quickly due to easy-to-check alignment
> issues, and so no longer test the more-interesting defenses?
> 
> Would it make sense for these tests to align the fake internal data structure to
> 16 bytes using _Alignas? I.e., should the tests be trying to fail due to easy
> alignment issues, or should they be trying to skip the easy alignment issues and
> go on to the harder parts of the tests?
> 

It might.  Of course, there's no (easy) way for the test to fake up the
tag colouring, but that's not a major issue (at least on aarch64)
because tags are ignored on non-tagged memory and static data is just that.

This is something I need to look into some more, but I wanted to get the
review started.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (8 preceding siblings ...)
  2020-06-15 15:08 ` Paul Eggert
@ 2020-06-15 16:37 ` Joseph Myers
  2020-06-15 16:53   ` Richard Earnshaw
  2020-06-15 17:04 ` DJ Delorie
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Joseph Myers @ 2020-06-15 16:37 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha

On Mon, 15 Jun 2020, Richard Earnshaw wrote:

> The fourth patch adds support for the new prctl operations that are
> being proposed to the linux kernel.  The kernel changes are to a
> generic header and this patch mirrors that design decision in glibc.

Are these values in Linus's git tree?  (I think we should generally avoid 
adding kernel interfaces until they are at least in upstream git.)

>  manual/install.texi                           |  13 ++

INSTALL should be regenerated when updating install.texi.  Also, new 
features such as this should get an entry in NEWS.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:30       ` Szabolcs Nagy
@ 2020-06-15 16:40         ` H.J. Lu
  2020-06-15 16:51           ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-15 16:40 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: Richard Earnshaw (lists), GNU C Library

On Mon, Jun 15, 2020 at 9:30 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 06/15/2020 08:37, H.J. Lu via Libc-alpha wrote:
> > On Mon, Jun 15, 2020 at 8:11 AM Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> > >
> > > On 15/06/2020 16:03, H.J. Lu wrote:
> > > > On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
> > > >>
> > > >> Last year I posted a preliminary set of patches for discussion
> > > >> purposes on adding support for tagged memory to glibc.  This version
> > > >> polishes that to the point where I believe it is now deployable.
> > > >>
> > > >> The first four patches are generic changes, the final three add the
> > > >> aarch64 specific code.
> > > >>
> > > >> The first patch simply adds a new configuration option to the build
> > > >> system which can be turned on with the option --enable-memory-tagging.
> > > >> The default at present is 'no'.
> > > >>
> > > >> The second patch adds a glibc tunable that can be used at run-time to
> > > >> enable the feature (the default again, is disabled).  This tunable
> > > >> would be always present, but have no effect on systems lacking support
> > > >> for memory tagging.  I've structured the tunable as a bit-mask of
> > > >> features that can be used with memory tagging, though at present only
> > > >> two bits have defined uses.
> > > >>
> > > >> The third patch is the meat of the changes; it adds the changes to the
> > > >> malloc APIs.  I've tried as far as possible to ensure that when memory
> > > >> tagging is disabled, there is no change in behaviour, even when the
> > > >> memory tagging is configured into the library, but there are
> > > >> inevitably a small number of changes needed in the optimizations that
> > > >> calloc performs since tagging would require that all the tags were
> > > >> correctly set, even if the memory does not strictly have to be zeroed.
> > > >> I've made use of function pointers in the code, much the same way as
> > > >> the morecore hook is used, so that when tagging is disabled, the
> > > >> functions called are the same as the traditional operations; this also
> > > >> ensures that glibc does not require any internal ifunc resolution in
> > > >> order to work.
> > > >>
> > > >> The fourth patch adds support for the new prctl operations that are
> > > >> being proposed to the linux kernel.  The kernel changes are to a
> > > >> generic header and this patch mirrors that design decision in glibc.
> > > >>
> > > >> The fifth patch is a place-holder, so that this series of changes is
> > > >> stand-alone.  Work is already underway to change the string operations
> > > >> to be MTE safe without losing too much in the way of performance.  I
> > > >> expect this patch to be removed entirely before the series is
> > > >> committed.
> > > >>
> > > >> The final two patches add the remaining aarch64 support.  The first
> > > >> adds the support code to examine the tunable and HW caps; and enable
> > > >> memory tagging in the kernel as needed.  The second adds the final
> > > >> pieces needed to support memory tagging in glibc.
> > > >>
> > > >
> > > > Obviously, pointer comparison and algorithm will be impacted by MTE.
> > > > From what you are proposing, only parts of glibc will be MTE compatible.
> > > > Is this correct?
> > > >
> > >
> > > Only *undefined* pointer comparisons will be impacted, such as comparing
> > > objects from different allocations.
> > >
> > > Within an allocation comparisons are fine.  And also, pointer
> > > equivalence is fine as well.
> > >
> >
> > Is "ptr1 - ptr2" valid to compute the distance between 2 pointers?
>
> in iso c that's only valid within the same object.
>
> (but e.g. gcc tries to detect which way the stack grows
> by comparing stack pointers across stack frames: that's
> not legal in c, and does not work if stack objects are
> tagged with mte, this patch set is for heap tagging though)

memmove in C has

rettype
inhibit_loop_to_libcall
MEMMOVE (a1const void *a1, a2const void *a2, size_t len)
{
  unsigned long int dstp = (long int) dest;
  unsigned long int srcp = (long int) src;

  /* This test makes the forward copying code be used whenever possible.
     Reduces the working set.  */
  if (dstp - srcp >= len) /* *Unsigned* compare!  */

How does it work?

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:40         ` H.J. Lu
@ 2020-06-15 16:51           ` Richard Earnshaw (lists)
  2020-06-15 17:46             ` H.J. Lu
                               ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-15 16:51 UTC (permalink / raw)
  To: H.J. Lu, Szabolcs Nagy; +Cc: GNU C Library

On 15/06/2020 17:40, H.J. Lu via Libc-alpha wrote:
> On Mon, Jun 15, 2020 at 9:30 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>
>> The 06/15/2020 08:37, H.J. Lu via Libc-alpha wrote:
>>> On Mon, Jun 15, 2020 at 8:11 AM Richard Earnshaw (lists)
>>> <Richard.Earnshaw@arm.com> wrote:
>>>>
>>>> On 15/06/2020 16:03, H.J. Lu wrote:
>>>>> On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
>>>>>>
>>>>>> Last year I posted a preliminary set of patches for discussion
>>>>>> purposes on adding support for tagged memory to glibc.  This version
>>>>>> polishes that to the point where I believe it is now deployable.
>>>>>>
>>>>>> The first four patches are generic changes, the final three add the
>>>>>> aarch64 specific code.
>>>>>>
>>>>>> The first patch simply adds a new configuration option to the build
>>>>>> system which can be turned on with the option --enable-memory-tagging.
>>>>>> The default at present is 'no'.
>>>>>>
>>>>>> The second patch adds a glibc tunable that can be used at run-time to
>>>>>> enable the feature (the default again, is disabled).  This tunable
>>>>>> would be always present, but have no effect on systems lacking support
>>>>>> for memory tagging.  I've structured the tunable as a bit-mask of
>>>>>> features that can be used with memory tagging, though at present only
>>>>>> two bits have defined uses.
>>>>>>
>>>>>> The third patch is the meat of the changes; it adds the changes to the
>>>>>> malloc APIs.  I've tried as far as possible to ensure that when memory
>>>>>> tagging is disabled, there is no change in behaviour, even when the
>>>>>> memory tagging is configured into the library, but there are
>>>>>> inevitably a small number of changes needed in the optimizations that
>>>>>> calloc performs since tagging would require that all the tags were
>>>>>> correctly set, even if the memory does not strictly have to be zeroed.
>>>>>> I've made use of function pointers in the code, much the same way as
>>>>>> the morecore hook is used, so that when tagging is disabled, the
>>>>>> functions called are the same as the traditional operations; this also
>>>>>> ensures that glibc does not require any internal ifunc resolution in
>>>>>> order to work.
>>>>>>
>>>>>> The fourth patch adds support for the new prctl operations that are
>>>>>> being proposed to the linux kernel.  The kernel changes are to a
>>>>>> generic header and this patch mirrors that design decision in glibc.
>>>>>>
>>>>>> The fifth patch is a place-holder, so that this series of changes is
>>>>>> stand-alone.  Work is already underway to change the string operations
>>>>>> to be MTE safe without losing too much in the way of performance.  I
>>>>>> expect this patch to be removed entirely before the series is
>>>>>> committed.
>>>>>>
>>>>>> The final two patches add the remaining aarch64 support.  The first
>>>>>> adds the support code to examine the tunable and HW caps; and enable
>>>>>> memory tagging in the kernel as needed.  The second adds the final
>>>>>> pieces needed to support memory tagging in glibc.
>>>>>>
>>>>>
>>>>> Obviously, pointer comparison and algorithm will be impacted by MTE.
>>>>> From what you are proposing, only parts of glibc will be MTE compatible.
>>>>> Is this correct?
>>>>>
>>>>
>>>> Only *undefined* pointer comparisons will be impacted, such as comparing
>>>> objects from different allocations.
>>>>
>>>> Within an allocation comparisons are fine.  And also, pointer
>>>> equivalence is fine as well.
>>>>
>>>
>>> Is "ptr1 - ptr2" valid to compute the distance between 2 pointers?
>>
>> in iso c that's only valid within the same object.
>>
>> (but e.g. gcc tries to detect which way the stack grows
>> by comparing stack pointers across stack frames: that's
>> not legal in c, and does not work if stack objects are
>> tagged with mte, this patch set is for heap tagging though)
> 
> memmove in C has
> 
> rettype
> inhibit_loop_to_libcall
> MEMMOVE (a1const void *a1, a2const void *a2, size_t len)
> {
>   unsigned long int dstp = (long int) dest;
>   unsigned long int srcp = (long int) src;
> 
>   /* This test makes the forward copying code be used whenever possible.
>      Reduces the working set.  */
>   if (dstp - srcp >= len) /* *Unsigned* compare!  */
> 
> How does it work?
> 

Well the code is technically undefined!

In practice it will work because objects passed to memmove will have to
have a single colour, so the test will work correctly, though not for
the reason the programmer thought.

:-)

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:37 ` Joseph Myers
@ 2020-06-15 16:53   ` Richard Earnshaw
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 16:53 UTC (permalink / raw)
  To: Joseph Myers, Richard Earnshaw; +Cc: libc-alpha

On 15/06/2020 17:37, Joseph Myers wrote:
> On Mon, 15 Jun 2020, Richard Earnshaw wrote:
> 
>> The fourth patch adds support for the new prctl operations that are
>> being proposed to the linux kernel.  The kernel changes are to a
>> generic header and this patch mirrors that design decision in glibc.
> 
> Are these values in Linus's git tree?  (I think we should generally avoid 
> adding kernel interfaces until they are at least in upstream git.)
> 
>>  manual/install.texi                           |  13 ++
> 
> INSTALL should be regenerated when updating install.texi.  Also, new 
> features such as this should get an entry in NEWS.
> 

This is only an RFC at this point.  Kernel patches are in review.  But
we need to agree on the API before they can be committed, and that can
only happen if we agree on the libc side of it as well.

We can't require the kernel patches be pushed before we even consider
this or we would have deadlock.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (9 preceding siblings ...)
  2020-06-15 16:37 ` Joseph Myers
@ 2020-06-15 17:04 ` DJ Delorie
  2020-06-15 17:09   ` Richard Earnshaw
  2020-06-16 10:16 ` Szabolcs Nagy
  2020-06-23 13:22 ` Szabolcs Nagy
  12 siblings, 1 reply; 47+ messages in thread
From: DJ Delorie @ 2020-06-15 17:04 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha, rearnsha


Two immediate thoughts...

1. Do we really want to add more environment variables as aliases for
   new tunables?  I thought env support was for pre-tunable variable
   support (compatibility) only.

2. Do we really need to lose the back pointer's word in allocated
   memory?  Historically, the back pointer is *not* part of the malloc
   internal data when the chunk is in 'allocated' state, and losing that
   memory will make small allocations much less efficient.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 17:04 ` DJ Delorie
@ 2020-06-15 17:09   ` Richard Earnshaw
  2020-06-15 17:22     ` DJ Delorie
  2020-06-16 10:17     ` Richard Earnshaw
  0 siblings, 2 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-15 17:09 UTC (permalink / raw)
  To: DJ Delorie, Richard Earnshaw; +Cc: libc-alpha

On 15/06/2020 18:04, DJ Delorie via Libc-alpha wrote:
> 
> Two immediate thoughts...
> 
> 1. Do we really want to add more environment variables as aliases for
>    new tunables?  I thought env support was for pre-tunable variable
>    support (compatibility) only.

That might depend on whether we want to try to share how this is enabled
with other C libraries - we can't expect them to copy all of glibcs
tunable API here.

That being said, this is easy enough to change if needed.

> 
> 2. Do we really need to lose the back pointer's word in allocated
>    memory?  Historically, the back pointer is *not* part of the malloc
>    internal data when the chunk is in 'allocated' state, and losing that
>    memory will make small allocations much less efficient.
> 

Yes, if you want to protect the back pointer against being trampled by
programs - it has to have a different tag colour to memory given to the
application.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 17:09   ` Richard Earnshaw
@ 2020-06-15 17:22     ` DJ Delorie
  2020-06-16 10:17     ` Richard Earnshaw
  1 sibling, 0 replies; 47+ messages in thread
From: DJ Delorie @ 2020-06-15 17:22 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: rearnsha, libc-alpha

Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>> 2. Do we really need to lose the back pointer's word in allocated
>>    memory?  Historically, the back pointer is *not* part of the malloc
>>    internal data when the chunk is in 'allocated' state, and losing that
>>    memory will make small allocations much less efficient.
>
> Yes, if you want to protect the back pointer against being trampled by
> programs - it has to have a different tag colour to memory given to the
> application.

But is there a way to recolor it when allocated?  It only need be
protected when the chunk is in malloc's control.  In fact, when the
application has it, it should be colored differently than malloc's data,
to prevent malloc from trying to use it.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:51           ` Richard Earnshaw (lists)
@ 2020-06-15 17:46             ` H.J. Lu
  2020-06-15 18:05             ` Paul Eggert
  2020-06-15 18:10             ` Andreas Schwab
  2 siblings, 0 replies; 47+ messages in thread
From: H.J. Lu @ 2020-06-15 17:46 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Szabolcs Nagy, GNU C Library

On Mon, Jun 15, 2020 at 9:51 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 15/06/2020 17:40, H.J. Lu via Libc-alpha wrote:
> > On Mon, Jun 15, 2020 at 9:30 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>
> >> The 06/15/2020 08:37, H.J. Lu via Libc-alpha wrote:
> >>> On Mon, Jun 15, 2020 at 8:11 AM Richard Earnshaw (lists)
> >>> <Richard.Earnshaw@arm.com> wrote:
> >>>>
> >>>> On 15/06/2020 16:03, H.J. Lu wrote:
> >>>>> On Mon, Jun 15, 2020 at 7:43 AM Richard Earnshaw <rearnsha@arm.com> wrote:
> >>>>>>
> >>>>>> Last year I posted a preliminary set of patches for discussion
> >>>>>> purposes on adding support for tagged memory to glibc.  This version
> >>>>>> polishes that to the point where I believe it is now deployable.
> >>>>>>
> >>>>>> The first four patches are generic changes, the final three add the
> >>>>>> aarch64 specific code.
> >>>>>>
> >>>>>> The first patch simply adds a new configuration option to the build
> >>>>>> system which can be turned on with the option --enable-memory-tagging.
> >>>>>> The default at present is 'no'.
> >>>>>>
> >>>>>> The second patch adds a glibc tunable that can be used at run-time to
> >>>>>> enable the feature (the default again, is disabled).  This tunable
> >>>>>> would be always present, but have no effect on systems lacking support
> >>>>>> for memory tagging.  I've structured the tunable as a bit-mask of
> >>>>>> features that can be used with memory tagging, though at present only
> >>>>>> two bits have defined uses.
> >>>>>>
> >>>>>> The third patch is the meat of the changes; it adds the changes to the
> >>>>>> malloc APIs.  I've tried as far as possible to ensure that when memory
> >>>>>> tagging is disabled, there is no change in behaviour, even when the
> >>>>>> memory tagging is configured into the library, but there are
> >>>>>> inevitably a small number of changes needed in the optimizations that
> >>>>>> calloc performs since tagging would require that all the tags were
> >>>>>> correctly set, even if the memory does not strictly have to be zeroed.
> >>>>>> I've made use of function pointers in the code, much the same way as
> >>>>>> the morecore hook is used, so that when tagging is disabled, the
> >>>>>> functions called are the same as the traditional operations; this also
> >>>>>> ensures that glibc does not require any internal ifunc resolution in
> >>>>>> order to work.
> >>>>>>
> >>>>>> The fourth patch adds support for the new prctl operations that are
> >>>>>> being proposed to the linux kernel.  The kernel changes are to a
> >>>>>> generic header and this patch mirrors that design decision in glibc.
> >>>>>>
> >>>>>> The fifth patch is a place-holder, so that this series of changes is
> >>>>>> stand-alone.  Work is already underway to change the string operations
> >>>>>> to be MTE safe without losing too much in the way of performance.  I
> >>>>>> expect this patch to be removed entirely before the series is
> >>>>>> committed.
> >>>>>>
> >>>>>> The final two patches add the remaining aarch64 support.  The first
> >>>>>> adds the support code to examine the tunable and HW caps; and enable
> >>>>>> memory tagging in the kernel as needed.  The second adds the final
> >>>>>> pieces needed to support memory tagging in glibc.
> >>>>>>
> >>>>>
> >>>>> Obviously, pointer comparison and algorithm will be impacted by MTE.
> >>>>> From what you are proposing, only parts of glibc will be MTE compatible.
> >>>>> Is this correct?
> >>>>>
> >>>>
> >>>> Only *undefined* pointer comparisons will be impacted, such as comparing
> >>>> objects from different allocations.
> >>>>
> >>>> Within an allocation comparisons are fine.  And also, pointer
> >>>> equivalence is fine as well.
> >>>>
> >>>
> >>> Is "ptr1 - ptr2" valid to compute the distance between 2 pointers?
> >>
> >> in iso c that's only valid within the same object.
> >>
> >> (but e.g. gcc tries to detect which way the stack grows
> >> by comparing stack pointers across stack frames: that's
> >> not legal in c, and does not work if stack objects are
> >> tagged with mte, this patch set is for heap tagging though)
> >
> > memmove in C has
> >
> > rettype
> > inhibit_loop_to_libcall
> > MEMMOVE (a1const void *a1, a2const void *a2, size_t len)
> > {
> >   unsigned long int dstp = (long int) dest;
> >   unsigned long int srcp = (long int) src;
> >
> >   /* This test makes the forward copying code be used whenever possible.
> >      Reduces the working set.  */
> >   if (dstp - srcp >= len) /* *Unsigned* compare!  */
> >
> > How does it work?
> >
>
> Well the code is technically undefined!

What does it mean to glibc?  Have you done an audit on glibc for this
issue?

> In practice it will work because objects passed to memmove will have to
> have a single colour, so the test will work correctly, though not for
> the reason the programmer thought.

Do you have a list of functions in glibc which allow more than one color?

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:51           ` Richard Earnshaw (lists)
  2020-06-15 17:46             ` H.J. Lu
@ 2020-06-15 18:05             ` Paul Eggert
  2020-06-15 18:14               ` Richard Earnshaw (lists)
  2020-06-15 18:41               ` Szabolcs Nagy
  2020-06-15 18:10             ` Andreas Schwab
  2 siblings, 2 replies; 47+ messages in thread
From: Paul Eggert @ 2020-06-15 18:05 UTC (permalink / raw)
  To: Richard Earnshaw (lists), H.J. Lu, Szabolcs Nagy; +Cc: GNU C Library

On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
> In practice it will work because objects passed to memmove will have to
> have a single colour,

Does this mean all stack and heap objects visible to the C programmer must have
the same tag? This surprises me, as I thought part of the idea was to assign
tags randomly.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 16:51           ` Richard Earnshaw (lists)
  2020-06-15 17:46             ` H.J. Lu
  2020-06-15 18:05             ` Paul Eggert
@ 2020-06-15 18:10             ` Andreas Schwab
  2 siblings, 0 replies; 47+ messages in thread
From: Andreas Schwab @ 2020-06-15 18:10 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: H.J. Lu, Szabolcs Nagy, GNU C Library

On Jun 15 2020, Richard Earnshaw (lists) wrote:

>> rettype
>> inhibit_loop_to_libcall
>> MEMMOVE (a1const void *a1, a2const void *a2, size_t len)
>> {
>>   unsigned long int dstp = (long int) dest;
>>   unsigned long int srcp = (long int) src;
>> 
>>   /* This test makes the forward copying code be used whenever possible.
>>      Reduces the working set.  */
>>   if (dstp - srcp >= len) /* *Unsigned* compare!  */
>> 
>> How does it work?
>> 
>
> Well the code is technically undefined!

Not undefined, but implementation-defined.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 18:05             ` Paul Eggert
@ 2020-06-15 18:14               ` Richard Earnshaw (lists)
  2020-06-15 18:41               ` Szabolcs Nagy
  1 sibling, 0 replies; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-15 18:14 UTC (permalink / raw)
  To: Paul Eggert, H.J. Lu, Szabolcs Nagy; +Cc: GNU C Library

On 15/06/2020 19:05, Paul Eggert wrote:
> On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
>> In practice it will work because objects passed to memmove will have to
>> have a single colour,
> 
> Does this mean all stack and heap objects visible to the C programmer must have
> the same tag? This surprises me, as I thought part of the idea was to assign
> tags randomly.
> 

No, I mean that a single block of memory must have a single tag, so if
the regions overlap, they must have the same tag.  If they don't
overlap, then the tags can be different, but then the length check would
indicate that as well.

memcpy/memmove aren't expected to copy the tag, since you might be
copying some data into the middle of another block.  All hell would
likely break out if they tried to do that.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 18:05             ` Paul Eggert
  2020-06-15 18:14               ` Richard Earnshaw (lists)
@ 2020-06-15 18:41               ` Szabolcs Nagy
  2020-06-15 19:18                 ` H.J. Lu
  1 sibling, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-15 18:41 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Richard Earnshaw (lists), H.J. Lu, GNU C Library

The 06/15/2020 11:05, Paul Eggert wrote:
> On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
> > In practice it will work because objects passed to memmove will have to
> > have a single colour,
> 
> Does this mean all stack and heap objects visible to the C programmer must have
> the same tag? This surprises me, as I thought part of the idea was to assign
> tags randomly.

the check works for non-overlapping objects with
or without tagging the same way, so different
heap allocations can have different color.

for overlapping objects the pointers must have
the same tag for the check to work, so single
heap allocations must have a single color.
this is guaranteed by the proposed design.

(in case of heap allocation it would be
difficult to do otherwise, but e.g. stack
tagging could try to color different fields
in a struct differently and then memmove
would fail to detect an overlapping copy).

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 18:41               ` Szabolcs Nagy
@ 2020-06-15 19:18                 ` H.J. Lu
  2020-06-16  8:14                   ` Szabolcs Nagy
  2020-06-16  9:36                   ` Richard Earnshaw (lists)
  0 siblings, 2 replies; 47+ messages in thread
From: H.J. Lu @ 2020-06-15 19:18 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: Paul Eggert, Richard Earnshaw (lists), GNU C Library

On Mon, Jun 15, 2020 at 11:41 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 06/15/2020 11:05, Paul Eggert wrote:
> > On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
> > > In practice it will work because objects passed to memmove will have to
> > > have a single colour,
> >
> > Does this mean all stack and heap objects visible to the C programmer must have
> > the same tag? This surprises me, as I thought part of the idea was to assign
> > tags randomly.
>
> the check works for non-overlapping objects with
> or without tagging the same way, so different
> heap allocations can have different color.
>
> for overlapping objects the pointers must have
> the same tag for the check to work, so single
> heap allocations must have a single color.
> this is guaranteed by the proposed design.
>
> (in case of heap allocation it would be
> difficult to do otherwise, but e.g. stack
> tagging could try to color different fields
> in a struct differently and then memmove
> would fail to detect an overlapping copy).

I think we need a marker to indicate an object is MTE compatible.

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 19:18                 ` H.J. Lu
@ 2020-06-16  8:14                   ` Szabolcs Nagy
  2020-06-16 13:31                     ` H.J. Lu
  2020-06-16  9:36                   ` Richard Earnshaw (lists)
  1 sibling, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-16  8:14 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Paul Eggert, Richard Earnshaw (lists), GNU C Library

The 06/15/2020 12:18, H.J. Lu wrote:
> I think we need a marker to indicate an object is MTE compatible.

i expect users to only able to discover tag-unsafety
issues in applications at runtime and even if there
is static information in binaries that's usually too
late to do anything useful about it (other than fail).

so i think an initial implementation that is off by
default but can be turned on with an env var makes
sense, but i agree that to turn tagging on by default
some markings will be needed such that tag-unsafe
applications continue to work (if possible).

i think a marking design for heap tagging has to at
least consider:

1 malloc can be interposed and then tagging is not a
  libc internal decision. so i think we would have to
  expose the markings to user code (e.g. like hwcaps)
  or have an opt-in libc api for malloc implementors
  to request tagging which can fail in the presence
  of unmarked modules.

2 software components other than malloc may choose
  to use tagging on memory that they manage: tagging
  can be controlled per page on aarch64, the only
  global settings are: enable syscall abi to take
  tagged pointers, select the tag checking behaviour
  (e.g. SIGSEGV on mismatch) and select the behaviour
  of the random tag generator instruction. The global
  settings may need coordination, but the ability to
  use tagging (locally) should not be disabled based
  on markings (of other modules).

3 users will want to force the tagging on at runtime
  even if their code is not tagging safe: this can
  be used for debugging unmodified binaries, and
  hitting a tagging issue is a dynamic property not
  statically determined, i.e. using tagging with
  code that's not generally tagging safe may work.
  the nice thing about heap tagging is that it's a
  runtime decision, no recompilation is needed, we
  should not ruin this.

4 there is tag-unsafe code that makes assumptions
  about pointer representation (e.g. stores things in
  the top bits of pointers) passing tagged pointers
  to such a module does not work with whatever tag
  checking setting, but there is tag-unsafe code that
  makes assumptions about the granularity of memory
  protection (e.g. assumes out of bound read is ok if
  it does not cross a page boundary) which is fine
  with tagged pointers, it just does not want to fault
  (e.g. tag checking can have a monitoring mode where
  tag mismatch is counted by the kernel per process
  and not cause runtime behaviour changes), so if we
  have markings then we need different marking for
  "compatible with tagged pointers" and "compatible
  with faulting tag checking mode".

do you think such static marking design can be
introduced later? what should the compiler do?
(in principle a compiler can generte tag-unsafe code
for conforming c code currently, but that's unlikely,
it's much more likely that the source code is doing
something non-portable, but that's hard to find out
in the compiler)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 19:18                 ` H.J. Lu
  2020-06-16  8:14                   ` Szabolcs Nagy
@ 2020-06-16  9:36                   ` Richard Earnshaw (lists)
  2020-06-16 13:37                     ` H.J. Lu
  1 sibling, 1 reply; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-16  9:36 UTC (permalink / raw)
  To: H.J. Lu, Szabolcs Nagy; +Cc: GNU C Library

On 15/06/2020 20:18, H.J. Lu via Libc-alpha wrote:
> On Mon, Jun 15, 2020 at 11:41 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>
>> The 06/15/2020 11:05, Paul Eggert wrote:
>>> On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
>>>> In practice it will work because objects passed to memmove will have to
>>>> have a single colour,
>>>
>>> Does this mean all stack and heap objects visible to the C programmer must have
>>> the same tag? This surprises me, as I thought part of the idea was to assign
>>> tags randomly.
>>
>> the check works for non-overlapping objects with
>> or without tagging the same way, so different
>> heap allocations can have different color.
>>
>> for overlapping objects the pointers must have
>> the same tag for the check to work, so single
>> heap allocations must have a single color.
>> this is guaranteed by the proposed design.
>>
>> (in case of heap allocation it would be
>> difficult to do otherwise, but e.g. stack
>> tagging could try to color different fields
>> in a struct differently and then memmove
>> would fail to detect an overlapping copy).
> 
> I think we need a marker to indicate an object is MTE compatible.
> 

I think that's inverted.  Firstly, we would then need to recompile
everything in order to use MTE; that's highly undesirable, especially
when it's largely unnecessary.  Secondly, how would a compiler know?  It
generally can't see when a programmer isn't conforming fully to the
standard - so you'd have to trust the user and just put in the tag.
Quite frankly, that's just silly.

There are a small number of programs that do violate the principles that
underlying tagging - sadly Python's memory management code is one of
them - it has objects that are a mix of malloc'd objects and objects
it's doled out from its own heap, but it assumes it can just grub around
in a memory page to find out which - yuck.  Ideally, these would be
marked in some way so that we never enabled MTE in those cases.  But at
present, I don't see that as urgent.

Remember, MTE is primarily a *debugging* aid, intended to help identify
issues such as buffer overruns and other allocation issues such as
use-after free or double free.  It doesn't have to be on all the time,
so if some programs won't work with it that's not a fatal issue - just
don't turn it on in that case.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (10 preceding siblings ...)
  2020-06-15 17:04 ` DJ Delorie
@ 2020-06-16 10:16 ` Szabolcs Nagy
  2020-06-16 13:44   ` Florian Weimer
  2020-06-23 13:22 ` Szabolcs Nagy
  12 siblings, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-16 10:16 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha

The 06/15/2020 15:40, Richard Earnshaw wrote:
> 2) Tests that assume that malloc_usable_size will return a specific
> amount of free space.  The assumptions are not correct, because the
> tag colouring boundaries needed for MTE means that the 8 bytes in the
> block containing the back pointer can no-longer be used by users when
> we have MTE (they have a different colour that belongs to the malloc
> data structures).

with --enable-memory-tagging i see

FAIL: malloc/tst-malloc-usable
FAIL: malloc/tst-malloc-usable-static
FAIL: malloc/tst-malloc-usable-static-tunables
FAIL: malloc/tst-malloc-usable-tunables

malloc_usable_size(malloc(7)) is 16 with
MALLOC_CHECK_=0 and it's 0 with MALLOC_CHECK_=3.

i think this breaks existing usage, so either
malloc check should be disabled if memory tagging
is enabled or fixed to be compatible.
(or at least the issue should be documented)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 17:09   ` Richard Earnshaw
  2020-06-15 17:22     ` DJ Delorie
@ 2020-06-16 10:17     ` Richard Earnshaw
  2020-06-16 10:31       ` Szabolcs Nagy
  2020-06-16 15:31       ` DJ Delorie
  1 sibling, 2 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-16 10:17 UTC (permalink / raw)
  To: DJ Delorie, Richard Earnshaw; +Cc: libc-alpha

On 15/06/2020 18:09, Richard Earnshaw wrote:
> On 15/06/2020 18:04, DJ Delorie via Libc-alpha wrote:
>>
>> Two immediate thoughts...
>>
>> 1. Do we really want to add more environment variables as aliases for
>>    new tunables?  I thought env support was for pre-tunable variable
>>    support (compatibility) only.
> 
> That might depend on whether we want to try to share how this is enabled
> with other C libraries - we can't expect them to copy all of glibcs
> tunable API here.
> 
> That being said, this is easy enough to change if needed.
> 
>>
>> 2. Do we really need to lose the back pointer's word in allocated
>>    memory?  Historically, the back pointer is *not* part of the malloc
>>    internal data when the chunk is in 'allocated' state, and losing that
>>    memory will make small allocations much less efficient.
>>
> 
> Yes, if you want to protect the back pointer against being trampled by
> programs - it has to have a different tag colour to memory given to the
> application.
> 
> R.
> 

Your second comment made me go back and look again at the assumptions
I've made.  I'm pretty sure they hold.

Taking the comment from the malloc code (the labels on the right are
mine to clarify the following text)...


    An allocated chunk looks like this:


    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	    | Size of previous chunk, if unallocated (P clear)  | (1)
	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	    | Size of chunk, in bytes                     |A|M|P| (2)
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	    | User data starts here...                          . (3)
	    .                                                   .
	    . (malloc_usable_size() bytes)                      .
	    .                                                   |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	    | (size of chunk, but used for application data)    | (4)
	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	    | Size of next chunk, in bytes                |A|0|1| (5)
	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Now, when we map MTE onto that we have to look at the granules, which
are the minimal units of memory that must have a single colour.  In
aarch64 that's a 16 byte chunk and maps quite happily onto the chunk
header.  So, (assuming I've understood all this correctly :) the
chunk header (labels 1 & 2 and 4 & 5 above) is 16 bytes long and 16-byte
aligned - perfect!  But what we can't do is allow the 8 bytes at the
start of nextchunk (4) to be used in the previous allocation block,
since to do that we'd have to assign it the colour of the user data (3)
and it can't have that colour unless the next chunk size (5) also has
that colour - and if we did that then malloc's own data structures would
no-longer be coloured differently to the user data.

A complete rewrite of malloc to use an out-of-band chunk list would
probably address the wastage, but I really wanted to avoid that... :)

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 10:17     ` Richard Earnshaw
@ 2020-06-16 10:31       ` Szabolcs Nagy
  2020-06-16 11:07         ` Richard Earnshaw
  2020-06-16 15:31       ` DJ Delorie
  1 sibling, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-16 10:31 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: DJ Delorie, Richard Earnshaw, libc-alpha

The 06/16/2020 11:17, Richard Earnshaw wrote:
> On 15/06/2020 18:09, Richard Earnshaw wrote:
> > On 15/06/2020 18:04, DJ Delorie via Libc-alpha wrote:
> >>
> >> Two immediate thoughts...
> >>
> >> 1. Do we really want to add more environment variables as aliases for
> >>    new tunables?  I thought env support was for pre-tunable variable
> >>    support (compatibility) only.
> > 
> > That might depend on whether we want to try to share how this is enabled
> > with other C libraries - we can't expect them to copy all of glibcs
> > tunable API here.
> > 
> > That being said, this is easy enough to change if needed.
> > 
> >>
> >> 2. Do we really need to lose the back pointer's word in allocated
> >>    memory?  Historically, the back pointer is *not* part of the malloc
> >>    internal data when the chunk is in 'allocated' state, and losing that
> >>    memory will make small allocations much less efficient.
> >>
> > 
> > Yes, if you want to protect the back pointer against being trampled by
> > programs - it has to have a different tag colour to memory given to the
> > application.
> > 
> > R.
> > 
> 
> Your second comment made me go back and look again at the assumptions
> I've made.  I'm pretty sure they hold.
> 
> Taking the comment from the malloc code (the labels on the right are
> mine to clarify the following text)...
> 
> 
>     An allocated chunk looks like this:
> 
> 
>     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 	    | Size of previous chunk, if unallocated (P clear)  | (1)
> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 	    | Size of chunk, in bytes                     |A|M|P| (2)
>       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 	    | User data starts here...                          . (3)
> 	    .                                                   .
> 	    . (malloc_usable_size() bytes)                      .
> 	    .                                                   |
> nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 	    | (size of chunk, but used for application data)    | (4)
> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 	    | Size of next chunk, in bytes                |A|0|1| (5)
> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> Now, when we map MTE onto that we have to look at the granules, which
> are the minimal units of memory that must have a single colour.  In
> aarch64 that's a 16 byte chunk and maps quite happily onto the chunk
> header.  So, (assuming I've understood all this correctly :) the
> chunk header (labels 1 & 2 and 4 & 5 above) is 16 bytes long and 16-byte
> aligned - perfect!  But what we can't do is allow the 8 bytes at the
> start of nextchunk (4) to be used in the previous allocation block,
> since to do that we'd have to assign it the colour of the user data (3)
> and it can't have that colour unless the next chunk size (5) also has
> that colour - and if we did that then malloc's own data structures would
> no-longer be coloured differently to the user data.

it is also possible to always get the right tag
when accessing data in user allocation. e.g.
instead of

size = *p;

use

size = *get_correctly_tagged_ptr(p);

but this is a bit akward and can be unsafe (the
meta data inside user allocation is not protected
via tagging) and does not work in case tags can
change concurrently (there is no 'unchecked load'
in the architecture only separate 'get correct tag'
and 'checked load') e.g. a free running concurrently
with an operation on the next chunk that for some
reason looks at this field of the prev chunk. or if
we want to allow users to retag their allocated
memory (sub allocators).

> 
> A complete rewrite of malloc to use an out-of-band chunk list would
> probably address the wastage, but I really wanted to avoid that... :)
> 
> R.

-- 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 10:31       ` Szabolcs Nagy
@ 2020-06-16 11:07         ` Richard Earnshaw
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-16 11:07 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: DJ Delorie, Richard Earnshaw, libc-alpha

On 16/06/2020 11:31, Szabolcs Nagy wrote:
> The 06/16/2020 11:17, Richard Earnshaw wrote:
>> On 15/06/2020 18:09, Richard Earnshaw wrote:
>>> On 15/06/2020 18:04, DJ Delorie via Libc-alpha wrote:
>>>>
>>>> Two immediate thoughts...
>>>>
>>>> 1. Do we really want to add more environment variables as aliases for
>>>>    new tunables?  I thought env support was for pre-tunable variable
>>>>    support (compatibility) only.
>>>
>>> That might depend on whether we want to try to share how this is enabled
>>> with other C libraries - we can't expect them to copy all of glibcs
>>> tunable API here.
>>>
>>> That being said, this is easy enough to change if needed.
>>>
>>>>
>>>> 2. Do we really need to lose the back pointer's word in allocated
>>>>    memory?  Historically, the back pointer is *not* part of the malloc
>>>>    internal data when the chunk is in 'allocated' state, and losing that
>>>>    memory will make small allocations much less efficient.
>>>>
>>>
>>> Yes, if you want to protect the back pointer against being trampled by
>>> programs - it has to have a different tag colour to memory given to the
>>> application.
>>>
>>> R.
>>>
>>
>> Your second comment made me go back and look again at the assumptions
>> I've made.  I'm pretty sure they hold.
>>
>> Taking the comment from the malloc code (the labels on the right are
>> mine to clarify the following text)...
>>
>>
>>     An allocated chunk looks like this:
>>
>>
>>     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> 	    | Size of previous chunk, if unallocated (P clear)  | (1)
>> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> 	    | Size of chunk, in bytes                     |A|M|P| (2)
>>       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> 	    | User data starts here...                          . (3)
>> 	    .                                                   .
>> 	    . (malloc_usable_size() bytes)                      .
>> 	    .                                                   |
>> nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> 	    | (size of chunk, but used for application data)    | (4)
>> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> 	    | Size of next chunk, in bytes                |A|0|1| (5)
>> 	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>
>> Now, when we map MTE onto that we have to look at the granules, which
>> are the minimal units of memory that must have a single colour.  In
>> aarch64 that's a 16 byte chunk and maps quite happily onto the chunk
>> header.  So, (assuming I've understood all this correctly :) the
>> chunk header (labels 1 & 2 and 4 & 5 above) is 16 bytes long and 16-byte
>> aligned - perfect!  But what we can't do is allow the 8 bytes at the
>> start of nextchunk (4) to be used in the previous allocation block,
>> since to do that we'd have to assign it the colour of the user data (3)
>> and it can't have that colour unless the next chunk size (5) also has
>> that colour - and if we did that then malloc's own data structures would
>> no-longer be coloured differently to the user data.
> 
> it is also possible to always get the right tag
> when accessing data in user allocation. e.g.
> instead of
> 
> size = *p;
> 
> use
> 
> size = *get_correctly_tagged_ptr(p);
> 
> but this is a bit akward and can be unsafe (the
> meta data inside user allocation is not protected
> via tagging) and does not work in case tags can
> change concurrently (there is no 'unchecked load'
> in the architecture only separate 'get correct tag'
> and 'checked load') e.g. a free running concurrently
> with an operation on the next chunk that for some
> reason looks at this field of the prev chunk. or if
> we want to allow users to retag their allocated
> memory (sub allocators).
> 

I currently haven't exported any of the memory tagging operations from
glibc (the new functions are internal only for now).  That's a
deliberate design choice at this time to avoid exposing an API until we
are confident that such operations are correct and desirable.

In terms of colouring the meta-data with the user's colour - it might be
possible, but it might have race issues as you say, and it would
certainly weaken the protection given by MTE, since a small buffer
overrun would corrupt the meta-data and might go undetected.  It's a
trade-off between protection and efficiency.

R.

>>
>> A complete rewrite of malloc to use an out-of-band chunk list would
>> probably address the wastage, but I really wanted to avoid that... :)
>>
>> R.
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16  8:14                   ` Szabolcs Nagy
@ 2020-06-16 13:31                     ` H.J. Lu
  2020-06-16 14:14                       ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-16 13:31 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: Paul Eggert, Richard Earnshaw (lists), GNU C Library

On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 06/15/2020 12:18, H.J. Lu wrote:
> > I think we need a marker to indicate an object is MTE compatible.
>
> i expect users to only able to discover tag-unsafety
> issues in applications at runtime and even if there
> is static information in binaries that's usually too
> late to do anything useful about it (other than fail).
>
> so i think an initial implementation that is off by
> default but can be turned on with an env var makes
> sense, but i agree that to turn tagging on by default
> some markings will be needed such that tag-unsafe
> applications continue to work (if possible).

MTE isn't just another debugging tool.  Otherwise, we wouldn't
want it in glibc.   Since we are enabling MTE in some object files,
at least we should mark them as MTE enabled.

> i think a marking design for heap tagging has to at
> least consider:
>
> 1 malloc can be interposed and then tagging is not a
>   libc internal decision. so i think we would have to
>   expose the markings to user code (e.g. like hwcaps)
>   or have an opt-in libc api for malloc implementors
>   to request tagging which can fail in the presence
>   of unmarked modules.

MTE marker can be in the GNU_PROPERTY segment.

> 2 software components other than malloc may choose
>   to use tagging on memory that they manage: tagging
>   can be controlled per page on aarch64, the only
>   global settings are: enable syscall abi to take
>   tagged pointers, select the tag checking behaviour
>   (e.g. SIGSEGV on mismatch) and select the behaviour
>   of the random tag generator instruction. The global
>   settings may need coordination, but the ability to
>   use tagging (locally) should not be disabled based
>   on markings (of other modules).

How to use the MTE marker info is up to the runtime.

> 3 users will want to force the tagging on at runtime
>   even if their code is not tagging safe: this can
>   be used for debugging unmodified binaries, and
>   hitting a tagging issue is a dynamic property not
>   statically determined, i.e. using tagging with
>   code that's not generally tagging safe may work.
>   the nice thing about heap tagging is that it's a
>   runtime decision, no recompilation is needed, we
>   should not ruin this.

This sounds like

GLIBC_TUNABLES=glibc.cpu.x86_shstk=on

> 4 there is tag-unsafe code that makes assumptions
>   about pointer representation (e.g. stores things in
>   the top bits of pointers) passing tagged pointers
>   to such a module does not work with whatever tag
>   checking setting, but there is tag-unsafe code that
>   makes assumptions about the granularity of memory
>   protection (e.g. assumes out of bound read is ok if
>   it does not cross a page boundary) which is fine
>   with tagged pointers, it just does not want to fault
>   (e.g. tag checking can have a monitoring mode where
>   tag mismatch is counted by the kernel per process
>   and not cause runtime behaviour changes), so if we
>   have markings then we need different marking for
>   "compatible with tagged pointers" and "compatible
>   with faulting tag checking mode".

It sounds like IBT and SHSTK in CET.

> do you think such static marking design can be

I think we should do it now.

> introduced later? what should the compiler do?
> (in principle a compiler can generte tag-unsafe code
> for conforming c code currently, but that's unlikely,
> it's much more likely that the source code is doing
> something non-portable, but that's hard to find out
> in the compiler)

Compiler should try to detect it, like -Wstringop-overflow.
Initially compilers can add a marker only via a command-line
option.

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16  9:36                   ` Richard Earnshaw (lists)
@ 2020-06-16 13:37                     ` H.J. Lu
  0 siblings, 0 replies; 47+ messages in thread
From: H.J. Lu @ 2020-06-16 13:37 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Szabolcs Nagy, GNU C Library

On Tue, Jun 16, 2020 at 2:36 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 15/06/2020 20:18, H.J. Lu via Libc-alpha wrote:
> > On Mon, Jun 15, 2020 at 11:41 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>
> >> The 06/15/2020 11:05, Paul Eggert wrote:
> >>> On 6/15/20 9:51 AM, Richard Earnshaw (lists) wrote:
> >>>> In practice it will work because objects passed to memmove will have to
> >>>> have a single colour,
> >>>
> >>> Does this mean all stack and heap objects visible to the C programmer must have
> >>> the same tag? This surprises me, as I thought part of the idea was to assign
> >>> tags randomly.
> >>
> >> the check works for non-overlapping objects with
> >> or without tagging the same way, so different
> >> heap allocations can have different color.
> >>
> >> for overlapping objects the pointers must have
> >> the same tag for the check to work, so single
> >> heap allocations must have a single color.
> >> this is guaranteed by the proposed design.
> >>
> >> (in case of heap allocation it would be
> >> difficult to do otherwise, but e.g. stack
> >> tagging could try to color different fields
> >> in a struct differently and then memmove
> >> would fail to detect an overlapping copy).
> >
> > I think we need a marker to indicate an object is MTE compatible.
> >
>
> I think that's inverted.  Firstly, we would then need to recompile
> everything in order to use MTE; that's highly undesirable, especially

Without the marker, MTE is just another debug tool which I don't think
glibc needs.  With the marker, MTE can be used to protect glibc.

> when it's largely unnecessary.  Secondly, how would a compiler know?  It
> generally can't see when a programmer isn't conforming fully to the
> standard - so you'd have to trust the user and just put in the tag.
> Quite frankly, that's just silly.

Initially the marker needs to be added explicitly by programmers, just
like we are enabling MTE in some glibc files.

> There are a small number of programs that do violate the principles that
> underlying tagging - sadly Python's memory management code is one of
> them - it has objects that are a mix of malloc'd objects and objects
> it's doled out from its own heap, but it assumes it can just grub around
> in a memory page to find out which - yuck.  Ideally, these would be
> marked in some way so that we never enabled MTE in those cases.  But at
> present, I don't see that as urgent.
>
> Remember, MTE is primarily a *debugging* aid, intended to help identify
> issues such as buffer overruns and other allocation issues such as
> use-after free or double free.  It doesn't have to be on all the time,
> so if some programs won't work with it that's not a fatal issue - just
> don't turn it on in that case.
>

But, MTE can be used to protect glibc. We may not be able to do it
today.  But an MTE marker can help us get there.

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 10:16 ` Szabolcs Nagy
@ 2020-06-16 13:44   ` Florian Weimer
  2020-06-16 14:06     ` Richard Earnshaw
  0 siblings, 1 reply; 47+ messages in thread
From: Florian Weimer @ 2020-06-16 13:44 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: Richard Earnshaw, libc-alpha

* Szabolcs Nagy:

> The 06/15/2020 15:40, Richard Earnshaw wrote:
>> 2) Tests that assume that malloc_usable_size will return a specific
>> amount of free space.  The assumptions are not correct, because the
>> tag colouring boundaries needed for MTE means that the 8 bytes in the
>> block containing the back pointer can no-longer be used by users when
>> we have MTE (they have a different colour that belongs to the malloc
>> data structures).
>
> with --enable-memory-tagging i see
>
> FAIL: malloc/tst-malloc-usable
> FAIL: malloc/tst-malloc-usable-static
> FAIL: malloc/tst-malloc-usable-static-tunables
> FAIL: malloc/tst-malloc-usable-tunables
>
> malloc_usable_size(malloc(7)) is 16 with
> MALLOC_CHECK_=0 and it's 0 with MALLOC_CHECK_=3.
>
> i think this breaks existing usage, so either
> malloc check should be disabled if memory tagging
> is enabled or fixed to be compatible.
> (or at least the issue should be documented)

I'm with Richard here—this is an incorrect test expectation, not a bug
in the implementation.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 13:44   ` Florian Weimer
@ 2020-06-16 14:06     ` Richard Earnshaw
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Earnshaw @ 2020-06-16 14:06 UTC (permalink / raw)
  To: Florian Weimer, Szabolcs Nagy; +Cc: libc-alpha, Richard Earnshaw

On 16/06/2020 14:44, Florian Weimer via Libc-alpha wrote:
> * Szabolcs Nagy:
> 
>> The 06/15/2020 15:40, Richard Earnshaw wrote:
>>> 2) Tests that assume that malloc_usable_size will return a specific
>>> amount of free space.  The assumptions are not correct, because the
>>> tag colouring boundaries needed for MTE means that the 8 bytes in the
>>> block containing the back pointer can no-longer be used by users when
>>> we have MTE (they have a different colour that belongs to the malloc
>>> data structures).
>>
>> with --enable-memory-tagging i see
>>
>> FAIL: malloc/tst-malloc-usable
>> FAIL: malloc/tst-malloc-usable-static
>> FAIL: malloc/tst-malloc-usable-static-tunables
>> FAIL: malloc/tst-malloc-usable-tunables
>>
>> malloc_usable_size(malloc(7)) is 16 with
>> MALLOC_CHECK_=0 and it's 0 with MALLOC_CHECK_=3.
>>
>> i think this breaks existing usage, so either
>> malloc check should be disabled if memory tagging
>> is enabled or fixed to be compatible.
>> (or at least the issue should be documented)
> 
> I'm with Richard here—this is an incorrect test expectation, not a bug
> in the implementation.
> 
> Thanks,
> Florian
> 

Actually, I think there is a real issue that I have to solve: the usable
size should never be less than the allocation request.

The problem is that I round down the allocation agressively in
malloc_usable_size and that does not account for the MALLOC_CHECK case
where the overall size is reduced by one to allow for the magic cookie
marker.

When MTE is enabled, I'm not sure it makes too much sense to also enable
MALLOC_CHECK: it does essentially the same thing, but less well.  But
when it is off (and the library is configured to support MTE), it does
need to work as expected.

I'm still thinking about this case.  One option is to put the cookie
inside the word that we no-longer hand out to the user.  It's harmless
to put it there and it avoids wasting even more space, even if the user
normally can't overrun it when MTE is on.

R.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 13:31                     ` H.J. Lu
@ 2020-06-16 14:14                       ` Richard Earnshaw (lists)
  2020-06-16 14:27                         ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-16 14:14 UTC (permalink / raw)
  To: H.J. Lu, Szabolcs Nagy; +Cc: GNU C Library

On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>
>> The 06/15/2020 12:18, H.J. Lu wrote:
>>> I think we need a marker to indicate an object is MTE compatible.
>>
>> i expect users to only able to discover tag-unsafety
>> issues in applications at runtime and even if there
>> is static information in binaries that's usually too
>> late to do anything useful about it (other than fail).
>>
>> so i think an initial implementation that is off by
>> default but can be turned on with an env var makes
>> sense, but i agree that to turn tagging on by default
>> some markings will be needed such that tag-unsafe
>> applications continue to work (if possible).
> 
> MTE isn't just another debugging tool.  Otherwise, we wouldn't
> want it in glibc.   

That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
debugging tool and we have it in glibc.

> Since we are enabling MTE in some object files,
> at least we should mark them as MTE enabled.

When we start using MTE to colour stack objects, yes, we'll need to mark
object files that use it (they require longjmp to clean up the stack
colouring, for example).  Until then, I disagree.  You do not need to
mark every object file as compatible with *heap* objects that have been
coloured.  If you disagree please provide a concrete example.

> 
>> i think a marking design for heap tagging has to at
>> least consider:
>>
>> 1 malloc can be interposed and then tagging is not a
>>   libc internal decision. so i think we would have to
>>   expose the markings to user code (e.g. like hwcaps)
>>   or have an opt-in libc api for malloc implementors
>>   to request tagging which can fail in the presence
>>   of unmarked modules.
> 
> MTE marker can be in the GNU_PROPERTY segment.
> 
>> 2 software components other than malloc may choose
>>   to use tagging on memory that they manage: tagging
>>   can be controlled per page on aarch64, the only
>>   global settings are: enable syscall abi to take
>>   tagged pointers, select the tag checking behaviour
>>   (e.g. SIGSEGV on mismatch) and select the behaviour
>>   of the random tag generator instruction. The global
>>   settings may need coordination, but the ability to
>>   use tagging (locally) should not be disabled based
>>   on markings (of other modules).
> 
> How to use the MTE marker info is up to the runtime.
> 
>> 3 users will want to force the tagging on at runtime
>>   even if their code is not tagging safe: this can
>>   be used for debugging unmodified binaries, and
>>   hitting a tagging issue is a dynamic property not
>>   statically determined, i.e. using tagging with
>>   code that's not generally tagging safe may work.
>>   the nice thing about heap tagging is that it's a
>>   runtime decision, no recompilation is needed, we
>>   should not ruin this.
> 
> This sounds like
> 
> GLIBC_TUNABLES=glibc.cpu.x86_shstk=on
> 
>> 4 there is tag-unsafe code that makes assumptions
>>   about pointer representation (e.g. stores things in
>>   the top bits of pointers) passing tagged pointers
>>   to such a module does not work with whatever tag
>>   checking setting, but there is tag-unsafe code that
>>   makes assumptions about the granularity of memory
>>   protection (e.g. assumes out of bound read is ok if
>>   it does not cross a page boundary) which is fine
>>   with tagged pointers, it just does not want to fault
>>   (e.g. tag checking can have a monitoring mode where
>>   tag mismatch is counted by the kernel per process
>>   and not cause runtime behaviour changes), so if we
>>   have markings then we need different marking for
>>   "compatible with tagged pointers" and "compatible
>>   with faulting tag checking mode".
> 
> It sounds like IBT and SHSTK in CET.
> 
>> do you think such static marking design can be
> 
> I think we should do it now.
> 
>> introduced later? what should the compiler do?
>> (in principle a compiler can generte tag-unsafe code
>> for conforming c code currently, but that's unlikely,
>> it's much more likely that the source code is doing
>> something non-portable, but that's hard to find out
>> in the compiler)
> 
> Compiler should try to detect it, like -Wstringop-overflow.
> Initially compilers can add a marker only via a command-line
> option.
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 14:14                       ` Richard Earnshaw (lists)
@ 2020-06-16 14:27                         ` H.J. Lu
  2020-06-16 14:34                           ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-16 14:27 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Szabolcs Nagy, GNU C Library

On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> > On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>
> >> The 06/15/2020 12:18, H.J. Lu wrote:
> >>> I think we need a marker to indicate an object is MTE compatible.
> >>
> >> i expect users to only able to discover tag-unsafety
> >> issues in applications at runtime and even if there
> >> is static information in binaries that's usually too
> >> late to do anything useful about it (other than fail).
> >>
> >> so i think an initial implementation that is off by
> >> default but can be turned on with an env var makes
> >> sense, but i agree that to turn tagging on by default
> >> some markings will be needed such that tag-unsafe
> >> applications continue to work (if possible).
> >
> > MTE isn't just another debugging tool.  Otherwise, we wouldn't
> > want it in glibc.
>
> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
> debugging tool and we have it in glibc.
>
> > Since we are enabling MTE in some object files,
> > at least we should mark them as MTE enabled.
>
> When we start using MTE to colour stack objects, yes, we'll need to mark
> object files that use it (they require longjmp to clean up the stack
> colouring, for example).  Until then, I disagree.  You do not need to
> mark every object file as compatible with *heap* objects that have been
> coloured.  If you disagree please provide a concrete example.
>

I can image MTE is always on by default, starting with glibc.


-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 14:27                         ` H.J. Lu
@ 2020-06-16 14:34                           ` Richard Earnshaw (lists)
  2020-06-16 14:52                             ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-16 14:34 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Szabolcs Nagy, GNU C Library

On 16/06/2020 15:27, H.J. Lu wrote:
> On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>>
>> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
>> > On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>> >>
>> >> The 06/15/2020 12:18, H.J. Lu wrote:
>> >>> I think we need a marker to indicate an object is MTE compatible.
>> >>
>> >> i expect users to only able to discover tag-unsafety
>> >> issues in applications at runtime and even if there
>> >> is static information in binaries that's usually too
>> >> late to do anything useful about it (other than fail).
>> >>
>> >> so i think an initial implementation that is off by
>> >> default but can be turned on with an env var makes
>> >> sense, but i agree that to turn tagging on by default
>> >> some markings will be needed such that tag-unsafe
>> >> applications continue to work (if possible).
>> >
>> > MTE isn't just another debugging tool.  Otherwise, we wouldn't
>> > want it in glibc.
>>
>> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
>> debugging tool and we have it in glibc.
>>
>> > Since we are enabling MTE in some object files,
>> > at least we should mark them as MTE enabled.
>>
>> When we start using MTE to colour stack objects, yes, we'll need to mark
>> object files that use it (they require longjmp to clean up the stack
>> colouring, for example).  Until then, I disagree.  You do not need to
>> mark every object file as compatible with *heap* objects that have been
>> coloured.  If you disagree please provide a concrete example.
>>
> 
> I can image MTE is always on by default, starting with glibc.

I think that's unlikely, even with lazy faulting because the HW overhead
is not zero.  At some point, it might be that we'd want to enable it
semi-randomly when running some programs (I've heard of some OSes
planning to do something like that for telemetry type monitoring), but
we're a long way from that; and what's more, we're a long way from
really having a list of what's safe and what's not.  So until then,
arbitrarily marking objects to say they're safe when we don't know that
will just create a marker that has zero use, because we won't be able to
rely on it.

If you're going to have markers, you'd better be 100% sure what it's
telling you is right, or it's not worth the bits you've spent on it.

R.

> 
> 
> -- 
> H.J.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 14:34                           ` Richard Earnshaw (lists)
@ 2020-06-16 14:52                             ` H.J. Lu
  2020-06-16 15:10                               ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-16 14:52 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Szabolcs Nagy, GNU C Library

On Tue, Jun 16, 2020 at 7:34 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 16/06/2020 15:27, H.J. Lu wrote:
> > On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> >>
> >> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> >> > On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >> >>
> >> >> The 06/15/2020 12:18, H.J. Lu wrote:
> >> >>> I think we need a marker to indicate an object is MTE compatible.
> >> >>
> >> >> i expect users to only able to discover tag-unsafety
> >> >> issues in applications at runtime and even if there
> >> >> is static information in binaries that's usually too
> >> >> late to do anything useful about it (other than fail).
> >> >>
> >> >> so i think an initial implementation that is off by
> >> >> default but can be turned on with an env var makes
> >> >> sense, but i agree that to turn tagging on by default
> >> >> some markings will be needed such that tag-unsafe
> >> >> applications continue to work (if possible).
> >> >
> >> > MTE isn't just another debugging tool.  Otherwise, we wouldn't
> >> > want it in glibc.
> >>
> >> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
> >> debugging tool and we have it in glibc.
> >>
> >> > Since we are enabling MTE in some object files,
> >> > at least we should mark them as MTE enabled.
> >>
> >> When we start using MTE to colour stack objects, yes, we'll need to mark
> >> object files that use it (they require longjmp to clean up the stack
> >> colouring, for example).  Until then, I disagree.  You do not need to
> >> mark every object file as compatible with *heap* objects that have been
> >> coloured.  If you disagree please provide a concrete example.
> >>
> >
> > I can image MTE is always on by default, starting with glibc.
>
> I think that's unlikely, even with lazy faulting because the HW overhead
> is not zero.  At some point, it might be that we'd want to enable it
> semi-randomly when running some programs (I've heard of some OSes
> planning to do something like that for telemetry type monitoring), but
> we're a long way from that; and what's more, we're a long way from
> really having a list of what's safe and what's not.  So until then,
> arbitrarily marking objects to say they're safe when we don't know that
> will just create a marker that has zero use, because we won't be able to
> rely on it.

Yes, there will be overhead.  But some users will be OK with it.

> If you're going to have markers, you'd better be 100% sure what it's
> telling you is right, or it's not worth the bits you've spent on it.
>

There will be mistakes.  It doesn't mean that we shouldn't do it.


-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 14:52                             ` H.J. Lu
@ 2020-06-16 15:10                               ` Richard Earnshaw (lists)
  2020-06-16 16:33                                 ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Earnshaw (lists) @ 2020-06-16 15:10 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On 16/06/2020 15:52, H.J. Lu via Libc-alpha wrote:
> On Tue, Jun 16, 2020 at 7:34 AM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>>
>> On 16/06/2020 15:27, H.J. Lu wrote:
>>> On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
>>> <Richard.Earnshaw@arm.com> wrote:
>>>>
>>>> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
>>>>> On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>>>>>
>>>>>> The 06/15/2020 12:18, H.J. Lu wrote:
>>>>>>> I think we need a marker to indicate an object is MTE compatible.
>>>>>>
>>>>>> i expect users to only able to discover tag-unsafety
>>>>>> issues in applications at runtime and even if there
>>>>>> is static information in binaries that's usually too
>>>>>> late to do anything useful about it (other than fail).
>>>>>>
>>>>>> so i think an initial implementation that is off by
>>>>>> default but can be turned on with an env var makes
>>>>>> sense, but i agree that to turn tagging on by default
>>>>>> some markings will be needed such that tag-unsafe
>>>>>> applications continue to work (if possible).
>>>>>
>>>>> MTE isn't just another debugging tool.  Otherwise, we wouldn't
>>>>> want it in glibc.
>>>>
>>>> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
>>>> debugging tool and we have it in glibc.
>>>>
>>>>> Since we are enabling MTE in some object files,
>>>>> at least we should mark them as MTE enabled.
>>>>
>>>> When we start using MTE to colour stack objects, yes, we'll need to mark
>>>> object files that use it (they require longjmp to clean up the stack
>>>> colouring, for example).  Until then, I disagree.  You do not need to
>>>> mark every object file as compatible with *heap* objects that have been
>>>> coloured.  If you disagree please provide a concrete example.
>>>>
>>>
>>> I can image MTE is always on by default, starting with glibc.
>>
>> I think that's unlikely, even with lazy faulting because the HW overhead
>> is not zero.  At some point, it might be that we'd want to enable it
>> semi-randomly when running some programs (I've heard of some OSes
>> planning to do something like that for telemetry type monitoring), but
>> we're a long way from that; and what's more, we're a long way from
>> really having a list of what's safe and what's not.  So until then,
>> arbitrarily marking objects to say they're safe when we don't know that
>> will just create a marker that has zero use, because we won't be able to
>> rely on it.
> 
> Yes, there will be overhead.  But some users will be OK with it.
> 
>> If you're going to have markers, you'd better be 100% sure what it's
>> telling you is right, or it's not worth the bits you've spent on it.
>>
> 
> There will be mistakes.  It doesn't mean that we shouldn't do it.
> 
> 

But we shouldn't be doing it NOW, at least, not until we have a MUCH
better idea of what constitutes a safe vs an unsafe program and can give
clear advice to developers (or automate the tools) to do this sensibly.
 Anything else would be a random guess and create more problems than it
solves.

R.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 10:17     ` Richard Earnshaw
  2020-06-16 10:31       ` Szabolcs Nagy
@ 2020-06-16 15:31       ` DJ Delorie
  1 sibling, 0 replies; 47+ messages in thread
From: DJ Delorie @ 2020-06-16 15:31 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha


Ok, if it's the alignment that limits us, so be it.  We probably should
revisit small allocation wastage in the future, but that might mean a
redesign of malloc itself :-(


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 15:10                               ` Richard Earnshaw (lists)
@ 2020-06-16 16:33                                 ` H.J. Lu
  2020-06-17 10:53                                   ` Szabolcs Nagy
  0 siblings, 1 reply; 47+ messages in thread
From: H.J. Lu @ 2020-06-16 16:33 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: GNU C Library

On Tue, Jun 16, 2020 at 8:10 AM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 16/06/2020 15:52, H.J. Lu via Libc-alpha wrote:
> > On Tue, Jun 16, 2020 at 7:34 AM Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> >>
> >> On 16/06/2020 15:27, H.J. Lu wrote:
> >>> On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
> >>> <Richard.Earnshaw@arm.com> wrote:
> >>>>
> >>>> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> >>>>> On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>>>>>
> >>>>>> The 06/15/2020 12:18, H.J. Lu wrote:
> >>>>>>> I think we need a marker to indicate an object is MTE compatible.
> >>>>>>
> >>>>>> i expect users to only able to discover tag-unsafety
> >>>>>> issues in applications at runtime and even if there
> >>>>>> is static information in binaries that's usually too
> >>>>>> late to do anything useful about it (other than fail).
> >>>>>>
> >>>>>> so i think an initial implementation that is off by
> >>>>>> default but can be turned on with an env var makes
> >>>>>> sense, but i agree that to turn tagging on by default
> >>>>>> some markings will be needed such that tag-unsafe
> >>>>>> applications continue to work (if possible).
> >>>>>
> >>>>> MTE isn't just another debugging tool.  Otherwise, we wouldn't
> >>>>> want it in glibc.
> >>>>
> >>>> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
> >>>> debugging tool and we have it in glibc.
> >>>>
> >>>>> Since we are enabling MTE in some object files,
> >>>>> at least we should mark them as MTE enabled.
> >>>>
> >>>> When we start using MTE to colour stack objects, yes, we'll need to mark
> >>>> object files that use it (they require longjmp to clean up the stack
> >>>> colouring, for example).  Until then, I disagree.  You do not need to
> >>>> mark every object file as compatible with *heap* objects that have been
> >>>> coloured.  If you disagree please provide a concrete example.
> >>>>
> >>>
> >>> I can image MTE is always on by default, starting with glibc.
> >>
> >> I think that's unlikely, even with lazy faulting because the HW overhead
> >> is not zero.  At some point, it might be that we'd want to enable it
> >> semi-randomly when running some programs (I've heard of some OSes
> >> planning to do something like that for telemetry type monitoring), but
> >> we're a long way from that; and what's more, we're a long way from
> >> really having a list of what's safe and what's not.  So until then,
> >> arbitrarily marking objects to say they're safe when we don't know that
> >> will just create a marker that has zero use, because we won't be able to
> >> rely on it.
> >
> > Yes, there will be overhead.  But some users will be OK with it.
> >
> >> If you're going to have markers, you'd better be 100% sure what it's
> >> telling you is right, or it's not worth the bits you've spent on it.
> >>
> >
> > There will be mistakes.  It doesn't mean that we shouldn't do it.
> >
> >
>
> But we shouldn't be doing it NOW, at least, not until we have a MUCH
> better idea of what constitutes a safe vs an unsafe program and can give
> clear advice to developers (or automate the tools) to do this sensibly.
>  Anything else would be a random guess and create more problems than it
> solves.

If we don't plan it from the very beginning, it may never happen.

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-16 16:33                                 ` H.J. Lu
@ 2020-06-17 10:53                                   ` Szabolcs Nagy
  2020-06-18 20:30                                     ` H.J. Lu
  0 siblings, 1 reply; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-17 10:53 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Earnshaw (lists), GNU C Library

The 06/16/2020 09:33, H.J. Lu via Libc-alpha wrote:
> On Tue, Jun 16, 2020 at 8:10 AM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
> >
> > On 16/06/2020 15:52, H.J. Lu via Libc-alpha wrote:
> > > On Tue, Jun 16, 2020 at 7:34 AM Richard Earnshaw (lists)
> > > <Richard.Earnshaw@arm.com> wrote:
> > >>
> > >> On 16/06/2020 15:27, H.J. Lu wrote:
> > >>> On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
> > >>> <Richard.Earnshaw@arm.com> wrote:
> > >>>>
> > >>>> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> > >>>>> On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> > >>>>>>
> > >>>>>> The 06/15/2020 12:18, H.J. Lu wrote:
> > >>>>>>> I think we need a marker to indicate an object is MTE compatible.
> > >>>>>>
> > >>>>>> i expect users to only able to discover tag-unsafety
> > >>>>>> issues in applications at runtime and even if there
> > >>>>>> is static information in binaries that's usually too
> > >>>>>> late to do anything useful about it (other than fail).
> > >>>>>>
> > >>>>>> so i think an initial implementation that is off by
> > >>>>>> default but can be turned on with an env var makes
> > >>>>>> sense, but i agree that to turn tagging on by default
> > >>>>>> some markings will be needed such that tag-unsafe
> > >>>>>> applications continue to work (if possible).
> > >>>>>
> > >>>>> MTE isn't just another debugging tool.  Otherwise, we wouldn't
> > >>>>> want it in glibc.
> > >>>>
> > >>>> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
> > >>>> debugging tool and we have it in glibc.
> > >>>>
> > >>>>> Since we are enabling MTE in some object files,
> > >>>>> at least we should mark them as MTE enabled.
> > >>>>
> > >>>> When we start using MTE to colour stack objects, yes, we'll need to mark
> > >>>> object files that use it (they require longjmp to clean up the stack
> > >>>> colouring, for example).  Until then, I disagree.  You do not need to
> > >>>> mark every object file as compatible with *heap* objects that have been
> > >>>> coloured.  If you disagree please provide a concrete example.
> > >>>>
> > >>>
> > >>> I can image MTE is always on by default, starting with glibc.
> > >>
> > >> I think that's unlikely, even with lazy faulting because the HW overhead
> > >> is not zero.  At some point, it might be that we'd want to enable it
> > >> semi-randomly when running some programs (I've heard of some OSes
> > >> planning to do something like that for telemetry type monitoring), but
> > >> we're a long way from that; and what's more, we're a long way from
> > >> really having a list of what's safe and what's not.  So until then,
> > >> arbitrarily marking objects to say they're safe when we don't know that
> > >> will just create a marker that has zero use, because we won't be able to
> > >> rely on it.
> > >
> > > Yes, there will be overhead.  But some users will be OK with it.
> > >
> > >> If you're going to have markers, you'd better be 100% sure what it's
> > >> telling you is right, or it's not worth the bits you've spent on it.
> > >>
> > >
> > > There will be mistakes.  It doesn't mean that we shouldn't do it.
> > >
> > >
> >
> > But we shouldn't be doing it NOW, at least, not until we have a MUCH
> > better idea of what constitutes a safe vs an unsafe program and can give
> > clear advice to developers (or automate the tools) to do this sensibly.
> >  Anything else would be a random guess and create more problems than it
> > solves.
> 
> If we don't plan it from the very beginning, it may never happen.

the two main usage of heap tagging is debugging
and security hardening.

for debugging the env var is fine.

for security if we tag things as mte unsafe
we get a system where all the relevant
applications are marked unsafe and mte is not
enabled (the main culprits for mte unsafety are
the complex runtimes that we want to protect:
python, browsers, ..., they are also the cases
where marking does not really work: they dlopen
their dependencies, so they fail at runtime
with or without a marking)

i think heap tagging is different from shadow
stack in that 99% of heap tagging failures are
not some obscure hack that is valid but happens
to be incompatible with the feature, but very
likely a bug in the code that is undefined by
the language standard and thus the compiler
can already break the code if it was smarter.
(and often it may be exploitable and then we
definitely don't want to use marking, but
fix the actual bug)

i don't know how the shadow stack marking
works: none of the asm files in glibc are
marked, do you force the marking on in the
linker? or the assembler? either way i don't
think it is a very reliable or easy to
maintain mechanism. but i agree that we will
need some mechanism to keep existing binaries
working (so users don't need to make a choice
between secure but broken or insecure but
working setup)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-17 10:53                                   ` Szabolcs Nagy
@ 2020-06-18 20:30                                     ` H.J. Lu
  0 siblings, 0 replies; 47+ messages in thread
From: H.J. Lu @ 2020-06-18 20:30 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: Richard Earnshaw (lists), GNU C Library

On Wed, Jun 17, 2020 at 3:53 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 06/16/2020 09:33, H.J. Lu via Libc-alpha wrote:
> > On Tue, Jun 16, 2020 at 8:10 AM Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> > >
> > > On 16/06/2020 15:52, H.J. Lu via Libc-alpha wrote:
> > > > On Tue, Jun 16, 2020 at 7:34 AM Richard Earnshaw (lists)
> > > > <Richard.Earnshaw@arm.com> wrote:
> > > >>
> > > >> On 16/06/2020 15:27, H.J. Lu wrote:
> > > >>> On Tue, Jun 16, 2020 at 7:14 AM Richard Earnshaw (lists)
> > > >>> <Richard.Earnshaw@arm.com> wrote:
> > > >>>>
> > > >>>> On 16/06/2020 14:31, H.J. Lu via Libc-alpha wrote:
> > > >>>>> On Tue, Jun 16, 2020 at 1:14 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> > > >>>>>>
> > > >>>>>> The 06/15/2020 12:18, H.J. Lu wrote:
> > > >>>>>>> I think we need a marker to indicate an object is MTE compatible.
> > > >>>>>>
> > > >>>>>> i expect users to only able to discover tag-unsafety
> > > >>>>>> issues in applications at runtime and even if there
> > > >>>>>> is static information in binaries that's usually too
> > > >>>>>> late to do anything useful about it (other than fail).
> > > >>>>>>
> > > >>>>>> so i think an initial implementation that is off by
> > > >>>>>> default but can be turned on with an env var makes
> > > >>>>>> sense, but i agree that to turn tagging on by default
> > > >>>>>> some markings will be needed such that tag-unsafe
> > > >>>>>> applications continue to work (if possible).
> > > >>>>>
> > > >>>>> MTE isn't just another debugging tool.  Otherwise, we wouldn't
> > > >>>>> want it in glibc.
> > > >>>>
> > > >>>> That doesn't follow from your claim.  eg we have MALLOC_CHECK - that's a
> > > >>>> debugging tool and we have it in glibc.
> > > >>>>
> > > >>>>> Since we are enabling MTE in some object files,
> > > >>>>> at least we should mark them as MTE enabled.
> > > >>>>
> > > >>>> When we start using MTE to colour stack objects, yes, we'll need to mark
> > > >>>> object files that use it (they require longjmp to clean up the stack
> > > >>>> colouring, for example).  Until then, I disagree.  You do not need to
> > > >>>> mark every object file as compatible with *heap* objects that have been
> > > >>>> coloured.  If you disagree please provide a concrete example.
> > > >>>>
> > > >>>
> > > >>> I can image MTE is always on by default, starting with glibc.
> > > >>
> > > >> I think that's unlikely, even with lazy faulting because the HW overhead
> > > >> is not zero.  At some point, it might be that we'd want to enable it
> > > >> semi-randomly when running some programs (I've heard of some OSes
> > > >> planning to do something like that for telemetry type monitoring), but
> > > >> we're a long way from that; and what's more, we're a long way from
> > > >> really having a list of what's safe and what's not.  So until then,
> > > >> arbitrarily marking objects to say they're safe when we don't know that
> > > >> will just create a marker that has zero use, because we won't be able to
> > > >> rely on it.
> > > >
> > > > Yes, there will be overhead.  But some users will be OK with it.
> > > >
> > > >> If you're going to have markers, you'd better be 100% sure what it's
> > > >> telling you is right, or it's not worth the bits you've spent on it.
> > > >>
> > > >
> > > > There will be mistakes.  It doesn't mean that we shouldn't do it.
> > > >
> > > >
> > >
> > > But we shouldn't be doing it NOW, at least, not until we have a MUCH
> > > better idea of what constitutes a safe vs an unsafe program and can give
> > > clear advice to developers (or automate the tools) to do this sensibly.
> > >  Anything else would be a random guess and create more problems than it
> > > solves.
> >
> > If we don't plan it from the very beginning, it may never happen.
>
> the two main usage of heap tagging is debugging
> and security hardening.

Yes, marking is intended for security hardening.  We need to plan
ahead.

> for debugging the env var is fine.

Agree.

> for security if we tag things as mte unsafe
> we get a system where all the relevant
> applications are marked unsafe and mte is not
> enabled (the main culprits for mte unsafety are
> the complex runtimes that we want to protect:
> python, browsers, ..., they are also the cases
> where marking does not really work: they dlopen
> their dependencies, so they fail at runtime
> with or without a marking)
>
> i think heap tagging is different from shadow
> stack in that 99% of heap tagging failures are
> not some obscure hack that is valid but happens
> to be incompatible with the feature, but very
> likely a bug in the code that is undefined by
> the language standard and thus the compiler
> can already break the code if it was smarter.
> (and often it may be exploitable and then we
> definitely don't want to use marking, but
> fix the actual bug)

We need to classify different taggings so that
some or all taggings can be enabled.

> i don't know how the shadow stack marking
> works: none of the asm files in glibc are
> marked, do you force the marking on in the
> linker? or the assembler? either way i don't

We do it in compiler and assembler with help from
compiler.   Since shadow stack is compatible with
most of codes, we simply add the shadow stack
marker with -fcf-protections.  The main problem is
JIT, which needs to unwind shadow stack when
stack frames are skipped.

> think it is a very reliable or easy to
> maintain mechanism. but i agree that we will
> need some mechanism to keep existing binaries
> working (so users don't need to make a choice
> between secure but broken or insecure but
> working setup)
>

Marker should be accurate.  Otherwise it won't work.

-- 
H.J.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/7] RFC Memory tagging support
  2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
                   ` (11 preceding siblings ...)
  2020-06-16 10:16 ` Szabolcs Nagy
@ 2020-06-23 13:22 ` Szabolcs Nagy
  12 siblings, 0 replies; 47+ messages in thread
From: Szabolcs Nagy @ 2020-06-23 13:22 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: libc-alpha

The 06/15/2020 15:40, Richard Earnshaw wrote:
> Testing
> =======
> 
> I've run all the malloc tests.  While not all of them pass at present,
> I have examined all the failing tests and I'm confident that none of the
> failures represent real bugs in the code; but I have not yet decided how
> best to address these failures.  The failures fall into three categories
> 
> 1) Tests that expect a particular failure mode on double-free operations.
> I've added a quick tag-checking access in the free entry path that
> essentially asserts that the tag colour of a free'd block of memory
> matches the colour of the pointer - this leads to the kernel delivering
> a different signal that the test code does not expect.
> 
> 2) Tests that assume that malloc_usable_size will return a specific
> amount of free space.  The assumptions are not correct, because the
> tag colouring boundaries needed for MTE means that the 8 bytes in the
> block containing the back pointer can no-longer be used by users when
> we have MTE (they have a different colour that belongs to the malloc
> data structures).
> 
> 3) Tests that construct a fake internal malloc data structure and then
> try to perform operations on them.  I haven't looked at these in too
> much detail, but the first issue is that the fake header is only
> 8-byte aligned and for MTE to work it requires a 16-byte aligned
> structure (the tag read/write operations require the address be
> granule aligned, and the real glibc data structure has this property).

now i run the glibc tests with the latest linux patches,
the new failures are

FAIL: malloc/tst-malloc-backtrace
FAIL: malloc/tst-malloc-usable
FAIL: malloc/tst-malloc-usable-static
FAIL: malloc/tst-malloc-usable-static-tunables
FAIL: malloc/tst-malloc-usable-tunables
FAIL: malloc/tst-mallocstate
FAIL: malloc/tst-safe-linking
FAIL: malloc/tst-tcfree1
FAIL: malloc/tst-tcfree2
FAIL: malloc/tst-tcfree3
FAIL: nptl/tst-stack3
FAIL: nptl/tst-stack3-mem
FAIL: posix/tst-mmap

all malloc failures are use after free or poking
at malloc internals, except malloc/tst-malloc-usable,
which now can return smaller amount than the
original allocation size (with MALLOC_CHECK_=3).

posix/tst-mmap uses mmap on malloced memory and mmap
fails with ENOMEM because the tagged-address syscall
abi rejects tagged address in mmap, the test expects
EINVAL for not page aligned address, in strace i see
[pid  3505] mmap(0xf00fffff7d544a1, 1000, PROT_READ, MAP_SHARED|MAP_FIXED, 3, 0x1000) = -1 ENOMEM (Cannot allocate memory)
[pid  3505] mmap(0xf00fffff7d544a1, 1000, PROT_READ, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = -1 ENOMEM (Cannot allocate memory)
...
the output is
wrong error value for mapping at address with mod pagesize != 0: %m (should be EINVAL)
wrong error value for mapping at address with mod pagesize != 0: %m (should be EINVAL)
wrong error value for mapping at address with mod pagesize != 0: %m (should be EINVAL)
wrong error value for mapping at address with mod pagesize != 0: %m (should be EINVAL)

nptl/tst-stack3 and nptl/tst-stack3-mem are real
issues: glibc calls madvise MADV_DONTNEED on
stack on thread exit, even if it was user allocated
with malloc, the effect is that tags on the memory
may get zeroed, so when the user actually tries
to free the memory the integrity tag check crashes.
(i think this should be fixed by not doing madvise
on user allocated thread stacks, and user code is
not allowed doing madvise on malloced memory either,
but i assume that's rare)

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-06-23 13:22 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-15 14:40 [PATCH 0/7] RFC Memory tagging support Richard Earnshaw
2020-06-15 14:40 ` [PATCH 1/7] config: Allow memory tagging to be enabled when configuring glibc Richard Earnshaw
2020-06-15 14:40 ` [PATCH 2/7] elf: Add a tunable to control use of tagged memory Richard Earnshaw
2020-06-15 14:40 ` [PATCH 3/7] malloc: Basic support for memory tagging in the malloc() family Richard Earnshaw
2020-06-15 14:40 ` [PATCH 4/7] linux: Add compatibility definitions to sys/prctl.h for MTE Richard Earnshaw
2020-06-15 14:40 ` [PATCH 5/7] aarch64: Mitigations for string functions when MTE is enabled Richard Earnshaw
2020-06-15 14:40 ` [PATCH 6/7] aarch64: Add sysv specific enabling code for memory tagging Richard Earnshaw
2020-06-15 14:40 ` [PATCH 7/7] aarch64: Add aarch64-specific files for memory tagging support Richard Earnshaw
2020-06-15 15:03 ` [PATCH 0/7] RFC Memory " H.J. Lu
2020-06-15 15:11   ` Richard Earnshaw (lists)
2020-06-15 15:37     ` H.J. Lu
2020-06-15 16:30       ` Szabolcs Nagy
2020-06-15 16:40         ` H.J. Lu
2020-06-15 16:51           ` Richard Earnshaw (lists)
2020-06-15 17:46             ` H.J. Lu
2020-06-15 18:05             ` Paul Eggert
2020-06-15 18:14               ` Richard Earnshaw (lists)
2020-06-15 18:41               ` Szabolcs Nagy
2020-06-15 19:18                 ` H.J. Lu
2020-06-16  8:14                   ` Szabolcs Nagy
2020-06-16 13:31                     ` H.J. Lu
2020-06-16 14:14                       ` Richard Earnshaw (lists)
2020-06-16 14:27                         ` H.J. Lu
2020-06-16 14:34                           ` Richard Earnshaw (lists)
2020-06-16 14:52                             ` H.J. Lu
2020-06-16 15:10                               ` Richard Earnshaw (lists)
2020-06-16 16:33                                 ` H.J. Lu
2020-06-17 10:53                                   ` Szabolcs Nagy
2020-06-18 20:30                                     ` H.J. Lu
2020-06-16  9:36                   ` Richard Earnshaw (lists)
2020-06-16 13:37                     ` H.J. Lu
2020-06-15 18:10             ` Andreas Schwab
2020-06-15 15:08 ` Paul Eggert
2020-06-15 16:37   ` Richard Earnshaw
2020-06-15 16:37 ` Joseph Myers
2020-06-15 16:53   ` Richard Earnshaw
2020-06-15 17:04 ` DJ Delorie
2020-06-15 17:09   ` Richard Earnshaw
2020-06-15 17:22     ` DJ Delorie
2020-06-16 10:17     ` Richard Earnshaw
2020-06-16 10:31       ` Szabolcs Nagy
2020-06-16 11:07         ` Richard Earnshaw
2020-06-16 15:31       ` DJ Delorie
2020-06-16 10:16 ` Szabolcs Nagy
2020-06-16 13:44   ` Florian Weimer
2020-06-16 14:06     ` Richard Earnshaw
2020-06-23 13:22 ` Szabolcs Nagy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).