public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Programming model for tagged addresses
Date: Fri, 7 May 2021 11:38:00 +0100	[thread overview]
Message-ID: <20210507103758.GB9028@arm.com> (raw)
In-Reply-To: <874kffeysx.fsf@oldenburg.str.redhat.com>

The 05/07/2021 10:24, Florian Weimer via Libc-alpha wrote:
> This is related to this bug:
> 
>   memmove doesn't work with tagged address
>   <https://sourceware.org/bugzilla/show_bug.cgi?id=27828>
> 
> The bug is about detecting memory region overlap in the presence of
> tagged addresses.  This problem exists also with address tagging
> emulation using alias mappings.
> 
> If tags are fixed at allocation, I do not think these comparisons are a
> problem.  The argument goes like this: Backwards vs forwards copy only
> matters in case of overlap.  All pointers within the same top-level
> object have the same tag, so the existing comparisons are fine.
> Overlapping memmove between different top-level objects cannot happen
> because top-level objects do not overlap.  So you have to copy multiple
> objects to get an overlap, but that copies data between the objects as
> well, which is necessarily undefined.
> 
> Things change when applications are expected to flip tag bits as they
> see fit, including for pointers to subjects.  This leads to the question
> whether it's valid to pass such tag-altered pointers to glibc functions
> and system calls.  Many objects have significant addresses (mutex and
> other synchronization objects, stdio streams), so the answer to that
> isn't immediately obvious.

thanks for bringing this up.

on aarch64 we also need to work out a heap tagging abi,
which necessarily relies on an address tagging abi.

we were already asked how suballocators can use tagging
i.e. fine grained memory tagging within a big malloced
chunk, and our answer so far was that is not allowed.
(our original concerns:
- libc internals assume one tag per malloc allocation,
  e.g. free can scan the entire range to check the tags.
- user code may use the malloc returned allocation as
  a whole as well as the suballocated objects separately
  and those two layers can't be mixed.
- we don't want to guarantee that tagging works on all
  malloc returned allocations, e.g. it makes sense to
  optimize large allocations to not use tagging just
  guard pages. without PROT_MTE, munmap can be faster.
- if user code wants to tag, it should use separate mmap.
  which implies munmap/madvise/.. are special: they need
  to cope with mixed tags. exact abi is TODO)

more generally the heap tagging abi so far relies on the
tags never changing during the lifetime of an object:
there is only one valid user pointer to an object and it
never changes.

for plain address tagging this may be too restrictive:
user code wants to tag pointers of existing objects,
when there may be pointers escaped with different tags.
this breaks c language semantics: pointer compares no
longer work (multiple different pointers may access the
same object and they compare unequal).

i think we need to either
- design a c language subset for tagged pointers and then
  ensure the libc follows that subset and supports user
  code that does so too,
- or only allow limited use of pointer tagging, with
  requirements like one pointer tag escaped per object.

> 
> The next question is tag bits coming from glibc and the kernel are
> always zero initially.  For example, for malloc, we currently use two
> bits in the heap to classify chunks (main arena, non-main arena, mmap).
> These bits do not change after allocation, so it is tempting to put them
> into the pointer itself.  But this means that some of the tag bits are
> lost for application use.

i think reserving tag bits/values for implementation use
is reasonable abi choice. so far we did not do that for
heap tagging because of the limited tag space and no
pressing need.

  reply	other threads:[~2021-05-07 10:38 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07  8:24 Florian Weimer
2021-05-07 10:38 ` Szabolcs Nagy [this message]
2021-05-07 14:24   ` H.J. Lu
2021-05-07 11:48 ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210507103758.GB9028@arm.com \
    --to=szabolcs.nagy@arm.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).