public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Florian Weimer <fweimer@redhat.com>,
	GNU C Library <libc-alpha@sourceware.org>
Subject: Re: Programming model for tagged addresses
Date: Fri, 7 May 2021 07:24:00 -0700	[thread overview]
Message-ID: <CAMe9rOqAB2NUBKE6vutYkBNmANPeMydUEUpLxMXGxx-YmotcFg@mail.gmail.com> (raw)
In-Reply-To: <20210507103758.GB9028@arm.com>

On Fri, May 7, 2021 at 5:16 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The 05/07/2021 10:24, Florian Weimer via Libc-alpha wrote:
> > This is related to this bug:
> >
> >   memmove doesn't work with tagged address
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=27828>
> >
> > The bug is about detecting memory region overlap in the presence of
> > tagged addresses.  This problem exists also with address tagging
> > emulation using alias mappings.
> >
> > If tags are fixed at allocation, I do not think these comparisons are a
> > problem.  The argument goes like this: Backwards vs forwards copy only
> > matters in case of overlap.  All pointers within the same top-level
> > object have the same tag, so the existing comparisons are fine.
> > Overlapping memmove between different top-level objects cannot happen
> > because top-level objects do not overlap.  So you have to copy multiple
> > objects to get an overlap, but that copies data between the objects as
> > well, which is necessarily undefined.
> >
> > Things change when applications are expected to flip tag bits as they
> > see fit, including for pointers to subjects.  This leads to the question
> > whether it's valid to pass such tag-altered pointers to glibc functions
> > and system calls.  Many objects have significant addresses (mutex and
> > other synchronization objects, stdio streams), so the answer to that
> > isn't immediately obvious.
>
> thanks for bringing this up.
>
> on aarch64 we also need to work out a heap tagging abi,
> which necessarily relies on an address tagging abi.
>
> we were already asked how suballocators can use tagging
> i.e. fine grained memory tagging within a big malloced
> chunk, and our answer so far was that is not allowed.
> (our original concerns:
> - libc internals assume one tag per malloc allocation,
>   e.g. free can scan the entire range to check the tags.
> - user code may use the malloc returned allocation as
>   a whole as well as the suballocated objects separately
>   and those two layers can't be mixed.
> - we don't want to guarantee that tagging works on all
>   malloc returned allocations, e.g. it makes sense to
>   optimize large allocations to not use tagging just
>   guard pages. without PROT_MTE, munmap can be faster.
> - if user code wants to tag, it should use separate mmap.
>   which implies munmap/madvise/.. are special: they need
>   to cope with mixed tags. exact abi is TODO)
>
> more generally the heap tagging abi so far relies on the
> tags never changing during the lifetime of an object:
> there is only one valid user pointer to an object and it
> never changes.
>
> for plain address tagging this may be too restrictive:
> user code wants to tag pointers of existing objects,
> when there may be pointers escaped with different tags.
> this breaks c language semantics: pointer compares no
> longer work (multiple different pointers may access the
> same object and they compare unequal).
>
> i think we need to either
> - design a c language subset for tagged pointers and then
>   ensure the libc follows that subset and supports user
>   code that does so too,
> - or only allow limited use of pointer tagging, with
>   requirements like one pointer tag escaped per object.
>
> >
> > The next question is tag bits coming from glibc and the kernel are
> > always zero initially.  For example, for malloc, we currently use two
> > bits in the heap to classify chunks (main arena, non-main arena, mmap).
> > These bits do not change after allocation, so it is tempting to put them
> > into the pointer itself.  But this means that some of the tag bits are
> > lost for application use.
>
> i think reserving tag bits/values for implementation use
> is reasonable abi choice. so far we did not do that for
> heap tagging because of the limited tag space and no
> pressing need.

Our LAM work is blocked by the API issue.  That is why I proposed
<sys/tagged-address.h>:

https://sourceware.org/pipermail/libc-alpha/2021-April/125249.html

I'd like to see a solution in glibc 2.35.

Thanks.

-- 
H.J.

  reply	other threads:[~2021-05-07 14:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07  8:24 Florian Weimer
2021-05-07 10:38 ` Szabolcs Nagy
2021-05-07 14:24   ` H.J. Lu [this message]
2021-05-07 11:48 ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOqAB2NUBKE6vutYkBNmANPeMydUEUpLxMXGxx-YmotcFg@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).