From: "H.J. Lu" <hjl.tools@gmail.com>
To: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Florian Weimer <fweimer@redhat.com>,
GNU C Library <libc-alpha@sourceware.org>
Subject: Re: Programming model for tagged addresses
Date: Fri, 7 May 2021 07:24:00 -0700 [thread overview]
Message-ID: <CAMe9rOqAB2NUBKE6vutYkBNmANPeMydUEUpLxMXGxx-YmotcFg@mail.gmail.com> (raw)
In-Reply-To: <20210507103758.GB9028@arm.com>
On Fri, May 7, 2021 at 5:16 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The 05/07/2021 10:24, Florian Weimer via Libc-alpha wrote:
> > This is related to this bug:
> >
> > memmove doesn't work with tagged address
> > <https://sourceware.org/bugzilla/show_bug.cgi?id=27828>
> >
> > The bug is about detecting memory region overlap in the presence of
> > tagged addresses. This problem exists also with address tagging
> > emulation using alias mappings.
> >
> > If tags are fixed at allocation, I do not think these comparisons are a
> > problem. The argument goes like this: Backwards vs forwards copy only
> > matters in case of overlap. All pointers within the same top-level
> > object have the same tag, so the existing comparisons are fine.
> > Overlapping memmove between different top-level objects cannot happen
> > because top-level objects do not overlap. So you have to copy multiple
> > objects to get an overlap, but that copies data between the objects as
> > well, which is necessarily undefined.
> >
> > Things change when applications are expected to flip tag bits as they
> > see fit, including for pointers to subjects. This leads to the question
> > whether it's valid to pass such tag-altered pointers to glibc functions
> > and system calls. Many objects have significant addresses (mutex and
> > other synchronization objects, stdio streams), so the answer to that
> > isn't immediately obvious.
>
> thanks for bringing this up.
>
> on aarch64 we also need to work out a heap tagging abi,
> which necessarily relies on an address tagging abi.
>
> we were already asked how suballocators can use tagging
> i.e. fine grained memory tagging within a big malloced
> chunk, and our answer so far was that is not allowed.
> (our original concerns:
> - libc internals assume one tag per malloc allocation,
> e.g. free can scan the entire range to check the tags.
> - user code may use the malloc returned allocation as
> a whole as well as the suballocated objects separately
> and those two layers can't be mixed.
> - we don't want to guarantee that tagging works on all
> malloc returned allocations, e.g. it makes sense to
> optimize large allocations to not use tagging just
> guard pages. without PROT_MTE, munmap can be faster.
> - if user code wants to tag, it should use separate mmap.
> which implies munmap/madvise/.. are special: they need
> to cope with mixed tags. exact abi is TODO)
>
> more generally the heap tagging abi so far relies on the
> tags never changing during the lifetime of an object:
> there is only one valid user pointer to an object and it
> never changes.
>
> for plain address tagging this may be too restrictive:
> user code wants to tag pointers of existing objects,
> when there may be pointers escaped with different tags.
> this breaks c language semantics: pointer compares no
> longer work (multiple different pointers may access the
> same object and they compare unequal).
>
> i think we need to either
> - design a c language subset for tagged pointers and then
> ensure the libc follows that subset and supports user
> code that does so too,
> - or only allow limited use of pointer tagging, with
> requirements like one pointer tag escaped per object.
>
> >
> > The next question is tag bits coming from glibc and the kernel are
> > always zero initially. For example, for malloc, we currently use two
> > bits in the heap to classify chunks (main arena, non-main arena, mmap).
> > These bits do not change after allocation, so it is tempting to put them
> > into the pointer itself. But this means that some of the tag bits are
> > lost for application use.
>
> i think reserving tag bits/values for implementation use
> is reasonable abi choice. so far we did not do that for
> heap tagging because of the limited tag space and no
> pressing need.
Our LAM work is blocked by the API issue. That is why I proposed
<sys/tagged-address.h>:
https://sourceware.org/pipermail/libc-alpha/2021-April/125249.html
I'd like to see a solution in glibc 2.35.
Thanks.
--
H.J.
next prev parent reply other threads:[~2021-05-07 14:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-07 8:24 Florian Weimer
2021-05-07 10:38 ` Szabolcs Nagy
2021-05-07 14:24 ` H.J. Lu [this message]
2021-05-07 11:48 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMe9rOqAB2NUBKE6vutYkBNmANPeMydUEUpLxMXGxx-YmotcFg@mail.gmail.com \
--to=hjl.tools@gmail.com \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=szabolcs.nagy@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).