public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: libc-alpha@sourceware.org,
	 Joseph Myers <joseph@codesourcery.com>,
	Vedvyas Shanbhogue <vedvyas.shanbhogue@intel.com>,
	 "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	 Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: Re: [PATCH v4 1/2] <sys/tagged-address.h>: An API for tagged address
Date: Wed, 21 Apr 2021 08:36:43 +0200	[thread overview]
Message-ID: <8735vkqh38.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <20210420021819.765779-2-hjl.tools@gmail.com> (H. J. Lu's message of "Mon, 19 Apr 2021 19:18:18 -0700")

* H. J. Lu:

> diff --git a/manual/tagged-address.texi b/manual/tagged-address.texi
> new file mode 100644
> index 0000000000..ce10f7e752
> --- /dev/null
> +++ b/manual/tagged-address.texi
> @@ -0,0 +1,59 @@
> +@node Tagged Address, Character Handling, Memory, Top
> +@c %MENU% Tagged address functions and macros
> +@chapter Tagged Address
> +
> +By default, the number of the address bits used in address translation
> +is the number of address bits.  But it can be changed by ARM Top-byte
> +Ignore (TBI) or Intel Linear Address Masking (LAM).
> +
> +@Theglibc{} provides several functions and macros in the header file
> +@file{sys/tagged-address.h} to manipulate tagged address bits, which is
> +the number of the address bits used in address translation.
> +@pindex sys/tagged-address.h

I don't under stand the “which is the number of address bits” part.

This section needs to describe under which circumstances it is valid to
alter the tag bits in pointers returned from glibc functions (including
system call wrappers).  I think at least historically, the kernel
required masking tag bits in user space for TBI.

> +@deftypefun {unsigned int} get_tagged_address_bits (void)
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Get the current address bits used in address translation.  The return
> +value is @code{0} if tag bits are not the highest bits in address.
> +@end deftypefun

“in addresses”?

What is the return value if there are no tag bits available?  The word
width?

> +@deftypefun uintptr_t get_tagged_address_mask (void)
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Get the current mask for address bits used in address translation.
> +@end deftypefun

Mask is ambiguous in this context.  If a bit is set in the return value,
will this bit take part in address translation or not?  Please be
explicit here.

> +@deftypefun int set_tagged_address_mask (uintptr_t @var{mask})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Set the mask for address bits used in address translation to @var{mask}.
> +The return value is @code{0} on success and @code{-1} on failure.  This
> +function can be called only once before @code{main}.  The possible
> +@code{errno} error conditions are @code{ENODEV}, @code{EPERM},
> +@code{EINVAL}, and @code{ENOSYS}.
> +@end deftypefun

Likewise, please clarify if bits set in MASK participate in address
translation or not.

Why before main?  Do you mean it can only be called once per process?

I think this limitation suggests we should use ELF markup for this.
There are definitely compatibility issues to work out here.

Historically, the x86-64 psABI supplement implied that the top 16 bits
are available for application use (without hardware masking obviously).
If e.g. malloc starts returning tagged addresses, that assumption
breaks.

Should glibc allocate tag bits to different libraries within the same
process?  For example, so that malloc could get 2 tag bits, the main
program 3 and some other library 1 bit?

For glibc malloc, it would be a simple enhancement to move the
IS_MMAPPED to a tag bit, and eliminate the malloc header for mmap'ed
chunks, replacing it with a separate data structure.  This would allow
us to preserve page alignment for mmap'ed chunks without wasting an
entire page for each allocation, just to store the malloc header.

> +@deftypefun {void *} tag_address (void *@var{addr}, unsigned int @var{tag})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Return the address of @var{addr} with the tag value @var{tag} stored
> +in the untranslated bits.  Overflow of @var{tag} in the untranslated
> +bits are ignored.
> +@end deftypefun
> +
> +@deftypefun {void *} untag_address (void *@var{addr})
> +@standards{GNU, sys/tagged-address.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +Return the address of @var{addr} with all zero untranslated bits.
> +@end deftypefun

This should reference the earlier discussion about when it is safe to
tag and untag addresses.

> +@deftypefn Macro int TAGGED_ADDRESS_VALID_BITS (@var{bits})
> +This macro returns a nonzero value (true) if @var{bits} a valid tagged
> +address bits.
> +@end deftypefn

“are valid tagged”?

Does “valid” mean in this context that “the CPU can be configured to
ignore bits set in BITS during address translation using
set_tagged_address_mask”?

> +@deftypefn Macro {const uintptr_t} TAGGED_ADDRESS_MASK (@var{bits})
> +This macro returns a nonzero value if it can be used as mask for constant
> +address @var{bits} used in address translation.
> +@end deftypefn

I do not understand the description.

Thanks,
Florian


  reply	other threads:[~2021-04-21  6:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-20  2:18 RFC [PATCH v4 0/2] " H.J. Lu
2021-04-20  2:18 ` [PATCH v4 1/2] " H.J. Lu
2021-04-21  6:36   ` Florian Weimer [this message]
2021-04-21 23:21     ` H.J. Lu
2021-04-22  9:43       ` Szabolcs Nagy
2021-04-20  2:18 ` [PATCH v4 2/2] <sys/tagged-address.h>: Update libc.abilist files H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735vkqh38.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=hjl.tools@gmail.com \
    --cc=joseph@codesourcery.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    --cc=vedvyas.shanbhogue@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).