From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id ED0EF388C015 for ; Wed, 21 Apr 2021 06:36:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org ED0EF388C015 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-508-2qP4-snqPn-dR9P-QRkuFw-1; Wed, 21 Apr 2021 02:36:31 -0400 X-MC-Unique: 2qP4-snqPn-dR9P-QRkuFw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 456988542DE; Wed, 21 Apr 2021 06:36:30 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-113-20.ams2.redhat.com [10.36.113.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 42C836061F; Wed, 21 Apr 2021 06:36:28 +0000 (UTC) From: Florian Weimer To: "H.J. Lu" Cc: libc-alpha@sourceware.org, Joseph Myers , Vedvyas Shanbhogue , "Kirill A . Shutemov" , Szabolcs Nagy Subject: Re: [PATCH v4 1/2] : An API for tagged address References: <20210420021819.765779-1-hjl.tools@gmail.com> <20210420021819.765779-2-hjl.tools@gmail.com> Date: Wed, 21 Apr 2021 08:36:43 +0200 In-Reply-To: <20210420021819.765779-2-hjl.tools@gmail.com> (H. J. Lu's message of "Mon, 19 Apr 2021 19:18:18 -0700") Message-ID: <8735vkqh38.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2021 06:36:35 -0000 * H. J. Lu: > diff --git a/manual/tagged-address.texi b/manual/tagged-address.texi > new file mode 100644 > index 0000000000..ce10f7e752 > --- /dev/null > +++ b/manual/tagged-address.texi > @@ -0,0 +1,59 @@ > +@node Tagged Address, Character Handling, Memory, Top > +@c %MENU% Tagged address functions and macros > +@chapter Tagged Address > + > +By default, the number of the address bits used in address translation > +is the number of address bits. But it can be changed by ARM Top-byte > +Ignore (TBI) or Intel Linear Address Masking (LAM). > + > +@Theglibc{} provides several functions and macros in the header file > +@file{sys/tagged-address.h} to manipulate tagged address bits, which is > +the number of the address bits used in address translation. > +@pindex sys/tagged-address.h I don't under stand the =E2=80=9Cwhich is the number of address bits=E2=80= =9D part. This section needs to describe under which circumstances it is valid to alter the tag bits in pointers returned from glibc functions (including system call wrappers). I think at least historically, the kernel required masking tag bits in user space for TBI. > +@deftypefun {unsigned int} get_tagged_address_bits (void) > +@standards{GNU, sys/tagged-address.h} > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} > +Get the current address bits used in address translation. The return > +value is @code{0} if tag bits are not the highest bits in address. > +@end deftypefun =E2=80=9Cin addresses=E2=80=9D? What is the return value if there are no tag bits available? The word width? > +@deftypefun uintptr_t get_tagged_address_mask (void) > +@standards{GNU, sys/tagged-address.h} > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} > +Get the current mask for address bits used in address translation. > +@end deftypefun Mask is ambiguous in this context. If a bit is set in the return value, will this bit take part in address translation or not? Please be explicit here. > +@deftypefun int set_tagged_address_mask (uintptr_t @var{mask}) > +@standards{GNU, sys/tagged-address.h} > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} > +Set the mask for address bits used in address translation to @var{mask}. > +The return value is @code{0} on success and @code{-1} on failure. This > +function can be called only once before @code{main}. The possible > +@code{errno} error conditions are @code{ENODEV}, @code{EPERM}, > +@code{EINVAL}, and @code{ENOSYS}. > +@end deftypefun Likewise, please clarify if bits set in MASK participate in address translation or not. Why before main? Do you mean it can only be called once per process? I think this limitation suggests we should use ELF markup for this. There are definitely compatibility issues to work out here. Historically, the x86-64 psABI supplement implied that the top 16 bits are available for application use (without hardware masking obviously). If e.g. malloc starts returning tagged addresses, that assumption breaks. Should glibc allocate tag bits to different libraries within the same process? For example, so that malloc could get 2 tag bits, the main program 3 and some other library 1 bit? For glibc malloc, it would be a simple enhancement to move the IS_MMAPPED to a tag bit, and eliminate the malloc header for mmap'ed chunks, replacing it with a separate data structure. This would allow us to preserve page alignment for mmap'ed chunks without wasting an entire page for each allocation, just to store the malloc header. > +@deftypefun {void *} tag_address (void *@var{addr}, unsigned int @var{ta= g}) > +@standards{GNU, sys/tagged-address.h} > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} > +Return the address of @var{addr} with the tag value @var{tag} stored > +in the untranslated bits. Overflow of @var{tag} in the untranslated > +bits are ignored. > +@end deftypefun > + > +@deftypefun {void *} untag_address (void *@var{addr}) > +@standards{GNU, sys/tagged-address.h} > +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} > +Return the address of @var{addr} with all zero untranslated bits. > +@end deftypefun This should reference the earlier discussion about when it is safe to tag and untag addresses. > +@deftypefn Macro int TAGGED_ADDRESS_VALID_BITS (@var{bits}) > +This macro returns a nonzero value (true) if @var{bits} a valid tagged > +address bits. > +@end deftypefn =E2=80=9Care valid tagged=E2=80=9D? Does =E2=80=9Cvalid=E2=80=9D mean in this context that =E2=80=9Cthe CPU can= be configured to ignore bits set in BITS during address translation using set_tagged_address_mask=E2=80=9D? > +@deftypefn Macro {const uintptr_t} TAGGED_ADDRESS_MASK (@var{bits}) > +This macro returns a nonzero value if it can be used as mask for constan= t > +address @var{bits} used in address translation. > +@end deftypefn I do not understand the description. Thanks, Florian