public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: Lirong Yuan <yuanzi@google.com>,
	libc-alpha@sourceware.org, Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: scw@google.com
Subject: Re: [PATCH] locale: align _nl_C_LC_CTYPE_class and _nl_C_LC_CTYPE_class32 arrays to uint16_t and uint32_t respectively
Date: Mon, 15 Mar 2021 21:44:59 -0400	[thread overview]
Message-ID: <2003e08c-55e2-80fa-89a6-fb8d59cc0e77@redhat.com> (raw)
In-Reply-To: <20210315184211.4124573-1-yuanzi@google.com>

On 3/15/21 2:42 PM, Lirong Yuan via Libc-alpha wrote:
> steps to reproduce the problem: compile a program that uses ctype functions such as “isspace” for aarch64 with UBSan flag “-fsanitize=undefined” and run it on x86_64 machines with qemu user mode emulation.

Szabolcs,

Do you have any input on this?
 
> observed behavior: UndefinedBehaviorSanitizer reports misaligned-pointer-use in the program.

Yes, the char array could be misaligned with respect to a 16-bit value,
and should be aligned to the type that is expected from the interface e.g.

ctype/ctype.h:

 91 # define __isctype_f(type) \
 92   __extern_inline int                                                         \
 93   is##type (int __c) __THROW                                                  \
 94   {                                                                           \
 95     return (*__ctype_b_loc ())[(int) (__c)] & (unsigned short int) _IS##type; \
 96   }
 97 #endif

include/ctype.h:

 38 CTYPE_EXTERN_INLINE const uint16_t ** __attribute__ ((const))
 39 __ctype_b_loc (void)
 40 {
 41   return __libc_tsd_address (const uint16_t *, CTYPE_B);
 42 }

So we expect a uint16_t type and the respective alignment.

My expectation is that normally aarch64 simply handles the unaligned load without any problems,
but that it would be "better" if it were 16-bit aligned?

Is this the *only* case of misaligned pointers?

> solution: align the arrays defined in locale/C-ctype.c with correct data types as defined in ctype/ctype.h.
> 
> test suite regressions: none.

This looks technically correct, and the C locale is builtin so the layout
itself should be able to change without any problems.

I'd like to hear comments from Arm about this before accepting.

> Signed-off-by: Lirong Yuan <yuanzi@google.com>

We don't use DSOs in glibc, we assign copyright to the FSF, so this line would
be normally removed, and you as the git author remains.

See:
https://sourceware.org/glibc/wiki/Contribution%20checklist

You are covered by the Google copyright assignment so everything is accepted.

> ---
>  locale/C-ctype.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/locale/C-ctype.c b/locale/C-ctype.c
> index bffdbedad0..da2c8cc33c 100644
> --- a/locale/C-ctype.c
> +++ b/locale/C-ctype.c
> @@ -18,6 +18,7 @@
>  
>  #include "localeinfo.h"
>  #include <endian.h>
> +#include <stdalign.h>

OK.

>  #include <stdint.h>
>  
>  #include "C-translit.h"
> @@ -30,7 +31,7 @@
>     In the `_nl_C_LC_CTYPE_class' array the value for EOF (== -1)
>     is set to always return 0 and the conversion arrays return EOF.  */
>  
> -const char _nl_C_LC_CTYPE_class[768] attribute_hidden =
> +alignas(uint16_t) const char _nl_C_LC_CTYPE_class[768] attribute_hidden =

OK. Used directly by __ctype_b_loc.

>    /* 0x80 */ "\000\000" "\000\000" "\000\000" "\000\000" "\000\000" "\000\000"
>    /* 0x86 */ "\000\000" "\000\000" "\000\000" "\000\000" "\000\000" "\000\000"
>    /* 0x8c */ "\000\000" "\000\000" "\000\000" "\000\000" "\000\000" "\000\000"
> @@ -96,7 +97,7 @@ const char _nl_C_LC_CTYPE_class[768] attribute_hidden =
>    /* 0xf4 */ "\000\000" "\000\000" "\000\000" "\000\000" "\000\000" "\000\000"
>    /* 0xfa */ "\000\000" "\000\000" "\000\000" "\000\000" "\000\000" "\000\000"
>  ;
> -const char _nl_C_LC_CTYPE_class32[1024] attribute_hidden =
> +alignas(uint32_t) const char _nl_C_LC_CTYPE_class32[1024] attribute_hidden =

OK. Might be exposed via nl_langinfo and is internally uint32_t (thought directly
exposed __ctype32_b should not exist for aarch64 (verified not on abilist)).

>    /* 0x00 */ "\000\000\002\000" "\000\000\002\000" "\000\000\002\000"
>    /* 0x03 */ "\000\000\002\000" "\000\000\002\000" "\000\000\002\000"
>    /* 0x06 */ "\000\000\002\000" "\000\000\002\000" "\000\000\002\000"
> 
-- 
Cheers,
Carlos.


  reply	other threads:[~2021-03-16  1:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15 18:42 Lirong Yuan
2021-03-16  1:44 ` Carlos O'Donell [this message]
2021-03-16 14:28   ` Szabolcs Nagy
2021-03-16 19:05     ` Lirong Yuan
2021-03-16 19:47       ` Adhemerval Zanella
2021-03-16 20:49         ` Andreas Schwab
2021-03-16 21:05         ` Carlos O'Donell
2021-03-17 11:34           ` Adhemerval Zanella
2021-03-19 18:31             ` Lirong Yuan
2021-03-30 17:33               ` Lirong Yuan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2003e08c-55e2-80fa-89a6-fb8d59cc0e77@redhat.com \
    --to=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=scw@google.com \
    --cc=szabolcs.nagy@arm.com \
    --cc=yuanzi@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).