public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/115658] New: char16_t and char32_t aliasing is conserative
@ 2024-06-26  4:13 pinskia at gcc dot gnu.org
  2024-06-26  4:15 ` [Bug c++/115658] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-26  4:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115658

            Bug ID: 115658
           Summary: char16_t and char32_t aliasing is conserative
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: alias, missed-optimization
          Severity: enhancement
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

when char8_t was added, a new aliasing set was done:
r9-5405-g2d91f79dc990f8

But when char16_t and char32_t was added (for GCC 4.4/C++11):
r0-88474-gc466b2cd136139

That was not done.

Maybe it should be done now.

Noticed from https://github.com/sg16-unicode/sg16/issues/67 .

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/115658] char16_t and char32_t aliasing is conserative
  2024-06-26  4:13 [Bug c++/115658] New: char16_t and char32_t aliasing is conserative pinskia at gcc dot gnu.org
@ 2024-06-26  4:15 ` pinskia at gcc dot gnu.org
  2024-06-26 20:12 ` tom at honermann dot net
  2024-06-28 19:52 ` tom at honermann dot net
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-26  4:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115658

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Though I should note
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2626r0.pdf and
https://github.com/sg16-unicode/sg16-meetings/tree/master#may-22nd-2024


So maybe we really should keep on treating them the same.
and maybe change char8_t back to similar as unsigned char ...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/115658] char16_t and char32_t aliasing is conserative
  2024-06-26  4:13 [Bug c++/115658] New: char16_t and char32_t aliasing is conserative pinskia at gcc dot gnu.org
  2024-06-26  4:15 ` [Bug c++/115658] " pinskia at gcc dot gnu.org
@ 2024-06-26 20:12 ` tom at honermann dot net
  2024-06-28 19:52 ` tom at honermann dot net
  2 siblings, 0 replies; 4+ messages in thread
From: tom at honermann dot net @ 2024-06-26 20:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115658

Tom Honermann <tom at honermann dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tom at honermann dot net

--- Comment #2 from Tom Honermann <tom at honermann dot net> ---
I think the prior comments regarding the aliasing sets refer to portions of
c_common_get_alias_set() in gcc/gcc/c-family/c-common.cc, specifically the
lines at
https://github.com/gcc-mirror/gcc/blob/629257bcb81434117f1e9c68479032563176dc0c/gcc/c-family/c-common.cc#L3892-L3895.

I think the alias sets are right; they are consistent with what the C++
standard states in [basic.lval]p11 (http://eel.is/c++draft/basic.lval#11).

> So maybe we really should keep on treating them the same.
and maybe change char8_t back to similar as unsigned char ...

I don't think that should be necessary. P0482R6 was explicit in its intent that
char8_t not be an aliasing type (http://wg21.link/p0482r6#proposal).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/115658] char16_t and char32_t aliasing is conserative
  2024-06-26  4:13 [Bug c++/115658] New: char16_t and char32_t aliasing is conserative pinskia at gcc dot gnu.org
  2024-06-26  4:15 ` [Bug c++/115658] " pinskia at gcc dot gnu.org
  2024-06-26 20:12 ` tom at honermann dot net
@ 2024-06-28 19:52 ` tom at honermann dot net
  2 siblings, 0 replies; 4+ messages in thread
From: tom at honermann dot net @ 2024-06-28 19:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115658

--- Comment #3 from Tom Honermann <tom at honermann dot net> ---
In retrospect, I think I misunderstood Andrew's motivation for filing this
issue.

There is a difference of behavior between gcc and clang with regard to aliasing
of `char16_t` and `char32_t` with respect to other types. This is illustrated
by the following example as demonstrated at
https://www.godbolt.org/z/PsMsPMa73.

Please note that (at least) `-O2` is necessary to reliably demonstrate
differences in behavior. Additionally, the use of `-fshort-wchar` to influence
the size of `wchar_t` affects behavior.

```
template<typename T, typename U>
U f(T *p, U *q) {
  *p = 1;
  U u = *q;
  *p = 2;
  return u;
}
template wchar_t f(char16_t*, wchar_t*);
template unsigned short f(char16_t*, unsigned short*);
template wchar_t f(char32_t*, wchar_t*);
template unsigned int f(char32_t*, unsigned int*);
```

The test case exercises dead store elimination in the presence of aliasing
types. If `T` may alias `U`, then the write of `1` to `*p` is observable by
`*q`, but may otherwise be eliminated due to the later write of `2` to `*p`.

For Clang, there is no aliasing between any of these types and the store of `1`
to `*p` is always eliminated.

For MSVC, it appears that either dead store elimination is not performed at
all, or aliasing is permitted across all of these types (even when the size
differs).

For gcc with `-fshort-wchar`, there appear to be two alias sets:
- `wchar_t`, `char16_t`, and `unsigned short`.
- `char32_t` and `unsigned int`.

For gcc without `-fshort-wchar`, there are also two alias sets, but they are
not symmetric in the presence of that option. Note that `char32_t` never
aliases with `wchar_t` even when they have the same size. This asymmetry is
explainable in consideration of compatibility with MSVC (where `wchar_t` is
always 16-bit).
- `char16_t` and `unsigned short`.
- `char32_t` and `unsigned int`.

Adding the following explicit template instantiations demonstrates that all of
gcc, clang, and MSVC permit aliasing between the set of `char`, `unsigned
char`, and `char8_t` (because `char` and `unsigned char` are permitted to alias
all types). https://www.godbolt.org/z/Pjxb661Y7.

```
template char f(char8_t*, char*);
template unsigned char f(char8_t*, unsigned char*);
```

To reiterate, I think the current gcc behavior is correct and defensible given
two goals:
- A desire to match MSVC behavior in the limited context of a 16-bit `wchar_t`
type.
- A desire to match C behavior with respect to `char16_t` and `char32_t`
aliasing the underlying types of `uint_least16_t` and `uint_least32_t` (the
former are typedefs in C).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-28 19:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-26  4:13 [Bug c++/115658] New: char16_t and char32_t aliasing is conserative pinskia at gcc dot gnu.org
2024-06-26  4:15 ` [Bug c++/115658] " pinskia at gcc dot gnu.org
2024-06-26 20:12 ` tom at honermann dot net
2024-06-28 19:52 ` tom at honermann dot net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).