public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/102615] New: [C++23] P2316R2 - Consistent character literal encoding
@ 2021-10-05 16:47 mpolacek at gcc dot gnu.org
  2021-10-05 16:48 ` [Bug c++/102615] " jakub at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2021-10-05 16:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102615

            Bug ID: 102615
           Summary: [C++23] P2316R2 - Consistent character literal
                    encoding
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

See https://wg21.link/p2316r2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/102615] [C++23] P2316R2 - Consistent character literal encoding
  2021-10-05 16:47 [Bug c++/102615] New: [C++23] P2316R2 - Consistent character literal encoding mpolacek at gcc dot gnu.org
@ 2021-10-05 16:48 ` jakub at gcc dot gnu.org
  2021-10-07 13:17 ` cvs-commit at gcc dot gnu.org
  2021-10-07 13:24 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-05 16:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102615

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Also likely a nop but we might want some testcase.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/102615] [C++23] P2316R2 - Consistent character literal encoding
  2021-10-05 16:47 [Bug c++/102615] New: [C++23] P2316R2 - Consistent character literal encoding mpolacek at gcc dot gnu.org
  2021-10-05 16:48 ` [Bug c++/102615] " jakub at gcc dot gnu.org
@ 2021-10-07 13:17 ` cvs-commit at gcc dot gnu.org
  2021-10-07 13:24 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-07 13:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102615

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:348b426be3fc99453b42e79a18331c7bf24ee0dc

commit r12-4226-g348b426be3fc99453b42e79a18331c7bf24ee0dc
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Oct 7 15:16:13 2021 +0200

    c++: Add testcase for C++23 P2316R2 - consistent character literal encoding
[PR102615]

    I believe we need no changes to the compiler for P2316R2, seems we treat
    character literals the same between preprocessor and C++ expressions,
    here is a testcase that should verify it.

    Note, seems the internal charset for GCC can be either UTF-8 or UTF-EBCDIC,
    but I bet it is very hard (at least for me) to actually test the latter.
    I'd guess one needs all system headers to be in EBCDIC and the gcc sources
too.
    But looking around the source, I'm a little bit worried about the
UTF-EBCDIC
    case.
    One is:
     #if  '\n' == 0x0A && ' ' == 0x20 && '0' == 0x30 \
        && 'A' == 0x41 && 'a' == 0x61 && '!' == 0x21
     #  define HOST_CHARSET HOST_CHARSET_ASCII
     #else
     # if '\n' == 0x15 && ' ' == 0x40 && '0' == 0xF0 \
        && 'A' == 0xC1 && 'a' == 0x81 && '!' == 0x5A
     #  define HOST_CHARSET HOST_CHARSET_EBCDIC
     # else
     #  define HOST_CHARSET HOST_CHARSET_UNKNOWN
     # endif
     #endif
    in include/safe-ctype.h, does that mean we only support EBCDIC if
-funsigned-char
    and otherwise fail to build gcc?  Because with -fsigned-char, '0' is -0x10
    rather than 0xF0, 'A' is -0x3F rather than 0xC1 and 'a' is -0x7F rather
than
    0x81.
    And another thing, if HOST_CHARSET == HOST_CHARSET_EBCDIC, how does the
libcpp/lex.c
    static const cppchar_t utf8_signifier = 0xC0;
    ...
          if (*buffer->cur >= utf8_signifier)
            {
              if (_cpp_valid_utf8 (pfile, &buffer->cur, buffer->rlimit, 1 +
!first,
                                   state, &s))
                return true;
            }
    work?  Because in UTF-EBCDIC, >= 0xC0 isn't the right test for start of
    multi-byte character, it is more complicated and seems _cpp_valid_utf8
    assumes UTF-8 as the host charset.

    2021-10-07  Jakub Jelinek  <jakub@redhat.com>

            PR c++/102615
            * g++.dg/cpp23/charlit-encoding1.C: New testcase for C++23 P2316R2.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug c++/102615] [C++23] P2316R2 - Consistent character literal encoding
  2021-10-05 16:47 [Bug c++/102615] New: [C++23] P2316R2 - Consistent character literal encoding mpolacek at gcc dot gnu.org
  2021-10-05 16:48 ` [Bug c++/102615] " jakub at gcc dot gnu.org
  2021-10-07 13:17 ` cvs-commit at gcc dot gnu.org
@ 2021-10-07 13:24 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-07 13:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102615

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think any gcc in the last decade or two satisfies this.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-07 13:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-05 16:47 [Bug c++/102615] New: [C++23] P2316R2 - Consistent character literal encoding mpolacek at gcc dot gnu.org
2021-10-05 16:48 ` [Bug c++/102615] " jakub at gcc dot gnu.org
2021-10-07 13:17 ` cvs-commit at gcc dot gnu.org
2021-10-07 13:24 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).