From: Jason Merrill <jason@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>,
"Joseph S. Myers" <joseph@codesourcery.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] c++: Add testcase for C++23 P2316R2 - consistent character literal encoding [PR102615]
Date: Thu, 7 Oct 2021 09:12:15 -0400 [thread overview]
Message-ID: <364eac8d-92d1-eadf-ad8e-565712f463fe@redhat.com> (raw)
In-Reply-To: <20211007130049.GT304296@tucnak>
On 10/7/21 09:00, Jakub Jelinek wrote:
> Hi!
>
> I believe we need no changes to the compiler for P2316R2, seems we treat
> character literals the same between preprocessor and C++ expressions,
> here is a testcase that should verify it.
>
> Tested on x86_64-linux, ok for trunk?
>
> Note, seems the internal charset for GCC can be either UTF-8 or UTF-EBCDIC,
> but I bet it is very hard (at least for me) to actually test the latter.
> I'd guess one needs all system headers to be in EBCDIC and the gcc sources too.
> But looking around the source, I'm a little bit worried about the UTF-EBCDIC
> case.
> One is:
> #if '\n' == 0x0A && ' ' == 0x20 && '0' == 0x30 \
> && 'A' == 0x41 && 'a' == 0x61 && '!' == 0x21
> # define HOST_CHARSET HOST_CHARSET_ASCII
> #else
> # if '\n' == 0x15 && ' ' == 0x40 && '0' == 0xF0 \
> && 'A' == 0xC1 && 'a' == 0x81 && '!' == 0x5A
> # define HOST_CHARSET HOST_CHARSET_EBCDIC
> # else
> # define HOST_CHARSET HOST_CHARSET_UNKNOWN
> # endif
> #endif
> in include/safe-ctype.h, does that mean we only support EBCDIC if -funsigned-char
> and otherwise fail to build gcc? Because with -fsigned-char, '0' is -0x10
> rather than 0xF0, 'A' is -0x3F rather than 0xC1 and 'a' is -0x7F rather than
> 0x81.
> And another thing, if HOST_CHARSET == HOST_CHARSET_EBCDIC, how does the libcpp/lex.c
> static const cppchar_t utf8_signifier = 0xC0;
> ...
> if (*buffer->cur >= utf8_signifier)
> {
> if (_cpp_valid_utf8 (pfile, &buffer->cur, buffer->rlimit, 1 + !first,
> state, &s))
> return true;
> }
> work? Because in UTF-EBCDIC, >= 0xC0 isn't the right test for start of
> multi-byte character, it is more complicated and seems _cpp_valid_utf8
> assumes UTF-8 as the host charset.
Are there any supported platforms that use UTF-EBCDIC?
> 2021-10-07 Jakub Jelinek <jakub@redhat.com>
>
> PR c++/102615
> * g++.dg/cpp23/charlit-encoding1.C: New testcase for C++23 P2316R2.
>
> --- gcc/testsuite/g++.dg/cpp23/charlit-encoding1.C.jj 2021-10-07 14:34:35.182132411 +0200
> +++ gcc/testsuite/g++.dg/cpp23/charlit-encoding1.C 2021-10-07 14:34:02.902583774 +0200
> @@ -0,0 +1,33 @@
> +// PR c++/102615 - P2316R2 - Consistent character literal encoding
> +// { dg-do compile }
Doesn't this need to run? OK with that change.
> +extern "C" void abort ();
> +
> +int
> +main ()
> +{
> +#if ' ' == 0x20
> + if (' ' != 0x20)
> + abort ();
> +#elif ' ' == 0x40
> + if (' ' != 0x40)
> + abort ();
> +#else
> + if (' ' == 0x20 || ' ' == 0x40)
> + abort ();
> +#endif
> +#if 'a' == 0x61
> + if ('a' != 0x61)
> + abort ();
> +#elif 'a' == 0x81
> + if ('a' != 0x81)
> + abort ();
> +#elif 'a' == -0x7F
> + if ('a' != -0x7F)
> + abort ();
> +#else
> + if ('a' == 0x61 || 'a' == 0x81 || 'a' == -0x7F)
> + abort ();
> +#endif
> + return 0;
> +}
>
> Jakub
>
next prev parent reply other threads:[~2021-10-07 13:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-07 13:00 Jakub Jelinek
2021-10-07 13:12 ` Jason Merrill [this message]
2021-10-07 13:23 ` Jakub Jelinek
2021-10-07 13:34 ` Lewis Hyatt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=364eac8d-92d1-eadf-ad8e-565712f463fe@redhat.com \
--to=jason@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=joseph@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).