On 8/31/22 10:15, Jakub Jelinek wrote: > On Wed, Aug 31, 2022 at 09:55:29AM -0400, Jason Merrill wrote: >> On 8/31/22 07:14, Jakub Jelinek wrote: >>> On Tue, Aug 30, 2022 at 05:51:26PM -0400, Jason Merrill wrote: >>>> This hunk now seems worth factoring out of the four places it occurs. >>>> >>>> It also seems the comment for _cpp_valid_utf8 needs to be updated: it >>>> currently says it's not called when parsing a string. >>> >>> Ok, so like this? >> >> OK, thanks. > > Actually, it isn't enough to diagnose this in comments and character/string > literals, sorry for finding that out only today. > > We don't accept invalid UTF-8 in identifiers, it fails the checking in there > (most of the times without errors), what we do is create CPP_OTHER tokens > out of those and then typically diagnose it when it is used somewhere. > Except it doesn't have to be used anywhere, it can be omitted. > > So if we have say > #define I(x) > I(���) > like in the Winvalid-utf8-3.c test, we silently accept it. > > This updated version extends the diagnostics even to those cases. > I can't use _cpp_handle_multibyte_utf8 in that case because it needs > different treatment (no bidi stuff which is emitted already from > forms_identifier_p etc.). > > Tested so far on the new tests, ok for trunk if it passes full > bootstrap/regtest? OK. > 2022-08-31 Jakub Jelinek > > PR c++/106655 > libcpp/ > * include/cpplib.h (struct cpp_options): Implement C++23 > P2295R6 - Support for UTF-8 as a portable source file encoding. > Add cpp_warn_invalid_utf8 and cpp_input_charset_explicit fields. > (enum cpp_warning_reason): Add CPP_W_INVALID_UTF8 enumerator. > * init.cc (cpp_create_reader): Initialize cpp_warn_invalid_utf8 > and cpp_input_charset_explicit. > * charset.cc (_cpp_valid_utf8): Adjust function comment. > * lex.cc (UCS_LIMIT): Define. > (utf8_continuation): New const variable. > (utf8_signifier): Move earlier in the file. > (_cpp_warn_invalid_utf8, _cpp_handle_multibyte_utf8): New functions. > (_cpp_skip_block_comment): Handle -Winvalid-utf8 warning. > (skip_line_comment): Likewise. > (lex_raw_string, lex_string): Likewise. > (_cpp_lex_direct): Likewise. > gcc/ > * doc/invoke.texi (-Winvalid-utf8): Document it. > gcc/c-family/ > * c.opt (-Winvalid-utf8): New warning. > * c-opts.c (c_common_handle_option) : > Set cpp_opts->cpp_input_charset_explicit. > (c_common_post_options): If -finput-charset=UTF-8 is explicit > in C++23, enable -Winvalid-utf8 by default and if -pedantic > or -pedantic-errors, make it a pedwarn. > gcc/testsuite/ > * c-c++-common/cpp/Winvalid-utf8-1.c: New test. > * c-c++-common/cpp/Winvalid-utf8-2.c: New test. > * c-c++-common/cpp/Winvalid-utf8-3.c: New test. > * g++.dg/cpp23/Winvalid-utf8-1.C: New test. > * g++.dg/cpp23/Winvalid-utf8-2.C: New test. > * g++.dg/cpp23/Winvalid-utf8-3.C: New test. > * g++.dg/cpp23/Winvalid-utf8-4.C: New test. > * g++.dg/cpp23/Winvalid-utf8-5.C: New test. > * g++.dg/cpp23/Winvalid-utf8-6.C: New test. > * g++.dg/cpp23/Winvalid-utf8-7.C: New test. > * g++.dg/cpp23/Winvalid-utf8-8.C: New test. > * g++.dg/cpp23/Winvalid-utf8-9.C: New test. > * g++.dg/cpp23/Winvalid-utf8-10.C: New test. > * g++.dg/cpp23/Winvalid-utf8-11.C: New test. > * g++.dg/cpp23/Winvalid-utf8-12.C: New test. > > --- libcpp/include/cpplib.h.jj 2022-08-31 10:19:45.226452609 +0200 > +++ libcpp/include/cpplib.h 2022-08-31 12:25:42.451125755 +0200 > @@ -560,6 +560,13 @@ struct cpp_options > cpp_bidirectional_level. */ > unsigned char cpp_warn_bidirectional; > > + /* True if libcpp should warn about invalid UTF-8 characters in comments. > + 2 if it should be a pedwarn. */ > + unsigned char cpp_warn_invalid_utf8; > + > + /* True if -finput-charset= option has been used explicitly. */ > + bool cpp_input_charset_explicit; > + > /* Dependency generation. */ > struct > { > @@ -666,7 +673,8 @@ enum cpp_warning_reason { > CPP_W_CXX11_COMPAT, > CPP_W_CXX20_COMPAT, > CPP_W_EXPANSION_TO_DEFINED, > - CPP_W_BIDIRECTIONAL > + CPP_W_BIDIRECTIONAL, > + CPP_W_INVALID_UTF8 > }; > > /* Callback for header lookup for HEADER, which is the name of a > --- libcpp/init.cc.jj 2022-08-31 10:19:45.260452148 +0200 > +++ libcpp/init.cc 2022-08-31 12:25:42.451125755 +0200 > @@ -227,6 +227,8 @@ cpp_create_reader (enum c_lang lang, cpp > CPP_OPTION (pfile, ext_numeric_literals) = 1; > CPP_OPTION (pfile, warn_date_time) = 0; > CPP_OPTION (pfile, cpp_warn_bidirectional) = bidirectional_unpaired; > + CPP_OPTION (pfile, cpp_warn_invalid_utf8) = 0; > + CPP_OPTION (pfile, cpp_input_charset_explicit) = 0; > > /* Default CPP arithmetic to something sensible for the host for the > benefit of dumb users like fix-header. */ > --- libcpp/charset.cc.jj 2022-08-26 16:06:10.578493272 +0200 > +++ libcpp/charset.cc 2022-08-31 12:34:18.921176118 +0200 > @@ -1742,9 +1742,9 @@ convert_ucn (cpp_reader *pfile, const uc > case, no diagnostic is emitted, and the return value of FALSE should cause > a new token to be formed. > > - Unlike _cpp_valid_ucn, this will never be called when lexing a string; only > - a potential identifier, or a CPP_OTHER token. NST is unused in the latter > - case. > + _cpp_valid_utf8 can be called when lexing a potential identifier, or a > + CPP_OTHER token or for the purposes of -Winvalid-utf8 warning in string or > + character literals. NST is unused when not in a potential identifier. > > As in _cpp_valid_ucn, IDENTIFIER_POS is 0 when not in an identifier, 1 for > the start of an identifier, or 2 otherwise. */ > --- libcpp/lex.cc.jj 2022-08-31 10:19:45.327451236 +0200 > +++ libcpp/lex.cc 2022-08-31 15:23:53.753556178 +0200 > @@ -50,6 +50,9 @@ static const struct token_spelling token > #define TOKEN_SPELL(token) (token_spellings[(token)->type].category) > #define TOKEN_NAME(token) (token_spellings[(token)->type].name) > > +/* ISO 10646 defines the UCS codespace as the range 0-0x10FFFF inclusive. */ > +#define UCS_LIMIT 0x10FFFF > + > static void add_line_note (cpp_buffer *, const uchar *, unsigned int); > static int skip_line_comment (cpp_reader *); > static void skip_whitespace (cpp_reader *, cppchar_t); > @@ -1704,6 +1707,120 @@ maybe_warn_bidi_on_char (cpp_reader *pfi > bidi::on_char (kind, ucn_p, loc); > } > > +static const cppchar_t utf8_continuation = 0x80; > +static const cppchar_t utf8_signifier = 0xC0; > + > +/* Emit -Winvalid-utf8 warning on invalid UTF-8 character starting > + at PFILE->buffer->cur. Return a pointer after the diagnosed > + invalid character. */ > + > +static const uchar * > +_cpp_warn_invalid_utf8 (cpp_reader *pfile) > +{ > + cpp_buffer *buffer = pfile->buffer; > + const uchar *cur = buffer->cur; > + bool pedantic = (CPP_PEDANTIC (pfile) > + && CPP_OPTION (pfile, cpp_warn_invalid_utf8) == 2); > + > + if (cur[0] < utf8_signifier > + || cur[1] < utf8_continuation || cur[1] >= utf8_signifier) > + { > + if (pedantic) > + cpp_error_with_line (pfile, CPP_DL_PEDWARN, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x>", > + cur[0]); > + else > + cpp_warning_with_line (pfile, CPP_W_INVALID_UTF8, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x>", > + cur[0]); > + return cur + 1; > + } > + else if (cur[2] < utf8_continuation || cur[2] >= utf8_signifier) > + { > + if (pedantic) > + cpp_error_with_line (pfile, CPP_DL_PEDWARN, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x>", > + cur[0], cur[1]); > + else > + cpp_warning_with_line (pfile, CPP_W_INVALID_UTF8, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x>", > + cur[0], cur[1]); > + return cur + 2; > + } > + else if (cur[3] < utf8_continuation || cur[3] >= utf8_signifier) > + { > + if (pedantic) > + cpp_error_with_line (pfile, CPP_DL_PEDWARN, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x><%x>", > + cur[0], cur[1], cur[2]); > + else > + cpp_warning_with_line (pfile, CPP_W_INVALID_UTF8, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x><%x>", > + cur[0], cur[1], cur[2]); > + return cur + 3; > + } > + else > + { > + if (pedantic) > + cpp_error_with_line (pfile, CPP_DL_PEDWARN, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x><%x><%x>", > + cur[0], cur[1], cur[2], cur[3]); > + else > + cpp_warning_with_line (pfile, CPP_W_INVALID_UTF8, > + pfile->line_table->highest_line, > + CPP_BUF_COL (buffer), > + "invalid UTF-8 character <%x><%x><%x><%x>", > + cur[0], cur[1], cur[2], cur[3]); > + return cur + 4; > + } > +} > + > +/* Helper function of *skip_*_comment and lex*_string. For C, > + character at CUR[-1] with MSB set handle -Wbidi-chars* and > + -Winvalid-utf8 diagnostics and return pointer to first character > + that should be processed next. */ > + > +static inline const uchar * > +_cpp_handle_multibyte_utf8 (cpp_reader *pfile, uchar c, > + const uchar *cur, bool warn_bidi_p, > + bool warn_invalid_utf8_p) > +{ > + /* If this is a beginning of a UTF-8 encoding, it might be > + a bidirectional control character. */ > + if (c == bidi::utf8_start && warn_bidi_p) > + { > + location_t loc; > + bidi::kind kind = get_bidi_utf8 (pfile, cur - 1, &loc); > + maybe_warn_bidi_on_char (pfile, kind, /*ucn_p=*/false, loc); > + } > + if (!warn_invalid_utf8_p) > + return cur; > + if (c >= utf8_signifier) > + { > + cppchar_t s; > + const uchar *pstr = cur - 1; > + if (_cpp_valid_utf8 (pfile, &pstr, pfile->buffer->rlimit, 0, NULL, &s) > + && s <= UCS_LIMIT) > + return pstr; > + } > + pfile->buffer->cur = cur - 1; > + return _cpp_warn_invalid_utf8 (pfile); > +} > + > /* Skip a C-style block comment. We find the end of the comment by > seeing if an asterisk is before every '/' we encounter. Returns > nonzero if comment terminated by EOF, zero otherwise. > @@ -1716,6 +1833,8 @@ _cpp_skip_block_comment (cpp_reader *pfi > const uchar *cur = buffer->cur; > uchar c; > const bool warn_bidi_p = pfile->warn_bidi_p (); > + const bool warn_invalid_utf8_p = CPP_OPTION (pfile, cpp_warn_invalid_utf8); > + const bool warn_bidi_or_invalid_utf8_p = warn_bidi_p | warn_invalid_utf8_p; > > cur++; > if (*cur == '/') > @@ -1765,14 +1884,10 @@ _cpp_skip_block_comment (cpp_reader *pfi > > cur = buffer->cur; > } > - /* If this is a beginning of a UTF-8 encoding, it might be > - a bidirectional control character. */ > - else if (__builtin_expect (c == bidi::utf8_start, 0) && warn_bidi_p) > - { > - location_t loc; > - bidi::kind kind = get_bidi_utf8 (pfile, cur - 1, &loc); > - maybe_warn_bidi_on_char (pfile, kind, /*ucn_p=*/false, loc); > - } > + else if (__builtin_expect (c >= utf8_continuation, 0) > + && warn_bidi_or_invalid_utf8_p) > + cur = _cpp_handle_multibyte_utf8 (pfile, c, cur, warn_bidi_p, > + warn_invalid_utf8_p); > } > > buffer->cur = cur; > @@ -1789,11 +1904,13 @@ skip_line_comment (cpp_reader *pfile) > cpp_buffer *buffer = pfile->buffer; > location_t orig_line = pfile->line_table->highest_line; > const bool warn_bidi_p = pfile->warn_bidi_p (); > + const bool warn_invalid_utf8_p = CPP_OPTION (pfile, cpp_warn_invalid_utf8); > + const bool warn_bidi_or_invalid_utf8_p = warn_bidi_p | warn_invalid_utf8_p; > > - if (!warn_bidi_p) > + if (!warn_bidi_or_invalid_utf8_p) > while (*buffer->cur != '\n') > buffer->cur++; > - else > + else if (!warn_invalid_utf8_p) > { > while (*buffer->cur != '\n' > && *buffer->cur != bidi::utf8_start) > @@ -1813,6 +1930,22 @@ skip_line_comment (cpp_reader *pfile) > maybe_warn_bidi_on_close (pfile, buffer->cur); > } > } > + else > + { > + while (*buffer->cur != '\n') > + { > + if (*buffer->cur < utf8_continuation) > + { > + buffer->cur++; > + continue; > + } > + buffer->cur > + = _cpp_handle_multibyte_utf8 (pfile, *buffer->cur, buffer->cur + 1, > + warn_bidi_p, warn_invalid_utf8_p); > + } > + if (warn_bidi_p) > + maybe_warn_bidi_on_close (pfile, buffer->cur); > + } > > _cpp_process_line_notes (pfile, true); > return orig_line != pfile->line_table->highest_line; > @@ -1919,8 +2052,6 @@ warn_about_normalization (cpp_reader *pf > } > } > > -static const cppchar_t utf8_signifier = 0xC0; > - > /* Returns TRUE if the sequence starting at buffer->cur is valid in > an identifier. FIRST is TRUE if this starts an identifier. */ > > @@ -2361,6 +2492,8 @@ lex_raw_string (cpp_reader *pfile, cpp_t > { > const uchar *pos = base; > const bool warn_bidi_p = pfile->warn_bidi_p (); > + const bool warn_invalid_utf8_p = CPP_OPTION (pfile, cpp_warn_invalid_utf8); > + const bool warn_bidi_or_invalid_utf8_p = warn_bidi_p | warn_invalid_utf8_p; > > /* 'tis a pity this information isn't passed down from the lexer's > initial categorization of the token. */ > @@ -2597,13 +2730,10 @@ lex_raw_string (cpp_reader *pfile, cpp_t > pos = base = pfile->buffer->cur; > note = &pfile->buffer->notes[pfile->buffer->cur_note]; > } > - else if (__builtin_expect ((unsigned char) c == bidi::utf8_start, 0) > - && warn_bidi_p) > - { > - location_t loc; > - bidi::kind kind = get_bidi_utf8 (pfile, pos - 1, &loc); > - maybe_warn_bidi_on_char (pfile, kind, /*ucn_p=*/false, loc); > - } > + else if (__builtin_expect ((unsigned char) c >= utf8_continuation, 0) > + && warn_bidi_or_invalid_utf8_p) > + pos = _cpp_handle_multibyte_utf8 (pfile, c, pos, warn_bidi_p, > + warn_invalid_utf8_p); > } > > if (warn_bidi_p) > @@ -2704,6 +2834,8 @@ lex_string (cpp_reader *pfile, cpp_token > terminator = '>', type = CPP_HEADER_NAME; > > const bool warn_bidi_p = pfile->warn_bidi_p (); > + const bool warn_invalid_utf8_p = CPP_OPTION (pfile, cpp_warn_invalid_utf8); > + const bool warn_bidi_or_invalid_utf8_p = warn_bidi_p | warn_invalid_utf8_p; > for (;;) > { > cppchar_t c = *cur++; > @@ -2745,12 +2877,10 @@ lex_string (cpp_reader *pfile, cpp_token > } > else if (c == '\0') > saw_NUL = true; > - else if (__builtin_expect (c == bidi::utf8_start, 0) && warn_bidi_p) > - { > - location_t loc; > - bidi::kind kind = get_bidi_utf8 (pfile, cur - 1, &loc); > - maybe_warn_bidi_on_char (pfile, kind, /*ucn_p=*/false, loc); > - } > + else if (__builtin_expect (c >= utf8_continuation, 0) > + && warn_bidi_or_invalid_utf8_p) > + cur = _cpp_handle_multibyte_utf8 (pfile, c, cur, warn_bidi_p, > + warn_invalid_utf8_p); > } > > if (saw_NUL && !pfile->state.skipping) > @@ -4052,6 +4182,7 @@ _cpp_lex_direct (cpp_reader *pfile) > default: > { > const uchar *base = --buffer->cur; > + static int no_warn_cnt; > > /* Check for an extended identifier ($ or UCN or UTF-8). */ > struct normalize_state nst = INITIAL_NORMALIZE_STATE; > @@ -4072,7 +4203,33 @@ _cpp_lex_direct (cpp_reader *pfile) > const uchar *pstr = base; > cppchar_t s; > if (_cpp_valid_utf8 (pfile, &pstr, buffer->rlimit, 0, NULL, &s)) > - buffer->cur = pstr; > + { > + if (s > UCS_LIMIT && CPP_OPTION (pfile, cpp_warn_invalid_utf8)) > + { > + buffer->cur = base; > + _cpp_warn_invalid_utf8 (pfile); > + } > + buffer->cur = pstr; > + } > + else if (CPP_OPTION (pfile, cpp_warn_invalid_utf8)) > + { > + buffer->cur = base; > + const uchar *end = _cpp_warn_invalid_utf8 (pfile); > + buffer->cur = base + 1; > + no_warn_cnt = end - buffer->cur; > + } > + } > + else if (c >= utf8_continuation > + && CPP_OPTION (pfile, cpp_warn_invalid_utf8)) > + { > + if (no_warn_cnt) > + --no_warn_cnt; > + else > + { > + buffer->cur = base; > + _cpp_warn_invalid_utf8 (pfile); > + buffer->cur = base + 1; > + } > } > create_literal (pfile, result, base, buffer->cur - base, CPP_OTHER); > break; > --- gcc/doc/invoke.texi.jj 2022-08-31 10:20:20.224976860 +0200 > +++ gcc/doc/invoke.texi 2022-08-31 14:58:17.654138549 +0200 > @@ -365,9 +365,9 @@ Objective-C and Objective-C++ Dialects}. > -Winfinite-recursion @gol > -Winit-self -Winline -Wno-int-conversion -Wint-in-bool-context @gol > -Wno-int-to-pointer-cast -Wno-invalid-memory-model @gol > --Winvalid-pch -Wjump-misses-init -Wlarger-than=@var{byte-size} @gol > --Wlogical-not-parentheses -Wlogical-op -Wlong-long @gol > --Wno-lto-type-mismatch -Wmain -Wmaybe-uninitialized @gol > +-Winvalid-pch -Winvalid-utf8 -Wjump-misses-init @gol > +-Wlarger-than=@var{byte-size} -Wlogical-not-parentheses -Wlogical-op @gol > +-Wlong-long -Wno-lto-type-mismatch -Wmain -Wmaybe-uninitialized @gol > -Wmemset-elt-size -Wmemset-transposed-args @gol > -Wmisleading-indentation -Wmissing-attributes -Wmissing-braces @gol > -Wmissing-field-initializers -Wmissing-format-attribute @gol > @@ -9569,6 +9569,13 @@ different size. > Warn if a precompiled header (@pxref{Precompiled Headers}) is found in > the search path but cannot be used. > > +@item -Winvalid-utf8 > +@opindex Winvalid-utf8 > +@opindex Wno-invalid-utf8 > +Warn if an invalid UTF-8 character is found. > +This warning is on by default for C++23 if @option{-finput-charset=UTF-8} > +is used and turned into error with @option{-pedantic-errors}. > + > @item -Wlong-long > @opindex Wlong-long > @opindex Wno-long-long > --- gcc/c-family/c.opt.jj 2022-08-31 10:19:45.145453711 +0200 > +++ gcc/c-family/c.opt 2022-08-31 12:25:42.457125674 +0200 > @@ -821,6 +821,10 @@ Winvalid-pch > C ObjC C++ ObjC++ CPP(warn_invalid_pch) CppReason(CPP_W_INVALID_PCH) Var(cpp_warn_invalid_pch) Init(0) Warning > Warn about PCH files that are found but not used. > > +Winvalid-utf8 > +C objC C++ ObjC++ CPP(cpp_warn_invalid_utf8) CppReason(CPP_W_INVALID_UTF8) Var(warn_invalid_utf8) Init(0) Warning > +Warn about invalid UTF-8 characters in comments. > + > Wjump-misses-init > C ObjC Var(warn_jump_misses_init) Warning LangEnabledby(C ObjC,Wc++-compat) > Warn when a jump misses a variable initialization. > --- gcc/c-family/c-opts.cc.jj 2022-08-31 10:19:45.080454594 +0200 > +++ gcc/c-family/c-opts.cc 2022-08-31 12:25:42.457125674 +0200 > @@ -534,6 +534,7 @@ c_common_handle_option (size_t scode, co > > case OPT_finput_charset_: > cpp_opts->input_charset = arg; > + cpp_opts->cpp_input_charset_explicit = 1; > break; > > case OPT_ftemplate_depth_: > @@ -1152,6 +1153,17 @@ c_common_post_options (const char **pfil > lang_hooks.preprocess_options (parse_in); > cpp_post_options (parse_in); > init_global_opts_from_cpp (&global_options, cpp_get_options (parse_in)); > + /* For C++23 and explicit -finput-charset=UTF-8, turn on -Winvalid-utf8 > + by default and make it a pedwarn unless -Wno-invalid-utf8. */ > + if (cxx_dialect >= cxx23 > + && cpp_opts->cpp_input_charset_explicit > + && strcmp (cpp_opts->input_charset, "UTF-8") == 0 > + && (cpp_opts->cpp_warn_invalid_utf8 > + || !global_options_set.x_warn_invalid_utf8)) > + { > + global_options.x_warn_invalid_utf8 = 1; > + cpp_opts->cpp_warn_invalid_utf8 = cpp_opts->cpp_pedantic ? 2 : 1; > + } > > /* Let diagnostics infrastructure know how to convert input files the same > way libcpp will do it, namely using the configured input charset and > --- gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-1.c.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-1.c 2022-08-31 14:58:28.977986756 +0200 > @@ -0,0 +1,43 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -Winvalid-utf8" } > + > +// a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } > +// a�a { dg-warning "invalid UTF-8 character <80>" } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a�a { dg-warning "invalid UTF-8 character " } > +// a���a { dg-warning "invalid UTF-8 character <80>" } > +// a���a { dg-warning "invalid UTF-8 character <9f><80>" } > +// a��a { dg-warning "invalid UTF-8 character " } > +// a��a { dg-warning "invalid UTF-8 character <80>" } > +// a���a { dg-warning "invalid UTF-8 character <80>" } > +// a����a { dg-warning "invalid UTF-8 character <80><80><80>" } > +// a����a { dg-warning "invalid UTF-8 character <8f>" } > +// a����a { dg-warning "invalid UTF-8 character <90><80><80>" } > +// a������a { dg-warning "invalid UTF-8 character " } > +// { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } > +/* a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } */ > +/* a�a { dg-warning "invalid UTF-8 character <80>" } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a�a { dg-warning "invalid UTF-8 character " } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" } */ > +/* a���a { dg-warning "invalid UTF-8 character <9f><80>" } */ > +/* a��a { dg-warning "invalid UTF-8 character " } */ > +/* a��a { dg-warning "invalid UTF-8 character <80>" } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" } */ > +/* a����a { dg-warning "invalid UTF-8 character <80><80><80>" } */ > +/* a����a { dg-warning "invalid UTF-8 character <8f>" } */ > +/* a����a { dg-warning "invalid UTF-8 character <90><80><80>" } */ > +/* a������a { dg-warning "invalid UTF-8 character " } */ > +/* { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } */ > --- gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-2.c.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-2.c 2022-08-31 14:58:38.091864588 +0200 > @@ -0,0 +1,88 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess { target { c || c++11 } } } > +// { dg-require-effective-target wchar } > +// { dg-options "-finput-charset=UTF-8 -Winvalid-utf8" } > +// { dg-additional-options "-std=gnu99" { target c } } > + > +#ifndef __cplusplus > +#include > +typedef __CHAR16_TYPE__ char16_t; > +typedef __CHAR32_TYPE__ char32_t; > +#endif > + > +char32_t a = U'�'; // { dg-warning "invalid UTF-8 character <80>" } > +char32_t b = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t c = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t d = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t e = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t f = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t g = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t h = U'�'; // { dg-warning "invalid UTF-8 character " } > +char32_t i = U'���'; // { dg-warning "invalid UTF-8 character <80>" } > +char32_t j = U'���'; // { dg-warning "invalid UTF-8 character <9f><80>" } > +char32_t k = U'��'; // { dg-warning "invalid UTF-8 character " } > +char32_t l = U'��'; // { dg-warning "invalid UTF-8 character <80>" } > +char32_t m = U'���'; // { dg-warning "invalid UTF-8 character <80>" } > +char32_t n = U'����'; // { dg-warning "invalid UTF-8 character <80><80><80>" } > +char32_t o = U'����'; // { dg-warning "invalid UTF-8 character <8f>" } > +char32_t p = U'����'; // { dg-warning "invalid UTF-8 character <90><80><80>" } > +char32_t q = U'������'; // { dg-warning "invalid UTF-8 character " } > + // { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } > +const char32_t *A = U"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +const char32_t *B = U"�"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *C = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *D = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *E = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *F = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *G = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *H = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *I = U"�"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *J = U"���"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *K = U"���"; // { dg-warning "invalid UTF-8 character <9f><80>" } > +const char32_t *L = U"��"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *M = U"��"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *N = U"���"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *O = U"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" } > +const char32_t *P = U"����"; // { dg-warning "invalid UTF-8 character <8f>" } > +const char32_t *Q = U"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" } > +const char32_t *R = U"������"; // { dg-warning "invalid UTF-8 character " } > + // { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } > +const char32_t *A1 = UR"(€߿ࠀ퟿𐀀􏿿)"; // { dg-bogus "invalid UTF-8 character" } > +const char32_t *B1 = UR"(�)"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *C1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *D1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *E1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *F1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *G1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *H1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *I1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *J1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *K1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <9f><80>" } > +const char32_t *L1 = UR"(��)"; // { dg-warning "invalid UTF-8 character " } > +const char32_t *M1 = UR"(��)"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *N1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" } > +const char32_t *O1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <80><80><80>" } > +const char32_t *P1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <8f>" } > +const char32_t *Q1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <90><80><80>" } > +const char32_t *R1 = UR"(������)"; // { dg-warning "invalid UTF-8 character " } > + // { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } > +const char *A2 = u8"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +const char *B2 = u8"�"; // { dg-warning "invalid UTF-8 character <80>" } > +const char *C2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *D2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *E2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *F2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *G2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *H2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *I2 = u8"�"; // { dg-warning "invalid UTF-8 character " } > +const char *J2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" } > +const char *K2 = u8"���"; // { dg-warning "invalid UTF-8 character <9f><80>" } > +const char *L2 = u8"��"; // { dg-warning "invalid UTF-8 character " } > +const char *M2 = u8"��"; // { dg-warning "invalid UTF-8 character <80>" } > +const char *N2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" } > +const char *O2 = u8"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" } > +const char *P2 = u8"����"; // { dg-warning "invalid UTF-8 character <8f>" } > +const char *Q2 = u8"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" } > +const char *R2 = u8"������"; // { dg-warning "invalid UTF-8 character " } > + // { dg-warning "invalid UTF-8 character " "" { target *-*-* } .-1 } > --- gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-3.c.jj 2022-08-31 13:15:18.456085432 +0200 > +++ gcc/testsuite/c-c++-common/cpp/Winvalid-utf8-3.c 2022-08-31 15:55:45.469984253 +0200 > @@ -0,0 +1,27 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -Winvalid-utf8" } > + > +#define I(x) > +I(€߿ࠀ퟿𐀀􏿿) // { dg-bogus "invalid UTF-8 character" } > + // { dg-error "is not valid in an identifier" "" { target c++ } .-1 } > +I(�) // { dg-warning "invalid UTF-8 character <80>" } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(�) // { dg-warning "invalid UTF-8 character " } > +I(���) // { dg-warning "invalid UTF-8 character <80>" } > +I(���) // { dg-warning "invalid UTF-8 character <9f><80>" } > +I(��) // { dg-warning "invalid UTF-8 character " } > +I(��) // { dg-warning "invalid UTF-8 character <80>" } > +I(���) // { dg-warning "invalid UTF-8 character <80>" } > +I(����) // { dg-warning "invalid UTF-8 character <80><80><80>" } > +I(����) // { dg-warning "invalid UTF-8 character <8f>" } > +I(����) // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c } } > + // { dg-error "is not valid in an identifier" "" { target c++ } .-1 } > +I(������) // { dg-warning "invalid UTF-8 character " "" { target c } } > + // { dg-error "is not valid in an identifier" "" { target c++ } .-1 } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-1.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-1.C 2022-08-31 14:59:22.296272041 +0200 > @@ -0,0 +1,43 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8" } > + > +// a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } > +// a�a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +// a��a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a��a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +// a������a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +/* a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } */ > +/* a�a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } */ > +/* a?�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a��a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } */ > +/* a������a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } */ > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-2.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-2.C 2022-08-31 14:59:27.314204777 +0200 > @@ -0,0 +1,43 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic" } > + > +// a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } > +// a�a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a?a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +// a��a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// a��a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +// a����a { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +// a������a { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +// { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +/* a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } */ > +/* a�a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } */ > +/* a��a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* a��a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } */ > +/* a����a { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } */ > +/* a������a { dg-warning "invalid UTF-8 character " "" { target c++23 } } */ > +/* { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } */ > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-3.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-3.C 2022-08-31 14:59:33.624120194 +0200 > @@ -0,0 +1,43 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors" } > + > +// a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } > +// a�a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a���a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +// a��a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// a��a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +// a���a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +// a����a { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +// a����a { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +// a����a { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +// a������a { dg-error "invalid UTF-8 character " "" { target c++23 } } > +// { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } > +/* a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } */ > +/* a�a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a�a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a���a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } */ > +/* a��a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* a��a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a���a { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } */ > +/* a����a { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } */ > +/* a����a { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } */ > +/* a����a { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } */ > +/* a������a { dg-error "invalid UTF-8 character " "" { target c++23 } } */ > +/* { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } */ > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-4.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-4.C 2022-08-31 14:59:37.965062005 +0200 > @@ -0,0 +1,43 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors -Wno-invalid-utf8" } > + > +// a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } > +// a�a { dg-bogus "invalid UTF-8 character <80>" } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a�a { dg-bogus "invalid UTF-8 character " } > +// a���a { dg-bogus "invalid UTF-8 character <80>" } > +// a���a { dg-bogus "invalid UTF-8 character <9f><80>" } > +// a��a { dg-bogus "invalid UTF-8 character " } > +// a��a { dg-bogus "invalid UTF-8 character <80>" } > +// a���a { dg-bogus "invalid UTF-8 character <80>" } > +// a����a { dg-bogus "invalid UTF-8 character <80><80><80>" } > +// a����a { dg-bogus "invalid UTF-8 character <8f>" } > +// a����a { dg-bogus "invalid UTF-8 character <90><80><80>" } > +// a������a { dg-bogus "invalid UTF-8 character " } > +// { dg-bogus "invalid UTF-8 character " "" { target *-*-* } .-1 } > +/* a€߿ࠀ퟿𐀀􏿿a { dg-bogus "invalid UTF-8 character" } */ > +/* a�a { dg-bogus "invalid UTF-8 character <80>" } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a�a { dg-bogus "invalid UTF-8 character " } */ > +/* a���a { dg-bogus "invalid UTF-8 character <80>" } */ > +/* a���a { dg-bogus "invalid UTF-8 character <9f><80>" } */ > +/* a��a { dg-bogus "invalid UTF-8 character " } */ > +/* a��a { dg-bogus "invalid UTF-8 character <80>" } */ > +/* a���a { dg-bogus "invalid UTF-8 character <80>" } */ > +/* a����a { dg-bogus "invalid UTF-8 character <80><80><80>" } */ > +/* a����a { dg-bogus "invalid UTF-8 character <8f>" } */ > +/* a����a { dg-bogus "invalid UTF-8 character <90><80><80>" } */ > +/* a������a { dg-bogus "invalid UTF-8 character " } */ > +/* { dg-bogus "invalid UTF-8 character " "" { target *-*-* } .-1 } */ > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-5.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-5.C 2022-08-31 14:59:49.089912880 +0200 > @@ -0,0 +1,80 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess { target c++11 } } > +// { dg-options "-finput-charset=UTF-8" } > + > +char32_t a = U'?'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t b = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t c = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t d = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t e = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t f = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t g = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t h = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t i = U'���'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t j = U'���'; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +char32_t k = U'��'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t l = U'��'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t m = U'���'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t n = U'����'; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +char32_t o = U'����'; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +char32_t p = U'����'; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +char32_t q = U'������'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A = U"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B = U"�"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J = U"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K = U"���"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L = U"��"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M = U"��"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N = U"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O = U"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P = U"����"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q = U"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R = U"������"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A1 = UR"(€߿ࠀ퟿𐀀􏿿)"; // { dg-bogus "invalid UTF-8 character" } > +auto B1 = UR"(�)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L1 = UR"(��)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M1 = UR"(��)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R1 = UR"(������)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A2 = u8"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B2 = u8"�"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K2 = u8"���"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L2 = u8"��"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M2 = u8"��"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O2 = u8"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P2 = u8"����"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q2 = u8"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R2 = u8"������"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-6.C.jj 2022-08-31 12:25:42.458125660 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-6.C 2022-08-31 14:59:58.128791717 +0200 > @@ -0,0 +1,80 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess { target c++11 } } > +// { dg-options "-finput-charset=UTF-8 -pedantic" } > + > +char32_t a = U'�'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t b = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t c = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t d = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t e = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t f = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t g = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t h = U'�'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t i = U'���'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t j = U'���'; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +char32_t k = U'��'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +char32_t l = U'��'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t m = U'���'; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t n = U'����'; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +char32_t o = U'����'; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +char32_t p = U'����'; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +char32_t q = U'������'; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A = U"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B = U"�"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I = U"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J = U"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K = U"���"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L = U"��"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M = U"��"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N = U"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O = U"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P = U"����"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q = U"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R = U"������"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A1 = UR"(€߿ࠀ퟿𐀀􏿿)"; // { dg-bogus "invalid UTF-8 character" } > +auto B1 = UR"(�)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I1 = UR"(�)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L1 = UR"(��)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M1 = UR"(??)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N1 = UR"(���)"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q1 = UR"(����)"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R1 = UR"(������)"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A2 = u8"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B2 = u8"�"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto D2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto E2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto F2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto G2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto H2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto I2 = u8"�"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto J2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K2 = u8"���"; // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L2 = u8"��"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +auto M2 = u8"��"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N2 = u8"���"; // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O2 = u8"����"; // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P2 = u8"����"; // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q2 = u8"����"; // { dg-warning "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R2 = u8"������"; // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > + // { dg-warning "invalid UTF-8 character " "" { target c++23 } .-1 } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-7.C.jj 2022-08-31 12:25:42.459125647 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-7.C 2022-08-31 15:00:05.749689562 +0200 > @@ -0,0 +1,80 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess { target c++11 } } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors" } > + > +char32_t a = U'�'; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t b = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t c = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t d = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t e = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t f = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t g = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t h = U'�'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t i = U'���'; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t j = U'���'; // { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +char32_t k = U'��'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +char32_t l = U'��'; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t m = U'���'; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t n = U'����'; // { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +char32_t o = U'����'; // { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +char32_t p = U'����'; // { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +char32_t q = U'������'; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > + // { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A = U"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B = U"�"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto D = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto E = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto F = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto G = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto H = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto I = U"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto J = U"���"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K = U"���"; // { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L = U"��"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto M = U"��"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N = U"���"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O = U"����"; // { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P = U"����"; // { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q = U"����"; // { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R = U"������"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > + // { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A1 = UR"(€߿ࠀ퟿𐀀􏿿)"; // { dg-bogus "invalid UTF-8 character" } > +auto B1 = UR"(�)"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto D1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto E1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto F1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto G1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto H1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto I1 = UR"(�)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto J1 = UR"(���)"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K1 = UR"(���)"; // { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L1 = UR"(��)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto M1 = UR"(��)"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N1 = UR"(���)"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O1 = UR"(����)"; // { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P1 = UR"(����)"; // { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q1 = UR"(����)"; // { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R1 = UR"(������)"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > + // { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A2 = u8"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B2 = u8"�"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto D2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto E2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto F2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto G2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto H2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto I2 = u8"�"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto J2 = u8"���"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K2 = u8"���"; // { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L2 = u8"��"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +auto M2 = u8"��"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N2 = u8"���"; // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O2 = u8"����"; // { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P2 = u8"����"; // { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q2 = u8"����"; // { dg-error "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R2 = u8"������"; // { dg-error "invalid UTF-8 character " "" { target c++23 } } > + // { dg-error "invalid UTF-8 character " "" { target c++23 } .-1 } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-8.C.jj 2022-08-31 12:25:42.459125647 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-8.C 2022-08-31 15:00:11.378614108 +0200 > @@ -0,0 +1,80 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess { target c++11 } } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors -Wno-invalid-utf8" } > + > +char32_t a = U'�'; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t b = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t c = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t d = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t e = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t f = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t g = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t h = U'�'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t i = U'���'; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t j = U'���'; // { dg-bogus "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +char32_t k = U'��'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +char32_t l = U'��'; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t m = U'���'; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +char32_t n = U'����'; // { dg-bogus "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +char32_t o = U'����'; // { dg-bogus "invalid UTF-8 character <8f>" "" { target c++23 } } > +char32_t p = U'����'; // { dg-bogus "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +char32_t q = U'������'; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > + // { dg-bogus "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A = U"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B = U"�"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto D = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto E = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto F = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto G = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto H = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto I = U"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto J = U"���"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K = U"���"; // { dg-bogus "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L = U"��"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto M = U"��"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N = U"���"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O = U"����"; // { dg-bogus "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P = U"����"; // { dg-bogus "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q = U"����"; // { dg-bogus "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R = U"������"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > + // { dg-bogus "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A1 = UR"(€߿ࠀ퟿𐀀􏿿)"; // { dg-bogus "invalid UTF-8 character" } > +auto B1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto D1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto E1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto F1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto G1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto H1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto I1 = UR"(�)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto J1 = UR"(���)"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K1 = UR"(���)"; // { dg-bogus "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L1 = UR"(��)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto M1 = UR"(��)"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N1 = UR"(���)"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O1 = UR"(����)"; // { dg-bogus "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P1 = UR"(����)"; // { dg-bogus "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q1 = UR"(����)"; // { dg-bogus "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R1 = UR"(������)"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > + // { dg-bogus "invalid UTF-8 character " "" { target c++23 } .-1 } > +auto A2 = u8"€߿ࠀ퟿𐀀􏿿"; // { dg-bogus "invalid UTF-8 character" } > +auto B2 = u8"�"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto C2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto D2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto E2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto F2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto G2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto H2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto I2 = u8"�"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto J2 = u8"���"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto K2 = u8"���"; // { dg-bogus "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +auto L2 = u8"��"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > +auto M2 = u8"��"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto N2 = u8"���"; // { dg-bogus "invalid UTF-8 character <80>" "" { target c++23 } } > +auto O2 = u8"����"; // { dg-bogus "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +auto P2 = u8"����"; // { dg-bogus "invalid UTF-8 character <8f>" "" { target c++23 } } > +auto Q2 = u8"����"; // { dg-bogus "invalid UTF-8 character <90><80><80>" "" { target c++23 } } > +auto R2 = u8"������"; // { dg-bogus "invalid UTF-8 character " "" { target c++23 } } > + // { dg-bogus "invalid UTF-8 character " "" { target c++23 } .-1 } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-9.C.jj 2022-08-31 15:56:45.801177822 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-9.C 2022-08-31 15:57:42.281422873 +0200 > @@ -0,0 +1,25 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8" } > + > +#define I(x) > +I(€߿ࠀ퟿𐀀􏿿) // { dg-bogus "invalid UTF-8 character" } > + // { dg-error "is not valid in an identifier" "" { target *-*-* } .-1 } > +I(�) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +I(��) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(��) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(����) // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +I(����) // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +I(����) // { dg-error "is not valid in an identifier" } > +I(������) // { dg-error "is not valid in an identifier" } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-10.C.jj 2022-08-31 15:58:04.029132184 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-10.C 2022-08-31 15:58:09.934053256 +0200 > @@ -0,0 +1,25 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic" } > + > +#define I(x) > +I(€߿ࠀ퟿𐀀􏿿) // { dg-bogus "invalid UTF-8 character" } > + // { dg-error "is not valid in an identifier" "" { target *-*-* } .-1 } > +I(�) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +I(��) // { dg-warning "invalid UTF-8 character " "" { target c++23 } } > +I(��) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-warning "invalid UTF-8 character <80>" "" { target c++23 } } > +I(����) // { dg-warning "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +I(����) // { dg-warning "invalid UTF-8 character <8f>" "" { target c++23 } } > +I(����) // { dg-error "is not valid in an identifier" } > +I(������) // { dg-error "is not valid in an identifier" } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-11.C.jj 2022-08-31 15:58:18.825934408 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-11.C 2022-08-31 15:58:30.929772616 +0200 > @@ -0,0 +1,25 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors" } > + > +#define I(x) > +I(€߿ࠀ퟿𐀀􏿿) // { dg-bogus "invalid UTF-8 character" } > + // { dg-error "is not valid in an identifier" "" { target *-*-* } .-1 } > +I(�) // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(�) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(���) // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-error "invalid UTF-8 character <9f><80>" "" { target c++23 } } > +I(��) // { dg-error "invalid UTF-8 character " "" { target c++23 } } > +I(��) // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +I(���) // { dg-error "invalid UTF-8 character <80>" "" { target c++23 } } > +I(����) // { dg-error "invalid UTF-8 character <80><80><80>" "" { target c++23 } } > +I(����) // { dg-error "invalid UTF-8 character <8f>" "" { target c++23 } } > +I(����) // { dg-error "is not valid in an identifier" } > +I(������) // { dg-error "is not valid in an identifier" } > --- gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-12.C.jj 2022-08-31 15:58:52.404485572 +0200 > +++ gcc/testsuite/g++.dg/cpp23/Winvalid-utf8-12.C 2022-08-31 15:59:19.735120251 +0200 > @@ -0,0 +1,25 @@ > +// P2295R6 - Support for UTF-8 as a portable source file encoding > +// This test intentionally contains various byte sequences which are not valid UTF-8 > +// { dg-do preprocess } > +// { dg-options "-finput-charset=UTF-8 -pedantic-errors -Wno-invalid-utf8" } > + > +#define I(x) > +I(€߿ࠀ퟿𐀀􏿿) // { dg-bogus "invalid UTF-8 character" } > + // { dg-error "is not valid in an identifier" "" { target *-*-* } .-1 } > +I(�) // { dg-bogus "invalid UTF-8 character <80>" } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(�) // { dg-bogus "invalid UTF-8 character " } > +I(���) // { dg-bogus "invalid UTF-8 character <80>" } > +I(���) // { dg-bogus "invalid UTF-8 character <9f><80>" } > +I(��) // { dg-bogus "invalid UTF-8 character " } > +I(?�) // { dg-bogus "invalid UTF-8 character <80>" } > +I(���) // { dg-bogus "invalid UTF-8 character <80>" } > +I(����) // { dg-bogus "invalid UTF-8 character <80><80><80>" } > +I(����) // { dg-bogus "invalid UTF-8 character <8f>" } > +I(����) // { dg-error "is not valid in an identifier" } > +I(������) // { dg-error "is not valid in an identifier" } > > > Jakub >