public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/100977] [C++23] Implement C++ Identifier Syntax using Unicode Standard Annex 31
Date: Wed, 01 Sep 2021 20:37:29 +0000	[thread overview]
Message-ID: <bug-100977-4-W7IkLLHap2@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-100977-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100977

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:c4d6dcacfca1b804504515496e6d9de176d7f51e

commit r12-3302-gc4d6dcacfca1b804504515496e6d9de176d7f51e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 1 22:33:06 2021 +0200

    libcpp: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode
Standard Annex 31

    The following patch implements the
    P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
    paper.  We already allow UTF-8 characters in the source, so that part
    is already implemented, so IMHO all we need to do is pedwarn instead of
    just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
    if the character is not in NFC and to use the unicode XID_Start and
    XID_Continue derived code properties to find out what characters are
allowed
    (the standard actually adds U+005F to XID_Start, but we are handling the
    ASCII compatible characters differently already and they aren't allowed
    in UCNs in identifiers).  Instead of hardcoding the large tables
    in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
    tables (13.0.0 version at this point).

    For non-pedantic mode, we accept as 2nd+ char in identifiers a union
    of valid characters in all supported modes, but for the 1st char it
    was actually pedantically requiring that it is not any of the characters
    that may not appear in the currently chosen standard as the first
character.
    This patch changes it such that also what is allowed at the start of an
    identifier is a union of characters valid at the start of an identifier
    in any of the pedantic modes.

    2021-09-01  Jakub Jelinek  <jakub@redhat.com>

            PR c++/100977
    libcpp/
            * include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
            * charset.c (CXX23, NXX23): New enumerators.
            (CID, NFC, NKC, CTX): Renumber.
            (ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
            NXX23 flags for cxx23_identifiers.  For start character in
            non-pedantic mode, allow characters that are allowed as start
            characters in any of the supported language modes, rather than
            disallowing characters allowed only as non-start characters in
            current mode but for characters from other language modes allowing
            them even if they are never allowed at start.
            * init.c (struct lang_flags): Add cxx23_identifiers.
            (lang_defaults): Add cxx23_identifiers column.
            (cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
            * lex.c (warn_about_normalization): If cxx23_identifiers, use
            cpp_pedwarning_with_line instead of cpp_warning_with_line for
            "is not in NFC" diagnostics.
            * makeucnid.c: Adjust usage comment.
            (CXX23, NXX23): New enumerators.
            (all_languages): Add CXX23.
            (not_NFC, not_NFKC, maybe_not_NFC): Renumber.
            (read_derivedcore): New function.
            (write_table): Print also CXX23 and NXX23 columns.
            (main): Require 5 arguments instead of 4, call read_derivedcore.
            * ucnid.h: Regenerated using Unicode 13.0.0 files.
    gcc/testsuite/
            * g++.dg/cpp23/normalize1.C: New test.
            * g++.dg/cpp23/normalize2.C: New test.
            * g++.dg/cpp23/normalize3.C: New test.
            * g++.dg/cpp23/normalize4.C: New test.
            * g++.dg/cpp23/normalize5.C: New test.
            * g++.dg/cpp23/normalize6.C: New test.
            * g++.dg/cpp23/normalize7.C: New test.
            * g++.dg/cpp23/ucnid-1-utf8.C: New test.
            * g++.dg/cpp23/ucnid-2-utf8.C: New test.
            * gcc.dg/cpp/ucnid-4.c: Don't expect
            "not valid at the start of an identifier" errors.
            * gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
            * gcc.dg/cpp/ucnid-5-utf8.c: New test.

  parent reply	other threads:[~2021-09-01 20:37 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-08 18:15 [Bug c++/100977] New: " jason at gcc dot gnu.org
2021-06-08 18:19 ` [Bug c++/100977] " mpolacek at gcc dot gnu.org
2021-08-04 13:39 ` jakub at gcc dot gnu.org
2021-08-04 14:08 ` jakub at gcc dot gnu.org
2021-08-04 16:14 ` jakub at gcc dot gnu.org
2021-08-04 18:34 ` joseph at codesourcery dot com
2021-08-04 18:40 ` jakub at gcc dot gnu.org
2021-08-04 19:06 ` ubizjak at gmail dot com
2021-08-04 19:20 ` jakub at gcc dot gnu.org
2021-08-04 19:25 ` ubizjak at gmail dot com
2021-08-05 10:17 ` jakub at gcc dot gnu.org
2021-08-05 15:34 ` cvs-commit at gcc dot gnu.org
2021-08-05 15:35 ` cvs-commit at gcc dot gnu.org
2021-09-01 20:37 ` cvs-commit at gcc dot gnu.org [this message]
2021-09-01 20:38 ` jakub at gcc dot gnu.org
2021-11-30  8:51 ` cvs-commit at gcc dot gnu.org
2021-12-01  9:22 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-100977-4-W7IkLLHap2@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).