From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 45CA73858C20 for ; Tue, 16 Jan 2024 00:10:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 45CA73858C20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 45CA73858C20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705363818; cv=none; b=T76wZ3fUWUlcoRXFuCkOU82WmamsuvLfDEq+Cfmfo/oh394qSofeurIGQFpjlnWI0dE4y6qez1m3CrYF9KLN/x1Qo0IchPbGbZgOPIqtpMvMpANXm0HhwTv+DSHB1N+QmWNSFjnC+J6ST+kbWY83oL7J+wzvY7Y6ZgQJfNBVuHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705363818; c=relaxed/simple; bh=gv02jyvn76MMrJ6i57tiLTQskyN7CJuKH3WwS9I6Kyo=; h=DKIM-Signature:From:Date:To:Subject:Message-ID:MIME-Version; b=hoTB/UGnD2mm6U9lUXefcTIStcaMpTQ8uoLOuRgso+QdCvvHulJ2WTAFhDutKHFOx91wuKhQ9B3bt7uH6IN0Ije6FtR8d+JeteA/Z8FqKXD1hkOFI1sf8AOOssPKRbq0ayOh9Fwp3+jGa7B/ywr5Bg0p74KjIdSMJifotUBhG04= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705363811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=lUG8Mg+WF8r8L28JFdow9wvgnR2rP+mxsIMCnJHYTls=; b=B6D5yYvfzvJWPssyGmauCCX6OU4j9tdPppyKicmQuTHT21jGJRVGym9rA0DG9rGz1UOrYy 371Vj+VQG8xTEqoeN2usRbpO+YTx0/4T795GkdZ2ydaoQ1ljMSl2ypITL7xAp22Vyk2Hib HgvECnUR97TB/1FVPSLialSUn1eOy0s= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-67-OWQPaFTQPfqRmcdg7G5gRw-1; Mon, 15 Jan 2024 19:10:00 -0500 X-MC-Unique: OWQPaFTQPfqRmcdg7G5gRw-1 Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-429ca123301so60995401cf.1 for ; Mon, 15 Jan 2024 16:10:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705363800; x=1705968600; h=mime-version:references:message-id:in-reply-to:subject:cc:to:date :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lUG8Mg+WF8r8L28JFdow9wvgnR2rP+mxsIMCnJHYTls=; b=vOMby5uqiR3OA1rR78NhUkBKH2duOSqYWG7rV1sXp4f2MZz6ygnTC5Ln0eVxxaQjYj lcD4IzVbPeKPr5xiSB4ylTYtgKuGbKCnuzyYa4Ivz6DHCYGQNnMQoE+iKk38kgme0i2I yqiyqNY4u8pxCliRAptPnhEPYQ76XYe8ZNfzBmJpUCnQK3996sPNXsyIu8Pk2AN0BV5y x0bFB9MGxyGrnSeBeT6NS3QXbhPSsaKMvzvGb5NhckNgGTGNsu9J8QF0MTmBrttghIeu 020PwtNmcMXmxHlEr9HGMLfEhay8T2UPKH3A9QhVV1vk6KQQV5JW8m7CWDmXD7/OCtk4 OyHw== X-Gm-Message-State: AOJu0YwSSGZ9NrxRToghLePZjXlKrgiLAnhNAF66FcqOWUd6LwMWZ+Rm fifV1DCO1JklFazOb45uldFEB/amcUiaXFUP/8lPPIOxad/cHJuz+oKjaTuikZ5wfkjqnqDBTfK FMXqv0GNtCXwjuqq00k1zkhD26w== X-Received: by 2002:a05:622a:198b:b0:42a:44d:ee2f with SMTP id u11-20020a05622a198b00b0042a044dee2fmr138274qtc.61.1705363799221; Mon, 15 Jan 2024 16:09:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IFPEX0Uhe3dJtnUoCSBaiwobt7zEdkB93TcYsDmMdLNQjCJhn8rexu72esPQj8Ykmw2qwVoIw== X-Received: by 2002:a05:622a:198b:b0:42a:44d:ee2f with SMTP id u11-20020a05622a198b00b0042a044dee2fmr138245qtc.61.1705363798333; Mon, 15 Jan 2024 16:09:58 -0800 (PST) Received: from [192.168.1.130] (ool-457670bb.dyn.optonline.net. [69.118.112.187]) by smtp.gmail.com with ESMTPSA id fj5-20020a05622a550500b004299fd63fdesm4341601qtb.4.2024.01.15.16.09.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 16:09:57 -0800 (PST) From: Patrick Palka X-Google-Original-From: Patrick Palka Date: Mon, 15 Jan 2024 19:09:56 -0500 (EST) To: Jonathan Wakely cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: [PATCH v3] libstdc++: Implement C++26 std::text_encoding (P1885R12) [PR113318] In-Reply-To: <20240115204803.1550804-1-jwakely@redhat.com> Message-ID: References: <20240113124834.1296437-1-jwakely@redhat.com> <20240115204803.1550804-1-jwakely@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-14.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 15 Jan 2024, Jonathan Wakely wrote: > I think I'm happy with this now. It has tests for all the new functions, > and the performance of the charset alias match algorithm is improved by > reusing part of . > > Tested x86_64-linux. > > -- >8 -- > > This is another C++26 change, approved in Varna 2022. We require a new 2023? > static array of data that is extracted from the IANA Character Sets > database. A new Python script to generate a header from the IANA CSV > file is added. > > libstdc++-v3/ChangeLog: > > PR libstdc++/113318 > * acinclude.m4 (GLIBCXX_CONFIGURE): Add c++26 directory. > (GLIBCXX_CHECK_TEXT_ENCODING): Define. > * config.h.in: Regenerate. > * configure: Regenerate. > * configure.ac: Use GLIBCXX_CHECK_TEXT_ENCODING. > * include/Makefile.am: Add new headers. > * include/Makefile.in: Regenerate. > * include/bits/locale_classes.h (locale::encoding): Declare new > member function. > * include/bits/unicode.h (__charset_alias_match): New function. > * include/bits/text_encoding-data.h: New file. > * include/bits/version.def (text_encoding): Define. > * include/bits/version.h: Regenerate. > * include/std/text_encoding: New file. > * src/Makefile.am: Add new subdirectory. > * src/Makefile.in: Regenerate. > * src/c++26/Makefile.am: New file. > * src/c++26/Makefile.in: New file. > * src/c++26/text_encoding.cc: New file. > * src/experimental/Makefile.am: Include c++26 convenience > library. > * src/experimental/Makefile.in: Regenerate. > * python/libstdcxx/v6/printers.py (StdTextEncodingPrinter): New > printer. > * scripts/gen_text_encoding_data.py: New file. > * testsuite/22_locale/locale/encoding.cc: New test. > * testsuite/ext/unicode/charset_alias_match.cc: New test. > * testsuite/std/text_encoding/cons.cc: New test. > * testsuite/std/text_encoding/members.cc: New test. > * testsuite/std/text_encoding/requirements.cc: New test. > --- > libstdc++-v3/acinclude.m4 | 30 +- > libstdc++-v3/config.h.in | 3 + > libstdc++-v3/configure | 70 +- > libstdc++-v3/configure.ac | 3 + > libstdc++-v3/include/Makefile.am | 2 + > libstdc++-v3/include/Makefile.in | 2 + > libstdc++-v3/include/bits/locale_classes.h | 14 + > .../include/bits/text_encoding-data.h | 902 ++++++++++++++++++ > libstdc++-v3/include/bits/unicode.h | 53 +- > libstdc++-v3/include/bits/version.def | 10 + > libstdc++-v3/include/bits/version.h | 13 +- > libstdc++-v3/include/std/text_encoding | 704 ++++++++++++++ > libstdc++-v3/python/libstdcxx/v6/printers.py | 17 + > .../scripts/gen_text_encoding_data.py | 70 ++ > libstdc++-v3/src/Makefile.am | 3 +- > libstdc++-v3/src/Makefile.in | 7 +- > libstdc++-v3/src/c++26/Makefile.am | 109 +++ > libstdc++-v3/src/c++26/Makefile.in | 747 +++++++++++++++ > libstdc++-v3/src/c++26/text_encoding.cc | 91 ++ > libstdc++-v3/src/experimental/Makefile.am | 2 + > libstdc++-v3/src/experimental/Makefile.in | 2 + > .../testsuite/22_locale/locale/encoding.cc | 36 + > .../ext/unicode/charset_alias_match.cc | 18 + > .../testsuite/std/text_encoding/cons.cc | 113 +++ > .../testsuite/std/text_encoding/members.cc | 41 + > .../std/text_encoding/requirements.cc | 31 + > 26 files changed, 3083 insertions(+), 10 deletions(-) > create mode 100644 libstdc++-v3/include/bits/text_encoding-data.h > create mode 100644 libstdc++-v3/include/std/text_encoding > create mode 100755 libstdc++-v3/scripts/gen_text_encoding_data.py > create mode 100644 libstdc++-v3/src/c++26/Makefile.am > create mode 100644 libstdc++-v3/src/c++26/Makefile.in > create mode 100644 libstdc++-v3/src/c++26/text_encoding.cc > create mode 100644 libstdc++-v3/testsuite/22_locale/locale/encoding.cc > create mode 100644 libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc > create mode 100644 libstdc++-v3/testsuite/std/text_encoding/cons.cc > create mode 100644 libstdc++-v3/testsuite/std/text_encoding/members.cc > create mode 100644 libstdc++-v3/testsuite/std/text_encoding/requirements.cc > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > index e7cbf0fcf96..f9ba7ef744b 100644 > --- a/libstdc++-v3/acinclude.m4 > +++ b/libstdc++-v3/acinclude.m4 > @@ -49,7 +49,7 @@ AC_DEFUN([GLIBCXX_CONFIGURE], [ > # Keep these sync'd with the list in Makefile.am. The first provides an > # expandable list at autoconf time; the second provides an expandable list > # (i.e., shell variable) at configure time. > - m4_define([glibcxx_SUBDIRS],[include libsupc++ src src/c++98 src/c++11 src/c++17 src/c++20 src/c++23 src/filesystem src/libbacktrace src/experimental doc po testsuite python]) > + m4_define([glibcxx_SUBDIRS],[include libsupc++ src src/c++98 src/c++11 src/c++17 src/c++20 src/c++23 src/c++26 src/filesystem src/libbacktrace src/experimental doc po testsuite python]) > SUBDIRS='glibcxx_SUBDIRS' > > # These need to be absolute paths, yet at the same time need to > @@ -5821,6 +5821,34 @@ AC_LANG_SAVE > AC_LANG_RESTORE > ]) > > +dnl > +dnl Check whether the dependencies for std::text_encoding are available. > +dnl > +dnl Defines: > +dnl _GLIBCXX_USE_NL_LANGINFO_L if nl_langinfo_l is in . > +dnl > +AC_DEFUN([GLIBCXX_CHECK_TEXT_ENCODING], [ > +AC_LANG_SAVE > + AC_LANG_CPLUSPLUS > + > + AC_MSG_CHECKING([whether nl_langinfo_l is defined in ]) > + AC_TRY_COMPILE([ > + #include > + #include > + ],[ > + locale_t loc = newlocale(LC_ALL_MASK, "", (locale_t)0); > + const char* enc = nl_langinfo_l(CODESET, loc); > + freelocale(loc); > + ], [ac_nl_langinfo_l=yes], [ac_nl_langinfo_l=no]) > + AC_MSG_RESULT($ac_nl_langinfo_l) > + if test "$ac_nl_langinfo_l" = yes; then > + AC_DEFINE_UNQUOTED(_GLIBCXX_USE_NL_LANGINFO_L, 1, > + [Define if nl_langinfo_l should be used for std::text_encoding.]) > + fi > + > + AC_LANG_RESTORE > +]) > + > # Macros from the top-level gcc directory. > m4_include([../config/gc++filt.m4]) > m4_include([../config/tls.m4]) > diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac > index c8b36333019..c68cac4f345 100644 > --- a/libstdc++-v3/configure.ac > +++ b/libstdc++-v3/configure.ac > @@ -557,6 +557,9 @@ GLIBCXX_CHECK_INIT_PRIORITY > # For __basic_file::native_handle() > GLIBCXX_CHECK_FILEBUF_NATIVE_HANDLES > > +# For std::text_encoding > +GLIBCXX_CHECK_TEXT_ENCODING > + > # Define documentation rules conditionally. > > # See if makeinfo has been installed and is modern enough > diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am > index c6d6a24eb9e..64152351ed0 100644 > --- a/libstdc++-v3/include/Makefile.am > +++ b/libstdc++-v3/include/Makefile.am > @@ -104,6 +104,7 @@ std_headers = \ > ${std_srcdir}/streambuf \ > ${std_srcdir}/string \ > ${std_srcdir}/system_error \ > + ${std_srcdir}/text_encoding \ > ${std_srcdir}/thread \ > ${std_srcdir}/unordered_map \ > ${std_srcdir}/unordered_set \ > @@ -159,6 +160,7 @@ bits_freestanding = \ > ${bits_srcdir}/stl_raw_storage_iter.h \ > ${bits_srcdir}/stl_relops.h \ > ${bits_srcdir}/stl_uninitialized.h \ > + ${bits_srcdir}/text_encoding-data.h \ > ${bits_srcdir}/version.h \ > ${bits_srcdir}/string_view.tcc \ > ${bits_srcdir}/unicode.h \ > diff --git a/libstdc++-v3/include/bits/locale_classes.h b/libstdc++-v3/include/bits/locale_classes.h > index 621f2a29f50..a2e94217006 100644 > --- a/libstdc++-v3/include/bits/locale_classes.h > +++ b/libstdc++-v3/include/bits/locale_classes.h > @@ -40,6 +40,10 @@ > #include > #include > > +#ifdef __glibcxx_text_encoding > +#include > +#endif > + > namespace std _GLIBCXX_VISIBILITY(default) > { > _GLIBCXX_BEGIN_NAMESPACE_VERSION > @@ -248,6 +252,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > string > name() const; > > +#ifdef __glibcxx_text_encoding > +# if __CHAR_BIT__ == 8 > + text_encoding > + encoding() const; > +# else > + text_encoding > + encoding() const = delete; > +# endif > +#endif > + > /** > * @brief Locale equality. > * > diff --git a/libstdc++-v3/include/bits/text_encoding-data.h b/libstdc++-v3/include/bits/text_encoding-data.h > new file mode 100644 > index 00000000000..7ac2e9dc3d9 > --- /dev/null > +++ b/libstdc++-v3/include/bits/text_encoding-data.h > @@ -0,0 +1,902 @@ > +// Generated by gen_text_encoding_data.py, do not edit. > + > +#ifndef _GLIBCXX_GET_ENCODING_DATA > +# error "This is not a public header, do not include it directly" > +#endif > + > + { 3, "US-ASCII" }, > + { 3, "iso-ir-6" }, > + { 3, "ANSI_X3.4-1968" }, > + { 3, "ANSI_X3.4-1986" }, > + { 3, "ISO_646.irv:1991" }, > + { 3, "ISO646-US" }, > + { 3, "us" }, > + { 3, "IBM367" }, > + { 3, "cp367" }, > + { 3, "csASCII" }, > + { 4, "ISO_8859-1:1987" }, > + { 4, "iso-ir-100" }, > + { 4, "ISO_8859-1" }, > + { 4, "ISO-8859-1" }, > + { 4, "latin1" }, > + { 4, "l1" }, > + { 4, "IBM819" }, > + { 4, "CP819" }, > + { 4, "csISOLatin1" }, > + { 5, "ISO_8859-2:1987" }, > + { 5, "iso-ir-101" }, > + { 5, "ISO_8859-2" }, > + { 5, "ISO-8859-2" }, > + { 5, "latin2" }, > + { 5, "l2" }, > + { 5, "csISOLatin2" }, > + { 6, "ISO_8859-3:1988" }, > + { 6, "iso-ir-109" }, > + { 6, "ISO_8859-3" }, > + { 6, "ISO-8859-3" }, > + { 6, "latin3" }, > + { 6, "l3" }, > + { 6, "csISOLatin3" }, > + { 7, "ISO_8859-4:1988" }, > + { 7, "iso-ir-110" }, > + { 7, "ISO_8859-4" }, > + { 7, "ISO-8859-4" }, > + { 7, "latin4" }, > + { 7, "l4" }, > + { 7, "csISOLatin4" }, > + { 8, "ISO_8859-5:1988" }, > + { 8, "iso-ir-144" }, > + { 8, "ISO_8859-5" }, > + { 8, "ISO-8859-5" }, > + { 8, "cyrillic" }, > + { 8, "csISOLatinCyrillic" }, > + { 9, "ISO_8859-6:1987" }, > + { 9, "iso-ir-127" }, > + { 9, "ISO_8859-6" }, > + { 9, "ISO-8859-6" }, > + { 9, "ECMA-114" }, > + { 9, "ASMO-708" }, > + { 9, "arabic" }, > + { 9, "csISOLatinArabic" }, > + { 10, "ISO_8859-7:1987" }, > + { 10, "iso-ir-126" }, > + { 10, "ISO_8859-7" }, > + { 10, "ISO-8859-7" }, > + { 10, "ELOT_928" }, > + { 10, "ECMA-118" }, > + { 10, "greek" }, > + { 10, "greek8" }, > + { 10, "csISOLatinGreek" }, > + { 11, "ISO_8859-8:1988" }, > + { 11, "iso-ir-138" }, > + { 11, "ISO_8859-8" }, > + { 11, "ISO-8859-8" }, > + { 11, "hebrew" }, > + { 11, "csISOLatinHebrew" }, > + { 12, "ISO_8859-9:1989" }, > + { 12, "iso-ir-148" }, > + { 12, "ISO_8859-9" }, > + { 12, "ISO-8859-9" }, > + { 12, "latin5" }, > + { 12, "l5" }, > + { 12, "csISOLatin5" }, > + { 13, "ISO-8859-10" }, > + { 13, "iso-ir-157" }, > + { 13, "l6" }, > + { 13, "ISO_8859-10:1992" }, > + { 13, "csISOLatin6" }, > + { 13, "latin6" }, > + { 14, "ISO_6937-2-add" }, > + { 14, "iso-ir-142" }, > + { 14, "csISOTextComm" }, > + { 15, "JIS_X0201" }, > + { 15, "X0201" }, > + { 15, "csHalfWidthKatakana" }, > + { 16, "JIS_Encoding" }, > + { 16, "csJISEncoding" }, > + { 17, "Shift_JIS" }, > + { 17, "MS_Kanji" }, > + { 17, "csShiftJIS" }, > + { 18, "Extended_UNIX_Code_Packed_Format_for_Japanese" }, > + { 18, "csEUCPkdFmtJapanese" }, > + { 18, "EUC-JP" }, > + { 19, "Extended_UNIX_Code_Fixed_Width_for_Japanese" }, > + { 19, "csEUCFixWidJapanese" }, > + { 20, "BS_4730" }, > + { 20, "iso-ir-4" }, > + { 20, "ISO646-GB" }, > + { 20, "gb" }, > + { 20, "uk" }, > + { 20, "csISO4UnitedKingdom" }, > + { 21, "SEN_850200_C" }, > + { 21, "iso-ir-11" }, > + { 21, "ISO646-SE2" }, > + { 21, "se2" }, > + { 21, "csISO11SwedishForNames" }, > + { 22, "IT" }, > + { 22, "iso-ir-15" }, > + { 22, "ISO646-IT" }, > + { 22, "csISO15Italian" }, > + { 23, "ES" }, > + { 23, "iso-ir-17" }, > + { 23, "ISO646-ES" }, > + { 23, "csISO17Spanish" }, > + { 24, "DIN_66003" }, > + { 24, "iso-ir-21" }, > + { 24, "de" }, > + { 24, "ISO646-DE" }, > + { 24, "csISO21German" }, > + { 25, "NS_4551-1" }, > + { 25, "iso-ir-60" }, > + { 25, "ISO646-NO" }, > + { 25, "no" }, > + { 25, "csISO60DanishNorwegian" }, > + { 25, "csISO60Norwegian1" }, > + { 26, "NF_Z_62-010" }, > + { 26, "iso-ir-69" }, > + { 26, "ISO646-FR" }, > + { 26, "fr" }, > + { 26, "csISO69French" }, > + { 27, "ISO-10646-UTF-1" }, > + { 27, "csISO10646UTF1" }, > + { 28, "ISO_646.basic:1983" }, > + { 28, "ref" }, > + { 28, "csISO646basic1983" }, > + { 29, "INVARIANT" }, > + { 29, "csINVARIANT" }, > + { 30, "ISO_646.irv:1983" }, > + { 30, "iso-ir-2" }, > + { 30, "irv" }, > + { 30, "csISO2IntlRefVersion" }, > + { 31, "NATS-SEFI" }, > + { 31, "iso-ir-8-1" }, > + { 31, "csNATSSEFI" }, > + { 32, "NATS-SEFI-ADD" }, > + { 32, "iso-ir-8-2" }, > + { 32, "csNATSSEFIADD" }, > + { 35, "SEN_850200_B" }, > + { 35, "iso-ir-10" }, > + { 35, "FI" }, > + { 35, "ISO646-FI" }, > + { 35, "ISO646-SE" }, > + { 35, "se" }, > + { 35, "csISO10Swedish" }, > + { 36, "KS_C_5601-1987" }, > + { 36, "iso-ir-149" }, > + { 36, "KS_C_5601-1989" }, > + { 36, "KSC_5601" }, > + { 36, "korean" }, > + { 36, "csKSC56011987" }, > + { 37, "ISO-2022-KR" }, > + { 37, "csISO2022KR" }, > + { 38, "EUC-KR" }, > + { 38, "csEUCKR" }, > + { 39, "ISO-2022-JP" }, > + { 39, "csISO2022JP" }, > + { 40, "ISO-2022-JP-2" }, > + { 40, "csISO2022JP2" }, > + { 41, "JIS_C6220-1969-jp" }, > + { 41, "JIS_C6220-1969" }, > + { 41, "iso-ir-13" }, > + { 41, "katakana" }, > + { 41, "x0201-7" }, > + { 41, "csISO13JISC6220jp" }, > + { 42, "JIS_C6220-1969-ro" }, > + { 42, "iso-ir-14" }, > + { 42, "jp" }, > + { 42, "ISO646-JP" }, > + { 42, "csISO14JISC6220ro" }, > + { 43, "PT" }, > + { 43, "iso-ir-16" }, > + { 43, "ISO646-PT" }, > + { 43, "csISO16Portuguese" }, > + { 44, "greek7-old" }, > + { 44, "iso-ir-18" }, > + { 44, "csISO18Greek7Old" }, > + { 45, "latin-greek" }, > + { 45, "iso-ir-19" }, > + { 45, "csISO19LatinGreek" }, > + { 46, "NF_Z_62-010_(1973)" }, > + { 46, "iso-ir-25" }, > + { 46, "ISO646-FR1" }, > + { 46, "csISO25French" }, > + { 47, "Latin-greek-1" }, > + { 47, "iso-ir-27" }, > + { 47, "csISO27LatinGreek1" }, > + { 48, "ISO_5427" }, > + { 48, "iso-ir-37" }, > + { 48, "csISO5427Cyrillic" }, > + { 49, "JIS_C6226-1978" }, > + { 49, "iso-ir-42" }, > + { 49, "csISO42JISC62261978" }, > + { 50, "BS_viewdata" }, > + { 50, "iso-ir-47" }, > + { 50, "csISO47BSViewdata" }, > + { 51, "INIS" }, > + { 51, "iso-ir-49" }, > + { 51, "csISO49INIS" }, > + { 52, "INIS-8" }, > + { 52, "iso-ir-50" }, > + { 52, "csISO50INIS8" }, > + { 53, "INIS-cyrillic" }, > + { 53, "iso-ir-51" }, > + { 53, "csISO51INISCyrillic" }, > + { 54, "ISO_5427:1981" }, > + { 54, "iso-ir-54" }, > + { 54, "ISO5427Cyrillic1981" }, > + { 54, "csISO54271981" }, > + { 55, "ISO_5428:1980" }, > + { 55, "iso-ir-55" }, > + { 55, "csISO5428Greek" }, > + { 56, "GB_1988-80" }, > + { 56, "iso-ir-57" }, > + { 56, "cn" }, > + { 56, "ISO646-CN" }, > + { 56, "csISO57GB1988" }, > + { 57, "GB_2312-80" }, > + { 57, "iso-ir-58" }, > + { 57, "chinese" }, > + { 57, "csISO58GB231280" }, > + { 58, "NS_4551-2" }, > + { 58, "ISO646-NO2" }, > + { 58, "iso-ir-61" }, > + { 58, "no2" }, > + { 58, "csISO61Norwegian2" }, > + { 59, "videotex-suppl" }, > + { 59, "iso-ir-70" }, > + { 59, "csISO70VideotexSupp1" }, > + { 60, "PT2" }, > + { 60, "iso-ir-84" }, > + { 60, "ISO646-PT2" }, > + { 60, "csISO84Portuguese2" }, > + { 61, "ES2" }, > + { 61, "iso-ir-85" }, > + { 61, "ISO646-ES2" }, > + { 61, "csISO85Spanish2" }, > + { 62, "MSZ_7795.3" }, > + { 62, "iso-ir-86" }, > + { 62, "ISO646-HU" }, > + { 62, "hu" }, > + { 62, "csISO86Hungarian" }, > + { 63, "JIS_C6226-1983" }, > + { 63, "iso-ir-87" }, > + { 63, "x0208" }, > + { 63, "JIS_X0208-1983" }, > + { 63, "csISO87JISX0208" }, > + { 64, "greek7" }, > + { 64, "iso-ir-88" }, > + { 64, "csISO88Greek7" }, > + { 65, "ASMO_449" }, > + { 65, "ISO_9036" }, > + { 65, "arabic7" }, > + { 65, "iso-ir-89" }, > + { 65, "csISO89ASMO449" }, > + { 66, "iso-ir-90" }, > + { 66, "csISO90" }, > + { 67, "JIS_C6229-1984-a" }, > + { 67, "iso-ir-91" }, > + { 67, "jp-ocr-a" }, > + { 67, "csISO91JISC62291984a" }, > + { 68, "JIS_C6229-1984-b" }, > + { 68, "iso-ir-92" }, > + { 68, "ISO646-JP-OCR-B" }, > + { 68, "jp-ocr-b" }, > + { 68, "csISO92JISC62991984b" }, > + { 69, "JIS_C6229-1984-b-add" }, > + { 69, "iso-ir-93" }, > + { 69, "jp-ocr-b-add" }, > + { 69, "csISO93JIS62291984badd" }, > + { 70, "JIS_C6229-1984-hand" }, > + { 70, "iso-ir-94" }, > + { 70, "jp-ocr-hand" }, > + { 70, "csISO94JIS62291984hand" }, > + { 71, "JIS_C6229-1984-hand-add" }, > + { 71, "iso-ir-95" }, > + { 71, "jp-ocr-hand-add" }, > + { 71, "csISO95JIS62291984handadd" }, > + { 72, "JIS_C6229-1984-kana" }, > + { 72, "iso-ir-96" }, > + { 72, "csISO96JISC62291984kana" }, > + { 73, "ISO_2033-1983" }, > + { 73, "iso-ir-98" }, > + { 73, "e13b" }, > + { 73, "csISO2033" }, > + { 74, "ANSI_X3.110-1983" }, > + { 74, "iso-ir-99" }, > + { 74, "CSA_T500-1983" }, > + { 74, "NAPLPS" }, > + { 74, "csISO99NAPLPS" }, > + { 75, "T.61-7bit" }, > + { 75, "iso-ir-102" }, > + { 75, "csISO102T617bit" }, > + { 76, "T.61-8bit" }, > + { 76, "T.61" }, > + { 76, "iso-ir-103" }, > + { 76, "csISO103T618bit" }, > + { 77, "ECMA-cyrillic" }, > + { 77, "iso-ir-111" }, > + { 77, "KOI8-E" }, > + { 77, "csISO111ECMACyrillic" }, > + { 78, "CSA_Z243.4-1985-1" }, > + { 78, "iso-ir-121" }, > + { 78, "ISO646-CA" }, > + { 78, "csa7-1" }, > + { 78, "csa71" }, > + { 78, "ca" }, > + { 78, "csISO121Canadian1" }, > + { 79, "CSA_Z243.4-1985-2" }, > + { 79, "iso-ir-122" }, > + { 79, "ISO646-CA2" }, > + { 79, "csa7-2" }, > + { 79, "csa72" }, > + { 79, "csISO122Canadian2" }, > + { 80, "CSA_Z243.4-1985-gr" }, > + { 80, "iso-ir-123" }, > + { 80, "csISO123CSAZ24341985gr" }, > + { 81, "ISO_8859-6-E" }, > + { 81, "csISO88596E" }, > + { 81, "ISO-8859-6-E" }, > + { 82, "ISO_8859-6-I" }, > + { 82, "csISO88596I" }, > + { 82, "ISO-8859-6-I" }, > + { 83, "T.101-G2" }, > + { 83, "iso-ir-128" }, > + { 83, "csISO128T101G2" }, > + { 84, "ISO_8859-8-E" }, > + { 84, "csISO88598E" }, > + { 84, "ISO-8859-8-E" }, > + { 85, "ISO_8859-8-I" }, > + { 85, "csISO88598I" }, > + { 85, "ISO-8859-8-I" }, > + { 86, "CSN_369103" }, > + { 86, "iso-ir-139" }, > + { 86, "csISO139CSN369103" }, > + { 87, "JUS_I.B1.002" }, > + { 87, "iso-ir-141" }, > + { 87, "ISO646-YU" }, > + { 87, "js" }, > + { 87, "yu" }, > + { 87, "csISO141JUSIB1002" }, > + { 88, "IEC_P27-1" }, > + { 88, "iso-ir-143" }, > + { 88, "csISO143IECP271" }, > + { 89, "JUS_I.B1.003-serb" }, > + { 89, "iso-ir-146" }, > + { 89, "serbian" }, > + { 89, "csISO146Serbian" }, > + { 90, "JUS_I.B1.003-mac" }, > + { 90, "macedonian" }, > + { 90, "iso-ir-147" }, > + { 90, "csISO147Macedonian" }, > + { 91, "greek-ccitt" }, > + { 91, "iso-ir-150" }, > + { 91, "csISO150" }, > + { 91, "csISO150GreekCCITT" }, > + { 92, "NC_NC00-10:81" }, > + { 92, "cuba" }, > + { 92, "iso-ir-151" }, > + { 92, "ISO646-CU" }, > + { 92, "csISO151Cuba" }, > + { 93, "ISO_6937-2-25" }, > + { 93, "iso-ir-152" }, > + { 93, "csISO6937Add" }, > + { 94, "GOST_19768-74" }, > + { 94, "ST_SEV_358-88" }, > + { 94, "iso-ir-153" }, > + { 94, "csISO153GOST1976874" }, > + { 95, "ISO_8859-supp" }, > + { 95, "iso-ir-154" }, > + { 95, "latin1-2-5" }, > + { 95, "csISO8859Supp" }, > + { 96, "ISO_10367-box" }, > + { 96, "iso-ir-155" }, > + { 96, "csISO10367Box" }, > + { 97, "latin-lap" }, > + { 97, "lap" }, > + { 97, "iso-ir-158" }, > + { 97, "csISO158Lap" }, > + { 98, "JIS_X0212-1990" }, > + { 98, "x0212" }, > + { 98, "iso-ir-159" }, > + { 98, "csISO159JISX02121990" }, > + { 99, "DS_2089" }, > + { 99, "DS2089" }, > + { 99, "ISO646-DK" }, > + { 99, "dk" }, > + { 99, "csISO646Danish" }, > + { 100, "us-dk" }, > + { 100, "csUSDK" }, > + { 101, "dk-us" }, > + { 101, "csDKUS" }, > + { 102, "KSC5636" }, > + { 102, "ISO646-KR" }, > + { 102, "csKSC5636" }, > + { 103, "UNICODE-1-1-UTF-7" }, > + { 103, "csUnicode11UTF7" }, > + { 104, "ISO-2022-CN" }, > + { 104, "csISO2022CN" }, > + { 105, "ISO-2022-CN-EXT" }, > + { 105, "csISO2022CNEXT" }, > +#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET 413 > + { 106, "UTF-8" }, > + { 106, "csUTF8" }, > + { 109, "ISO-8859-13" }, > + { 109, "csISO885913" }, > + { 110, "ISO-8859-14" }, > + { 110, "iso-ir-199" }, > + { 110, "ISO_8859-14:1998" }, > + { 110, "ISO_8859-14" }, > + { 110, "latin8" }, > + { 110, "iso-celtic" }, > + { 110, "l8" }, > + { 110, "csISO885914" }, > + { 111, "ISO-8859-15" }, > + { 111, "ISO_8859-15" }, > + { 111, "Latin-9" }, > + { 111, "csISO885915" }, > + { 112, "ISO-8859-16" }, > + { 112, "iso-ir-226" }, > + { 112, "ISO_8859-16:2001" }, > + { 112, "ISO_8859-16" }, > + { 112, "latin10" }, > + { 112, "l10" }, > + { 112, "csISO885916" }, > + { 113, "GBK" }, > + { 113, "CP936" }, > + { 113, "MS936" }, > + { 113, "windows-936" }, > + { 113, "csGBK" }, > + { 114, "GB18030" }, > + { 114, "csGB18030" }, > + { 115, "OSD_EBCDIC_DF04_15" }, > + { 115, "csOSDEBCDICDF0415" }, > + { 116, "OSD_EBCDIC_DF03_IRV" }, > + { 116, "csOSDEBCDICDF03IRV" }, > + { 117, "OSD_EBCDIC_DF04_1" }, > + { 117, "csOSDEBCDICDF041" }, > + { 118, "ISO-11548-1" }, > + { 118, "ISO_11548-1" }, > + { 118, "ISO_TR_11548-1" }, > + { 118, "csISO115481" }, > + { 119, "KZ-1048" }, > + { 119, "STRK1048-2002" }, > + { 119, "RK1048" }, > + { 119, "csKZ1048" }, > + { 1000, "ISO-10646-UCS-2" }, > + { 1000, "csUnicode" }, > + { 1001, "ISO-10646-UCS-4" }, > + { 1001, "csUCS4" }, > + { 1002, "ISO-10646-UCS-Basic" }, > + { 1002, "csUnicodeASCII" }, > + { 1003, "ISO-10646-Unicode-Latin1" }, > + { 1003, "csUnicodeLatin1" }, > + { 1003, "ISO-10646" }, > + { 1004, "ISO-10646-J-1" }, > + { 1004, "csUnicodeJapanese" }, > + { 1005, "ISO-Unicode-IBM-1261" }, > + { 1005, "csUnicodeIBM1261" }, > + { 1006, "ISO-Unicode-IBM-1268" }, > + { 1006, "csUnicodeIBM1268" }, > + { 1007, "ISO-Unicode-IBM-1276" }, > + { 1007, "csUnicodeIBM1276" }, > + { 1008, "ISO-Unicode-IBM-1264" }, > + { 1008, "csUnicodeIBM1264" }, > + { 1009, "ISO-Unicode-IBM-1265" }, > + { 1009, "csUnicodeIBM1265" }, > + { 1010, "UNICODE-1-1" }, > + { 1010, "csUnicode11" }, > + { 1011, "SCSU" }, > + { 1011, "csSCSU" }, > + { 1012, "UTF-7" }, > + { 1012, "csUTF7" }, > + { 1013, "UTF-16BE" }, > + { 1013, "csUTF16BE" }, > + { 1014, "UTF-16LE" }, > + { 1014, "csUTF16LE" }, > + { 1015, "UTF-16" }, > + { 1015, "csUTF16" }, > + { 1016, "CESU-8" }, > + { 1016, "csCESU8" }, > + { 1016, "csCESU-8" }, > + { 1017, "UTF-32" }, > + { 1017, "csUTF32" }, > + { 1018, "UTF-32BE" }, > + { 1018, "csUTF32BE" }, > + { 1019, "UTF-32LE" }, > + { 1019, "csUTF32LE" }, > + { 1020, "BOCU-1" }, > + { 1020, "csBOCU1" }, > + { 1020, "csBOCU-1" }, > + { 1021, "UTF-7-IMAP" }, > + { 1021, "csUTF7IMAP" }, > + { 2000, "ISO-8859-1-Windows-3.0-Latin-1" }, > + { 2000, "csWindows30Latin1" }, > + { 2001, "ISO-8859-1-Windows-3.1-Latin-1" }, > + { 2001, "csWindows31Latin1" }, > + { 2002, "ISO-8859-2-Windows-Latin-2" }, > + { 2002, "csWindows31Latin2" }, > + { 2003, "ISO-8859-9-Windows-Latin-5" }, > + { 2003, "csWindows31Latin5" }, > + { 2004, "hp-roman8" }, > + { 2004, "roman8" }, > + { 2004, "r8" }, > + { 2004, "csHPRoman8" }, > + { 2005, "Adobe-Standard-Encoding" }, > + { 2005, "csAdobeStandardEncoding" }, > + { 2006, "Ventura-US" }, > + { 2006, "csVenturaUS" }, > + { 2007, "Ventura-International" }, > + { 2007, "csVenturaInternational" }, > + { 2008, "DEC-MCS" }, > + { 2008, "dec" }, > + { 2008, "csDECMCS" }, > + { 2009, "IBM850" }, > + { 2009, "cp850" }, > + { 2009, "850" }, > + { 2009, "csPC850Multilingual" }, > + { 2010, "IBM852" }, > + { 2010, "cp852" }, > + { 2010, "852" }, > + { 2010, "csPCp852" }, > + { 2011, "IBM437" }, > + { 2011, "cp437" }, > + { 2011, "437" }, > + { 2011, "csPC8CodePage437" }, > + { 2012, "PC8-Danish-Norwegian" }, > + { 2012, "csPC8DanishNorwegian" }, > + { 2013, "IBM862" }, > + { 2013, "cp862" }, > + { 2013, "862" }, > + { 2013, "csPC862LatinHebrew" }, > + { 2014, "PC8-Turkish" }, > + { 2014, "csPC8Turkish" }, > + { 2015, "IBM-Symbols" }, > + { 2015, "csIBMSymbols" }, > + { 2016, "IBM-Thai" }, > + { 2016, "csIBMThai" }, > + { 2017, "HP-Legal" }, > + { 2017, "csHPLegal" }, > + { 2018, "HP-Pi-font" }, > + { 2018, "csHPPiFont" }, > + { 2019, "HP-Math8" }, > + { 2019, "csHPMath8" }, > + { 2020, "Adobe-Symbol-Encoding" }, > + { 2020, "csHPPSMath" }, > + { 2021, "HP-DeskTop" }, > + { 2021, "csHPDesktop" }, > + { 2022, "Ventura-Math" }, > + { 2022, "csVenturaMath" }, > + { 2023, "Microsoft-Publishing" }, > + { 2023, "csMicrosoftPublishing" }, > + { 2024, "Windows-31J" }, > + { 2024, "csWindows31J" }, > + { 2025, "GB2312" }, > + { 2025, "csGB2312" }, > + { 2026, "Big5" }, > + { 2026, "csBig5" }, > + { 2027, "macintosh" }, > + { 2027, "mac" }, > + { 2027, "csMacintosh" }, > + { 2028, "IBM037" }, > + { 2028, "cp037" }, > + { 2028, "ebcdic-cp-us" }, > + { 2028, "ebcdic-cp-ca" }, > + { 2028, "ebcdic-cp-wt" }, > + { 2028, "ebcdic-cp-nl" }, > + { 2028, "csIBM037" }, > + { 2029, "IBM038" }, > + { 2029, "EBCDIC-INT" }, > + { 2029, "cp038" }, > + { 2029, "csIBM038" }, > + { 2030, "IBM273" }, > + { 2030, "CP273" }, > + { 2030, "csIBM273" }, > + { 2031, "IBM274" }, > + { 2031, "EBCDIC-BE" }, > + { 2031, "CP274" }, > + { 2031, "csIBM274" }, > + { 2032, "IBM275" }, > + { 2032, "EBCDIC-BR" }, > + { 2032, "cp275" }, > + { 2032, "csIBM275" }, > + { 2033, "IBM277" }, > + { 2033, "EBCDIC-CP-DK" }, > + { 2033, "EBCDIC-CP-NO" }, > + { 2033, "csIBM277" }, > + { 2034, "IBM278" }, > + { 2034, "CP278" }, > + { 2034, "ebcdic-cp-fi" }, > + { 2034, "ebcdic-cp-se" }, > + { 2034, "csIBM278" }, > + { 2035, "IBM280" }, > + { 2035, "CP280" }, > + { 2035, "ebcdic-cp-it" }, > + { 2035, "csIBM280" }, > + { 2036, "IBM281" }, > + { 2036, "EBCDIC-JP-E" }, > + { 2036, "cp281" }, > + { 2036, "csIBM281" }, > + { 2037, "IBM284" }, > + { 2037, "CP284" }, > + { 2037, "ebcdic-cp-es" }, > + { 2037, "csIBM284" }, > + { 2038, "IBM285" }, > + { 2038, "CP285" }, > + { 2038, "ebcdic-cp-gb" }, > + { 2038, "csIBM285" }, > + { 2039, "IBM290" }, > + { 2039, "cp290" }, > + { 2039, "EBCDIC-JP-kana" }, > + { 2039, "csIBM290" }, > + { 2040, "IBM297" }, > + { 2040, "cp297" }, > + { 2040, "ebcdic-cp-fr" }, > + { 2040, "csIBM297" }, > + { 2041, "IBM420" }, > + { 2041, "cp420" }, > + { 2041, "ebcdic-cp-ar1" }, > + { 2041, "csIBM420" }, > + { 2042, "IBM423" }, > + { 2042, "cp423" }, > + { 2042, "ebcdic-cp-gr" }, > + { 2042, "csIBM423" }, > + { 2043, "IBM424" }, > + { 2043, "cp424" }, > + { 2043, "ebcdic-cp-he" }, > + { 2043, "csIBM424" }, > + { 2044, "IBM500" }, > + { 2044, "CP500" }, > + { 2044, "ebcdic-cp-be" }, > + { 2044, "ebcdic-cp-ch" }, > + { 2044, "csIBM500" }, > + { 2045, "IBM851" }, > + { 2045, "cp851" }, > + { 2045, "851" }, > + { 2045, "csIBM851" }, > + { 2046, "IBM855" }, > + { 2046, "cp855" }, > + { 2046, "855" }, > + { 2046, "csIBM855" }, > + { 2047, "IBM857" }, > + { 2047, "cp857" }, > + { 2047, "857" }, > + { 2047, "csIBM857" }, > + { 2048, "IBM860" }, > + { 2048, "cp860" }, > + { 2048, "860" }, > + { 2048, "csIBM860" }, > + { 2049, "IBM861" }, > + { 2049, "cp861" }, > + { 2049, "861" }, > + { 2049, "cp-is" }, > + { 2049, "csIBM861" }, > + { 2050, "IBM863" }, > + { 2050, "cp863" }, > + { 2050, "863" }, > + { 2050, "csIBM863" }, > + { 2051, "IBM864" }, > + { 2051, "cp864" }, > + { 2051, "csIBM864" }, > + { 2052, "IBM865" }, > + { 2052, "cp865" }, > + { 2052, "865" }, > + { 2052, "csIBM865" }, > + { 2053, "IBM868" }, > + { 2053, "CP868" }, > + { 2053, "cp-ar" }, > + { 2053, "csIBM868" }, > + { 2054, "IBM869" }, > + { 2054, "cp869" }, > + { 2054, "869" }, > + { 2054, "cp-gr" }, > + { 2054, "csIBM869" }, > + { 2055, "IBM870" }, > + { 2055, "CP870" }, > + { 2055, "ebcdic-cp-roece" }, > + { 2055, "ebcdic-cp-yu" }, > + { 2055, "csIBM870" }, > + { 2056, "IBM871" }, > + { 2056, "CP871" }, > + { 2056, "ebcdic-cp-is" }, > + { 2056, "csIBM871" }, > + { 2057, "IBM880" }, > + { 2057, "cp880" }, > + { 2057, "EBCDIC-Cyrillic" }, > + { 2057, "csIBM880" }, > + { 2058, "IBM891" }, > + { 2058, "cp891" }, > + { 2058, "csIBM891" }, > + { 2059, "IBM903" }, > + { 2059, "cp903" }, > + { 2059, "csIBM903" }, > + { 2060, "IBM904" }, > + { 2060, "cp904" }, > + { 2060, "904" }, > + { 2060, "csIBBM904" }, > + { 2061, "IBM905" }, > + { 2061, "CP905" }, > + { 2061, "ebcdic-cp-tr" }, > + { 2061, "csIBM905" }, > + { 2062, "IBM918" }, > + { 2062, "CP918" }, > + { 2062, "ebcdic-cp-ar2" }, > + { 2062, "csIBM918" }, > + { 2063, "IBM1026" }, > + { 2063, "CP1026" }, > + { 2063, "csIBM1026" }, > + { 2064, "EBCDIC-AT-DE" }, > + { 2064, "csIBMEBCDICATDE" }, > + { 2065, "EBCDIC-AT-DE-A" }, > + { 2065, "csEBCDICATDEA" }, > + { 2066, "EBCDIC-CA-FR" }, > + { 2066, "csEBCDICCAFR" }, > + { 2067, "EBCDIC-DK-NO" }, > + { 2067, "csEBCDICDKNO" }, > + { 2068, "EBCDIC-DK-NO-A" }, > + { 2068, "csEBCDICDKNOA" }, > + { 2069, "EBCDIC-FI-SE" }, > + { 2069, "csEBCDICFISE" }, > + { 2070, "EBCDIC-FI-SE-A" }, > + { 2070, "csEBCDICFISEA" }, > + { 2071, "EBCDIC-FR" }, > + { 2071, "csEBCDICFR" }, > + { 2072, "EBCDIC-IT" }, > + { 2072, "csEBCDICIT" }, > + { 2073, "EBCDIC-PT" }, > + { 2073, "csEBCDICPT" }, > + { 2074, "EBCDIC-ES" }, > + { 2074, "csEBCDICES" }, > + { 2075, "EBCDIC-ES-A" }, > + { 2075, "csEBCDICESA" }, > + { 2076, "EBCDIC-ES-S" }, > + { 2076, "csEBCDICESS" }, > + { 2077, "EBCDIC-UK" }, > + { 2077, "csEBCDICUK" }, > + { 2078, "EBCDIC-US" }, > + { 2078, "csEBCDICUS" }, > + { 2079, "UNKNOWN-8BIT" }, > + { 2079, "csUnknown8BiT" }, > + { 2080, "MNEMONIC" }, > + { 2080, "csMnemonic" }, > + { 2081, "MNEM" }, > + { 2081, "csMnem" }, > + { 2082, "VISCII" }, > + { 2082, "csVISCII" }, > + { 2083, "VIQR" }, > + { 2083, "csVIQR" }, > + { 2084, "KOI8-R" }, > + { 2084, "csKOI8R" }, > + { 2085, "HZ-GB-2312" }, > + { 2086, "IBM866" }, > + { 2086, "cp866" }, > + { 2086, "866" }, > + { 2086, "csIBM866" }, > + { 2087, "IBM775" }, > + { 2087, "cp775" }, > + { 2087, "csPC775Baltic" }, > + { 2088, "KOI8-U" }, > + { 2088, "csKOI8U" }, > + { 2089, "IBM00858" }, > + { 2089, "CCSID00858" }, > + { 2089, "CP00858" }, > + { 2089, "PC-Multilingual-850+euro" }, > + { 2089, "csIBM00858" }, > + { 2090, "IBM00924" }, > + { 2090, "CCSID00924" }, > + { 2090, "CP00924" }, > + { 2090, "ebcdic-Latin9--euro" }, > + { 2090, "csIBM00924" }, > + { 2091, "IBM01140" }, > + { 2091, "CCSID01140" }, > + { 2091, "CP01140" }, > + { 2091, "ebcdic-us-37+euro" }, > + { 2091, "csIBM01140" }, > + { 2092, "IBM01141" }, > + { 2092, "CCSID01141" }, > + { 2092, "CP01141" }, > + { 2092, "ebcdic-de-273+euro" }, > + { 2092, "csIBM01141" }, > + { 2093, "IBM01142" }, > + { 2093, "CCSID01142" }, > + { 2093, "CP01142" }, > + { 2093, "ebcdic-dk-277+euro" }, > + { 2093, "ebcdic-no-277+euro" }, > + { 2093, "csIBM01142" }, > + { 2094, "IBM01143" }, > + { 2094, "CCSID01143" }, > + { 2094, "CP01143" }, > + { 2094, "ebcdic-fi-278+euro" }, > + { 2094, "ebcdic-se-278+euro" }, > + { 2094, "csIBM01143" }, > + { 2095, "IBM01144" }, > + { 2095, "CCSID01144" }, > + { 2095, "CP01144" }, > + { 2095, "ebcdic-it-280+euro" }, > + { 2095, "csIBM01144" }, > + { 2096, "IBM01145" }, > + { 2096, "CCSID01145" }, > + { 2096, "CP01145" }, > + { 2096, "ebcdic-es-284+euro" }, > + { 2096, "csIBM01145" }, > + { 2097, "IBM01146" }, > + { 2097, "CCSID01146" }, > + { 2097, "CP01146" }, > + { 2097, "ebcdic-gb-285+euro" }, > + { 2097, "csIBM01146" }, > + { 2098, "IBM01147" }, > + { 2098, "CCSID01147" }, > + { 2098, "CP01147" }, > + { 2098, "ebcdic-fr-297+euro" }, > + { 2098, "csIBM01147" }, > + { 2099, "IBM01148" }, > + { 2099, "CCSID01148" }, > + { 2099, "CP01148" }, > + { 2099, "ebcdic-international-500+euro" }, > + { 2099, "csIBM01148" }, > + { 2100, "IBM01149" }, > + { 2100, "CCSID01149" }, > + { 2100, "CP01149" }, > + { 2100, "ebcdic-is-871+euro" }, > + { 2100, "csIBM01149" }, > + { 2101, "Big5-HKSCS" }, > + { 2101, "csBig5HKSCS" }, > + { 2102, "IBM1047" }, > + { 2102, "IBM-1047" }, > + { 2102, "csIBM1047" }, > + { 2103, "PTCP154" }, > + { 2103, "csPTCP154" }, > + { 2103, "PT154" }, > + { 2103, "CP154" }, > + { 2103, "Cyrillic-Asian" }, > + { 2104, "Amiga-1251" }, > + { 2104, "Ami1251" }, > + { 2104, "Amiga1251" }, > + { 2104, "Ami-1251" }, > + { 2104, "csAmiga1251" }, > + { 2104, "(Aliases" }, > + { 2104, "are" }, > + { 2104, "provided" }, > + { 2104, "for" }, > + { 2104, "historical" }, > + { 2104, "reasons" }, > + { 2104, "and" }, > + { 2104, "should" }, > + { 2104, "not" }, > + { 2104, "be" }, > + { 2104, "used)" }, > + { 2104, "[Malyshev]" }, > + { 2105, "KOI7-switched" }, > + { 2105, "csKOI7switched" }, > + { 2106, "BRF" }, > + { 2106, "csBRF" }, > + { 2107, "TSCII" }, > + { 2107, "csTSCII" }, > + { 2108, "CP51932" }, > + { 2108, "csCP51932" }, > + { 2109, "windows-874" }, > + { 2109, "cswindows874" }, > + { 2250, "windows-1250" }, > + { 2250, "cswindows1250" }, > + { 2251, "windows-1251" }, > + { 2251, "cswindows1251" }, > + { 2252, "windows-1252" }, > + { 2252, "cswindows1252" }, > + { 2253, "windows-1253" }, > + { 2253, "cswindows1253" }, > + { 2254, "windows-1254" }, > + { 2254, "cswindows1254" }, > + { 2255, "windows-1255" }, > + { 2255, "cswindows1255" }, > + { 2256, "windows-1256" }, > + { 2256, "cswindows1256" }, > + { 2257, "windows-1257" }, > + { 2257, "cswindows1257" }, > + { 2258, "windows-1258" }, > + { 2258, "cswindows1258" }, > + { 2259, "TIS-620" }, > + { 2259, "csTIS620" }, > + { 2259, "ISO-8859-11" }, > + { 2260, "CP50220" }, > + { 2260, "csCP50220" }, > + > +#undef _GLIBCXX_GET_ENCODING_DATA > diff --git a/libstdc++-v3/include/bits/unicode.h b/libstdc++-v3/include/bits/unicode.h > index f1b2b359bdf..8bc55e9c136 100644 > --- a/libstdc++-v3/include/bits/unicode.h > +++ b/libstdc++-v3/include/bits/unicode.h > @@ -32,7 +32,8 @@ > > #if __cplusplus >= 202002L > #include > -#include > +#include // bit_width > +#include // __detail::__from_chars_alnum_to_val_table > #include > #include > #include > @@ -986,7 +987,7 @@ inline namespace __v15_1_0 > return __n; > } > > - template > + template > consteval bool > __literal_encoding_is_unicode() > { > @@ -1056,6 +1057,54 @@ inline namespace __v15_1_0 > __literal_encoding_is_utf8() > { return __literal_encoding_is_unicode(); } > > + consteval bool > + __literal_encoding_is_extended_ascii() > + { > + return '0' == 0x30 && 'A' == 0x41 && 'Z' == 0x5a > + && 'a' == 0x61 && 'z' == 0x7a; > + } This function seems unused now. > + > + // https://www.unicode.org/reports/tr22/tr22-8.html#Charset_Alias_Matching > + constexpr bool > + __charset_alias_match(string_view __a, string_view __b) > + { > + // Map alphanumeric chars to their base 64 value, everything else to 127. > + auto __map = [](char __c, bool& __num) -> unsigned char { > + using __detail::__from_chars_alnum_to_val_table; > + if (__c == '0') [[unlikely]] > + return __num ? 0 : 127; > + auto __v = __from_chars_alnum_to_val_table::value.__data[__c]; Maybe it'd be more concise to use the accessor function __from_chars_alnum_to_val here (so e.g. the caller doesn't have to pass _DecOnly=false explicitly)? > + __num = __v < 10; > + return __v; > + }; > + > + auto __ptr_a = __a.begin(), __end_a = __a.end(); > + auto __ptr_b = __b.begin(), __end_b = __b.end(); > + bool __num_a = false, __num_b = false; > + > + while (true) > + { > + // Find the value of the next alphanumeric character in each string. > + unsigned char __val_a, __val_b; > + while (__ptr_a != __end_a > + && (__val_a = __map(*__ptr_a, __num_a)) == 127) > + ++__ptr_a; > + while (__ptr_b != __end_b > + && (__val_b = __map(*__ptr_b, __num_b)) == 127) > + ++__ptr_b; > + // Stop when we reach the end of a string, or get a mismatch. > + if (__ptr_a == __end_a) > + return __ptr_b == __end_b; > + else if (__ptr_b == __end_b) > + return false; > + else if (__val_a != __val_b) > + return false; // Found non-matching characters. > + ++__ptr_a; > + ++__ptr_b; > + } > + return true; > + } > + > } // namespace __unicode > > _GLIBCXX_END_NAMESPACE_VERSION > diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def > index afbec6c3e6a..8fb8a2877ee 100644 > --- a/libstdc++-v3/include/bits/version.def > +++ b/libstdc++-v3/include/bits/version.def > @@ -1751,6 +1751,16 @@ ftms = { > }; > }; > > +ftms = { > + name = text_encoding; > + values = { > + v = 202306; > + cxxmin = 26; > + hosted = yes; > + extra_cond = "_GLIBCXX_USE_NL_LANGINFO_L"; > + }; > +}; > + > ftms = { > name = to_string; > values = { > diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h > index 9688b246ef4..9ba99deeda6 100644 > --- a/libstdc++-v3/include/bits/version.h > +++ b/libstdc++-v3/include/bits/version.h > @@ -2137,6 +2137,17 @@ > #undef __glibcxx_want_saturation_arithmetic > > // from version.def line 1755 > +#if !defined(__cpp_lib_text_encoding) > +# if (__cplusplus > 202302L) && _GLIBCXX_HOSTED && (_GLIBCXX_USE_NL_LANGINFO_L) > +# define __glibcxx_text_encoding 202306L > +# if defined(__glibcxx_want_all) || defined(__glibcxx_want_text_encoding) > +# define __cpp_lib_text_encoding 202306L > +# endif > +# endif > +#endif /* !defined(__cpp_lib_text_encoding) && defined(__glibcxx_want_text_encoding) */ > +#undef __glibcxx_want_text_encoding > + > +// from version.def line 1765 > #if !defined(__cpp_lib_to_string) > # if (__cplusplus > 202302L) && _GLIBCXX_HOSTED && (__glibcxx_to_chars) > # define __glibcxx_to_string 202306L > @@ -2147,7 +2158,7 @@ > #endif /* !defined(__cpp_lib_to_string) && defined(__glibcxx_want_to_string) */ > #undef __glibcxx_want_to_string > > -// from version.def line 1765 > +// from version.def line 1775 > #if !defined(__cpp_lib_generator) > # if (__cplusplus >= 202100L) && (__glibcxx_coroutine) > # define __glibcxx_generator 202207L > diff --git a/libstdc++-v3/include/std/text_encoding b/libstdc++-v3/include/std/text_encoding > new file mode 100644 > index 00000000000..df8a09c5810 > --- /dev/null > +++ b/libstdc++-v3/include/std/text_encoding > @@ -0,0 +1,704 @@ > +// -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// . > + > +/** @file include/text_encoding > + * This is a Standard C++ Library header. > + */ > + > +#ifndef _GLIBCXX_TEXT_ENCODING > +#define _GLIBCXX_TEXT_ENCODING > + > +#pragma GCC system_header > + > +#include > + > +#define __glibcxx_want_text_encoding > +#include > + > +#ifdef __cpp_lib_text_encoding > +#include > +#include > +#include // hash > +#include // view_interface > +#include // __charset_alias_match > +#include // __int_traits > + > +namespace std _GLIBCXX_VISIBILITY(default) > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > + > + /** > + * @brief An interface for accessing the IANA Character Sets registry. > + * @ingroup locales > + * @since C++23 > + */ > + struct text_encoding > + { > + private: > + struct _Rep > + { > + using id = __INT_LEAST32_TYPE__; > + id _M_id; > + const char* _M_name; > + > + friend constexpr bool > + operator<(const _Rep& __r, id __m) noexcept > + { return __r._M_id < __m; } > + > + friend constexpr bool > + operator==(const _Rep& __r, string_view __name) noexcept > + { return __r._M_name == __name; } > + }; > + > + public: > + static constexpr size_t max_name_length = 63; > + > + enum class id : _Rep::id > + { > + other = 1, > + unknown = 2, > + ASCII = 3, > + ISOLatin1 = 4, > + ISOLatin2 = 5, > + ISOLatin3 = 6, > + ISOLatin4 = 7, > + ISOLatinCyrillic = 8, > + ISOLatinArabic = 9, > + ISOLatinGreek = 10, > + ISOLatinHebrew = 11, > + ISOLatin5 = 12, > + ISOLatin6 = 13, > + ISOTextComm = 14, > + HalfWidthKatakana = 15, > + JISEncoding = 16, > + ShiftJIS = 17, > + EUCPkdFmtJapanese = 18, > + EUCFixWidJapanese = 19, > + ISO4UnitedKingdom = 20, > + ISO11SwedishForNames = 21, > + ISO15Italian = 22, > + ISO17Spanish = 23, > + ISO21German = 24, > + ISO60DanishNorwegian = 25, > + ISO69French = 26, > + ISO10646UTF1 = 27, > + ISO646basic1983 = 28, > + INVARIANT = 29, > + ISO2IntlRefVersion = 30, > + NATSSEFI = 31, > + NATSSEFIADD = 32, > + ISO10Swedish = 35, > + KSC56011987 = 36, > + ISO2022KR = 37, > + EUCKR = 38, > + ISO2022JP = 39, > + ISO2022JP2 = 40, > + ISO13JISC6220jp = 41, > + ISO14JISC6220ro = 42, > + ISO16Portuguese = 43, > + ISO18Greek7Old = 44, > + ISO19LatinGreek = 45, > + ISO25French = 46, > + ISO27LatinGreek1 = 47, > + ISO5427Cyrillic = 48, > + ISO42JISC62261978 = 49, > + ISO47BSViewdata = 50, > + ISO49INIS = 51, > + ISO50INIS8 = 52, > + ISO51INISCyrillic = 53, > + ISO54271981 = 54, > + ISO5428Greek = 55, > + ISO57GB1988 = 56, > + ISO58GB231280 = 57, > + ISO61Norwegian2 = 58, > + ISO70VideotexSupp1 = 59, > + ISO84Portuguese2 = 60, > + ISO85Spanish2 = 61, > + ISO86Hungarian = 62, > + ISO87JISX0208 = 63, > + ISO88Greek7 = 64, > + ISO89ASMO449 = 65, > + ISO90 = 66, > + ISO91JISC62291984a = 67, > + ISO92JISC62991984b = 68, > + ISO93JIS62291984badd = 69, > + ISO94JIS62291984hand = 70, > + ISO95JIS62291984handadd = 71, > + ISO96JISC62291984kana = 72, > + ISO2033 = 73, > + ISO99NAPLPS = 74, > + ISO102T617bit = 75, > + ISO103T618bit = 76, > + ISO111ECMACyrillic = 77, > + ISO121Canadian1 = 78, > + ISO122Canadian2 = 79, > + ISO123CSAZ24341985gr = 80, > + ISO88596E = 81, > + ISO88596I = 82, > + ISO128T101G2 = 83, > + ISO88598E = 84, > + ISO88598I = 85, > + ISO139CSN369103 = 86, > + ISO141JUSIB1002 = 87, > + ISO143IECP271 = 88, > + ISO146Serbian = 89, > + ISO147Macedonian = 90, > + ISO150 = 91, > + ISO151Cuba = 92, > + ISO6937Add = 93, > + ISO153GOST1976874 = 94, > + ISO8859Supp = 95, > + ISO10367Box = 96, > + ISO158Lap = 97, > + ISO159JISX02121990 = 98, > + ISO646Danish = 99, > + USDK = 100, > + DKUS = 101, > + KSC5636 = 102, > + Unicode11UTF7 = 103, > + ISO2022CN = 104, > + ISO2022CNEXT = 105, > + UTF8 = 106, > + ISO885913 = 109, > + ISO885914 = 110, > + ISO885915 = 111, > + ISO885916 = 112, > + GBK = 113, > + GB18030 = 114, > + OSDEBCDICDF0415 = 115, > + OSDEBCDICDF03IRV = 116, > + OSDEBCDICDF041 = 117, > + ISO115481 = 118, > + KZ1048 = 119, > + UCS2 = 1000, > + UCS4 = 1001, > + UnicodeASCII = 1002, > + UnicodeLatin1 = 1003, > + UnicodeJapanese = 1004, > + UnicodeIBM1261 = 1005, > + UnicodeIBM1268 = 1006, > + UnicodeIBM1276 = 1007, > + UnicodeIBM1264 = 1008, > + UnicodeIBM1265 = 1009, > + Unicode11 = 1010, > + SCSU = 1011, > + UTF7 = 1012, > + UTF16BE = 1013, > + UTF16LE = 1014, > + UTF16 = 1015, > + CESU8 = 1016, > + UTF32 = 1017, > + UTF32BE = 1018, > + UTF32LE = 1019, > + BOCU1 = 1020, > + UTF7IMAP = 1021, > + Windows30Latin1 = 2000, > + Windows31Latin1 = 2001, > + Windows31Latin2 = 2002, > + Windows31Latin5 = 2003, > + HPRoman8 = 2004, > + AdobeStandardEncoding = 2005, > + VenturaUS = 2006, > + VenturaInternational = 2007, > + DECMCS = 2008, > + PC850Multilingual = 2009, > + PC8DanishNorwegian = 2012, > + PC862LatinHebrew = 2013, > + PC8Turkish = 2014, > + IBMSymbols = 2015, > + IBMThai = 2016, > + HPLegal = 2017, > + HPPiFont = 2018, > + HPMath8 = 2019, > + HPPSMath = 2020, > + HPDesktop = 2021, > + VenturaMath = 2022, > + MicrosoftPublishing = 2023, > + Windows31J = 2024, > + GB2312 = 2025, > + Big5 = 2026, > + Macintosh = 2027, > + IBM037 = 2028, > + IBM038 = 2029, > + IBM273 = 2030, > + IBM274 = 2031, > + IBM275 = 2032, > + IBM277 = 2033, > + IBM278 = 2034, > + IBM280 = 2035, > + IBM281 = 2036, > + IBM284 = 2037, > + IBM285 = 2038, > + IBM290 = 2039, > + IBM297 = 2040, > + IBM420 = 2041, > + IBM423 = 2042, > + IBM424 = 2043, > + PC8CodePage437 = 2011, > + IBM500 = 2044, > + IBM851 = 2045, > + PCp852 = 2010, > + IBM855 = 2046, > + IBM857 = 2047, > + IBM860 = 2048, > + IBM861 = 2049, > + IBM863 = 2050, > + IBM864 = 2051, > + IBM865 = 2052, > + IBM868 = 2053, > + IBM869 = 2054, > + IBM870 = 2055, > + IBM871 = 2056, > + IBM880 = 2057, > + IBM891 = 2058, > + IBM903 = 2059, > + IBM904 = 2060, > + IBM905 = 2061, > + IBM918 = 2062, > + IBM1026 = 2063, > + IBMEBCDICATDE = 2064, > + EBCDICATDEA = 2065, > + EBCDICCAFR = 2066, > + EBCDICDKNO = 2067, > + EBCDICDKNOA = 2068, > + EBCDICFISE = 2069, > + EBCDICFISEA = 2070, > + EBCDICFR = 2071, > + EBCDICIT = 2072, > + EBCDICPT = 2073, > + EBCDICES = 2074, > + EBCDICESA = 2075, > + EBCDICESS = 2076, > + EBCDICUK = 2077, > + EBCDICUS = 2078, > + Unknown8BiT = 2079, > + Mnemonic = 2080, > + Mnem = 2081, > + VISCII = 2082, > + VIQR = 2083, > + KOI8R = 2084, > + HZGB2312 = 2085, > + IBM866 = 2086, > + PC775Baltic = 2087, > + KOI8U = 2088, > + IBM00858 = 2089, > + IBM00924 = 2090, > + IBM01140 = 2091, > + IBM01141 = 2092, > + IBM01142 = 2093, > + IBM01143 = 2094, > + IBM01144 = 2095, > + IBM01145 = 2096, > + IBM01146 = 2097, > + IBM01147 = 2098, > + IBM01148 = 2099, > + IBM01149 = 2100, > + Big5HKSCS = 2101, > + IBM1047 = 2102, > + PTCP154 = 2103, > + Amiga1251 = 2104, > + KOI7switched = 2105, > + BRF = 2106, > + TSCII = 2107, > + CP51932 = 2108, > + windows874 = 2109, > + windows1250 = 2250, > + windows1251 = 2251, > + windows1252 = 2252, > + windows1253 = 2253, > + windows1254 = 2254, > + windows1255 = 2255, > + windows1256 = 2256, > + windows1257 = 2257, > + windows1258 = 2258, > + TIS620 = 2259, > + CP50220 = 2260 > + }; > + using enum id; > + > + constexpr text_encoding() = default; > + > + constexpr explicit > + text_encoding(string_view __enc) noexcept > + : _M_rep(_S_find_name(__enc)) > + { > + __enc.copy(_M_name, max_name_length); > + } > + > + // @pre i has the value of one of the enumerators of id. > + constexpr > + text_encoding(id __i) noexcept > + : _M_rep(_S_find_id(__i)) > + { > + if (string_view __name(_M_rep->_M_name); !__name.empty()) > + __name.copy(_M_name, max_name_length); > + } > + > + constexpr id mib() const noexcept { return id(_M_rep->_M_id); } > + > + constexpr const char* name() const noexcept { return _M_name; } > + > + struct aliases_view : ranges::view_interface > + { > + private: > + class _Iterator; > + struct _Sentinel { }; > + > + public: > + constexpr _Iterator begin() const noexcept { return _Iterator(_M_begin); } > + constexpr _Sentinel end() const noexcept { return _Sentinel{}; } > + > + private: > + friend struct text_encoding; > + > + constexpr explicit aliases_view(const _Rep* __r) : _M_begin(__r) { } > + > + class _Iterator > + { > + public: > + using value_type = const char*; > + using reference = const char*; > + using difference_type = int; > + constexpr value_type operator*() const; Are these defined out-of-line to avoid excessive indentation? Perhaps we could define _Iterator (and aliases_view::begin) out-of-line instead? > + constexpr _Iterator& operator++(); > + constexpr _Iterator& operator--(); > + constexpr _Iterator operator++(int); > + constexpr _Iterator operator--(int); > + constexpr value_type operator[](difference_type) const; > + constexpr _Iterator& operator+=(difference_type); > + constexpr _Iterator& operator-=(difference_type); > + constexpr difference_type operator-(const _Iterator&) const; > + constexpr bool operator==(const _Iterator&) const = default; > + constexpr bool operator==(_Sentinel) const noexcept; > + constexpr strong_ordering operator<=>(const _Iterator&) const; > + > + friend _Iterator > + operator+(_Iterator __i, difference_type __n) constexpr? > + { > + __i += __n; > + return __i; > + } > + > + friend _Iterator > + operator+(difference_type __n, _Iterator __i) > + { > + __i += __n; > + return __i; > + } > + > + friend _Iterator > + operator-(_Iterator __i, difference_type __n) > + { > + __i -= __n; > + return __i; > + } > + > + private: > + friend class text_encoding; > + > + constexpr explicit > + _Iterator(const _Rep* __r) noexcept > + : _M_rep(__r), _M_id(__r ? __r->_M_id : 0) > + { } > + > + constexpr bool _M_dereferenceable() const noexcept; > + static constexpr difference_type _S_neg(difference_type) noexcept; > + > + const _Rep* _M_rep = nullptr; > + _Rep::id _M_id = 0; > + }; > + > + const _Rep* _M_begin = nullptr; > + }; > + > + constexpr aliases_view > + aliases() const noexcept > + { > + return _M_rep->_M_name[0] ? aliases_view(_M_rep) : aliases_view{nullptr}; > + } > + > + friend constexpr bool > + operator==(const text_encoding& __a, > + const text_encoding& __b) noexcept > + { > + if (__a.mib() == id::other && __b.mib() == id::other) [[unlikely]] > + return _S_comp(__a._M_name, __b._M_name); > + else > + return __a.mib() == __b.mib(); > + } > + > + friend constexpr bool > + operator==(const text_encoding& __encoding, id __i) noexcept > + { return __encoding.mib() == __i; } > + > +#if __CHAR_BIT__ == 8 > + static consteval text_encoding > + literal() noexcept > + { > +#ifdef __GNUC_EXECUTION_CHARSET_NAME > + return text_encoding(__GNUC_EXECUTION_CHARSET_NAME); > +#elif defined __clang_literal_encoding__ > + return text_encoding(__clang_literal_encoding__); > +#else > + return text_encoding(); > +#endif > + } > + > + static text_encoding > + environment(); > + > + template > + static bool > + environment_is() > + { return text_encoding(_Id)._M_is_environment(); } > +#else > + static text_encoding literal() = delete; > + static text_encoding environment() = delete; > + template static bool environment_is() = delete; > +#endif > + > + private: > + const _Rep* _M_rep = _S_reps + 1; // id::unknown > + char _M_name[max_name_length + 1] = {0}; > + > + bool > + _M_is_environment() const; > + > + static inline constexpr _Rep _S_reps[] = { > + { 1, "" }, { 2, "" }, > +#define _GLIBCXX_GET_ENCODING_DATA > +#include > +#ifdef _GLIBCXX_GET_ENCODING_DATA > +# error "Invalid text_encoding data" > +#endif > + { 9999, nullptr }, // sentinel > + }; > + > + static constexpr bool > + _S_comp(string_view __a, string_view __b) > + { return __unicode::__charset_alias_match(__a, __b); } > + > + static constexpr const _Rep* > + _S_find_name(string_view __name) noexcept > + { > +#ifdef _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET > + // Optimize the common UTF-8 case to avoid a linear search through all > + // strings in the table using the _S_comp function. > + if (__name == "UTF-8") > + return _S_reps + 2 + _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET; > +#endif > + > + // The first two array elements (other and unknown) don't have names. > + // The last element is a sentinel that can never match anything. > + const auto __first = _S_reps + 2, __end = std::end(_S_reps) - 1; > + for (auto __r = __first; __r != __end; ++__r) > + if (_S_comp(__r->_M_name, __name)) > + { > + // Might have matched an alias. Find the first entry for this ID. > + const auto __id = __r->_M_id; > + while (__r[-1]._M_id == __id) > + --__r; > + return __r; > + } > + return _S_reps; // id::other > + } > + > + static constexpr const _Rep* > + _S_find_id(id __id) noexcept > + { > + const auto __i = (_Rep::id)__id; > + const auto __r = std::lower_bound(_S_reps, std::end(_S_reps) - 1, __i); > + if (__r->_M_id == __i) [[likely]] > + return __r; > + else > + { > + // Preconditions: i has the value of one of the enumerators of id. > + __glibcxx_assert(__r->_M_id == __i); > + return _S_reps + 1; // id::unknown > + } > + } > + }; > + > + template<> > + struct hash > + { > + size_t > + operator()(const text_encoding& __enc) const noexcept > + { return std::hash()(__enc.mib()); } > + }; > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator*() const > + -> value_type > + { > + if (_M_dereferenceable()) [[likely]] > + return _M_rep->_M_name; > + else > + { > + __glibcxx_assert(_M_dereferenceable()); > + return ""; > + } > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator++() > + -> _Iterator& > + { > + if (_M_dereferenceable()) [[likely]] > + ++_M_rep; > + else > + { > + __glibcxx_assert(_M_dereferenceable()); > + *this = _Iterator{}; > + } > + return *this; > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator--() > + -> _Iterator& > + { > + const bool __decrementable = _M_rep != nullptr && _M_rep[-1]._M_id == _M_id; > + if (__decrementable) [[likely]] > + --_M_rep; > + else > + { > + __glibcxx_assert(__decrementable); > + *this = _Iterator{}; > + } > + return *this; > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator++(int) > + -> _Iterator > + { > + auto __it = *this; > + ++*this; > + return __it; > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator--(int) > + -> _Iterator > + { > + auto __it = *this; > + --*this; > + return __it; > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator[](difference_type __n) const > + -> value_type > + { return *(*this + __n); } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator+=(difference_type __n) > + -> _Iterator& > + { > + if (_M_rep != nullptr) > + { > + if ((__n > 0 && __n < (std::end(_S_reps) - _M_rep)) > + || (__n < 0 && __n > (_S_reps - _M_rep))) > + { > + if (_M_rep[__n]._M_id == _M_id) > + _M_rep += __n; > + else > + *this = _Iterator{}; > + } > + else if (__n != 0) > + *this = _Iterator{}; > + } > + __glibcxx_assert(_M_rep != nullptr); > + return *this; > + } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator-=(difference_type __n) > + -> _Iterator& > + { return operator+=(_S_neg(__n)); } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::operator-(const _Iterator& __i) const noexcept > + -> difference_type > + { > + if (_M_id == __i._M_id) > + return _M_rep - __i._M_rep; > + __glibcxx_assert(_M_id == __i._M_id); > + return __gnu_cxx::__int_traits::__max; > + } > + > + constexpr bool > + text_encoding::aliases_view:: > + _Iterator::operator==(_Sentinel) const noexcept > + { return !_M_dereferenceable(); } > + > + constexpr strong_ordering > + text_encoding::aliases_view:: > + _Iterator::operator<=>(const _Iterator& __i) const > + { > + __glibcxx_assert(_M_id == __i._M_id); > + return _M_rep <=> __i._M_rep; > + } > + > + constexpr bool > + text_encoding::aliases_view:: > + _Iterator::_M_dereferenceable() const noexcept > + { return _M_rep != nullptr && _M_rep->_M_id == _M_id; } > + > + constexpr auto > + text_encoding::aliases_view:: > + _Iterator::_S_neg(difference_type __n) noexcept > + -> difference_type > + { > + using _Traits = __gnu_cxx::__int_traits; > + if (__n == _Traits::__min) [[unlikely]] > + return _Traits::__max; > + return -__n; > + } > + > +namespace ranges > +{ > + // Opt-in to borrowed_range concept > + template<> > + inline constexpr bool > + enable_borrowed_range = true; > +} > + > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > + > +#endif // __cpp_lib_text_encoding > +#endif // _GLIBCXX_TEXT_ENCODING > diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py > index 032a7aa58a2..a6c2ed4599f 100644 > --- a/libstdc++-v3/python/libstdcxx/v6/printers.py > +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py > @@ -2324,6 +2324,21 @@ class StdIntegralConstantPrinter(printer_base): > typename = strip_versioned_namespace(self._typename) > return "{}<{}, {}>".format(typename, value_type, value) > > +class StdTextEncodingPrinter(printer_base): > + """Print a std::text_encoding.""" > + > + def __init__(self, typename, val): > + self._val = val > + self._typename = typename > + > + def to_string(self): > + rep = self._val['_M_rep'].dereference() > + if rep['_M_id'] == 1: > + return self._val['_M_name'] > + if rep['_M_id'] == 2: > + return 'unknown' > + return rep['_M_name'] > + > # A "regular expression" printer which conforms to the > # "SubPrettyPrinter" protocol from gdb.printing. > class RxPrinter(object): > @@ -2807,6 +2822,8 @@ def build_libstdcxx_dictionary(): > > libstdcxx_printer.add_version('std::', 'integral_constant', > StdIntegralConstantPrinter) > + libstdcxx_printer.add_version('std::', 'text_encoding', > + StdTextEncodingPrinter) > > if hasattr(gdb.Value, 'dynamic_type'): > libstdcxx_printer.add_version('std::', 'error_code', > diff --git a/libstdc++-v3/scripts/gen_text_encoding_data.py b/libstdc++-v3/scripts/gen_text_encoding_data.py > new file mode 100755 > index 00000000000..2d6f3e4077a > --- /dev/null > +++ b/libstdc++-v3/scripts/gen_text_encoding_data.py > @@ -0,0 +1,70 @@ > +#!/usr/bin/env python3 > +# > +# Script to generate tables for libstdc++ std::text_encoding. > +# > +# This file is part of GCC. > +# > +# GCC is free software; you can redistribute it and/or modify it under > +# the terms of the GNU General Public License as published by the Free > +# Software Foundation; either version 3, or (at your option) any later > +# version. > +# > +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +# WARRANTY; without even the implied warranty of MERCHANTABILITY or > +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +# for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with GCC; see the file COPYING3. If not see > +# . > + > +# To update the Libstdc++ static data in download > +# the latest: > +# https://www.iana.org/assignments/character-sets/character-sets-1.csv > +# Then run this script and save the output to > +# include/bits/text_encoding-data.h > + > +import sys > +import csv > + > +if len(sys.argv) != 2: > + print("Usage: %s " % sys.argv[0], file=sys.stderr) > + sys.exit(1) > + > +print("// Generated by gen_text_encoding_data.py, do not edit.\n") > +print("#ifndef _GLIBCXX_GET_ENCODING_DATA") > +print('# error "This is not a public header, do not include it directly"') > +print("#endif\n") > + > + > +charsets = {} > +with open(sys.argv[1], newline='') as f: > + reader = csv.reader(f) > + next(reader) # skip header row > + for row in reader: > + mib = int(row[2]) > + if mib in charsets: > + raise ValueError("Multiple rows for mibEnum={}".format(mib)) > + name = row[1] > + aliases = row[5].split() > + # Ensure primary name comes first > + if name in aliases: > + aliases.remove(name) > + charsets[mib] = [name] + aliases > + > +# Remove "NATS-DANO" and "NATS-DANO-ADD" > +charsets.pop(33, None) > +charsets.pop(34, None) > + > +count = 0 > +for mib in sorted(charsets.keys()): > + names = charsets[mib] > + if names[0] == "UTF-8": > + print("#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET {}".format(count)) > + for name in names: > + print(' {{ {:4}, "{}" }},'.format(mib, name)) > + count += len(names) > + > +# gives an error if this macro is left defined. > +# Do this last, so that the generated output is not usable unless we reach here. > +print("\n#undef _GLIBCXX_GET_ENCODING_DATA") > diff --git a/libstdc++-v3/src/Makefile.am b/libstdc++-v3/src/Makefile.am > index 7292ae70f81..37ba1491dea 100644 > --- a/libstdc++-v3/src/Makefile.am > +++ b/libstdc++-v3/src/Makefile.am > @@ -43,7 +43,7 @@ experimental_dir = > endif > > ## Keep this list sync'd with acinclude.m4:GLIBCXX_CONFIGURE. > -SUBDIRS = c++98 c++11 c++17 c++20 c++23 \ > +SUBDIRS = c++98 c++11 c++17 c++20 c++23 c++26 \ > $(filesystem_dir) $(backtrace_dir) $(experimental_dir) > > # Cross compiler support. > @@ -77,6 +77,7 @@ vpath % $(top_srcdir)/src/c++11 > vpath % $(top_srcdir)/src/c++17 > vpath % $(top_srcdir)/src/c++20 > vpath % $(top_srcdir)/src/c++23 > +vpath % $(top_srcdir)/src/c++26 > if ENABLE_FILESYSTEM_TS > vpath % $(top_srcdir)/src/filesystem > endif > diff --git a/libstdc++-v3/src/c++26/Makefile.am b/libstdc++-v3/src/c++26/Makefile.am > new file mode 100644 > index 00000000000..000ced1f501 > --- /dev/null > +++ b/libstdc++-v3/src/c++26/Makefile.am > @@ -0,0 +1,109 @@ > +## Makefile for the C++26 sources of the GNU C++ Standard library. > +## > +## Copyright (C) 1997-2023 Free Software Foundation, Inc. > +## > +## This file is part of the libstdc++ version 3 distribution. > +## Process this file with automake to produce Makefile.in. > + > +## This file is part of the GNU ISO C++ Library. This library is free > +## software; you can redistribute it and/or modify it under the > +## terms of the GNU General Public License as published by the > +## Free Software Foundation; either version 3, or (at your option) > +## any later version. > + > +## This library is distributed in the hope that it will be useful, > +## but WITHOUT ANY WARRANTY; without even the implied warranty of > +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +## GNU General Public License for more details. > + > +## You should have received a copy of the GNU General Public License along > +## with this library; see the file COPYING3. If not see > +## . > + > +include $(top_srcdir)/fragment.am > + > +# Convenience library for C++26 runtime. > +noinst_LTLIBRARIES = libc++26convenience.la > + > +headers = > + > +if ENABLE_EXTERN_TEMPLATE > +# XTEMPLATE_FLAGS = -fno-implicit-templates > +inst_sources = > +else > +# XTEMPLATE_FLAGS = > +inst_sources = > +endif > + > +sources = text_encoding.cc > + > +vpath % $(top_srcdir)/src/c++26 > + > + > +if GLIBCXX_HOSTED > +libc__26convenience_la_SOURCES = $(sources) $(inst_sources) > +else > +libc__26convenience_la_SOURCES = > +endif > + > +# AM_CXXFLAGS needs to be in each subdirectory so that it can be > +# modified in a per-library or per-sub-library way. Need to manually > +# set this option because CONFIG_CXXFLAGS has to be after > +# OPTIMIZE_CXXFLAGS on the compile line so that -O2 can be overridden > +# as the occasion calls for it. > +AM_CXXFLAGS = \ > + -std=gnu++26 \ > + $(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \ > + $(XTEMPLATE_FLAGS) $(VTV_CXXFLAGS) \ > + $(WARN_CXXFLAGS) $(OPTIMIZE_CXXFLAGS) $(CONFIG_CXXFLAGS) \ > + -fimplicit-templates > + > +AM_MAKEFLAGS = \ > + "gxx_include_dir=$(gxx_include_dir)" > + > +# Libtool notes > + > +# 1) In general, libtool expects an argument such as `--tag=CXX' when > +# using the C++ compiler, because that will enable the settings > +# detected when C++ support was being configured. However, when no > +# such flag is given in the command line, libtool attempts to figure > +# it out by matching the compiler name in each configuration section > +# against a prefix of the command line. The problem is that, if the > +# compiler name and its initial flags stored in the libtool > +# configuration file don't match those in the command line, libtool > +# can't decide which configuration to use, and it gives up. The > +# correct solution is to add `--tag CXX' to LTCXXCOMPILE and maybe > +# CXXLINK, just after $(LIBTOOL), so that libtool doesn't have to > +# attempt to infer which configuration to use. > +# > +# The second tag argument, `--tag disable-shared` means that libtool > +# only compiles each source once, for static objects. In actuality, > +# glibcxx_lt_pic_flag and glibcxx_compiler_shared_flag are added to > +# the libtool command that is used create the object, which is > +# suitable for shared libraries. The `--tag disable-shared` must be > +# placed after --tag CXX lest things CXX undo the affect of > +# disable-shared. > + > +# 2) Need to explicitly set LTCXXCOMPILE so that EXTRA_CXX_FLAGS is > +# last. (That way, things like -O2 passed down from the toplevel can > +# be overridden by --enable-debug.) > +LTCXXCOMPILE = \ > + $(LIBTOOL) --tag CXX --tag disable-shared \ > + $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ > + --mode=compile $(CXX) $(TOPLEVEL_INCLUDES) \ > + $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS) $(EXTRA_CXX_FLAGS) > + > +LTLDFLAGS = $(shell $(SHELL) $(top_srcdir)/../libtool-ldflags $(LDFLAGS)) > + > +# 3) We'd have a problem when building the shared libstdc++ object if > +# the rules automake generates would be used. We cannot allow g++ to > +# be used since this would add -lstdc++ to the link line which of > +# course is problematic at this point. So, we get the top-level > +# directory to configure libstdc++-v3 to use gcc as the C++ > +# compilation driver. > +CXXLINK = \ > + $(LIBTOOL) --tag CXX --tag disable-shared \ > + $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ > + --mode=link $(CXX) \ > + $(VTV_CXXLINKFLAGS) \ > + $(OPT_LDFLAGS) $(SECTION_LDFLAGS) $(AM_CXXFLAGS) $(LTLDFLAGS) -o $@ > diff --git a/libstdc++-v3/src/c++26/text_encoding.cc b/libstdc++-v3/src/c++26/text_encoding.cc > new file mode 100644 > index 00000000000..9a7df07db29 > --- /dev/null > +++ b/libstdc++-v3/src/c++26/text_encoding.cc > @@ -0,0 +1,91 @@ > +// Definitions for -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// . > + > +#include > +#include > + > +#ifdef _GLIBCXX_USE_NL_LANGINFO_L > +#include > +#include > + > +#if __CHAR_BIT__ == 8 > +namespace std > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > + > +text_encoding > +__locale_encoding(const char* name) > +{ > + text_encoding enc; > + if (locale_t loc = ::newlocale(LC_ALL_MASK, name, (locale_t)0)) > + { > + if (const char* codeset = ::nl_langinfo_l(CODESET, loc)) > + { > + string_view s(codeset); > + if (s.size() < text_encoding::max_name_length) > + enc = text_encoding(s); > + } > + ::freelocale(loc); > + } > + return enc; > +} > + > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > + > +std::text_encoding > +std::text_encoding::environment() > +{ > + return std::__locale_encoding(""); > +} > + > +bool > +std::text_encoding::_M_is_environment() const > +{ > + bool matched = false; > + if (locale_t loc = ::newlocale(LC_ALL_MASK, "", (locale_t)0)) > + { > + if (const char* codeset = ::nl_langinfo_l(CODESET, loc)) > + { > + string_view sv(codeset); > + for (auto alias : aliases()) > + if (__unicode::__charset_alias_match(alias, sv)) > + { > + matched = true; > + break; > + } > + } > + ::freelocale(loc); > + } > + return matched; > +} > + > +std::text_encoding > +std::locale::encoding() const > +{ > + return std::__locale_encoding(name().c_str()); > +} > +#endif // CHAR_BIT == 8 > + > +#endif // _GLIBCXX_USE_NL_LANGINFO_L > diff --git a/libstdc++-v3/src/experimental/Makefile.am b/libstdc++-v3/src/experimental/Makefile.am > index 8259f986d95..6241430988e 100644 > --- a/libstdc++-v3/src/experimental/Makefile.am > +++ b/libstdc++-v3/src/experimental/Makefile.am > @@ -47,10 +47,12 @@ libstdc__exp_la_SOURCES = $(sources) > > libstdc__exp_la_LIBADD = \ > $(top_builddir)/src/c++23/libc++23convenience.la \ > + $(top_builddir)/src/c++26/libc++26convenience.la \ > $(filesystem_lib) $(backtrace_lib) > > libstdc__exp_la_DEPENDENCIES = \ > $(top_builddir)/src/c++23/libc++23convenience.la \ > + $(top_builddir)/src/c++26/libc++26convenience.la \ > $(filesystem_lib) $(backtrace_lib) > > # AM_CXXFLAGS needs to be in each subdirectory so that it can be > diff --git a/libstdc++-v3/testsuite/22_locale/locale/encoding.cc b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc > new file mode 100644 > index 00000000000..18825fb88b9 > --- /dev/null > +++ b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc > @@ -0,0 +1,36 @@ > +// { dg-options "-lstdc++exp" } > +// { dg-do run { target c++26 } } > +// { dg-require-namedlocale "en_US.ISO8859-1" } > +// { dg-require-namedlocale "fr_FR.ISO8859-15" } > + > +#include > +#include > + > +void > +test_encoding() > +{ > + const std::locale c = std::locale::classic(); > + std::text_encoding c_enc = c.encoding(); > + VERIFY( c_enc == std::text_encoding::ASCII ); > + > + const std::locale fr = std::locale(ISO_8859(15, fr_FR)); > + std::text_encoding fr_enc = fr.encoding(); > + VERIFY( fr_enc == std::text_encoding::ISO885915 ); > + > + const std::locale en = std::locale(ISO_8859(1, en_US)); > + std::text_encoding en_enc = en.encoding(); > + VERIFY( en_enc == std::text_encoding::ISOLatin1 ); > + > +#if __cpp_exceptions > + try { > + const std::locale c_utf8 = std::locale("C.UTF-8"); > + VERIFY( c_utf8.encoding() == std::text_encoding::UTF8 ); > + } catch (...) { > + } > +#endif > +} > + > +int main() > +{ > + test_encoding(); > +} > diff --git a/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc b/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc > new file mode 100644 > index 00000000000..f6272ae998b > --- /dev/null > +++ b/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc > @@ -0,0 +1,18 @@ > +// { dg-do compile { target c++20 } } > +#include > + > +using std::__unicode::__charset_alias_match; > +static_assert( __charset_alias_match("UTF-8", "utf8") == true ); > +static_assert( __charset_alias_match("UTF-8", "u.t.f-008") == true ); > +static_assert( __charset_alias_match("UTF-8", "utf-80") == false ); > +static_assert( __charset_alias_match("UTF-8", "ut8") == false ); > + > +static_assert( __charset_alias_match("iso8859_1", "ISO-8859-1") == true ); > + > +static_assert( __charset_alias_match("", "") == true ); > +static_assert( __charset_alias_match("", ".") == true ); > +static_assert( __charset_alias_match("--", "...") == true ); > +static_assert( __charset_alias_match("--a", "a...") == true ); > +static_assert( __charset_alias_match("--a010", "a..10.") == true ); > +static_assert( __charset_alias_match("--a010", "a..1.0") == false ); > +static_assert( __charset_alias_match("aaaa", "000.00.0a0a)0aa...") == true ); > diff --git a/libstdc++-v3/testsuite/std/text_encoding/cons.cc b/libstdc++-v3/testsuite/std/text_encoding/cons.cc > new file mode 100644 > index 00000000000..b9d93641de4 > --- /dev/null > +++ b/libstdc++-v3/testsuite/std/text_encoding/cons.cc > @@ -0,0 +1,113 @@ > +// { dg-do run { target c++26 } } > + > +#include > +#include > +#include > + > +using namespace std::string_view_literals; > + > +constexpr void > +test_default_construct() > +{ > + std::text_encoding e0; > + VERIFY( e0.mib() == std::text_encoding::unknown ); > + VERIFY( e0.name()[0] == '\0' ); // P2862R1 name() should never return null > + VERIFY( e0.aliases().empty() ); > +} > + > +constexpr void > +test_construct_by_name() > +{ > + std::string_view s; > + std::text_encoding e0(s); > + VERIFY( e0.mib() == std::text_encoding::other ); > + VERIFY( e0.name() == s ); > + VERIFY( e0.aliases().empty() ); > + > + s = "not a real encoding"; > + std::text_encoding e1(s); > + VERIFY( e1.mib() == std::text_encoding::other ); > + VERIFY( e1.name() == s ); > + VERIFY( e1.aliases().empty() ); > + > + VERIFY( e1 != e0 ); > + VERIFY( e1 == e0.mib() ); > + > + s = "utf8"; > + std::text_encoding e2(s); > + VERIFY( e2.mib() == std::text_encoding::UTF8 ); > + VERIFY( e2.name() == s ); > + VERIFY( ! e2.aliases().empty() ); > + VERIFY( e2.aliases().front() == "UTF-8"sv ); > + > + s = "Latin-1"; // matches "latin1" > + std::text_encoding e3(s); > + VERIFY( e3.mib() == std::text_encoding::ISOLatin1 ); > + VERIFY( e3.name() == s ); > + VERIFY( ! e3.aliases().empty() ); > + VERIFY( e3.aliases().front() == "ISO_8859-1:1987"sv ); // primary name > + > + s = "U.S."; // matches "us" > + std::text_encoding e4(s); > + VERIFY( e4.mib() == std::text_encoding::ASCII ); > + VERIFY( e4.name() == s ); > + VERIFY( ! e4.aliases().empty() ); > + VERIFY( e4.aliases().front() == "US-ASCII"sv ); // primary name > +} > + > +constexpr void > +test_construct_by_id() > +{ > + std::text_encoding e0(std::text_encoding::other); > + VERIFY( e0.mib() == std::text_encoding::other ); > + VERIFY( e0.name() == ""sv ); > + VERIFY( e0.aliases().empty() ); > + > + std::text_encoding e1(std::text_encoding::unknown); > + VERIFY( e1.mib() == std::text_encoding::unknown ); > + VERIFY( e1.name() == ""sv ); > + VERIFY( e1.aliases().empty() ); > + > + std::text_encoding e2(std::text_encoding::UTF8); > + VERIFY( e2.mib() == std::text_encoding::UTF8 ); > + VERIFY( e2.name() == "UTF-8"sv ); > + VERIFY( ! e2.aliases().empty() ); > + VERIFY( e2.aliases().front() == std::string_view(e2.name()) ); > + bool found = false; > + for (auto alias : e2.aliases()) > + if (alias == "csUTF8"sv) > + { > + found = true; > + break; > + } > + VERIFY( found ); > +} > + > +constexpr void > +test_copy_construct() > +{ > + std::text_encoding e0; > + std::text_encoding e1 = e0; > + VERIFY( e1 == e0 ); > + > + std::text_encoding e2(std::text_encoding::UTF8); > + auto e3 = e2; > + VERIFY( e3 == e2 ); > + > + e1 = e3; > + VERIFY( e1 == e2 ); > +} > + > +int main() > +{ > + auto run_tests = [] { > + test_default_construct(); > + test_construct_by_name(); > + test_construct_by_id(); > + test_copy_construct(); > + return true; > + }; > + > + run_tests(); > + static_assert( run_tests() ); > +} > diff --git a/libstdc++-v3/testsuite/std/text_encoding/members.cc b/libstdc++-v3/testsuite/std/text_encoding/members.cc > new file mode 100644 > index 00000000000..0b0d6bd0c96 > --- /dev/null > +++ b/libstdc++-v3/testsuite/std/text_encoding/members.cc > @@ -0,0 +1,41 @@ > +// { dg-options "-lstdc++exp" } > +// { dg-do run { target c++26 } } > +// { dg-require-namedlocale "en_US.ISO8859-1" } > +// { dg-require-namedlocale "fr_FR.ISO8859-15" } > + > +#include > +#include > +#include > +#include > + > +using namespace std::string_view_literals; > + > +void > +test_literal() > +{ > + const std::text_encoding lit = std::text_encoding::literal(); > + VERIFY( lit.name() == std::string_view(__GNUC_EXECUTION_CHARSET_NAME) ); > +} > + > +void > +test_env() > +{ > + const std::text_encoding env = std::text_encoding::environment(); > + > + if (env.mib() == std::text_encoding::UTF8) > + VERIFY( std::text_encoding::environment_is() ); > + > + ::setlocale(LC_ALL, ISO_8859(1, en_US)); > + const std::text_encoding env1 = std::text_encoding::environment(); > + VERIFY( env1 == env ); > + > + ::setlocale(LC_ALL, ISO_8859(15, fr_FR)); > + const std::text_encoding env2 = std::text_encoding::environment(); > + VERIFY( env2 == env ); > +} > + > +int main() > +{ > + test_literal(); > + test_env(); > +} > diff --git a/libstdc++-v3/testsuite/std/text_encoding/requirements.cc b/libstdc++-v3/testsuite/std/text_encoding/requirements.cc > new file mode 100644 > index 00000000000..d62d93dcda4 > --- /dev/null > +++ b/libstdc++-v3/testsuite/std/text_encoding/requirements.cc > @@ -0,0 +1,31 @@ > +// { dg-do compile { target c++26 } } > +// { dg-add-options no_pch } > + > +#include > +#ifndef __cpp_lib_text_encoding > +# error "Feature-test macro for text_encoding missing in " > +#elif __cpp_lib_text_encoding != 202306L > +# error "Feature-test macro for text_encoding has wrong value in " > +#endif > + > +#undef __cpp_lib_expected > +#include > +#ifndef __cpp_lib_text_encoding > +# error "Feature-test macro for text_encoding missing in " > +#elif __cpp_lib_text_encoding != 202306L > +# error "Feature-test macro for text_encoding has wrong value in " > +#endif > + > +#include > +#include > +static_assert( std::is_trivially_copyable_v ); > + > +using aliases_view = std::text_encoding::aliases_view; > +static_assert( std::copyable ); > +static_assert( std::ranges::view ); > +static_assert( std::ranges::random_access_range ); > +static_assert( std::ranges::borrowed_range ); > +static_assert( std::same_as, > + const char*> ); > +static_assert( std::same_as, > + const char*> ); > -- > 2.43.0 > >