From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp116.ord1c.emailsrvr.com (smtp116.ord1c.emailsrvr.com [108.166.43.116]) by sourceware.org (Postfix) with ESMTPS id 60B763858D35 for ; Sun, 27 Feb 2022 16:53:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 60B763858D35 X-Auth-ID: tom@honermann.net Received: by smtp15.relay.ord1c.emailsrvr.com (Authenticated sender: tom-AT-honermann.net) with ESMTPSA id 47D912009E for ; Sun, 27 Feb 2022 11:53:13 -0500 (EST) Message-ID: Date: Sun, 27 Feb 2022 11:53:12 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US From: Tom Honermann Subject: [PATCH 0/3]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb(). To: libc-alpha Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Classification-ID: 50ad1b57-328c-4c8b-87bc-9a9e92733657-1-1 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Feb 2022 16:53:17 -0000 This series of patches provides the following: - A fix for bug 25744 [1]. - Implementations of the mbrtoc8 and c8rtomb functions adopted for C++20 via WG21 P0482R6 [2] and for C2X via WG14 N2653 [3]. - A char8_t typedef as adopted for C2X via WG14 N2653 [3]. These patches addresses feedback provided in response to a previous submission [4]. Patch 1: A fix and test for bug 25744 [1]. Patch 2: Definitions of the mbrtoc8 and c8rtomb functions and the char8_t typedef. Patch 3: Tests for the mbrtoc8 and c8rtomb functions and the char8_t typedef. The fix for bug 25744 [1] is included in this patch series because the tests for mbrtoc8 and c8rtomb depend on it for exercising the special case where a pair of Unicode code points is converted to/from a single double byte sequence. Such conversion cases exist for Big5-HKSCS. N2653 was adopted by WG14 for C2X during their recent meeting. This patch series enables the new declarations in C2X mode and when _GNU_SOURCE is defined. Thank you to Joseph Myers and Carlos O'Donell for their prior reviews of this patch series. Tom. [1]: Bug 25744 "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming the second byte of certain double byte characters" https://sourceware.org/bugzilla/show_bug.cgi?id=25744 [2]: WG21 P0482R6 "char8_t: A type for UTF-8 characters and strings (Revision 6)" https://wg21.link/p0482r6 [3]: WG14 N2653 "char8_t: A type for UTF-8 characters and strings (Revision 1)" http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm [4]: "[PATCH 2/3]: C++20 P0482R6 and C2X N2653: Implement mbrtoc8, c8rtomb, char8_t" https://sourceware.org/pipermail/libc-alpha/2022-February/136558.html