From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp81.iad3a.emailsrvr.com (smtp81.iad3a.emailsrvr.com [173.203.187.81]) by sourceware.org (Postfix) with ESMTPS id 0B50B3854169 for ; Wed, 6 Jul 2022 03:27:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0B50B3854169 X-Auth-ID: tom@honermann.net Received: by smtp19.relay.iad3a.emailsrvr.com (Authenticated sender: tom-AT-honermann.net) with ESMTPSA id 8DBCD4E06; Tue, 5 Jul 2022 23:27:45 -0400 (EDT) Message-ID: Date: Tue, 5 Jul 2022 23:27:45 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v4 0/3] C++20 P0482R6 and C2X N2653: char8_t, mbrtoc8(), and c8rtomb() Content-Language: en-US To: Adhemerval Zanella Cc: libc-alpha@sourceware.org References: <20220630125215.6052-1-tom@honermann.net> From: Tom Honermann In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Classification-ID: b128e499-c54b-48d1-9c71-220610d37fa6-1-1 X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2022 03:27:48 -0000 On 7/4/22 3:08 PM, Adhemerval Zanella wrote: > >> On 30 Jun 2022, at 09:52, Tom Honermann via Libc-alpha wrote: >> >> This series of patches provides the following: >> - A fix for BZ #25744 [1]. >> - Implementations of the mbrtoc8 and c8rtomb functions adopted for >> C++20 via WG21 P0482R6 [2] and for C2X via WG14 N2653 [3]. >> - A char8_t typedef as adopted for C2X via WG14 N2653 [3]. >> >> The fix for BZ #25744 [1] is included in this patch series because the tests >> for mbrtoc8 and c8rtomb depend on it for exercising the special case where a >> pair of Unicode code points is converted to/from a single double byte sequence. >> Such conversion cases exist for Big5-HKSCS. >> >> N2653 was adopted by WG14 for C2X and wording is present in the N2912 revision >> of the C2X working draft [4]. This patch series enables the new declarations >> in C2X mode and when _GNU_SOURCE is defined. >> >> This patch series addresses feedback provided in response to the prior >> submission series starting at [5]. All feedback was addressed except as noted >> below: >> - I removed the previously proposed wcsmbs/test-char8-type.c test as redundant >> per feedback. I prototyped extending the c++-types.data test as suggested, >> but that would have introduced a C++20 dependency. The test demonstrated that >> the char16_t and char32_t types were mapped to 'Ds' and 'Di' as expected, but >> char8_t got mapped to 'h' rather than 'Du'. It might be worth extending this >> test in the future if/when C++20 support becomes a minimal requirement. >> - I did not switch the iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c test to use >> TEST_COMPARE in favor of the existing custom error handling; the current >> error messages are customized to provide a more detailed explanation of the >> error, so removing that seemed like an arguable regression. My changes retain >> consistency with the existing code. I think it is reasonable to switch to >> TEST_COMPARE, but I think that should be done separately from this change and >> separately motivated. >> - I did not change the c8rtomb() and mbrtoc8() implementations to use thread >> local storage when PS is null; the implementations continue to use static >> storage consistent with c16rtomb(), mbrtoc16(), c32rtomb(), and mbrtoc32(). >> I think it is reasonable to switch all of these functions to use thread >> local storage, but I think such a change should be done consistently across >> all of them and should be separately motivated. >> >> Thank you to Joseph Myers, Carlos O'Donell, Adhemerval Zanella, and Florian >> Weimer for prior reviews of this patch series. >> >> Tested on Linux x86_64. >> >> [1]: Bug 25744 >> "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming the >> second byte of certain double byte characters" >> https://sourceware.org/bugzilla/show_bug.cgi?id=25744 >> >> [2]: WG21 P0482R6 >> "char8_t: A type for UTF-8 characters and strings (Revision 6)" >> https://wg21.link/p0482r6 >> >> [3]: WG14 N2653 >> "char8_t: A type for UTF-8 characters and strings (Revision 1)" >> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm >> >> [4]: WG14 N2912 >> "Programming languages — C working draft — June 8, 2022" >> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf >> >> [5]: libc-alpha mailing list >> "[PATCH 0/3]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb()." >> https://sourceware.org/pipermail/libc-alpha/2022-February/136723.html > Hi Tom, > > Patches look good in general, just some really minor style issues I > found in last review. I have fixed them and if you are ok I can > install them, I have create a branch so you can check [1]. Excellent, thank you! I verified that your branch differs from the patches I posted just by the style issues mentioned in your review and the additional removal of the (empty) sysdeps/mach/hurd/libhurduser.abilist and sysdeps/mach/libmachuser.abilist files (which I presume to be intentional). Looks good to me! I very much appreciate you making the additional style corrections! Tom. > > [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/mbrtoc8-c8rtomb >