From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vs1-xe31.google.com (mail-vs1-xe31.google.com [IPv6:2607:f8b0:4864:20::e31]) by sourceware.org (Postfix) with ESMTPS id 1D1223858D39 for ; Wed, 6 Jul 2022 12:23:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1D1223858D39 Received: by mail-vs1-xe31.google.com with SMTP id 126so14849032vsq.13 for ; Wed, 06 Jul 2022 05:23:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=VDha6xR7KNkCIaIcwPlh2Ewsu5SN6LxNCQNeTc17Urw=; b=xfHJQ2zQ6Qfg5VqPdnX93VJwrq4lhJ9bMiXtQJ872jsFQa/a57MBwUjmXDhf+bwGz4 jhCdb3O+bZLDo1OkPCSEx0qvCIBa088/YlpD0kprmAahsRxYlPCAFA++jP5kLIpl4qJG VGierzuaOZBqdHXgtXDTKQQh1IMTyuFuNP9V+Db+791kwL+GNVw1iViMJ8VNcJjYAp48 /xLupHvbozKaCcV/+tOTNvtUavwysvhYxq9gI3IaZ5LmAeR/QDaIYL8lbEtXRvCMiBvq 03jKVEBFrX24Xts3+Al58QR3M8K2lNDQPrHXakdoz4qqzHpDuCo8KWfbdl7AloM235VL 8KPw== X-Gm-Message-State: AJIora9mJD0Bw7U64DFooNAWn6CYZDguQ+HjfeOQRrCOpxkgF4ZyRNjc NDmDTZWKCT6p0rmUm1RmkQ3yqiHwarFfOhyP X-Google-Smtp-Source: AGRyM1uZ/UBf0LIgzqd+mLwyU0ITn9H2mz60oT3wD8HZ0JN9hEmTm+v5U2dhtKoxoIMOeZ/RjbrKfw== X-Received: by 2002:a05:6102:53b:b0:354:23f9:a4db with SMTP id m27-20020a056102053b00b0035423f9a4dbmr21483659vsa.34.1657110225385; Wed, 06 Jul 2022 05:23:45 -0700 (PDT) Received: from smtpclient.apple ([2804:431:c7cb:fef6:4d79:ffc6:5481:dcec]) by smtp.gmail.com with ESMTPSA id 66-20020a1f1745000000b0036bfe74006esm8797621vkx.31.2022.07.06.05.23.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Jul 2022 05:23:45 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\)) Subject: Re: [PATCH v4 0/3] C++20 P0482R6 and C2X N2653: char8_t, mbrtoc8(), and c8rtomb() From: Adhemerval Zanella In-Reply-To: Date: Wed, 6 Jul 2022 09:23:42 -0300 Cc: libc-alpha@sourceware.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <20220630125215.6052-1-tom@honermann.net> To: Tom Honermann X-Mailer: Apple Mail (2.3696.100.31) X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2022 12:23:47 -0000 > On 6 Jul 2022, at 00:27, Tom Honermann wrote: >=20 > On 7/4/22 3:08 PM, Adhemerval Zanella wrote: >>=20 >>> On 30 Jun 2022, at 09:52, Tom Honermann via Libc-alpha = wrote: >>>=20 >>> This series of patches provides the following: >>> - A fix for BZ #25744 [1]. >>> - Implementations of the mbrtoc8 and c8rtomb functions adopted for >>> C++20 via WG21 P0482R6 [2] and for C2X via WG14 N2653 [3]. >>> - A char8_t typedef as adopted for C2X via WG14 N2653 [3]. >>>=20 >>> The fix for BZ #25744 [1] is included in this patch series because = the tests >>> for mbrtoc8 and c8rtomb depend on it for exercising the special case = where a >>> pair of Unicode code points is converted to/from a single double = byte sequence. >>> Such conversion cases exist for Big5-HKSCS. >>>=20 >>> N2653 was adopted by WG14 for C2X and wording is present in the = N2912 revision >>> of the C2X working draft [4]. This patch series enables the new = declarations >>> in C2X mode and when _GNU_SOURCE is defined. >>>=20 >>> This patch series addresses feedback provided in response to the = prior >>> submission series starting at [5]. All feedback was addressed = except as noted >>> below: >>> - I removed the previously proposed wcsmbs/test-char8-type.c test as = redundant >>> per feedback. I prototyped extending the c++-types.data test as = suggested, >>> but that would have introduced a C++20 dependency. The test = demonstrated that >>> the char16_t and char32_t types were mapped to 'Ds' and 'Di' as = expected, but >>> char8_t got mapped to 'h' rather than 'Du'. It might be worth = extending this >>> test in the future if/when C++20 support becomes a minimal = requirement. >>> - I did not switch the iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c = test to use >>> TEST_COMPARE in favor of the existing custom error handling; the = current >>> error messages are customized to provide a more detailed = explanation of the >>> error, so removing that seemed like an arguable regression. My = changes retain >>> consistency with the existing code. I think it is reasonable to = switch to >>> TEST_COMPARE, but I think that should be done separately from this = change and >>> separately motivated. >>> - I did not change the c8rtomb() and mbrtoc8() implementations to = use thread >>> local storage when PS is null; the implementations continue to use = static >>> storage consistent with c16rtomb(), mbrtoc16(), c32rtomb(), and = mbrtoc32(). >>> I think it is reasonable to switch all of these functions to use = thread >>> local storage, but I think such a change should be done = consistently across >>> all of them and should be separately motivated. >>>=20 >>> Thank you to Joseph Myers, Carlos O'Donell, Adhemerval Zanella, and = Florian >>> Weimer for prior reviews of this patch series. >>>=20 >>> Tested on Linux x86_64. >>>=20 >>> [1]: Bug 25744 >>> "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming = the >>> second byte of certain double byte characters" >>> https://sourceware.org/bugzilla/show_bug.cgi?id=3D25744 >>>=20 >>> [2]: WG21 P0482R6 >>> "char8_t: A type for UTF-8 characters and strings (Revision 6)" >>> https://wg21.link/p0482r6 >>>=20 >>> [3]: WG14 N2653 >>> "char8_t: A type for UTF-8 characters and strings (Revision 1)" >>> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm >>>=20 >>> [4]: WG14 N2912 >>> "Programming languages =E2=80=94 C working draft =E2=80=94 June = 8, 2022" >>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf >>>=20 >>> [5]: libc-alpha mailing list >>> "[PATCH 0/3]: C++20 P0482R6 and C2X N2653: support for char8_t, = mbrtoc8(), and c8rtomb()." >>> = https://sourceware.org/pipermail/libc-alpha/2022-February/136723.html >> Hi Tom, >>=20 >> Patches look good in general, just some really minor style issues I >> found in last review. I have fixed them and if you are ok I can >> install them, I have create a branch so you can check [1]. >=20 > Excellent, thank you! I verified that your branch differs from the = patches I posted just by the style issues mentioned in your review and = the additional removal of the (empty) = sysdeps/mach/hurd/libhurduser.abilist and = sysdeps/mach/libmachuser.abilist files (which I presume to be = intentional). Looks good to me! I very much appreciate you making the = additional style corrections! >=20 > Tom. The hurd files removal is a mistake from my part (we should really fix make update-abi/check-abi on hurd...). I will fix it and install the patches. >=20 >>=20 >> [1] = https://sourceware.org/git/?p=3Dglibc.git;a=3Dshortlog;h=3Drefs/heads/azan= ella/mbrtoc8-c8rtomb >>=20