From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp99.iad3a.emailsrvr.com (smtp99.iad3a.emailsrvr.com [173.203.187.99]) by sourceware.org (Postfix) with ESMTPS id 143013848019 for ; Sun, 13 Jun 2021 15:35:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 143013848019 X-Auth-ID: tom@honermann.net Received: by smtp37.relay.iad3a.emailsrvr.com (Authenticated sender: tom-AT-honermann.net) with ESMTPSA id 889A818B4; Sun, 13 Jun 2021 11:35:11 -0400 (EDT) Subject: Re: [PATCH 2/3]: C++20 P0482R6 and C2X N2653: Implement mbrtoc8, c8rtomb, char8_t To: Joseph Myers Cc: libc-alpha References: <40868945-29d9-3099-fb11-260796a2a022@honermann.net> <1c33efce-b4ed-a238-aa3a-7b15fa3dd495@honermann.net> From: Tom Honermann Message-ID: Date: Sun, 13 Jun 2021 11:35:11 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Classification-ID: 35204c2d-7eac-49ef-93a1-92632712d2d6-1-1 X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, HTML_MESSAGE, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Jun 2021 15:35:14 -0000 On 6/11/21 12:28 PM, Joseph Myers wrote: > On Fri, 11 Jun 2021, Tom Honermann via Libc-alpha wrote: > >> 1) Enabling pre-C2x compatibility with C++20. > Where a feature in a header should be enabled for a particular C++ > version, we can have an internal __GLIBC_USE (CXX20) which is enabled > based on the value of __cplusplus, just like the existing __USE_ISOCXX11. > (Though that doesn't make much practical difference until we provide a way > for C++ library headers to get all the features they need without defining > _GNU_SOURCE.) > > You'd then have > > #if defined __USE_GNU || __GLIBC_USE (CXX20) > > in uchar.h, with appropriate !defined __cpp_char8_t for defining the > typedef itself. And the defined __USE_GNU would become __GLIBC_USE > (ISOC2X) if the feature is accepted for C2x. Sounds good for C. For C++ though, the declarations of mbrtoc8() and c8rtomb() need to be aligned with libstdc++ expectations since they are redeclared in the std namespace if available.  The corresponding libstdc++ patch probes whether these functions are available through an autoconf probe that runs g++ -fchar8_t; the redeclarations are then dependent only on whether char8_t support is enabled (for C++, the corresponding library support is dependent on the core language changes).  Restricting the declarations as you suggest would result in compilation errors for invocations like g++ -std=c++17 -fchar8_t and g++ -std=c++2x -fchar8_t.  The char8_t dialect already exists for C++, and I think the glibc changes should align with that for C++.  I think it is fine if the new functions are declared for modes like g++ -std=gnu++17; there is no need to disable them with -fno-char8_t.  I therefore suggest: #if defined __USE_GNU || /* __GLIBC_USE (ISOC2X) || */ defined __cpp_char8_t > >> 2) Enabling conditional code sensitive to the change of type of u8 string >> literals. > That might be a reason for a compiler option (I don't think it's a > sufficient reason, I expect -std=c11 or -Wno-pointer-sign or pointer casts > to suffice in practice), but it has nothing to do with the use of feature > test macros in glibc since the library headers don't care at all what the > type of u8 string literals is (there are no u8 string literals anywhere in > glibc). That's fair.  I think Jakub's suggestion on gcc-patches to use __CHAR8_TYPE__ to conditionalize sensitive code suffices. > >> 3) Avoiding clashes with existing uses of the char8_t identifier in code that >> may be compiled with _GNU_SOURCE. > It's a universal feature in glibc that all non-obsolescent, > non-conflicting features are enabled with _GNU_SOURCE, including features > from future standard versions - all feature test macros are subsets of > _GNU_SOURCE, except where the features actually conflict in some way (a > few cases where a POSIX function has semantics incompatible with a GNU > function of the same name), and features from future standards with no > feature test macro support in glibc (e.g. some features proposed for the > next POSIX version) are also enabled by _GNU_SOURCE. > > So char8_t and these functions should be enabled by _GNU_SOURCE, > regardless of what other options enable them. > > Users of _GNU_SOURCE are expected to deal with possible naming conflicts > with features added to _GNU_SOURCE. Likewise people building with > -std=gnu2x / -std=c2x, which would enable these declarations if char8_t > and these functions are added to C2x, or users of _ISOC2X_SOURCE (which > isn't expected to be used much). Ok, great.  I wasn't aware that _GNU_SOURCE was intended to be this inconclusive. > >> I agree that this is a narrow feature, but I think the feature test macro is >> beneficial for compatibility reasons.  Assuming the proposal is adopted for >> C23, some programmers may need to opt-out of the feature by compiling with >> -fno-char8_t during a migration period; as some C++20 projects have done. > We don't provide a way to opt out of any other library additions when > compiling in C2x mode; compiling with -std=c2x or -std=gnu2x or > -D_GNU_SOURCE unconditionally enables all other C2x additions in headers > (including e.g. the iszero macro, although the issues seen with that were > mainly for C++ code and we addressed that by making it a template instead > of a macro for C++), compiling with e.g. -std=c11 or -std=gnu11 (and not > using _GNU_SOURCE) is how you opt out of all such additions at the same > time. Likewise for previous C standard versions. And _GNU_SOURCE is the > universal way to opt in to features that might or might not be in a future > standard version. If such a feature has its own TS, we may support the > __STDC_WANT_* macros from such a TS, but that's not a basis for creating > our own feature test macro for a handful of new identifiers. Perfect. Per this and prior messages, I'll review the patch series as follows: 1. Remove uses of the _CHAR8_T_SOURCE macro. 2. Remove the comments added for __GLIBC_USE_ISOC2X, __GLIBC_USE_DEPRECATED_GETS, and __GLIBC_USE_DEPRECATED_SCANF. 3. Remove the __GLIBC_USE_CHAR8_T macro. 4. Conditionalize the char8_t, mbrtoc8() and c8rtomb() declarations on __USE_GNU and __cpp_char8_t (and perhaps __GLIBC_USE (ISOC2X) eventually). 5. Transition the new tests to the support/test-driver.c driver and use the support/check.h interfaces. 6. Correct the copyright statements. 7. Remove the "contributed by" comments. If I've forgotten anything, please let me know. Thank you for the thorough review! Tom.