* [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation @ 2019-09-15 19:39 Tom Honermann 2019-11-29 17:47 ` Jonathan Wakely 0 siblings, 1 reply; 5+ messages in thread From: Tom Honermann @ 2019-09-15 19:39 UTC (permalink / raw) To: libstdc++, gcc-patches This series of patches provides an implementation of the changes for C++ proposal P1423R3 [1]. These changes do not impact default libstdc++ behavior for C++17 and earlier; they are only active for C++2a or when the -fchar8_t option is specified. Tested x86_64-linux. Patch 1: Decouple constraints for u8path from path constructors. Patch 2: Update __cpp_lib_char8_t feature test macro value, add deleted operators, update u8path. Patch 3: Updates to existing tests. Patch 4: New tests. Tom. [1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation 2019-09-15 19:39 [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation Tom Honermann @ 2019-11-29 17:47 ` Jonathan Wakely 2019-11-29 19:53 ` Jonathan Wakely 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Wakely @ 2019-11-29 17:47 UTC (permalink / raw) To: Tom Honermann; +Cc: libstdc++, gcc-patches [-- Attachment #1: Type: text/plain, Size: 980 bytes --] On 15/09/19 15:39 -0400, Tom Honermann wrote: >This series of patches provides an implementation of the changes for >C++ proposal P1423R3 [1]. > >These changes do not impact default libstdc++ behavior for C++17 and >earlier; they are only active for C++2a or when the -fchar8_t option >is specified. > >Tested x86_64-linux. > >Patch 1: Decouple constraints for u8path from path constructors. >Patch 2: Update __cpp_lib_char8_t feature test macro value, add >deleted operators, update u8path. >Patch 3: Updates to existing tests. >Patch 4: New tests. > >Tom. > >[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html It took a while, but I've committed these four patches, with just some minor whitespace changes and changelog tweaks. I'm also following it up with this patch, which corrects some pre-existing problems that got worse with the new deleted operator<< overloads. Tested powerpc64le-linux, committed to trunk. [-- Attachment #2: patch.txt --] [-- Type: text/x-patch, Size: 6482 bytes --] commit 81c954850d8422eaeefe3c9e833b75f8dbeb905f Author: Jonathan Wakely <jwakely@redhat.com> Date: Fri Nov 29 17:15:25 2019 +0000 libstdc++: Adjust some function templates for coding conventions * include/bits/fs_path.h (path::operator/=): Change template-head to use typename instead of class. * include/experimental/bits/fs_path.h (path::operator/=): Likewise. * include/std/ostream (operator<<): Likewise. diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index 643478292cd..b129372447b 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -272,7 +272,7 @@ namespace __detail path& operator/=(const path& __p); - template <class _Source> + template<typename _Source> __detail::_Path<_Source>& operator/=(_Source const& __source) { diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h index b924fbfd5f6..91202e5b008 100644 --- a/libstdc++-v3/include/experimental/bits/fs_path.h +++ b/libstdc++-v3/include/experimental/bits/fs_path.h @@ -278,7 +278,7 @@ namespace __detail path& operator/=(const path& __p) { return _M_append(__p._M_pathname); } - template <class _Source> + template<typename _Source> __detail::_Path<_Source>& operator/=(_Source const& __source) { return append(__source); } diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream index 771c28db7b7..895e4d7ab4e 100644 --- a/libstdc++-v3/include/std/ostream +++ b/libstdc++-v3/include/std/ostream @@ -512,18 +512,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return (__out << __out.widen(__c)); } // Specialization - template <class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>& __out, char __c) { return __ostream_insert(__out, &__c, 1); } // Signed and unsigned - template<class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>& __out, signed char __c) { return (__out << static_cast<char>(__c)); } - template<class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>& __out, unsigned char __c) { return (__out << static_cast<char>(__c)); } @@ -533,37 +533,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // numeric values. #ifdef _GLIBCXX_USE_WCHAR_T - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, wchar_t) = delete; #endif // _GLIBCXX_USE_WCHAR_T #ifdef _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, char8_t) = delete; #endif - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, char16_t) = delete; - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, char32_t) = delete; #ifdef _GLIBCXX_USE_WCHAR_T #ifdef _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, char8_t) = delete; #endif // _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, char16_t) = delete; - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, char32_t) = delete; #endif // _GLIBCXX_USE_WCHAR_T @@ -601,7 +601,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION operator<<(basic_ostream<_CharT, _Traits>& __out, const char* __s); // Partial specializations - template<class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>& __out, const char* __s) { @@ -614,12 +614,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } // Signed and unsigned - template<class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>& __out, const signed char* __s) { return (__out << reinterpret_cast<const char*>(__s)); } - template<class _Traits> + template<typename _Traits> inline basic_ostream<char, _Traits> & operator<<(basic_ostream<char, _Traits>& __out, const unsigned char* __s) { return (__out << reinterpret_cast<const char*>(__s)); } @@ -629,37 +629,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // pointer values. #ifdef _GLIBCXX_USE_WCHAR_T - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, const wchar_t*) = delete; #endif // _GLIBCXX_USE_WCHAR_T #ifdef _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, const char8_t*) = delete; #endif // _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, const char16_t*) = delete; - template<class _Traits> + template<typename _Traits> basic_ostream<char, _Traits>& operator<<(basic_ostream<char, _Traits>&, const char32_t*) = delete; #ifdef _GLIBCXX_USE_WCHAR_T #ifdef _GLIBCXX_USE_CHAR8_T - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, const char8_t*) = delete; #endif - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, const char16_t*) = delete; - template<class _Traits> + template<typename _Traits> basic_ostream<wchar_t, _Traits>& operator<<(basic_ostream<wchar_t, _Traits>&, const char32_t*) = delete; #endif // _GLIBCXX_USE_WCHAR_T ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation 2019-11-29 17:47 ` Jonathan Wakely @ 2019-11-29 19:53 ` Jonathan Wakely 2019-11-29 22:11 ` Jonathan Wakely 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Wakely @ 2019-11-29 19:53 UTC (permalink / raw) To: Tom Honermann; +Cc: libstdc++, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1708 bytes --] On 29/11/19 17:45 +0000, Jonathan Wakely wrote: >On 15/09/19 15:39 -0400, Tom Honermann wrote: >>This series of patches provides an implementation of the changes for >>C++ proposal P1423R3 [1]. >> >>These changes do not impact default libstdc++ behavior for C++17 and >>earlier; they are only active for C++2a or when the -fchar8_t option >>is specified. >> >>Tested x86_64-linux. >> >>Patch 1: Decouple constraints for u8path from path constructors. >>Patch 2: Update __cpp_lib_char8_t feature test macro value, add >>deleted operators, update u8path. >>Patch 3: Updates to existing tests. >>Patch 4: New tests. >> >>Tom. >> >>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html > >It took a while, but I've committed these four patches, with just some >minor whitespace changes and changelog tweaks. Running the new tests revealed a latent bug on Windows, where experimental::filesystem::u8path(const Source&) assumed the input was an iterator over a NTCTS. That worked for a const char* but not a std::string or experimental::string_view. The attached patch fixes that (and simplifies the #if and if-constexpr conditions for Windows) but there's a remaining bug. Constructing a experimental::filesystem::path from a char8_t string doesn't do the right thing on Windows, so these cases fails: fs::path p(u8"\xf0\x9d\x84\x9e"); VERIFY( p.u8string() == u8"\U0001D11E" ); p = fs::u8path(u8"\xf0\x9d\x84\x9e"); VERIFY( p.u8string() == u8"\U0001D11E" ); It works correctly for std::filesystem::path, just not the TS version. I plan to commit the attached patch after some more testing. I'll backport something like it too. [-- Attachment #2: patch.txt --] [-- Type: text/x-patch, Size: 3957 bytes --] commit f9ff56e4624c34be0615aab62ed4a2ce559dd567 Author: Jonathan Wakely <jwakely@redhat.com> Date: Fri Nov 29 19:36:39 2019 +0000 libstdc++: Fix experimental::filesystem::u8path(const Source&) for Windows This function failed to compile when called with a std::string. * include/bits/fs_path.h (u8path(InputIterator, InputIterator)) (u8path(const Source&)) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify conditions. * include/experimental/bits/fs_path.h (__u8path(const Source&, char)): Add overloads for std::string and types convertible to std::string. diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index b129372447b..20ec42da57d 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -691,14 +691,8 @@ namespace __detail u8path(_InputIterator __first, _InputIterator __last) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS -#ifdef _GLIBCXX_USE_CHAR8_T - if constexpr (is_same_v<_CharT, char8_t>) + if constexpr (is_same_v<_CharT, char>) { - return path{ __first, __last }; - } - else - { -#endif // XXX This assumes native wide encoding is UTF-16. std::codecvt_utf8_utf16<path::value_type> __cvt; path::string_type __tmp; @@ -710,16 +704,16 @@ namespace __detail else { const std::string __u8str{__first, __last}; - const char* const __ptr = __u8str.data(); - if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) + const char* const __p = __u8str.data(); + if (__str_codecvt_in_all(__p, __p + __u8str.size(), __tmp, __cvt)) return path{ __tmp }; } _GLIBCXX_THROW_OR_ABORT(filesystem_error( "Cannot convert character sequence", std::make_error_code(errc::illegal_byte_sequence))); -#ifdef _GLIBCXX_USE_CHAR8_T } -#endif + else + return path{ __first, __last }; #else // This assumes native normal encoding is UTF-8. return path{ __first, __last }; @@ -737,14 +731,8 @@ namespace __detail u8path(const _Source& __source) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS -#ifdef _GLIBCXX_USE_CHAR8_T - if constexpr (is_same_v<_CharT, char8_t>) + if constexpr (is_same_v<_CharT, char>) { - return path{ __source }; - } - else - { -#endif if constexpr (is_convertible_v<const _Source&, std::string_view>) { const std::string_view __s = __source; @@ -755,9 +743,9 @@ namespace __detail std::string __s = path::_S_string_from_iter(__source); return filesystem::u8path(__s.data(), __s.data() + __s.size()); } -#ifdef _GLIBCXX_USE_CHAR8_T } -#endif + else + return path{ __source }; #else return path{ __source }; #endif diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h index 91202e5b008..796458b37b0 100644 --- a/libstdc++-v3/include/experimental/bits/fs_path.h +++ b/libstdc++-v3/include/experimental/bits/fs_path.h @@ -644,8 +644,22 @@ namespace __detail /// Create a path from a UTF-8-encoded sequence of char #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS + inline path + __u8path(const string& __s, char) + { + return filesystem::u8path(__s.data(), __s.data() + __s.size()); + } + template<typename _Source> - inline path + inline __enable_if_t<is_convertible<const _Source&, string>::value, path> + __u8path(const _Source& __source, char) + { + std::string __s = __source; + return filesystem::u8path(__s.data(), __s.data() + __s.size()); + } + + template<typename _Source> + inline __enable_if_t<!is_convertible<const _Source&, string>::value, path> __u8path(const _Source& __source, char) { std::string __s = path::_S_string_from_iter(__source); ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation 2019-11-29 19:53 ` Jonathan Wakely @ 2019-11-29 22:11 ` Jonathan Wakely 2019-11-30 1:04 ` Jonathan Wakely 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Wakely @ 2019-11-29 22:11 UTC (permalink / raw) To: Tom Honermann; +Cc: libstdc++, gcc-patches On 29/11/19 19:48 +0000, Jonathan Wakely wrote: >On 29/11/19 17:45 +0000, Jonathan Wakely wrote: >>On 15/09/19 15:39 -0400, Tom Honermann wrote: >>>This series of patches provides an implementation of the changes >>>for C++ proposal P1423R3 [1]. >>> >>>These changes do not impact default libstdc++ behavior for C++17 >>>and earlier; they are only active for C++2a or when the -fchar8_t >>>option is specified. >>> >>>Tested x86_64-linux. >>> >>>Patch 1: Decouple constraints for u8path from path constructors. >>>Patch 2: Update __cpp_lib_char8_t feature test macro value, add >>>deleted operators, update u8path. >>>Patch 3: Updates to existing tests. >>>Patch 4: New tests. >>> >>>Tom. >>> >>>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html >> >>It took a while, but I've committed these four patches, with just some >>minor whitespace changes and changelog tweaks. > >Running the new tests revealed a latent bug on Windows, where >experimental::filesystem::u8path(const Source&) assumed the input >was an iterator over a NTCTS. That worked for a const char* but not a >std::string or experimental::string_view. > >The attached patch fixes that (and simplifies the #if and if-constexpr >conditions for Windows) but there's a remaining bug. Constructing a >experimental::filesystem::path from a char8_t string doesn't do the >right thing on Windows, so these cases fails: > > fs::path p(u8"\xf0\x9d\x84\x9e"); > VERIFY( p.u8string() == u8"\U0001D11E" ); > > p = fs::u8path(u8"\xf0\x9d\x84\x9e"); > VERIFY( p.u8string() == u8"\U0001D11E" ); > >It works correctly for std::filesystem::path, just not the TS version. I think this is the fix needed for the TS code: --- a/libstdc++-v3/include/experimental/bits/fs_path.h +++ b/libstdc++-v3/include/experimental/bits/fs_path.h @@ -765,7 +765,14 @@ namespace __detail { #ifdef _GLIBCXX_USE_CHAR8_T if constexpr (is_same<_CharT, char8_t>::value) - return _S_wconvert((const char*)__f, (const char*)__l, true_type()); + { + const char* __f2 = (const char*)__f; + const char* __l2 = (const char*)__l; + std::wstring __wstr; + std::codecvt_utf8_utf16<wchar_t> __wcvt; + if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt)) + return __wstr; + } else #endif { The current code uses std::codecvt<wchar_t, char, mbstate_t> but when we know the input is UTF-8 encoded we should use codecvt_utf8_utf16 (which is what the C++17 code already does for char8_t input). I'll add that the patch I'm testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation 2019-11-29 22:11 ` Jonathan Wakely @ 2019-11-30 1:04 ` Jonathan Wakely 0 siblings, 0 replies; 5+ messages in thread From: Jonathan Wakely @ 2019-11-30 1:04 UTC (permalink / raw) To: Tom Honermann; +Cc: libstdc++, gcc-patches [-- Attachment #1: Type: text/plain, Size: 2885 bytes --] On 29/11/19 21:45 +0000, Jonathan Wakely wrote: >On 29/11/19 19:48 +0000, Jonathan Wakely wrote: >>On 29/11/19 17:45 +0000, Jonathan Wakely wrote: >>>On 15/09/19 15:39 -0400, Tom Honermann wrote: >>>>This series of patches provides an implementation of the changes >>>>for C++ proposal P1423R3 [1]. >>>> >>>>These changes do not impact default libstdc++ behavior for C++17 >>>>and earlier; they are only active for C++2a or when the >>>>-fchar8_t option is specified. >>>> >>>>Tested x86_64-linux. >>>> >>>>Patch 1: Decouple constraints for u8path from path constructors. >>>>Patch 2: Update __cpp_lib_char8_t feature test macro value, add >>>>deleted operators, update u8path. >>>>Patch 3: Updates to existing tests. >>>>Patch 4: New tests. >>>> >>>>Tom. >>>> >>>>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html >>> >>>It took a while, but I've committed these four patches, with just some >>>minor whitespace changes and changelog tweaks. >> >>Running the new tests revealed a latent bug on Windows, where >>experimental::filesystem::u8path(const Source&) assumed the input >>was an iterator over a NTCTS. That worked for a const char* but not a >>std::string or experimental::string_view. >> >>The attached patch fixes that (and simplifies the #if and if-constexpr >>conditions for Windows) but there's a remaining bug. Constructing a >>experimental::filesystem::path from a char8_t string doesn't do the >>right thing on Windows, so these cases fails: >> >>fs::path p(u8"\xf0\x9d\x84\x9e"); >>VERIFY( p.u8string() == u8"\U0001D11E" ); >> >>p = fs::u8path(u8"\xf0\x9d\x84\x9e"); >>VERIFY( p.u8string() == u8"\U0001D11E" ); >> >>It works correctly for std::filesystem::path, just not the TS version. > >I think this is the fix needed for the TS code: > >--- a/libstdc++-v3/include/experimental/bits/fs_path.h >+++ b/libstdc++-v3/include/experimental/bits/fs_path.h >@@ -765,7 +765,14 @@ namespace __detail > { >#ifdef _GLIBCXX_USE_CHAR8_T > if constexpr (is_same<_CharT, char8_t>::value) >- return _S_wconvert((const char*)__f, (const char*)__l, true_type()); >+ { >+ const char* __f2 = (const char*)__f; >+ const char* __l2 = (const char*)__l; >+ std::wstring __wstr; >+ std::codecvt_utf8_utf16<wchar_t> __wcvt; >+ if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt)) >+ return __wstr; >+ } > else >#endif > { > >The current code uses std::codecvt<wchar_t, char, mbstate_t> but when >we know the input is UTF-8 encoded we should use codecvt_utf8_utf16 >(which is what the C++17 code already does for char8_t input). > >I'll add that the patch I'm testing. Here's the final patch I'm committing. Tested powerpc64le-linux and x86_64-w64-mingw, committed to trunk. [-- Attachment #2: patch.txt --] [-- Type: text/x-patch, Size: 7586 bytes --] commit cb232cf3d475d51fe5e550680bff64dfd32f57a5 Author: Jonathan Wakely <jwakely@redhat.com> Date: Fri Nov 29 19:36:39 2019 +0000 libstdc++: Fix experimental::filesystem::u8path(const Source&) for Windows This function failed to compile when called with a std::string. Also, constructing a path with a char8_t string did not correctly treat the string as already UTF-8 encoded. * include/bits/fs_path.h (u8path(InputIterator, InputIterator)) (u8path(const Source&)) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify conditions. * include/experimental/bits/fs_path.h [_GLIBCXX_FILESYSTEM_IS_WINDOWS] (__u8path(const Source&, char)): Add overloads for std::string and types convertible to std::string. (_Cvt::_S_wconvert): Add a new overload for char8_t strings and use codecvt_utf8_utf16 to do the correct conversion. diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h index b129372447b..20ec42da57d 100644 --- a/libstdc++-v3/include/bits/fs_path.h +++ b/libstdc++-v3/include/bits/fs_path.h @@ -691,14 +691,8 @@ namespace __detail u8path(_InputIterator __first, _InputIterator __last) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS -#ifdef _GLIBCXX_USE_CHAR8_T - if constexpr (is_same_v<_CharT, char8_t>) + if constexpr (is_same_v<_CharT, char>) { - return path{ __first, __last }; - } - else - { -#endif // XXX This assumes native wide encoding is UTF-16. std::codecvt_utf8_utf16<path::value_type> __cvt; path::string_type __tmp; @@ -710,16 +704,16 @@ namespace __detail else { const std::string __u8str{__first, __last}; - const char* const __ptr = __u8str.data(); - if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt)) + const char* const __p = __u8str.data(); + if (__str_codecvt_in_all(__p, __p + __u8str.size(), __tmp, __cvt)) return path{ __tmp }; } _GLIBCXX_THROW_OR_ABORT(filesystem_error( "Cannot convert character sequence", std::make_error_code(errc::illegal_byte_sequence))); -#ifdef _GLIBCXX_USE_CHAR8_T } -#endif + else + return path{ __first, __last }; #else // This assumes native normal encoding is UTF-8. return path{ __first, __last }; @@ -737,14 +731,8 @@ namespace __detail u8path(const _Source& __source) { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS -#ifdef _GLIBCXX_USE_CHAR8_T - if constexpr (is_same_v<_CharT, char8_t>) + if constexpr (is_same_v<_CharT, char>) { - return path{ __source }; - } - else - { -#endif if constexpr (is_convertible_v<const _Source&, std::string_view>) { const std::string_view __s = __source; @@ -755,9 +743,9 @@ namespace __detail std::string __s = path::_S_string_from_iter(__source); return filesystem::u8path(__s.data(), __s.data() + __s.size()); } -#ifdef _GLIBCXX_USE_CHAR8_T } -#endif + else + return path{ __source }; #else return path{ __source }; #endif diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h index 91202e5b008..5ce012eec81 100644 --- a/libstdc++-v3/include/experimental/bits/fs_path.h +++ b/libstdc++-v3/include/experimental/bits/fs_path.h @@ -644,8 +644,22 @@ namespace __detail /// Create a path from a UTF-8-encoded sequence of char #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS + inline path + __u8path(const string& __s, char) + { + return filesystem::u8path(__s.data(), __s.data() + __s.size()); + } + template<typename _Source> - inline path + inline __enable_if_t<is_convertible<const _Source&, string>::value, path> + __u8path(const _Source& __source, char) + { + std::string __s = __source; + return filesystem::u8path(__s.data(), __s.data() + __s.size()); + } + + template<typename _Source> + inline __enable_if_t<!is_convertible<const _Source&, string>::value, path> __u8path(const _Source& __source, char) { std::string __s = path::_S_string_from_iter(__source); @@ -733,8 +747,21 @@ namespace __detail struct path::_Cvt { #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS +#ifdef _GLIBCXX_USE_CHAR8_T static string_type - _S_wconvert(const char* __f, const char* __l, true_type) + _S_wconvert(const char8_t* __f, const char8_t* __l, const char8_t*) + { + const char* __f2 = (const char*)__f; + const char* __l2 = (const char*)__l; + std::wstring __wstr; + std::codecvt_utf8_utf16<wchar_t> __wcvt; + if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt)) + return __wstr; + } +#endif + + static string_type + _S_wconvert(const char* __f, const char* __l, const char*) { using _Cvt = std::codecvt<wchar_t, char, mbstate_t>; const auto& __cvt = std::use_facet<_Cvt>(std::locale{}); @@ -747,36 +774,29 @@ namespace __detail } static string_type - _S_wconvert(const _CharT* __f, const _CharT* __l, false_type) + _S_wconvert(const _CharT* __f, const _CharT* __l, const void*) { -#ifdef _GLIBCXX_USE_CHAR8_T - if constexpr (is_same<_CharT, char8_t>::value) - return _S_wconvert((const char*)__f, (const char*)__l, true_type()); - else -#endif + struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t> + { } __cvt; + std::string __str; + if (__str_codecvt_out_all(__f, __l, __str, __cvt)) { - struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t> - { } __cvt; - std::string __str; - if (__str_codecvt_out_all(__f, __l, __str, __cvt)) - { - const char* __f2 = __str.data(); - const char* __l2 = __f2 + __str.size(); - std::codecvt_utf8_utf16<wchar_t> __wcvt; - std::wstring __wstr; - if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt)) - return __wstr; - } - _GLIBCXX_THROW_OR_ABORT(filesystem_error( - "Cannot convert character sequence", - std::make_error_code(errc::illegal_byte_sequence))); + const char* __f2 = __str.data(); + const char* __l2 = __f2 + __str.size(); + std::codecvt_utf8_utf16<wchar_t> __wcvt; + std::wstring __wstr; + if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt)) + return __wstr; } + _GLIBCXX_THROW_OR_ABORT(filesystem_error( + "Cannot convert character sequence", + std::make_error_code(errc::illegal_byte_sequence))); } static string_type _S_convert(const _CharT* __f, const _CharT* __l) { - return _S_wconvert(__f, __l, is_same<_CharT, char>{}); + return _S_wconvert(__f, __l, (const _CharT*)nullptr); } #else static string_type @@ -786,19 +806,17 @@ namespace __detail if constexpr (is_same<_CharT, char8_t>::value) return string_type(__f, __l); else - { #endif + { struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t> { } __cvt; std::string __str; if (__str_codecvt_out_all(__f, __l, __str, __cvt)) return __str; -#ifdef _GLIBCXX_USE_CHAR8_T + _GLIBCXX_THROW_OR_ABORT(filesystem_error( + "Cannot convert character sequence", + std::make_error_code(errc::illegal_byte_sequence))); } -#endif - _GLIBCXX_THROW_OR_ABORT(filesystem_error( - "Cannot convert character sequence", - std::make_error_code(errc::illegal_byte_sequence))); } #endif ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-11-30 1:03 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-15 19:39 [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation Tom Honermann 2019-11-29 17:47 ` Jonathan Wakely 2019-11-29 19:53 ` Jonathan Wakely 2019-11-29 22:11 ` Jonathan Wakely 2019-11-30 1:04 ` Jonathan Wakely
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).