public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
@ 2019-09-15 19:39 Tom Honermann
  2019-11-29 17:47 ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Honermann @ 2019-09-15 19:39 UTC (permalink / raw)
  To: libstdc++, gcc-patches

This series of patches provides an implementation of the changes for C++ 
proposal P1423R3 [1].

These changes do not impact default libstdc++ behavior for C++17 and 
earlier; they are only active for C++2a or when the -fchar8_t option is 
specified.

Tested x86_64-linux.

Patch 1: Decouple constraints for u8path from path constructors.
Patch 2: Update __cpp_lib_char8_t feature test macro value, add deleted 
operators, update u8path.
Patch 3: Updates to existing tests.
Patch 4: New tests.

Tom.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
  2019-09-15 19:39 [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation Tom Honermann
@ 2019-11-29 17:47 ` Jonathan Wakely
  2019-11-29 19:53   ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2019-11-29 17:47 UTC (permalink / raw)
  To: Tom Honermann; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

On 15/09/19 15:39 -0400, Tom Honermann wrote:
>This series of patches provides an implementation of the changes for 
>C++ proposal P1423R3 [1].
>
>These changes do not impact default libstdc++ behavior for C++17 and 
>earlier; they are only active for C++2a or when the -fchar8_t option 
>is specified.
>
>Tested x86_64-linux.
>
>Patch 1: Decouple constraints for u8path from path constructors.
>Patch 2: Update __cpp_lib_char8_t feature test macro value, add 
>deleted operators, update u8path.
>Patch 3: Updates to existing tests.
>Patch 4: New tests.
>
>Tom.
>
>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html

It took a while, but I've committed these four patches, with just some
minor whitespace changes and changelog tweaks.

I'm also following it up with this patch, which corrects some
pre-existing problems that got worse with the new deleted operator<<
overloads.

Tested powerpc64le-linux, committed to trunk.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 6482 bytes --]

commit 81c954850d8422eaeefe3c9e833b75f8dbeb905f
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Nov 29 17:15:25 2019 +0000

    libstdc++: Adjust some function templates for coding conventions
    
            * include/bits/fs_path.h (path::operator/=): Change template-head to
            use typename instead of class.
            * include/experimental/bits/fs_path.h (path::operator/=): Likewise.
            * include/std/ostream (operator<<): Likewise.

diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index 643478292cd..b129372447b 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -272,7 +272,7 @@ namespace __detail
 
     path& operator/=(const path& __p);
 
-    template <class _Source>
+    template<typename _Source>
       __detail::_Path<_Source>&
       operator/=(_Source const& __source)
       {
diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h
index b924fbfd5f6..91202e5b008 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -278,7 +278,7 @@ namespace __detail
 
     path& operator/=(const path& __p) { return _M_append(__p._M_pathname); }
 
-    template <class _Source>
+    template<typename _Source>
       __detail::_Path<_Source>&
       operator/=(_Source const& __source)
       { return append(__source); }
diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index 771c28db7b7..895e4d7ab4e 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -512,18 +512,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     { return (__out << __out.widen(__c)); }
 
   // Specialization
-  template <class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>& __out, char __c)
     { return __ostream_insert(__out, &__c, 1); }
 
   // Signed and unsigned
-  template<class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>& __out, signed char __c)
     { return (__out << static_cast<char>(__c)); }
 
-  template<class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>& __out, unsigned char __c)
     { return (__out << static_cast<char>(__c)); }
@@ -533,37 +533,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // numeric values.
 
 #ifdef _GLIBCXX_USE_WCHAR_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, wchar_t) = delete;
 #endif // _GLIBCXX_USE_WCHAR_T
 
 #ifdef _GLIBCXX_USE_CHAR8_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, char8_t) = delete;
 #endif
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, char16_t) = delete;
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, char32_t) = delete;
 
 #ifdef _GLIBCXX_USE_WCHAR_T
 #ifdef _GLIBCXX_USE_CHAR8_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, char8_t) = delete;
 #endif // _GLIBCXX_USE_CHAR8_T
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, char16_t) = delete;
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, char32_t) = delete;
 #endif // _GLIBCXX_USE_WCHAR_T
@@ -601,7 +601,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     operator<<(basic_ostream<_CharT, _Traits>& __out, const char* __s);
 
   // Partial specializations
-  template<class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>& __out, const char* __s)
     {
@@ -614,12 +614,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     }
 
   // Signed and unsigned
-  template<class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>& __out, const signed char* __s)
     { return (__out << reinterpret_cast<const char*>(__s)); }
 
-  template<class _Traits>
+  template<typename _Traits>
     inline basic_ostream<char, _Traits> &
     operator<<(basic_ostream<char, _Traits>& __out, const unsigned char* __s)
     { return (__out << reinterpret_cast<const char*>(__s)); }
@@ -629,37 +629,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    // pointer values.
 
 #ifdef _GLIBCXX_USE_WCHAR_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, const wchar_t*) = delete;
 #endif // _GLIBCXX_USE_WCHAR_T
 
 #ifdef _GLIBCXX_USE_CHAR8_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, const char8_t*) = delete;
 #endif // _GLIBCXX_USE_CHAR8_T
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, const char16_t*) = delete;
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<char, _Traits>&
     operator<<(basic_ostream<char, _Traits>&, const char32_t*) = delete;
 
 #ifdef _GLIBCXX_USE_WCHAR_T
 #ifdef _GLIBCXX_USE_CHAR8_T
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, const char8_t*) = delete;
 #endif
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, const char16_t*) = delete;
 
-  template<class _Traits>
+  template<typename _Traits>
     basic_ostream<wchar_t, _Traits>&
     operator<<(basic_ostream<wchar_t, _Traits>&, const char32_t*) = delete;
 #endif // _GLIBCXX_USE_WCHAR_T

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
  2019-11-29 17:47 ` Jonathan Wakely
@ 2019-11-29 19:53   ` Jonathan Wakely
  2019-11-29 22:11     ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2019-11-29 19:53 UTC (permalink / raw)
  To: Tom Honermann; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1708 bytes --]

On 29/11/19 17:45 +0000, Jonathan Wakely wrote:
>On 15/09/19 15:39 -0400, Tom Honermann wrote:
>>This series of patches provides an implementation of the changes for 
>>C++ proposal P1423R3 [1].
>>
>>These changes do not impact default libstdc++ behavior for C++17 and 
>>earlier; they are only active for C++2a or when the -fchar8_t option 
>>is specified.
>>
>>Tested x86_64-linux.
>>
>>Patch 1: Decouple constraints for u8path from path constructors.
>>Patch 2: Update __cpp_lib_char8_t feature test macro value, add 
>>deleted operators, update u8path.
>>Patch 3: Updates to existing tests.
>>Patch 4: New tests.
>>
>>Tom.
>>
>>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html
>
>It took a while, but I've committed these four patches, with just some
>minor whitespace changes and changelog tweaks.

Running the new tests revealed a latent bug on Windows, where
experimental::filesystem::u8path(const Source&) assumed the input
was an iterator over a NTCTS. That worked for a const char* but not a
std::string or experimental::string_view.

The attached patch fixes that (and simplifies the #if and if-constexpr
conditions for Windows) but there's a remaining bug. Constructing a
experimental::filesystem::path from a char8_t string doesn't do the
right thing on Windows, so these cases fails:

  fs::path p(u8"\xf0\x9d\x84\x9e");
  VERIFY( p.u8string() == u8"\U0001D11E" );

  p = fs::u8path(u8"\xf0\x9d\x84\x9e");
  VERIFY( p.u8string() == u8"\U0001D11E" );

It works correctly for std::filesystem::path, just not the TS version.

I plan to commit the attached patch after some more testing. I'll
backport something like it too.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 3957 bytes --]

commit f9ff56e4624c34be0615aab62ed4a2ce559dd567
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Nov 29 19:36:39 2019 +0000

    libstdc++: Fix experimental::filesystem::u8path(const Source&) for Windows
    
    This function failed to compile when called with a std::string.
    
            * include/bits/fs_path.h (u8path(InputIterator, InputIterator))
            (u8path(const Source&)) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify
            conditions.
            * include/experimental/bits/fs_path.h (__u8path(const Source&, char)):
            Add overloads for std::string and types convertible to std::string.

diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index b129372447b..20ec42da57d 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -691,14 +691,8 @@ namespace __detail
     u8path(_InputIterator __first, _InputIterator __last)
     {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-#ifdef _GLIBCXX_USE_CHAR8_T
-      if constexpr (is_same_v<_CharT, char8_t>)
+      if constexpr (is_same_v<_CharT, char>)
 	{
-	  return path{ __first, __last };
-	}
-      else
-	{
-#endif
 	  // XXX This assumes native wide encoding is UTF-16.
 	  std::codecvt_utf8_utf16<path::value_type> __cvt;
 	  path::string_type __tmp;
@@ -710,16 +704,16 @@ namespace __detail
 	  else
 	    {
 	      const std::string __u8str{__first, __last};
-	      const char* const __ptr = __u8str.data();
-	      if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
+	      const char* const __p = __u8str.data();
+	      if (__str_codecvt_in_all(__p, __p + __u8str.size(), __tmp, __cvt))
 		return path{ __tmp };
 	    }
 	  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
 	      "Cannot convert character sequence",
 	      std::make_error_code(errc::illegal_byte_sequence)));
-#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-#endif
+      else
+	return path{ __first, __last };
 #else
       // This assumes native normal encoding is UTF-8.
       return path{ __first, __last };
@@ -737,14 +731,8 @@ namespace __detail
     u8path(const _Source& __source)
     {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-#ifdef _GLIBCXX_USE_CHAR8_T
-      if constexpr (is_same_v<_CharT, char8_t>)
+      if constexpr (is_same_v<_CharT, char>)
 	{
-	  return path{ __source };
-	}
-      else
-	{
-#endif
 	  if constexpr (is_convertible_v<const _Source&, std::string_view>)
 	    {
 	      const std::string_view __s = __source;
@@ -755,9 +743,9 @@ namespace __detail
 	      std::string __s = path::_S_string_from_iter(__source);
 	      return filesystem::u8path(__s.data(), __s.data() + __s.size());
 	    }
-#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-#endif
+      else
+	return path{ __source };
 #else
       return path{ __source };
 #endif
diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h
index 91202e5b008..796458b37b0 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -644,8 +644,22 @@ namespace __detail
 
   /// Create a path from a UTF-8-encoded sequence of char
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  inline path
+  __u8path(const string& __s, char)
+  {
+    return filesystem::u8path(__s.data(), __s.data() + __s.size());
+  }
+
   template<typename _Source>
-    inline path
+    inline __enable_if_t<is_convertible<const _Source&, string>::value, path>
+    __u8path(const _Source& __source, char)
+    {
+      std::string __s = __source;
+      return filesystem::u8path(__s.data(), __s.data() + __s.size());
+    }
+
+  template<typename _Source>
+    inline __enable_if_t<!is_convertible<const _Source&, string>::value, path>
     __u8path(const _Source& __source, char)
     {
       std::string __s = path::_S_string_from_iter(__source);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
  2019-11-29 19:53   ` Jonathan Wakely
@ 2019-11-29 22:11     ` Jonathan Wakely
  2019-11-30  1:04       ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2019-11-29 22:11 UTC (permalink / raw)
  To: Tom Honermann; +Cc: libstdc++, gcc-patches

On 29/11/19 19:48 +0000, Jonathan Wakely wrote:
>On 29/11/19 17:45 +0000, Jonathan Wakely wrote:
>>On 15/09/19 15:39 -0400, Tom Honermann wrote:
>>>This series of patches provides an implementation of the changes 
>>>for C++ proposal P1423R3 [1].
>>>
>>>These changes do not impact default libstdc++ behavior for C++17 
>>>and earlier; they are only active for C++2a or when the -fchar8_t 
>>>option is specified.
>>>
>>>Tested x86_64-linux.
>>>
>>>Patch 1: Decouple constraints for u8path from path constructors.
>>>Patch 2: Update __cpp_lib_char8_t feature test macro value, add 
>>>deleted operators, update u8path.
>>>Patch 3: Updates to existing tests.
>>>Patch 4: New tests.
>>>
>>>Tom.
>>>
>>>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html
>>
>>It took a while, but I've committed these four patches, with just some
>>minor whitespace changes and changelog tweaks.
>
>Running the new tests revealed a latent bug on Windows, where
>experimental::filesystem::u8path(const Source&) assumed the input
>was an iterator over a NTCTS. That worked for a const char* but not a
>std::string or experimental::string_view.
>
>The attached patch fixes that (and simplifies the #if and if-constexpr
>conditions for Windows) but there's a remaining bug. Constructing a
>experimental::filesystem::path from a char8_t string doesn't do the
>right thing on Windows, so these cases fails:
>
> fs::path p(u8"\xf0\x9d\x84\x9e");
> VERIFY( p.u8string() == u8"\U0001D11E" );
>
> p = fs::u8path(u8"\xf0\x9d\x84\x9e");
> VERIFY( p.u8string() == u8"\U0001D11E" );
>
>It works correctly for std::filesystem::path, just not the TS version.

I think this is the fix needed for the TS code:

--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -765,7 +765,14 @@ namespace __detail
       {
 #ifdef _GLIBCXX_USE_CHAR8_T
        if constexpr (is_same<_CharT, char8_t>::value)
-         return _S_wconvert((const char*)__f, (const char*)__l, true_type());
+         {
+           const char* __f2 = (const char*)__f;
+           const char* __l2 = (const char*)__l;
+           std::wstring __wstr;
+           std::codecvt_utf8_utf16<wchar_t> __wcvt;
+           if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
+             return __wstr;
+         }
        else
 #endif
          {

The current code uses std::codecvt<wchar_t, char, mbstate_t> but when
we know the input is UTF-8 encoded we should use codecvt_utf8_utf16
(which is what the C++17 code already does for char8_t input).

I'll add that the patch I'm testing.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation
  2019-11-29 22:11     ` Jonathan Wakely
@ 2019-11-30  1:04       ` Jonathan Wakely
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Wakely @ 2019-11-30  1:04 UTC (permalink / raw)
  To: Tom Honermann; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2885 bytes --]

On 29/11/19 21:45 +0000, Jonathan Wakely wrote:
>On 29/11/19 19:48 +0000, Jonathan Wakely wrote:
>>On 29/11/19 17:45 +0000, Jonathan Wakely wrote:
>>>On 15/09/19 15:39 -0400, Tom Honermann wrote:
>>>>This series of patches provides an implementation of the changes 
>>>>for C++ proposal P1423R3 [1].
>>>>
>>>>These changes do not impact default libstdc++ behavior for C++17 
>>>>and earlier; they are only active for C++2a or when the 
>>>>-fchar8_t option is specified.
>>>>
>>>>Tested x86_64-linux.
>>>>
>>>>Patch 1: Decouple constraints for u8path from path constructors.
>>>>Patch 2: Update __cpp_lib_char8_t feature test macro value, add 
>>>>deleted operators, update u8path.
>>>>Patch 3: Updates to existing tests.
>>>>Patch 4: New tests.
>>>>
>>>>Tom.
>>>>
>>>>[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html
>>>
>>>It took a while, but I've committed these four patches, with just some
>>>minor whitespace changes and changelog tweaks.
>>
>>Running the new tests revealed a latent bug on Windows, where
>>experimental::filesystem::u8path(const Source&) assumed the input
>>was an iterator over a NTCTS. That worked for a const char* but not a
>>std::string or experimental::string_view.
>>
>>The attached patch fixes that (and simplifies the #if and if-constexpr
>>conditions for Windows) but there's a remaining bug. Constructing a
>>experimental::filesystem::path from a char8_t string doesn't do the
>>right thing on Windows, so these cases fails:
>>
>>fs::path p(u8"\xf0\x9d\x84\x9e");
>>VERIFY( p.u8string() == u8"\U0001D11E" );
>>
>>p = fs::u8path(u8"\xf0\x9d\x84\x9e");
>>VERIFY( p.u8string() == u8"\U0001D11E" );
>>
>>It works correctly for std::filesystem::path, just not the TS version.
>
>I think this is the fix needed for the TS code:
>
>--- a/libstdc++-v3/include/experimental/bits/fs_path.h
>+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
>@@ -765,7 +765,14 @@ namespace __detail
>      {
>#ifdef _GLIBCXX_USE_CHAR8_T
>       if constexpr (is_same<_CharT, char8_t>::value)
>-         return _S_wconvert((const char*)__f, (const char*)__l, true_type());
>+         {
>+           const char* __f2 = (const char*)__f;
>+           const char* __l2 = (const char*)__l;
>+           std::wstring __wstr;
>+           std::codecvt_utf8_utf16<wchar_t> __wcvt;
>+           if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
>+             return __wstr;
>+         }
>       else
>#endif
>         {
>
>The current code uses std::codecvt<wchar_t, char, mbstate_t> but when
>we know the input is UTF-8 encoded we should use codecvt_utf8_utf16
>(which is what the C++17 code already does for char8_t input).
>
>I'll add that the patch I'm testing.

Here's the final patch I'm committing. Tested powerpc64le-linux and
x86_64-w64-mingw, committed to trunk.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 7586 bytes --]

commit cb232cf3d475d51fe5e550680bff64dfd32f57a5
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Nov 29 19:36:39 2019 +0000

    libstdc++: Fix experimental::filesystem::u8path(const Source&) for Windows
    
    This function failed to compile when called with a std::string.
    
    Also, constructing a path with a char8_t string did not correctly treat
    the string as already UTF-8 encoded.
    
            * include/bits/fs_path.h (u8path(InputIterator, InputIterator))
            (u8path(const Source&)) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify
            conditions.
            * include/experimental/bits/fs_path.h [_GLIBCXX_FILESYSTEM_IS_WINDOWS]
            (__u8path(const Source&, char)): Add overloads for std::string and
            types convertible to std::string.
            (_Cvt::_S_wconvert): Add a new overload for char8_t strings and use
            codecvt_utf8_utf16 to do the correct conversion.

diff --git a/libstdc++-v3/include/bits/fs_path.h b/libstdc++-v3/include/bits/fs_path.h
index b129372447b..20ec42da57d 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -691,14 +691,8 @@ namespace __detail
     u8path(_InputIterator __first, _InputIterator __last)
     {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-#ifdef _GLIBCXX_USE_CHAR8_T
-      if constexpr (is_same_v<_CharT, char8_t>)
+      if constexpr (is_same_v<_CharT, char>)
 	{
-	  return path{ __first, __last };
-	}
-      else
-	{
-#endif
 	  // XXX This assumes native wide encoding is UTF-16.
 	  std::codecvt_utf8_utf16<path::value_type> __cvt;
 	  path::string_type __tmp;
@@ -710,16 +704,16 @@ namespace __detail
 	  else
 	    {
 	      const std::string __u8str{__first, __last};
-	      const char* const __ptr = __u8str.data();
-	      if (__str_codecvt_in_all(__ptr, __ptr + __u8str.size(), __tmp, __cvt))
+	      const char* const __p = __u8str.data();
+	      if (__str_codecvt_in_all(__p, __p + __u8str.size(), __tmp, __cvt))
 		return path{ __tmp };
 	    }
 	  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
 	      "Cannot convert character sequence",
 	      std::make_error_code(errc::illegal_byte_sequence)));
-#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-#endif
+      else
+	return path{ __first, __last };
 #else
       // This assumes native normal encoding is UTF-8.
       return path{ __first, __last };
@@ -737,14 +731,8 @@ namespace __detail
     u8path(const _Source& __source)
     {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-#ifdef _GLIBCXX_USE_CHAR8_T
-      if constexpr (is_same_v<_CharT, char8_t>)
+      if constexpr (is_same_v<_CharT, char>)
 	{
-	  return path{ __source };
-	}
-      else
-	{
-#endif
 	  if constexpr (is_convertible_v<const _Source&, std::string_view>)
 	    {
 	      const std::string_view __s = __source;
@@ -755,9 +743,9 @@ namespace __detail
 	      std::string __s = path::_S_string_from_iter(__source);
 	      return filesystem::u8path(__s.data(), __s.data() + __s.size());
 	    }
-#ifdef _GLIBCXX_USE_CHAR8_T
 	}
-#endif
+      else
+	return path{ __source };
 #else
       return path{ __source };
 #endif
diff --git a/libstdc++-v3/include/experimental/bits/fs_path.h b/libstdc++-v3/include/experimental/bits/fs_path.h
index 91202e5b008..5ce012eec81 100644
--- a/libstdc++-v3/include/experimental/bits/fs_path.h
+++ b/libstdc++-v3/include/experimental/bits/fs_path.h
@@ -644,8 +644,22 @@ namespace __detail
 
   /// Create a path from a UTF-8-encoded sequence of char
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  inline path
+  __u8path(const string& __s, char)
+  {
+    return filesystem::u8path(__s.data(), __s.data() + __s.size());
+  }
+
   template<typename _Source>
-    inline path
+    inline __enable_if_t<is_convertible<const _Source&, string>::value, path>
+    __u8path(const _Source& __source, char)
+    {
+      std::string __s = __source;
+      return filesystem::u8path(__s.data(), __s.data() + __s.size());
+    }
+
+  template<typename _Source>
+    inline __enable_if_t<!is_convertible<const _Source&, string>::value, path>
     __u8path(const _Source& __source, char)
     {
       std::string __s = path::_S_string_from_iter(__source);
@@ -733,8 +747,21 @@ namespace __detail
     struct path::_Cvt
     {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+#ifdef _GLIBCXX_USE_CHAR8_T
       static string_type
-      _S_wconvert(const char* __f, const char* __l, true_type)
+      _S_wconvert(const char8_t* __f, const char8_t* __l, const char8_t*)
+      {
+	const char* __f2 = (const char*)__f;
+	const char* __l2 = (const char*)__l;
+	std::wstring __wstr;
+	std::codecvt_utf8_utf16<wchar_t> __wcvt;
+	if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
+	  return __wstr;
+      }
+#endif
+
+      static string_type
+      _S_wconvert(const char* __f, const char* __l, const char*)
       {
 	using _Cvt = std::codecvt<wchar_t, char, mbstate_t>;
 	const auto& __cvt = std::use_facet<_Cvt>(std::locale{});
@@ -747,36 +774,29 @@ namespace __detail
       }
 
       static string_type
-      _S_wconvert(const _CharT* __f, const _CharT* __l, false_type)
+      _S_wconvert(const _CharT* __f, const _CharT* __l, const void*)
       {
-#ifdef _GLIBCXX_USE_CHAR8_T
-	if constexpr (is_same<_CharT, char8_t>::value)
-	  return _S_wconvert((const char*)__f, (const char*)__l, true_type());
-	else
-#endif
+	struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t>
+	{ } __cvt;
+	std::string __str;
+	if (__str_codecvt_out_all(__f, __l, __str, __cvt))
 	  {
-	    struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t>
-	    { } __cvt;
-	    std::string __str;
-	    if (__str_codecvt_out_all(__f, __l, __str, __cvt))
-	      {
-		const char* __f2 = __str.data();
-		const char* __l2 = __f2 + __str.size();
-		std::codecvt_utf8_utf16<wchar_t> __wcvt;
-		std::wstring __wstr;
-		if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
-		  return __wstr;
-	      }
-	    _GLIBCXX_THROW_OR_ABORT(filesystem_error(
-		  "Cannot convert character sequence",
-		  std::make_error_code(errc::illegal_byte_sequence)));
+	    const char* __f2 = __str.data();
+	    const char* __l2 = __f2 + __str.size();
+	    std::codecvt_utf8_utf16<wchar_t> __wcvt;
+	    std::wstring __wstr;
+	    if (__str_codecvt_in_all(__f2, __l2, __wstr, __wcvt))
+	      return __wstr;
 	  }
+	_GLIBCXX_THROW_OR_ABORT(filesystem_error(
+	      "Cannot convert character sequence",
+	      std::make_error_code(errc::illegal_byte_sequence)));
       }
 
       static string_type
       _S_convert(const _CharT* __f, const _CharT* __l)
       {
-	return _S_wconvert(__f, __l, is_same<_CharT, char>{});
+	return _S_wconvert(__f, __l, (const _CharT*)nullptr);
       }
 #else
       static string_type
@@ -786,19 +806,17 @@ namespace __detail
 	if constexpr (is_same<_CharT, char8_t>::value)
 	  return string_type(__f, __l);
 	else
-	  {
 #endif
+	  {
 	    struct _UCvt : std::codecvt<_CharT, char, std::mbstate_t>
 	    { } __cvt;
 	    std::string __str;
 	    if (__str_codecvt_out_all(__f, __l, __str, __cvt))
 	      return __str;
-#ifdef _GLIBCXX_USE_CHAR8_T
+	    _GLIBCXX_THROW_OR_ABORT(filesystem_error(
+		  "Cannot convert character sequence",
+		  std::make_error_code(errc::illegal_byte_sequence)));
 	  }
-#endif
-	_GLIBCXX_THROW_OR_ABORT(filesystem_error(
-	      "Cannot convert character sequence",
-	      std::make_error_code(errc::illegal_byte_sequence)));
       }
 #endif
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-30  1:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-15 19:39 [PATCH 0/4]: C++ P1423R3 char8_t remediation implementation Tom Honermann
2019-11-29 17:47 ` Jonathan Wakely
2019-11-29 19:53   ` Jonathan Wakely
2019-11-29 22:11     ` Jonathan Wakely
2019-11-30  1:04       ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).