From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by sourceware.org (Postfix) with ESMTPS id D93383858D20; Fri, 15 Dec 2023 01:17:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D93383858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D93383858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::436 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702603055; cv=none; b=nSK71qARrdW0d4pxH8+F5RX/1mHPV3E8bvY0nWCkcz82Gr/3NjWKloNOBPtPseZl2KF+flwRmA7RcvY2l9VqzcymjXNZl34O9SXkfwgxekt8f+5m6Ib9cqDctvVAcrMs0eE9gMHH8x1QJQ5YstGeJjLb3fqmpeZXyJhBGDL6F2A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702603055; c=relaxed/simple; bh=VhtOujNscZkyFWQtVr/ZnwVtEuvZO8qymMN2ulAhl34=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=AKR69qxdKhDm/6dGXSa8MJT8eN8hhZk8pNrNqUldknwdcWqFB5mK9A3pOZHOV4/XJ8hgL7k0bmt8qMelFC8qiwJi4RT8GAsnGnjf2rLqvfzrhtxMWN5gujc6US3N3tcF/1Qd6+HTcoCaNqPtx1IjtYxIn/dH3G+L4HK6uWIlIvw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-32f8441dfb5so102446f8f.0; Thu, 14 Dec 2023 17:17:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702603044; x=1703207844; darn=gcc.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Uu38GQcuxR6qumzZwNbu7V33koNYuIFbvCs4LZXZ8mk=; b=cd92d5427D0tV337z4aM4zbVX/WQRqXphJ/D6Kw+0ZHUfu18D/c9bNcO6Rjna6gfba UzlAVxbFnwzxuH0amGs3a4pD009LTs7tpahUuchgEPCmHn4fVfsCG1pySB9N3MRsCYES NbHd8b04FlNVIPvHiiHJ0jPRlg2Dxo8Obt0G0mSoabP2Ohl3+QlGF9Wp22OSzp/aD/ej RSJzgRI+U2BMUoqFloNUv4xBAGjNvpYZm61Af8OI9OKT/QAk9bCxlU+c28jIHPU/3BLU xhSSp+GkSqV+0/3kG4PU+LSx5j7rZU0R8brptz+MjVR1IRiqtyTXCpgcMQLzXxYWcvD6 MgzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702603044; x=1703207844; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Uu38GQcuxR6qumzZwNbu7V33koNYuIFbvCs4LZXZ8mk=; b=UCI2p84cCDm7DS0WObFm++8kjqd2O3VMvzMjjwLDM7ea/+LNcMFltYWbA0uOGoMEy3 UOioNxNLK5Oqcxq1rWKPwkWhsdSB9uz4r59RiRCygWtzSNBvjKrfKnEsiWR2IbYeoNtL cbLFDW7pcuHYauMnjz3l35ewV7a8RNJvY+BBaW+eTBtSK8ayYjl05gXUZzv0pWp+y3uO dx/0W1HXOGXtkLUHaAOVtYOjT+hMs7SeyrIpCCZOcM7Shd1we+u76lJSeVJZNYPd3CII wzZnBgJAodE1S+k/Kagj8sqNH++G+3XIOTKCWeGD1yEEkuwxD5iInxNl5HczCFEpyyGg S0sw== X-Gm-Message-State: AOJu0YwNprP9qiGo8tQRl2Ibfy4RAEtq9AeFotYjrs6SXiWkKsYvnBNk gWENlYvK0bI9KHGpSpByHLSrFk6jPXxEN8EuGlTseG4BwZk= X-Google-Smtp-Source: AGHT+IFHhcwEvwqfOZwJ3Mniwnl3vnLXbRlGI+RlYXyIPNFtO2xXt3W8bwadfaA5Z3MJcwoLvTHQR3tc03LXQIvgplo= X-Received: by 2002:adf:f80b:0:b0:336:42d3:4d1e with SMTP id s11-20020adff80b000000b0033642d34d1emr1504178wrp.143.1702603044036; Thu, 14 Dec 2023 17:17:24 -0800 (PST) MIME-Version: 1.0 References: <20231215000350.2069578-1-jwakely@redhat.com> In-Reply-To: <20231215000350.2069578-1-jwakely@redhat.com> From: Tim Song Date: Thu, 14 Dec 2023 19:16:46 -0600 Message-ID: Subject: Re: [committed] libstdc++: Implement C++23 header [PR107760] To: Jonathan Wakely Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Content-Type: multipart/alternative; boundary="000000000000a224f2060c8229c5" X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000a224f2060c8229c5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Dec 14, 2023 at 6:05=E2=80=AFPM Jonathan Wakely wrote: > Tested x86_64-linux. Pushed to trunk. > > -- >8 -- > > This adds the C++23 std::print functions, which use std::format to write > to a FILE stream or std::ostream (defaulting to stdout). > > The new extern symbols are in the libstdc++exp.a archive, so we aren't > committing to stable symbols in the DSO yet. There's a UTF-8 validating > and transcoding function added by this change. That can certainly be > optimized, but it's internal to libstdc++exp.a so can be tweaked later > at leisure. > > Currently the external symbols work for all targets, but are only > actually used for Windows, where it's necessary to transcode to UTF-16 > to write to the console. The standard seems to encourage us to also > diagnose invalid UTF-8 for non-Windows targets when writing to a > terminal (and only when writing to a terminal), but I'm reliably > informed that that wasn't the intent of the wording. Checking for > invalid UTF-8 sequences only needs to happen for Windows, which is good > as checking for a terminal requires a call to isatty, and on Linux that > uses an ioctl syscall, which would make std::print ten times slower! > > Testing the std::print behaviour is difficult if it depends on whether > the output stream is connected to a Windows console or not, as we can't > (as far as I know) do that non-interactively in DejaGNU. One of the new > tests uses the internal __write_to_terminal function directly. That > allows us to verify its UTF-8 error handling on POSIX targets, even > though that's not actually used by std::print. For Windows, that > __write_to_terminal function transcodes to UTF-16 but then uses > WriteConsoleW which fails unless it really is writing to the console. > That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16 > transcoding has been manually tested using mingw-w64 and Wine, and > appears to work. > > libstdc++-v3/ChangeLog: > > PR libstdc++/107760 > * include/Makefile.am: Add new header. > * include/Makefile.in: Regenerate. > * include/bits/version.def (__cpp_lib_print): Define. > * include/bits/version.h: Regenerate. > * include/std/format (__literal_encoding_is_utf8): New function. > (_Seq_sink::view()): New member function. > * include/std/ostream (vprintf_nonunicode, vprintf_unicode) > (print, println): New functions. > * include/std/print: New file. > * src/c++23/Makefile.am: Add new source file. > * src/c++23/Makefile.in: Regenerate. > * src/c++23/print.cc: New file. > * testsuite/27_io/basic_ostream/print/1.cc: New test. > * testsuite/27_io/print/1.cc: New test. > * testsuite/27_io/print/2.cc: New test. > --- > libstdc++-v3/include/Makefile.am | 1 + > libstdc++-v3/include/Makefile.in | 1 + > libstdc++-v3/include/bits/version.def | 9 + > libstdc++-v3/include/bits/version.h | 29 +- > libstdc++-v3/include/std/format | 53 +++ > libstdc++-v3/include/std/ostream | 152 ++++++++ > libstdc++-v3/include/std/print | 138 +++++++ > libstdc++-v3/src/c++23/Makefile.am | 8 +- > libstdc++-v3/src/c++23/Makefile.in | 10 +- > libstdc++-v3/src/c++23/print.cc | 348 ++++++++++++++++++ > .../testsuite/27_io/basic_ostream/print/1.cc | 112 ++++++ > libstdc++-v3/testsuite/27_io/print/1.cc | 85 +++++ > libstdc++-v3/testsuite/27_io/print/2.cc | 151 ++++++++ > 13 files changed, 1085 insertions(+), 12 deletions(-) > create mode 100644 libstdc++-v3/include/std/print > create mode 100644 libstdc++-v3/src/c++23/print.cc > create mode 100644 libstdc++-v3/testsuite/27_io/basic_ostream/print/1.cc > create mode 100644 libstdc++-v3/testsuite/27_io/print/1.cc > create mode 100644 libstdc++-v3/testsuite/27_io/print/2.cc > > diff --git a/libstdc++-v3/include/Makefile.am > b/libstdc++-v3/include/Makefile.am > index 17d9d9cec31..368b92eafbc 100644 > --- a/libstdc++-v3/include/Makefile.am > +++ b/libstdc++-v3/include/Makefile.am > @@ -85,6 +85,7 @@ std_headers =3D \ > ${std_srcdir}/memory_resource \ > ${std_srcdir}/mutex \ > ${std_srcdir}/ostream \ > + ${std_srcdir}/print \ > ${std_srcdir}/queue \ > ${std_srcdir}/random \ > ${std_srcdir}/regex \ > diff --git a/libstdc++-v3/include/Makefile.in > b/libstdc++-v3/include/Makefile.in > index f038af709cc..a31588c0100 100644 > --- a/libstdc++-v3/include/Makefile.in > +++ b/libstdc++-v3/include/Makefile.in > @@ -441,6 +441,7 @@ std_freestanding =3D \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/memory_resource \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/mutex \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/ostream \ > +@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/print \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/queue \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/random \ > @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/regex \ > diff --git a/libstdc++-v3/include/bits/version.def > b/libstdc++-v3/include/bits/version.def > index 38b73ec9b5d..0134a71b3ab 100644 > --- a/libstdc++-v3/include/bits/version.def > +++ b/libstdc++-v3/include/bits/version.def > @@ -1645,6 +1645,15 @@ ftms =3D { > }; > }; > > +ftms =3D { > + name =3D print; > + values =3D { > + v =3D 202211; > + cxxmin =3D 23; > + hosted =3D yes; > + }; > +}; > + > ftms =3D { > name =3D spanstream; > values =3D { > diff --git a/libstdc++-v3/include/bits/version.h > b/libstdc++-v3/include/bits/version.h > index a201a444925..c28fbe14c15 100644 > --- a/libstdc++-v3/include/bits/version.h > +++ b/libstdc++-v3/include/bits/version.h > @@ -2005,6 +2005,17 @@ > #undef __glibcxx_want_out_ptr > > // from version.def line 1649 > +#if !defined(__cpp_lib_print) > +# if (__cplusplus >=3D 202100L) && _GLIBCXX_HOSTED > +# define __glibcxx_print 202211L > +# if defined(__glibcxx_want_all) || defined(__glibcxx_want_print) > +# define __cpp_lib_print 202211L > +# endif > +# endif > +#endif /* !defined(__cpp_lib_print) && defined(__glibcxx_want_print) */ > +#undef __glibcxx_want_print > + > +// from version.def line 1658 > #if !defined(__cpp_lib_spanstream) > # if (__cplusplus >=3D 202100L) && _GLIBCXX_HOSTED && (__glibcxx_span) > # define __glibcxx_spanstream 202106L > diff --git a/libstdc++-v3/include/std/format > b/libstdc++-v3/include/std/format > index 6204fd0e3c1..1110ba4ab16 100644 > --- a/libstdc++-v3/include/std/format > +++ b/libstdc++-v3/include/std/format > @@ -346,6 +346,44 @@ namespace __format > _WP_from_arg // Use a formatting argument for width/prec. > }; > > + consteval bool > + __literal_encoding_is_utf8() > + { > +#ifdef __GNUC_EXECUTION_CHARSET_NAME > + const char* __enc =3D __GNUC_EXECUTION_CHARSET_NAME; > + // GNU iconv allows "ISO-10646/" prefix (case-insensitive). > + if (__enc[0] =3D=3D 'I' || __enc[0] =3D=3D 'i') > + { > + if ((__enc[1] =3D=3D 'S' || __enc[1] =3D=3D 's') > + && (__enc[2] =3D=3D 'O' || __enc[2] =3D=3D 'o')) > + { > + __enc +=3D 3; > + if (string_view(__enc).starts_with("-10646/")) > + __enc +=3D 7; > + else > + return false; > + } > + else > + return false; > + } > + > + if ((__enc[0] =3D=3D 'U' || __enc[0] =3D=3D 'u') > + && (__enc[1] =3D=3D 'T' || __enc[1] =3D=3D 't') > + && (__enc[2] =3D=3D 'F' || __enc[2] =3D=3D 'f')) > + { > + __enc +=3D 3; > + if (__enc[0] =3D=3D '-') > + ++__enc; > + if (__enc[0] =3D=3D '8') > + return __enc[1] =3D=3D '\0' || string_view(__enc + 1) =3D=3D "/= /"; > + } > +#elif defined __clang_literal_encoding__ > + // Clang accepts "-fexec-charset=3Dutf-8" but the macro is still > uppercase. > + return string_view(__clang_literal_encoding__) =3D=3D "UTF-8"; > +#endif > + return false; > + } > + > template > size_t > __int_from_arg(const basic_format_arg<_Context>& __arg); > @@ -2754,6 +2792,21 @@ namespace __format > _Seq_sink::_M_overflow(); > return std::move(_M_seq); > } > + > + // A writable span that views everything written to the sink. > + // Will be either a view over _M_seq or the used part of _M_buf. > + span<_CharT> > + view() > + { > + auto __s =3D this->_M_used(); > + if (_M_seq.size()) > + { > + if (__s.size() !=3D 0) > + _Seq_sink::_M_overflow(); > + return _M_seq; > + } > + return __s; > + } > }; > > template> > diff --git a/libstdc++-v3/include/std/ostream > b/libstdc++-v3/include/std/ostream > index 1de1c1bd359..4f1cdc281a3 100644 > --- a/libstdc++-v3/include/std/ostream > +++ b/libstdc++-v3/include/std/ostream > @@ -39,6 +39,11 @@ > > #include > #include > +#if __cplusplus > 202002L > +# include > +#endif > + > +# define __glibcxx_want_print > #include // __glibcxx_syncbuf > > namespace std _GLIBCXX_VISIBILITY(default) > @@ -872,6 +877,153 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > } > #endif // __glibcxx_syncbuf > > +#if __cpp_lib_print // C++ >=3D 23 > + > + inline void > + vprint_nonunicode(ostream& __os, string_view __fmt, format_args __args) > + { > + ostream::sentry __cerb(__os); > + if (__cerb) > + { > + __format::_Str_sink __buf; > + std::vformat_to(__buf.out(), __os.getloc(), __fmt, __args); > + auto __out =3D __buf.view(); > + > + __try > + { > + const streamsize __w =3D __os.width(); > + const streamsize __n =3D __out.size(); > + if (__w > __n) > + { > + const bool __left > + =3D (__os.flags() & ios_base::adjustfield) =3D=3D > ios_base::left; > + if (!__left) > + std::__ostream_fill(__os, __w - __n); > + if (__os.good()) > + std::__ostream_write(__os, __out.data(), __n); > + if (__left && __os.good()) > + std::__ostream_fill(__os, __w - __n); > + } > + else > + std::__ostream_write(__os, __out.data(), __n); > + } > + __catch(const __cxxabiv1::__forced_unwind&) > + { > + __os._M_setstate(ios_base::badbit); > + __throw_exception_again; > + } > + __catch(...) > + { __os._M_setstate(ios_base::badbit); } > + } > + } > + > + inline void > + vprint_unicode(ostream& __os, string_view __fmt, format_args __args) > + { > + ostream::sentry __cerb(__os); > + if (__cerb) > + { > + > + const streamsize __w =3D __os.width(); > + const bool __left > + =3D (__os.flags() & ios_base::adjustfield) =3D=3D ios_base::lef= t; > I'm pretty sure - when I wrote this wording anyway - that the intent was that it was just an unformatted write at the end. The wording in [ostream.formatted.print] doesn't use the "determines padding" words of power that would invoke [ostream.formatted.reqmts]/3. > + > + __format::_Str_sink __buf; > + std::vformat_to(__buf.out(), __os.getloc(), __fmt, __args); > + auto __out =3D __buf.view(); > + > +#ifdef _WIN32 > + void* __open_terminal(streambuf*); > + error_code __write_to_terminal(void*, span); > + // If stream refers to a terminal, write a Unicode string to it. > + if (auto __term =3D __open_terminal(__os.rdbuf())) > + { > + __format::_Str_sink __buf2; > + if (__w !=3D 0) > + { > + char __fmt[] =3D "{0:..{1}}"; > + __fmt[3] =3D=3D __os.fill(); > + __fmt[4] =3D=3D __left ? '<' : '>'; > + string_view __str(__out); > + std::vformat_to(__buf2.out(), // N.B. no need to use > getloc() > + __fmt, std::make_format_args(__str, __w)); > + __out =3D __buf2.view(); > + } > + > + ios_base::iostate __err =3D ios_base::goodbit; > + __try > + { > + if (__os.rdbuf()->pubsync() =3D=3D -1) > + __err =3D ios::badbit; > + else if (auto __e =3D __write_to_terminal(__term, __out)) > + if (__e !=3D > std::make_error_code(errc::illegal_byte_sequence)) > + __err =3D ios::badbit; > +#ifndef _WIN32 > + // __open_terminal(streambuf*) opens a new FILE with > fdopen, > + // so we need to close it here. > + std::fclose((FILE*)__term); > +#endif > + } > + __catch(const __cxxabiv1::__forced_unwind&) > + { > + __os._M_setstate(ios_base::badbit); > + __throw_exception_again; > + } > + __catch(...) > + { __os._M_setstate(ios_base::badbit); } > + > + if (__err) > + __os.setstate(__err); > + return; > + } > +#endif > + > + // Otherwise just insert the string as normal. > + __try > + { > + const streamsize __n =3D __out.size(); > + if (__w > __n) > + { > + if (!__left) > + std::__ostream_fill(__os, __w - __n); > + if (__os.good()) > + std::__ostream_write(__os, __out.data(), __n); > + if (__left && __os.good()) > + std::__ostream_fill(__os, __w - __n); > + } + else + std::__ostream_write(__os, __out.data(), __n); + } > + __catch(const __cxxabiv1::__forced_unwind&) > + { > + __os._M_setstate(ios_base::badbit); > + __throw_exception_again; > + } > + __catch(...) > + { __os._M_setstate(ios_base::badbit); } > + } > + } > + > + template > + inline void > + print(ostream& __os, format_string<_Args...> __fmt, _Args&&... __arg= s) > + { > + auto __fmtargs =3D > std::make_format_args(std::forward<_Args>(__args)...); > + if constexpr (__format::__literal_encoding_is_utf8()) > + std::vprint_unicode(__os, __fmt.get(), __fmtargs); > + else > + std::vprint_nonunicode(__os, __fmt.get(), __fmtargs); > + } > + > + template > + inline void > + println(ostream& __os, format_string<_Args...> __fmt, _Args&&... > __args) > + { > + std::print(__os, "{}\n", > + std::format(__fmt, std::forward<_Args>(__args)...)); > + } > +#endif // __cpp_lib_print > + > #endif // C++11 > > _GLIBCXX_END_NAMESPACE_VERSION > diff --git a/libstdc++-v3/include/std/print > b/libstdc++-v3/include/std/print > new file mode 100644 > index 00000000000..e7099ab6fe3 > --- /dev/null > +++ b/libstdc++-v3/include/std/print > @@ -0,0 +1,138 @@ > +// Print functions -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// . > + > +/** @file include/print > + * This is a Standard C++ Library header. > + */ > + > +#ifndef _GLIBCXX_PRINT > +#define _GLIBCXX_PRINT 1 > + > +#pragma GCC system_header > + > +#include // for std::format > + > +#define __glibcxx_want_print > +#include > + > +#ifdef __cpp_lib_print // C++ >=3D 23 > + > +#include > +#include > +#include > +#include > + > +#ifdef _WIN32 > +# include > +#endif > + > +namespace std _GLIBCXX_VISIBILITY(default) > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > + > + inline void > + vprint_nonunicode(FILE* __stream, string_view __fmt, format_args __arg= s) > + { > + __format::_Str_sink __buf; > + std::vformat_to(__buf.out(), __fmt, __args); > + auto __out =3D __buf.view(); > + if (std::fwrite(__out.data(), 1, __out.size(), __stream) !=3D > __out.size()) > + __throw_system_error(EIO); > + } > + > + inline void > + vprint_unicode(FILE* __stream, string_view __fmt, format_args __args) > + { > + __format::_Str_sink __buf; > + std::vformat_to(__buf.out(), __fmt, __args); > + auto __out =3D __buf.view(); > + > +#ifdef _WIN32 > + void* __open_terminal(FILE*); > + error_code __write_to_terminal(void*, span); > + // If stream refers to a terminal, write a native Unicode string to > it. > + if (auto __term =3D __open_terminal(__stream)) > + { > + string __out =3D std::vformat(__fmt, __args); > + error_code __e; > + if (!std::fflush(__stream)) > + { > + __e =3D __write_to_terminal(__term, __out); > + if (!__e) > + return; > + if (__e =3D=3D std::make_error_code(errc::illegal_byte_sequen= ce)) > + return; > + } > + else > + __e =3D error_code(errno, generic_category()); > + _GLIBCXX_THROW_OR_ABORT(system_error(__e, "std::vprint_unicode")); > + } > +#endif > + > + // Otherwise just write the string to the file. > + if (std::fwrite(__out.data(), 1, __out.size(), __stream) !=3D > __out.size()) > + __throw_system_error(EIO); > + } > + > + template > + inline void > + print(FILE* __stream, format_string<_Args...> __fmt, _Args&&... > __args) > + { > + auto __fmtargs =3D > std::make_format_args(std::forward<_Args>(__args)...); > + if constexpr (__format::__literal_encoding_is_utf8()) > + std::vprint_unicode(__stream, __fmt.get(), __fmtargs); > + else > + std::vprint_nonunicode(__stream, __fmt.get(), __fmtargs); > + } > + > + template > + inline void > + print(format_string<_Args...> __fmt, _Args&&... __args) > + { std::print(stdout, __fmt, std::forward<_Args>(__args)...); } > + > + template > + inline void > + println(FILE* __stream, format_string<_Args...> __fmt, _Args&&... > __args) > + { > + std::print(__stream, "{}\n", > + std::format(__fmt, std::forward<_Args>(__args)...)); > + } > + > + template > + inline void > + println(format_string<_Args...> __fmt, _Args&&... __args) > + { std::println(stdout, __fmt, std::forward<_Args>(__args)...); } > + > + inline void > + vprint_unicode(string_view __fmt, format_args __args) > + { std::vprint_unicode(stdout, __fmt, __args); } > + > + inline void > + vprint_nonunicode(string_view __fmt, format_args __args) > + { std::vprint_nonunicode(stdout, __fmt, __args); } > + > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > +#endif // __cpp_lib_print > +#endif // _GLIBCXX_PRINT > diff --git a/libstdc++-v3/src/c++23/Makefile.am > b/libstdc++-v3/src/c++23/Makefile.am > index da988c352f8..76938755f58 100644 > --- a/libstdc++-v3/src/c++23/Makefile.am > +++ b/libstdc++-v3/src/c++23/Makefile.am > @@ -35,7 +35,7 @@ else > inst_sources =3D > endif > > -sources =3D stacktrace.cc > +sources =3D stacktrace.cc print.cc > > vpath % $(top_srcdir)/src/c++23 > > @@ -46,6 +46,12 @@ else > libc__23convenience_la_SOURCES =3D > endif > > +# Use C++26 so that std::filebuf::native_handle() is available. > +print.lo: print.cc > + $(LTCXXCOMPILE) -std=3Dgnu++26 -c $< > +print.o: print.cc > + $(CXXCOMPILE) -std=3Dgnu++26 -c $< > + > # AM_CXXFLAGS needs to be in each subdirectory so that it can be > # modified in a per-library or per-sub-library way. Need to manually > # set this option because CONFIG_CXXFLAGS has to be after > diff --git a/libstdc++-v3/src/c++23/Makefile.in > b/libstdc++-v3/src/c++23/Makefile.in > index 1121749d84b..ce609688025 100644 > --- a/libstdc++-v3/src/c++23/Makefile.in > +++ b/libstdc++-v3/src/c++23/Makefile.in > @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES =3D > CONFIG_CLEAN_VPATH_FILES =3D > LTLIBRARIES =3D $(noinst_LTLIBRARIES) > libc__23convenience_la_LIBADD =3D > -am__objects_1 =3D stacktrace.lo > +am__objects_1 =3D stacktrace.lo print.lo > am__objects_2 =3D > @GLIBCXX_HOSTED_TRUE@am_libc__23convenience_la_OBJECTS =3D \ > @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) > @@ -430,7 +430,7 @@ headers =3D > > # XTEMPLATE_FLAGS =3D -fno-implicit-templates > @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources =3D > -sources =3D stacktrace.cc > +sources =3D stacktrace.cc print.cc > @GLIBCXX_HOSTED_FALSE@libc__23convenience_la_SOURCES =3D > @GLIBCXX_HOSTED_TRUE@libc__23convenience_la_SOURCES =3D $(sources) > $(inst_sources) > > @@ -742,6 +742,12 @@ uninstall-am: > > vpath % $(top_srcdir)/src/c++23 > > +# Use C++26 so that std::filebuf::native_handle() is available. > +print.lo: print.cc > + $(LTCXXCOMPILE) -std=3Dgnu++26 -c $< > +print.o: print.cc > + $(CXXCOMPILE) -std=3Dgnu++26 -c $< > + > # Tell versions [3.59,3.63) of GNU make to not export all variables. > # Otherwise a system limit (for SysV at least) may be exceeded. > .NOEXPORT: > diff --git a/libstdc++-v3/src/c++23/print.cc > b/libstdc++-v3/src/c++23/print.cc > new file mode 100644 > index 00000000000..2fe7a2e3565 > --- /dev/null > +++ b/libstdc++-v3/src/c++23/print.cc > @@ -0,0 +1,348 @@ > +// std::print -*- C++ -*- > + > +// Copyright The GNU Toolchain Authors. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// Under Section 7 of GPL version 3, you are granted additional > +// permissions described in the GCC Runtime Library Exception, version > +// 3.1, as published by the Free Software Foundation. > + > +// You should have received a copy of the GNU General Public License and > +// a copy of the GCC Runtime Library Exception along with this program; > +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +// . > + > +#include > +#include > +#include > +#include > +#include > +#include // uint32_t > +#include > +#include > +#include > +#include > + > +#ifdef _WIN32 > +# include // _fileno > +# include // _get_osfhandle > +# include // GetLastError, WriteConsoleW > +#elifdef _GLIBCXX_HAVE_UNISTD_H > +# include // fileno > +# include // isatty > +#endif > + > +namespace std _GLIBCXX_VISIBILITY(default) > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > + > +#ifdef _WIN32 > +namespace > +{ > + void* > + check_for_console(void* handle) > + { > + if (handle !=3D nullptr && handle !=3D INVALID_HANDLE_VALUE) > + { > + unsigned long mode; // unused > + if (::GetConsoleMode(handle, &mode)) > + return handle; > + } > + return nullptr; > + } > +} // namespace > +#endif > + > + // This returns intptr_t that is either a Windows HANDLE > + // or 1 + a POSIX file descriptor. A zero return indicates failure. > + void* > + __open_terminal(FILE* f) > + { > +#ifndef _GLIBCXX_USE_STDIO_PURE > + if (f) > + { > +#ifdef _WIN32 > + if (int fd =3D ::_fileno(f); fd >=3D 0) > + return check_for_console((void*)_get_osfhandle(fd)); > +#elifdef _GLIBCXX_HAVE_UNISTD_H > + if (int fd =3D ::fileno(f); fd >=3D 0 && ::isatty(fd)) > + return f; > +#endif > + } > +#endif > + return nullptr; > + } > + > + void* > + __open_terminal(std::streambuf* sb) > + { > +#ifndef _GLIBCXX_USE_STDIO_PURE > + using namespace __gnu_cxx; > + > + if (auto fb =3D dynamic_cast*>(sb)) > + return __open_terminal(fb->file()); > + > + if (auto fb =3D dynamic_cast*>(sb)) > + return __open_terminal(fb->file()); > + > +#ifdef __glibcxx_fstream_native_handle > +#ifdef _WIN32 > + if (auto fb =3D dynamic_cast(sb)) > + return check_for_console(fb->native_handle()); > +#elifdef _GLIBCXX_HAVE_UNISTD_H > + if (auto fb =3D dynamic_cast(sb)) > + if (int fd =3D fb->native_handle(); fd >=3D 0 && ::isatty(fd)) > + return ::fdopen(::dup(fd), "w"); // Caller must call fclose. > +#endif > +#endif > +#endif // ! _GLIBCXX_USE_STDIO_PURE > + > + return nullptr; > + } > + > +namespace > +{ > + // Validate UTF-8 string, replacing invalid sequences with U+FFFD. > + // > + // Return true if the input is valid UTF-8, false otherwise. > + // > + // If sizeof(_CharT) > 1, then transcode a valid string into out, > + // using either UTF-16 or UTF-32 as determined by sizeof(_CharT). > + // > + // If sizeof(_CharT) =3D=3D 1 and the input is valid UTF-8, both s and= out > will > + // be unchanged. Otherwise, each invalid sequence in s will be > overwritten > + // with a single 0xFF byte followed by zero or more 0xFE bytes, and th= en > + // a valid UTF-8 string will be produced in out (replacing invalid > + // sequences with U+FFFD). > + template > + bool > + to_valid_unicode(span s, basic_string<_CharT>& out) > + { > + constexpr bool transcode =3D sizeof(_CharT) > 1; > + > + unsigned seen =3D 0, needed =3D 0; > + unsigned char lo_bound =3D 0x80, hi_bound =3D 0xBF; > + size_t errors =3D 0; > + > + [[maybe_unused]] uint32_t code_point{}; > + if constexpr (transcode) > + { > + out.clear(); > + // XXX: count code points in s instead of bytes? > + out.reserve(s.size()); > + } > + > + auto q =3D s.data(), eoq =3D q + s.size(); > + while (q !=3D eoq) > + { > + unsigned char byte =3D *q; > + if (needed =3D=3D 0) > + { > + if (byte <=3D 0x7F) [[likely]] // 0x00 to 0x7F > + { > + if constexpr (transcode) > + out.push_back(_CharT(byte)); > + > + // Fast forward to the next non-ASCII character. > + while (++q !=3D eoq && (unsigned char)*q <=3D 0x7F) > + { > + if constexpr (transcode) > + out.push_back(*q); > + } > + continue; > + } > + else if (byte < 0xC2) > + { > + if constexpr (transcode) > + out.push_back(0xFFFD); > + else > + *q =3D 0xFF; > + ++errors; > + } > + else if (byte <=3D 0xDF) // 0xC2 to 0xDF > + { > + needed =3D 1; > + if constexpr (transcode) > + code_point =3D byte & 0x1F; > + } > + else if (byte <=3D 0xEF) // 0xE0 to 0xEF > + { > + if (byte =3D=3D 0xE0) > + lo_bound =3D 0xA0; > + else if (byte =3D=3D 0xED) > + hi_bound =3D 0x9F; > + > + needed =3D 2; > + if constexpr (transcode) > + code_point =3D byte & 0x0F; > + } > + else if (byte <=3D 0xF4) // 0xF0 to 0xF4 > + { > + if (byte =3D=3D 0xF0) > + lo_bound =3D 0x90; > + else if (byte =3D=3D 0xF4) > + hi_bound =3D 0x8F; > + > + needed =3D 3; > + if constexpr (transcode) > + code_point =3D byte & 0x07; > + } > + else [[unlikely]] > + { > + if constexpr (transcode) > + out.push_back(0xFFFD); > + else > + *q =3D 0xFF; > + ++errors; > + } > + } > + else > + { > + if (byte < lo_bound || byte > hi_bound) [[unlikely]] > + { > + if constexpr (transcode) > + out.push_back(0xFFFD); > + else > + { > + *(q - seen - 1) =3D 0xFF; > + __builtin_memset(q - seen, 0xFE, seen); > + } > + ++errors; > + needed =3D seen =3D 0; > + lo_bound =3D 0x80; > + hi_bound =3D 0xBF; > + continue; // Reprocess the current character. > + } > + > + if constexpr (transcode) > + code_point =3D (code_point << 6) | (byte & 0x3f); > + > + lo_bound =3D 0x80; > + hi_bound =3D 0xBF; > + ++seen; > + if (seen =3D=3D needed) [[likely]] > + { > + if constexpr (transcode) > + { > + if (code_point <=3D > __gnu_cxx::__int_traits<_CharT>::__max) > + out.push_back(code_point); > + else > + { > + // Algorithm from > + // > http://www.unicode.org/faq/utf_bom.html#utf16-4 > + const char32_t LEAD_OFFSET =3D 0xD800 - (0x10000 > >> 10); > + char16_t lead =3D LEAD_OFFSET + (code_point >> = 10); > + char16_t trail =3D 0xDC00 + (code_point & 0x3FF= ); > + out.push_back(lead); > + out.push_back(trail); > + } > + } > + needed =3D seen =3D 0; > + } > + } > + ++q; > + } > + > + if (needed) [[unlikely]] > + { > + // The string ends with an incomplete multibyte sequence. > + if constexpr (transcode) > + out.push_back(0xFFFD); > + else > + { > + // Truncate the incomplete sequence to a single byte. > + if (seen) > + s =3D s.first(s.size() - seen); > + s.back() =3D 0xFF; > + } > + ++errors; > + } > + > + if (errors =3D=3D 0) [[likely]] > + return true; > + else if constexpr (!transcode) > + { > + out.reserve(s.size() + errors * 2); > + for (unsigned char byte : s) > + { > + if (byte < 0xFE) [[likely]] > + out +=3D (char)byte; > + else if (byte =3D=3D 0xFF) > + out +=3D "\xef\xbf\xbd"; // U+FFFD in UTF-8 > + } > + } > + return false; > + } > + > + // Validate UTF-8 string. > + // Returns true if s is valid UTF-8, otherwise returns false and stores > + // a valid UTF-8 string in err. > + [[__gnu__::__always_inline__]] > + inline bool > + to_valid_utf8(span s, string& err) > + { > + return to_valid_unicode(s, err); > + } > + > + // Transcode UTF-8 string to UTF-16. > + // Returns true if s is valid UTF-8, otherwise returns false. > + // In either case, a valid UTF-16 string is stored in u16. > + [[__gnu__::__always_inline__]] > + inline bool > + to_valid_utf16(span s, u16string& u16) > + { > + return to_valid_unicode(s, u16); > + } > +} // namespace > + > + // Write a UTF-8 string to a file descriptor/handle. > + // Ill-formed sequences in the string will be substituted with U+FFFD. > + error_code > + __write_to_terminal(void* term, span str) > + { > + if (term =3D=3D nullptr) [[unlikely]] > + return std::make_error_code(std::errc::invalid_argument); > + > + error_code ec; > + > +#ifdef _WIN32 > + // We could use std::wstring here instead of std::u16string. In > general > + // char_traits is more optimized than char_traits > but > + // for the purposes of to_valid_unicode only char_traits::copy > matters, > + // and char_traits::copy uses memcpy so is OK. > + u16string wstr; > + if (!to_valid_utf16(str, wstr)) > + ec =3D std::make_error_code(errc::illegal_byte_sequence); > + > + unsigned long nchars =3D 0; > + WriteConsoleW(term, wstr.data(), wstr.size(), &nchars, nullptr); > + if (nchars !=3D wstr.size()) > + return {(int)GetLastError(), system_category()}; > +#elifdef _GLIBCXX_HAVE_UNISTD_H > + string out; > + if (!to_valid_utf8(str, out)) > + { > + str =3D out; > + ec =3D std::make_error_code(errc::illegal_byte_sequence); > + } > + > + auto n =3D std::fwrite(str.data(), 1, str.size(), (FILE*)term); > + if (n !=3D str.size()) > + ec =3D std::make_error_code(errc::io_error); > +#else > + ec =3D std::make_error_code(std::errc::function_not_supported); > +#endif > + return ec; > + } > +_GLIBCXX_END_NAMESPACE_VERSION > +} // namespace std > diff --git a/libstdc++-v3/testsuite/27_io/basic_ostream/print/1.cc > b/libstdc++-v3/testsuite/27_io/basic_ostream/print/1.cc > new file mode 100644 > index 00000000000..28dc8af33e6 > --- /dev/null > +++ b/libstdc++-v3/testsuite/27_io/basic_ostream/print/1.cc > @@ -0,0 +1,112 @@ > +// { dg-options "-lstdc++exp" } > +// { dg-do run { target c++23 } } > +// { dg-require-fileio "" } > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +void > +test_print_ostream() > +{ > + char buf[64]; > + std::spanstream os(buf); > + std::print(os, "File under '{}' for {}", 'O', "OUT OF FILE"); > + std::string_view txt(os.span()); > + VERIFY( txt =3D=3D "File under 'O' for OUT OF FILE" ); > +} > + > +void > +test_println_ostream() > +{ > + char buf[64]; > + std::spanstream os(buf); > + std::println(os, "{} Lineman was a song I once heard", "Wichita"); > + std::string_view txt(os.span()); > + VERIFY( txt =3D=3D "Wichita Lineman was a song I once heard\n" ); > +} > + > +void > +test_print_raw() > +{ > + char buf[64]; > + std::spanstream os(buf); > + std::print(os, "{}", '\xa3'); // Not a valid UTF-8 string. > + std::string_view txt(os.span()); > + // Invalid UTF-8 should be written out unchanged if the ostream is not > + // connected to a tty: > + VERIFY( txt =3D=3D "\xa3" ); > +} > + > +void > +test_print_formatted() > +{ > + char buf[64]; > + std::spanstream os(buf); > + os << std::setw(20) << std::setfill('*') << std::right; > + std::print(os, "{} Luftballons", 99); > + std::string_view txt(os.span()); > + VERIFY( txt =3D=3D "******99 Luftballons" ); > +} > + > +void > +test_vprint_nonunicode() > +{ > + std::ostream out(std::cout.rdbuf()); > + std::vprint_nonunicode(out, "{0} in \xc0 {0} out\n", > + std::make_format_args("garbage")); > + // { dg-output "garbage in . garbage out" } > +} > + > +struct brit_punc : std::numpunct > +{ > + std::string do_grouping() const override { return "\3\3"; } > + char do_thousands_sep() const override { return ','; } > + std::string do_truename() const override { return "yes mate"; } > + std::string do_falsename() const override { return "nah bruv"; } > +}; > + > +void > +test_locale() > +{ > + struct stream_punc : std::numpunct > + { > + std::string do_grouping() const override { return "\2\2"; } > + char do_thousands_sep() const override { return '~'; } > + }; > + > + // The default C locale. > + std::locale cloc =3D std::locale::classic(); > + // A custom locale using comma digit separators. > + std::locale bloc(cloc, new stream_punc); > + > + { > + std::ostringstream os; > + std::print(os, "{:L} {}", 12345, 6789); > + VERIFY(os.str() =3D=3D "12345 6789"); > + } > + { > + std::ostringstream os; > + std::print(os, "{}", 42); > + VERIFY(os.str() =3D=3D "42"); > + } > + { > + std::ostringstream os; > + os.imbue(bloc); > + std::print(os, "{:L} {}", 12345, 6789); > + VERIFY(os.str() =3D=3D "1~23~45 6789"); > + } > +} > + > +int main() > +{ > + test_print_ostream(); > + test_println_ostream(); > + test_print_raw(); > + test_print_formatted(); > + test_vprint_nonunicode(); > + test_locale(); > +} > diff --git a/libstdc++-v3/testsuite/27_io/print/1.cc > b/libstdc++-v3/testsuite/27_io/print/1.cc > new file mode 100644 > index 00000000000..3cfdac1bb74 > --- /dev/null > +++ b/libstdc++-v3/testsuite/27_io/print/1.cc > @@ -0,0 +1,85 @@ > +// { dg-options "-lstdc++exp" } > +// { dg-do run { target c++23 } } > +// { dg-require-fileio "" } > + > +#include > +#include > +#include > +#include > +#include > + > +void > +test_print_default() > +{ > + std::print("H{}ll{}, {}!", 3, 0, "world"); > + // { dg-output "H3ll0, world!" } > +} > + > +void > +test_println_default() > +{ > + std::println("I walk the line"); > + // { dg-output "I walk the line\n" } > +} > + > +void > +test_print_file() > +{ > + __gnu_test::scoped_file f; > + FILE* strm =3D std::fopen(f.path.string().c_str(), "w"); > + VERIFY( strm ); > + std::print(strm, "File under '{}' for {}", 'O', "OUT OF FILE"); > + std::fclose(strm); > + > + std::ifstream in(f.path); > + std::string txt(std::istreambuf_iterator(in), {}); > + VERIFY( txt =3D=3D "File under 'O' for OUT OF FILE" ); > +} > + > +void > +test_println_file() > +{ > + __gnu_test::scoped_file f; > + FILE* strm =3D std::fopen(f.path.string().c_str(), "w"); > + VERIFY( strm ); > + std::println(strm, "{} Lineman was a song I once heard", "Wichita"); > + std::fclose(strm); > + > + std::ifstream in(f.path); > + std::string txt(std::istreambuf_iterator(in), {}); > + VERIFY( txt =3D=3D "Wichita Lineman was a song I once heard\n" ); > +} > + > +void > +test_print_raw() > +{ > + __gnu_test::scoped_file f; > + FILE* strm =3D std::fopen(f.path.string().c_str(), "w"); > + VERIFY( strm ); > + std::print(strm, "{}", '\xa3'); // Not a valid UTF-8 string. > + std::fclose(strm); > + > + std::ifstream in(f.path); > + std::string txt(std::istreambuf_iterator(in), {}); > + // Invalid UTF-8 should be written out unchanged if the stream is not > + // connected to a tty: > + VERIFY( txt =3D=3D "\xa3" ); > +} > + > +void > +test_vprint_nonunicode() > +{ > + std::vprint_nonunicode("{0} in \xc0 {0} out\n", > + std::make_format_args("garbage")); > + // { dg-output "garbage in . garbage out" } > +} > + > +int main() > +{ > + test_print_default(); > + test_println_default(); > + test_print_file(); > + test_println_file(); > + test_print_raw(); > + test_vprint_nonunicode(); > +} > diff --git a/libstdc++-v3/testsuite/27_io/print/2.cc > b/libstdc++-v3/testsuite/27_io/print/2.cc > new file mode 100644 > index 00000000000..e101201f109 > --- /dev/null > +++ b/libstdc++-v3/testsuite/27_io/print/2.cc > @@ -0,0 +1,151 @@ > +// { dg-options "-lstdc++exp" } > +// { dg-do run { target c++23 } } > +// { dg-require-fileio "" } > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#ifdef _WIN32 > +#include > +#endif > + > +namespace std > +{ > +_GLIBCXX_BEGIN_NAMESPACE_VERSION > + // This is an internal implementation detail that must not be used > directly. > + // We need to use it here to test the behaviour > + error_code __write_to_terminal(void*, span); > +_GLIBCXX_END_NAMESPACE_VERSION > +} > + > +// Test the internal __write_to_terminal function that vprintf_unicode > uses. > +// The string parameter will be written to a file, then the bytes of the > file > +// will be read back again. On Windows those bytes will be a UTF-16 > string. > +// Returns true if the string was valid UTF-8. > +bool > +as_printed_to_terminal(std::string& s) > +{ > + __gnu_test::scoped_file f; > + FILE* strm =3D std::fopen(f.path.string().c_str(), "w"); > + VERIFY( strm ); > +#ifdef _WIN32 > + void* handle =3D (void*)_get_osfhandle(_fileno(strm)); > + const auto ec =3D std::__write_to_terminal(handle, s); > +#else > + const auto ec =3D std::__write_to_terminal(strm, s); > +#endif > + VERIFY( !ec || ec =3D=3D > std::make_error_code(std::errc::illegal_byte_sequence) ); > + std::fclose(strm); > + std::ifstream in(f.path); > + s.assign(std::istreambuf_iterator(in), {}); > + return !ec; > +} > + > +void > +test_utf8_validation() > +{ > +#ifndef _WIN32 > + std::string s =3D (const char*)u8"=C2=A3=F0=9F=87=AC=F0=9F=87=A7 =E2= =82=AC=F0=9F=87=AA=F0=9F=87=BA"; > + const std::string s2 =3D s; > + VERIFY( as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D s2 ); > + > + s +=3D " \xa3 10.99 \xee \xdd"; > + const std::string s3 =3D s; > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s !=3D s3 ); > + std::string repl =3D (const char*)u8"\uFFFD"; > + const std::string s4 =3D s2 + " " + repl + " 10.99 " + repl + " " + re= pl; > + VERIFY( s =3D=3D s4 ); > + > + s =3D "\xc0\x80"; > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D repl + repl ); > + s =3D "\xc0\xae"; > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D repl + repl ); > + > + // Examples of U+FFFD substitution from Unicode standard. > + std::string r4 =3D repl + repl + repl + repl; > + s =3D "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D r4 + r4 + "\x41" ); > + s =3D "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D r4 + r4 + "\x41" ); > + s =3D "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D r4 + repl + "\x41" + repl + repl + "\x42" ); > + s =3D "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( s =3D=3D r4 + "\x41" ); > +#endif > +} > + > +// Create a std::u16string from the bytes in a std::string. > +std::u16string > +utf16_from_bytes(const std::string& s) > +{ > + std::u16string u16; > + // s should have an even number of bytes. If it doesn't, we'll copy its > + // null terminator into the result, which will not match the expected > value. > + const auto len =3D (s.size() + 1) / 2; > + u16.resize_and_overwrite(len, [&s](char16_t* p, size_t n) { > + std::memcpy(p, s.data(), n * sizeof(char16_t)); > + return n; > + }); > + return u16; > +} > + > +void > +test_utf16_transcoding() > +{ > +#ifdef _WIN32 > + // FIXME: We can't test __write_to_terminal for Windows, because it > + // returns an INVALID_HANDLE Windows error when writing to a normal > file. > + > + std::string s =3D (const char*)u8"=C2=A3=F0=9F=87=AC=F0=9F=87=A7 =E2= =82=AC=F0=9F=87=AA=F0=9F=87=BA"; > + const std::u16string s2 =3D u"=C2=A3=F0=9F=87=AC=F0=9F=87=A7 =E2=82=AC= =F0=9F=87=AA=F0=9F=87=BA"; > + VERIFY( as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D s2 ); > + > + s +=3D " \xa3 10.99 \xee\xdd"; > + VERIFY( ! as_printed_to_terminal(s) ); > + std::u16string repl =3D u"\uFFFD"; > + const std::u16string s3 =3D s2 + u" " + repl + u" 10.99 " + repl + rep= l; > + VERIFY( utf16_from_bytes(s) =3D=3D s3 ); > + > + s =3D "\xc0\x80"; > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D repl + repl ); > + s =3D "\xc0\xae"; > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D repl + repl ); > + > + // Examples of U+FFFD substitution from Unicode standard. > + std::u16string r4 =3D repl + repl + repl + repl; > + s =3D "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D r4 + r4 + u"\x41" ); > + s =3D "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D r4 + r4 + u"\x41" ); > + s =3D "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D r4 + repl + u"\x41" + repl + repl + > u"\x42" ); > + s =3D "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11 > + VERIFY( ! as_printed_to_terminal(s) ); > + VERIFY( utf16_from_bytes(s) =3D=3D r4 + u"\x41" ); > +#endif > +} > + > +int main() > +{ > + test_utf8_validation(); > + test_utf16_transcoding(); > +} > -- > 2.43.0 > > --000000000000a224f2060c8229c5--