public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
@ 2022-10-17 23:50 ` redi at gcc dot gnu.org
  2023-04-05 18:35 ` f.heckenbach@fh-soft.de
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2022-10-17 23:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The standard does not require facets for char16_t

*** This bug has been marked as a duplicate of bug 78486 ***

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
  2022-10-17 23:50 ` [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill() redi at gcc dot gnu.org
@ 2023-04-05 18:35 ` f.heckenbach@fh-soft.de
  2023-04-06  9:15 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: f.heckenbach@fh-soft.de @ 2023-04-05 18:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

Frank Heckenbach <f.heckenbach@fh-soft.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |f.heckenbach@fh-soft.de

--- Comment #2 from Frank Heckenbach <f.heckenbach@fh-soft.de> ---
According to
https://stackoverflow.com/questions/57406448/canot-read-char8-t-from-basic-stringstreamchar8-t
this seems to be the same issue, so I'm not filing a new bug, just adding a
comment.

Apparently this is no GCC bug, but according to the standard. IMHO this shows
how ridiculous the current UTF-8 support is. A common I/O manipulator (setw)
fails with an inscrutable error (bad cast, what cast?) which is even suppressed
by default, unless enabled with o.exceptions, so by the default the stream just
mysteriously stops working. And all that because the library doesn't know what
the space character is in UTF-8. Just ranting, I know, but it's silly.

% cat test.cpp
#include <sstream>
#include <fstream>
#include <iomanip>

int main ()
{
  std::basic_ostringstream <char8_t> o;
  o.exceptions (std::ifstream::badbit);
  o << std::setw (1) << u8"";
}
% g++ -std=c++20 -Wall -Wextra -o test test.cpp  
% ./test
terminate called after throwing an instance of 'std::bad_cast'
  what():  std::bad_cast
Aborted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
  2022-10-17 23:50 ` [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill() redi at gcc dot gnu.org
  2023-04-05 18:35 ` f.heckenbach@fh-soft.de
@ 2023-04-06  9:15 ` redi at gcc dot gnu.org
  2023-04-08  0:32 ` f.heckenbach@fh-soft.de
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2023-04-06  9:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Frank Heckenbach from comment #2)
> And all that because the library
> doesn't know what the space character is in UTF-8.

That's a completely wrong description of the problem.

The standard library does not come with support for using char8_t as the
character type for streams. That's it. Period.

The problem has nothing to do with the setw manipulator or the space character
or even UTF-8, it's that char8_t type is not supported by streams out of the
box. You would need to imbue the stream with the relevant facets such as
ctype<char8_t>.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-04-06  9:15 ` redi at gcc dot gnu.org
@ 2023-04-08  0:32 ` f.heckenbach@fh-soft.de
  2023-04-09 14:42 ` redi at gcc dot gnu.org
  2023-04-09 20:13 ` f.heckenbach@fh-soft.de
  5 siblings, 0 replies; 6+ messages in thread
From: f.heckenbach@fh-soft.de @ 2023-04-08  0:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

--- Comment #4 from Frank Heckenbach <f.heckenbach@fh-soft.de> ---
(In reply to Jonathan Wakely from comment #3)

I don't think my description is "completely wrong". I'm basically saying the
same as you, in plain English.

char8_t was introduced as the preferred type for holding UTF-8 text, so this
clearly has to do with UTF-8. (And I say "text" intentionally -- single
characters are usually better represented as char32_t code points while
encodings such as UTF-8 are used for text.) Streams are the main tool for input
and output and formatting in the standard library, so a text type which does
not support input, output and formatting is indeed ridiculous.

And "imbue the stream with the relevant facets" is just techspeak for telling
the stream what the space character is (and possibly other things), like I
said.

Moreover, if streams don't support char8_t by default, they should clearly say
so (which should be easy these days with concepts, otherwise even with SFINAE)
instead of giving obscure errors about a bad cast where there is no cast.

Again, I'm not blaming gcc if the standard says so, so this discussion here is
probably a waste of time, but it might serve as a warning to other users, to
avoid falling into this trap like I did -- which is easy to fall into when u8""
literals are char8_t[] by default, and char8_t's stated purpose is to hold
UTF-8.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-04-08  0:32 ` f.heckenbach@fh-soft.de
@ 2023-04-09 14:42 ` redi at gcc dot gnu.org
  2023-04-09 20:13 ` f.heckenbach@fh-soft.de
  5 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2023-04-09 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
No, you can't disable it with SFINAE, because it's a runtime property. If you
define ctype<char8_t> yourself and add it to a locale at runtime, and use that
locale with the stream, then it works. We can't disable things at compile time
if the program can make them work at runtime.

You get the same behaviour trying to use a stream of char16_t or unsigned char
or std::byte or myprog::char_type. Like I said, it's not actually a problem
with UTF-8. Streams just don't support any of those types out of the box,
irrespective of the encoding they happen to use.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill()
       [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-04-09 14:42 ` redi at gcc dot gnu.org
@ 2023-04-09 20:13 ` f.heckenbach@fh-soft.de
  5 siblings, 0 replies; 6+ messages in thread
From: f.heckenbach@fh-soft.de @ 2023-04-09 20:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

--- Comment #6 from Frank Heckenbach <f.heckenbach@fh-soft.de> ---
Yet ironically, char8_t and char16_t are meant to be used with a certain
encoding (UTF-8 and UTF-16, respectively) which is locale-independent, whereas
char is very much locale-dependent (with even EBCDIC still supported), yet it's
the latter that works out of the box without setting a locale.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-04-09 20:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-88508-4@http.gcc.gnu.org/bugzilla/>
2022-10-17 23:50 ` [Bug libstdc++/88508] std::bad_cast in std::basic_ostringstream<char16_t>.oss.fill() redi at gcc dot gnu.org
2023-04-05 18:35 ` f.heckenbach@fh-soft.de
2023-04-06  9:15 ` redi at gcc dot gnu.org
2023-04-08  0:32 ` f.heckenbach@fh-soft.de
2023-04-09 14:42 ` redi at gcc dot gnu.org
2023-04-09 20:13 ` f.heckenbach@fh-soft.de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).