public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* operator<< and int8_t and uint8_t types....
@ 2023-10-23 11:36 Théo Papadopoulo
  2023-10-23 11:45 ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Théo Papadopoulo @ 2023-10-23 11:36 UTC (permalink / raw)
  To: libstdc++

     Hi,

I suspect that this will probably receive an answer "this is the 
intended/normal/expected  behavior", but from what I saw on the internet
this is a kind of gray area in the standard and I did not found any 
(even closed) report on bugzilla, so I thought of asking here.

Currently, printing a int8_t of uint8_t number with operator<< basically 
prints it as a (u)char. I think this is very unfortunate and is somewhat 
deceptive to
users which expect a "int" like behavior.

#include <iostream>
#include <cstdint>

namespace std {

     std::ostream& operator<<(std::ostream& os,const int8_t v) {
         return os << static_cast<int>(v);
     }

     std::ostream& operator<<(std::ostream& os,const uint8_t v) {
         return os << static_cast<int>(v);
     }
}

int main() {
     std::int8_t a  = 65;
     std::uint8_t b = 66;
     char         c = 'c';
     std::cerr << a << ' ' << b << ' ' << c << std::endl;
}

Without the operator<< overloads: this code prints:

A B c

whereas I think a programmer would expect to see:

65 66 c

which is what is obtained with the overloads.

There is an old and extended discussion on stack overflow on the 
subject, in which from some comment it seems that
it is the promotion to int vs the conversion to char that plays a role 
here (and that gcc like many over compiler favors the char
conversion over the int promotion, but I'm not so sure about the 
validity of this discussion).

Certainly from the user's perspective, one would expect to see and 
integer value and not a char... and it does not look very difficult to do.

     Thank you in advance,

         Theo.






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: operator<< and int8_t and uint8_t types....
  2023-10-23 11:36 operator<< and int8_t and uint8_t types Théo Papadopoulo
@ 2023-10-23 11:45 ` Jonathan Wakely
  2023-10-23 11:53   ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wakely @ 2023-10-23 11:45 UTC (permalink / raw)
  To: Théo Papadopoulo; +Cc: libstdc++

On Mon, 23 Oct 2023 at 12:36, Théo Papadopoulo <papadopoulo@gmail.com> wrote:
>
>      Hi,
>
> I suspect that this will probably receive an answer "this is the
> intended/normal/expected  behavior",


Correct.

> but from what I saw on the internet
> this is a kind of gray area in the standard

Not really.

> and I did not found any
> (even closed) report on bugzilla, so I thought of asking here.
>
> Currently, printing a int8_t of uint8_t number with operator<< basically
> prints it as a (u)char. I think this is very unfortunate and is somewhat
> deceptive to
> users which expect a "int" like behavior.

But int8_t is not an int, it's a typedef for an unspecified 8-bit
integral type, typically signed char (or potentially, but unlikely,
char, if that happens to be signed). The behaviour when printing
signed char is 100% specfied, there is no wriggle room for doing
anything else.

> #include <iostream>
> #include <cstdint>
>
> namespace std {
>
>      std::ostream& operator<<(std::ostream& os,const int8_t v) {
>          return os << static_cast<int>(v);
>      }
>
>      std::ostream& operator<<(std::ostream& os,const uint8_t v) {
>          return os << static_cast<int>(v);
>      }
> }
>
> int main() {
>      std::int8_t a  = 65;
>      std::uint8_t b = 66;
>      char         c = 'c';
>      std::cerr << a << ' ' << b << ' ' << c << std::endl;
> }
>
> Without the operator<< overloads: this code prints:
>
> A B c
>
> whereas I think a programmer would expect to see:
>
> 65 66 c
>
> which is what is obtained with the overloads.
>
> There is an old and extended discussion on stack overflow on the
> subject, in which from some comment it seems that
> it is the promotion to int vs the conversion to char that plays a role
> here (and that gcc like many over compiler favors the char
> conversion over the int promotion, but I'm not so sure about the
> validity of this discussion).

The standard specifies exactly how integer promotions work. There is
nothing any compiler can do to favour one promotion or another, it's
dictated by the standard.

But that's irrelevant here, there is no promotion. The standard
requires the following overloads which define how signed char and
unsigned char are printed:

template<class traits>
basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
signed char);
template<class traits>
basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
unsigned char);

The standard requires int8_t and uint8_t to be typedefs, not distinct
types. On most platforms, there are no integral types with exactly 8
bits except for char, signed char and unsigned char, and since C++20,
char8_t. So int8_t and uint8_t MUST be typedefs for one of those. How
those are printed with iostreams is clearly specified by the standard.


> Certainly from the user's perspective, one would expect to see and
> integer value and not a char... and it does not look very difficult to do.

The standard already says how to print those types, adding a more
specialized overload that only works for std::ostream (and not
std::basic_ostream<char, acme::traits<char>>) would be a breaking
change, and very surprising to all the users who do actually know how
this works.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: operator<< and int8_t and uint8_t types....
  2023-10-23 11:45 ` Jonathan Wakely
@ 2023-10-23 11:53   ` Jonathan Wakely
  2023-10-23 12:01     ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wakely @ 2023-10-23 11:53 UTC (permalink / raw)
  To: Théo Papadopoulo; +Cc: libstdc++

On Mon, 23 Oct 2023 at 12:45, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Mon, 23 Oct 2023 at 12:36, Théo Papadopoulo <papadopoulo@gmail.com> wrote:
> >
> >      Hi,
> >
> > I suspect that this will probably receive an answer "this is the
> > intended/normal/expected  behavior",
>
>
> Correct.
>
> > but from what I saw on the internet
> > this is a kind of gray area in the standard
>
> Not really.
>
> > and I did not found any
> > (even closed) report on bugzilla, so I thought of asking here.
> >
> > Currently, printing a int8_t of uint8_t number with operator<< basically
> > prints it as a (u)char. I think this is very unfortunate and is somewhat
> > deceptive to
> > users which expect a "int" like behavior.
>
> But int8_t is not an int, it's a typedef for an unspecified 8-bit
> integral type, typically signed char (or potentially, but unlikely,
> char, if that happens to be signed). The behaviour when printing
> signed char is 100% specfied, there is no wriggle room for doing
> anything else.
>
> > #include <iostream>
> > #include <cstdint>
> >
> > namespace std {
> >
> >      std::ostream& operator<<(std::ostream& os,const int8_t v) {
> >          return os << static_cast<int>(v);
> >      }
> >
> >      std::ostream& operator<<(std::ostream& os,const uint8_t v) {
> >          return os << static_cast<int>(v);
> >      }
> > }
> >
> > int main() {
> >      std::int8_t a  = 65;
> >      std::uint8_t b = 66;
> >      char         c = 'c';
> >      std::cerr << a << ' ' << b << ' ' << c << std::endl;
> > }
> >
> > Without the operator<< overloads: this code prints:
> >
> > A B c
> >
> > whereas I think a programmer would expect to see:
> >
> > 65 66 c
> >
> > which is what is obtained with the overloads.
> >
> > There is an old and extended discussion on stack overflow on the
> > subject, in which from some comment it seems that
> > it is the promotion to int vs the conversion to char that plays a role
> > here (and that gcc like many over compiler favors the char
> > conversion over the int promotion, but I'm not so sure about the
> > validity of this discussion).
>
> The standard specifies exactly how integer promotions work. There is
> nothing any compiler can do to favour one promotion or another, it's
> dictated by the standard.
>
> But that's irrelevant here, there is no promotion. The standard
> requires the following overloads which define how signed char and
> unsigned char are printed:
>
> template<class traits>
> basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
> signed char);
> template<class traits>
> basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
> unsigned char);
>
> The standard requires int8_t and uint8_t to be typedefs, not distinct
> types. On most platforms, there are no integral types with exactly 8
> bits except for char, signed char and unsigned char, and since C++20,
> char8_t. So int8_t and uint8_t MUST be typedefs for one of those. How
> those are printed with iostreams is clearly specified by the standard.
>
>
> > Certainly from the user's perspective, one would expect to see and
> > integer value and not a char... and it does not look very difficult to do.
>
> The standard already says how to print those types, adding a more
> specialized overload that only works for std::ostream (and not
> std::basic_ostream<char, acme::traits<char>>) would be a breaking
> change, and very surprising to all the users who do actually know how
> this works.

P.S. the behaviour of printing signed char and unsigned char has been
the same for 25 years, since the first C++ standard. Therefore the
behaviour of printing int8_t and uint8_t has been the same for as long
a those typedefs have existed (they were added to C99 in 1999, and
then inherited by C++11 in 2011, but were already in common use before
then). So this behaviour has been the same for a very long time.
Plenty of users expect exactly what libstdc++ (and other
implementations) do, because that's what it has always done.

There are many things which are surprising to some people when they
first encounter them. That doesn't mean we should or can change them
now. Even if this were to change, it would have to happen in the C++
standard, not just in libstdc++. We implement the standard.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: operator<< and int8_t and uint8_t types....
  2023-10-23 11:53   ` Jonathan Wakely
@ 2023-10-23 12:01     ` Jonathan Wakely
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Wakely @ 2023-10-23 12:01 UTC (permalink / raw)
  To: Théo Papadopoulo; +Cc: libstdc++

On Mon, 23 Oct 2023 at 12:53, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Mon, 23 Oct 2023 at 12:45, Jonathan Wakely <jwakely@redhat.com> wrote:
> >
> > On Mon, 23 Oct 2023 at 12:36, Théo Papadopoulo <papadopoulo@gmail.com> wrote:
> > >
> > >      Hi,
> > >
> > > I suspect that this will probably receive an answer "this is the
> > > intended/normal/expected  behavior",
> >
> >
> > Correct.
> >
> > > but from what I saw on the internet
> > > this is a kind of gray area in the standard
> >
> > Not really.
> >
> > > and I did not found any
> > > (even closed) report on bugzilla, so I thought of asking here.
> > >
> > > Currently, printing a int8_t of uint8_t number with operator<< basically
> > > prints it as a (u)char. I think this is very unfortunate and is somewhat
> > > deceptive to
> > > users which expect a "int" like behavior.
> >
> > But int8_t is not an int, it's a typedef for an unspecified 8-bit
> > integral type, typically signed char (or potentially, but unlikely,
> > char, if that happens to be signed). The behaviour when printing
> > signed char is 100% specfied, there is no wriggle room for doing
> > anything else.
> >
> > > #include <iostream>
> > > #include <cstdint>
> > >
> > > namespace std {
> > >
> > >      std::ostream& operator<<(std::ostream& os,const int8_t v) {
> > >          return os << static_cast<int>(v);
> > >      }
> > >
> > >      std::ostream& operator<<(std::ostream& os,const uint8_t v) {
> > >          return os << static_cast<int>(v);
> > >      }
> > > }
> > >
> > > int main() {
> > >      std::int8_t a  = 65;
> > >      std::uint8_t b = 66;
> > >      char         c = 'c';
> > >      std::cerr << a << ' ' << b << ' ' << c << std::endl;
> > > }
> > >
> > > Without the operator<< overloads: this code prints:
> > >
> > > A B c
> > >
> > > whereas I think a programmer would expect to see:
> > >
> > > 65 66 c
> > >
> > > which is what is obtained with the overloads.
> > >
> > > There is an old and extended discussion on stack overflow on the
> > > subject, in which from some comment it seems that
> > > it is the promotion to int vs the conversion to char that plays a role
> > > here (and that gcc like many over compiler favors the char
> > > conversion over the int promotion, but I'm not so sure about the
> > > validity of this discussion).
> >
> > The standard specifies exactly how integer promotions work. There is
> > nothing any compiler can do to favour one promotion or another, it's
> > dictated by the standard.
> >
> > But that's irrelevant here, there is no promotion. The standard
> > requires the following overloads which define how signed char and
> > unsigned char are printed:
> >
> > template<class traits>
> > basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
> > signed char);
> > template<class traits>
> > basic_ostream<char, traits>& operator<<(basic_ostream<char, traits>&,
> > unsigned char);
> >
> > The standard requires int8_t and uint8_t to be typedefs, not distinct
> > types. On most platforms, there are no integral types with exactly 8
> > bits except for char, signed char and unsigned char, and since C++20,
> > char8_t. So int8_t and uint8_t MUST be typedefs for one of those. How
> > those are printed with iostreams is clearly specified by the standard.
> >
> >
> > > Certainly from the user's perspective, one would expect to see and
> > > integer value and not a char... and it does not look very difficult to do.
> >
> > The standard already says how to print those types, adding a more
> > specialized overload that only works for std::ostream (and not
> > std::basic_ostream<char, acme::traits<char>>) would be a breaking
> > change, and very surprising to all the users who do actually know how
> > this works.
>
> P.S. the behaviour of printing signed char and unsigned char has been
> the same for 25 years, since the first C++ standard. Therefore the
> behaviour of printing int8_t and uint8_t has been the same for as long
> a those typedefs have existed (they were added to C99 in 1999, and
> then inherited by C++11 in 2011, but were already in common use before
> then). So this behaviour has been the same for a very long time.

P.P.S. The Single UNIX Specification specified those typedefs in
<inttypes.h> in 1997.

> Plenty of users expect exactly what libstdc++ (and other
> implementations) do, because that's what it has always done.
>
> There are many things which are surprising to some people when they
> first encounter them. That doesn't mean we should or can change them
> now. Even if this were to change, it would have to happen in the C++
> standard, not just in libstdc++. We implement the standard.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-10-23 12:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-23 11:36 operator<< and int8_t and uint8_t types Théo Papadopoulo
2023-10-23 11:45 ` Jonathan Wakely
2023-10-23 11:53   ` Jonathan Wakely
2023-10-23 12:01     ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).