From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 13BF03858D37 for ; Mon, 23 Oct 2023 12:01:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 13BF03858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 13BF03858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698062506; cv=none; b=Ij57FRMFTS4q5X21wHuISgytqJ31hJr2SoP6p4HE4ob8uSI1lQePj4U8oBQtFpNKeoHwbdGSnQhBlv5Ia7LUJgDlhJBgiapJyC0WydLY8w9GVErTnPFVyBBBs414p/nlFjy28H/joVzKXIX4Pjh8/BolOoHSvKhi05krJzcDmgs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698062506; c=relaxed/simple; bh=5JehzTK1Jg6A8+1pE6Ndw88rM4mh3TXWZiSOER88yxw=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=gVD80O24BlX04QwwPJQwvnsfgqnvw09CldGZy8bilCVsdwhxT7few69SsOIjQ+TD9ZQl/28fezpXxCbTCWIzmdzkMUx4G8rXFIcXFpAzW2jEBbHv8IBOwGen2Te4bzMv31Pg0NsqNcnkzdl3FRRnT0JXSHE0zhQMmm/T90/trkM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1698062504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kPyvRF5rt/+ZcvSRH6sS3Px2RdSWLuDJYB/WEL3L3V4=; b=O+qd4Q0+e2ds9JYTpT4pjEzzSDheF8TSh+bj658Me2U5cKauUTpCUCv2cPIglbDvuHODJd U1VYhI7RlDt8Ch90DTOdzoMPInSZKCbl7yHDIaak/MF25dm+1bS68+poSR3rY8YmR3HQ14 jc6hMUp+oTjUOx1V4HBMIF2+Gidc2V4= Received: from mail-yb1-f200.google.com (mail-yb1-f200.google.com [209.85.219.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-608-SLzgER61OS2AeWIEgQ0WVw-1; Mon, 23 Oct 2023 08:01:43 -0400 X-MC-Unique: SLzgER61OS2AeWIEgQ0WVw-1 Received: by mail-yb1-f200.google.com with SMTP id 3f1490d57ef6-d9cfec5e73dso3095393276.2 for ; Mon, 23 Oct 2023 05:01:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698062503; x=1698667303; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kPyvRF5rt/+ZcvSRH6sS3Px2RdSWLuDJYB/WEL3L3V4=; b=ja5Ekp8GBC3P3wMQHN2CCfJYS85Aeh25Q1j/3LTettJbS9GiPxfNFa0i3XdLKpV1bQ vEtoNqJKBJMaBmiWkKKKkHZQ15jSR4QfIGaaGo3HpFECAJU2wF1XMMBcLdIHWue8ngvi OFEFKeypg/emqFIRh/gHUXQJoX5MUCz15berQjo4+bJMNk+dZ0mTJ8f6qBEz1z6nzLRj lTwGSxj7/j/ig5pWSUJwHwVkQ7DtL3i7DY/5Md72ddkwrQ5j3OBpyYPx9qGVMpclYVqC uY5oe5QgMAadAI3URaLgXblQxDFAyDl7qBhQ6DlBqt920ufQ2aPjLf+c3QUVmQAb9QhY h+UA== X-Gm-Message-State: AOJu0YyHPYO1VFzMDl/7XafZKEh0/LhSAm0ao09Z0rc8Q13sWB1F796V LBZ3+xMc+lkI5JTltZT8YxHi4perXYVSDMPW4GJPKoYIvVZ2WedN5AcrwHl2AbEuy+kKYeDFaSQ 0+wfRNgHMoSCSmPBJYdGe3/l/dQtaziiDo1lRLAY= X-Received: by 2002:a25:3619:0:b0:d9a:5f91:c616 with SMTP id d25-20020a253619000000b00d9a5f91c616mr8447077yba.46.1698062502892; Mon, 23 Oct 2023 05:01:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEwONM8EFe0luGQxLGqXUdNGCjBWcHZEXj14ISU7tON2nWTA50i+W/YX3QHqa1wJHEUINv3ejy0JdP5SoRESBM= X-Received: by 2002:a25:3619:0:b0:d9a:5f91:c616 with SMTP id d25-20020a253619000000b00d9a5f91c616mr8447055yba.46.1698062502586; Mon, 23 Oct 2023 05:01:42 -0700 (PDT) MIME-Version: 1.0 References: <32b868eb-2569-471c-8f19-aac7fc362c42@gmail.com> In-Reply-To: From: Jonathan Wakely Date: Mon, 23 Oct 2023 13:01:31 +0100 Message-ID: Subject: Re: operator<< and int8_t and uint8_t types.... To: =?UTF-8?Q?Th=C3=A9o_Papadopoulo?= Cc: "libstdc++" X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 23 Oct 2023 at 12:53, Jonathan Wakely wrote: > > On Mon, 23 Oct 2023 at 12:45, Jonathan Wakely wrote: > > > > On Mon, 23 Oct 2023 at 12:36, Th=C3=A9o Papadopoulo wrote: > > > > > > Hi, > > > > > > I suspect that this will probably receive an answer "this is the > > > intended/normal/expected behavior", > > > > > > Correct. > > > > > but from what I saw on the internet > > > this is a kind of gray area in the standard > > > > Not really. > > > > > and I did not found any > > > (even closed) report on bugzilla, so I thought of asking here. > > > > > > Currently, printing a int8_t of uint8_t number with operator<< basica= lly > > > prints it as a (u)char. I think this is very unfortunate and is somew= hat > > > deceptive to > > > users which expect a "int" like behavior. > > > > But int8_t is not an int, it's a typedef for an unspecified 8-bit > > integral type, typically signed char (or potentially, but unlikely, > > char, if that happens to be signed). The behaviour when printing > > signed char is 100% specfied, there is no wriggle room for doing > > anything else. > > > > > #include > > > #include > > > > > > namespace std { > > > > > > std::ostream& operator<<(std::ostream& os,const int8_t v) { > > > return os << static_cast(v); > > > } > > > > > > std::ostream& operator<<(std::ostream& os,const uint8_t v) { > > > return os << static_cast(v); > > > } > > > } > > > > > > int main() { > > > std::int8_t a =3D 65; > > > std::uint8_t b =3D 66; > > > char c =3D 'c'; > > > std::cerr << a << ' ' << b << ' ' << c << std::endl; > > > } > > > > > > Without the operator<< overloads: this code prints: > > > > > > A B c > > > > > > whereas I think a programmer would expect to see: > > > > > > 65 66 c > > > > > > which is what is obtained with the overloads. > > > > > > There is an old and extended discussion on stack overflow on the > > > subject, in which from some comment it seems that > > > it is the promotion to int vs the conversion to char that plays a rol= e > > > here (and that gcc like many over compiler favors the char > > > conversion over the int promotion, but I'm not so sure about the > > > validity of this discussion). > > > > The standard specifies exactly how integer promotions work. There is > > nothing any compiler can do to favour one promotion or another, it's > > dictated by the standard. > > > > But that's irrelevant here, there is no promotion. The standard > > requires the following overloads which define how signed char and > > unsigned char are printed: > > > > template > > basic_ostream& operator<<(basic_ostream&, > > signed char); > > template > > basic_ostream& operator<<(basic_ostream&, > > unsigned char); > > > > The standard requires int8_t and uint8_t to be typedefs, not distinct > > types. On most platforms, there are no integral types with exactly 8 > > bits except for char, signed char and unsigned char, and since C++20, > > char8_t. So int8_t and uint8_t MUST be typedefs for one of those. How > > those are printed with iostreams is clearly specified by the standard. > > > > > > > Certainly from the user's perspective, one would expect to see and > > > integer value and not a char... and it does not look very difficult t= o do. > > > > The standard already says how to print those types, adding a more > > specialized overload that only works for std::ostream (and not > > std::basic_ostream>) would be a breaking > > change, and very surprising to all the users who do actually know how > > this works. > > P.S. the behaviour of printing signed char and unsigned char has been > the same for 25 years, since the first C++ standard. Therefore the > behaviour of printing int8_t and uint8_t has been the same for as long > a those typedefs have existed (they were added to C99 in 1999, and > then inherited by C++11 in 2011, but were already in common use before > then). So this behaviour has been the same for a very long time. P.P.S. The Single UNIX Specification specified those typedefs in in 1997. > Plenty of users expect exactly what libstdc++ (and other > implementations) do, because that's what it has always done. > > There are many things which are surprising to some people when they > first encounter them. That doesn't mean we should or can change them > now. Even if this were to change, it would have to happen in the C++ > standard, not just in libstdc++. We implement the standard.