From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-help-return-31836-listarch-gcc-help=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 13851 invoked by alias); 5 Mar 2008 14:11:59 -0000
Received: (qmail 13843 invoked by uid 22791); 5 Mar 2008 14:11:58 -0000
X-Spam-Check-By: sourceware.org
Received: from exprod6og101.obsmtp.com (HELO exprod6og101.obsmtp.com) (64.18.1.181)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 05 Mar 2008 14:11:34 +0000
Received: from source ([192.150.20.142]) by exprod6ob101.postini.com ([64.18.5.12]) with SMTP; 	Wed, 05 Mar 2008 06:11:29 PST
Received: from inner-relay-1.corp.adobe.com ([153.32.1.51]) 	by outbound-smtp-2.corp.adobe.com (8.12.10/8.12.10) with ESMTP id m25EBQGb008022; 	Wed, 5 Mar 2008 06:11:27 -0800 (PST)
Received: from fe2.corp.adobe.com (fe2.corp.adobe.com [10.8.192.72]) 	by inner-relay-1.corp.adobe.com (8.12.10/8.12.10) with ESMTP id m25EBPRC012605; 	Wed, 5 Mar 2008 06:11:25 -0800 (PST)
Received: from namailgen.corp.adobe.com ([10.8.192.91]) by fe2.corp.adobe.com with Microsoft SMTPSVC(6.0.3790.1830); 	 Wed, 5 Mar 2008 06:11:25 -0800
Received: from 10.32.16.88 ([10.32.16.88]) by namailgen.corp.adobe.com ([10.8.192.91]) via Exchange Front-End Server namailhost.corp.adobe.com ([10.8.192.70]) with Microsoft Exchange Server HTTP-DAV ;  Wed,  5 Mar 2008 14:11:25 +0000
User-Agent: Microsoft-Entourage/12.0.0.071130
Date: Wed, 05 Mar 2008 14:11:00 -0000
Subject: Re: how gcc thinks `char' as signed char or unsigned char ?
From: John Love-Jensen <eljay@adobe.com>
To: Tom St Denis <tstdenis@ellipticsemi.com>
CC: PRC <panruochen@gmail.com>, GCC-help <gcc-help@gcc.gnu.org>
Message-ID: <C3F4062A.2D88E%eljay@adobe.com>
In-Reply-To: <47CEA121.5090500@ellipticsemi.com>
Mime-version: 1.0
Content-type: text/plain; 	charset="US-ASCII"
Content-transfer-encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-help.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-help@gcc.gnu.org>
Sender: gcc-help-owner@gcc.gnu.org
X-SW-Source: 2008-03/txt/msg00042.txt.bz2

Hi Tom,

> Presumably GetByte() would be responsible for ensuring it's range is
> limited to 0..255 or -128..127, so the &0xFF is not required.

GetByte returns a byte (a char), which does not specify whether the range is
0..255 or -128..127, so the &0xFF is required.

> b < 200 would work just fine if GetByte() were spec'e properly.

GetByte is spec'd properly.

> It's more apt to think of "char" as a small integer type, not a
> "character type."  You could have a platform where char and int are the
> same size for instance.

I have worked on a platform where char are 32-bit, and another platform
where char are 13-bit.  But that was quite a while ago.

It's more apt to think of char as holding character data, and that a byte
holds a byte (not a small integer).  And that the (b & 0xFF) converts the
byte into an int with a constrained range of 0..255.

> In two's compliment it doesn't really matter, unless you multiply the
> type.  You'd get a signed multiplication instead of unsigned.  Which
> probably won't matter, but it's good to be explicit.  Also shifting
> works differently.  unsigned right shifts fill with zeros.  signed
> shifts may fill with zeros OR the sign bit.

The (b & 0xFF) will work correctly on 1's complement machines, and 2's
complement machines.

If the desire is to have a byte (the typedef'd identifier) represent an
octet, it will also work correctly on machines with greater than 8-bit
bytes.  (In which case it may be more appropriate to use typedef char octet;
as the identifier.)

On platforms where a byte is less than 8-bit, and the desire is for a byte
to represent an octet, the char is not sufficient to hold an octet.

On all the platforms I work on these days, a byte is 8-bit.  I don't expect
that to change any time soon.

> It's IMO a better idea to use the unsigned type (that's why it exists)
> and write your functions so that their domains and co-domains are well
> understood.  If you can't figure out what the inputs to a function
> should be, it's clearly not clearly clearly documented.  ;-)

For a type that holds small integers, an unsigned char (or uint8_t from
<stdint.h>) is appropriate.

For a type that holds a byte, a typedef char byte; without regard to
signed-ness or unsigned-ness is appropriate.

For conversion of a byte to an unsigned char (or uint8_t) which is a small
integer, a (b & 0xFF) is appropriate.

IMO.  YMMV.

Sincerely,
--Eljay