* how gcc thinks `char' as signed char or unsigned char ? @ 2008-03-05 9:20 PRC 2008-03-05 9:58 ` Andrew Haley 2008-03-05 13:08 ` John Love-Jensen 0 siblings, 2 replies; 5+ messages in thread From: PRC @ 2008-03-05 9:20 UTC (permalink / raw) To: gcc-help --------------------------------- int main() { char a = -7; if( a < -9 ) printf("a"); else printf("b"); } --------------------------------- sde-gcc -c a2.c c:/a2.c: In function `main': c:/a2.c:6: warning: comparison is always false due to limited range of data type It may be the reason for this warning that gcc thinks `char' as 'unsigned char' by default. Can I change the default configuration by modifying some configuration file? Or this feature can't be changed after gcc has been built? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how gcc thinks `char' as signed char or unsigned char ? 2008-03-05 9:20 how gcc thinks `char' as signed char or unsigned char ? PRC @ 2008-03-05 9:58 ` Andrew Haley 2008-03-05 13:08 ` John Love-Jensen 1 sibling, 0 replies; 5+ messages in thread From: Andrew Haley @ 2008-03-05 9:58 UTC (permalink / raw) To: PRC; +Cc: gcc-help PRC wrote: > --------------------------------- > int main() > { > char a = -7; > > if( a < -9 ) > printf("a"); > else > printf("b"); > } > --------------------------------- > sde-gcc -c a2.c > c:/a2.c: In function `main': > c:/a2.c:6: warning: comparison is always false due to limited range of data type > > It may be the reason for this warning that gcc thinks `char' as 'unsigned char' by default. > Can I change the default configuration by modifying some configuration file? > Or this feature can't be changed after gcc has been built? -fsigned-char The signedness of characters is part of a machine's ABI. If you really need to have unsigned chars, declare them as such. Andrew. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how gcc thinks `char' as signed char or unsigned char ? 2008-03-05 9:20 how gcc thinks `char' as signed char or unsigned char ? PRC 2008-03-05 9:58 ` Andrew Haley @ 2008-03-05 13:08 ` John Love-Jensen 2008-03-05 13:32 ` Tom St Denis 1 sibling, 1 reply; 5+ messages in thread From: John Love-Jensen @ 2008-03-05 13:08 UTC (permalink / raw) To: PRC, GCC-help Hi PRC, In C++ (and C too), it is really best to think of char as non-signed. If you need to manipulate bytes of data, take a page from the Java programming manual and do this: typedef char byte; byte b = GetByte(); if ((b & 0xFF) < 200) { std::cout << "byte is under 200" << std::endl; } The explicit (b & 0xFF) converts the byte to an int, and ensures that it is between 0 and 255. Also, the explicit (b & 0xFF) tells any other programmer that follows in your footsteps that the value is ensured to be between 0 and 255. It's safe from char being signed or unsigned, which is not reliable from platform to platform. Also, using 'byte' instead of 'char' is a way to convey, in code (rather than in comment) that you are working with byte information and not character information. (The above assumes that the size of a byte is an octet. If you are on a platform that has a byte of a different bit-size, you may need to accommodate accordingly.) And on my platforms, (b & 0xFF) optimizes very well. Apparently most CPUs are pretty good at bit twiddling. :-) Some may advocate using 'unsigned char' for byte. I used to advocate that, too (via: typedef unsigned char byte;). After a stint doing Java development, I've changed my mind and now much prefer using an unspecified 'char' for byte (via: typedef char byte;), and employ the (b & 0xFF) paradigm when/where needed. More typing, but -- in my opinion -- much better code clarity, better self-documenting code, and less obfuscation. In the end, it's a matter of coding style and personal preference. The above is my recommendation, for you consideration. HTH, --Eljay ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how gcc thinks `char' as signed char or unsigned char ? 2008-03-05 13:08 ` John Love-Jensen @ 2008-03-05 13:32 ` Tom St Denis 2008-03-05 14:11 ` John Love-Jensen 0 siblings, 1 reply; 5+ messages in thread From: Tom St Denis @ 2008-03-05 13:32 UTC (permalink / raw) To: John Love-Jensen; +Cc: PRC, GCC-help John Love-Jensen wrote: > typedef char byte; > byte b = GetByte(); > if ((b & 0xFF) < 200) > { > std::cout << "byte is under 200" << std::endl; > } > Presumably GetByte() would be responsible for ensuring it's range is limited to 0..255 or -128..127, so the &0xFF is not required. > The explicit (b & 0xFF) converts the byte to an int, and ensures that it is > between 0 and 255. > b < 200 would work just fine if GetByte() were spec'e properly. > Also, using 'byte' instead of 'char' is a way to convey, in code (rather > than in comment) that you are working with byte information and not > character information. > It's more apt to think of "char" as a small integer type, not a "character type." You could have a platform where char and int are the same size for instance. > Some may advocate using 'unsigned char' for byte. I used to advocate that, > too (via: typedef unsigned char byte;). After a stint doing Java > development, I've changed my mind and now much prefer using an unspecified > 'char' for byte (via: typedef char byte;), and employ the (b & 0xFF) > paradigm when/where needed. More typing, but -- in my opinion -- much > better code clarity, better self-documenting code, and less obfuscation. > In two's compliment it doesn't really matter, unless you multiply the type. You'd get a signed multiplication instead of unsigned. Which probably won't matter, but it's good to be explicit. Also shifting works differently. unsigned right shifts fill with zeros. signed shifts may fill with zeros OR the sign bit. It's IMO a better idea to use the unsigned type (that's why it exists) and write your functions so that their domains and co-domains are well understood. If you can't figure out what the inputs to a function should be, it's clearly not clearly clearly documented. ;-) Tom ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how gcc thinks `char' as signed char or unsigned char ? 2008-03-05 13:32 ` Tom St Denis @ 2008-03-05 14:11 ` John Love-Jensen 0 siblings, 0 replies; 5+ messages in thread From: John Love-Jensen @ 2008-03-05 14:11 UTC (permalink / raw) To: Tom St Denis; +Cc: PRC, GCC-help Hi Tom, > Presumably GetByte() would be responsible for ensuring it's range is > limited to 0..255 or -128..127, so the &0xFF is not required. GetByte returns a byte (a char), which does not specify whether the range is 0..255 or -128..127, so the &0xFF is required. > b < 200 would work just fine if GetByte() were spec'e properly. GetByte is spec'd properly. > It's more apt to think of "char" as a small integer type, not a > "character type." You could have a platform where char and int are the > same size for instance. I have worked on a platform where char are 32-bit, and another platform where char are 13-bit. But that was quite a while ago. It's more apt to think of char as holding character data, and that a byte holds a byte (not a small integer). And that the (b & 0xFF) converts the byte into an int with a constrained range of 0..255. > In two's compliment it doesn't really matter, unless you multiply the > type. You'd get a signed multiplication instead of unsigned. Which > probably won't matter, but it's good to be explicit. Also shifting > works differently. unsigned right shifts fill with zeros. signed > shifts may fill with zeros OR the sign bit. The (b & 0xFF) will work correctly on 1's complement machines, and 2's complement machines. If the desire is to have a byte (the typedef'd identifier) represent an octet, it will also work correctly on machines with greater than 8-bit bytes. (In which case it may be more appropriate to use typedef char octet; as the identifier.) On platforms where a byte is less than 8-bit, and the desire is for a byte to represent an octet, the char is not sufficient to hold an octet. On all the platforms I work on these days, a byte is 8-bit. I don't expect that to change any time soon. > It's IMO a better idea to use the unsigned type (that's why it exists) > and write your functions so that their domains and co-domains are well > understood. If you can't figure out what the inputs to a function > should be, it's clearly not clearly clearly documented. ;-) For a type that holds small integers, an unsigned char (or uint8_t from <stdint.h>) is appropriate. For a type that holds a byte, a typedef char byte; without regard to signed-ness or unsigned-ness is appropriate. For conversion of a byte to an unsigned char (or uint8_t) which is a small integer, a (b & 0xFF) is appropriate. IMO. YMMV. Sincerely, --Eljay ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-03-05 14:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-03-05 9:20 how gcc thinks `char' as signed char or unsigned char ? PRC 2008-03-05 9:58 ` Andrew Haley 2008-03-05 13:08 ` John Love-Jensen 2008-03-05 13:32 ` Tom St Denis 2008-03-05 14:11 ` John Love-Jensen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).