* RE: Crazy compiler optimization
@ 2013-10-09 9:36 vijay nag
2013-10-09 9:54 ` Jonathan Wakely
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: vijay nag @ 2013-10-09 9:36 UTC (permalink / raw)
To: gcc-help
Hello GCC,
I'm facing a wierd compiler optimization problem. Consider the code
snippet below
#include <stdio.h>
int printChar(unsigned long cur_col, unsigned char c)
{
char buf[256];
char* bufp = buf;
char cnt = sizeof(buf) - 2; /* overflow in implicit type conversion */
unsigned long terminal_width = 500;
while ((cur_col++ < terminal_width) && cnt) {
*bufp++ = c;
cnt--;
}
*bufp++ = '\n';
*bufp = 0;
printf("%c\n", buf[0]);
return 1;
}
int main()
{
printChar(80, '-');
return 1;
}
While compiler optimization should guarantee that the result of
execution is same at all optimization levels, I'm observing difference
in the result of execution of the above program when optimized to
different levels. Although there is fundamental problem with the
statement "char cnt = sizeof(buf) - 2", GCC seems to be warning(that
too only when -pedantic flag is used) about overflow error while
silently discarding any code related to cnt i.e. the check "&& cnt" in
the if-clause is silently discarded by the compiler at -O2.
$]gcc -g char.c -o char.c.unoptimized -m32 -O0 -Wall -Wextra -pedantic
char.c: In function ‘printChar’:
char.c:8: warning: overflow in implicit constant conversion
$]gcc -g char.c -o char.c.optimized -m32 -O2 -Wall -Wextra -pedantic
char.c: In function ‘printChar’:
char.c:8: warning: overflow in implicit constant conversion
$]./char.c.unoptimized
-
$]./char.c.optimized
-
Segmentation fault (core dumped)
Basically the crash here is because of elimination of the check in the
if-clause "&& cnt" which is causing stack overrun and thereby SIGSEGV.
While standards may say that the behaviour is
undefined when an unsigned value is stored in a signed value, can a
language lawyer explain to me why GCC chose to eliminate code
pertaining to cnt considering it as dead-code ?
Below is the objdump -S output of optimized and unoptimized binaries.
A) Optimized Binary
int printChar(unsigned long cur_col, unsigned char c)
{
80483b0: 55 push %ebp
80483b1: 89 e5 mov %esp,%ebp
80483b3: 81 ec 08 01 00 00 sub $0x108,%esp
80483b9: 8b 45 08 mov 0x8(%ebp),%eax
char buf[256];
char* bufp = buf;
char cnt = sizeof(buf) - 2;
unsigned long terminal_width = 500;
while ((cur_col++ < terminal_width) && cnt) {
80483bc: 8d 8d 00 ff ff ff lea 0xffffff00(%ebp),%ecx
80483c2: 8b 55 0c mov 0xc(%ebp),%edx
80483c5: 3d f3 01 00 00 cmp $0x1f3,%eax
80483ca: 77 18 ja 80483e4 <printChar+0x34>
80483cc: 83 c0 01 add $0x1,%eax
80483cf: 8d 8d 00 ff ff ff lea 0xffffff00(%ebp),%ecx
80483d5: 83 c0 01 add $0x1,%eax
*bufp++ = c;
80483d8: 88 11 mov %dl,(%ecx)
80483da: 83 c1 01 add $0x1,%ecx
80483dd: 3d f5 01 00 00 cmp $0x1f5,%eax
80483e2: 75 f1 jne 80483d5 <printChar+0x25>
cnt--;
}
*bufp++ = '\n';
80483e4: c6 01 0a movb $0xa,(%ecx)
*bufp = 0;
80483e7: c6 41 01 00 movb $0x0,0x1(%ecx)
printf("%c\n", buf[0]);
80483eb: 0f be 85 00 ff ff ff movsbl 0xffffff00(%ebp),%eax
80483f2: c7 04 24 20 85 04 08 movl $0x8048520,(%esp)
80483f9: 89 44 24 04 mov %eax,0x4(%esp)
80483fd: e8 b6 fe ff ff call 80482b8 <printf@plt>
return 1;
}
8048402: b8 01 00 00 00 mov $0x1,%eax
8048407: c9 leave
8048408: c3 ret
8048409: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
B) Unoptimized binary
int printChar(unsigned long cur_col, unsigned char c)
{
80483a4: 55 push %ebp
80483a5: 89 e5 mov %esp,%ebp
80483a7: 81 ec 28 01 00 00 sub $0x128,%esp
80483ad: 8b 45 0c mov 0xc(%ebp),%eax
80483b0: 88 85 ec fe ff ff mov %al,0xfffffeec(%ebp)
char buf[256];
char* bufp = buf;
80483b6: 8d 85 f4 fe ff ff lea 0xfffffef4(%ebp),%eax
80483bc: 89 45 f4 mov %eax,0xfffffff4(%ebp)
char cnt = sizeof(buf) - 2;
80483bf: c6 45 fb fe movb $0xfe,0xfffffffb(%ebp)
unsigned long terminal_width = 500;
80483c3: c7 45 fc f4 01 00 00 movl $0x1f4,0xfffffffc(%ebp)
while ((cur_col++ < terminal_width) && cnt) {
80483ca: eb 14 jmp 80483e0 <printChar+0x3c>
*bufp++ = c;
80483cc: 0f b6 95 ec fe ff ff movzbl 0xfffffeec(%ebp),%edx
80483d3: 8b 45 f4 mov 0xfffffff4(%ebp),%eax
80483d6: 88 10 mov %dl,(%eax)
80483d8: 83 45 f4 01 addl $0x1,0xfffffff4(%ebp)
cnt--;
80483dc: 80 6d fb 01 subb $0x1,0xfffffffb(%ebp)
80483e0: 8b 45 08 mov 0x8(%ebp),%eax
80483e3: 3b 45 fc cmp 0xfffffffc(%ebp),%eax
80483e6: 0f 92 c0 setb %al
80483e9: 83 45 08 01 addl $0x1,0x8(%ebp)
80483ed: 83 f0 01 xor $0x1,%eax
80483f0: 84 c0 test %al,%al
80483f2: 75 06 jne 80483fa <printChar+0x56>
80483f4: 80 7d fb 00 cmpb $0x0,0xfffffffb(%ebp)
80483f8: 75 d2 jne 80483cc <printChar+0x28>
}
*bufp++ = '\n';
80483fa: 8b 45 f4 mov 0xfffffff4(%ebp),%eax
80483fd: c6 00 0a movb $0xa,(%eax)
8048400: 83 45 f4 01 addl $0x1,0xfffffff4(%ebp)
*bufp = 0;
8048404: 8b 45 f4 mov 0xfffffff4(%ebp),%eax
8048407: c6 00 00 movb $0x0,(%eax)
printf("%c\n", buf[0]);
804840a: 0f b6 85 f4 fe ff ff movzbl 0xfffffef4(%ebp),%eax
8048411: 0f be c0 movsbl %al,%eax
8048414: 89 44 24 04 mov %eax,0x4(%esp)
8048418: c7 04 24 30 85 04 08 movl $0x8048530,(%esp)
804841f: e8 94 fe ff ff call 80482b8 <printf@plt>
return 1;
8048424: b8 01 00 00 00 mov $0x1,%eax
}
8048429: c9 leave
804842a: c3 ret
g++ -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --disable-plugin
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
--with-cpu=generic --host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 9:36 Crazy compiler optimization vijay nag
@ 2013-10-09 9:54 ` Jonathan Wakely
2013-10-09 10:02 ` vijay nag
2013-10-09 10:18 ` Nicholas Mc Guire
2013-10-09 17:48 ` Ian Lance Taylor
2 siblings, 1 reply; 7+ messages in thread
From: Jonathan Wakely @ 2013-10-09 9:54 UTC (permalink / raw)
To: vijay nag; +Cc: gcc-help
On 9 October 2013 10:36, vijay nag wrote:
> Hello GCC,
>
> I'm facing a wierd compiler optimization problem. Consider the code
> snippet below
>
> #include <stdio.h>
>
> int printChar(unsigned long cur_col, unsigned char c)
> {
> char buf[256];
> char* bufp = buf;
> char cnt = sizeof(buf) - 2; /* overflow in implicit type conversion */
> unsigned long terminal_width = 500;
>
> while ((cur_col++ < terminal_width) && cnt) {
> *bufp++ = c;
> cnt--;
> }
> Basically the crash here is because of elimination of the check in the
> if-clause "&& cnt" which is causing stack overrun and thereby SIGSEGV.
> While standards may say that the behaviour is
> undefined when an unsigned value is stored in a signed value,
Standards do not say that. 254 cannot be presented in a char if char
is a signed type, so it's an overflow, which is undefined behaviour.
Storing an unsigned value that doesn't overflow is OK.
> can a
> language lawyer explain to me why GCC chose to eliminate code
> pertaining to cnt considering it as dead-code ?
cnt is initialized to -2 (after an overflow) and then you decrement it
so it gets more negative. The "&& cnt" condition will never be false,
because cnt starts non-zero and gets further from zero, so will never
reach zero.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 9:54 ` Jonathan Wakely
@ 2013-10-09 10:02 ` vijay nag
2013-10-09 10:16 ` Jonathan Wakely
2013-10-09 15:40 ` David Brown
0 siblings, 2 replies; 7+ messages in thread
From: vijay nag @ 2013-10-09 10:02 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: gcc-help
On Wed, Oct 9, 2013 at 3:24 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On 9 October 2013 10:36, vijay nag wrote:
>> Hello GCC,
>>
>> I'm facing a wierd compiler optimization problem. Consider the code
>> snippet below
>>
>> #include <stdio.h>
>>
>> int printChar(unsigned long cur_col, unsigned char c)
>> {
>> char buf[256];
>> char* bufp = buf;
>> char cnt = sizeof(buf) - 2; /* overflow in implicit type conversion */
>> unsigned long terminal_width = 500;
>>
>> while ((cur_col++ < terminal_width) && cnt) {
>> *bufp++ = c;
>> cnt--;
>> }
>
>
>> Basically the crash here is because of elimination of the check in the
>> if-clause "&& cnt" which is causing stack overrun and thereby SIGSEGV.
>> While standards may say that the behaviour is
>> undefined when an unsigned value is stored in a signed value,
>
> Standards do not say that. 254 cannot be presented in a char if char
> is a signed type, so it's an overflow, which is undefined behaviour.
> Storing an unsigned value that doesn't overflow is OK.
>
>> can a
>> language lawyer explain to me why GCC chose to eliminate code
>> pertaining to cnt considering it as dead-code ?
>
> cnt is initialized to -2 (after an overflow) and then you decrement it
> so it gets more negative. The "&& cnt" condition will never be false,
> because cnt starts non-zero and gets further from zero, so will never
> reach zero.
Alright that is perfectly valid behaviour. Why does compiler consider
it to be a unsigned type at optimization level zero ? i.e. I see a
wrap around after
-128 to 128 ?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 10:02 ` vijay nag
@ 2013-10-09 10:16 ` Jonathan Wakely
2013-10-09 15:40 ` David Brown
1 sibling, 0 replies; 7+ messages in thread
From: Jonathan Wakely @ 2013-10-09 10:16 UTC (permalink / raw)
To: vijay nag; +Cc: gcc-help
On 9 October 2013 11:02, vijay nag wrote:
> Alright that is perfectly valid behaviour. Why does compiler consider
> it to be a unsigned type at optimization level zero ?
It doesn't.
> i.e. I see a
> wrap around after
> -128 to 128 ?
Because undefined behaviour.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 9:36 Crazy compiler optimization vijay nag
2013-10-09 9:54 ` Jonathan Wakely
@ 2013-10-09 10:18 ` Nicholas Mc Guire
2013-10-09 17:48 ` Ian Lance Taylor
2 siblings, 0 replies; 7+ messages in thread
From: Nicholas Mc Guire @ 2013-10-09 10:18 UTC (permalink / raw)
To: vijay nag; +Cc: gcc-help
On Wed, 09 Oct 2013, vijay nag wrote:
> Hello GCC,
>
> I'm facing a wierd compiler optimization problem. Consider the code
> snippet below
>
> #include <stdio.h>
>
> int printChar(unsigned long cur_col, unsigned char c)
> {
> char buf[256];
> char* bufp = buf;
> char cnt = sizeof(buf) - 2; /* overflow in implicit type conversion */
> unsigned long terminal_width = 500;
>
> while ((cur_col++ < terminal_width) && cnt) {
> *bufp++ = c;
> cnt--;
> }
>
> *bufp++ = '\n';
> *bufp = 0;
>
> printf("%c\n", buf[0]);
> return 1;
> }
>
> int main()
> {
> printChar(80, '-');
> return 1;
> }
>
> While compiler optimization should guarantee that the result of
> execution is same at all optimization levels, I'm observing difference
> in the result of execution of the above program when optimized to
> different levels. Although there is fundamental problem with the
> statement "char cnt = sizeof(buf) - 2", GCC seems to be warning(that
> too only when -pedantic flag is used) about overflow error while
> silently discarding any code related to cnt i.e. the check "&& cnt" in
> the if-clause is silently discarded by the compiler at -O2.
>
> $]gcc -g char.c -o char.c.unoptimized -m32 -O0 -Wall -Wextra -pedantic
> char.c: In function ?printChar?:
> char.c:8: warning: overflow in implicit constant conversion
>
This compiler optimization dependency is visible with quite a few code examples
that violate the C standard.
Integer overflow/underflow results in undefined behavior - you are in the
wild lands basically - you should not expect C-standard violations to result
in "reliable undefined" code.
See C99 Annex J.2 for details of undefined behaviors.
thx!
hofrat
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 10:02 ` vijay nag
2013-10-09 10:16 ` Jonathan Wakely
@ 2013-10-09 15:40 ` David Brown
1 sibling, 0 replies; 7+ messages in thread
From: David Brown @ 2013-10-09 15:40 UTC (permalink / raw)
To: vijay nag; +Cc: Jonathan Wakely, gcc-help
On 09/10/13 12:02, vijay nag wrote:
> On Wed, Oct 9, 2013 at 3:24 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>> On 9 October 2013 10:36, vijay nag wrote:
>>> Hello GCC,
>>>
>>> I'm facing a wierd compiler optimization problem. Consider the code
>>> snippet below
>>>
>>> #include <stdio.h>
>>>
>>> int printChar(unsigned long cur_col, unsigned char c)
>>> {
>>> char buf[256];
>>> char* bufp = buf;
>>> char cnt = sizeof(buf) - 2; /* overflow in implicit type conversion */
>>> unsigned long terminal_width = 500;
>>>
>>> while ((cur_col++ < terminal_width) && cnt) {
>>> *bufp++ = c;
>>> cnt--;
>>> }
>>
>>
>>> Basically the crash here is because of elimination of the check in the
>>> if-clause "&& cnt" which is causing stack overrun and thereby SIGSEGV.
>>> While standards may say that the behaviour is
>>> undefined when an unsigned value is stored in a signed value,
>>
>> Standards do not say that. 254 cannot be presented in a char if char
>> is a signed type, so it's an overflow, which is undefined behaviour.
>> Storing an unsigned value that doesn't overflow is OK.
>>
>>> can a
>>> language lawyer explain to me why GCC chose to eliminate code
>>> pertaining to cnt considering it as dead-code ?
>>
>> cnt is initialized to -2 (after an overflow) and then you decrement it
>> so it gets more negative. The "&& cnt" condition will never be false,
>> because cnt starts non-zero and gets further from zero, so will never
>> reach zero.
>
> Alright that is perfectly valid behaviour. Why does compiler consider
> it to be a unsigned type at optimization level zero ? i.e. I see a
> wrap around after
> -128 to 128 ?
>
Without optimisation, the compiler generates simpler code without trying
to save time and space. Thus it actually generates code to do the tests
and the decrement operation, rather than spending the effort "thinking"
about whether or not they are necessary. It turns out that on your
system (and almost all modern systems), the simplistic machine code thus
generated has the effect you were looking for.
It's easy to correct the code by picking a more appropriate type for "cnt".
Incidentally, you should never assume that plain "char" is signed or
unsigned. It varies between platforms, and can be changed by compiler
flags. If you mean an "signed char", call it "signed char". Far
better, of course, is to use <stdint.h> and call it "int8_t" or
"uint8_t" as appropriate, if you really want an 8-bit integer.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Crazy compiler optimization
2013-10-09 9:36 Crazy compiler optimization vijay nag
2013-10-09 9:54 ` Jonathan Wakely
2013-10-09 10:18 ` Nicholas Mc Guire
@ 2013-10-09 17:48 ` Ian Lance Taylor
2 siblings, 0 replies; 7+ messages in thread
From: Ian Lance Taylor @ 2013-10-09 17:48 UTC (permalink / raw)
To: vijay nag; +Cc: gcc-help
On Wed, Oct 9, 2013 at 2:36 AM, vijay nag <vijunag@gmail.com> wrote:
>
> While compiler optimization should guarantee that the result of
> execution is same at all optimization levels,
That guarantee only holds for programs that fully conform to the
language standard and avoid all cases of undefined behaviour.
Ian
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-10-09 17:48 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-09 9:36 Crazy compiler optimization vijay nag
2013-10-09 9:54 ` Jonathan Wakely
2013-10-09 10:02 ` vijay nag
2013-10-09 10:16 ` Jonathan Wakely
2013-10-09 15:40 ` David Brown
2013-10-09 10:18 ` Nicholas Mc Guire
2013-10-09 17:48 ` Ian Lance Taylor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).