detailed comparison of generated code size for GCC and other compilers

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* detailed comparison of generated code size for GCC and other compilers
@ 2009-12-14 15:35 John Regehr
  2009-12-14 16:15 ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: John Regehr @ 2009-12-14 15:35 UTC (permalink / raw)
  To: gcc

See here:

   http://embed.cs.utah.edu/embarrassing/

There is a lot of data there.  Please excuse bugs and other problems. 
Feedback would be appreciated.

John Regehr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other compilers
  2009-12-14 15:35 detailed comparison of generated code size for GCC and other compilers John Regehr
@ 2009-12-14 16:15 ` Andi Kleen
  2009-12-14 16:31   ` John Regehr
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2009-12-14 16:15 UTC (permalink / raw)
  To: John Regehr; +Cc: gcc

John Regehr <regehr@cs.utah.edu> writes:

> See here:
>
>   http://embed.cs.utah.edu/embarrassing/
>
> There is a lot of data there.  Please excuse bugs and other
> problems. Feedback would be appreciated.
>

I was a bit surprised by the icc results, because traditionally icc doesn't
have a good reputation for good code size (and for my own testing
icc output is usually larger). So I took a look at some of these.

Some of the test cases seem very broken. For example the first 1000%
entry here

http://embed.cs.utah.edu/embarrassing/dec_09/harvest/gcc-head_icc-11.1/

int
fetchBlock (INDATA * in, char *where, int many)
{
  int copy;
  int advance;
  int i;
  int tmp;

  {
    advance = 1;
    i = 0;
    while (i < copy)

"copy" is clearly uninitialized. So how can this function ever have
worked? Depending what's on the stack or in a register it'll corrupt
random memory.

icc wins because it assumes uninitialized variables are 0 and optimizes
away everything.

I wonder if the original program was already broken or was this 
something your conversion introduced?

it might be a good idea to check for unitialized variable warnings
and remove completely broken examples (unfortunately the gcc uninitialized
variables warnings are not 100% accurate)

Looking further down the table a lot of the differences on
empty-after-optimization functions (lots of 5 vs 2 bytes) seem to be
that gcc-head uses frame pointers and the other compiler
doesn't. Clearly for a fair comparison these settings should be the
same.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-14 16:15 ` Andi Kleen
@ 2009-12-14 16:31   ` John Regehr
  2009-12-14 17:17     ` Andi Kleen
                       ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: John Regehr @ 2009-12-14 16:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc

> I wonder if the original program was already broken or was this
> something your conversion introduced?

Not sure about this specific case but I'm sure there's some of each.

I also noticed these testcases but decided to leave them in for now. 
Obviously the code is useless, but it can still be interpreted according 
to the C standard, and code can be generated.  Once you start going down 
the road of exploiting undefined behavior to create better code -- and gcc 
already does this pretty aggressively -- why not keep going?

That said, if there's a clear sentiment that this kind of test case is 
undesirable, I'll make an effort to get rid of these for subsequent runs. 
The bottom line is that these results are supposed to provide you folks 
with useful directions for improvement.

> Looking further down the table a lot of the differences on 
> empty-after-optimization functions (lots of 5 vs 2 bytes) seem to be 
> that gcc-head uses frame pointers and the other compiler doesn't. 
> Clearly for a fair comparison these settings should be the same.

I wanted to avoid playing flag games and go with -Os (or nearest 
equivalent) for all compilers.  Maybe that isn't right.

John Regehr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-14 16:31   ` John Regehr
@ 2009-12-14 17:17     ` Andi Kleen
  2009-12-14 17:46       ` Daniel Jacobowitz
  2009-12-14 20:31       ` John Regehr
  2009-12-14 17:49     ` Steven Bosscher
  2009-12-15  6:30     ` Robert Dewar
  2 siblings, 2 replies; 23+ messages in thread
From: Andi Kleen @ 2009-12-14 17:17 UTC (permalink / raw)
  To: John Regehr; +Cc: Andi Kleen, gcc

On Mon, Dec 14, 2009 at 09:30:57AM -0700, John Regehr wrote:
> I also noticed these testcases but decided to leave them in for now. 
> Obviously the code is useless, but it can still be interpreted according to 
> the C standard, and code can be generated.  Once you start going down the 
> road of exploiting undefined behavior to create better code -- and gcc 
> already does this pretty aggressively -- why not keep going?

I'm not sure relying on uninitialized variables for optimization is 
a good idea.

> That said, if there's a clear sentiment that this kind of test case is 
> undesirable, I'll make an effort to get rid of these for subsequent runs. 
> The bottom line is that these results are supposed to provide you folks 
> with useful directions for improvement.

I personally feel that test cases that get optimized away are not 
very interesting.

>
>> Looking further down the table a lot of the differences on 
>> empty-after-optimization functions (lots of 5 vs 2 bytes) seem to be that 
>> gcc-head uses frame pointers and the other compiler doesn't. Clearly for a 
>> fair comparison these settings should be the same.
>
> I wanted to avoid playing flag games and go with -Os (or nearest 
> equivalent) for all compilers.  Maybe that isn't right.

At least for small functions this skews the results badly I think.
On larger functions it would be less a problem. 

On my gccs on linux actually no frame pointer is the default, but 
that might depend on the actual target configuration it was built for.
I think that default was changed at some point and it also
depends on the OS.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 17:17     ` Andi Kleen
@ 2009-12-14 17:46       ` Daniel Jacobowitz
  2009-12-14 20:36         ` John Regehr
  2009-12-14 20:31       ` John Regehr
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Jacobowitz @ 2009-12-14 17:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: John Regehr, gcc

On Mon, Dec 14, 2009 at 06:17:45PM +0100, Andi Kleen wrote:
> I personally feel that test cases that get optimized away are not 
> very interesting.

Actually, I think they're very interesting - especially if they are
valid code, and one compiler optimizes them away, but the other
doesn't.  You may have heard of a commercial testsuite built on this
principle :-)

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 17:46       ` Daniel Jacobowitz
@ 2009-12-14 20:36         ` John Regehr
  2009-12-14 21:46           ` Joe Buck
  0 siblings, 1 reply; 23+ messages in thread
From: John Regehr @ 2009-12-14 20:36 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Andi Kleen, gcc

My opinion is that code containing undefined behaviors is definitely 
interesting, but probably it is interesting in a different way than 
functions that are more meaningful.

If I have time I'll just separate out the testcases into two groups: one 
containing functions that are more or less sensible code, the other 
containing functions that can be automatically categorized as bogus.

Thanks,

John Regehr

> Actually, I think they're very interesting - especially if they are
> valid code, and one compiler optimizes them away, but the other
> doesn't.  You may have heard of a commercial testsuite built on this
> principle :-)
>
> --
> Daniel Jacobowitz
> CodeSourcery
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other compilers
  2009-12-14 20:36         ` John Regehr
@ 2009-12-14 21:46           ` Joe Buck
  2009-12-14 21:53             ` John Regehr
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Buck @ 2009-12-14 21:46 UTC (permalink / raw)
  To: John Regehr; +Cc: Daniel Jacobowitz, Andi Kleen, gcc

On Mon, Dec 14, 2009 at 12:36:00PM -0800, John Regehr wrote:
> My opinion is that code containing undefined behaviors is definitely 
> interesting, but probably it is interesting in a different way than 
> functions that are more meaningful.

Optimizations based on uninitialized variables make me very nervous.
If uninitialized memory reads are transformed into don't-cares, then
checking tools like valgrind will no longer see the UMR (assuming that
the lack of initialization is a bug).

Did I understand that icc does this?  It seems like a dangerous practice.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-14 21:46           ` Joe Buck
@ 2009-12-14 21:53             ` John Regehr
  2009-12-14 22:06               ` Joe Buck
  2009-12-15  6:33               ` Robert Dewar
  0 siblings, 2 replies; 23+ messages in thread
From: John Regehr @ 2009-12-14 21:53 UTC (permalink / raw)
  To: Joe Buck; +Cc: Daniel Jacobowitz, Andi Kleen, gcc

> Optimizations based on uninitialized variables make me very nervous.
> If uninitialized memory reads are transformed into don't-cares, then
> checking tools like valgrind will no longer see the UMR (assuming that
> the lack of initialization is a bug).
>
> Did I understand that icc does this?  It seems like a dangerous practice.

Yes, it looks like icc does this.  But so does gcc, see below.  There is 
no "add" in the generated code.

John Regehr


[regehr@babel ~]$ cat undef.c
int foo (int x)
{
   int y;
   return x+y;
}
[regehr@babel ~]$ current-gcc -O3 -S -o - undef.c -fomit-frame-pointer
         .file   "undef.c"
         .text
         .p2align 4,,15
.globl foo
         .type   foo, @function
foo:
         movl    4(%esp), %eax
         ret
         .size   foo, .-foo
         .ident  "GCC: (GNU) 4.5.0 20091117 (experimental)"
         .section        .note.GNU-stack,"",@progbits

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other compilers
  2009-12-14 21:53             ` John Regehr
@ 2009-12-14 22:06               ` Joe Buck
  2009-12-15  0:38                 ` John Regehr
  2009-12-15  6:33               ` Robert Dewar
  1 sibling, 1 reply; 23+ messages in thread
From: Joe Buck @ 2009-12-14 22:06 UTC (permalink / raw)
  To: John Regehr; +Cc: Daniel Jacobowitz, Andi Kleen, gcc

On Mon, Dec 14, 2009 at 01:53:30PM -0800, John Regehr wrote:
> > Optimizations based on uninitialized variables make me very nervous.
> > If uninitialized memory reads are transformed into don't-cares, then
> > checking tools like valgrind will no longer see the UMR (assuming that
> > the lack of initialization is a bug).
> >
> > Did I understand that icc does this?  It seems like a dangerous practice.
> 
> Yes, it looks like icc does this.  But so does gcc, see below.  There is 
> no "add" in the generated code.
> 
> John Regehr
> 
> 
> [regehr@babel ~]$ cat undef.c
> int foo (int x)
> {
>    int y;
>    return x+y;
> }

I'm less concerned about cases like this, because the compiler will
issue a warning for the uninitialized variable (if -Wall is included).

I would only be worried for cases where no warning is issued *and*
unitialized accesses are eliminated.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-14 22:06               ` Joe Buck
@ 2009-12-15  0:38                 ` John Regehr
  2009-12-15  6:35                   ` Robert Dewar
                                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: John Regehr @ 2009-12-15  0:38 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

> I would only be worried for cases where no warning is issued *and*
> unitialized accesses are eliminated.

Yeah, it would be excellent if GCC maintained the invariant that for all 
uses of uninitialized storage, either the compiler or else valgrind will 
issue a warning.

We could test for violations of this.  Several times I've thought about 
cross-testing various compilers and versions of compilers for consistency 
of warnings.  But I never managed to convince myself that developers would 
care enough to make it worth the trouble.

John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-15  0:38                 ` John Regehr
@ 2009-12-15  6:35                   ` Robert Dewar
  2009-12-15 10:24                   ` Andi Kleen
  2009-12-15 18:56                   ` Andreas Schwab
  2 siblings, 0 replies; 23+ messages in thread
From: Robert Dewar @ 2009-12-15  6:35 UTC (permalink / raw)
  To: John Regehr; +Cc: Joe Buck, gcc

John Regehr wrote:
>> I would only be worried for cases where no warning is issued *and*
>> unitialized accesses are eliminated.
> 
> Yeah, it would be excellent if GCC maintained the invariant that for all 
> uses of uninitialized storage, either the compiler or else valgrind will 
> issue a warning.

I find that reasonable at -O0, but an intolerable restriction if
optimization is active, since it would force inefficient code in
some cases.

BTW, the Ada front end has a very nice feature for dealing with
uninitialized storage. Pragma Initialize_Scalars forces everything
to be initialized, and you can change the initializing pattern at
link time or at run time with an environment variable. Then if the
program behavior changes when you change the initialization pattern,
you know something is wrong.
> 
> We could test for violations of this.  Several times I've thought about 
> cross-testing various compilers and versions of compilers for consistency 
> of warnings.  But I never managed to convince myself that developers would 
> care enough to make it worth the trouble.

It's impossible in practice to be 100% precise about when warnings
are issued and when they are not.
> 
> John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-15  0:38                 ` John Regehr
  2009-12-15  6:35                   ` Robert Dewar
@ 2009-12-15 10:24                   ` Andi Kleen
  2009-12-15 11:46                     ` Mathieu Lacage
  2009-12-15 18:56                   ` Andreas Schwab
  2 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2009-12-15 10:24 UTC (permalink / raw)
  To: John Regehr; +Cc: Joe Buck, gcc

John Regehr <regehr@cs.utah.edu> writes:

>> I would only be worried for cases where no warning is issued *and*
>> unitialized accesses are eliminated.
>
> Yeah, it would be excellent if GCC maintained the invariant that for
> all uses of uninitialized storage, either the compiler or else
> valgrind will issue a warning.

My understanding was that valgrind's detection of uninitialized
local variables is not 100% reliable because it cannot track
all updates of the frames (it's difficult to distingush stack
reuse from uninitialized stack)

e.g. 

int f1() { int x; return x; } 
int f2() { int x; return x; } 

int main(void)
{
        f1();
        f2();
	return 0;
}

compiled without optimization so that the variables stay around
still gives no warning in valgrind:

==22573== Memcheck, a memory error detector
==22573== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==22573== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==22573== Command: ./a.out
==22573== 
==22573== 
==22573== HEAP SUMMARY:
==22573==     in use at exit: 0 bytes in 0 blocks
==22573==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==22573== 
==22573== All heap blocks were freed -- no leaks are possible
==22573== 
==22573== For counts of detected and suppressed errors, rerun with: -v
==22573== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 5)

On the other hand the compiler tends to warn too much for
uninitialized variables, typically because it cannot handle something
like that:

void f(int flag)
{
	int local;
        if (flag)
        	... initialize local ....
        ...

        if (flag)
                ... use local ....
}

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-15 10:24                   ` Andi Kleen
@ 2009-12-15 11:46                     ` Mathieu Lacage
  2009-12-15 13:45                       ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Mathieu Lacage @ 2009-12-15 11:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: John Regehr, Joe Buck, gcc

On Tue, 2009-12-15 at 11:24 +0100, Andi Kleen wrote:
> John Regehr <regehr@cs.utah.edu> writes:
> 
> >> I would only be worried for cases where no warning is issued *and*
> >> unitialized accesses are eliminated.
> >
> > Yeah, it would be excellent if GCC maintained the invariant that for
> > all uses of uninitialized storage, either the compiler or else
> > valgrind will issue a warning.
> 
> My understanding was that valgrind's detection of uninitialized
> local variables is not 100% reliable because it cannot track
> all updates of the frames (it's difficult to distingush stack
> reuse from uninitialized stack)

I am not a valgrind expert so, take the following with a grain of salt
but I think that the above statement is wrong: valgrind reliably detects
use of uninitialized variables if you define 'use' as meaning 'affects
control flow of your program' in valgrind.

i.e., try this:

[mlacage@diese ~]$ cat > test.c
int f(void)
{
int x;
return x;
}
int main (int argc, char *argv[])
{
if (f())
{
printf ("something\n"); 
}
return 0;
}
^C
[mlacage@diese ~]$ gcc ./test.c
./test.c: In function â€˜mainâ€™:
./test.c:10: warning: incompatible implicit declaration of built-in
function â€˜printfâ€™
[mlacage@diese ~]$ valgrind ./a.out 
==18933== Memcheck, a memory error detector.
==18933== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et
al.
==18933== Using LibVEX rev 1804, a library for dynamic binary
translation.
==18933== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==18933== Using valgrind-3.3.0, a dynamic binary instrumentation
framework.
==18933== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et
al.
==18933== For more details, rerun with: -v
==18933== 
==18933== Conditional jump or move depends on uninitialised value(s)
==18933==    at 0x80483D7: main (in /home/mlacage/a.out)
something
==18933== 
==18933== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 12 from
1)
==18933== malloc/free: in use at exit: 0 bytes in 0 blocks.
==18933== malloc/free: 0 allocs, 0 frees, 0 bytes allocated.
==18933== For counts of detected errors, rerun with: -v
==18933== All heap blocks were freed -- no leaks are possible.
[mlacage@diese ~]$

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-15 11:46                     ` Mathieu Lacage
@ 2009-12-15 13:45                       ` Andi Kleen
  0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2009-12-15 13:45 UTC (permalink / raw)
  To: Mathieu Lacage; +Cc: Andi Kleen, John Regehr, Joe Buck, gcc

> I am not a valgrind expert so, take the following with a grain of salt
> but I think that the above statement is wrong: valgrind reliably detects
> use of uninitialized variables if you define 'use' as meaning 'affects
> control flow of your program' in valgrind.

It works in some cases for the stack, but not in all. Consider the redzone 
on the x86-64 ABI. How should valgrind distingush an uninitialized redzone
variable from a initialized one if the stack has been used before? I didn't 
even think it worked in all cases for variables in the real frame.

You're right my example was bogus because it didn't test the control flow.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-15  0:38                 ` John Regehr
  2009-12-15  6:35                   ` Robert Dewar
  2009-12-15 10:24                   ` Andi Kleen
@ 2009-12-15 18:56                   ` Andreas Schwab
  2 siblings, 0 replies; 23+ messages in thread
From: Andreas Schwab @ 2009-12-15 18:56 UTC (permalink / raw)
  To: John Regehr; +Cc: Joe Buck, gcc

John Regehr <regehr@cs.utah.edu> writes:

>> I would only be worried for cases where no warning is issued *and*
>> unitialized accesses are eliminated.
>
> Yeah, it would be excellent if GCC maintained the invariant that for all
> uses of uninitialized storage, either the compiler or else valgrind will
> issue a warning.

If GCC cannot prove that an object is uninitialized it cannot optimize
based on that assumption, meaning that the access is likely to happen
unless it is dead for other reasons.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 21:53             ` John Regehr
  2009-12-14 22:06               ` Joe Buck
@ 2009-12-15  6:33               ` Robert Dewar
  1 sibling, 0 replies; 23+ messages in thread
From: Robert Dewar @ 2009-12-15  6:33 UTC (permalink / raw)
  To: John Regehr; +Cc: Joe Buck, Daniel Jacobowitz, Andi Kleen, gcc

John Regehr wrote:
>> Optimizations based on uninitialized variables make me very nervous.
>> If uninitialized memory reads are transformed into don't-cares, then
>> checking tools like valgrind will no longer see the UMR (assuming that
>> the lack of initialization is a bug).

Well that's the way things are, you cannot count on any specific
behavior of the compiler if there are references to uninitialized
variables, and if valgrind depends on such specific behavior, it
is wrong to do so. Now of course in practice valgrind HAS to rely
on this behavior, but realistically, it cannot be expected to be
reliable if optimization is turned o.
>>
>> Did I understand that icc does this?  It seems like a dangerous practice.

On the contrary, to me, trying to define undefined is what is dangerous!
Sure, in cases where there is a strong expectation of a particular 
behavior, and programs expect a certain behavior, you can have a debate,
but no program has deliberate use of references to uninitialized 
variables legitimately expecting some particular behavior.
> 
> Yes, it looks like icc does this.  But so does gcc, see below.  There is 
> no "add" in the generated code.
> 
> John Regehr
> 
> 
> [regehr@babel ~]$ cat undef.c
> int foo (int x)
> {
>    int y;
>    return x+y;
> }
> [regehr@babel ~]$ current-gcc -O3 -S -o - undef.c -fomit-frame-pointer
>          .file   "undef.c"
>          .text
>          .p2align 4,,15
> .globl foo
>          .type   foo, @function
> foo:
>          movl    4(%esp), %eax
>          ret
>          .size   foo, .-foo
>          .ident  "GCC: (GNU) 4.5.0 20091117 (experimental)"
>          .section        .note.GNU-stack,"",@progbits

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-14 17:17     ` Andi Kleen
  2009-12-14 17:46       ` Daniel Jacobowitz
@ 2009-12-14 20:31       ` John Regehr
  2009-12-15  8:28         ` Paolo Bonzini
  1 sibling, 1 reply; 23+ messages in thread
From: John Regehr @ 2009-12-14 20:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc

Ok, thanks for the feedback Andi.  Incidentally, the LLVM folks seem to 
agree with both of your suggestions.  I'll re-run everything w/o frame 
pointers and ignoring testcases where some compiler warns about use of 
uninitialized local.  I hate the way these warnings are not totally 
reliable, but realistically if GCC catches most cases (which it almost 
certainly will) the ones that slip past won't be too much of a problem.

No doubt there are plenty more improvements to make but hopefully this is 
a good start.

John Regehr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 20:31       ` John Regehr
@ 2009-12-15  8:28         ` Paolo Bonzini
  2009-12-15  8:44           ` Chris Lattner
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2009-12-15  8:28 UTC (permalink / raw)
  To: gcc

On 12/14/2009 09:31 PM, John Regehr wrote:
> Ok, thanks for the feedback Andi.  Incidentally, the LLVM folks seem to
> agree with both of your suggestions. I'll re-run everything w/o frame
> pointers and ignoring testcases where some compiler warns about use of
> uninitialized local. I hate the way these warnings are not totally
> reliable, but realistically if GCC catches most cases (which it almost
> certainly will) the ones that slip past won't be too much of a problem.

I also wonder if you have something like LTO enabled.  This function 
produces completely bogus code in LLVM, presumably because some kind of 
LTO proves that CC1000SendReceiveP is never written.  Of course, this 
assumption would be wrong at runtime in a real program.

http://embed.cs.utah.edu/embarrassing/src_harvested_dec_09/015306.c

Of course the answer is not to disable LTO, but rather to add an 
"initializer" function that does

volatile void *p;
memcpy (CC1000SendReceiveP__f, p, sizeof (CC1000SendReceiveP__f));
memcpy (CC1000SendReceiveP__count, p, sizeof (CC1000SendReceiveP__count));
memcpy (CC1000SendReceiveP__rxBuf, p, sizeof (CC1000SendReceiveP__rxBuf));

... and to make all variables non-static (otherwise the initializer 
would have to be in the same file, but that would perturb your results).

I also agree with others that the frame pointer default is special 
enough to warrant adding a special -f option to compilers that generate 
it, if some other compilers do not generate it.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other compilers
  2009-12-15  8:28         ` Paolo Bonzini
@ 2009-12-15  8:44           ` Chris Lattner
  2009-12-15  9:00             ` Paolo Bonzini
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Lattner @ 2009-12-15  8:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc


On Dec 15, 2009, at 12:28 AM, Paolo Bonzini wrote:

> On 12/14/2009 09:31 PM, John Regehr wrote:
>> Ok, thanks for the feedback Andi.  Incidentally, the LLVM folks seem to
>> agree with both of your suggestions. I'll re-run everything w/o frame
>> pointers and ignoring testcases where some compiler warns about use of
>> uninitialized local. I hate the way these warnings are not totally
>> reliable, but realistically if GCC catches most cases (which it almost
>> certainly will) the ones that slip past won't be too much of a problem.
> 
> I also wonder if you have something like LTO enabled.

No, he doesn't enable LLVM LTO.  Even if it did, LTO wouldn't touch the 'CC1000SendReceiveP*' definitions because they are not static (unless he explicitly built with an export map).

I haven't analyzed what is going on in this example though.  The code is probably using some undefined behavior and getting zapped.

-Chris

>  This function produces completely bogus code in LLVM, presumably because some kind of LTO proves that CC1000SendReceiveP is never written.  Of course, this assumption would be wrong at runtime in a real program.
> 
> http://embed.cs.utah.edu/embarrassing/src_harvested_dec_09/015306.c
> 
> Of course the answer is not to disable LTO, but rather to add an "initializer" function that does
> 
> volatile void *p;
> memcpy (CC1000SendReceiveP__f, p, sizeof (CC1000SendReceiveP__f));
> memcpy (CC1000SendReceiveP__count, p, sizeof (CC1000SendReceiveP__count));
> memcpy (CC1000SendReceiveP__rxBuf, p, sizeof (CC1000SendReceiveP__rxBuf));
> 
> ... and to make all variables non-static (otherwise the initializer would have to be in the same file, but that would perturb your results).
> 
> I also agree with others that the frame pointer default is special enough to warrant adding a special -f option to compilers that generate it, if some other compilers do not generate it.
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other  compilers
  2009-12-15  8:44           ` Chris Lattner
@ 2009-12-15  9:00             ` Paolo Bonzini
  2009-12-15 17:17               ` John Regehr
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2009-12-15  9:00 UTC (permalink / raw)
  To: Chris Lattner; +Cc: gcc


>> I also wonder if you have something like LTO enabled.
>
> No, he doesn't enable LLVM LTO.  Even if it did, LTO wouldn't touch
> the 'CC1000SendReceiveP*' definitions because they are not static
> (unless he explicitly built with an export map).

Interesting.

> I haven't analyzed what is going on in this example though.  The
> code is probably using some undefined behavior and getting zapped.

This access is being eliminated:

  _cil_inline_tmp_23 =
    ((unsigned int)
     *((uint8_t const *) ((void const *) ((unsigned char *) 0U)) +
      1) << 8) | (unsigned int) *((uint8_t const *) ((void const *) 	

GCC generates

	movzbl	1, %eax
	movzbl	0, %edx
	sall	$8, %eax
	orl	%edx, %eax
	ret

so probably LLVM is eliminating NULL pointer accesses, or something like 
that.  This is the undefined behavior.

Thanks for following up.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-15  9:00             ` Paolo Bonzini
@ 2009-12-15 17:17               ` John Regehr
  0 siblings, 0 replies; 23+ messages in thread
From: John Regehr @ 2009-12-15 17:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Chris Lattner, gcc

Also, we're not running LTO in any compiler and we removed all "static" 
declarations from the code to keep compilers from making closed-world 
assumptions.

John Regehr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 16:31   ` John Regehr
  2009-12-14 17:17     ` Andi Kleen
@ 2009-12-14 17:49     ` Steven Bosscher
  2009-12-15  6:30     ` Robert Dewar
  2 siblings, 0 replies; 23+ messages in thread
From: Steven Bosscher @ 2009-12-14 17:49 UTC (permalink / raw)
  To: John Regehr; +Cc: Andi Kleen, gcc

On Mon, Dec 14, 2009 at 5:30 PM, John Regehr <regehr@cs.utah.edu> wrote:
>> I wonder if the original program was already broken or was this
>> something your conversion introduced?
>
> Not sure about this specific case but I'm sure there's some of each.
>
> I also noticed these testcases but decided to leave them in for now.
> Obviously the code is useless, but it can still be interpreted according to
> the C standard, and code can be generated.  Once you start going down the
> road of exploiting undefined behavior to create better code -- and gcc
> already does this pretty aggressively -- why not keep going?
>
> That said, if there's a clear sentiment that this kind of test case is
> undesirable, I'll make an effort to get rid of these for subsequent runs.

+1 for undesirable. Benchmarks are already always artificial, but
benchmarks of undefined code are not going to give useful comparisons.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: detailed comparison of generated code size for GCC and other   compilers
  2009-12-14 16:31   ` John Regehr
  2009-12-14 17:17     ` Andi Kleen
  2009-12-14 17:49     ` Steven Bosscher
@ 2009-12-15  6:30     ` Robert Dewar
  2 siblings, 0 replies; 23+ messages in thread
From: Robert Dewar @ 2009-12-15  6:30 UTC (permalink / raw)
  To: John Regehr; +Cc: Andi Kleen, gcc

John Regehr wrote:
>> I wonder if the original program was already broken or was this
>> something your conversion introduced?
> 
> Not sure about this specific case but I'm sure there's some of each.
> 
> I also noticed these testcases but decided to leave them in for now. 
> Obviously the code is useless, but it can still be interpreted according 
> to the C standard, and code can be generated.  Once you start going down 
> the road of exploiting undefined behavior to create better code -- and gcc 
> already does this pretty aggressively -- why not keep going?

gcc does not "exploit undefined behavior", it simply takes advantage
of knowing that the program does not contain undefined behavior. If
this "knowledge" is wrong, it's OK, since the result is undefined
anyway.

But a test which actually contains undefied behavior is complete
nonsense and serves no purpose whatsoever. A C test suite is
supposed to contain C, not junk!
> 
> That said, if there's a clear sentiment that this kind of test case is 
> undesirable, I'll make an effort to get rid of these for subsequent runs. 
> The bottom line is that these results are supposed to provide you folks 
> with useful directions for improvement.
> 
>> Looking further down the table a lot of the differences on 
>> empty-after-optimization functions (lots of 5 vs 2 bytes) seem to be 
>> that gcc-head uses frame pointers and the other compiler doesn't. 
>> Clearly for a fair comparison these settings should be the same.
> 
> I wanted to avoid playing flag games and go with -Os (or nearest 
> equivalent) for all compilers.  Maybe that isn't right.
> 
> John Regehr

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-12-15 18:56 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-14 15:35 detailed comparison of generated code size for GCC and other compilers John Regehr
2009-12-14 16:15 ` Andi Kleen
2009-12-14 16:31   ` John Regehr
2009-12-14 17:17     ` Andi Kleen
2009-12-14 17:46       ` Daniel Jacobowitz
2009-12-14 20:36         ` John Regehr
2009-12-14 21:46           ` Joe Buck
2009-12-14 21:53             ` John Regehr
2009-12-14 22:06               ` Joe Buck
2009-12-15  0:38                 ` John Regehr
2009-12-15  6:35                   ` Robert Dewar
2009-12-15 10:24                   ` Andi Kleen
2009-12-15 11:46                     ` Mathieu Lacage
2009-12-15 13:45                       ` Andi Kleen
2009-12-15 18:56                   ` Andreas Schwab
2009-12-15  6:33               ` Robert Dewar
2009-12-14 20:31       ` John Regehr
2009-12-15  8:28         ` Paolo Bonzini
2009-12-15  8:44           ` Chris Lattner
2009-12-15  9:00             ` Paolo Bonzini
2009-12-15 17:17               ` John Regehr
2009-12-14 17:49     ` Steven Bosscher
2009-12-15  6:30     ` Robert Dewar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).