Some optimization thoughts (and thanks!)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Some optimization thoughts (and thanks!)
@ 2001-05-01  8:09 At150bogomips
  2001-05-01  9:15 ` Carlo Wood
  2001-05-02  8:48 ` Scott A Crosby
  0 siblings, 2 replies; 11+ messages in thread
From: At150bogomips @ 2001-05-01  8:09 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3578 bytes --]

Here are some optimization suggestions:

* Inlining of all functions with a single caller: This would presumably
 Â be done after short-function inlining (to allow wrapper functions to
 Â be inlined into their callers, such wrappers are likely have condition
 Â checks and such which can be simplified if inlined into the more basic
 Â caller; but if the function that a wrapper calls is inlined first, the
 Â resulting function may be too large to inline for multiple basic
 Â callers). Â This removes the performance penalty of breaking code into
 Â graspable portions. Â Unlike regular inlining, this reduces code size
 Â (by removing call/return overhead). Â This should not be that difficult
 Â to implement, nor add much to compile time, particularly if the
 Â call-tree for a program is retained between compilings and partially
 Â updated after source modification.

* Consider moving static condition check outside of simple loop.
 Â Although branch prediction hardware would correctly predict the
 Â direction of the branch (possibly after one or two warm-up cycles),
 Â for a simple loop, the extra instruction could be an excessive extra
 Â cost. Â This would bloat code somewhat, but might be a notable win
 Â for simple enough, long loops. Â (E.g., "for (x = 0; x < limit; x++)
 Â { if (direction==forward) { A[x] = A[x]<<1; } else 
 Â { A[x] = A[x]>>1;} }" --OR-- "for (x = 0; x < limit; x++) { if
 Â ( (test1==true) && (A[x] > min_A) ) { A[x]--; } }")

* Provide vector-like operations using standard registers, if conditions
 Â allow such. Â (E.g., summing two arrays of 16-bit unsigned integers can
 Â be done with in pairs with 32-bit registers if overflow is known not
 Â to occur.) Â 64-bit machines would seem to offer even more benefit in
 Â this, allowing a pair of 16-bit multiply and divide by a constant at
 Â the cost of some shifting and logical operations. Â If gcc will include 
 Â support for detecting when vector-like operations can be done, it
 Â might be useful to provide this optimization for chips that do not
 Â have special vector instructions.

* When branches are expensive, it might be reasonable to translate if
 Â statements with multiple conditions into merged condition and a single
 Â branch rather than multiple branches. Â E.g., "if ((A==N) && (B==M))"
 Â might be translated into "XOR Temp1 A N; XOR Temp2 B M; OR Temp1 Temp2
 Â Temp1; BNEZ Temp1 L3; #branch over conditional code" rather than "XOR
 Â Temp1 A N; BNEZ Temp1 L3; #branch over second condition test and
 Â conditional code\\ XOR Temp1 B M; BNEZ Temp1 L3;" Â Removing a branch
 Â might increase modality of the branch and decrease branch history
 Â table aliasing, improving branch prediction. Â Such prediction
 Â improvement might compensate for the overhead of performing both tests
 Â in all cases. Â (This is what happened in a test case using a pair of 
 Â comparisons of a random number and a constant, at least performance 
 Â improved on an x86-based system.)

(I am disappointed that gcc does not surpass all other compilers in
optimization--gcc should be the best in everything (portability,
optimization, compile speed, standards conformance, etc.). Â It is
particularly disappointing that gcc seems to miss some common
optimizations (e.g., loop arrangement to optimize memory access
patterns). Â I am, of course, very happy that gcc is Free--a compiler is
a system component even if some vendors do not think so.)

Thank you for providing gcc!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-01  8:09 Some optimization thoughts (and thanks!) At150bogomips
@ 2001-05-01  9:15 ` Carlo Wood
  2001-05-01  9:35   ` Jean Francois Martinez
  2001-06-05 14:25   ` Joern Rennecke
  2001-05-02  8:48 ` Scott A Crosby
  1 sibling, 2 replies; 11+ messages in thread
From: Carlo Wood @ 2001-05-01  9:15 UTC (permalink / raw)
  To: At150bogomips; +Cc: gcc

On Tue, May 01, 2001 at 11:08:39AM -0400, At150bogomips@aol.com wrote:
> Here are some optimization suggestions:
> 
> * Inlining of all functions with a single caller: This would presumably
>   be done after short-function inlining (to allow wrapper functions to
>   be inlined into their callers, such wrappers are likely have condition
>   checks and such which can be simplified if inlined into the more basic
>   caller; but if the function that a wrapper calls is inlined first, the
>   resulting function may be too large to inline for multiple basic
>   callers).  This removes the performance penalty of breaking code into
>   graspable portions.  Unlike regular inlining, this reduces code size
>   (by removing call/return overhead).  This should not be that difficult
>   to implement, nor add much to compile time, particularly if the
>   call-tree for a program is retained between compilings and partially
>   updated after source modification.

The compiler can't know how often a function is called, only the linker
can.  This would be possible for static functions, but I'd be highly
surpriced when static functions with one caller aren't already inlined :/

[...]
> (I am disappointed that gcc does not surpass all other compilers in
> optimization--gcc should be the best in everything (portability,
> optimization, compile speed, standards conformance, etc.).  It is
> particularly disappointing that gcc seems to miss some common
> optimizations (e.g., loop arrangement to optimize memory access
> patterns).  I am, of course, very happy that gcc is Free--a compiler is
> a system component even if some vendors do not think so.)

I am not a developer of gcc, just a beta tester (or rather someone who
needs the compiler to work for his own work and therefore reports all
bugs he runs into) - but I understood that the priority of version 3.0
is to make the compiler 100% conforming the new standard; only after
it compiles everything attention will be turned to optimisation issues.
There are still many years to go :)

-- 
Carlo Wood <carlo@alinoe.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-01  9:15 ` Carlo Wood
@ 2001-05-01  9:35   ` Jean Francois Martinez
  2001-06-05 14:25   ` Joern Rennecke
  1 sibling, 0 replies; 11+ messages in thread
From: Jean Francois Martinez @ 2001-05-01  9:35 UTC (permalink / raw)
  To: Carlo Wood; +Cc: At150bogomips, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]

Carlo Wood a Ã©crit :

> On Tue, May 01, 2001 at 11:08:39AM -0400, At150bogomips@aol.com wrote:
> > Here are some optimization suggestions:
> >
> > * Inlining of all functions with a single caller: This would presumably
> >   be done after short-function inlining (to allow wrapper functions to
> >   be inlined into their callers, such wrappers are likely have condition
> >   checks and such which can be simplified if inlined into the more basic
> >   caller; but if the function that a wrapper calls is inlined first, the
> >   resulting function may be too large to inline for multiple basic
> >   callers).  This removes the performance penalty of breaking code into
> >   graspable portions.  Unlike regular inlining, this reduces code size
> >   (by removing call/return overhead).  This should not be that difficult
> >   to implement, nor add much to compile time, particularly if the
> >   call-tree for a program is retained between compilings and partially
> >   updated after source modification.
>
> The compiler can't know how often a function is called, only the linker
> can.  This would be possible for static functions, but I'd be highly
> surpriced when static functions with one caller aren't already inlined :/
>

The linker knows if a function is called from many places,  it does not know
if
a function is called often.  Think in a cleaning function who
is called just before exit after we detected something was wrong.   Only the
programmer knows if a function will be called often.  And that is why when
the
progtrammer inlined functions by hand then using -O3 instead of -O2 will just

make the programmer bigger (compiler inlines some "nearly dead functions")
and
slower since there is no speed benefit from inlining nearly dead functions
and you
still pay a penalty for additional TLB and cache misses.


                                                JFM


> [...]
> > (I am disappointed that gcc does not surpass all other compilers in
> > optimization--gcc should be the best in everything (portability,
> > optimization, compile speed, standards conformance, etc.).  It is
> > particularly disappointing that gcc seems to miss some common
> > optimizations (e.g., loop arrangement to optimize memory access
> > patterns).  I am, of course, very happy that gcc is Free--a compiler is
> > a system component even if some vendors do not think so.)
>
> I am not a developer of gcc, just a beta tester (or rather someone who
> needs the compiler to work for his own work and therefore reports all
> bugs he runs into) - but I understood that the priority of version 3.0
> is to make the compiler 100% conforming the new standard; only after
> it compiles everything attention will be turned to optimisation issues.
> There are still many years to go :)
>
> --
> Carlo Wood <carlo@alinoe.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-01  8:09 Some optimization thoughts (and thanks!) At150bogomips
  2001-05-01  9:15 ` Carlo Wood
@ 2001-05-02  8:48 ` Scott A Crosby
  1 sibling, 0 replies; 11+ messages in thread
From: Scott A Crosby @ 2001-05-02  8:48 UTC (permalink / raw)
  To: At150bogomips; +Cc: gcc

On Tue, 1 May 2001 At150bogomips@aol.com wrote:

> Here are some optimization suggestions:
>
> * Inlining of all functions with a single caller: This would presumably
>   be done after short-function inlining (to allow wrapper functions to
>   be inlined into their callers, such wrappers are likely have condition
>   checks and such which can be simplified if inlined into the more basic
>   caller; but if the function that a wrapper calls is inlined first, the
>   resulting function may be too large to inline for multiple basic
>   callers).  This removes the performance penalty of breaking code into
>   graspable portions.  Unlike regular inlining, this reduces code size
>   (by removing call/return overhead).  This should not be that difficult
>   to implement, nor add much to compile time, particularly if the
>   call-tree for a program is retained between compilings and partially
>   updated after source modification.
>

And for all rules, there is an exception.. I seem to remember Linus
Torvalds flaming here a couple of years ago when GCC did start to do this
for static functions with only a single caller.

In the linux kernel, certain rare (exceptional) cases were factored into
seperate static functions, in order to make the mainline code more
compact.


Scott

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-01  9:15 ` Carlo Wood
  2001-05-01  9:35   ` Jean Francois Martinez
@ 2001-06-05 14:25   ` Joern Rennecke
  2001-06-05 14:38     ` Daniel Berlin
  1 sibling, 1 reply; 11+ messages in thread
From: Joern Rennecke @ 2001-06-05 14:25 UTC (permalink / raw)
  To: Carlo Wood; +Cc: At150bogomips, gcc

> The compiler can't know how often a function is called, only the linker
> can.  This would be possible for static functions, but I'd be highly
> surpriced when static functions with one caller aren't already inlined :/

They are only inlined if they precede the caller, and are below the
inlining threshold.  Most functions (those above the inlining threshold)
are immediately after optimization, before gcc tackels the next one,
so gcc doesn't know if a function is called just once when it has to
make the decision if to inline or not.

But even if it knew, it would not be desirable to indiscriminately inline
functions, not with the way the register allocator works right now.
For a large inlined function, the caller with inlined callee tends to
be larger and slower than the caller with out-of-line callee together.
This is because the register allocator is rather clumsy when faced with
a large overall demand of registers in a large function.
Two optimization techniques - live range splitting and shrink wrapping -
promise to overcome these problems, but they create variable /
register/memory assignments that are hard or impossible to describe with
some (most?) debugging information formats.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-06-05 14:25   ` Joern Rennecke
@ 2001-06-05 14:38     ` Daniel Berlin
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Berlin @ 2001-06-05 14:38 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: Carlo Wood, At150bogomips, gcc

Joern Rennecke <amylaar@redhat.com> writes:

>> The compiler can't know how often a function is called, only the linker
>> can.  This would be possible for static functions, but I'd be highly
>> surpriced when static functions with one caller aren't already inlined :/
> 
> They are only inlined if they precede the caller, and are below the
> inlining threshold.  Most functions (those above the inlining threshold)
> are immediately after optimization, before gcc tackels the next one,
> so gcc doesn't know if a function is called just once when it has to
> make the decision if to inline or not.
> 
> But even if it knew, it would not be desirable to indiscriminately inline
> functions, not with the way the register allocator works right now.
> For a large inlined function, the caller with inlined callee tends to
> be larger and slower than the caller with out-of-line callee together.
> This is because the register allocator is rather clumsy when faced with
> a large overall demand of registers in a large function.
> Two optimization techniques - live range splitting and shrink wrapping -
> promise to overcome these problems, but they create variable /
> register/memory assignments that are hard or impossible to describe with
> some (most?) debugging information formats.

Errr, depends on your context of "some" and "most".
If you mean by sheer number of debug formats, you are correct to say
most.
If you mean by what people use, you have to realize that a large
majority of compilers these days use DWARF2 , which can describe it
just fine, and in fact, it's not even difficult to do. I'd classify it
as quite trivial, in fact, having done it for the graph coloring
register allocator (which does some live range splitting automatically).

I'd guess by the time all the various issues around the new allocator are
resolved, we could just turn off the new allocator for non-dwarf2
targets, and be done with it.

-- 
"The other day, I was walking my dog around my building...  on
the ledge.  Some people are afraid of heights.  Not me, I'm
afraid of widths.
"-Steven Wright

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-02 14:56 ` Tom Leete
@ 2001-05-02 19:23   ` Carlo Wood
  0 siblings, 0 replies; 11+ messages in thread
From: Carlo Wood @ 2001-05-02 19:23 UTC (permalink / raw)
  To: Tom Leete; +Cc: gcc

On Wed, May 02, 2001 at 05:55:47PM -0400, Tom Leete wrote:
> Operator && is a sequence point, and it short-circuits. If (A==N) evaluates
> false, (B==M) must not be evaluated and none of its side-effects occur. IMO
> that includes clobbering the processor flags.

I don't think that clobbering the processor flags is a problem here;
the short-circuit of operator && is a high-level language requirement
and the proposed evaluation is allowed when evaluation of 'B' and 'M'
have no side effects (ie, a volatile variable is involved and/or a
function is called that is not marked with __attribute__((const))).

However, the proposed evaluation should not be done in any other case
but when 'B' and 'M' are registers or constants of course, no speed gain
can be expected anyway otherwise.

As far as I understand it, the proposal is therefore that:

if ((expression) && register/constant == register/constant) ...

is NOT short-circuited; because in the case of comparing two registers
it is faster to actually *do* the compare and get rid of an extra conditional
jump that way.

-- 
Carlo Wood <carlo@alinoe.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
@ 2001-05-02 15:10 At150bogomips
  0 siblings, 0 replies; 11+ messages in thread
From: At150bogomips @ 2001-05-02 15:10 UTC (permalink / raw)
  To: carlo; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4771 bytes --]

Carlo Wood <carlo@alinoe.com> wrote (May 1, 2001, 12:15:46EDT):
>The compiler can't know how often a function is called, only the linker
>can.Â  This would be possible for static functions, but I'd be highly
>surpriced when static functions with one caller aren't already inlined :/

1) The -fprofile-arcs option generates a program flow graph, so gcc
_can_ determine if a function has a single call point (this need not be
done at the level of the linker).

2) gcc (2.96, at least) does not inline _static_ functions that are only
called from one location.

Here is some test code that demonstrates this failing:
/* inline_test.c */
/* a test of inlining single call point functions */

#include <stdlib.h>
#include <stdio.h>

static int file_local_variable;

static void do_that(void)
{
    int temp;
    int count;
    int limit;
    temp = 0;

    /* random number between 0 and 255 */
    limit = rand() & 0xff;

    for (count = 0; count < limit; count++)
    {
        temp += (count & 0xf1);
    }

    if (temp > 4096)
    {
        temp = temp/512;

        for (count = 0; count < limit; count++)
        {
            temp += (count & 0x22);
        }
        if (temp > 4096)
        {
            file_local_variable = 0;
        }
        else
        {
            file_local_variable = 512;
        }
    }
    else
    {
        temp = temp*16;

        for (count = 0; count < limit; count++)
        {
            temp += (count & 0x1f);
        }
        if (temp > 4096)
        {
            file_local_variable = 4;
        }
        else
        {
            temp *= 4;
            for (count = 0; count < limit; count++)
            {
                temp += (count & 0xf2);
            }
            if (temp > 4096)
            {
                file_local_variable = 8;
            }
            else
            {
                file_local_variable = 16;
            }
        }
    }
}

void do_this (void)
{
    do_that();
    if (file_local_variable==0)
    {
        printf("This is done.");
    }
    else
    {
        printf("Ooops!");
    }
}

/* END */

And here is the assembly resulting from 'gcc -O3 -S inline_test.c':

    .file   "inline_test.c"
    .version    "01.01"
gcc2_compiled.:
.text
    .align 4
    .type    do_that,@function
do_that:
    pushl   %ebp
    movl    %esp, %ebp
    pushl   %ebx
    pushl   %eax
    call    rand
    xorl    %ebx, %ebx
    movzbl  %al,%ecx
    xorl    %edx, %edx
    cmpl    %ecx, %ebx
    jge .L62
    .p2align 2
.L36:
    movl    %edx, %eax
    andl    $241, %eax
    incl    %edx
    addl    %eax, %ebx
    cmpl    %ecx, %edx
    jl  .L36
.L62:
    cmpl    $4096, %ebx
    jle .L38
    testl   %ebx, %ebx
    movl    %ebx, %eax
    jns .L39
    leal    511(%ebx), %eax
.L39:
    movl    %eax, %ebx
    xorl    %edx, %edx
    sarl    $9, %ebx
    cmpl    %ecx, %edx
    jge .L63
    .p2align 2
.L43:
    movl    %edx, %eax
    andl    $34, %eax
    incl    %edx
    addl    %eax, %ebx
    cmpl    %ecx, %edx
    jl  .L43
.L63:
    xorl    %eax, %eax
    cmpl    $4097, %ebx
    setl    %al
    decl    %eax
    andl    $-512, %eax
    addl    $512, %eax
    jmp .L66
    .p2align 2
.L38:
    xorl    %edx, %edx
    sall    $4, %ebx
    cmpl    %ecx, %edx
    jge .L64
    .p2align 2
.L51:
    movl    %edx, %eax
    andl    $31, %eax
    incl    %edx
    addl    %eax, %ebx
    cmpl    %ecx, %edx
    jl  .L51
.L64:
    cmpl    $4096, %ebx
    jle .L53
    movl    $4, file_local_variable
    jmp .L47
    .p2align 2
.L53:
    xorl    %edx, %edx
    sall    $2, %ebx
    cmpl    %ecx, %edx
    jge .L65
    .p2align 2
.L58:
    movl    %edx, %eax
    andl    $242, %eax
    incl    %edx
    addl    %eax, %ebx
    cmpl    %ecx, %edx
    jl  .L58
.L65:
    xorl    %eax, %eax
    cmpl    $4097, %ebx
    setl    %al
    leal    8(,%eax,8), %eax
.L66:
    movl    %eax, file_local_variable
.L47:
    movl    -4(%ebp), %ebx
    leave
    ret
.Lfe1:
    .size    do_that,.Lfe1-do_that
        .section    .rodata
.LC0:
    .string "This is done."
.LC1:
    .string "Ooops!"
    .local  file_local_variable
    .comm   file_local_variable,4,4
.text
    .align 4
.globl do_this
    .type    do_this,@function
do_this:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
    call    do_that
    movl    file_local_variable, %edx
    testl   %edx, %edx
    jne .L68
    subl    $12, %esp
    pushl   $.LC0
    jmp .L70
    .p2align 2
.L68:
    subl    $12, %esp
    pushl   $.LC1
.L70:
    call    printf
    addl    $16, %esp
    leave
    ret
.Lfe2:
    .size    do_this,.Lfe2-do_this
    .ident  "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.0)"

p.s., I am not a gcc developer and only a hobbyist user. Â I wish gcc to be 
the best in the world because it is Open Source. Â 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
  2001-05-02  7:32 Jan Hubicka
@ 2001-05-02 14:56 ` Tom Leete
  2001-05-02 19:23   ` Carlo Wood
  0 siblings, 1 reply; 11+ messages in thread
From: Tom Leete @ 2001-05-02 14:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: At150bogomips, gcc

Jan Hubicka wrote:
> 
> > * When branches are expensive, it might be reasonable to translate if
> >  statements with multiple conditions into merged condition and a single
> >  branch rather than multiple branches.  E.g., "if ((A==N) && (B==M))"
> >  might be translated into "XOR Temp1 A N; XOR Temp2 B M; OR Temp1 Temp2
> >  Temp1; BNEZ Temp1 L3; #branch over conditional code" rather than "XOR
> >  Temp1 A N; BNEZ Temp1 L3; #branch over second condition test and
> >  conditional code\\ XOR Temp1 B M; BNEZ Temp1 L3;"  Removing a branch
> >  might increase modality of the branch and decrease branch history
> >  table aliasing, improving branch prediction.  Such prediction
> >  improvement might compensate for the overhead of performing both tests
> >  in all cases.  (This is what happened in a test case using a pair of
> >  comparisons of a random number and a constant, at least performance
> >  improved on an x86-based system.)
> I've implemented this last month so I will try to contribute that
> once I find time to polish thinks and test the code.
> 
> Honza

Operator && is a sequence point, and it short-circuits. If (A==N) evaluates
false, (B==M) must not be evaluated and none of its side-effects occur. IMO
that includes clobbering the processor flags.

Other constructions can give the effect you want.

Cheers
Tom
-- 
The Daemons lurk and are dumb. -- Emerson

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
@ 2001-05-02  7:32 Jan Hubicka
  2001-05-02 14:56 ` Tom Leete
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Hubicka @ 2001-05-02  7:32 UTC (permalink / raw)
  To: At150bogomips, gcc

> * When branches are expensive, it might be reasonable to translate if 
>  statements with multiple conditions into merged condition and a single 
>  branch rather than multiple branches.  E.g., "if ((A==N) && (B==M))" 
>  might be translated into "XOR Temp1 A N; XOR Temp2 B M; OR Temp1 Temp2 
>  Temp1; BNEZ Temp1 L3; #branch over conditional code" rather than "XOR 
>  Temp1 A N; BNEZ Temp1 L3; #branch over second condition test and 
>  conditional code\\ XOR Temp1 B M; BNEZ Temp1 L3;"  Removing a branch 
>  might increase modality of the branch and decrease branch history 
>  table aliasing, improving branch prediction.  Such prediction 
>  improvement might compensate for the overhead of performing both tests 
>  in all cases.  (This is what happened in a test case using a pair of 
>  comparisons of a random number and a constant, at least performance 
>  improved on an x86-based system.) 
I've implemented this last month so I will try to contribute that
once I find time to polish thinks and test the code.

Honza

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Some optimization thoughts (and thanks!)
@ 2001-05-01 12:34 Mike Stump
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Stump @ 2001-05-01 12:34 UTC (permalink / raw)
  To: At150bogomips, carlo; +Cc: gcc

> Date: Tue, 1 May 2001 18:15:10 +0200
> From: Carlo Wood <carlo@alinoe.com>
> To: At150bogomips@aol.com
> Cc: gcc@gcc.gnu.org

> The compiler can't know how often a function is called, only the linker
> can.

On systems with a dynamic/runtine linker, of course, he means only the
dynamic/runtime linker can know.  :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-06-05 14:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-01  8:09 Some optimization thoughts (and thanks!) At150bogomips
2001-05-01  9:15 ` Carlo Wood
2001-05-01  9:35   ` Jean Francois Martinez
2001-06-05 14:25   ` Joern Rennecke
2001-06-05 14:38     ` Daniel Berlin
2001-05-02  8:48 ` Scott A Crosby
2001-05-01 12:34 Mike Stump
2001-05-02  7:32 Jan Hubicka
2001-05-02 14:56 ` Tom Leete
2001-05-02 19:23   ` Carlo Wood
2001-05-02 15:10 At150bogomips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).