Code gen question

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Code gen question
@ 1999-02-12 15:06 Paul Derbyshire
       [not found] ` < 3.0.6.32.19990212180551.00841100@pop.netaddress.com >
  1999-02-28 22:53 ` Paul Derbyshire
  0 siblings, 2 replies; 10+ messages in thread
From: Paul Derbyshire @ 1999-02-12 15:06 UTC (permalink / raw)
  To: djgpp, egcs

Which will cause cc1plus to generate better code?

inline int myclass::myfunc (int j) { return j*j*j; }

inline int myclass::myfunc (const int &j) { return j*j*j; }

My guess would be the latter, since the latter when inlined won't make a
copy of the argument passed. However, it might be that at high -O settings
cc1plus will spot that the first version doesn't modify j and silently
compile it like the second version.
If so, this leads me to ask: under what circumstances will the compiler be
smart enough to detect that an inline function passed an argument of a
builtin type doesn't modify it and avoid making an unnecessary copy?

This leads me to ask: when writing short inline functions, is it better for
code optimization to pass builtin data types (bool, int, double, etc.) and
pointers by value or by reference? (Yuck, passing pointers by reference,
well I'll do it if it means real speed gains in tiny inline functions that
get invoked a great deal.)

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  | http://surf.to/pgd.net
_____________________ ____|________     Paul Derbyshire     pderbysh@usa.net
Programmer & Humanist|ICQ: 10423848|

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
       [not found] ` < 3.0.6.32.19990212180551.00841100@pop.netaddress.com >
@ 1999-02-12 15:29   ` Joe Buck
  1999-02-28 22:53     ` Joe Buck
  0 siblings, 1 reply; 10+ messages in thread
From: Joe Buck @ 1999-02-12 15:29 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: djgpp, egcs

> Which will cause cc1plus to generate better code?
> 
> inline int myclass::myfunc (int j) { return j*j*j; }
> 
> 
> inline int myclass::myfunc (const int &j) { return j*j*j; }

It depends on the code at the call site, and whether the object passed
to myfunc is in a register or in memory, plus whether the compiler can
optimize away unneeded write-to-memory, read-back-from-memory code.

In many cases, you'll get the exact same code for either of the above
two functions.  In other cases, either one or the other turns out a
bit better, usually because of some missed optimizaton.

If you really want to know, use -S and look at the assembly.

> My guess would be the latter, since the latter when inlined won't make a
> copy of the argument passed.

In principle the two should come out about the same, now that we have
the ADDRESSOF optimization.  Before we had ADDRESSOF the first one
was always better on most processors, since j gets passed in a register
while the second one forces j to be in memory.

> However, it might be that at high -O settings
> cc1plus will spot that the first version doesn't modify j and silently
> compile it like the second version.

That would be bad: what if j is already in a register?  Why would you want
to force it to memory?

> This leads me to ask: when writing short inline functions, is it better for
> code optimization to pass builtin data types (bool, int, double, etc.) and
> pointers by value or by reference?

If it fits in a register, use by-value, though in many cases the
difference is not significant.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
  1999-02-12 15:29   ` Joe Buck
@ 1999-02-28 22:53     ` Joe Buck
  0 siblings, 0 replies; 10+ messages in thread
From: Joe Buck @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: djgpp, egcs

> Which will cause cc1plus to generate better code?
> 
> inline int myclass::myfunc (int j) { return j*j*j; }
> 
> 
> inline int myclass::myfunc (const int &j) { return j*j*j; }

It depends on the code at the call site, and whether the object passed
to myfunc is in a register or in memory, plus whether the compiler can
optimize away unneeded write-to-memory, read-back-from-memory code.

In many cases, you'll get the exact same code for either of the above
two functions.  In other cases, either one or the other turns out a
bit better, usually because of some missed optimizaton.

If you really want to know, use -S and look at the assembly.

> My guess would be the latter, since the latter when inlined won't make a
> copy of the argument passed.

In principle the two should come out about the same, now that we have
the ADDRESSOF optimization.  Before we had ADDRESSOF the first one
was always better on most processors, since j gets passed in a register
while the second one forces j to be in memory.

> However, it might be that at high -O settings
> cc1plus will spot that the first version doesn't modify j and silently
> compile it like the second version.

That would be bad: what if j is already in a register?  Why would you want
to force it to memory?

> This leads me to ask: when writing short inline functions, is it better for
> code optimization to pass builtin data types (bool, int, double, etc.) and
> pointers by value or by reference?

If it fits in a register, use by-value, though in many cases the
difference is not significant.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Code gen question
  1999-02-12 15:06 Code gen question Paul Derbyshire
       [not found] ` < 3.0.6.32.19990212180551.00841100@pop.netaddress.com >
@ 1999-02-28 22:53 ` Paul Derbyshire
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Derbyshire @ 1999-02-28 22:53 UTC (permalink / raw)
  To: djgpp, egcs

Which will cause cc1plus to generate better code?

inline int myclass::myfunc (int j) { return j*j*j; }

inline int myclass::myfunc (const int &j) { return j*j*j; }

My guess would be the latter, since the latter when inlined won't make a
copy of the argument passed. However, it might be that at high -O settings
cc1plus will spot that the first version doesn't modify j and silently
compile it like the second version.
If so, this leads me to ask: under what circumstances will the compiler be
smart enough to detect that an inline function passed an argument of a
builtin type doesn't modify it and avoid making an unnecessary copy?

This leads me to ask: when writing short inline functions, is it better for
code optimization to pass builtin data types (bool, int, double, etc.) and
pointers by value or by reference? (Yuck, passing pointers by reference,
well I'll do it if it means real speed gains in tiny inline functions that
get invoked a great deal.)

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  | http://surf.to/pgd.net
_____________________ ____|________     Paul Derbyshire     pderbysh@usa.net
Programmer & Humanist|ICQ: 10423848|

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
  1999-02-12 17:34                     ` Paul Derbyshire
       [not found]                       ` < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >
@ 1999-02-28 22:53                       ` Paul Derbyshire
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Derbyshire @ 1999-02-28 22:53 UTC (permalink / raw)
  To: egcs

At 03:29 PM 2/12/99 -0800, you wrote:
>>>>>> Paul Derbyshire <pderbysh@usa.net> writes:
>
> > Which will cause cc1plus to generate better code?
> > inline int myclass::myfunc (int j) { return j*j*j; }
>
> > inline int myclass::myfunc (const int &j) { return j*j*j; }
>
> > My guess would be the latter, since the latter when inlined won't make a
> > copy of the argument passed.
>
>Neither will the former.

It won't? So it will observe that j is never modified making copying
unnecessary?

>The difference is that the latter refers to its
>arguments address, which impairs optimization (though not as much as it
>used to).

It does? If the address isn't used except to dereference, I'd expect the
compiler to turn

int j, k;
j = compute_something();
k = myclass::myfunc(j);

into something that resembles:

compute_something();
; j is in eax.
movl %eax,  %ebx  ; k is in ebx now. Hmm, it is copied anyways in a
                  ; sense.
mul  %eax,  %ebx  ; k == j*j
mul  %eax,  %ebx  ; k == j*j*j



>Absolutely pass scalars by value.

Does this also apply to stock GCC? PGCC? Most of the differences among
these three gccs, except for namespace support (and extern inline behavior
:-)), are optimization differences.

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  | http://surf.to/pgd.net
_____________________ ____|________     Paul Derbyshire     pderbysh@usa.net
Programmer & Humanist|ICQ: 10423848|

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
  1999-02-12 15:29                 ` Jason Merrill
       [not found]                   ` < u990e3kxq8.fsf@yorick.cygnus.com >
@ 1999-02-28 22:53                   ` Jason Merrill
  1 sibling, 0 replies; 10+ messages in thread
From: Jason Merrill @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: egcs

>>>>> Paul Derbyshire <pderbysh@usa.net> writes:

 > Which will cause cc1plus to generate better code?
 > inline int myclass::myfunc (int j) { return j*j*j; }

 > inline int myclass::myfunc (const int &j) { return j*j*j; }

 > My guess would be the latter, since the latter when inlined won't make a
 > copy of the argument passed.

Neither will the former.  The difference is that the latter refers to its
arguments address, which impairs optimization (though not as much as it
used to).  Absolutely pass scalars by value.

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
  1999-02-12 17:39                         ` Jeffrey A Law
@ 1999-02-28 22:53                           ` Jeffrey A Law
  0 siblings, 0 replies; 10+ messages in thread
From: Jeffrey A Law @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: egcs

  In message < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >you write:
  > It won't? So it will observe that j is never modified making copying
  > unnecessary?
The copy will initially appear, then be optimized away if at all possible
by local and global copy propagation.


  > >The difference is that the latter refers to its
  > >arguments address, which impairs optimization (though not as much as it
  > >used to).
  > 
  > It does? If the address isn't used except to dereference, I'd expect the
  > compiler to turn
It will try, but it may not always succeed.  When you take the address of an
object you generally make analysis more difficult on the compiler and sometimes
it will be unable to decipher the result.

In general you are better off writing the code in the most natural and
straightforward way instead of trying to micro-optimize too much.  Instead
spend your time writing goot algorithms.


  > >Absolutely pass scalars by value.
  > 
  > Does this also apply to stock GCC? PGCC? Most of the differences among
  > these three gccs, except for namespace support (and extern inline behavior
  > :-)), are optimization differences.
Yes.  This generally applies to any compiler.  As a general rule compilers are
a lot better at optimizing scalars than pointers to scalars.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
       [not found]                       ` < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >
@ 1999-02-12 17:39                         ` Jeffrey A Law
  1999-02-28 22:53                           ` Jeffrey A Law
  0 siblings, 1 reply; 10+ messages in thread
From: Jeffrey A Law @ 1999-02-12 17:39 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: egcs

  In message < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >you write:
  > It won't? So it will observe that j is never modified making copying
  > unnecessary?
The copy will initially appear, then be optimized away if at all possible
by local and global copy propagation.


  > >The difference is that the latter refers to its
  > >arguments address, which impairs optimization (though not as much as it
  > >used to).
  > 
  > It does? If the address isn't used except to dereference, I'd expect the
  > compiler to turn
It will try, but it may not always succeed.  When you take the address of an
object you generally make analysis more difficult on the compiler and sometimes
it will be unable to decipher the result.

In general you are better off writing the code in the most natural and
straightforward way instead of trying to micro-optimize too much.  Instead
spend your time writing goot algorithms.


  > >Absolutely pass scalars by value.
  > 
  > Does this also apply to stock GCC? PGCC? Most of the differences among
  > these three gccs, except for namespace support (and extern inline behavior
  > :-)), are optimization differences.
Yes.  This generally applies to any compiler.  As a general rule compilers are
a lot better at optimizing scalars than pointers to scalars.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
       [not found]                   ` < u990e3kxq8.fsf@yorick.cygnus.com >
@ 1999-02-12 17:34                     ` Paul Derbyshire
       [not found]                       ` < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >
  1999-02-28 22:53                       ` Paul Derbyshire
  0 siblings, 2 replies; 10+ messages in thread
From: Paul Derbyshire @ 1999-02-12 17:34 UTC (permalink / raw)
  To: egcs

At 03:29 PM 2/12/99 -0800, you wrote:
>>>>>> Paul Derbyshire <pderbysh@usa.net> writes:
>
> > Which will cause cc1plus to generate better code?
> > inline int myclass::myfunc (int j) { return j*j*j; }
>
> > inline int myclass::myfunc (const int &j) { return j*j*j; }
>
> > My guess would be the latter, since the latter when inlined won't make a
> > copy of the argument passed.
>
>Neither will the former.

It won't? So it will observe that j is never modified making copying
unnecessary?

>The difference is that the latter refers to its
>arguments address, which impairs optimization (though not as much as it
>used to).

It does? If the address isn't used except to dereference, I'd expect the
compiler to turn

int j, k;
j = compute_something();
k = myclass::myfunc(j);

into something that resembles:

compute_something();
; j is in eax.
movl %eax,  %ebx  ; k is in ebx now. Hmm, it is copied anyways in a
                  ; sense.
mul  %eax,  %ebx  ; k == j*j
mul  %eax,  %ebx  ; k == j*j*j



>Absolutely pass scalars by value.

Does this also apply to stock GCC? PGCC? Most of the differences among
these three gccs, except for namespace support (and extern inline behavior
:-)), are optimization differences.

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  | http://surf.to/pgd.net
_____________________ ____|________     Paul Derbyshire     pderbysh@usa.net
Programmer & Humanist|ICQ: 10423848|

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Code gen question
       [not found]               ` <3.0.6.32.19990212180551.00841100.cygnus.egcs@pop.netaddress.com>
@ 1999-02-12 15:29                 ` Jason Merrill
       [not found]                   ` < u990e3kxq8.fsf@yorick.cygnus.com >
  1999-02-28 22:53                   ` Jason Merrill
  0 siblings, 2 replies; 10+ messages in thread
From: Jason Merrill @ 1999-02-12 15:29 UTC (permalink / raw)
  To: Paul Derbyshire; +Cc: egcs

>>>>> Paul Derbyshire <pderbysh@usa.net> writes:

 > Which will cause cc1plus to generate better code?
 > inline int myclass::myfunc (int j) { return j*j*j; }

 > inline int myclass::myfunc (const int &j) { return j*j*j; }

 > My guess would be the latter, since the latter when inlined won't make a
 > copy of the argument passed.

Neither will the former.  The difference is that the latter refers to its
arguments address, which impairs optimization (though not as much as it
used to).  Absolutely pass scalars by value.

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~1999-02-28 22:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-02-12 15:06 Code gen question Paul Derbyshire
     [not found] ` < 3.0.6.32.19990212180551.00841100@pop.netaddress.com >
1999-02-12 15:29   ` Joe Buck
1999-02-28 22:53     ` Joe Buck
1999-02-28 22:53 ` Paul Derbyshire
     [not found] <pderbysh@usa.net's>
     [not found] ` <message>
     [not found]   ` <of>
     [not found]     ` <12>
     [not found]       ` <Feb>
     [not found]         ` <1999>
     [not found]           ` <15:07:44>
     [not found]             ` <-0800>
     [not found]               ` <3.0.6.32.19990212180551.00841100.cygnus.egcs@pop.netaddress.com>
1999-02-12 15:29                 ` Jason Merrill
     [not found]                   ` < u990e3kxq8.fsf@yorick.cygnus.com >
1999-02-12 17:34                     ` Paul Derbyshire
     [not found]                       ` < 3.0.6.32.19990212203311.0083e6d0@pop.netaddress.com >
1999-02-12 17:39                         ` Jeffrey A Law
1999-02-28 22:53                           ` Jeffrey A Law
1999-02-28 22:53                       ` Paul Derbyshire
1999-02-28 22:53                   ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).