Re: std::pow implementation

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: std::pow implementation
@ 2003-07-30 13:13 Martin Reinecke
  2003-07-30 13:30 ` Gabriel Dos Reis
  2003-07-30 15:56 ` std::pow implementation Scott Robert Ladd
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 13:13 UTC (permalink / raw)
  To: gcc

Gabriel Dos Reis wrote:

 > Do trust the programmer.

This is certainly a valid point of view, but I think that C++, as it exists
at the moment, just _doesn't give_ the programmer the possibility of specifying
what exactly (s)he wants.

I think it is not generally possible for a programmer to decide whether a function
should be inlined or not (see below).

The only exceptions are functions like std::string::end()
that probably even *save space* when they are inlined.

But imagine, for example, a function foo() with about 50 lines of code, somewhere in a C++
library. If this function is going to be used in the following context

amax = 100000000;
[...]
for (int a=0; a<=amax; ++a)
   foo();

then it should most definitely be inlined, because not much space is wasted, and the
calling overhead is gone, which pays off if amax is large.
The problem is that gcc usually cannot determine how large amax is going to be, so it
doesn't know what the right solution is, no matter how good the inlining heuristics are.

On the other hand, if foo() is called from many different places but only once, like this:

[...]
foo();
[...]
foo();
and so on,

then inlining probably doesn't make much sense.

In other words: the author of foo() _cannot_ decide whether foo() should be inlined or not,
because he doesn't know to what kind of uses it will be put by other people (unless he is the
only user of foo(), of course). Whatever the author of foo() specifies (inline or not),
he's going to get it wrong in some cases.

In most cases, the user who calls foo() knows much better if foo() should be inlined.
Wouldn't it therefore be much more sensible to give the compiler an inlining hint at the place
where foo() is called, e.g.

for (int a=0; a<=amax; ++a)
#pragma inline
   foo();

or

#pragma noinline
{
[...]
foo();
[...]
foo();
}

Of course it would be much nicer if C++ supported this natively ...

Please ignore me if I didn't make any sense.

Cheers,
   Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:13 std::pow implementation Martin Reinecke
@ 2003-07-30 13:30 ` Gabriel Dos Reis
  2003-07-30 13:40   ` Martin Reinecke
  2003-07-30 15:56 ` std::pow implementation Scott Robert Ladd
  1 sibling, 1 reply; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-07-30 13:30 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: gcc

Martin Reinecke <martin@MPA-Garching.MPG.DE> writes:

| Gabriel Dos Reis wrote:
| 
|  > Do trust the programmer.
| 
| This is certainly a valid point of view, but I think that C++, as it exists
| at the moment, just _doesn't give_ the programmer the possibility of specifying
| what exactly (s)he wants.

But taking out the only lever the programmer currently has is even far
worse. 

| I think it is not generally possible for a programmer to decide
| whether a function should be inlined or not (see below).

I reckon that there are conner cases that "inline" does not cover.
But, I'm worrying about the common situations -- and those that ave
been subject of PRs.

However, in lieu of pragmas, I will point out that current C++
gives the programmer means to express, in many corner cases, the
intention of not inlining, by combining inline and noninline
definitions and object functions. 

[...]

| Of course it would be much nicer if C++ supported this natively ...
| 
| Please ignore me if I didn't make any sense.

Your input is meaningful, I don't see why it should be ignored :-)

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:30 ` Gabriel Dos Reis
@ 2003-07-30 13:40   ` Martin Reinecke
  2003-07-30 13:46     ` Andrew Pinski
  2003-07-30 13:53     ` Gabriel Dos Reis
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 13:40 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc

Gabriel Dos Reis wrote:
> Martin Reinecke <martin@MPA-Garching.MPG.DE> writes:
> 
> | This is certainly a valid point of view, but I think that C++, as it exists
> | at the moment, just _doesn't give_ the programmer the possibility of specifying
> | what exactly (s)he wants.
> 
> But taking out the only lever the programmer currently has is even far
> worse. 

Right, but that's exactly the dilemma with C++'s "implicit" inline:
if you want the compiler to _be able_ to inline a function, you should put
it in the class definition. But this implies an "inline" directive
which would more or less _force_ the compiler to always inline the code.
So, if inline was as strict as you'd like to see it, there would be no
way to tell gcc "here's a function that might be probably worthwhile to
inline". I could only say "inline this" or "don't inline".

This could change if gcc starts to inline functions across translation units,
but currently it doesn't (I believe).

> | I think it is not generally possible for a programmer to decide
> | whether a function should be inlined or not (see below).
> 
> I reckon that there are conner cases that "inline" does not cover.
> But, I'm worrying about the common situations -- and those that ave
> been subject of PRs.

I've encountered the example I gave a few times in real-life code,
but in the frame of standard C++ nothing can be done about it, so
I cannot file a PR.

> However, in lieu of pragmas, I will point out that current C++
> gives the programmer means to express, in many corner cases, the
> intention of not inlining, by combining inline and noninline
> definitions and object functions. 

You're right. It's not too beautiful and makes maintenance a bit harder,
but on the other hand it shouldn't happen too often.

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:40   ` Martin Reinecke
@ 2003-07-30 13:46     ` Andrew Pinski
  2003-07-30 13:47       ` Steven Bosscher
  2003-07-30 13:53     ` Gabriel Dos Reis
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Pinski @ 2003-07-30 13:46 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: Andrew Pinski, Gabriel Dos Reis, gcc

On Wednesday, Jul 30, 2003, at 09:19 US/Eastern, Martin Reinecke wrote:
> This could change if gcc starts to inline functions across translation 
> units,
> but currently it doesn't (I believe).

It does in the mainline if you put all the files as arguments to gcc.
Example:

gcc temp.cc temp1.cc temp2.cc -o temp.o
gcc temp.o -o temp

This will cause intermodular optimizations.

This was put on the mainline in July 11, 2003.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:46     ` Andrew Pinski
@ 2003-07-30 13:47       ` Steven Bosscher
  2003-07-30 14:32         ` Martin Reinecke
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Bosscher @ 2003-07-30 13:47 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Martin Reinecke, Gabriel Dos Reis, gcc

Op wo 30-07-2003, om 15:24 schreef Andrew Pinski:
> On Wednesday, Jul 30, 2003, at 09:19 US/Eastern, Martin Reinecke wrote:
> > This could change if gcc starts to inline functions across translation 
> > units,
> > but currently it doesn't (I believe).
> 
> It does in the mainline if you put all the files as arguments to gcc.
> Example:
> 
> gcc temp.cc temp1.cc temp2.cc -o temp.o
> gcc temp.o -o temp
> 
> This will cause intermodular optimizations.

But not yet for C++

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:40   ` Martin Reinecke
  2003-07-30 13:46     ` Andrew Pinski
@ 2003-07-30 13:53     ` Gabriel Dos Reis
  2003-07-30 14:14       ` Martin Reinecke
  1 sibling, 1 reply; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-07-30 13:53 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: gcc

Martin Reinecke <martin@MPA-Garching.MPG.DE> writes:

| Gabriel Dos Reis wrote:
| > Martin Reinecke <martin@MPA-Garching.MPG.DE> writes:
| > | This is certainly a valid point of view, but I think that C++, as
| > it exists
| > | at the moment, just _doesn't give_ the programmer the possibility of specifying
| > | what exactly (s)he wants.
| > But taking out the only lever the programmer currently has is even
| > far
| > worse.
| 
| Right, but that's exactly the dilemma with C++'s "implicit" inline:
| if you want the compiler to _be able_ to inline a function, you should put
| it in the class definition.

Please note that there is no "implicit" inline in C++.  The only thing
that is present is that function definition within a class declaration
is an inline definition, because that is the orginal syntax before
inlining nonmember function was implemented.  A function defined
within a class is as inline as a function defined with the "inline"
keyword outside a class.

| But this implies an "inline" directive
| which would more or less _force_ the compiler to always inline the code.
| So, if inline was as strict as you'd like to see it, there would be no
| way to tell gcc "here's a function that might be probably worthwhile to
| inline". I could only say "inline this" or "don't inline".
| 
| This could change if gcc starts to inline functions across translation units,
| but currently it doesn't (I believe).

That "worthwhile inlining" could subject to higher optimiations
("-O3") if it is so that we've reached the level of sophistication
where automatic inline supersedes traditional "inline".

But I see on concrete real world codes that the "inline" meaning
transmutation GCC operates does cause more grief that it does good.

| > | I think it is not generally possible for a programmer to decide
| > | whether a function should be inlined or not (see below).
| > I reckon that there are conner cases that "inline" does not cover.
| > But, I'm worrying about the common situations -- and those that ave
| > been subject of PRs.
| 
| I've encountered the example I gave a few times in real-life code,
| but in the frame of standard C++ nothing can be done about it, so
| I cannot file a PR.
| 
| > However, in lieu of pragmas, I will point out that current C++
| > gives the programmer means to express, in many corner cases, the
| > intention of not inlining, by combining inline and noninline
| > definitions and object functions.
| 
| You're right. It's not too beautiful and makes maintenance a bit harder,
| but on the other hand it shouldn't happen too often.

and is expressed with standard constructs.

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:53     ` Gabriel Dos Reis
@ 2003-07-30 14:14       ` Martin Reinecke
  2003-07-30 14:33         ` Gabriel Dos Reis
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 14:14 UTC (permalink / raw)
  Cc: gcc

Gabriel Dos Reis wrote:

> Please note that there is no "implicit" inline in C++.  The only thing
> that is present is that function definition within a class declaration
> is an inline definition, because that is the orginal syntax before
> inlining nonmember function was implemented.  A function defined
> within a class is as inline as a function defined with the "inline"
> keyword outside a class.

I know; this is why I put "implicit" in quotes. But this doesn't help, I think.

Say I write the following

==========foo.h

class foo
   {
   inline void bar();
   };

==========foo.cc

inline void foo::bar() {...}

No current g++ will inline foo::bar() (and most other compilers won't either) when I
call it from, say, baz.cc. So there is a practical difference between defining
a function in the class body, and declaring it as inline and defining it in a
separate .cc file.

If we assume for now that intermodule optimization is not done (and that's
the current situation), there is no way to tell the C++ compiler
"this function might be useful to inline", because it has to see the function
body before it can decide whether to inline it or not. And it can't see
it because it's in a different translation unit.

AFAIK, it's ill-formed to write

=========foo.h

class foo
   {
   void bar();
   };

void foo::bar() {...}

=========end of foo.h

I think this is only OK if I explicitly declare bar() as inline.
If not, I run into trouble with the ODR.

> | But this implies an "inline" directive
> | which would more or less _force_ the compiler to always inline the code.
> | So, if inline was as strict as you'd like to see it, there would be no
> | way to tell gcc "here's a function that might be probably worthwhile to
> | inline". I could only say "inline this" or "don't inline".
> | 
> | This could change if gcc starts to inline functions across translation units,
> | but currently it doesn't (I believe).
> 
> That "worthwhile inlining" could subject to higher optimiations
> ("-O3") if it is so that we've reached the level of sophistication
> where automatic inline supersedes traditional "inline".

I'll be happy with this as soon as intermodule optimization works
well. Without it, we have not gained anything, as the example shows.

> But I see on concrete real world codes that the "inline" meaning
> transmutation GCC operates does cause more grief that it does good.

Probably. I just think it was caused by the fact that C++ compilers
have been imperfect (i.e. lacking intermodule inlining) for a long time.
And because so many codes exist that were written with the "transmuted"
meaning in mind, there will be a terrible lot of breakage if the meaning
is reset abruptly.

> | You're right. It's not too beautiful and makes maintenance a bit harder,
> | but on the other hand it shouldn't happen too often.
> 
> and is expressed with standard constructs.

Agreed. I was arguing for a possible extension of the standard itself, not
for introducing new pragmas.

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:47       ` Steven Bosscher
@ 2003-07-30 14:32         ` Martin Reinecke
  0 siblings, 0 replies; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 14:32 UTC (permalink / raw)
  Cc: gcc

Steven Bosscher wrote:
> Op wo 30-07-2003, om 15:24 schreef Andrew Pinski:
> 
>>On Wednesday, Jul 30, 2003, at 09:19 US/Eastern, Martin Reinecke wrote:
>>
>>>This could change if gcc starts to inline functions across translation 
>>>units,
>>>but currently it doesn't (I believe).
>>
>>It does in the mainline if you put all the files as arguments to gcc.
>>Example:
>>
>>gcc temp.cc temp1.cc temp2.cc -o temp.o
>>gcc temp.o -o temp
>>
>>This will cause intermodular optimizations.
> 
> 
> But not yet for C++

Even if we had it for C++ it wouldn't help when I link
against libraries, i.e. the optimizations are only done
at the source code level. Once I have only object files
and libraries, I'm out of luck. So it means that for a large
application I'd need the full source code of all used
libraries and compile them with a single call to gcc.
This is for various reasons, rather unrealistic.

Cheers,
   Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 14:14       ` Martin Reinecke
@ 2003-07-30 14:33         ` Gabriel Dos Reis
  2003-07-30 15:27           ` Martin Reinecke
  2003-08-04 12:55           ` Theodore Papadopoulo
  0 siblings, 2 replies; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-07-30 14:33 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: gcc

Martin Reinecke <martin@MPA-Garching.MPG.DE> writes:

[...]

| Say I write the following
| 
| ==========foo.h
| 
| class foo
|    {
|    inline void bar();
|    };
| 
| ==========foo.cc
| 
| inline void foo::bar() {...}
| 
| No current g++ will inline foo::bar() (and most other compilers won't either) when I
| call it from, say, baz.cc.

The standard does mandate that the definition of an inline function
be available in every translation unit that uses it.

[...]

| > But I see on concrete real world codes that the "inline" meaning
| > transmutation GCC operates does cause more grief that it does good.
| 
| Probably. I just think it was caused by the fact that C++ compilers
| have been imperfect (i.e. lacking intermodule inlining) for a long time.

KCC did better, even with no inter translation-unit optimization, on
real world codes.

| And because so many codes exist that were written with the "transmuted"
| meaning in mind, there will be a terrible lot of breakage if the meaning
| is reset abruptly.
| 
| > | You're right. It's not too beautiful and makes maintenance a bit harder,
| > | but on the other hand it shouldn't happen too often.
| > and is expressed with standard constructs.
| 
| Agreed. I was arguing for a possible extension of the standard itself, not
| for introducing new pragmas.

I see, but that tends to confuse the debate.

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 14:33         ` Gabriel Dos Reis
@ 2003-07-30 15:27           ` Martin Reinecke
  2003-07-30 15:42             ` Gabriel Dos Reis
  2003-08-04 12:55           ` Theodore Papadopoulo
  1 sibling, 1 reply; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 15:27 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc

Gabriel Dos Reis wrote:

> The standard does mandate that the definition of an inline function
> be available in every translation unit that uses it.

Thanks for the clarification. But I think there is still no way to expose
the definition of a not-explicitely-inline function to all
translation units where it is used.

The fundamental problem is that there are three classes of functions:
  - functions that should always be inlined
  - functions where the compiler should decide
  - functions that should never be inlined

The first category must have its definition in a header file.
The second category should have its definition in a header file also,
to give the compiler a chance of inlining it. But C++ won't let us,
because all functions with their definition in a header automatically
belong to the first category.

Something seems fundamentally wrong here, and I think it is C++'s property
to automatically put the "inline" tag on all functions defined in a class
body, even if they don't have the "inline" keyword. But I see no chance
to get this changed.

Maybe the discussion is so intense because there is no real solution?

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 15:27           ` Martin Reinecke
@ 2003-07-30 15:42             ` Gabriel Dos Reis
  2003-07-30 17:38               ` Martin Reinecke
  0 siblings, 1 reply; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-07-30 15:42 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: gcc

Martin Reinecke <martin@mpa-garching.mpg.de> writes:

| Gabriel Dos Reis wrote:
| 
| > The standard does mandate that the definition of an inline function
| > be available in every translation unit that uses it.
| 
| Thanks for the clarification. But I think there is still no way to expose
| the definition of a not-explicitely-inline function to all
| translation units where it is used.

Yes.

I do not see those as pressing as getting inlining of those declared
inline good.

| The fundamental problem is that there are three classes of functions:
|   - functions that should always be inlined
|   - functions where the compiler should decide
|   - functions that should never be inlined

I can agree with that categorization.

| The first category must have its definition in a header file.
| The second category should have its definition in a header file also,
| to give the compiler a chance of inlining it. But C++ won't let us,
| because all functions with their definition in a header automatically
| belong to the first category.

Not actually.  Not just because a function definition is put in a
header file -- or more accurately, is avalaible in translation unit
means that inlining is requested for that function.

(I may suggest "mutable inline" for the second category and ~inline
for the third :-)

| Something seems fundamentally wrong here, and I think it is C++'s property
| to automatically put the "inline" tag on all functions defined in a class
| body, even if they don't have the "inline" keyword. But I see no chance
| to get this changed.

That is the way inlining is introduced in C++.  Fundamentally, I think
that decision is not wrong.

| Maybe the discussion is so intense because there is no real solution?

I'm confident that we can read a useful point, if we're careful enough
not to turn it into a flame war -- we're not going to take the habit
of turning every discussion about "inline" into a flame war, right? :-)

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 13:13 std::pow implementation Martin Reinecke
  2003-07-30 13:30 ` Gabriel Dos Reis
@ 2003-07-30 15:56 ` Scott Robert Ladd
  2003-07-30 16:16   ` Steven Bosscher
  1 sibling, 1 reply; 21+ messages in thread
From: Scott Robert Ladd @ 2003-07-30 15:56 UTC (permalink / raw)
  To: Martin Reinecke; +Cc: gcc

Martin Reinecke wrote:
> Gabriel Dos Reis wrote:
> 
>  > Do trust the programmer.
> 
> This is certainly a valid point of view, but I think that C++, as it exists
> at the moment, just _doesn't give_ the programmer the possibility of 
> specifying what exactly (s)he wants.

No high-level language "gives a programmer what they want." Which is one 
reason some of us still use assembler languages from time to time.

Any compiler worth its weight in bits will translate written code based 
on analysis; I vigorously support intelligent compilation. What I do 
*not* support is trying to be smarter than the programmer. A fine line 
line, to be sure, but a line nonetheless.

If I explicitly define a variable type or a structure packing, I expect 
the compiler to honor my wishes; if I say "inline", I mean "inline", 
within the physical limits of the compiler (i.e., memory use during 
compilation, etc.)

At the very least, a compiler should report when it has made a choice 
counter to the programmer's wishes. If I declare a function inline, and 
the compiler decides otherwise, it should inform me of its rebellion. 
Perhaps I'll even agree with it -- but I want to know when my tools make 
choices for me. I am writing the program, not the compiler.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 15:56 ` std::pow implementation Scott Robert Ladd
@ 2003-07-30 16:16   ` Steven Bosscher
  2003-07-30 16:47     ` Scott Robert Ladd
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Bosscher @ 2003-07-30 16:16 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Martin Reinecke, gcc

Op wo 30-07-2003, om 17:43 schreef Scott Robert Ladd:
> At the very least, a compiler should report when it has made a choice 
> counter to the programmer's wishes. If I declare a function inline, and 
> the compiler decides otherwise, it should inform me of its rebellion. 
> Perhaps I'll even agree with it -- but I want to know when my tools make 
> choices for me. I am writing the program, not the compiler.

I plan to split the function body size estimate code out in a separate
function (if no-one does so before me), that would be a first step
towards improving the messages you get from -Winline. Right now it just
tells you that it didn't inline something, but not why, so it's not very
helpful.

Gr.
Steven

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 16:16   ` Steven Bosscher
@ 2003-07-30 16:47     ` Scott Robert Ladd
  0 siblings, 0 replies; 21+ messages in thread
From: Scott Robert Ladd @ 2003-07-30 16:47 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Martin Reinecke, gcc

Steven Bosscher wrote:
> I plan to split the function body size estimate code out in a separate
> function (if no-one does so before me), that would be a first step
> towards improving the messages you get from -Winline. Right now it just
> tells you that it didn't inline something, but not why, so it's not very
> helpful.

*That* would be lovely.

..Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 15:42             ` Gabriel Dos Reis
@ 2003-07-30 17:38               ` Martin Reinecke
  0 siblings, 0 replies; 21+ messages in thread
From: Martin Reinecke @ 2003-07-30 17:38 UTC (permalink / raw)
  To: Gabriel Dos Reis, gcc

On Wed, Jul 30, 2003 at 04:51:01PM +0200, Gabriel Dos Reis wrote:

> Not actually.  Not just because a function definition is put in a
> header file -- or more accurately, is avalaible in translation unit
> means that inlining is requested for that function.

I just can't think of a way how to do that without breaking some
C++ rule; could you please give a short example?
If the definition can be made available in more than one translation unit
without requesting inlining, I'm perfectly happy.

> (I may suggest "mutable inline" for the second category and ~inline
> for the third :-)

Nice ;)

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-07-30 14:33         ` Gabriel Dos Reis
  2003-07-30 15:27           ` Martin Reinecke
@ 2003-08-04 12:55           ` Theodore Papadopoulo
  2003-08-04 13:11             ` Gabriel Dos Reis
  1 sibling, 1 reply; 21+ messages in thread
From: Theodore Papadopoulo @ 2003-08-04 12:55 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Martin Reinecke, gcc

	Sorry, I'm certainly replying a bit late...

gdr@integrable-solutions.net said:
> | > It suffices to point out that (defunct) KCC did outperform GCC on most
> | > real world code. 
> |
> | But surely not due to honouring the inline keyword. 

> Its honouring the inline keyword was most certainly part of that. 

Well to be honest at the time the inline-limit was put into gcc, gcc 
was much better than KCC (in terms of inlining not in terms of 
overall efficiency of the compiled code). A simple program like:

template <int N>
struct A {

   void f() {
      A<N-1>::f();
      A<N-1>::f();
   }
};

template <>
struct A<0> {
   void f() { }
};

int main()
{
   A<32>::f();
}

was causing a real memory hog with gcc (which inlined everything and 
not realizing that the functions were empty !!) 
while KCC was stopping after a few inlining levels. At that time, 
inlining was done on RTL was quite fast and the inlining limit was
very high...

Then a lot of things changed, garbage collection was introduced, tree 
based inlining was introduced, function-at-a-time and now even 
unit-at-a-time strategy, a new standard C++ library came in, ...
(this list is certainly not intented to be used as a list of guilty
changes) ... and inlining (presumably among other things) started to
be really slow. 

Then, someone noticed, that one way to reduce the time the compiler 
was spending in doing inlining was simply to reduce the inlining 
limit (and even to introduces new ones). And there were some 
convincing examples to show that the overall performance of the 
programs were not decreased (at least too much). Unfortunately, with 
more testing, this has proved to be not entirely true.

So saying that KCC was better than gcc because it inlined more was 
simply not true (at the time where this story begins), KCC was still 
often better than gcc and gcc was inlining everything it could (ie 
marked as inline either explicit or implicit).

The main conclusions this leads to, are (IMHO):

- The simple example above shows clearly that some limit has to exist 
and to be put by the compiler (of course here the function is empty 
and the compiler could have figured it out, but imagine if it were 
not).

- At some point (egcs times, around 1998), even while inlining much 
much more than today gcc was faster and consumed less memory (and 
certainly had also difficult memory management related bugs...). So 
the scheme that Gaby describes worked (with some very high limits) at 
some point in the past and people were not complaining...

- The current limitting strategy is coming from a very practical point of 
view (restrict the compilation time of the released 
compiler) and might not be considered as the definitive answer on the 
problem all the more that users have reported that the inlining 
limits consequences are varying a lot depending on the langage in use,
the type of the code, etc.

- The inlining strategy plays some important role. In the function 
above, gcc which (at that time) was doing top-down inlining (ie from 
A<32> down to A<0>) had to inline everything before being able to 
realize that the function was empty. I think Nathan or Zack proposed 
a bottom-up experimental patch around that time. From what I read here, 
it seems that the ideal inlining strategy still has to be found (or 
more likely approximated).

- Context (in conjunction of optimization and inlining strategy) is
also certainly important (Imagine if you could prove that f() is empty
in the example above). I hope that tree based optimisation will allow
partial optimization of functions after inlining, so that the metrics of
the costs will be more meaningfull...

Final point: it has been reported here (and it is also my 
experience), that for some C++ code, sometimes -Os (which I believe 
restricts the amount of inlining) results with more efficient code 
than all over -O[0123] choices. This with timings that can go for 
about 12s at -O0 down to 1.2s for -Os and about 2.5s for -O2 (numbers 
are approximative and pre-date the unit-at-a-time patch but I'm not 
sure how this interferes, all functions were small). This tends to 
show that 

- it is the metrics that are used to determine the efficiency 
of the code that need some work.

- certainly, at some point the compiler will know better what to 
inline (sorry Gaby) and what to keep as a function. And I also buy, 
all the portability arguments that have been raised along this 
discussion.

It really seems that this inlining problem is much more difficult to 
cope with than it was expected... I wish I had a good idea...

I just hope that SSA based optimisations and the work Jan just did on 
metrics and unit-at-a-time, will really make effective some of the 
infrastructure work (like function-at-a-time and tree based inlining) 
that might need tree based optimisation. Here I certainly show how 
little I know about how inlining interferes with other optimizations
(now and the past RTL based scheme).

--------------------------------------------------------------------
Theodore Papadopoulo
Email: Theodore.Papadopoulo@sophia.inria.fr Tel: (33) 04 92 38 76 01
 --------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-08-04 12:55           ` Theodore Papadopoulo
@ 2003-08-04 13:11             ` Gabriel Dos Reis
  2003-08-04 14:32               ` Theodore Papadopoulo
  0 siblings, 1 reply; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-08-04 13:11 UTC (permalink / raw)
  To: Theodore Papadopoulo; +Cc: Martin Reinecke, gcc

Theodore Papadopoulo <Theodore.Papadopoulo@sophia.inria.fr> writes:

| 	Sorry, I'm certainly replying a bit late...
| 
| 
| gdr@integrable-solutions.net said:
| > | > It suffices to point out that (defunct) KCC did outperform GCC on most
| > | > real world code. 
| > |
| > | But surely not due to honouring the inline keyword. 
| 
| > Its honouring the inline keyword was most certainly part of that. 
| 
| Well to be honest at the time the inline-limit was put into gcc, gcc 
| was much better than KCC (in terms of inlining not in terms of 
| overall efficiency of the compiled code).

As I noted elswhere, just inlining does not solve all problems.  There
is no dispute on that.  But KCC's honouring the inline keyword was
most certainly part of the efficacy of codes produced by KCC.
Inlining can expose opportunities for better optimization, like
constant propagation, dead code elimination or reference shrinking.

[...]

| Then a lot of things changed, garbage collection was introduced, tree 
| based inlining was introduced, function-at-a-time and now even 
| unit-at-a-time strategy, a new standard C++ library came in, ...

listing the (new) standard library in this discussion will create more
confusion than helps sort out the issue.

[...]

| So saying that KCC was better than gcc because it inlined more was 
| simply not true (at the time where this story begins), KCC was still 
| often better than gcc and gcc was inlining everything it could (ie 
| marked as inline either explicit or implicit).

Please do be careful in your rephrasing.  I'm specifically concerned
about honouring the inline request and -not- "inlining more".  And I'm
not saying GCC should inline everything.  Part of the confusion in
this debate came from the fact that people are considering that
honouring the inline request means inlining everything.

[...]

| - The inlining strategy plays some important role. In the function 

Equally important is considering the language-specific semantics.

[...]

| Final point: it has been reported here (and it is also my 
| experience), that for some C++ code, sometimes -Os (which I believe 
| restricts the amount of inlining) results with more efficient code 
| than all over -O[0123] choices.

You seem to be confusing "inline" with "optimize".

[...]

| - certainly, at some point the compiler will know better what to 
| inline

at that point, "inline" could go the way of "register", but we're
not at that level yet.

I'll send in a moment a text I intended to send some time ago.

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-08-04 13:11             ` Gabriel Dos Reis
@ 2003-08-04 14:32               ` Theodore Papadopoulo
  2003-08-04 14:50                 ` Gabriel Dos Reis
  0 siblings, 1 reply; 21+ messages in thread
From: Theodore Papadopoulo @ 2003-08-04 14:32 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Martin Reinecke, gcc

gdr@integrable-solutions.net said:
> listing the (new) standard library in this discussion will create more
> confusion than helps sort out the issue.

That was not my point, I listed some of the big (necessary and/or good) changes
in the C++ context. The bigger library was often associated with bigger compile times, 
but as I said the list is certainly not changes that have to be 
accused for the compile times.

> | So saying that KCC was better than gcc because it inlined more was
> | simply not true (at the time where this story begins), KCC was still
>  | often better than gcc and gcc was inlining everything it could (ie
> | marked as inline either explicit or implicit).

> Please do be careful in your rephrasing.  I'm specifically concerned
> about honouring the inline request and -not- "inlining more".  And I'm
> not saying GCC should inline everything.  Part of the confusion in
> this debate came from the fact that people are considering that
> honouring the inline request means inlining everything.

Agreed. But, at that time gcc was "honouring the inline request" even 
better than KCC was. That is what the more refers to... When I say 
that gcc inlined everything it could means that it was "honouring the
inline request in all places where it could".

> | - The inlining strategy plays some important role. In the function 
> Equally important is considering the language-specific semantics.

Totally agreeing. I think, I basically say the same things somewhere 
else in the text.

> | Final point: it has been reported here (and it is also my  |
> experience), that for some C++ code, sometimes -Os (which I believe  |
> restricts the amount of inlining) results with more efficient code  |
> than all over -O[0123] choices.
> You seem to be confusing "inline" with "optimize".

Well, I agree that the example is not as clear as I would have wished...
Certainly -Os has consequences on the inlining limits but I agree 
this might be drawn in many other program transformations.

> I'll send in a moment a text I intended to send some time ago. 

I'll read it.

Note that my point was not to discuss how inline should be honoured. 
I gave just many contradicting facts:

- when gcc honored inline like you would like it did, it worked and 
was faster than it is now (not sure about the compiled code speed but 
it was not that bad in my recollections).

- Some limits on the "honoring" just have to exist.

- on the other hand, as mentioned elsewhere in the thread, just 
saying that the programmer knows what it does is simply not quite 
true... And the portability makes things even worse.

My guess (but this is just that) is that the situtation is bad for 
C++ because:

- inlining is now tree-based but most other optimizations are 
RTL-based.

- general speed has decreased (at least up to recently).

- most inlining limits were set based on too narrow testing (I 
believe it was mostly C programs). I agree that C and C++ need 
different treatment.

Now if I have to say something abou the topic, I believe that, 
unfortunately, because of the compilation times, I not quite sure that gcc is 
ready for the old "honor mostly every inline up to some extremely 
high limit"... that is why I said that the current situation is just 
a pragmatic move rather than a definitive answer.

As you said, to really realize the full benefits of inlining, one has 
to embed somehow inlining with other optimizations (cprop, dce, ...), 
currently (correct me if I'm wrong), the situation is doomed because 
inlining decisions are made on trees and most of these optimisations 
come later on RTL so that the metrics used to decide if inlining is 
a profitable choice cannot take account of the further optimizations
the inlining may allow. As I said my hope is that the SSA stuff will 
change the overall picture and will made the "compiler made" 
decisions a better alternative than it is now. Will the code size 
decrease (and hopefully the memory consumption and speed) be enough to 
"honour" all inline hints as you argue (provided it is better than 
the choices made by the compiler) I do not know.

My main conclusion is that inlining limits are just a trick to make 
the compiler fast enough to convince people to use it. They must be 
reconsidered regularly and IMHO made specific to each language (to 
better cope with the language expectations).
This is unfortunate and just a pragmatic conclusion and I place my 
hopes in SSA for a better answer to the problem.

Please correct me if I'm wrong in my expectations !!!

--------------------------------------------------------------------
Theodore Papadopoulo
Email: Theodore.Papadopoulo@sophia.inria.fr Tel: (33) 04 92 38 76 01
 --------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-08-04 14:32               ` Theodore Papadopoulo
@ 2003-08-04 14:50                 ` Gabriel Dos Reis
  2003-08-04 14:58                   ` Daniel Berlin
  0 siblings, 1 reply; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-08-04 14:50 UTC (permalink / raw)
  To: Theodore Papadopoulo; +Cc: Martin Reinecke, gcc

Theodore Papadopoulo <Theodore.Papadopoulo@sophia.inria.fr> writes:

| > I'll send in a moment a text I intended to send some time ago. 
| 
| I'll read it.
| 
| 
| Note that my point was not to discuss how inline should be honoured. 
| I gave just many contradicting facts:

and I appreciate them.

| - when gcc honored inline like you would like it did, it worked and 
| was faster than it is now (not sure about the compiled code speed but 
| it was not that bad in my recollections).
| 
| - Some limits on the "honoring" just have to exist.

I explicitly acknowledge "pathological" cases, but surely no simple,
short expression is pathological.

| - on the other hand, as mentioned elsewhere in the thread, just 
| saying that the programmer knows what it does is simply not quite 
| true... And the portability makes things even worse.

We're talking about C++ where we already trust the programmer in many
many more circumstances.  

| My guess (but this is just that) is that the situtation is bad for 
| C++ because:
| 
| - inlining is now tree-based but most other optimizations are 
| RTL-based.
| 
| - general speed has decreased (at least up to recently).
| 
| - most inlining limits were set based on too narrow testing (I 
| believe it was mostly C programs). I agree that C and C++ need 
| different treatment.

If we base only on semantics, I will say C needs different
treatment.  However I'm not arguing about how C should be treated. 
For the time being, I'm concerned about C++.

I think many notions of "inlining" are being confused and C++'s is
lost in the process.  At least we have: 

  (1) language specific (C++).  This, ideally, should happen very early
      in the program translation.

  (2) more-or-less language-neutral, e.g. middle end or back ends,
      meaning "optimize for whatever" 

Currenly, what we're doing is much closer to (2) than (1).  I do not
believe they should be mutually exclusive.  The compiler is free to
inline any non-inline function that helps "optimize for whatever".
I don't think we should ban an RTL inliner just because we have a
tree inliner.  Both can be useful.  Of course there are more chance
for a tree-based inliner to understand language-specific semantics
much easier than an RTL inliner. 

If you take somthing like

    template<class T>
      struct vector_iterator {
         T* ptr;
         // ...
      };

    template<class T>
      inline bool
      operator==(vector_iterator<T> p, vector_iterator<T> q)
      { return p.ptr == q.ptr; }

we should make it so that use like p == q be as efficient as a
compariason between scalars p_ptr == q_ptr.  If you don't inline, then
you will be forced to use usual function call ABI dictated by the
target and you may loose performance, especially if the comparison is
done in a tight loop and the target ABI requires using stack for
structrures. 

Similarly, inlining something like

   template<class T>
     const T& min(const T& a, const T& b)
     { return b > a ? a : b; }

helps to identify and remove spurious references.

I would like to emphasize that I'm not claiming that inlining
everything will solve all optimization problems we have.

[...]

| My main conclusion is that inlining limits are just a trick to make 
| the compiler fast enough to convince people to use it. They must be 

If the compiler is "fast enough" but does not produce "acceptable"
codes, people won't use it.  Some people are still sticking with 2.95
because they think it is doing better (inling) job.

Not just because I'm arguing for language specific meaning for C++ inline
means I'm not pragmatic or I do not understand what purpose heuristics
might serve.

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: std::pow implementation
  2003-08-04 14:50                 ` Gabriel Dos Reis
@ 2003-08-04 14:58                   ` Daniel Berlin
  2003-08-04 15:13                     ` The Zen of Optimizing C++ - What is the sound of one compiler inlining? Gabriel Dos Reis
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Berlin @ 2003-08-04 14:58 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Theodore Papadopoulo, Martin Reinecke, gcc

Hey guys, maybe we can rename this thread to "The Zen of Optimizing C++ - What
is the sound of one compiler inlining?" or something, rather than "std::pow
implementation", since it  stopped being about std::pow maybe 90 messages
ago?

> Theodore Papadopoulo <Theodore.Papadopoulo@sophia.inria.fr> writes:
>
> | > I'll send in a moment a text I intended to send some time ago.
> |
> | I'll read it.
> |
> |
> | Note that my point was not to discuss how inline should be honoured.
> | I gave just many contradicting facts:
>
> and I appreciate them.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* The Zen of Optimizing C++ - What is the sound of one compiler inlining?
  2003-08-04 14:58                   ` Daniel Berlin
@ 2003-08-04 15:13                     ` Gabriel Dos Reis
  0 siblings, 0 replies; 21+ messages in thread
From: Gabriel Dos Reis @ 2003-08-04 15:13 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Theodore Papadopoulo, Martin Reinecke, gcc

Daniel Berlin <dberlin@dberlin.org> writes:

| Hey guys, maybe we can rename this thread to "The Zen of Optimizing C++ - What
| is the sound of one compiler inlining?" or something, rather than "std::pow
| implementation", since it  stopped being about std::pow maybe 90 messages
| ago?

Fixed thusly :-)

-- Gaby

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2003-08-04 14:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-30 13:13 std::pow implementation Martin Reinecke
2003-07-30 13:30 ` Gabriel Dos Reis
2003-07-30 13:40   ` Martin Reinecke
2003-07-30 13:46     ` Andrew Pinski
2003-07-30 13:47       ` Steven Bosscher
2003-07-30 14:32         ` Martin Reinecke
2003-07-30 13:53     ` Gabriel Dos Reis
2003-07-30 14:14       ` Martin Reinecke
2003-07-30 14:33         ` Gabriel Dos Reis
2003-07-30 15:27           ` Martin Reinecke
2003-07-30 15:42             ` Gabriel Dos Reis
2003-07-30 17:38               ` Martin Reinecke
2003-08-04 12:55           ` Theodore Papadopoulo
2003-08-04 13:11             ` Gabriel Dos Reis
2003-08-04 14:32               ` Theodore Papadopoulo
2003-08-04 14:50                 ` Gabriel Dos Reis
2003-08-04 14:58                   ` Daniel Berlin
2003-08-04 15:13                     ` The Zen of Optimizing C++ - What is the sound of one compiler inlining? Gabriel Dos Reis
2003-07-30 15:56 ` std::pow implementation Scott Robert Ladd
2003-07-30 16:16   ` Steven Bosscher
2003-07-30 16:47     ` Scott Robert Ladd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).