LTO and the inlining of functions only called once.

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* LTO and the inlining of functions only called once.
@ 2009-10-10 11:00 Toon Moene
  2009-10-10 11:34 ` Richard Guenther
  0 siblings, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-10 11:00 UTC (permalink / raw)
  To: Jan Hubicka, gcc mailing list

Gcc's man page says:

        -finline-functions-called-once
            Consider all "static" functions called once for inlining into
            their caller even if they are not marked "inline".  If a call
            to a given function is integrated, then the function is not
            output as assembler code in its own right.

            Enabled at levels -O1, -O2, -O3 and -Os.

Now, when using -flto -fwhole-program, *all* functions (that the user 
provided) will be "static inline", no ? - so *all* functions only called 
once in that program will be inlined ?

I am asking because our most important programs often consist of a chain 
of "routines-that-do-all-the-work" which are all only called once. 
However, in general they are (certainly when viewed from the perspective 
of a C programmer) *huge*.

E.g., our forecast program:

      HLPROG (small "main" program)
         |
         | calls
         V
      GEMINI (read input files, write output files, and:)
         |
         V
      SL2TIM (time stepping, and:)
         |
         V
      PHCALL (subgrid scale computations)
         |
         V
      PHTASK (split them into tasks over model domain)
         |
         V
      PHYS   (actually hand out the work to:)
         |
      ----------------------
      |    |    |     |    |
      LSP  CVP  RADIA TURB SOIL

      (large scale precipitation, convective precipitation,
       radiation, turbulence, soil processes)

      The last five are each around 2,000 lines of Fortran, the P 
routines are each several hundreds of lines, as is SL2TIM.  GEMINI is 
more than 2,000 lines.

My question is: How can I be sure that all of them are integrated (note 
that the man page says they are "considered") ?  Does -Winline help here 
?  Perhaps I should scan the assembler output (HAH!).

Kind regards,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:00 LTO and the inlining of functions only called once Toon Moene
@ 2009-10-10 11:34 ` Richard Guenther
  2009-10-10 11:47   ` Toon Moene
                     ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Richard Guenther @ 2009-10-10 11:34 UTC (permalink / raw)
  To: Toon Moene; +Cc: Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene <toon@moene.org> wrote:
> Gcc's man page says:
>
>       -finline-functions-called-once
>           Consider all "static" functions called once for inlining into
>           their caller even if they are not marked "inline".  If a call
>           to a given function is integrated, then the function is not
>           output as assembler code in its own right.
>
>           Enabled at levels -O1, -O2, -O3 and -Os.
>
> Now, when using -flto -fwhole-program, *all* functions (that the user
> provided) will be "static inline", no ? - so *all* functions only called
> once in that program will be inlined ?

Well, I think that we should try to not do this across the whole program.
Simply for the reason that a gigantic main function will hit several
non-linear complexity algorithms in GCC.

> I am asking because our most important programs often consist of a chain of
> "routines-that-do-all-the-work" which are all only called once. However, in
> general they are (certainly when viewed from the perspective of a C
> programmer) *huge*.
>
> E.g., our forecast program:
>
>     HLPROG (small "main" program)
>        |
>        | calls
>        V
>     GEMINI (read input files, write output files, and:)
>        |
>        V
>     SL2TIM (time stepping, and:)
>        |
>        V
>     PHCALL (subgrid scale computations)
>        |
>        V
>     PHTASK (split them into tasks over model domain)
>        |
>        V
>     PHYS   (actually hand out the work to:)
>        |
>     ----------------------
>     |    |    |     |    |
>     LSP  CVP  RADIA TURB SOIL
>
>     (large scale precipitation, convective precipitation,
>      radiation, turbulence, soil processes)
>
>     The last five are each around 2,000 lines of Fortran, the P routines are
> each several hundreds of lines, as is SL2TIM.  GEMINI is more than 2,000
> lines.
>
> My question is: How can I be sure that all of them are integrated (note that
> the man page says they are "considered") ?  Does -Winline help here ?
>  Perhaps I should scan the assembler output (HAH!).

-Winline doesn't help here.  Scanning the assember output does (obviously!).

Note that I wouldn't expect such aggressive inlining to have any positive
performance impact - were it not for the fact that we still lack a properly
operating IPA points-to analysis pass (yes, it's on my todo list - but
not for 4.5).

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:34 ` Richard Guenther
@ 2009-10-10 11:47   ` Toon Moene
  2009-10-10 11:50     ` Richard Guenther
  2009-10-10 13:40   ` Jan Hubicka
  2009-10-14 18:58   ` Paolo Bonzini
  2 siblings, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-10 11:47 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, gcc mailing list

Richard Guenther wrote:

> On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene <toon@moene.org> wrote:

> Well, I think that we should try to not do this across the whole program.
> Simply for the reason that a gigantic main function will hit several
> non-linear complexity algorithms in GCC.

But, but ... other people are talking about 30 minute "compiles" while 
the lto1 executable doesn't even register on the radar (top) over here.

Besides, inlining these subroutines will get rid of 200+ item argument 
lists, a benefit on its own (think I-cache).

In the mean time I found the --param I'll tweak:

  --param max-inline-insns-single=100000

I'd like to see some fireworks, too !

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:47   ` Toon Moene
@ 2009-10-10 11:50     ` Richard Guenther
  2009-10-10 12:18       ` Richard Guenther
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Guenther @ 2009-10-10 11:50 UTC (permalink / raw)
  To: Toon Moene; +Cc: Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 1:34 PM, Toon Moene <toon@moene.org> wrote:
> Richard Guenther wrote:
>
>> On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene <toon@moene.org> wrote:
>
>> Well, I think that we should try to not do this across the whole program.
>> Simply for the reason that a gigantic main function will hit several
>> non-linear complexity algorithms in GCC.
>
> But, but ... other people are talking about 30 minute "compiles" while the
> lto1 executable doesn't even register on the radar (top) over here.
>
> Besides, inlining these subroutines will get rid of 200+ item argument
> lists, a benefit on its own (think I-cache).
>
> In the mean time I found the --param I'll tweak:
>
>  --param max-inline-insns-single=100000
>
> I'd like to see some fireworks, too !

That's not the parameter you want to tweak ;)  You want

--param large-function-growth=10000 --param large-function-insns=1000000
--param large-stack-frame-growth=10000 --param large-stack-frame=100000

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:50     ` Richard Guenther
@ 2009-10-10 12:18       ` Richard Guenther
  2009-10-10 12:31         ` Toon Moene
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Guenther @ 2009-10-10 12:18 UTC (permalink / raw)
  To: Toon Moene; +Cc: Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 1:46 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Oct 10, 2009 at 1:34 PM, Toon Moene <toon@moene.org> wrote:
>> Richard Guenther wrote:
>>
>>> On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene <toon@moene.org> wrote:
>>
>>> Well, I think that we should try to not do this across the whole program.
>>> Simply for the reason that a gigantic main function will hit several
>>> non-linear complexity algorithms in GCC.
>>
>> But, but ... other people are talking about 30 minute "compiles" while the
>> lto1 executable doesn't even register on the radar (top) over here.
>>
>> Besides, inlining these subroutines will get rid of 200+ item argument
>> lists, a benefit on its own (think I-cache).
>>
>> In the mean time I found the --param I'll tweak:
>>
>>  --param max-inline-insns-single=100000
>>
>> I'd like to see some fireworks, too !
>
> That's not the parameter you want to tweak ;)  You want
>
> --param large-function-growth=10000 --param large-function-insns=1000000
> --param large-stack-frame-growth=10000 --param large-stack-frame=100000

Or rather for testing the effect of inlining all functions called once
use the following
patch:

Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 152615)
+++ ipa-inline.c	(working copy)
@@ -1249,8 +1249,8 @@ cgraph_decide_inlining (void)
 			   node->callers->caller->global.size);
 		}

-	      if (cgraph_check_inline_limits (node->callers->caller, node,
-					      NULL, false))
+	      if (1 || cgraph_check_inline_limits (node->callers->caller, node,
+						   NULL, false))
 		{
 		  cgraph_mark_inline (node->callers);
 		  if (dump_file)


tuning params will affect other inlining decisions as well.

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 12:18       ` Richard Guenther
@ 2009-10-10 12:31         ` Toon Moene
  2009-10-10 14:25           ` Toon Moene
  0 siblings, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-10 12:31 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, gcc mailing list

Richard Guenther wrote:

> On Sat, Oct 10, 2009 at 1:46 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Sat, Oct 10, 2009 at 1:34 PM, Toon Moene <toon@moene.org> wrote:

[ Inlining all functions called once ]

>>> I'd like to see some fireworks, too !
>> That's not the parameter you want to tweak ;)  You want
>>
>> --param large-function-growth=10000 --param large-function-insns=1000000
>> --param large-stack-frame-growth=10000 --param large-stack-frame=100000
> 
> Or rather for testing the effect of inlining all functions called once
> use the following
> patch:
> 
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c	(revision 152615)
> +++ ipa-inline.c	(working copy)
> @@ -1249,8 +1249,8 @@ cgraph_decide_inlining (void)
>  			   node->callers->caller->global.size);
>  		}
> 
> -	      if (cgraph_check_inline_limits (node->callers->caller, node,
> -					      NULL, false))
> +	      if (1 || cgraph_check_inline_limits (node->callers->caller, node,
> +						   NULL, false))
>  		{
>  		  cgraph_mark_inline (node->callers);
>  		  if (dump_file)

Going this route, thanks !

> tuning params will affect other inlining decisions as well.

Yep, I was afraid of that too, but think it is inconsequential for our code.

Thanks !

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:34 ` Richard Guenther
  2009-10-10 11:47   ` Toon Moene
@ 2009-10-10 13:40   ` Jan Hubicka
  2009-10-10 15:27     ` Daniel Jacobowitz
  2009-10-14 18:58   ` Paolo Bonzini
  2 siblings, 1 reply; 33+ messages in thread
From: Jan Hubicka @ 2009-10-10 13:40 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Toon Moene, Jan Hubicka, gcc mailing list

> On Sat, Oct 10, 2009 at 12:55 PM, Toon Moene <toon@moene.org> wrote:
> > Gcc's man page says:
> >
> > Â  Â  Â  -finline-functions-called-once
> > Â  Â  Â  Â  Â  Consider all "static" functions called once for inlining into
> > Â  Â  Â  Â  Â  their caller even if they are not marked "inline". Â If a call
> > Â  Â  Â  Â  Â  to a given function is integrated, then the function is not
> > Â  Â  Â  Â  Â  output as assembler code in its own right.
> >
> > Â  Â  Â  Â  Â  Enabled at levels -O1, -O2, -O3 and -Os.
> >
> > Now, when using -flto -fwhole-program, *all* functions (that the user
> > provided) will be "static inline", no ? - so *all* functions only called
> > once in that program will be inlined ?
> 
> Well, I think that we should try to not do this across the whole program.
> Simply for the reason that a gigantic main function will hit several
> non-linear complexity algorithms in GCC.

We do inline functions called once until we hit large-function-growth
parameter.  The parameter is here to avoid hitting nonlinearity of
compiler and the fact that some passes simply give up on large
functions.
I did not experimented much with tunning these limits, it might be
interesting if they do make some difference in your testcase.
> 
> -Winline doesn't help here.  Scanning the assember output does (obviously!).

My solution would be probably to pass -fdump-ipa-inline parameter to lto
compilation and read the log.  It lists the inlining decisions and if
something is not inlined, you get dump of reason why.
The dump from lto compilation appears in /tmp for some reason.

Honza
> 
> Note that I wouldn't expect such aggressive inlining to have any positive
> performance impact - were it not for the fact that we still lack a properly
> operating IPA points-to analysis pass (yes, it's on my todo list - but
> not for 4.5).
> 
> Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 12:31         ` Toon Moene
@ 2009-10-10 14:25           ` Toon Moene
  2009-10-10 15:10             ` Richard Guenther
  0 siblings, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-10 14:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, gcc mailing list

Toon Moene wrote:

> Richard Guenther wrote:

>> Or rather for testing the effect of inlining all functions called once
>> use the following
>> patch:
>>
>> Index: ipa-inline.c
>> ===================================================================
>> --- ipa-inline.c    (revision 152615)
>> +++ ipa-inline.c    (working copy)
>> @@ -1249,8 +1249,8 @@ cgraph_decide_inlining (void)
>>                 node->callers->caller->global.size);
>>          }
>>
>> -          if (cgraph_check_inline_limits (node->callers->caller, node,
>> -                          NULL, false))
>> +          if (1 || cgraph_check_inline_limits (node->callers->caller, 
>> node,
>> +                           NULL, false))
>>          {
>>            cgraph_mark_inline (node->callers);
>>            if (dump_file)
> 
> Going this route, thanks !

Well, this definitely wasn't the right approach - it made timing for a 
single integration time step in the model vary between 63 and 94 seconds.

Without an explanation for that, this is not the way to go.

However, that doesn't explain the (small) difference in size of the 
executables to me (given the large differences in code generation that 
should ensue):

The two largest binaries without the above change to ipa-inline.c:

-rwxr-xr-x 1 hirlam hirlam 9943616 2009-10-10 09:06 hirvda.x
-rwxr-xr-x 1 hirlam hirlam 2306291 2009-10-10 09:04 hlprog.x

their counterparts, after the above change:

-rwxr-xr-x 1 hirlam hirlam 9943673 2009-10-10 14:40 hirvda.x
-rwxr-xr-x 1 hirlam hirlam 2306219 2009-10-10 14:37 hlprog.x

This doesn't scan.  In hlprog.x only, we're removing several hundreds of 
  arguments in argument lists, both in the calling routine and the 
callee.  This cannot just result in ~ 70 bytes of difference in code size.

OK, for those sceptical, here are the results of size:

Before the change:

    text    data     bss     dec     hex filename
2254145    3592 186344080       188601817       b3dd5d9 
../../../../../bin/hlprog.x

After the change:

    text    data     bss     dec     hex filename
2254113    3592 186344080       188601785       b3dd5b9 hlprog.x

Something's rotten in the State of Denmark :-)

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 14:25           ` Toon Moene
@ 2009-10-10 15:10             ` Richard Guenther
  0 siblings, 0 replies; 33+ messages in thread
From: Richard Guenther @ 2009-10-10 15:10 UTC (permalink / raw)
  To: Toon Moene; +Cc: Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 3:40 PM, Toon Moene <toon@moene.org> wrote:
> Toon Moene wrote:
>
>> Richard Guenther wrote:
>
>>> Or rather for testing the effect of inlining all functions called once
>>> use the following
>>> patch:
>>>
>>> Index: ipa-inline.c
>>> ===================================================================
>>> --- ipa-inline.c    (revision 152615)
>>> +++ ipa-inline.c    (working copy)
>>> @@ -1249,8 +1249,8 @@ cgraph_decide_inlining (void)
>>>                node->callers->caller->global.size);
>>>         }
>>>
>>> -          if (cgraph_check_inline_limits (node->callers->caller, node,
>>> -                          NULL, false))
>>> +          if (1 || cgraph_check_inline_limits (node->callers->caller,
>>> node,
>>> +                           NULL, false))
>>>         {
>>>           cgraph_mark_inline (node->callers);
>>>           if (dump_file)
>>
>> Going this route, thanks !
>
> Well, this definitely wasn't the right approach - it made timing for a
> single integration time step in the model vary between 63 and 94 seconds.
>
> Without an explanation for that, this is not the way to go.
>
> However, that doesn't explain the (small) difference in size of the
> executables to me (given the large differences in code generation that
> should ensue):
>
> The two largest binaries without the above change to ipa-inline.c:
>
> -rwxr-xr-x 1 hirlam hirlam 9943616 2009-10-10 09:06 hirvda.x
> -rwxr-xr-x 1 hirlam hirlam 2306291 2009-10-10 09:04 hlprog.x
>
> their counterparts, after the above change:
>
> -rwxr-xr-x 1 hirlam hirlam 9943673 2009-10-10 14:40 hirvda.x
> -rwxr-xr-x 1 hirlam hirlam 2306219 2009-10-10 14:37 hlprog.x
>
> This doesn't scan.  In hlprog.x only, we're removing several hundreds of
>  arguments in argument lists, both in the calling routine and the callee.
>  This cannot just result in ~ 70 bytes of difference in code size.
>
> OK, for those sceptical, here are the results of size:
>
> Before the change:
>
>   text    data     bss     dec     hex filename
> 2254145    3592 186344080       188601817       b3dd5d9
> ../../../../../bin/hlprog.x
>
> After the change:
>
>   text    data     bss     dec     hex filename
> 2254113    3592 186344080       188601785       b3dd5b9 hlprog.x
>
> Something's rotten in the State of Denmark :-)

Well, it might simple be that even with our standard parameters full
inlining of functions called once already happens for you.

As honza said - the only real way to know is to look at the dumps,
-fdump-ipa-inline in this case, there should be a section titled
'Deciding on functions called once:'
and you can compare the differences with/without the patch.

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 13:40   ` Jan Hubicka
@ 2009-10-10 15:27     ` Daniel Jacobowitz
  2009-10-10 15:28       ` Jan Hubicka
                         ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Daniel Jacobowitz @ 2009-10-10 15:27 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, Toon Moene, Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
> My solution would be probably to pass -fdump-ipa-inline parameter to lto
> compilation and read the log.  It lists the inlining decisions and if
> something is not inlined, you get dump of reason why.

GCC's dumps are really aimed at compiler developers.  I think we would
benefit from more "what is the compiler doing to my code" options
(producing "note:"); things like which functions were inlined, which
loops unrolled.  We do already have this for vectorization.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 15:27     ` Daniel Jacobowitz
@ 2009-10-10 15:28       ` Jan Hubicka
  2009-10-11 13:30         ` Toon Moene
  2009-10-10 15:33       ` Diego Novillo
  2009-10-10 16:40       ` Jeff Law
  2 siblings, 1 reply; 33+ messages in thread
From: Jan Hubicka @ 2009-10-10 15:28 UTC (permalink / raw)
  To: Jan Hubicka, Richard Guenther, Toon Moene, Jan Hubicka, gcc mailing list

> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
> > My solution would be probably to pass -fdump-ipa-inline parameter to lto
> > compilation and read the log.  It lists the inlining decisions and if
> > something is not inlined, you get dump of reason why.
> 
> GCC's dumps are really aimed at compiler developers.  I think we would
> benefit from more "what is the compiler doing to my code" options
> (producing "note:"); things like which functions were inlined, which
> loops unrolled.  We do already have this for vectorization.

We already have -Winline that dumps same info for all functions marked
inline.  I am not sure if warning about all functions would help too
much, but I might be C centric where if I wanted to get something
inlined, I would use inline keyword :)

Honza
> 
> -- 
> Daniel Jacobowitz
> CodeSourcery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 15:27     ` Daniel Jacobowitz
  2009-10-10 15:28       ` Jan Hubicka
@ 2009-10-10 15:33       ` Diego Novillo
  2009-10-10 17:26         ` Joseph S. Myers
  2009-10-10 16:40       ` Jeff Law
  2 siblings, 1 reply; 33+ messages in thread
From: Diego Novillo @ 2009-10-10 15:33 UTC (permalink / raw)
  To: Jan Hubicka, Richard Guenther, Toon Moene, Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 11:17, Daniel Jacobowitz <drow@false.org> wrote:
> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
>> My solution would be probably to pass -fdump-ipa-inline parameter to lto
>> compilation and read the log.  It lists the inlining decisions and if
>> something is not inlined, you get dump of reason why.
>
> GCC's dumps are really aimed at compiler developers.  I think we would
> benefit from more "what is the compiler doing to my code" options
> (producing "note:"); things like which functions were inlined, which
> loops unrolled.  We do already have this for vectorization.

Agreed.  We've had some discussions on this and there's been some
efforts started (http://gcc.gnu.org/wiki/Pass%20Activity%20Log), but
nothing concrete so far.

We should evolve a generic reporting facility for passes to produce
human readable logs on what they did, what they couldn't do, reasons,
etc.


Diego.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 15:27     ` Daniel Jacobowitz
  2009-10-10 15:28       ` Jan Hubicka
  2009-10-10 15:33       ` Diego Novillo
@ 2009-10-10 16:40       ` Jeff Law
  2009-10-10 17:16         ` Richard Guenther
  2009-10-13 18:43         ` Toon Moene
  2 siblings, 2 replies; 33+ messages in thread
From: Jeff Law @ 2009-10-10 16:40 UTC (permalink / raw)
  To: Jan Hubicka, Richard Guenther, Toon Moene, Jan Hubicka, gcc mailing list

On 10/10/09 09:17, Daniel Jacobowitz wrote:
> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
>    
>> My solution would be probably to pass -fdump-ipa-inline parameter to lto
>> compilation and read the log.  It lists the inlining decisions and if
>> something is not inlined, you get dump of reason why.
>>      
> GCC's dumps are really aimed at compiler developers.  I think we would
> benefit from more "what is the compiler doing to my code" options
> (producing "note:"); things like which functions were inlined, which
> loops unrolled.  We do already have this for vectorization.
>    
Interestingly enough, some of the Red Hat folks were having this exact 
discussion re: inlining with a customer last week.

What I was starting to think was to include both functions which were 
inlined, but also functions which were not inlined and the heuristic 
limits that caused the function not to be inlined.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 16:40       ` Jeff Law
@ 2009-10-10 17:16         ` Richard Guenther
  2009-10-13  1:05           ` Jeff Law
  2009-10-13 18:43         ` Toon Moene
  1 sibling, 1 reply; 33+ messages in thread
From: Richard Guenther @ 2009-10-10 17:16 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jan Hubicka, Toon Moene, Jan Hubicka, gcc mailing list

On Sat, Oct 10, 2009 at 6:24 PM, Jeff Law <law@redhat.com> wrote:
> On 10/10/09 09:17, Daniel Jacobowitz wrote:
>>
>> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
>>
>>>
>>> My solution would be probably to pass -fdump-ipa-inline parameter to lto
>>> compilation and read the log.  It lists the inlining decisions and if
>>> something is not inlined, you get dump of reason why.
>>>
>>
>> GCC's dumps are really aimed at compiler developers.  I think we would
>> benefit from more "what is the compiler doing to my code" options
>> (producing "note:"); things like which functions were inlined, which
>> loops unrolled.  We do already have this for vectorization.
>>
>
> Interestingly enough, some of the Red Hat folks were having this exact
> discussion re: inlining with a customer last week.
>
> What I was starting to think was to include both functions which were
> inlined, but also functions which were not inlined and the heuristic limits
> that caused the function not to be inlined.

Well - that will print one diagnostic per callgraph edge.  A bit too much, no?

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 15:33       ` Diego Novillo
@ 2009-10-10 17:26         ` Joseph S. Myers
  2009-10-14 20:07           ` Paolo Bonzini
  0 siblings, 1 reply; 33+ messages in thread
From: Joseph S. Myers @ 2009-10-10 17:26 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Jan Hubicka, Richard Guenther, Toon Moene, Jan Hubicka, gcc mailing list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1693 bytes --]

On Sat, 10 Oct 2009, Diego Novillo wrote:

> On Sat, Oct 10, 2009 at 11:17, Daniel Jacobowitz <drow@false.org> wrote:
> > On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
> >> My solution would be probably to pass -fdump-ipa-inline parameter to lto
> >> compilation and read the log.  It lists the inlining decisions and if
> >> something is not inlined, you get dump of reason why.
> >
> > GCC's dumps are really aimed at compiler developers.  I think we would
> > benefit from more "what is the compiler doing to my code" options
> > (producing "note:"); things like which functions were inlined, which
> > loops unrolled.  We do already have this for vectorization.
> 
> Agreed.  We've had some discussions on this and there's been some
> efforts started (http://gcc.gnu.org/wiki/Pass%20Activity%20Log), but
> nothing concrete so far.
> 
> We should evolve a generic reporting facility for passes to produce
> human readable logs on what they did, what they couldn't do, reasons,
> etc.

We should also keep in mind that such logs aimed at users should support 
i18n - unlike the existing dumps for compiler developers, which are quite 
properly English only, and most calls to internal_error which should only 
appear if there is a compiler bug and are also only meant to be useful for 
compiler developers (so represent useless work for translators at present 
- though it does seem possible some internal_error calls could actually 
appear with invalid input rather than compiler bugs and so should be 
normal errors).

If you're generating "note:" via inform(), the i18n support is automatic.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 15:28       ` Jan Hubicka
@ 2009-10-11 13:30         ` Toon Moene
  0 siblings, 0 replies; 33+ messages in thread
From: Toon Moene @ 2009-10-11 13:30 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, Jan Hubicka, gcc mailing list

Jan Hubicka wrote:

> We already have -Winline that dumps same info for all functions marked
> inline.  I am not sure if warning about all functions would help too
> much, but I might be C centric where if I wanted to get something
> inlined, I would use inline keyword :)

I must say I learned *a lot* about vectorization by having the 
vectorizer report on *every loop* whether it could vectorize it or not.

A couple of weeks ago we handed off our code to an external reviewer. 
He  - using ifort - (as well as we, using gfortran),  reported ~ 100,000 
loops of which ~ 51 % were vectorized (he also categorized the 
non-vectorized loops according to reason-not-vectorized (5 classes)).

If I were afraid of big numbers, I wouldn't work in meteorology :-)

[ PS, I finally constructed a dumbed down setup with the routines
   GEMINI, HLPROG, PHCALL, PHTASK and SL2TIM.  Amazingly, now
   -flto -fwhole-program starts to *do* something: lto1 spend a
   whole 25 seconds on processing this. ]

Line counts:
   3298 gemini.f
    830 hlprog.f
      2 main.f
    584 phcall.f
    882 phtask.f
   2182 sl2tim.f
   7778 total

Cheers,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 17:16         ` Richard Guenther
@ 2009-10-13  1:05           ` Jeff Law
  2009-10-13  1:29             ` Michael Matz
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2009-10-13  1:05 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, Toon Moene, Jan Hubicka, gcc mailing list

On 10/10/09 10:40, Richard Guenther wrote:
>
> Well - that will print one diagnostic per callgraph edge.  A bit too much, no?
>    
Possibly -- it's not yet clear (to me) how to present this data to 
users, but it's clearly something they're interested in.

To put things in perspective, the particular person I spoke with spent 
many days trying to understand why a particular function wasn't being 
inlined -- presumably they'd see "grep <ugly function> logfile" as a 
huge improvement over the days and days of twiddling sources, tuning 
options, etc, even if that presented them with a large amount of data to 
analyze.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13  1:05           ` Jeff Law
@ 2009-10-13  1:29             ` Michael Matz
  2009-10-13  6:39               ` Jeff Law
  0 siblings, 1 reply; 33+ messages in thread
From: Michael Matz @ 2009-10-13  1:29 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Guenther, Jan Hubicka, Toon Moene, Jan Hubicka, gcc mailing list

Hi,

On Mon, 12 Oct 2009, Jeff Law wrote:

> To put things in perspective, the particular person I spoke with spent 
> many days trying to understand why a particular function wasn't being 
> inlined -- presumably they'd see "grep <ugly function> logfile" as a 
> huge improvement over the days and days of twiddling sources, tuning 
> options, etc, even if that presented them with a large amount of data to 
> analyze.

If we would listen to such requests by providing the requested 
information, nothing stops users from asking to have something like that 
also for other transformations.  Like "I've spent days and days with 
analyzing why this loop isn't unrolled, I'd like to have -Winfo-unroll to 
tell me exactly when a loop is unrolled, and when it isn't for which 
reason".  Make "loop is unrolled" be $TRANSFORMATION and it becomes silly.  
I don't think this is reductio ad absurdum.  We have dump files for 
exactly such information.  Maybe the latter could be molded (via an new 
flag) into something less detailed than now, but still containing the 
larger decisions.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13  1:29             ` Michael Matz
@ 2009-10-13  6:39               ` Jeff Law
  2009-10-13 12:05                 ` Paul Brook
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2009-10-13  6:39 UTC (permalink / raw)
  To: Michael Matz
  Cc: Richard Guenther, Jan Hubicka, Toon Moene, Jan Hubicka, gcc mailing list

On 10/12/09 19:18, Michael Matz wrote:
> Hi,
>
> On Mon, 12 Oct 2009, Jeff Law wrote:
>
>    
>> To put things in perspective, the particular person I spoke with spent
>> many days trying to understand why a particular function wasn't being
>> inlined -- presumably they'd see "grep<ugly function>  logfile" as a
>> huge improvement over the days and days of twiddling sources, tuning
>> options, etc, even if that presented them with a large amount of data to
>> analyze.
>>      
> If we would listen to such requests by providing the requested
> information, nothing stops users from asking to have something like that
> also for other transformations.  Like "I've spent days and days with
> analyzing why this loop isn't unrolled, I'd like to have -Winfo-unroll to
> tell me exactly when a loop is unrolled, and when it isn't for which
> reason".  Make "loop is unrolled" be $TRANSFORMATION and it becomes silly.
> I don't think this is reductio ad absurdum.  We have dump files for
> exactly such information.  Maybe the latter could be molded (via an new
> flag) into something less detailed than now, but still containing the
> larger decisions.
>    
I'm virtually certain this customer would ask for that precise 
information about unrolling once they can get it for inline functions :-)

Nothing you've said changes  the fact that there are a class of users 
for whom that information is vital and we ought to spend some time 
thinking about how to provide the information in a form they can 
digest.  GCC dumps as they exist today are largely useless for a non-GCC 
developer to use to understand why a particular transformation did or 
did not occur in their code.  This has come up time and time again and 
will continue to do so unless we find a way to provide visibility into 
the optimizer's decision making.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13  6:39               ` Jeff Law
@ 2009-10-13 12:05                 ` Paul Brook
  2009-10-13 13:30                   ` Daniel Jacobowitz
                                     ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Paul Brook @ 2009-10-13 12:05 UTC (permalink / raw)
  To: gcc
  Cc: Jeff Law, Michael Matz, Richard Guenther, Jan Hubicka,
	Toon Moene, Jan Hubicka

> Nothing you've said changes  the fact that there are a class of users
> for whom that information is vital and we ought to spend some time
> thinking about how to provide the information in a form they can
> digest.  GCC dumps as they exist today are largely useless for a non-GCC
> developer to use to understand why a particular transformation did or
> did not occur in their code.  This has come up time and time again and
> will continue to do so unless we find a way to provide visibility into
> the optimizer's decision making.

My guess is anyone inspecting the code and optimizer decisions at this level 
is also going to want to strongarm the result they want when the compiler 
makes the "wrong" decision. i.e. detailed unroller diagnostics are only of 
limited use without (say) #pragma unroll.

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13 12:05                 ` Paul Brook
@ 2009-10-13 13:30                   ` Daniel Jacobowitz
  2009-10-13 15:15                   ` Jeff Law
  2009-10-13 18:31                   ` Adam Nemet
  2 siblings, 0 replies; 33+ messages in thread
From: Daniel Jacobowitz @ 2009-10-13 13:30 UTC (permalink / raw)
  To: Paul Brook
  Cc: gcc, Jeff Law, Michael Matz, Richard Guenther, Jan Hubicka,
	Toon Moene, Jan Hubicka

On Tue, Oct 13, 2009 at 10:41:39AM +0100, Paul Brook wrote:
> My guess is anyone inspecting the code and optimizer decisions at this level 
> is also going to want to strongarm the result they want when the compiler 
> makes the "wrong" decision. i.e. detailed unroller diagnostics are only of 
> limited use without (say) #pragma unroll.

Not too limited, I'd say.  I've seen a lot of developers willing to
mutilate their critical loops to accomodate the compiler.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13 12:05                 ` Paul Brook
  2009-10-13 13:30                   ` Daniel Jacobowitz
@ 2009-10-13 15:15                   ` Jeff Law
  2009-10-13 18:31                   ` Adam Nemet
  2 siblings, 0 replies; 33+ messages in thread
From: Jeff Law @ 2009-10-13 15:15 UTC (permalink / raw)
  To: Paul Brook
  Cc: gcc, Michael Matz, Richard Guenther, Jan Hubicka, Toon Moene,
	Jan Hubicka

On 10/13/09 03:41, Paul Brook wrote:
>> Nothing you've said changes  the fact that there are a class of users
>> for whom that information is vital and we ought to spend some time
>> thinking about how to provide the information in a form they can
>> digest.  GCC dumps as they exist today are largely useless for a non-GCC
>> developer to use to understand why a particular transformation did or
>> did not occur in their code.  This has come up time and time again and
>> will continue to do so unless we find a way to provide visibility into
>> the optimizer's decision making.
>>      
> My guess is anyone inspecting the code and optimizer decisions at this level
> is also going to want to strongarm the result they want when the compiler
> makes the "wrong" decision. i.e. detailed unroller diagnostics are only of
> limited use without (say) #pragma unroll.
>    
Perhaps.  Of course in this customer's case they're looking at 20%+ 
hits, so a "wrong" decision is quite costly.  At least with the inliner 
they have a fair number of knobs they can turn once they know which 
heuristic is preventing inlining the key function(s) -- for other passes 
we don't have nearly as many knobs.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13 12:05                 ` Paul Brook
  2009-10-13 13:30                   ` Daniel Jacobowitz
  2009-10-13 15:15                   ` Jeff Law
@ 2009-10-13 18:31                   ` Adam Nemet
  2 siblings, 0 replies; 33+ messages in thread
From: Adam Nemet @ 2009-10-13 18:31 UTC (permalink / raw)
  To: Paul Brook
  Cc: gcc, Jeff Law, Michael Matz, Richard Guenther, Jan Hubicka,
	Toon Moene, Jan Hubicka

Paul Brook <paul@codesourcery.com> writes:

>> Nothing you've said changes  the fact that there are a class of users
>> for whom that information is vital and we ought to spend some time
>> thinking about how to provide the information in a form they can
>> digest.  GCC dumps as they exist today are largely useless for a non-GCC
>> developer to use to understand why a particular transformation did or
>> did not occur in their code.  This has come up time and time again and
>> will continue to do so unless we find a way to provide visibility into
>> the optimizer's decision making.
>
> My guess is anyone inspecting the code and optimizer decisions at this level 
> is also going to want to strongarm the result they want when the compiler 
> makes the "wrong" decision. i.e. detailed unroller diagnostics are only of 
> limited use without (say) #pragma unroll.

We would also increase the chances of getting more precise bug reports
rather than "my code is slower with GCC 4.5 than it was with GCC 4.4".
IOW, we could push some of the initial investigation work over to the
user :).

Also with VTA we will hopefully be in better shape referencing
source-level constructs.

Adam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 16:40       ` Jeff Law
  2009-10-10 17:16         ` Richard Guenther
@ 2009-10-13 18:43         ` Toon Moene
  2009-10-14 10:05           ` Richard Guenther
  1 sibling, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-13 18:43 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jan Hubicka, Richard Guenther, Jan Hubicka, gcc mailing list

Jeff Law wrote:

> On 10/10/09 09:17, Daniel Jacobowitz wrote:

>> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
>>   
>>> My solution would be probably to pass -fdump-ipa-inline parameter to lto
>>> compilation and read the log.  It lists the inlining decisions and if
>>> something is not inlined, you get dump of reason why.

OK, I did just that (of course, because I'm only interested in inlining 
during Link-Time-Optimization, I only passed the compiler option to the 
link phase of the the build).

Now where does the resulting dump ends up - and how is it named ?

I.e., in case of:

gfortran -o exe -O3 -flto -fwhole-program -fdump-ipa-inline a.f lib.a

?

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-13 18:43         ` Toon Moene
@ 2009-10-14 10:05           ` Richard Guenther
  2009-10-14 13:48             ` Toon Moene
  2009-10-14 14:04             ` Toon Moene
  0 siblings, 2 replies; 33+ messages in thread
From: Richard Guenther @ 2009-10-14 10:05 UTC (permalink / raw)
  To: Toon Moene; +Cc: Jeff Law, Jan Hubicka, Jan Hubicka, gcc mailing list

On Tue, Oct 13, 2009 at 8:31 PM, Toon Moene <toon@moene.org> wrote:
> Jeff Law wrote:
>
>> On 10/10/09 09:17, Daniel Jacobowitz wrote:
>
>>> On Sat, Oct 10, 2009 at 02:31:25PM +0200, Jan Hubicka wrote:
>>>
>>>>
>>>> My solution would be probably to pass -fdump-ipa-inline parameter to lto
>>>> compilation and read the log.  It lists the inlining decisions and if
>>>> something is not inlined, you get dump of reason why.
>
> OK, I did just that (of course, because I'm only interested in inlining
> during Link-Time-Optimization, I only passed the compiler option to the link
> phase of the the build).
>
> Now where does the resulting dump ends up - and how is it named ?
>
> I.e., in case of:
>
> gfortran -o exe -O3 -flto -fwhole-program -fdump-ipa-inline a.f lib.a
>
> ?

It'll be in /tmp and named after the first object file, in your case it will
be ccGGS24.o.047i.inline (because the first object file will be a
tempfile).  A minor inconvenience that maybe is going to be fixed.

Richard.

>
> --
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/
> Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 10:05           ` Richard Guenther
@ 2009-10-14 13:48             ` Toon Moene
  2009-10-14 14:04             ` Toon Moene
  1 sibling, 0 replies; 33+ messages in thread
From: Toon Moene @ 2009-10-14 13:48 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc mailing list

Richard Guenther wrote:

> On Tue, Oct 13, 2009 at 8:31 PM, Toon Moene <toon@moene.org> wrote:

>> gfortran -o exe -O3 -flto -fwhole-program -fdump-ipa-inline a.f lib.a
>>
>> ?
> 
> It'll be in /tmp and named after the first object file, in your case it will
> be ccGGS24.o.047i.inline (because the first object file will be a
> tempfile).  A minor inconvenience that maybe is going to be fixed.

Found it.  That surely is counter-intuitive, though ....

Thanks !

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 10:05           ` Richard Guenther
  2009-10-14 13:48             ` Toon Moene
@ 2009-10-14 14:04             ` Toon Moene
  2009-10-14 14:43               ` Jan Hubicka
  1 sibling, 1 reply; 33+ messages in thread
From: Toon Moene @ 2009-10-14 14:04 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, Jan Hubicka, gcc mailing list

Richard Guenther wrote:

> It'll be in /tmp and named after the first object file, in your case it will
> be ccGGS24.o.047i.inline (because the first object file will be a
> tempfile).  A minor inconvenience that maybe is going to be fixed.

Now that Richard has pointed out to me where the info is, I can post it 
here.  This are the inlining decision on my mini-example (just 5 
subroutines and a "main"):

gemini.f  hlprog.f  main.f  phcall.f  phtask.f  sl2tim.f

Reclaiming functions:
Deciding on inlining.  Starting with size 45477.

Inlining always_inline functions:

Deciding on smaller functions:
Considering inline candidate phcall_.clone.3.
Inlining failed: --param max-inline-insns-auto limit reached
Considering inline candidate phtask_.clone.2.
Inlining failed: --param max-inline-insns-auto limit reached
Considering inline candidate gemini_.clone.1.
Inlining failed: --param max-inline-insns-auto limit reached
Considering inline candidate sl2tim_.clone.0.
Inlining failed: --param max-inline-insns-auto limit reached
Considering inline candidate hlprog.
Inlining failed: --param max-inline-insns-auto limit reached

Deciding on functions called once:

Considering gemini_.clone.1 size 11443.
  Called once from hlprog 462 insns.
  Inlined into hlprog which now has 10728 size for a net change of 
-12620 size.

Considering hlprog size 10728.
  Called once from main 7 insns.
  Inline limit reached, not inlined.

Inlined 1 calls, eliminated 1 functions, size 45477 turned to 32857 size.

The mistake made here is that *all* the above functions are "called 
once", but only GEMINI is considered for some reason (probably simply 
because it's the first one ?).

Jan, if you're interested, I can send you the mini-example so that you 
can see for yourself.

HLPROG calls GEMINI, which calls SL2TIM, which calls PHCALL, which calls 
PHTASK (all "only-once calls").

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 14:04             ` Toon Moene
@ 2009-10-14 14:43               ` Jan Hubicka
  2009-10-14 16:49                 ` Daniel Jacobowitz
  2009-10-14 18:20                 ` Toon Moene
  0 siblings, 2 replies; 33+ messages in thread
From: Jan Hubicka @ 2009-10-14 14:43 UTC (permalink / raw)
  To: Toon Moene; +Cc: Richard Guenther, Jan Hubicka, Jan Hubicka, gcc mailing list

> Richard Guenther wrote:
> 
> >It'll be in /tmp and named after the first object file, in your case it 
> >will
> >be ccGGS24.o.047i.inline (because the first object file will be a
> >tempfile).  A minor inconvenience that maybe is going to be fixed.
> 
> Now that Richard has pointed out to me where the info is, I can post it 
> here.  This are the inlining decision on my mini-example (just 5 
> subroutines and a "main"):
> 
> gemini.f  hlprog.f  main.f  phcall.f  phtask.f  sl2tim.f
> 
> 
> Reclaiming functions:
> Deciding on inlining.  Starting with size 45477.
> 
> Inlining always_inline functions:
> 
> Deciding on smaller functions:
> Considering inline candidate phcall_.clone.3.
> Inlining failed: --param max-inline-insns-auto limit reached
> Considering inline candidate phtask_.clone.2.
> Inlining failed: --param max-inline-insns-auto limit reached
> Considering inline candidate gemini_.clone.1.
> Inlining failed: --param max-inline-insns-auto limit reached
> Considering inline candidate sl2tim_.clone.0.
> Inlining failed: --param max-inline-insns-auto limit reached
> Considering inline candidate hlprog.
> Inlining failed: --param max-inline-insns-auto limit reached
> 
> Deciding on functions called once:
> 
> Considering gemini_.clone.1 size 11443.
>  Called once from hlprog 462 insns.
>  Inlined into hlprog which now has 10728 size for a net change of 
> -12620 size.
> 
> Considering hlprog size 10728.
>  Called once from main 7 insns.
>  Inline limit reached, not inlined.
> 
> Inlined 1 calls, eliminated 1 functions, size 45477 turned to 32857 size.
> 
> 
> The mistake made here is that *all* the above functions are "called 
> once", but only GEMINI is considered for some reason (probably simply 
> because it's the first one ?).
> 
> Jan, if you're interested, I can send you the mini-example so that you 
> can see for yourself.

Yes, I would be interested.  It seems that for osme reason the other
functions are not considered to be called once, perhaps a visibility
issue.  We also should say what limit was reached on inlining hlprog.

Honza
> 
> HLPROG calls GEMINI, which calls SL2TIM, which calls PHCALL, which calls 
> PHTASK (all "only-once calls").
> 
> -- 
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/
> Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 14:43               ` Jan Hubicka
@ 2009-10-14 16:49                 ` Daniel Jacobowitz
  2009-10-14 18:20                 ` Toon Moene
  1 sibling, 0 replies; 33+ messages in thread
From: Daniel Jacobowitz @ 2009-10-14 16:49 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Toon Moene, Richard Guenther, Jan Hubicka, gcc mailing list

On Wed, Oct 14, 2009 at 04:33:35PM +0200, Jan Hubicka wrote:
> > Deciding on smaller functions:
> > Considering inline candidate phcall_.clone.3.
> > Inlining failed: --param max-inline-insns-auto limit reached

> Yes, I would be interested.  It seems that for osme reason the other
> functions are not considered to be called once, perhaps a visibility
> issue.  We also should say what limit was reached on inlining hlprog.

Maybe because of whatever did that cloning?

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 14:43               ` Jan Hubicka
  2009-10-14 16:49                 ` Daniel Jacobowitz
@ 2009-10-14 18:20                 ` Toon Moene
  1 sibling, 0 replies; 33+ messages in thread
From: Toon Moene @ 2009-10-14 18:20 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, Jan Hubicka, gcc mailing list

Jan Hubicka wrote:

> Yes, I would be interested.  It seems that for osme reason the other
> functions are not considered to be called once, perhaps a visibility
> issue.  We also should say what limit was reached on inlining hlprog.

Sent off bzip2'd tar file.

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 11:34 ` Richard Guenther
  2009-10-10 11:47   ` Toon Moene
  2009-10-10 13:40   ` Jan Hubicka
@ 2009-10-14 18:58   ` Paolo Bonzini
  2009-10-14 19:09     ` Paolo Bonzini
  2 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2009-10-14 18:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Toon Moene, Jan Hubicka, gcc mailing list


> -Winline doesn't help here.  Scanning the assember output does (obviously!).

nm also does.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-14 18:58   ` Paolo Bonzini
@ 2009-10-14 19:09     ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2009-10-14 19:09 UTC (permalink / raw)
  To: gcc; +Cc: Toon Moene, Jan Hubicka, gcc mailing list


> -Winline doesn't help here.  Scanning the assember output does (obviously!).

nm also does.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: LTO and the inlining of functions only called once.
  2009-10-10 17:26         ` Joseph S. Myers
@ 2009-10-14 20:07           ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2009-10-14 20:07 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: Diego Novillo, Jan Hubicka, Richard Guenther, Toon Moene,
	Jan Hubicka, gcc mailing list


> We should also keep in mind that such logs aimed at users should support
> i18n - unlike the existing dumps for compiler developers, which are quite
> properly English only, and most calls to internal_error which should only
> appear if there is a compiler bug and are also only meant to be useful for
> compiler developers (so represent useless work for translators at present
> - though it does seem possible some internal_error calls could actually
> appear with invalid input rather than compiler bugs and so should be
> normal errors).

We should first support i18n of C++ error messages, which is totally 
broken for languages that have more than one case or more than one 
article form (e.g. singular and plural of "the").

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-10-14 18:58 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-10 11:00 LTO and the inlining of functions only called once Toon Moene
2009-10-10 11:34 ` Richard Guenther
2009-10-10 11:47   ` Toon Moene
2009-10-10 11:50     ` Richard Guenther
2009-10-10 12:18       ` Richard Guenther
2009-10-10 12:31         ` Toon Moene
2009-10-10 14:25           ` Toon Moene
2009-10-10 15:10             ` Richard Guenther
2009-10-10 13:40   ` Jan Hubicka
2009-10-10 15:27     ` Daniel Jacobowitz
2009-10-10 15:28       ` Jan Hubicka
2009-10-11 13:30         ` Toon Moene
2009-10-10 15:33       ` Diego Novillo
2009-10-10 17:26         ` Joseph S. Myers
2009-10-14 20:07           ` Paolo Bonzini
2009-10-10 16:40       ` Jeff Law
2009-10-10 17:16         ` Richard Guenther
2009-10-13  1:05           ` Jeff Law
2009-10-13  1:29             ` Michael Matz
2009-10-13  6:39               ` Jeff Law
2009-10-13 12:05                 ` Paul Brook
2009-10-13 13:30                   ` Daniel Jacobowitz
2009-10-13 15:15                   ` Jeff Law
2009-10-13 18:31                   ` Adam Nemet
2009-10-13 18:43         ` Toon Moene
2009-10-14 10:05           ` Richard Guenther
2009-10-14 13:48             ` Toon Moene
2009-10-14 14:04             ` Toon Moene
2009-10-14 14:43               ` Jan Hubicka
2009-10-14 16:49                 ` Daniel Jacobowitz
2009-10-14 18:20                 ` Toon Moene
2009-10-14 18:58   ` Paolo Bonzini
2009-10-14 19:09     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).