public inbox for
 help / color / mirror / Atom feed
From: Qing Zhao <>
To: Jan Hubicka <>
Cc: "Richard Biener" <>,
	"Martin Liška" <>,
	"" <>
Subject: Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone
Date: Fri, 7 Oct 2022 15:36:50 +0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

> On Oct 7, 2022, at 10:43 AM, Jan Hubicka <> wrote:
>>>> Probably not hard, and the IPA pass adjusting visbility could as well
>>>> mark the functions
>>>> as not to be inlined with -flive-patching=inline-only-static.
>>>>>> OTOH inline-only-static could disable WPA inlining and do all inlining early ...
>>>>> Inline-only-static ONLY inlines static functions, how can it disable WPA inlining? Don’t quite understand here.
>>>> it's a flag so it can be used to control other things
>>> GCC has two inliners
>>> 1) ealry inlininer which happens at compile time and is quite
>>> restricted only to obvious cases (always_inline, flatten and very small
>>> functions)
>>> 2) IPA inlining happening at link-time (WPA) which is using greedy
>>> algorithm and makes more complicated code size/speed tradeoffs
>>> Indeed betwen 1 and 2 previously global functions may become static by
>>> resolution info (they won't currently with kernel since we do
>>> incremental linking).  We could easily keep track of originally static
>>> functions and promoted to static functions and make IPA inlining to
>>> honnor the patch.
> 	       ^^^^ I mean -flive-patching=inline-only-static flag
>> Yes, this is similar as Studio compiler (an early inliner and a IPA inliner) 
>> But I still don’t quite understand why during IPA inlining, extern functions need to be changed to static functions?
>> (in Studio compiler, it’s opposite, static functions are all promoted to extern functions to enable inter-procedural inlining)
>> Is there a file I can read to understand more details on this?
> In GCC WPA stage takes all compilation units and create a new combined
> translation unit.  This has same kind of symbol table as if it came all
> from a single source file.

Okay, I see now.. (should take some time to familiar with the GCC LTO framework a little more, I guess-:)

>  So if resolution file tells us that a given
> symbol is not used outside the LTO bytecode (when one links final binary
> linker knows if there was other use and similar for shared libraries if
> symbol is hidden) we promote symbol to static.

Okay, that’s reasonable. 

>  This lets us to optimize
> it better: change calling conventions, remove offline copy if all calls
> was inlined, propagate various information about it.
> Since from this moment on the whole translation unit behaves as single
> source file it is not problem to inline static functions cross-modle.
Yes, makes sense.

> Later we partition the combined unit to do rest of compilation in
> parallel and at this stage some static symbols can be changed to hidden
> symbols if they are used by multiple partitioning.
So, original “extern” functions will be kept as “static” if they are not used by multiple partitioning?

>>> I however wonder how much LTO optimization would remain. If we disable
>>> all inter-module inlining
>> Oh, wait,  so, demoting “extern” functions to “static” functions in GCC’s IPA inlining is to disable inter-module inlining? 
>> Why? Is there any technical issue with inter-module inlining in GCC? 
> No I suppose it is just different organization of LTO from Studio
> compiler.  Perhaps studio compiler still combine according to original
> source-level compilation units and then cross-module inlining of
> function A may cause function B to be promoted to public in a case
> originally both A and B were in same compilation unit and B was static?
I think the biggest difference between IPO (inter-procedural optimization) of Studio compiler
 and LTO of GCC is, GCC merges all IR files together into one single IR file. But Studio still 
kept multiple IR files. So, For GCC, LTO inlining is just similar as a IN-Module inlining, and
external functions are better to be demoted to static functions to enable better optimizations. 

For Studio compiler, IPO analysis and optimization still based on multiple IRs. So in order to
enable cross-file inlining between different modules, static functions need to be promoted to
external function the exactly reason as you mentioned above. (Though Studio compiler could 
do better to only promote parts of the static functions to external by demand)

>>> and with live patching we also disable most of
>>> other optimization,
>> Yes, with live-patching, most of the IPA optimization need to be disabled. But this functionality is needed, right? When user requests live-patching support, 
>> They should know that most of IPA optimization will be disabled.
> Question is how much useful is to biuld with -flto then.  I woudl say that
> close to 90% of performance benefits from LTO originates from cross-module
> inlining.
> Most of code size benefits are due to 
> - removing unreachable code,
> - inlining functions called once (with whole program knowledge)
> - removing offline function bodies if all calls has been inlined. and
> - identical code folding


> Other IPA optimizations we implement (function flags discovery, mod-ref
> etc.) accounts for smaller portion of perofmance & size.
Yes, this was also the case for studio compiler. 
> If you block cross-module inlininig and all kind of inter-procedural
> optimizations you lose, I believe, all of the goodies above except for
> removal of unreachable code.
> While this is important for some C++ code with a lot of template
> instantiations, I am not sure how many completely dead functions kernel has.
> So what use cases you expect for -flto -flive-patching=inline-only-static?

Okay, So, your major concern of the combination of “-flto -flive-patching=inline-only-static” is: not very  useful. 
If so, Yes, I agree with this. 

My major point was, if technically simple and doable, it’s better to support it for feature completeness. 
It’s user’s choice to use it or not. (We can provide more details in documentation to warn the user about the
Performance impact).  On the other hand, if it’s too complicate to implement, I agree not doing it is better.

Thanks a lot for your patience and detailed explanation.


> Honza
>> Qing
>>> I think basically only unreachable code removal will
>>> remain and possibly some propagation of "coldness" across the code.
>>> I can implement this incrementally.
>>> Martin, if live patching is happy about some symbols being promoted
>>> static, the patch is OK.
>>> Honza

  reply	other threads:[~2022-10-07 15:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-05 11:41 Martin Liška
2022-10-05 14:50 ` Qing Zhao
2022-10-05 17:36   ` Martin Liška
2022-10-05 18:18     ` Qing Zhao
2022-10-06  8:29       ` Richard Biener
2022-10-06  8:40         ` Martin Liška
2022-10-06 13:18         ` Qing Zhao
2022-10-07  6:34           ` Richard Biener
2022-10-07 13:03             ` Jan Hubicka
2022-10-07 14:30               ` Qing Zhao
2022-10-07 14:43                 ` Jan Hubicka
2022-10-07 15:36                   ` Qing Zhao [this message]
2022-10-07 13:04             ` Qing Zhao
2022-10-07 13:50               ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).