public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC 10 LTO documentation
@ 2021-06-21 13:41 Chris S
  2021-06-22 13:02 ` Jonathan Wakely
  2021-06-23 10:04 ` Kewen.Lin
  0 siblings, 2 replies; 9+ messages in thread
From: Chris S @ 2021-06-21 13:41 UTC (permalink / raw)
  To: gcc-help

Are the capabilities and/or limitations of GCC10 LTO documented anywhere?
I understand it only at a high level but not with details, and am having
trouble finding any current information that describes it very clearly.
Some questions came up recently that I'd like to be able to answer, but it
boils down to this:

Is it possible to build static libraries that have LTO optimizations
applied to the object code they contain (that is, all the code in that
library is optimized together with LTO), but when built together into a
final binary, no additional LTO is performed?  We have several large,
static libraries that are mostly unrelated, and are looking for ways to
reduce a massive increase in build times after moving to g++10, where
almost 80% of the time is spent in LTO.  Having optimized individual
libraries but not a global optimized binary might be a reasonable
tradeoff.  Is this possible?

I don't have a good mental model of when "extra information" (GIMPLE) is
merely included in the code for later use, and when that GIMPLE information
is actually used to perform LTO optimizations.  (My suspicion is that it's
only when building the final binary.)  However, if we can build static
libraries that are already optimized within themselves, a  hint of what
command line options to use would also be very appreciated.

Thanks.
Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-21 13:41 GCC 10 LTO documentation Chris S
@ 2021-06-22 13:02 ` Jonathan Wakely
  2021-06-22 15:05   ` Chris S
  2021-06-22 15:14   ` Xi Ruoyao
  2021-06-23 10:04 ` Kewen.Lin
  1 sibling, 2 replies; 9+ messages in thread
From: Jonathan Wakely @ 2021-06-22 13:02 UTC (permalink / raw)
  To: Chris S; +Cc: gcc-help

On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> Is it possible to build static libraries that have LTO optimizations
> applied to the object code they contain (that is, all the code in that
> library is optimized together with LTO), but when built together into a
> final binary, no additional LTO is performed?  We have several large,
> static libraries that are mostly unrelated, and are looking for ways to
> reduce a massive increase in build times after moving to g++10, where
> almost 80% of the time is spent in LTO.  Having optimized individual
> libraries but not a global optimized binary might be a reasonable
> tradeoff.  Is this possible?

A static library is just an archive file containing object files. No
LTO is done "between" those files, they're just added to the archive.
That's because creating a static library is not "linking".

If you do not use -flto when doing the final link, you might as well
not use -flto when compiling the objects that go into your static
library, because otherwise you're adding all the extra LTO information
to the objects and then ignoring it when linking.

If you use dynamic libraries, they can be LTO-optimized internally,
and you get partial benefits of LTO even if you don't use LTO when
linking the final binary.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-22 13:02 ` Jonathan Wakely
@ 2021-06-22 15:05   ` Chris S
  2021-06-22 15:14   ` Xi Ruoyao
  1 sibling, 0 replies; 9+ messages in thread
From: Chris S @ 2021-06-22 15:05 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help

Thank you.  As a follow up, without turning it off completely, is there any
way to limit the scope of LTO such that when it brings those static
libraries together, they are not fully mixed?  That is, building the
executable could do less work by only considering certain groups of files
instead of trying to "LTO All The Things".   After upgrading from g++7 to
g++10, after compiling, the LTO time for one application went from 20
minutes to 70 minutes with no change in code, and we're looking for ideas
and ways to reduce this (while maintaining a statically linked
executable).  In our case, each static library can be thought of as a
closely knit set of files that does not need full LTO with all the other
code from the other libraries.

Thanks,
Chris

On Tue, Jun 22, 2021 at 8:02 AM Jonathan Wakely <jwakely.gcc@gmail.com>
wrote:

> On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> > Is it possible to build static libraries that have LTO optimizations
> > applied to the object code they contain (that is, all the code in that
> > library is optimized together with LTO), but when built together into a
> > final binary, no additional LTO is performed?  We have several large,
> > static libraries that are mostly unrelated, and are looking for ways to
> > reduce a massive increase in build times after moving to g++10, where
> > almost 80% of the time is spent in LTO.  Having optimized individual
> > libraries but not a global optimized binary might be a reasonable
> > tradeoff.  Is this possible?
>
> A static library is just an archive file containing object files. No
> LTO is done "between" those files, they're just added to the archive.
> That's because creating a static library is not "linking".
>
> If you do not use -flto when doing the final link, you might as well
> not use -flto when compiling the objects that go into your static
> library, because otherwise you're adding all the extra LTO information
> to the objects and then ignoring it when linking.
>
> If you use dynamic libraries, they can be LTO-optimized internally,
> and you get partial benefits of LTO even if you don't use LTO when
> linking the final binary.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-22 13:02 ` Jonathan Wakely
  2021-06-22 15:05   ` Chris S
@ 2021-06-22 15:14   ` Xi Ruoyao
  2021-06-22 15:52     ` Jonathan Wakely
  1 sibling, 1 reply; 9+ messages in thread
From: Xi Ruoyao @ 2021-06-22 15:14 UTC (permalink / raw)
  To: Jonathan Wakely, Chris S; +Cc: gcc-help

On Tue, 2021-06-22 at 14:02 +0100, Jonathan Wakely via Gcc-help wrote:
> On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> > Is it possible to build static libraries that have LTO optimizations
> > applied to the object code they contain (that is, all the code in
> > that
> > library is optimized together with LTO), but when built together
> > into a
> > final binary, no additional LTO is performed?  We have several
> > large,
> > static libraries that are mostly unrelated, and are looking for ways
> > to
> > reduce a massive increase in build times after moving to g++10,
> > where
> > almost 80% of the time is spent in LTO.  Having optimized individual
> > libraries but not a global optimized binary might be a reasonable
> > tradeoff.  Is this possible?

I tried "gcc -Wl,--whole-archive lib.a -Wl,-r -nostdlib -flinker-
output=nolto-rel -o lib.o", which seems working for a very simple
testcase.  But I'm not sure if it's really correct.

> A static library is just an archive file containing object files. No
> LTO is done "between" those files, they're just added to the archive.
> That's because creating a static library is not "linking".
> 
> If you do not use -flto when doing the final link, you might as well
> not use -flto when compiling the objects that go into your static
> library, because otherwise you're adding all the extra LTO information
> to the objects and then ignoring it when linking.

It's not "extra" information.  Object files created with -flto only
contains GIMPLE which can only be linked with LTO enabled, the "normal"
object code is not there.  For example, if lib.a contains several object
files compiled with -flto:

cc main.c lib.a -flto      # main.c is compiled with LTO, and LTO will
                           # run for main.o and object files in lib.a

cc main.c lib.a            # main.c is not compiled with LTO, but LTO
                           # will still run for object files in lib.a

cc main.c lib.a -fno-lto   # error, linker will say
                           # "plugin needed to handle lto object"

(Unless -ffat-lto-objects is used.)
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-22 15:14   ` Xi Ruoyao
@ 2021-06-22 15:52     ` Jonathan Wakely
  2021-06-22 16:08       ` Xi Ruoyao
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Wakely @ 2021-06-22 15:52 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: Chris S, gcc-help

On Tue, 22 Jun 2021 at 16:14, Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
> On Tue, 2021-06-22 at 14:02 +0100, Jonathan Wakely via Gcc-help wrote:
> > On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> > > Is it possible to build static libraries that have LTO optimizations
> > > applied to the object code they contain (that is, all the code in
> > > that
> > > library is optimized together with LTO), but when built together
> > > into a
> > > final binary, no additional LTO is performed?  We have several
> > > large,
> > > static libraries that are mostly unrelated, and are looking for ways
> > > to
> > > reduce a massive increase in build times after moving to g++10,
> > > where
> > > almost 80% of the time is spent in LTO.  Having optimized individual
> > > libraries but not a global optimized binary might be a reasonable
> > > tradeoff.  Is this possible?
>
> I tried "gcc -Wl,--whole-archive lib.a -Wl,-r -nostdlib -flinker-
> output=nolto-rel -o lib.o", which seems working for a very simple
> testcase.  But I'm not sure if it's really correct.
>
> > A static library is just an archive file containing object files. No
> > LTO is done "between" those files, they're just added to the archive.
> > That's because creating a static library is not "linking".
> >
> > If you do not use -flto when doing the final link, you might as well
> > not use -flto when compiling the objects that go into your static
> > library, because otherwise you're adding all the extra LTO information
> > to the objects and then ignoring it when linking.
>
> It's not "extra" information.  Object files created with -flto only
> contains GIMPLE which can only be linked with LTO enabled, the "normal"
> object code is not there.  For example, if lib.a contains several object
> files compiled with -flto:
>
> cc main.c lib.a -flto      # main.c is compiled with LTO, and LTO will
>                            # run for main.o and object files in lib.a
>
> cc main.c lib.a            # main.c is not compiled with LTO, but LTO
>                            # will still run for object files in lib.a
>
> cc main.c lib.a -fno-lto   # error, linker will say
>                            # "plugin needed to handle lto object"
>
> (Unless -ffat-lto-objects is used.)

Thanks for the correction. So then maybe that's what Chris wants: the
objects in the static libs can be LTO'd but the objects in the main
executable won't be. And if it's still too slow, only enable LTO for
some static libs, where it gives significant performance gain.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-22 15:52     ` Jonathan Wakely
@ 2021-06-22 16:08       ` Xi Ruoyao
  2021-06-22 17:02         ` Jonathan Wakely
  0 siblings, 1 reply; 9+ messages in thread
From: Xi Ruoyao @ 2021-06-22 16:08 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Chris S, gcc-help

On Tue, 2021-06-22 at 16:52 +0100, Jonathan Wakely wrote:
> On Tue, 22 Jun 2021 at 16:14, Xi Ruoyao <xry111@mengyan1223.wang>
> wrote:
> > 
> > On Tue, 2021-06-22 at 14:02 +0100, Jonathan Wakely via Gcc-help
> > wrote:
> > > On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> > > > Is it possible to build static libraries that have LTO
> > > > optimizations
> > > > applied to the object code they contain (that is, all the code
> > > > in
> > > > that
> > > > library is optimized together with LTO), but when built together
> > > > into a
> > > > final binary, no additional LTO is performed?  We have several
> > > > large,
> > > > static libraries that are mostly unrelated, and are looking for
> > > > ways
> > > > to
> > > > reduce a massive increase in build times after moving to g++10,
> > > > where
> > > > almost 80% of the time is spent in LTO.  Having optimized
> > > > individual
> > > > libraries but not a global optimized binary might be a
> > > > reasonable
> > > > tradeoff.  Is this possible?
> > 
> > I tried "gcc -Wl,--whole-archive lib.a -Wl,-r -nostdlib -flinker-
> > output=nolto-rel -o lib.o", which seems working for a very simple
> > testcase.  But I'm not sure if it's really correct.
> > 
> > > A static library is just an archive file containing object files.
> > > No
> > > LTO is done "between" those files, they're just added to the
> > > archive.
> > > That's because creating a static library is not "linking".
> > > 
> > > If you do not use -flto when doing the final link, you might as
> > > well
> > > not use -flto when compiling the objects that go into your static
> > > library, because otherwise you're adding all the extra LTO
> > > information
> > > to the objects and then ignoring it when linking.
> > 
> > It's not "extra" information.  Object files created with -flto only
> > contains GIMPLE which can only be linked with LTO enabled, the
> > "normal"
> > object code is not there.  For example, if lib.a contains several
> > object
> > files compiled with -flto:
> > 
> > cc main.c lib.a -flto      # main.c is compiled with LTO, and LTO
> > will
> >                            # run for main.o and object files in
> > lib.a
> > 
> > cc main.c lib.a            # main.c is not compiled with LTO, but
> > LTO
> >                            # will still run for object files in
> > lib.a
> > 
> > cc main.c lib.a -fno-lto   # error, linker will say
> >                            # "plugin needed to handle lto object"
> > 
> > (Unless -ffat-lto-objects is used.)
> 
> Thanks for the correction. So then maybe that's what Chris wants: the
> objects in the static libs can be LTO'd but the objects in the main
> executable won't be. And if it's still too slow, only enable LTO for
> some static libs, where it gives significant performance gain.

I think he means:

cc main.c lib1.a lib2.a lib3.a lib4.a

lib[1-4].a are all built with -flto.  In this case, unfortunately gcc
will still run a whole LTO pass for all object files in lib[1-4].a.  But
Chris wants 4 LTO passes, each for one static library and abandon the
optimization oppertunity crossing library files.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-22 16:08       ` Xi Ruoyao
@ 2021-06-22 17:02         ` Jonathan Wakely
  0 siblings, 0 replies; 9+ messages in thread
From: Jonathan Wakely @ 2021-06-22 17:02 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: Chris S, gcc-help

On Tue, 22 Jun 2021 at 17:08, Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
> On Tue, 2021-06-22 at 16:52 +0100, Jonathan Wakely wrote:
> > On Tue, 22 Jun 2021 at 16:14, Xi Ruoyao <xry111@mengyan1223.wang>
> > wrote:
> > >
> > > On Tue, 2021-06-22 at 14:02 +0100, Jonathan Wakely via Gcc-help
> > > wrote:
> > > > On Mon, 21 Jun 2021 at 14:42, Chris S wrote:
> > > > > Is it possible to build static libraries that have LTO
> > > > > optimizations
> > > > > applied to the object code they contain (that is, all the code
> > > > > in
> > > > > that
> > > > > library is optimized together with LTO), but when built together
> > > > > into a
> > > > > final binary, no additional LTO is performed?  We have several
> > > > > large,
> > > > > static libraries that are mostly unrelated, and are looking for
> > > > > ways
> > > > > to
> > > > > reduce a massive increase in build times after moving to g++10,
> > > > > where
> > > > > almost 80% of the time is spent in LTO.  Having optimized
> > > > > individual
> > > > > libraries but not a global optimized binary might be a
> > > > > reasonable
> > > > > tradeoff.  Is this possible?
> > >
> > > I tried "gcc -Wl,--whole-archive lib.a -Wl,-r -nostdlib -flinker-
> > > output=nolto-rel -o lib.o", which seems working for a very simple
> > > testcase.  But I'm not sure if it's really correct.
> > >
> > > > A static library is just an archive file containing object files.
> > > > No
> > > > LTO is done "between" those files, they're just added to the
> > > > archive.
> > > > That's because creating a static library is not "linking".
> > > >
> > > > If you do not use -flto when doing the final link, you might as
> > > > well
> > > > not use -flto when compiling the objects that go into your static
> > > > library, because otherwise you're adding all the extra LTO
> > > > information
> > > > to the objects and then ignoring it when linking.
> > >
> > > It's not "extra" information.  Object files created with -flto only
> > > contains GIMPLE which can only be linked with LTO enabled, the
> > > "normal"
> > > object code is not there.  For example, if lib.a contains several
> > > object
> > > files compiled with -flto:
> > >
> > > cc main.c lib.a -flto      # main.c is compiled with LTO, and LTO
> > > will
> > >                            # run for main.o and object files in
> > > lib.a
> > >
> > > cc main.c lib.a            # main.c is not compiled with LTO, but
> > > LTO
> > >                            # will still run for object files in
> > > lib.a
> > >
> > > cc main.c lib.a -fno-lto   # error, linker will say
> > >                            # "plugin needed to handle lto object"
> > >
> > > (Unless -ffat-lto-objects is used.)
> >
> > Thanks for the correction. So then maybe that's what Chris wants: the
> > objects in the static libs can be LTO'd but the objects in the main
> > executable won't be. And if it's still too slow, only enable LTO for
> > some static libs, where it gives significant performance gain.
>
> I think he means:
>
> cc main.c lib1.a lib2.a lib3.a lib4.a
>
> lib[1-4].a are all built with -flto.  In this case, unfortunately gcc
> will still run a whole LTO pass for all object files in lib[1-4].a.  But
> Chris wants 4 LTO passes, each for one static library and abandon the
> optimization oppertunity crossing library files.

Yes, but depending on the form of the libraries APIs, there may not
*be* much opportunity for optimizing across them, and so the final
link wouldn't spend much time trying to do it. Maybe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-21 13:41 GCC 10 LTO documentation Chris S
  2021-06-22 13:02 ` Jonathan Wakely
@ 2021-06-23 10:04 ` Kewen.Lin
  2021-07-03 18:39   ` Chris S
  1 sibling, 1 reply; 9+ messages in thread
From: Kewen.Lin @ 2021-06-23 10:04 UTC (permalink / raw)
  To: Chris S; +Cc: gcc-help

Hi,

on 2021/6/21 下午9:41, Chris S via Gcc-help wrote:
> Are the capabilities and/or limitations of GCC10 LTO documented anywhere?
> I understand it only at a high level but not with details, and am having
> trouble finding any current information that describes it very clearly.
> Some questions came up recently that I'd like to be able to answer, but it
> boils down to this:
> 
> Is it possible to build static libraries that have LTO optimizations
> applied to the object code they contain (that is, all the code in that
> library is optimized together with LTO), but when built together into a
> final binary, no additional LTO is performed?  We have several large,
> static libraries that are mostly unrelated, and are looking for ways to
> reduce a massive increase in build times after moving to g++10, where
> almost 80% of the time is spent in LTO.  

If the compiling time is the concern, maybe it's worth to trying with
LTO parallel build, such as -flto=auto or -flto=n.

https://gcc.gnu.org/onlinedocs/gcc-10.3.0/gcc/Optimize-Options.html#Optimize-Options

BR,
Kewen

> Having optimized individual
> libraries but not a global optimized binary might be a reasonable
> tradeoff.  Is this possible?
> 
> I don't have a good mental model of when "extra information" (GIMPLE) is
> merely included in the code for later use, and when that GIMPLE information
> is actually used to perform LTO optimizations.  (My suspicion is that it's
> only when building the final binary.)  However, if we can build static
> libraries that are already optimized within themselves, a  hint of what
> command line options to use would also be very appreciated.
> 
> Thanks.
> Chris
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: GCC 10 LTO documentation
  2021-06-23 10:04 ` Kewen.Lin
@ 2021-07-03 18:39   ` Chris S
  0 siblings, 0 replies; 9+ messages in thread
From: Chris S @ 2021-07-03 18:39 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: gcc-help

Many thanks! This was extremely helpful, and reduced link time for that
binary by an hour. :)

Chris

On Wed, Jun 23, 2021 at 5:05 AM Kewen.Lin <linkw@linux.ibm.com> wrote:

> Hi,
>
> on 2021/6/21 下午9:41, Chris S via Gcc-help wrote:
> > Are the capabilities and/or limitations of GCC10 LTO documented anywhere?
> > I understand it only at a high level but not with details, and am having
> > trouble finding any current information that describes it very clearly.
> > Some questions came up recently that I'd like to be able to answer, but
> it
> > boils down to this:
> >
> > Is it possible to build static libraries that have LTO optimizations
> > applied to the object code they contain (that is, all the code in that
> > library is optimized together with LTO), but when built together into a
> > final binary, no additional LTO is performed?  We have several large,
> > static libraries that are mostly unrelated, and are looking for ways to
> > reduce a massive increase in build times after moving to g++10, where
> > almost 80% of the time is spent in LTO.
>
> If the compiling time is the concern, maybe it's worth to trying with
> LTO parallel build, such as -flto=auto or -flto=n.
>
>
> https://gcc.gnu.org/onlinedocs/gcc-10.3.0/gcc/Optimize-Options.html#Optimize-Options
>
> BR,
> Kewen
>
> > Having optimized individual
> > libraries but not a global optimized binary might be a reasonable
> > tradeoff.  Is this possible?
> >
> > I don't have a good mental model of when "extra information" (GIMPLE) is
> > merely included in the code for later use, and when that GIMPLE
> information
> > is actually used to perform LTO optimizations.  (My suspicion is that
> it's
> > only when building the final binary.)  However, if we can build static
> > libraries that are already optimized within themselves, a  hint of what
> > command line options to use would also be very appreciated.
> >
> > Thanks.
> > Chris
> >
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-03 18:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21 13:41 GCC 10 LTO documentation Chris S
2021-06-22 13:02 ` Jonathan Wakely
2021-06-22 15:05   ` Chris S
2021-06-22 15:14   ` Xi Ruoyao
2021-06-22 15:52     ` Jonathan Wakely
2021-06-22 16:08       ` Xi Ruoyao
2021-06-22 17:02         ` Jonathan Wakely
2021-06-23 10:04 ` Kewen.Lin
2021-07-03 18:39   ` Chris S

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).