public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* LTO vs GCC 8
@ 2018-05-10 21:32 Freddie Chopin
  2018-05-11  9:19 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Freddie Chopin @ 2018-05-10 21:32 UTC (permalink / raw)
  To: gcc

Hi!

In one of my embedded projects I have an option to enable LTO. This was
working more or less fine for GCC 6 and GCC 7, however for GCC 8.1.0
(and binutils 2.30) - with the same set of options - I see something
like this

-- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --

$ arm-none-eabi-g++ -Wall -Wextra -Wshadow -std=gnu++11 -mcpu=cortex-m4 
-mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -g -ggdb3 -O2 -flto -ffat-
lto-objects -fno-use-cxa-atexit -ffunction-sections -fdata-sections
-fno-rtti -fno-exceptions ... [include paths] ... -MD -MP -c
test/TestCase.cpp -o output/test/TestCase.o

$ arm-none-eabi-g++  -mcpu=cortex-m4 -mthumb -mfloat-abi=hard
-mfpu=fpv4-sp-d16 -g -O2 -flto -fuse-linker-plugin -Wl,-
Map=output/test/distortosTest.map,--cref,--gc-sections
-Toutput/ST_STM32F4DISCOVERY.preprocessed.ld ... [a lot of objects] ...
-Wl,--whole-archive -l:output/libdistortos.a -Wl,--no-whole-archive -o
output/test/distortosTest.elf

$ arm-none-eabi-objdump --demangle -S output/test/distortosTest.elf >
output/test/distortosTest.lss
arm-none-eabi-objdump: Dwarf Error: Could not find abbrev number 167.
arm-none-eabi-objdump: Dwarf Error: found dwarf version '37', this
reader only handles version 2, 3, 4 and 5 information.
arm-none-eabi-objdump: Dwarf Error: found dwarf version '6144', this
reader only handles version 2, 3, 4 and 5 information.
arm-none-eabi-objdump: Dwarf Error: found dwarf version '4864', this
reader only handles version 2, 3, 4 and 5 information.
...
... (a lot more)
...

-- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --

As you see, the errors apear only when I try to generate an assembly
dump. I'm not sure whether the problem is in GCC or in objdump, but
when I have an .elf file produced (with the same options) by gcc 7.3.0,
then this new version of objdump doesn't produce any errors. What is
also interesting is that the errors are not fatal - the exit code of
the process is 0.

What is also interesing is that this problem doesn't appear in a
trivial test case, so I suspect this is something more subtle. I did
not try to narrow it down into a shareable test case, but if you have
no hints then maybe I'll try to do that.

Any ideas what may be the problem here? Especially do you know whether
I should be asking this question here or maybe on binutils mailing
list?

Thanks in advance!

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-10 21:32 LTO vs GCC 8 Freddie Chopin
@ 2018-05-11  9:19 ` Richard Biener
  2018-05-11 11:06   ` David Brown
  2018-05-11 15:33   ` Freddie Chopin
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Biener @ 2018-05-11  9:19 UTC (permalink / raw)
  To: Freddie Chopin; +Cc: GCC Development

On Thu, May 10, 2018 at 11:32 PM, Freddie Chopin <freddie_chopin@op.pl> wrote:
> Hi!
>
> In one of my embedded projects I have an option to enable LTO. This was
> working more or less fine for GCC 6 and GCC 7, however for GCC 8.1.0
> (and binutils 2.30) - with the same set of options - I see something
> like this
>
> -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --
>
> $ arm-none-eabi-g++ -Wall -Wextra -Wshadow -std=gnu++11 -mcpu=cortex-m4
> -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -g -ggdb3 -O2 -flto -ffat-
> lto-objects -fno-use-cxa-atexit -ffunction-sections -fdata-sections
> -fno-rtti -fno-exceptions ... [include paths] ... -MD -MP -c
> test/TestCase.cpp -o output/test/TestCase.o
>
> $ arm-none-eabi-g++  -mcpu=cortex-m4 -mthumb -mfloat-abi=hard
> -mfpu=fpv4-sp-d16 -g -O2 -flto -fuse-linker-plugin -Wl,-
> Map=output/test/distortosTest.map,--cref,--gc-sections
> -Toutput/ST_STM32F4DISCOVERY.preprocessed.ld ... [a lot of objects] ...
> -Wl,--whole-archive -l:output/libdistortos.a -Wl,--no-whole-archive -o
> output/test/distortosTest.elf
>
> $ arm-none-eabi-objdump --demangle -S output/test/distortosTest.elf >
> output/test/distortosTest.lss
> arm-none-eabi-objdump: Dwarf Error: Could not find abbrev number 167.
> arm-none-eabi-objdump: Dwarf Error: found dwarf version '37', this
> reader only handles version 2, 3, 4 and 5 information.
> arm-none-eabi-objdump: Dwarf Error: found dwarf version '6144', this
> reader only handles version 2, 3, 4 and 5 information.
> arm-none-eabi-objdump: Dwarf Error: found dwarf version '4864', this
> reader only handles version 2, 3, 4 and 5 information.
> ...
> ... (a lot more)
> ...
>
> -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --
>
> As you see, the errors apear only when I try to generate an assembly
> dump. I'm not sure whether the problem is in GCC or in objdump, but
> when I have an .elf file produced (with the same options) by gcc 7.3.0,
> then this new version of objdump doesn't produce any errors. What is
> also interesting is that the errors are not fatal - the exit code of
> the process is 0.
>
> What is also interesing is that this problem doesn't appear in a
> trivial test case, so I suspect this is something more subtle. I did
> not try to narrow it down into a shareable test case, but if you have
> no hints then maybe I'll try to do that.
>
> Any ideas what may be the problem here? Especially do you know whether
> I should be asking this question here or maybe on binutils mailing
> list?

Hmm, can you try without --gc-sections?  "Old" GNU ld versions have
a bug that wrecks debug info (sourceware PR20882).

Richard.

> Thanks in advance!
>
> Regards,
> FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11  9:19 ` Richard Biener
@ 2018-05-11 11:06   ` David Brown
  2018-05-11 15:50     ` Freddie Chopin
  2018-05-11 15:33   ` Freddie Chopin
  1 sibling, 1 reply; 12+ messages in thread
From: David Brown @ 2018-05-11 11:06 UTC (permalink / raw)
  To: Richard Biener, Freddie Chopin; +Cc: GCC Development

On 11/05/18 11:19, Richard Biener wrote:
> On Thu, May 10, 2018 at 11:32 PM, Freddie Chopin <freddie_chopin@op.pl> wrote:
>> Hi!
>>
>> In one of my embedded projects I have an option to enable LTO. This was
>> working more or less fine for GCC 6 and GCC 7, however for GCC 8.1.0
>> (and binutils 2.30) - with the same set of options - I see something
>> like this
>>
>> -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --
>>
>> $ arm-none-eabi-g++ -Wall -Wextra -Wshadow -std=gnu++11 -mcpu=cortex-m4
>> -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -g -ggdb3 -O2 -flto -ffat-
>> lto-objects -fno-use-cxa-atexit -ffunction-sections -fdata-sections
>> -fno-rtti -fno-exceptions ... [include paths] ... -MD -MP -c
>> test/TestCase.cpp -o output/test/TestCase.o
>>
<snip>
> Hmm, can you try without --gc-sections?  "Old" GNU ld versions have
> a bug that wrecks debug info (sourceware PR20882).
> 

For the Cortex-M devices (and probably many other RISC targets),
-fdata-sections comes at a big cost - it effectively blocks
-fsection-anchors and makes access to file-static data a lot bigger.
People often use -fdata-sections and -ffunction-sections along with
-Wl,--gc-sections with the aim of removing unused code and data (and
thus saving space, useful on small devices) - I would expect LTO would
manage that anyway.  The other purpose of these is to improve locality
of reference - again LTO should do that for you.  But even without LTO,
I find the cost of -fdata-sections high compared to -fsection-anchors.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11  9:19 ` Richard Biener
  2018-05-11 11:06   ` David Brown
@ 2018-05-11 15:33   ` Freddie Chopin
  1 sibling, 0 replies; 12+ messages in thread
From: Freddie Chopin @ 2018-05-11 15:33 UTC (permalink / raw)
  To: gcc

On Fri, 2018-05-11 at 11:19 +0200, Richard Biener wrote:
> Hmm, can you try without --gc-sections?  "Old" GNU ld versions have
> a bug that wrecks debug info (sourceware PR20882).

Yes - you are right. Without --gc-sections the errors are gone. The bug
was marked as resolved and fixed a year ago, however from the comments
I presume that it was only a partial fix, so possibly 2.31 will be
working fine for arm-none-abi, right?

What is also interesting is that there was no problem for gcc 7.3 with
binutils 2.29.1 and for gcc 6.3 with binutils 2.28 - only 8.1 + 2.30
behave like this.

Is there a workaround for the problem? Maybe I could mark some sections
as KEEP() in the linker script while waiting for binutils 2.31?

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11 11:06   ` David Brown
@ 2018-05-11 15:50     ` Freddie Chopin
  2018-05-11 16:51       ` Richard Biener
  2018-05-14 14:34       ` David Brown
  0 siblings, 2 replies; 12+ messages in thread
From: Freddie Chopin @ 2018-05-11 15:50 UTC (permalink / raw)
  To: GCC Development

On Fri, 2018-05-11 at 13:06 +0200, David Brown wrote:
> For the Cortex-M devices (and probably many other RISC targets),
> -fdata-sections comes at a big cost - it effectively blocks
> -fsection-anchors and makes access to file-static data a lot bigger.
> People often use -fdata-sections and -ffunction-sections along with
> -Wl,--gc-sections with the aim of removing unused code and data (and
> thus saving space, useful on small devices) - I would expect LTO
> would
> manage that anyway.  The other purpose of these is to improve
> locality
> of reference - again LTO should do that for you.  But even without
> LTO,
> I find the cost of -fdata-sections high compared to -fsection-
> anchors.

Unfortunatelly having LTO doesn't make -ffunction-sections + -fdata-
sections + --gc-sections useless.

My test project compiled:
- without LTO and without these attributes - 150824 B ROM + 4240 B RAM
- with LTO and without these attributes - 133812 B ROM + 4208 B RAM
- without LTO and with these attributes - 124456 B ROM + 3484 B RAM
- with LTO and with these attributes - 120280 B ROM + 3680 B RAM

As you see these attributes give much more than LTO in terms of size.

As for the -fsection-anchors I guess this has no use for non-PIC code
for arm-none-eabi. Whether I use it or not, the sizes are identical.

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11 15:50     ` Freddie Chopin
@ 2018-05-11 16:51       ` Richard Biener
  2018-05-15 19:39         ` Freddie Chopin
  2018-05-14 14:34       ` David Brown
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Biener @ 2018-05-11 16:51 UTC (permalink / raw)
  To: gcc, Freddie Chopin, GCC Development

On May 11, 2018 5:49:44 PM GMT+02:00, Freddie Chopin <freddie_chopin@op.pl> wrote:
>On Fri, 2018-05-11 at 13:06 +0200, David Brown wrote:
>> For the Cortex-M devices (and probably many other RISC targets),
>> -fdata-sections comes at a big cost - it effectively blocks
>> -fsection-anchors and makes access to file-static data a lot bigger.
>> People often use -fdata-sections and -ffunction-sections along with
>> -Wl,--gc-sections with the aim of removing unused code and data (and
>> thus saving space, useful on small devices) - I would expect LTO
>> would
>> manage that anyway.  The other purpose of these is to improve
>> locality
>> of reference - again LTO should do that for you.  But even without
>> LTO,
>> I find the cost of -fdata-sections high compared to -fsection-
>> anchors.
>
>Unfortunatelly having LTO doesn't make -ffunction-sections + -fdata-
>sections + --gc-sections useless.
>
>My test project compiled:
>- without LTO and without these attributes - 150824 B ROM + 4240 B RAM
>- with LTO and without these attributes - 133812 B ROM + 4208 B RAM
>- without LTO and with these attributes - 124456 B ROM + 3484 B RAM
>- with LTO and with these attributes - 120280 B ROM + 3680 B RAM
>
>As you see these attributes give much more than LTO in terms of size.
>
>As for the -fsection-anchors I guess this has no use for non-PIC code
>for arm-none-eabi. Whether I use it or not, the sizes are identical.

That's an interesting result. Do you have any non-LTO objects? Basically I'm curious what ld eliminates that gcc with LTO doesn't. 

As to a workaround for the ld bug you can try keeping all .debug_* sections. IIRC 2.30 has the bug fixed (on the branch). 

Richard. 

>Regards,
>FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11 15:50     ` Freddie Chopin
  2018-05-11 16:51       ` Richard Biener
@ 2018-05-14 14:34       ` David Brown
  2018-05-15 20:04         ` Freddie Chopin
  1 sibling, 1 reply; 12+ messages in thread
From: David Brown @ 2018-05-14 14:34 UTC (permalink / raw)
  To: Freddie Chopin, GCC Development

On 11/05/18 17:49, Freddie Chopin wrote:
> On Fri, 2018-05-11 at 13:06 +0200, David Brown wrote:
>> For the Cortex-M devices (and probably many other RISC targets),
>> -fdata-sections comes at a big cost - it effectively blocks
>> -fsection-anchors and makes access to file-static data a lot bigger.
>> People often use -fdata-sections and -ffunction-sections along with
>> -Wl,--gc-sections with the aim of removing unused code and data (and
>> thus saving space, useful on small devices) - I would expect LTO
>> would
>> manage that anyway.  The other purpose of these is to improve
>> locality
>> of reference - again LTO should do that for you.  But even without
>> LTO,
>> I find the cost of -fdata-sections high compared to -fsection-
>> anchors.
> 
> Unfortunatelly having LTO doesn't make -ffunction-sections + -fdata-
> sections + --gc-sections useless.
> 
> My test project compiled:
> - without LTO and without these attributes - 150824 B ROM + 4240 B RAM
> - with LTO and without these attributes - 133812 B ROM + 4208 B RAM
> - without LTO and with these attributes - 124456 B ROM + 3484 B RAM
> - with LTO and with these attributes - 120280 B ROM + 3680 B RAM
> 
> As you see these attributes give much more than LTO in terms of size.
> 

Interesting.  Making these sections and then using gc-sections should
only remove code that is not used - LTO should do that anyway.

Have you tried with -ffunction-sections and not -fdata-sections?  It is
the -fdata-sections that ruins -fsection-anchors - the
-ffunction-sections doesn't have the same kind of cost.

>
> As for the -fsection-anchors I guess this has no use for non-PIC code
> for arm-none-eabi. Whether I use it or not, the sizes are identical.
> 

No, -fsection-anchors has plenty of use for fixed-position eabi code.

Take this little example code:

static int x;
static int y;
static int z;

void foo(void) {
	int t = x;
	x = y;
	y = z;
	z = t;
}

Compiled with gcc (4.8, as that's what I had convenient) with -O2
-mcpu=cortex-m4 -mthumb and -fsection-anchors (enabled automatically
with -O2, I believe), this gives:

  21                    foo:
  22                            @ args = 0, pretend = 0, frame = 0
  23                            @ frame_needed = 0, uses_anonymous_args = 0
  24                            @ link register save eliminated.
  25 0000 034B                  ldr     r3, .L2
  26 0002 93E80500              ldmia   r3, {r0, r2}
  27 0006 9968                  ldr     r1, [r3, #8]
  28 0008 1A60                  str     r2, [r3]
  29 000a 9860                  str     r0, [r3, #8]
  30 000c 5960                  str     r1, [r3, #4]
  31 000e 7047                  bx      lr
  32                    .L3:
  33                            .align  2
  34                    .L2:
  35 0010 00000000              .word   .LANCHOR0
  37                            .bss
  38                            .align  2
  39                            .set    .LANCHOR0,. + 0
  42                    x:
  43 0000 00000000              .space  4
  46                    y:
  47 0004 00000000              .space  4
  50                    z:
  51 0008 00000000              .space  4


With -fdata-sections, I get:

  21                    foo:
  22                            @ args = 0, pretend = 0, frame = 0
  23                            @ frame_needed = 0, uses_anonymous_args = 0
  24                            @ link register save eliminated.
  25 0000 30B4                  push    {r4, r5}
  26 0002 0549                  ldr     r1, .L2
  27 0004 054B                  ldr     r3, .L2+4
  28 0006 064A                  ldr     r2, .L2+8
  29 0008 0D68                  ldr     r5, [r1]
  30 000a 1468                  ldr     r4, [r2]
  31 000c 1868                  ldr     r0, [r3]
  32 000e 1560                  str     r5, [r2]
  33 0010 1C60                  str     r4, [r3]
  34 0012 0860                  str     r0, [r1]
  35 0014 30BC                  pop     {r4, r5}
  36 0016 7047                  bx      lr
  37                    .L3:
  38                            .align  2
  39                    .L2:
  40 0018 00000000              .word   .LANCHOR0
  41 001c 00000000              .word   .LANCHOR1
  42 0020 00000000              .word   .LANCHOR2
  44                            .section        .bss.x,"aw",%nobits
  45                            .align  2
  46                            .set    .LANCHOR0,. + 0
  49                    x:
  50 0000 00000000              .space  4
  51                            .section        .bss.y,"aw",%nobits
  52                            .align  2
  53                            .set    .LANCHOR1,. + 0
  56                    y:
  57 0000 00000000              .space  4
  58                            .section        .bss.z,"aw",%nobits
  59                            .align  2
  60                            .set    .LANCHOR2,. + 0
  63                    z:
  64 0000 00000000              .space  4


The code is clearly bigger and slower, and uses more anchors in the code
section.


Note that to get similar improvements with non-static data, you need
"-fno-common" - a flag that I believe should be the default for the
compiler.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-11 16:51       ` Richard Biener
@ 2018-05-15 19:39         ` Freddie Chopin
  2018-05-15 20:13           ` Freddie Chopin
  0 siblings, 1 reply; 12+ messages in thread
From: Freddie Chopin @ 2018-05-15 19:39 UTC (permalink / raw)
  To: gcc

On Fri, 2018-05-11 at 18:51 +0200, Richard Biener wrote:
> That's an interesting result. Do you have any non-LTO objects?
> Basically I'm curious what ld eliminates that gcc with LTO doesn't. 

Whole project is compiled with LTO, part of the project is provided as
a library (which is archived with arm-none-eabi-gcc-ar). Only non-LTO
stuff in the final executable are objects from standard toolchain
libraries and I suppose they are the culprit here - the toolchain is
compiled with -ffunction-sections -fdata-sections, but without -flto.
Maybe I should actually compile the whole toolchain with -flto -ffat-
lto-objects? Is this a sane idea?

> As to a workaround for the ld bug you can try keeping all .debug_*
> sections. IIRC 2.30 has the bug fixed (on the branch). 

Indeed - "keeping" all the debug sections is a viable alternative. I've
found out that it is enough to "keep" just these:

	/* DWARF 2 */
	.debug_info 0 : { KEEP(*(.debug_info .gnu.linkonce.wi.*)); }
	 ...
	.debug_frame 0 : { KEEP(*(.debug_frame)); }

I have to check whether debugging something like that is actually
possible (; Thanks for the workaround!

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-14 14:34       ` David Brown
@ 2018-05-15 20:04         ` Freddie Chopin
  2018-05-16  7:37           ` David Brown
  0 siblings, 1 reply; 12+ messages in thread
From: Freddie Chopin @ 2018-05-15 20:04 UTC (permalink / raw)
  To: gcc

On Mon, 2018-05-14 at 16:34 +0200, David Brown wrote:
> Interesting.  Making these sections and then using gc-sections should
> only remove code that is not used - LTO should do that anyway.

My guess - expressed in the other e-mail to the list - is that the
things LTO cannot remove but --gc-sections can are objects from
toolchain library.

> Have you tried with -ffunction-sections and not -fdata-sections?  It
> is
> the -fdata-sections that ruins -fsection-anchors - the
> -ffunction-sections doesn't have the same kind of cost.

Results:
- -ffunction-sections + -fdata-sections = 124396 ROM + 3484 RAM
- -ffunction-sections = 125168 ROM + 3676 RAM
- -ffunction-sections + -fsection-anchors = 125168 ROM + 3676 RAM
- -ffunction-sections + -fsection-anchors + -fno-common = 125168 ROM +
3676 RAM 

Generated executables for the second, third and fourth case are
identical - assembly listings for these three cases have no differences
at all.

I've also tried with -fno-section-anchors, and this makes a minor
(negative) difference - 125352 ROM + 3676 RAM.

> No, -fsection-anchors has plenty of use for fixed-position eabi code.
> ...
> The code is clearly bigger and slower, and uses more anchors in the
> code
> section.
> 
> Note that to get similar improvements with non-static data, you need
> "-fno-common" - a flag that I believe should be the default for the
> compiler.

I cannot reproduce this here ); Don't get me wrong - if there's a
"free" way to improve code size/speed with some compiler flags which I
did not use previously, then I'm very much interested, however in my
particular case the best result (size-wise) I get is with just
-ffunction-sections + -fdata-sections. The difference is not huge, but
it's also not negligible. Maybe this has to do with different compiler
versions we are comparing (4.8 vs 8.1)? I guess this is not LTO (which
I did not enable for these measurements), as you did not mention it in
your flags...

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-15 19:39         ` Freddie Chopin
@ 2018-05-15 20:13           ` Freddie Chopin
  2018-05-16  5:26             ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Freddie Chopin @ 2018-05-15 20:13 UTC (permalink / raw)
  To: gcc

On Tue, 2018-05-15 at 21:39 +0200, Freddie Chopin wrote:
> On Fri, 2018-05-11 at 18:51 +0200, Richard Biener wrote:
> > As to a workaround for the ld bug you can try keeping all .debug_*
> > sections. IIRC 2.30 has the bug fixed (on the branch). 
> 
> Indeed - "keeping" all the debug sections is a viable alternative.
> I've
> found out that it is enough to "keep" just these:
> 
> 	/* DWARF 2 */
> 	.debug_info 0 : { KEEP(*(.debug_info .gnu.linkonce.wi.*)); }
> 	 ...
> 	.debug_frame 0 : { KEEP(*(.debug_frame)); }
> 
> I have to check whether debugging something like that is actually
> possible (; Thanks for the workaround!

Nope, sent it too fast... With these two (three) sections "kept" --gc-
sections stops working and the executable I get is almost identical to
the case when I have no --gc-sections at all:
- lto + --gc-sections, sections "kept" - 133504 ROM + 4196 RAM
- lto + --gc-sections, sections not "kept" (causes previously mentioned
errors) - 120288 ROM + 3676 RAM
- lto, sections not "kept" - 133812 ROM + 4220 RAM

So it seems I have to patiently wait for new binutils if I would like
to use LTO (;

Regards,
FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-15 20:13           ` Freddie Chopin
@ 2018-05-16  5:26             ` Richard Biener
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Biener @ 2018-05-16  5:26 UTC (permalink / raw)
  To: gcc, Freddie Chopin

On May 15, 2018 10:12:45 PM GMT+02:00, Freddie Chopin <freddie_chopin@op.pl> wrote:
>On Tue, 2018-05-15 at 21:39 +0200, Freddie Chopin wrote:
>> On Fri, 2018-05-11 at 18:51 +0200, Richard Biener wrote:
>> > As to a workaround for the ld bug you can try keeping all .debug_*
>> > sections. IIRC 2.30 has the bug fixed (on the branch). 
>> 
>> Indeed - "keeping" all the debug sections is a viable alternative.
>> I've
>> found out that it is enough to "keep" just these:
>> 
>> 	/* DWARF 2 */
>> 	.debug_info 0 : { KEEP(*(.debug_info .gnu.linkonce.wi.*)); }
>> 	 ...
>> 	.debug_frame 0 : { KEEP(*(.debug_frame)); }
>> 
>> I have to check whether debugging something like that is actually
>> possible (; Thanks for the workaround!
>
>Nope, sent it too fast... With these two (three) sections "kept" --gc-
>sections stops working and the executable I get is almost identical to
>the case when I have no --gc-sections at all:
>- lto + --gc-sections, sections "kept" - 133504 ROM + 4196 RAM
>- lto + --gc-sections, sections not "kept" (causes previously mentioned
>errors) - 120288 ROM + 3676 RAM
>- lto, sections not "kept" - 133812 ROM + 4220 RAM
>
>So it seems I have to patiently wait for new binutils if I would like
>to use LTO (;

Build your own (patched) binutils :) 

Richard. 

>Regards,
>FCh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: LTO vs GCC 8
  2018-05-15 20:04         ` Freddie Chopin
@ 2018-05-16  7:37           ` David Brown
  0 siblings, 0 replies; 12+ messages in thread
From: David Brown @ 2018-05-16  7:37 UTC (permalink / raw)
  To: Freddie Chopin, gcc

On 15/05/18 22:03, Freddie Chopin wrote:

> 
> I cannot reproduce this here ); Don't get me wrong - if there's a
> "free" way to improve code size/speed with some compiler flags which I
> did not use previously, then I'm very much interested, however in my
> particular case the best result (size-wise) I get is with just
> -ffunction-sections + -fdata-sections. The difference is not huge, but
> it's also not negligible. Maybe this has to do with different compiler
> versions we are comparing (4.8 vs 8.1)? I guess this is not LTO (which
> I did not enable for these measurements), as you did not mention it in
> your flags...
> 

It is quite possible that the difference is from the gcc versions -
there have been many improvements since 4.8, and it is entirely possible
that gcc now gives the benefits of -fsection-anchors even with
-fdata-sections.  And I was looking here at the differences for short
code sections, rather than the whole program.

I will try a few more tests when I have the chance.  This computer has
such an old Linux installation that I can no longer use modern pre-built
versions of the gnu arm embedded toolchain - I can't even use
godbolt.org because the browser version is too old.  I'll test from
home, where I have newer arm gcc versions (including a "bleeding edge"
toolchain or too, provided by some nice chap off the internet :-) ).



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-16  7:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-10 21:32 LTO vs GCC 8 Freddie Chopin
2018-05-11  9:19 ` Richard Biener
2018-05-11 11:06   ` David Brown
2018-05-11 15:50     ` Freddie Chopin
2018-05-11 16:51       ` Richard Biener
2018-05-15 19:39         ` Freddie Chopin
2018-05-15 20:13           ` Freddie Chopin
2018-05-16  5:26             ` Richard Biener
2018-05-14 14:34       ` David Brown
2018-05-15 20:04         ` Freddie Chopin
2018-05-16  7:37           ` David Brown
2018-05-11 15:33   ` Freddie Chopin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).