public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* How big (and fast) is going to be GCC 8?
@ 2018-03-06 10:14 Martin Liška
  2018-03-06 15:13 ` David Malcolm
                   ` (4 more replies)
  0 siblings, 5 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-06 10:14 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]

Hello.

Many significant changes has landed in mainline and will be released as GCC 8.1.
I decided to use various GCC configs we have and test how there configuration differ
in size and also binary size.

This is first part where I measured binary size, speed comparison will follow.
Configuration names should be self-explaining, the 'system-*' is built done
without bootstrap with my system compiler (GCC 7.3.0). All builds are done
on my Intel Haswell machine.

Feel free to reply if you need any explanation.
Martin

[-- Attachment #2: gcc-8-build-stats.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 24473 bytes --]

[-- Attachment #3: gcc-8-build-stats.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 21081 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
  2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
@ 2018-03-06 15:13 ` David Malcolm
  2018-03-06 16:18   ` Martin Liška
  2018-03-06 17:50 ` How big (and fast) is going to be GCC 8? [part 2] Martin Liška
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: David Malcolm @ 2018-03-06 15:13 UTC (permalink / raw)
  To: Martin Liška, GCC Development
  Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

On Tue, 2018-03-06 at 11:14 +0100, Martin Liška wrote:
> Hello.
> 
> Many significant changes has landed in mainline and will be released
> as GCC 8.1.
> I decided to use various GCC configs we have and test how there
> configuration differ
> in size and also binary size.
> 
> This is first part where I measured binary size, speed comparison
> will follow.
> Configuration names should be self-explaining, the 'system-*' is
> built done
> without bootstrap with my system compiler (GCC 7.3.0). All builds are
> done
> on my Intel Haswell machine.
> 
> Feel free to reply if you need any explanation.
> Martin

Some possibly silly questions:

(a) was this done with:
  --enable-checking=release ?

(b) is this measuring cc1 ?

(c) are the units bytes?  (so ~183MB for the unstripped system-O2-
native cc1, ~25MB after stripping?)

(d) do you have comparable data for gcc 7?

Thanks
Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
  2018-03-06 15:13 ` David Malcolm
@ 2018-03-06 16:18   ` Martin Liška
  2018-03-06 18:35     ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2018-03-06 16:18 UTC (permalink / raw)
  To: David Malcolm, GCC Development
  Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

On 03/06/2018 04:13 PM, David Malcolm wrote:
> On Tue, 2018-03-06 at 11:14 +0100, Martin Liška wrote:
>> Hello.
>>
>> Many significant changes has landed in mainline and will be released
>> as GCC 8.1.
>> I decided to use various GCC configs we have and test how there
>> configuration differ
>> in size and also binary size.
>>
>> This is first part where I measured binary size, speed comparison
>> will follow.
>> Configuration names should be self-explaining, the 'system-*' is
>> built done
>> without bootstrap with my system compiler (GCC 7.3.0). All builds are
>> done
>> on my Intel Haswell machine.
>>
>> Feel free to reply if you need any explanation.
>> Martin
> 
> Some possibly silly questions:

Hi David.

All of them are qualified!

> 
> (a) was this done with:
>    --enable-checking=release ?

Yes.

> 
> (b) is this measuring cc1 ?

cc1plus. Let me also add cc1 when I'll have run-time numbers.

> 
> (c) are the units bytes?  (so ~183MB for the unstripped system-O2-
> native cc1, ~25MB after stripping?)

Yes, in bytes. Would be nicer to have it in MB ;) It would be easily
readable. I'll fix that.

> 
> (d) do you have comparable data for gcc 7?

Will build corresponding builds for GCC 7 tonight.

Martin

> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8? [part 2]
  2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
  2018-03-06 15:13 ` David Malcolm
@ 2018-03-06 17:50 ` Martin Liška
  2018-03-06 18:16   ` Bin.Cheng
  2018-03-07  9:26 ` Size and speed comparison of GCC 7 & 8 Martin Liška
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2018-03-06 17:50 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 326 bytes --]

Hi.

This is speed comparison of GCC 8 builds compared to my system GCC 7.3.0
which is built with PGO bootstrap.

I run empty C and C++ source file, tramp3d and the rest are some big beasts
from GCC source file. Feel free to suggest another test candidates? Note
that first column defines how many times was test run.

Martin

[-- Attachment #2: gcc-8-perf-stats.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 24793 bytes --]

[-- Attachment #3: gcc-8-perf-stats.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 24141 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8? [part 2]
  2018-03-06 17:50 ` How big (and fast) is going to be GCC 8? [part 2] Martin Liška
@ 2018-03-06 18:16   ` Bin.Cheng
  2018-03-06 18:36     ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Bin.Cheng @ 2018-03-06 18:16 UTC (permalink / raw)
  To: Martin Liška
  Cc: GCC Development, Jan Hubicka, Richard Biener, Michael Matz,
	Martin Jambor

On Tue, Mar 6, 2018 at 5:50 PM, Martin Liška <mliska@suse.cz> wrote:
> Hi.
>
> This is speed comparison of GCC 8 builds compared to my system GCC 7.3.0
> which is built with PGO bootstrap.
>
> I run empty C and C++ source file, tramp3d and the rest are some big beasts
> from GCC source file. Feel free to suggest another test candidates? Note
> that first column defines how many times was test run.
First thanks very much for collecting the data.
Since we enabled several loop passes at O3 and above levels, some data
for Ofast might be interesting?

Thanks,
bin
>
> Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
  2018-03-06 16:18   ` Martin Liška
@ 2018-03-06 18:35     ` Martin Liška
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-06 18:35 UTC (permalink / raw)
  To: David Malcolm, GCC Development
  Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

On 03/06/2018 05:18 PM, Martin Liška wrote:
> Yes, in bytes. Would be nicer to have it in MB ;) It would be easily
> readable. I'll fix that.

Hi.

I'm sending updated binary size statistics for both cc1 and cc1plus
in MB.

Martin

[-- Attachment #2: gcc-8-build-stats-v2.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 27525 bytes --]

[-- Attachment #3: gcc-8-build-stats-v2.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 25633 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8? [part 2]
  2018-03-06 18:16   ` Bin.Cheng
@ 2018-03-06 18:36     ` Martin Liška
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-06 18:36 UTC (permalink / raw)
  To: Bin.Cheng
  Cc: GCC Development, Jan Hubicka, Richard Biener, Michael Matz,
	Martin Jambor

On 03/06/2018 07:16 PM, Bin.Cheng wrote:
> On Tue, Mar 6, 2018 at 5:50 PM, Martin Liška <mliska@suse.cz> wrote:
>> Hi.
>>
>> This is speed comparison of GCC 8 builds compared to my system GCC 7.3.0
>> which is built with PGO bootstrap.
>>
>> I run empty C and C++ source file, tramp3d and the rest are some big beasts
>> from GCC source file. Feel free to suggest another test candidates? Note
>> that first column defines how many times was test run.
> First thanks very much for collecting the data.
> Since we enabled several loop passes at O3 and above levels, some data
> for Ofast might be interesting?

Do you have a nice source file full of loop nests that would test that
properly?

Martin

> 
> Thanks,
> bin
>>
>> Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Size and speed comparison of GCC 7 & 8
  2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
  2018-03-06 15:13 ` David Malcolm
  2018-03-06 17:50 ` How big (and fast) is going to be GCC 8? [part 2] Martin Liška
@ 2018-03-07  9:26 ` Martin Liška
  2018-03-07 10:13   ` Martin Liška
  2018-03-20 19:57 ` How can compiler speed-up postgresql database? Martin Liška
  2019-04-15 11:44 ` GCC 8 vs. GCC 9 speed and size comparison Martin Liška
  4 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2018-03-07  9:26 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 172 bytes --]

Hi.

Now I have also numbers for GCC 7, so the attached PDF and Calc file contain
all numbers I have. Looks we've got both speed and size limits compared to GCC 7.

Martin

[-- Attachment #2: gcc-build-stats-all-in-one.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 53989 bytes --]

[-- Attachment #3: gcc-build-stats-all-in-one.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 80048 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Size and speed comparison of GCC 7 & 8
  2018-03-07  9:26 ` Size and speed comparison of GCC 7 & 8 Martin Liška
@ 2018-03-07 10:13   ` Martin Liška
  2018-03-07 11:12     ` Martin Liška
  2018-03-13 13:31     ` Martin Liška
  0 siblings, 2 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-07 10:13 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 56 bytes --]

V2: fixed headers in the last table of the PDF.

Martin

[-- Attachment #2: gcc-build-stats-all-in-one.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 80057 bytes --]

[-- Attachment #3: gcc-build-stats-all-in-one.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 54043 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Size and speed comparison of GCC 7 & 8
  2018-03-07 10:13   ` Martin Liška
@ 2018-03-07 11:12     ` Martin Liška
  2018-03-13 13:31     ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-07 11:12 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

On 03/07/2018 11:13 AM, Martin Liška wrote:
> V2: fixed headers in the last table of the PDF.
> 
> Martin
> 

About the i386.ii -O2 -g, there's perf diff in between GCC 7 (base) and GCC 8:

# Baseline  Delta Abs  Shared Object         Symbol                                                                                                                                                                                                                        
# ........  .........  ....................  ..............................................................................................................................................................................................................................
#
               +0.65%  cc1plus               [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_call_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_call_summary*> >::hash_entry, xcallocator>::find_slot_with_hash
     0.18%     +0.43%  cc1plus               [.] sreal::operator*
               +0.41%  cc1plus               [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_fn_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_fn_summary*> >::hash_entry, xcallocator>::find_slot_with_hash
     0.07%     +0.35%  cc1plus               [.] cgraph_node::find_replacement
               +0.33%  cc1plus               [.] profile_count::to_sreal_scale
               +0.33%  cc1plus               [.] predicate::probability
               +0.27%  cc1plus               [.] call_summary<ipa_call_summary*>::get
     0.04%     +0.25%  cc1plus               [.] sreal::operator/
               +0.24%  cc1plus               [.] sreal::normalize
     0.70%     -0.23%  [kernel]              [.] 0xffffffff9c80019f
               +0.23%  cc1plus               [.] wide_int_to_tree_1
     0.09%     +0.22%  cc1plus               [.] sreal::operator+
               +0.21%  cc1plus               [.] analyze_function_body
     0.04%     +0.19%  cc1plus               [.] dwarf2out_var_location
     0.19%     -0.19%  cc1plus               [.] compute_inlined_call_time
               +0.19%  cc1plus               [.] function_summary<ipa_fn_summary*>::get
     0.30%     -0.18%  cc1plus               [.] can_inline_edge_p
     1.91%     -0.16%  cc1plus               [.] bitmap_set_bit
     0.74%     -0.15%  cc1plus               [.] pre_and_rev_post_order_compute_fn
     0.80%     +0.15%  [unknown]             [.] 0xffffffff9c80019f
               +0.14%  cc1plus               [.] cleanup_control_flow_pre
     0.81%     -0.14%  cc1plus               [.] ggc_set_mark
     0.13%     +0.13%  cc1plus               [.] variably_modified_type_p
     0.81%     -0.13%  cc1plus               [.] et_splay
     0.17%     -0.13%  cc1plus               [.] curr_insn_transform
               +0.12%  cc1plus               [.] profile_count::from_gcov_type
               +0.12%  cc1plus               [.] process_alt_operands
               +0.12%  cc1plus               [.] can_inline_edge_by_limits_p
     0.60%     -0.11%  cc1plus               [.] estimate_calls_size_and_time
     1.36%     -0.11%  libc-2.26.so          [.] _int_malloc
     0.27%     +0.11%  cc1plus               [.] constrain_operands
               +0.11%  cc1plus               [.] bitmap_alloc
     0.60%     +0.11%  cc1plus               [.] hash_table<variable_hasher, xcallocator>::find_slot_with_hash
               +0.11%  cc1plus               [.] predicate::evaluate
               +0.10%  cc1plus               [.] vr_values::get_value_range
     0.22%     -0.10%  cc1plus               [.] nonzero_bits1
     0.24%     +0.10%  cc1plus               [.] big_speedup_p
               +0.09%  cc1plus               [.] get_class_binding_direct
               +0.09%  cc1plus               [.] maybe_hot_count_p
     0.58%     -0.09%  cc1plus               [.] walk_tree_1
     0.06%     +0.09%  cc1plus               [.] estimate_size_after_inlining
               +0.09%  cc1plus               [.] mark_use
               +0.09%  cc1plus               [.] ix86_hard_regno_call_part_clobbered
     0.59%     -0.09%  cc1plus               [.] bitmap_bit_p
     0.23%     -0.09%  cc1plus               [.] delete_trivially_dead_insns
     0.41%     -0.09%  cc1plus               [.] cse_insn
     0.47%     -0.09%  cc1plus               [.] (anonymous namespace)::dom_info::calc_idoms
               +0.08%  cc1plus               [.] hash_table<named_decl_hash, xcallocator>::find_slot_with_hash
               +0.08%  cc1plus               [.] profile_count::to_frequency
     0.28%     -0.08%  libc-2.26.so          [.] msort_with_tmp.part.0
     0.90%     -0.08%  libc-2.26.so          [.] _int_free
               +0.08%  cc1plus               [.] substitute_and_fold_engine::replace_uses_in
     0.20%     -0.08%  cc1plus               [.] rtx_equal_for_memref_p
               +0.08%  cc1plus               [.] predicate::add_clause
     0.18%     -0.07%  cc1plus               [.] update_callee_keys
     0.67%     -0.07%  cc1plus               [.] gt_ggc_mx_lang_tree_node

Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Size and speed comparison of GCC 7 & 8
  2018-03-07 10:13   ` Martin Liška
  2018-03-07 11:12     ` Martin Liška
@ 2018-03-13 13:31     ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Liška @ 2018-03-13 13:31 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 190 bytes --]

On 03/07/2018 11:13 AM, Martin Liška wrote:
> V2: fixed headers in the last table of the PDF.
> 
> Martin
> 

Hi.

V3: added kdecore library source file provided by Michael Matz.

Martin

[-- Attachment #2: gcc-build-stats-all-in-one.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 57583 bytes --]

[-- Attachment #3: gcc-build-stats-all-in-one.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 81619 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* How can compiler speed-up postgresql database?
  2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
                   ` (2 preceding siblings ...)
  2018-03-07  9:26 ` Size and speed comparison of GCC 7 & 8 Martin Liška
@ 2018-03-20 19:57 ` Martin Liška
  2018-03-21  9:26   ` Richard Biener
  2019-04-15 11:44 ` GCC 8 vs. GCC 9 speed and size comparison Martin Liška
  4 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2018-03-20 19:57 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 130 bytes --]

Hi.

I did similar stats for postgresql server, more precisely for pgbench:
pgbench -s100 & 10 runs of pgbench -t10000 -v

Martin

[-- Attachment #2: pgbench-gcc-test.pdf.bz2 --]
[-- Type: application/x-bzip, Size: 17078 bytes --]

[-- Attachment #3: pgbench-gcc-test.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 14161 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How can compiler speed-up postgresql database?
  2018-03-20 19:57 ` How can compiler speed-up postgresql database? Martin Liška
@ 2018-03-21  9:26   ` Richard Biener
  2018-03-21  9:34     ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2018-03-21  9:26 UTC (permalink / raw)
  To: Martin Liška
  Cc: GCC Development, Jan Hubicka, Michael Matz, Martin Jambor

On Tue, Mar 20, 2018 at 8:57 PM, Martin Liška <mliska@suse.cz> wrote:
> Hi.
>
> I did similar stats for postgresql server, more precisely for pgbench:
> pgbench -s100 & 10 runs of pgbench -t10000 -v

Without looking at the benchmark probably only because it is flawed
(aka not I/O or memory bandwidth limited).  It might have some
actual operations on data (regex code?) that we can speed up though.

Richard.

> Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How can compiler speed-up postgresql database?
  2018-03-21  9:26   ` Richard Biener
@ 2018-03-21  9:34     ` Martin Liška
  2018-03-21 11:47       ` Jan Hubicka
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2018-03-21  9:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development, Jan Hubicka, Michael Matz, Martin Jambor

On 03/21/2018 10:26 AM, Richard Biener wrote:
> On Tue, Mar 20, 2018 at 8:57 PM, Martin Liška <mliska@suse.cz> wrote:
>> Hi.
>>
>> I did similar stats for postgresql server, more precisely for pgbench:
>> pgbench -s100 & 10 runs of pgbench -t10000 -v
> 
> Without looking at the benchmark probably only because it is flawed
> (aka not I/O or memory bandwidth limited).  It might have some
> actual operations on data (regex code?) that we can speed up though.

Well, it's not ideal as it tests quite simple DB with just couple of tables:
```
By default, pgbench tests a scenario that is loosely based on TPC-B, involving five SELECT, UPDATE, and INSERT commands per transaction.
```

Note that I had pg_data in /dev/shm and I verified that CPU utilization was 100% on a single core.
That said, it should not be so misleading ;)

Martin

> 
> Richard.
> 
>> Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How can compiler speed-up postgresql database?
  2018-03-21  9:34     ` Martin Liška
@ 2018-03-21 11:47       ` Jan Hubicka
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Hubicka @ 2018-03-21 11:47 UTC (permalink / raw)
  To: Martin Liška
  Cc: Richard Biener, GCC Development, Michael Matz, Martin Jambor

> On 03/21/2018 10:26 AM, Richard Biener wrote:
> >On Tue, Mar 20, 2018 at 8:57 PM, Martin Liška <mliska@suse.cz> wrote:
> >>Hi.
> >>
> >>I did similar stats for postgresql server, more precisely for pgbench:
> >>pgbench -s100 & 10 runs of pgbench -t10000 -v
> >
> >Without looking at the benchmark probably only because it is flawed
> >(aka not I/O or memory bandwidth limited).  It might have some
> >actual operations on data (regex code?) that we can speed up though.
> 
> Well, it's not ideal as it tests quite simple DB with just couple of tables:
> ```
> By default, pgbench tests a scenario that is loosely based on TPC-B, involving five SELECT, UPDATE, and INSERT commands per transaction.
> ```
> 
> Note that I had pg_data in /dev/shm and I verified that CPU utilization was 100% on a single core.
> That said, it should not be so misleading ;)

Well, it is usually easy to do perf and look how hot spots looks like.
I see similar speedups for common page lading at firefox and similar benchmarks
that are quite good.  Not everything needs to be designed to be memory bound
like spec.

Honza
> 
> Martin
> 
> >
> >Richard.
> >
> >>Martin
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* GCC 8 vs. GCC 9 speed and size comparison
  2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
                   ` (3 preceding siblings ...)
  2018-03-20 19:57 ` How can compiler speed-up postgresql database? Martin Liška
@ 2019-04-15 11:44 ` Martin Liška
  2019-04-15 12:12   ` Michael Matz
  4 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2019-04-15 11:44 UTC (permalink / raw)
  To: GCC Development; +Cc: Jan Hubicka, Richard Biener, Michael Matz, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 305 bytes --]

Hi.

There's a similar comparison that I did for the official openSUSE gcc packages.
gcc8 is built with PGO, while the gcc9 package is built in 2 different configurations:
PGO, LTO, PGO+LTO (LTO used for FE in stage4, for generators in stage3 as well).

Please take a look at attached statistics.

Martin

[-- Attachment #2: gcc8-gcc9-comparison.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 17320 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-15 11:44 ` GCC 8 vs. GCC 9 speed and size comparison Martin Liška
@ 2019-04-15 12:12   ` Michael Matz
  2019-04-15 13:20     ` Jan Hubicka
  2019-04-15 13:33     ` Jakub Jelinek
  0 siblings, 2 replies; 30+ messages in thread
From: Michael Matz @ 2019-04-15 12:12 UTC (permalink / raw)
  To: Martin Liška
  Cc: GCC Development, Jan Hubicka, Richard Biener, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 682 bytes --]

Hi,

On Mon, 15 Apr 2019, Martin Liška wrote:

> There's a similar comparison that I did for the official openSUSE gcc 
> packages. gcc8 is built with PGO, while the gcc9 package is built in 2 
> different configurations: PGO, LTO, PGO+LTO (LTO used for FE in stage4, 
> for generators in stage3 as well).
> 
> Please take a look at attached statistics.

It seems the C++ parser got quite a bit slower with gcc 9 :-( Most visible 
in the compile time for tramp-3d (24%) and kdecore.cc (18% slower with 
just PGO); it seems that the other .ii files are C-like enough to not 
trigger this behaviour, so it's probably something to do with templates 
and/or classes.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-15 12:12   ` Michael Matz
@ 2019-04-15 13:20     ` Jan Hubicka
  2019-04-15 13:33     ` Jakub Jelinek
  1 sibling, 0 replies; 30+ messages in thread
From: Jan Hubicka @ 2019-04-15 13:20 UTC (permalink / raw)
  To: Michael Matz
  Cc: Martin Liška, GCC Development, Richard Biener, Martin Jambor

> Hi,
> 
> On Mon, 15 Apr 2019, Martin Liška wrote:
> 
> > There's a similar comparison that I did for the official openSUSE gcc 
> > packages. gcc8 is built with PGO, while the gcc9 package is built in 2 
> > different configurations: PGO, LTO, PGO+LTO (LTO used for FE in stage4, 
> > for generators in stage3 as well).
> > 
> > Please take a look at attached statistics.
> 
> It seems the C++ parser got quite a bit slower with gcc 9 :-( Most visible 
> in the compile time for tramp-3d (24%) and kdecore.cc (18% slower with 
> just PGO); it seems that the other .ii files are C-like enough to not 
> trigger this behaviour, so it's probably something to do with templates 
> and/or classes.

Would be possible to have -ftime-report for your tramp3d build with
the four compilers?  It may be a consequence of training changes and
i.e. template instantiation code which is excercised by tramp3d a lot
more than by GCC bootstrap.
But it still seem bit too much for simple PGO divergence. It also
may be caused by changes of inliner decisions or some other pass being
overly active.

Honza
> 
> 
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-15 12:12   ` Michael Matz
  2019-04-15 13:20     ` Jan Hubicka
@ 2019-04-15 13:33     ` Jakub Jelinek
  2019-04-15 15:08       ` Michael Matz
  1 sibling, 1 reply; 30+ messages in thread
From: Jakub Jelinek @ 2019-04-15 13:33 UTC (permalink / raw)
  To: Michael Matz
  Cc: Martin Liška, GCC Development, Jan Hubicka, Richard Biener,
	Martin Jambor

On Mon, Apr 15, 2019 at 12:12:13PM +0000, Michael Matz wrote:
> Hi,
> 
> On Mon, 15 Apr 2019, Martin Liška wrote:
> 
> > There's a similar comparison that I did for the official openSUSE gcc 
> > packages. gcc8 is built with PGO, while the gcc9 package is built in 2 
> > different configurations: PGO, LTO, PGO+LTO (LTO used for FE in stage4, 
> > for generators in stage3 as well).
> > 
> > Please take a look at attached statistics.
> 
> It seems the C++ parser got quite a bit slower with gcc 9 :-( Most visible 
> in the compile time for tramp-3d (24%) and kdecore.cc (18% slower with 
> just PGO); it seems that the other .ii files are C-like enough to not 

Is that with the same libstdc++ headers (i.e. identical *.ii files) or with
the corresponding libstdc++ headers?  Those do change a lot every release as
well.

	Jakub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-15 13:33     ` Jakub Jelinek
@ 2019-04-15 15:08       ` Michael Matz
  2019-04-16  7:48         ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Michael Matz @ 2019-04-15 15:08 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Martin Liška, GCC Development, Jan Hubicka, Richard Biener,
	Martin Jambor

Hi,

On Mon, 15 Apr 2019, Jakub Jelinek wrote:

> > It seems the C++ parser got quite a bit slower with gcc 9 :-( Most 
> > visible in the compile time for tramp-3d (24%) and kdecore.cc (18% 
> > slower with just PGO); it seems that the other .ii files are C-like 
> > enough to not
> 
> Is that with the same libstdc++ headers (i.e. identical *.ii files) or 
> with the corresponding libstdc++ headers?  Those do change a lot every 
> release as well.

The tramp3d and kdecore testcases are preprocessed files from a 
collection of benchmark sources we use, i.e. the same 
input for all compilers.  I think the {gimple,generic}-match.ii are in the 
same league.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-15 15:08       ` Michael Matz
@ 2019-04-16  7:48         ` Martin Liška
  2019-04-16  8:17           ` Martin Liška
  2019-04-16  8:53           ` Michael Matz
  0 siblings, 2 replies; 30+ messages in thread
From: Martin Liška @ 2019-04-16  7:48 UTC (permalink / raw)
  To: Michael Matz, Jakub Jelinek
  Cc: GCC Development, Jan Hubicka, Richard Biener, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]

On 4/15/19 5:07 PM, Michael Matz wrote:
> Hi,
> 
> On Mon, 15 Apr 2019, Jakub Jelinek wrote:
> 
>>> It seems the C++ parser got quite a bit slower with gcc 9 :-( Most 
>>> visible in the compile time for tramp-3d (24%) and kdecore.cc (18% 
>>> slower with just PGO); it seems that the other .ii files are C-like 
>>> enough to not
>>
>> Is that with the same libstdc++ headers (i.e. identical *.ii files) or 
>> with the corresponding libstdc++ headers?  Those do change a lot every 
>> release as well.
> 
> The tramp3d and kdecore testcases are preprocessed files from a 
> collection of benchmark sources we use, i.e. the same 
> input for all compilers.  I think the {gimple,generic}-match.ii are in the 
> same league.

Yes, except kdecore.cc I used in all cases .ii pre-processed files. I'm going
to start using kdecore.ii as well.

As Honza pointed out in the email that hasn't reached this mailing list
due to file size, there's a significant change in inline-unit-growth. The param
has changed from 20 to 40 for GCC 9. Using --param inline-unit-growth=20 for all
benchmarks, I see green numbres for GCC 9!

Martin

> 
> 
> Ciao,
> Michael.
> 


[-- Attachment #2: gcc8-gcc9-comparison.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 17331 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16  7:48         ` Martin Liška
@ 2019-04-16  8:17           ` Martin Liška
  2019-04-16  8:53           ` Michael Matz
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Liška @ 2019-04-16  8:17 UTC (permalink / raw)
  To: Michael Matz, Jakub Jelinek
  Cc: GCC Development, Jan Hubicka, Richard Biener, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 1292 bytes --]

On 4/16/19 9:48 AM, Martin Liška wrote:
> On 4/15/19 5:07 PM, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 15 Apr 2019, Jakub Jelinek wrote:
>>
>>>> It seems the C++ parser got quite a bit slower with gcc 9 :-( Most 
>>>> visible in the compile time for tramp-3d (24%) and kdecore.cc (18% 
>>>> slower with just PGO); it seems that the other .ii files are C-like 
>>>> enough to not
>>>
>>> Is that with the same libstdc++ headers (i.e. identical *.ii files) or 
>>> with the corresponding libstdc++ headers?  Those do change a lot every 
>>> release as well.
>>
>> The tramp3d and kdecore testcases are preprocessed files from a 
>> collection of benchmark sources we use, i.e. the same 
>> input for all compilers.  I think the {gimple,generic}-match.ii are in the 
>> same league.
> 
> Yes, except kdecore.cc I used in all cases .ii pre-processed files. I'm going
> to start using kdecore.ii as well.
> 
> As Honza pointed out in the email that hasn't reached this mailing list
> due to file size, there's a significant change in inline-unit-growth. The param
> has changed from 20 to 40 for GCC 9. Using --param inline-unit-growth=20 for all
> benchmarks, I see green numbres for GCC 9!
> 
> Martin
> 
>>
>>
>> Ciao,
>> Michael.
>>
> 

Sending updated cell conditional formatting.

Martin

[-- Attachment #2: gcc8-gcc9-comparison.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 17471 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16  7:48         ` Martin Liška
  2019-04-16  8:17           ` Martin Liška
@ 2019-04-16  8:53           ` Michael Matz
  2019-04-16  9:56             ` Richard Biener
  1 sibling, 1 reply; 30+ messages in thread
From: Michael Matz @ 2019-04-16  8:53 UTC (permalink / raw)
  To: Martin Liška
  Cc: Jakub Jelinek, GCC Development, Jan Hubicka, Richard Biener,
	Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

Hello Martin,

On Tue, 16 Apr 2019, Martin Liška wrote:

> Yes, except kdecore.cc I used in all cases .ii pre-processed files. I'm 
> going to start using kdecore.ii as well.

If the kdecore.cc is the one from me it's also preprocessed and doesn't 
contain any #include directives, I just edited it somewhat to be 
compilable for different architecture.


Ciao,
Michael.

> 
> As Honza pointed out in the email that hasn't reached this mailing list
> due to file size, there's a significant change in inline-unit-growth. The param
> has changed from 20 to 40 for GCC 9. Using --param inline-unit-growth=20 for all
> benchmarks, I see green numbres for GCC 9!
> 
> Martin
> 
> > 
> > 
> > Ciao,
> > Michael.
> > 
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16  8:53           ` Michael Matz
@ 2019-04-16  9:56             ` Richard Biener
  2019-04-16 11:25               ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2019-04-16  9:56 UTC (permalink / raw)
  To: Michael Matz
  Cc: Martin Liška, Jakub Jelinek, GCC Development, Jan Hubicka,
	Martin Jambor

On Tue, Apr 16, 2019 at 10:53 AM Michael Matz <matz@suse.de> wrote:
>
> Hello Martin,
>
> On Tue, 16 Apr 2019, Martin Liška wrote:
>
> > Yes, except kdecore.cc I used in all cases .ii pre-processed files. I'm
> > going to start using kdecore.ii as well.
>
> If the kdecore.cc is the one from me it's also preprocessed and doesn't
> contain any #include directives, I just edited it somewhat to be
> compilable for different architecture.

Btw, the tramp3d sources on our testers _do_ contain #include directives.

Richard.

>
> Ciao,
> Michael.
>
> >
> > As Honza pointed out in the email that hasn't reached this mailing list
> > due to file size, there's a significant change in inline-unit-growth. The param
> > has changed from 20 to 40 for GCC 9. Using --param inline-unit-growth=20 for all
> > benchmarks, I see green numbres for GCC 9!
> >
> > Martin
> >
> > >
> > >
> > > Ciao,
> > > Michael.
> > >
> >
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16  9:56             ` Richard Biener
@ 2019-04-16 11:25               ` Richard Biener
  2019-04-16 11:39                 ` Jakub Jelinek
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2019-04-16 11:25 UTC (permalink / raw)
  To: Michael Matz
  Cc: Martin Liška, Jakub Jelinek, GCC Development, Jan Hubicka,
	Martin Jambor

On Tue, Apr 16, 2019 at 11:56 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Apr 16, 2019 at 10:53 AM Michael Matz <matz@suse.de> wrote:
> >
> > Hello Martin,
> >
> > On Tue, 16 Apr 2019, Martin Liška wrote:
> >
> > > Yes, except kdecore.cc I used in all cases .ii pre-processed files. I'm
> > > going to start using kdecore.ii as well.
> >
> > If the kdecore.cc is the one from me it's also preprocessed and doesn't
> > contain any #include directives, I just edited it somewhat to be
> > compilable for different architecture.
>
> Btw, the tramp3d sources on our testers _do_ contain #include directives.

So for the parser it's small differences that accumulate, for example
a lot more comptype calls via null_ptr_cst_p (via char_type_p) via the new
conversion_null_warnings which is called even without any warning option.

Possible speedup to null_ptr_cst_p is to avoid the expensive char_type_p
(called 50000 times in GCC 9 vs. only 2000 times in GCC 8):

Index: gcc/cp/call.c
===================================================================
--- gcc/cp/call.c       (revision 270387)
+++ gcc/cp/call.c       (working copy)
@@ -541,11 +541,11 @@ null_ptr_cst_p (tree t)
       STRIP_ANY_LOCATION_WRAPPER (t);

       /* Core issue 903 says only literal 0 is a null pointer constant.  */
-      if (TREE_CODE (type) == INTEGER_TYPE
-         && !char_type_p (type)
-         && TREE_CODE (t) == INTEGER_CST
+      if (TREE_CODE (t) == INTEGER_CST
+         && !TREE_OVERFLOW (t)
+         && TREE_CODE (type) == INTEGER_TYPE
          && integer_zerop (t)
-         && !TREE_OVERFLOW (t))
+         && !char_type_p (type))
        return true;
     }
   else if (CP_INTEGRAL_TYPE_P (type))

brings down the number of char_type_p calls to ~5000.  Still null_ptr_cst_p
calls are 150000 vs. 17000, caused by the conversion_null_warnings code
doing

  /* Handle zero as null pointer warnings for cases other
     than EQ_EXPR and NE_EXPR */
  else if (null_ptr_cst_p (expr) &&
           (TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype)))
    {

similarly "easy" to short-cut most of them:

@@ -6882,8 +6882,8 @@ conversion_null_warnings (tree totype, t
     }
   /* Handle zero as null pointer warnings for cases other
      than EQ_EXPR and NE_EXPR */
-  else if (null_ptr_cst_p (expr) &&
-          (TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype)))
+  else if ((TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype))
+          && null_ptr_cst_p (expr))
     {
       location_t loc =
get_location_for_expr_unwinding_for_system_header (expr);
       maybe_warn_zero_as_null_pointer_constant (expr, loc);

brings them down to 25000.

All this looks like there's plenty of low-hanging micro-optimization possible in
the C++ frontend.

I'm going to test the above two hunks, the overall savings are of course
small (and possibly applicable to branches as well).

Richard.


> Richard.
>
> >
> > Ciao,
> > Michael.
> >
> > >
> > > As Honza pointed out in the email that hasn't reached this mailing list
> > > due to file size, there's a significant change in inline-unit-growth. The param
> > > has changed from 20 to 40 for GCC 9. Using --param inline-unit-growth=20 for all
> > > benchmarks, I see green numbres for GCC 9!
> > >
> > > Martin
> > >
> > > >
> > > >
> > > > Ciao,
> > > > Michael.
> > > >
> > >
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16 11:25               ` Richard Biener
@ 2019-04-16 11:39                 ` Jakub Jelinek
  2019-04-16 11:54                   ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Jelinek @ 2019-04-16 11:39 UTC (permalink / raw)
  To: Richard Biener
  Cc: Michael Matz, Martin Liška, GCC Development, Jan Hubicka,
	Martin Jambor

On Tue, Apr 16, 2019 at 01:25:38PM +0200, Richard Biener wrote:
> So for the parser it's small differences that accumulate, for example
> a lot more comptype calls via null_ptr_cst_p (via char_type_p) via the new
> conversion_null_warnings which is called even without any warning option.
> 
> Possible speedup to null_ptr_cst_p is to avoid the expensive char_type_p
> (called 50000 times in GCC 9 vs. only 2000 times in GCC 8):

If we do this (looks like a good idea to me), perhaps we should do also
following (first part just doing what you've done in yet another spot,
moving the less expensive checks first, because null_node_p strips location
wrappers etc.) and the second not to call conversion_null_warnings at all
if we don't want to warn (though, admittedly while
warn_zero_as_null_pointer_constant defaults to 0, warn_conversion_null
defaults to 1).

--- gcc/cp/call.c	2019-04-12 21:47:06.301924378 +0200
+++ gcc/cp/call.c	2019-04-16 13:35:59.779977641 +0200
@@ -6844,8 +6844,9 @@ static void
 conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
 {
   /* Issue warnings about peculiar, but valid, uses of NULL.  */
-  if (null_node_p (expr) && TREE_CODE (totype) != BOOLEAN_TYPE
-      && ARITHMETIC_TYPE_P (totype))
+  if (TREE_CODE (totype) != BOOLEAN_TYPE
+      && ARITHMETIC_TYPE_P (totype)
+      && null_node_p (expr))
     {
       location_t loc = get_location_for_expr_unwinding_for_system_header (expr);
       if (fn)
@@ -7059,7 +7060,9 @@ convert_like_real (conversion *convs, tr
       return cp_convert (totype, expr, complain);
     }
 
-  if (issue_conversion_warnings && (complain & tf_warning))
+  if (issue_conversion_warnings
+      && (complain & tf_warning)
+      && (warn_conversion_null || warn_zero_as_null_pointer_constant))
     conversion_null_warnings (totype, expr, fn, argnum);
 
   switch (convs->kind)


	Jakub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: GCC 8 vs. GCC 9 speed and size comparison
  2019-04-16 11:39                 ` Jakub Jelinek
@ 2019-04-16 11:54                   ` Richard Biener
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Biener @ 2019-04-16 11:54 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Michael Matz, Martin Liška, GCC Development, Jan Hubicka,
	Martin Jambor

On Tue, Apr 16, 2019 at 1:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Apr 16, 2019 at 01:25:38PM +0200, Richard Biener wrote:
> > So for the parser it's small differences that accumulate, for example
> > a lot more comptype calls via null_ptr_cst_p (via char_type_p) via the new
> > conversion_null_warnings which is called even without any warning option.
> >
> > Possible speedup to null_ptr_cst_p is to avoid the expensive char_type_p
> > (called 50000 times in GCC 9 vs. only 2000 times in GCC 8):
>
> If we do this (looks like a good idea to me), perhaps we should do also
> following (first part just doing what you've done in yet another spot,
> moving the less expensive checks first, because null_node_p strips location
> wrappers etc.) and the second not to call conversion_null_warnings at all
> if we don't want to warn (though, admittedly while
> warn_zero_as_null_pointer_constant defaults to 0, warn_conversion_null
> defaults to 1).
>
> --- gcc/cp/call.c       2019-04-12 21:47:06.301924378 +0200
> +++ gcc/cp/call.c       2019-04-16 13:35:59.779977641 +0200
> @@ -6844,8 +6844,9 @@ static void
>  conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
>  {
>    /* Issue warnings about peculiar, but valid, uses of NULL.  */
> -  if (null_node_p (expr) && TREE_CODE (totype) != BOOLEAN_TYPE
> -      && ARITHMETIC_TYPE_P (totype))
> +  if (TREE_CODE (totype) != BOOLEAN_TYPE
> +      && ARITHMETIC_TYPE_P (totype)
> +      && null_node_p (expr))
>      {
>        location_t loc = get_location_for_expr_unwinding_for_system_header (expr);
>        if (fn)
> @@ -7059,7 +7060,9 @@ convert_like_real (conversion *convs, tr
>        return cp_convert (totype, expr, complain);
>      }
>
> -  if (issue_conversion_warnings && (complain & tf_warning))
> +  if (issue_conversion_warnings
> +      && (complain & tf_warning)
> +      && (warn_conversion_null || warn_zero_as_null_pointer_constant))
>      conversion_null_warnings (totype, expr, fn, argnum);
>
>    switch (convs->kind)

Yes, that looks good to me as well.

Btw, I noticed the C++ FE calls build_qualified_type a _lot_, in 99% picking
up an existing variant from the list and those list walks visit ~20 types
_on average_!  A simple LRU cache (just put the found variant first) manages
to improve compile-time to be even better than GCC 8 (~1% improvement).
It improves the number of types checked to ~2.5 (from those 20).  Also
-fsynax-only compile-time from 2.9s to 2.75s (consistently).

Index: gcc/tree.c
===================================================================
--- gcc/tree.c  (revision 270387)
+++ gcc/tree.c  (working copy)
@@ -6459,9 +6459,22 @@ get_qualified_type (tree type, int type_
   /* Search the chain of variants to see if there is already one there just
      like the one we need to have.  If so, use that existing one.  We must
      preserve the TYPE_NAME, since there is code that depends on this.  */
-  for (t = TYPE_MAIN_VARIANT (type); t; t = TYPE_NEXT_VARIANT (t))
-    if (check_qualified_type (t, type, type_quals))
-      return t;
+  for (tree *t = &TYPE_MAIN_VARIANT (type); *t; t = &TYPE_NEXT_VARIANT (*t))
+    {
+      if (check_qualified_type (*t, type, type_quals))
+       {
+         tree mv = TYPE_MAIN_VARAINT (type);
+         tree x = *t;
+         if (x != mv)
+           {
+             /* LRU.  */
+             *t = TYPE_NEXT_VARIANT (*t);
+             TYPE_NEXT_VARIANT (x) = TYPE_NEXT_VARIANT (mv);
+             TYPE_NEXT_VARIANT (mv) = x;
+           }
+         return x;
+       }
+    }

   return NULL_TREE;
 }

peeling the main-variant case above might make the code a little bit prettier.

Richard.

>
>         Jakub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
       [not found] <5ec1f1c1-e0f6-5681-b6c6-cf8b076bc02a@suse.cz>
  2018-03-06 10:32 ` How big (and fast) is going to be GCC 8? Richard Biener
@ 2018-03-12 15:05 ` Jan Hubicka
  1 sibling, 0 replies; 30+ messages in thread
From: Jan Hubicka @ 2018-03-12 15:05 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development, Richard Biener, Michael Matz, tom

Hello,
I have also re-done most of my firefox testing similar to ones I published at
http://hubicka.blogspot.cz/2014/04/linktime-optimization-in-gcc-2-firefox.html
(thanks to Martin Liska who got LTO builds to work again)

I am attaching statistics on binary sizes.  Interesting is that for firefox LTO is quite
good size optimization (16% on text) and similarly FDO reduces text size and combines well
with LTO, which is bit different from Martin's gcc stats. I have looked into this very
briefly and one isse seems to be with the way we determine hot/cold threshold.

binary size             text            relocations     data            EH              rest
gcc6 -O3                90448658        12887358        13720073        13035704        257839
gcc6 -O3 -flto          75810786        12145211        12390185        8422776         240002
gcc6 -O3 + FDO          67087824        13008294        13655305        13719944        259585
gcc6 -O3 -flto + FDO    60206898        12169803        12334113        9083088         240050
gcc7 -O3                93233440        12928831        13780313        13578224        257408
gcc7 -O3 -flto          76764274        12128031        12405369        8420448         240662
gcc7 -O3 + FDO          67500688        12994279        13650185        13661760        263400
gcc7 -O3 -flto + FDO    59776994        12151360        12325217        8971344         239501
gcc8 -O2                80311120        12939568        13763033        12948752        258711
gcc8 -O2 -flto          69156752        12109236        12475801        8501152         240163
gcc8 -O3                89913648        12924468        13790393        13374328        256867
gcc8 -O3 -flto          75971122        12138528        12426649        8593024         239861
gcc8 -O3 + FDO          67047632        12996890        13707017        13146232        263413
gcc8 -O3 -flto + FDO    58951410        12146008        12377161        8634152         241765

I also did some builds with clang. Observation is that clang's -O3 binary is
smaller than ours, while our LTO/FDO builds are smaller than clang's (LTO+FDO
build quite substantially).
Our EH is bigger than clang's which is probably something to look into.  One problem I am
aware of is that our nothrow pass is not type sensitive and thus won't figure out if
program throws an exception of specific type and catches it later.

clang6 -O3              84754848        13032018        13597433        10791528        371429
clang6 -O3 -flto        90757024        12273574        12258521        6841424         350585
clang6 -O3 -flto=thin   92940576        12376724        12479233        7974856         353171
clang6 -O3 + FDO        81776880        13136428        13574489        11501344        385123
clang6 -O3 -flto=thin+FDO 88374432      12405075        12434297        9574416         356508
clang6 -O3 -flto + FDO  90637168        12288433        12244265        9023304         349078

I also did some benchmarking and found at least an issue with -flto -O3 hitting
--param inline-unit-growth bit too early so we do not get much benefits (while
clang does but it also does not reduce binary size). -O3 -flto + FDO or -O2
-flto seems to work well. Will summarize the results later.

Firefox developer Tom Ritter has tested LTO with FDO and without here (it is
rather nice interface - I like that one can click to the graph and see the
results in context of other tests done recently).  This is done with gcc6.

Tracking bug:
https://bugzilla.mozilla.org/show_bug.cgi?format=default&id=521435

non-FDO build:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=12ce14a5bcac9975b41a1f901bfc3a8dcb2d791b&framework=1&showOnlyImportant=1&selectedTimeRange=172800

FDO build:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=7e5bd52e36fcc1703ced01fe87e831a716677295&framework=1&showOnlyImportant=1&selectedTimeRange=172800

Honza

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
  2018-03-06 10:32 ` How big (and fast) is going to be GCC 8? Richard Biener
@ 2018-03-06 14:59   ` Jan Hubicka
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Hubicka @ 2018-03-06 14:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Liška, GCC Development, Michael Matz

> On Tue, Mar 6, 2018 at 11:12 AM, Martin Liška <mliska@suse.cz> wrote:
> > Hello.
> >
> > Many significant changes has landed in mainline and will be released as GCC 8.1.
> > I decided to use various GCC configs we have and test how there configuration differ
> > in size and also binary size.
> >
> > This is first part where I measured binary size, speed comparison will follow.
> > Configuration names should be self-explaining, the 'system-*' is built done
> > without bootstrap with my system compiler (GCC 7.3.0). All builds are done
> > on my Intel Haswell machine.
> 
> So from the numbers I see that bootstrap causes a 8% bigger binary compared
> to non-bootstrap using GCC 7.3 at -O2 when including debug info and 1.2%
> larger stripped.  That means trunk generates larger code.

It is bit odd indeed because size stats from specs seems to imply otherwise.
It would be nice to work that out.  Also I am surprised that LTO increases text
size even for non-plugin build. I should not happen.
These issues are generally hard to debug though.  I will try to take a look.

I will send similar stats for my firefox experiments. If you have scripts to collect
them, they would be welcome.

Thanks for looking into this!
Honza
> 
> What is missing is a speed comparison of the various binaries -- you could
> try measuring this by doing a make all-gcc for a non-bootstrap config
> (so it uses -O2 -g and doesn't build target libs with the built compiler).
> 
> Richard.
> 
> > Feel free to reply if you need any explanation.
> > Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How big (and fast) is going to be GCC 8?
       [not found] <5ec1f1c1-e0f6-5681-b6c6-cf8b076bc02a@suse.cz>
@ 2018-03-06 10:32 ` Richard Biener
  2018-03-06 14:59   ` Jan Hubicka
  2018-03-12 15:05 ` Jan Hubicka
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Biener @ 2018-03-06 10:32 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Development, Jan Hubicka, Michael Matz

On Tue, Mar 6, 2018 at 11:12 AM, Martin Liška <mliska@suse.cz> wrote:
> Hello.
>
> Many significant changes has landed in mainline and will be released as GCC 8.1.
> I decided to use various GCC configs we have and test how there configuration differ
> in size and also binary size.
>
> This is first part where I measured binary size, speed comparison will follow.
> Configuration names should be self-explaining, the 'system-*' is built done
> without bootstrap with my system compiler (GCC 7.3.0). All builds are done
> on my Intel Haswell machine.

So from the numbers I see that bootstrap causes a 8% bigger binary compared
to non-bootstrap using GCC 7.3 at -O2 when including debug info and 1.2%
larger stripped.  That means trunk generates larger code.

What is missing is a speed comparison of the various binaries -- you could
try measuring this by doing a make all-gcc for a non-bootstrap config
(so it uses -O2 -g and doesn't build target libs with the built compiler).

Richard.

> Feel free to reply if you need any explanation.
> Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-04-16 11:54 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 10:14 How big (and fast) is going to be GCC 8? Martin Liška
2018-03-06 15:13 ` David Malcolm
2018-03-06 16:18   ` Martin Liška
2018-03-06 18:35     ` Martin Liška
2018-03-06 17:50 ` How big (and fast) is going to be GCC 8? [part 2] Martin Liška
2018-03-06 18:16   ` Bin.Cheng
2018-03-06 18:36     ` Martin Liška
2018-03-07  9:26 ` Size and speed comparison of GCC 7 & 8 Martin Liška
2018-03-07 10:13   ` Martin Liška
2018-03-07 11:12     ` Martin Liška
2018-03-13 13:31     ` Martin Liška
2018-03-20 19:57 ` How can compiler speed-up postgresql database? Martin Liška
2018-03-21  9:26   ` Richard Biener
2018-03-21  9:34     ` Martin Liška
2018-03-21 11:47       ` Jan Hubicka
2019-04-15 11:44 ` GCC 8 vs. GCC 9 speed and size comparison Martin Liška
2019-04-15 12:12   ` Michael Matz
2019-04-15 13:20     ` Jan Hubicka
2019-04-15 13:33     ` Jakub Jelinek
2019-04-15 15:08       ` Michael Matz
2019-04-16  7:48         ` Martin Liška
2019-04-16  8:17           ` Martin Liška
2019-04-16  8:53           ` Michael Matz
2019-04-16  9:56             ` Richard Biener
2019-04-16 11:25               ` Richard Biener
2019-04-16 11:39                 ` Jakub Jelinek
2019-04-16 11:54                   ` Richard Biener
     [not found] <5ec1f1c1-e0f6-5681-b6c6-cf8b076bc02a@suse.cz>
2018-03-06 10:32 ` How big (and fast) is going to be GCC 8? Richard Biener
2018-03-06 14:59   ` Jan Hubicka
2018-03-12 15:05 ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).