* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
@ 2012-12-13 13:29 ` rguenth at gcc dot gnu.org
2012-12-13 14:45 ` tejohnson at google dot com
` (21 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-12-13 13:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org,
| |tejohnson at google dot com
Target Milestone|--- |4.8.0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
2012-12-13 13:29 ` [Bug gcov-profile/55674] " rguenth at gcc dot gnu.org
@ 2012-12-13 14:45 ` tejohnson at google dot com
2012-12-13 14:46 ` markus at trippelsdorf dot de
` (20 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 14:45 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #1 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 14:45:01 UTC ---
I'm really surprised that using --param hot-bb-count-ws-permille=950 didn't
help, since even fewer things should look hot enough to inline than before the
revision.
Would you send me the new -fdump-ipa-inline output?
Thanks,
Teresa
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
2012-12-13 13:29 ` [Bug gcov-profile/55674] " rguenth at gcc dot gnu.org
2012-12-13 14:45 ` tejohnson at google dot com
@ 2012-12-13 14:46 ` markus at trippelsdorf dot de
2012-12-13 14:49 ` tejohnson at google dot com
` (19 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 14:46 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #2 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 14:46:31 UTC ---
In the tramp3d-v4 case, when I run with this simple debug patch:
diff --git a/gcc/predict.c b/gcc/predict.c
index 5d3de29..bf3a259 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -147,6 +147,7 @@ maybe_hot_count_p (struct function *fun, gcov_type count)
gcc_assert (ws);
min_count = ws->min_counter;
}
+ fprintf (stderr, "count=%i min_count=%i profile_info->sum_max=%i\n", count,
min_count, profile_info->sum_max);
return (count >= min_count);
}
it shows that in the middle of the output, min_count changes from
8585 to 0:
... ~600000 similar lines with min_count=8585
count=3 min_count=8585 profile_info->sum_max=257406300
count=3 min_count=8585 profile_info->sum_max=257406300
count=3 min_count=8585 profile_info->sum_max=257406300
count=3 min_count=8585 profile_info->sum_max=257406300
count=40 min_count=0 profile_info->sum_max=257406300
count=61 min_count=0 profile_info->sum_max=257406300
count=40 min_count=0 profile_info->sum_max=257406300
count=40 min_count=0 profile_info->sum_max=257406300
... ~600000 similar lines with min_count=0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (2 preceding siblings ...)
2012-12-13 14:46 ` markus at trippelsdorf dot de
@ 2012-12-13 14:49 ` tejohnson at google dot com
2012-12-13 14:53 ` markus at trippelsdorf dot de
` (18 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 14:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #3 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 14:49:19 UTC ---
Hi Markus,
Are you sure you have my subsequent fixes patched in, to make sure the
histogram is getting streamed through the LTO files? This was the
behavior I saw when I was debugging the original issue that I fixed
with those patches. Basically, the hotness checks before we went into
LTO were good, and after LTO the min count was 0.
Teresa
On Thu, Dec 13, 2012 at 6:46 AM, markus at trippelsdorf dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #2 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 14:46:31 UTC ---
> In the tramp3d-v4 case, when I run with this simple debug patch:
>
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 5d3de29..bf3a259 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -147,6 +147,7 @@ maybe_hot_count_p (struct function *fun, gcov_type count)
> gcc_assert (ws);
> min_count = ws->min_counter;
> }
> + fprintf (stderr, "count=%i min_count=%i profile_info->sum_max=%i\n", count,
> min_count, profile_info->sum_max);
> return (count >= min_count);
> }
>
> it shows that in the middle of the output, min_count changes from
> 8585 to 0:
>
> ... ~600000 similar lines with min_count=8585
> count=3 min_count=8585 profile_info->sum_max=257406300
> count=3 min_count=8585 profile_info->sum_max=257406300
> count=3 min_count=8585 profile_info->sum_max=257406300
> count=3 min_count=8585 profile_info->sum_max=257406300
> count=40 min_count=0 profile_info->sum_max=257406300
> count=61 min_count=0 profile_info->sum_max=257406300
> count=40 min_count=0 profile_info->sum_max=257406300
> count=40 min_count=0 profile_info->sum_max=257406300
> ... ~600000 similar lines with min_count=0
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (3 preceding siblings ...)
2012-12-13 14:49 ` tejohnson at google dot com
@ 2012-12-13 14:53 ` markus at trippelsdorf dot de
2012-12-13 15:03 ` tejohnson at google dot com
` (17 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 14:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #4 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 14:52:37 UTC ---
(In reply to comment #3)
> Hi Markus,
>
> Are you sure you have my subsequent fixes patched in, to make sure the
> histogram is getting streamed through the LTO files? This was the
> behavior I saw when I was debugging the original issue that I fixed
> with those patches. Basically, the hotness checks before we went into
> LTO were good, and after LTO the min count was 0.
>
Hi Teresa,
yes my gcc is up-to-date (from todays git).
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (4 preceding siblings ...)
2012-12-13 14:53 ` markus at trippelsdorf dot de
@ 2012-12-13 15:03 ` tejohnson at google dot com
2012-12-13 15:06 ` markus at trippelsdorf dot de
` (16 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 15:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #5 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 15:02:55 UTC ---
Ok, I will download tramp3d-v4 right now and see what is going on. Can
you send me the full set of options you are using to compile it?
Teresa
On Thu, Dec 13, 2012 at 6:52 AM, markus at trippelsdorf dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #4 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 14:52:37 UTC ---
> (In reply to comment #3)
>> Hi Markus,
>>
>> Are you sure you have my subsequent fixes patched in, to make sure the
>> histogram is getting streamed through the LTO files? This was the
>> behavior I saw when I was debugging the original issue that I fixed
>> with those patches. Basically, the hotness checks before we went into
>> LTO were good, and after LTO the min count was 0.
>>
> Hi Teresa,
>
> yes my gcc is up-to-date (from todays git).
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (5 preceding siblings ...)
2012-12-13 15:03 ` tejohnson at google dot com
@ 2012-12-13 15:06 ` markus at trippelsdorf dot de
2012-12-13 15:50 ` tejohnson at google dot com
` (15 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 15:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #6 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 15:06:12 UTC ---
(In reply to comment #5)
> Ok, I will download tramp3d-v4 right now and see what is going on. Can
> you send me the full set of options you are using to compile it?
>
g++ -w -O3 -fprofile-generate -march=native tramp3d-v4.cpp
./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20
g++ -w -O3 -fprofile-use -flto=4 -march=native tramp3d-v4.cpp
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (6 preceding siblings ...)
2012-12-13 15:06 ` markus at trippelsdorf dot de
@ 2012-12-13 15:50 ` tejohnson at google dot com
2012-12-13 18:23 ` tejohnson at google dot com
` (14 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 15:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #7 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 15:50:05 UTC ---
Reproduced. Looks like somehow my fix to stream this through LTO is
not working properly. I see that the min count is valid when
generating the .o file, but goes to zero when we do the link(lto)
phase. Hopefully this means that the heuristic itself is not broken.
=) I'll see if I can root cause this and have a fix today.
Teresa
On Thu, Dec 13, 2012 at 7:06 AM, markus at trippelsdorf dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #6 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 15:06:12 UTC ---
> (In reply to comment #5)
>> Ok, I will download tramp3d-v4 right now and see what is going on. Can
>> you send me the full set of options you are using to compile it?
>>
>
> g++ -w -O3 -fprofile-generate -march=native tramp3d-v4.cpp
> ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20
> g++ -w -O3 -fprofile-use -flto=4 -march=native tramp3d-v4.cpp
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (7 preceding siblings ...)
2012-12-13 15:50 ` tejohnson at google dot com
@ 2012-12-13 18:23 ` tejohnson at google dot com
2012-12-13 19:11 ` markus at trippelsdorf dot de
` (13 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 18:23 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #8 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 18:23:08 UTC ---
Dumb mistake in my previous fix to the lto support. Here is the patch
that fixes it, I will submit for review after regression testing
completes:
Index: lto-cgraph.c
===================================================================
--- lto-cgraph.c (revision 194404)
+++ lto-cgraph.c (working copy)
@@ -1368,7 +1368,9 @@
so we need to account for a non-zero histogram entry at new_ix. */
unsigned new_ix = gcov_histo_index (scaled_min);
lto_gcov_summary.histogram[new_ix].min_value
- = MIN (lto_gcov_summary.histogram[new_ix].min_value, scaled_min);
+ = (lto_gcov_summary.histogram[new_ix].num_counters
+ ? MIN (lto_gcov_summary.histogram[new_ix].min_value, scaled_min)
+ : scaled_min);
/* Some of the scaled counter values would ostensibly need to be placed
into different (larger) histogram buckets, but we keep things simple
here and place the scaled cumulative counter value in the bucket
Please let me know how this affects the mozilla size.
Thanks,
Teresa
On Thu, Dec 13, 2012 at 7:49 AM, Teresa Johnson <tejohnson@google.com> wrote:
> Reproduced. Looks like somehow my fix to stream this through LTO is
> not working properly. I see that the min count is valid when
> generating the .o file, but goes to zero when we do the link(lto)
> phase. Hopefully this means that the heuristic itself is not broken.
> =) I'll see if I can root cause this and have a fix today.
>
> Teresa
>
> On Thu, Dec 13, 2012 at 7:06 AM, markus at trippelsdorf dot de
> <gcc-bugzilla@gcc.gnu.org> wrote:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>>
>> --- Comment #6 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 15:06:12 UTC ---
>> (In reply to comment #5)
>>> Ok, I will download tramp3d-v4 right now and see what is going on. Can
>>> you send me the full set of options you are using to compile it?
>>>
>>
>> g++ -w -O3 -fprofile-generate -march=native tramp3d-v4.cpp
>> ./a.out --cartvis 1.0 0.0 --rhomin 1e-8 -n 20
>> g++ -w -O3 -fprofile-use -flto=4 -march=native tramp3d-v4.cpp
>>
>> --
>> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
>> ------- You are receiving this mail because: -------
>> You are on the CC list for the bug.
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (8 preceding siblings ...)
2012-12-13 18:23 ` tejohnson at google dot com
@ 2012-12-13 19:11 ` markus at trippelsdorf dot de
2012-12-13 22:14 ` markus at trippelsdorf dot de
` (12 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 19:11 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #9 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 19:10:40 UTC ---
*** Bug 55669 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (9 preceding siblings ...)
2012-12-13 19:11 ` markus at trippelsdorf dot de
@ 2012-12-13 22:14 ` markus at trippelsdorf dot de
2012-12-13 22:16 ` tejohnson at google dot com
` (11 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 22:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #10 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 22:14:01 UTC ---
(In reply to comment #8)
> Please let me know how this affects the mozilla size.
Looks much better now:
39748288 hot-bb-count-ws-permille=999 (default)
34573408 hot-bb-count-ws-permille=890
34072808 without lto/pgo
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (10 preceding siblings ...)
2012-12-13 22:14 ` markus at trippelsdorf dot de
@ 2012-12-13 22:16 ` tejohnson at google dot com
2012-12-13 22:35 ` markus at trippelsdorf dot de
` (10 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-13 22:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #11 from Teresa Johnson <tejohnson at google dot com> 2012-12-13 22:16:19 UTC ---
Do you happen to know what it was with lto/pgo before the change? Should be
roughly equivalent to hot-bb-count-ws-permille=970 from what I saw in your
profiles. What size increase is acceptable?
Thanks,
Teresa
On Thu, Dec 13, 2012 at 2:14 PM, markus at trippelsdorf dot de <
gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #10 from Markus Trippelsdorf <markus at trippelsdorf dot de>
> 2012-12-13 22:14:01 UTC ---
> (In reply to comment #8)
> > Please let me know how this affects the mozilla size.
>
> Looks much better now:
> 39748288 hot-bb-count-ws-permille=999 (default)
> 34573408 hot-bb-count-ws-permille=890
> 34072808 without lto/pgo
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (11 preceding siblings ...)
2012-12-13 22:16 ` tejohnson at google dot com
@ 2012-12-13 22:35 ` markus at trippelsdorf dot de
2012-12-14 15:11 ` tejohnson at gcc dot gnu.org
` (9 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-13 22:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #12 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-13 22:35:33 UTC ---
(In reply to comment #11)
> Do you happen to know what it was with lto/pgo before the change? Should be
> roughly equivalent to hot-bb-count-ws-permille=970 from what I saw in your
> profiles
Yes, it was ~34MB before the change.
> What size increase is acceptable?
It's hard to say in case of Firefox, because the only thing
that one can reliably measure is the JavaScript performance.
And this varies only very slightly with different compiler options.
So you have no way to measure up to which point more inlining
is still beneficial.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (12 preceding siblings ...)
2012-12-13 22:35 ` markus at trippelsdorf dot de
@ 2012-12-14 15:11 ` tejohnson at gcc dot gnu.org
2012-12-14 16:31 ` markus at trippelsdorf dot de
` (8 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at gcc dot gnu.org @ 2012-12-14 15:11 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #13 from tejohnson at gcc dot gnu.org 2012-12-14 15:11:00 UTC ---
Author: tejohnson
Date: Fri Dec 14 15:10:45 2012
New Revision: 194502
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=194502
Log:
2012-12-14 Teresa Johnson <tejohnson@google.com>
PR gcov-profile/55674
* lto-cgraph.c (merge_profile_summaries): Set min correctly the
first time we merge into a histogram entry.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lto-cgraph.c
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (13 preceding siblings ...)
2012-12-14 15:11 ` tejohnson at gcc dot gnu.org
@ 2012-12-14 16:31 ` markus at trippelsdorf dot de
2012-12-18 15:41 ` hubicka at ucw dot cz
` (7 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-14 16:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
Markus Trippelsdorf <markus at trippelsdorf dot de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |FIXED
--- Comment #14 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-14 16:31:21 UTC ---
fixed
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (14 preceding siblings ...)
2012-12-14 16:31 ` markus at trippelsdorf dot de
@ 2012-12-18 15:41 ` hubicka at ucw dot cz
2012-12-18 17:04 ` markus at trippelsdorf dot de
` (6 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: hubicka at ucw dot cz @ 2012-12-18 15:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #15 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-18 15:40:34 UTC ---
> It's hard to say in case of Firefox, because the only thing
> that one can reliably measure is the JavaScript performance.
> And this varies only very slightly with different compiler options.
One way to get better differences is to disable the JIT :)
Doesn't snappy allow us to measure things like startup time and
other stuff where not much of JITed code is involved?
> So you have no way to measure up to which point more inlining
> is still beneficial.
I will do some testing on SPEC on when the performance starts dropping.
Setting limit to 890 seems bit dangerous. -Os can slow down the code
considerably, so if 10% of time program will be running -Os code, we probably
get measurable slowdowns overall.
Thanks!
Honza
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (15 preceding siblings ...)
2012-12-18 15:41 ` hubicka at ucw dot cz
@ 2012-12-18 17:04 ` markus at trippelsdorf dot de
2012-12-18 17:25 ` hubicka at ucw dot cz
` (5 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: markus at trippelsdorf dot de @ 2012-12-18 17:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #16 from Markus Trippelsdorf <markus at trippelsdorf dot de> 2012-12-18 17:03:52 UTC ---
I did some measurements with tramp3d and in this case
the default (999) gives the best performance:
par. size time
--------------------
999 955859 3.71752
990 933390 3.73969
980 904718 3.84547
... " "
750 904718 3.84769
740 837654 7.67177
600 836024 8.80879
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (16 preceding siblings ...)
2012-12-18 17:04 ` markus at trippelsdorf dot de
@ 2012-12-18 17:25 ` hubicka at ucw dot cz
2012-12-19 16:45 ` tejohnson at google dot com
` (4 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: hubicka at ucw dot cz @ 2012-12-18 17:25 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #17 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-18 17:25:37 UTC ---
> I did some measurements with tramp3d and in this case
> the default (999) gives the best performance:
>
> par. size time
> --------------------
> 999 955859 3.71752
> 990 933390 3.73969
> 980 904718 3.84547
> ... " "
> 750 904718 3.84769
> 740 837654 7.67177
> 600 836024 8.80879
Yep, tramp3d is unforutnately quite special case: we measure the number of
instructions prior
late optimization, while in tramp3d over 90% of code disappear as a result of
inlining and further
simplification, so we get GIGO problem...
I am not sure how to handle these things in any resonable way....
I will test couple of values on spec2k this week and lets see how things scale
elsewhere.
Honza
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (17 preceding siblings ...)
2012-12-18 17:25 ` hubicka at ucw dot cz
@ 2012-12-19 16:45 ` tejohnson at google dot com
2012-12-21 16:16 ` hubicka at ucw dot cz
` (3 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-19 16:45 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #18 from Teresa Johnson <tejohnson at google dot com> 2012-12-19 16:44:21 UTC ---
On Tue, Dec 18, 2012 at 9:25 AM, hubicka at ucw dot cz
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #17 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-18 17:25:37 UTC ---
>> I did some measurements with tramp3d and in this case
>> the default (999) gives the best performance:
>>
>> par. size time
>> --------------------
>> 999 955859 3.71752
>> 990 933390 3.73969
>> 980 904718 3.84547
>> ... " "
>> 750 904718 3.84769
>> 740 837654 7.67177
>> 600 836024 8.80879
>
> Yep, tramp3d is unforutnately quite special case: we measure the number of
> instructions prior
> late optimization, while in tramp3d over 90% of code disappear as a result of
> inlining and further
> simplification, so we get GIGO problem...
>
> I am not sure how to handle these things in any resonable way....
>
> I will test couple of values on spec2k this week and lets see how things scale
> elsewhere.
As another data point, in our internal benchmarks I had tried a few
values and 99.9% gave the best performance. Just going down to 99.0%
reduced the inlining too much, even compared to the old static cutoff
count, missing some key inlines and reducing performance.
Thanks,
Teresa
>
> Honza
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (18 preceding siblings ...)
2012-12-19 16:45 ` tejohnson at google dot com
@ 2012-12-21 16:16 ` hubicka at ucw dot cz
2012-12-21 16:26 ` tejohnson at google dot com
` (2 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: hubicka at ucw dot cz @ 2012-12-21 16:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #19 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-21 16:15:34 UTC ---
> As another data point, in our internal benchmarks I had tried a few
> values and 99.9% gave the best performance. Just going down to 99.0%
> reduced the inlining too much, even compared to the old static cutoff
> count, missing some key inlines and reducing performance.
this really should not happen too much. I still think something along the
following lines
is desirable. Does it helps setting more resonable threshold?
Honza
Index: predict.c
===================================================================
*** predict.c (revision 194655)
--- predict.c (working copy)
*************** maybe_hot_count_p (struct function *fun,
*** 145,151 ****
{
ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
gcc_assert (ws);
! min_count = ws->min_counter;
}
return (count >= min_count);
}
--- 145,156 ----
{
ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
gcc_assert (ws);
!
! /* We want all counters above ws->min_counter * profile_info->runs
! to be safely identified as hot regions. This may be spoiled
! by optimizations such as unrolling that reduce counts of the
! body, thus divide by 32. */
! min_count = ws->min_counter / 32;
}
return (count >= min_count);
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (19 preceding siblings ...)
2012-12-21 16:16 ` hubicka at ucw dot cz
@ 2012-12-21 16:26 ` tejohnson at google dot com
2012-12-22 23:20 ` hubicka at ucw dot cz
2013-01-11 18:19 ` tejohnson at google dot com
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2012-12-21 16:26 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #20 from Teresa Johnson <tejohnson at google dot com> 2012-12-21 16:26:17 UTC ---
On Fri, Dec 21, 2012 at 8:15 AM, hubicka at ucw dot cz
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #19 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-21 16:15:34 UTC ---
>> As another data point, in our internal benchmarks I had tried a few
>> values and 99.9% gave the best performance. Just going down to 99.0%
>> reduced the inlining too much, even compared to the old static cutoff
>> count, missing some key inlines and reducing performance.
> this really should not happen too much. I still think something along the
> following lines
> is desirable. Does it helps setting more resonable threshold?
I'll give this patch a try and let you know how it affects the
performance I see. But unrolling shouldn't affect inlining, since all
unrolling is after inlining, right?
Thanks,
Teresa
>
> Honza
>
> Index: predict.c
> ===================================================================
> *** predict.c (revision 194655)
> --- predict.c (working copy)
> *************** maybe_hot_count_p (struct function *fun,
> *** 145,151 ****
> {
> ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
> gcc_assert (ws);
> ! min_count = ws->min_counter;
> }
> return (count >= min_count);
> }
> --- 145,156 ----
> {
> ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
> gcc_assert (ws);
> !
> ! /* We want all counters above ws->min_counter * profile_info->runs
> ! to be safely identified as hot regions. This may be spoiled
> ! by optimizations such as unrolling that reduce counts of the
> ! body, thus divide by 32. */
> ! min_count = ws->min_counter / 32;
> }
> return (count >= min_count);
> }
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (20 preceding siblings ...)
2012-12-21 16:26 ` tejohnson at google dot com
@ 2012-12-22 23:20 ` hubicka at ucw dot cz
2013-01-11 18:19 ` tejohnson at google dot com
22 siblings, 0 replies; 24+ messages in thread
From: hubicka at ucw dot cz @ 2012-12-22 23:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #21 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-22 23:20:37 UTC ---
> I'll give this patch a try and let you know how it affects the
> performance I see. But unrolling shouldn't affect inlining, since all
> unrolling is after inlining, right?
Well, it is, but we do other transformations that affect profile prior inlining
+ we use hot/cold code all across compilation, not just to drive inlining.
Honza
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug gcov-profile/55674] [4.8 Regression] >20% size increase of lto/pgo binaries since r193747
2012-12-13 12:56 [Bug gcov-profile/55674] New: [4.8 Regression] >20% size increase of lto/pgo binaries since r193747 markus at trippelsdorf dot de
` (21 preceding siblings ...)
2012-12-22 23:20 ` hubicka at ucw dot cz
@ 2013-01-11 18:19 ` tejohnson at google dot com
22 siblings, 0 replies; 24+ messages in thread
From: tejohnson at google dot com @ 2013-01-11 18:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
--- Comment #22 from Teresa Johnson <tejohnson at google dot com> 2013-01-11 18:18:48 UTC ---
Hi Honza,
I ran a number of experiments at different thresholds, and found that
performance starts dropping pretty quickly as the working set
threshold is dropped, even to 98.5% (2 working sets lower than the
99.9% threshold). This appears to be due to a reduction in key
inlines. If I use your divide by 32 heuristic, the performance is
recovered for most of the apps I looked at (with the 98.5% threshold),
but is still reduced for others.
When I looked at the ratio of the min counter values between the
98.28% bucket (selected when I use the 98.5% threshold) and the 99.9%
bucket, I can see that the apps that still degrade with 98.5% + your
divide by 32 heuristic are those with the largest ratio (100-300). For
the apps that no longer degrade from the 98.5% thresholds when
dividing by 32, the ratio is small and often less than 32 (so in
effect the resulting min counter is even smaller than with the current
99.9% threshold).
What threshold were you thinking of changing it to, based on your
experiments? For the google branches, based on the data I collected,
we plan to leave it at 99.9%.
Thanks,
Teresa
On Sat, Dec 22, 2012 at 3:20 PM, hubicka at ucw dot cz
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674
>
> --- Comment #21 from Jan Hubicka <hubicka at ucw dot cz> 2012-12-22 23:20:37 UTC ---
>> I'll give this patch a try and let you know how it affects the
>> performance I see. But unrolling shouldn't affect inlining, since all
>> unrolling is after inlining, right?
>
> Well, it is, but we do other transformations that affect profile prior inlining
> + we use hot/cold code all across compilation, not just to drive inlining.
>
> Honza
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 24+ messages in thread