* Set inline-unit-growth to 40 @ 2019-01-12 18:32 Jan Hubicka 2019-01-14 16:28 ` Christophe Lyon 2019-01-14 16:53 ` Qing Zhao 0 siblings, 2 replies; 7+ messages in thread From: Jan Hubicka @ 2019-01-12 18:32 UTC (permalink / raw) To: gcc-patches Hello, this patch sets inline-unit-growth to 40. The performance changes are - Firefox, LTO https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 After fixes to inlining priorities this makes difference without profile feedback only. Code size growth is about 9.15% with LTO and 3.95 with LTO and profile feedback. - Firefox noLTO https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 With about 7% code size growth - SPEC https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f - C++ benchmarks https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f I am not entirely happy about the code-size/performance tradeoffs but it is concerned only for programs built with -O3 or having too many inline keywords. I have looked into inlining decisions for Firefox, HHVM and Clang and inliner gets out of growt bounds way too early and some of more performance aware projects already sets the limit up. I will tune other metrics down to handle some of the code size problems. Honza Index: ChangeLog =================================================================== --- ChangeLog (revision 267882) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2019-01-05 Jan Hubicka <hubicka@ucw.cz> + + * params.def (inline-unit-growth): Set to 40. + 2019-01-12 Jakub Jelinek <jakub@redhat.com> * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. Index: params.def =================================================================== --- params.def (revision 267882) +++ params.def (working copy) @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, DEFPARAM(PARAM_INLINE_UNIT_GROWTH, "inline-unit-growth", "How much can given compilation unit grow because of the inlining (in percent).", - 20, 0, 0) + 40, 0, 0) DEFPARAM(PARAM_IPCP_UNIT_GROWTH, "ipcp-unit-growth", "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-12 18:32 Set inline-unit-growth to 40 Jan Hubicka @ 2019-01-14 16:28 ` Christophe Lyon 2019-01-14 16:40 ` Jan Hubicka 2019-01-14 16:53 ` Qing Zhao 1 sibling, 1 reply; 7+ messages in thread From: Christophe Lyon @ 2019-01-14 16:28 UTC (permalink / raw) To: Jan Hubicka; +Cc: gcc Patches Hi Honza, On Sat, 12 Jan 2019 at 19:32, Jan Hubicka <hubicka@ucw.cz> wrote: > > Hello, > this patch sets inline-unit-growth to 40. The performance changes are > - Firefox, LTO > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 > After fixes to inlining priorities this makes difference without > profile feedback only. > > Code size growth is about 9.15% with LTO and 3.95 with LTO and profile > feedback. > - Firefox noLTO > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 > With about 7% code size growth > - SPEC > https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > - C++ benchmarks > https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > > I am not entirely happy about the code-size/performance tradeoffs but it > is concerned only for programs built with -O3 or having too many inline > keywords. I have looked into inlining decisions for Firefox, HHVM and > Clang and inliner gets out of growt bounds way too early and some of > more performance aware projects already sets the limit up. > > I will tune other metrics down to handle some of the code size problems. > > Honza > > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 267882) > +++ ChangeLog (working copy) > @@ -1,3 +1,7 @@ > +2019-01-05 Jan Hubicka <hubicka@ucw.cz> > + > + * params.def (inline-unit-growth): Set to 40. > + > 2019-01-12 Jakub Jelinek <jakub@redhat.com> > > * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. > Index: params.def > =================================================================== > --- params.def (revision 267882) > +++ params.def (working copy) > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > "inline-unit-growth", > "How much can given compilation unit grow because of the inlining (in percent).", > - 20, 0, 0) > + 40, 0, 0) > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > "ipcp-unit-growth", > "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", This patch introduces a regression in libstdc++: FAIL: ext/pb_ds/regression/list_update_map_rand.cc execution test on a few arm targets. For instance: arm-none-linux-gnueabihf --with-mode arm --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16 Using --with-mode thumb and the same other configure options makes the test pass. I'm seeing this with other configurations --with-mode arm and --with-fpu vfp* (as opposed to neon*) The .log file has: <?xml version = "1.0"?> <test> <sd value = "1547347280"> </sd> <n value = "50"> </n> <m value = "10"> </m> <tp value = "0.2"> </tp> <ip value = "0.6"> </ip> <ep value = "0.2"> </ep> <cp value = "0.001"> </cp> <mp value = "0.25"> </mp> </test> <cntnr name = "lu_mtf_map"> <desc> <type value = "list_update"> <Update_Policy value = "lu_move_to_front_policy"> </Update_Policy> </type> </desc> <progress> ---------------------------------------- ************qemu: uncaught target signal 11 (Segmentation fault) - core dumped The (incomplete?) qemu execution trace ends with: ---------------- IN: 0x40ada6b8: e5910008 ldr r0, [r1, #8] 0x40ada6bc: e1560000 cmp r6, r0 0x40ada6c0: 1a00004f bne #0x40ada804 ---------------- IN: 0x40ada6c4: e5960004 ldr r0, [r6, #4] 0x40ada6c8: e582100c str r1, [r2, #0xc] 0x40ada6cc: e3500c02 cmp r0, #0x200 0x40ada6d0: e5812008 str r2, [r1, #8] 0x40ada6d4: 3a000002 blo #0x40ada6e4 ---------------- IN: 0x40adb880: eaffff3e b #0x40adb580 ---------------- IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE 0x0001ffc4: e3a06000 mov r6, #0 0x0001ffc8: eaffff88 b #0x1fdf0 ---------------- IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE 0x0001fdf0: ee180a10 vmov r0, s16 0x0001fdf4: ebffcd12 bl #0x13244 Christophe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-14 16:28 ` Christophe Lyon @ 2019-01-14 16:40 ` Jan Hubicka 2019-01-17 13:22 ` Christophe Lyon 0 siblings, 1 reply; 7+ messages in thread From: Jan Hubicka @ 2019-01-14 16:40 UTC (permalink / raw) To: Christophe Lyon; +Cc: gcc Patches Hello, > > Index: params.def > > =================================================================== > > --- params.def (revision 267882) > > +++ params.def (working copy) > > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > > "inline-unit-growth", > > "How much can given compilation unit grow because of the inlining (in percent).", > > - 20, 0, 0) > > + 40, 0, 0) > > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > > "ipcp-unit-growth", > > "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", > > This patch introduces a regression in libstdc++: > FAIL: ext/pb_ds/regression/list_update_map_rand.cc execution test > on a few arm targets. > > For instance: > arm-none-linux-gnueabihf > --with-mode arm > --with-cpu cortex-a5 > --with-fpu vfpv3-d16-fp16 Adjusting inliner heuiristics should not trigger correcness issues, so this seems like a bug that was previously latent. I guess it may legally break correct code only if stack usage gets too large. Do you have any idea what breaks in this testcase? Honza > > Using --with-mode thumb and the same other configure options makes the > test pass. > I'm seeing this with other configurations --with-mode arm and > --with-fpu vfp* (as opposed to neon*) > > The .log file has: > <?xml version = "1.0"?> > <test> > <sd value = "1547347280"> > </sd> > <n value = "50"> > </n> > <m value = "10"> > </m> > <tp value = "0.2"> > </tp> > <ip value = "0.6"> > </ip> > <ep value = "0.2"> > </ep> > <cp value = "0.001"> > </cp> > <mp value = "0.25"> > </mp> > </test> > <cntnr name = "lu_mtf_map"> > <desc> > <type value = "list_update"> > <Update_Policy value = "lu_move_to_front_policy"> > </Update_Policy> > </type> > </desc> > <progress> > ---------------------------------------- > ************qemu: uncaught target signal 11 (Segmentation fault) - core dumped > > The (incomplete?) qemu execution trace ends with: > ---------------- > IN: > 0x40ada6b8: e5910008 ldr r0, [r1, #8] > 0x40ada6bc: e1560000 cmp r6, r0 > 0x40ada6c0: 1a00004f bne #0x40ada804 > ---------------- > IN: > 0x40ada6c4: e5960004 ldr r0, [r6, #4] > 0x40ada6c8: e582100c str r1, [r2, #0xc] > 0x40ada6cc: e3500c02 cmp r0, #0x200 > 0x40ada6d0: e5812008 str r2, [r1, #8] > 0x40ada6d4: 3a000002 blo #0x40ada6e4 > ---------------- > IN: > 0x40adb880: eaffff3e b #0x40adb580 > ---------------- > IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE > 0x0001ffc4: e3a06000 mov r6, #0 > 0x0001ffc8: eaffff88 b #0x1fdf0 > ---------------- > IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE > 0x0001fdf0: ee180a10 vmov r0, s16 > 0x0001fdf4: ebffcd12 bl #0x13244 > > Christophe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-14 16:40 ` Jan Hubicka @ 2019-01-17 13:22 ` Christophe Lyon 0 siblings, 0 replies; 7+ messages in thread From: Christophe Lyon @ 2019-01-17 13:22 UTC (permalink / raw) To: Jan Hubicka; +Cc: gcc Patches On Mon, 14 Jan 2019 at 17:40, Jan Hubicka <hubicka@ucw.cz> wrote: > > Hello, > > > Index: params.def > > > =================================================================== > > > --- params.def (revision 267882) > > > +++ params.def (working copy) > > > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > > > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > > > "inline-unit-growth", > > > "How much can given compilation unit grow because of the inlining (in percent).", > > > - 20, 0, 0) > > > + 40, 0, 0) > > > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > > > "ipcp-unit-growth", > > > "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", > > > > This patch introduces a regression in libstdc++: > > FAIL: ext/pb_ds/regression/list_update_map_rand.cc execution test > > on a few arm targets. > > > > For instance: > > arm-none-linux-gnueabihf > > --with-mode arm > > --with-cpu cortex-a5 > > --with-fpu vfpv3-d16-fp16 > > Adjusting inliner heuiristics should not trigger correcness issues, so > this seems like a bug that was previously latent. I guess it may > legally break correct code only if stack usage gets too large. > Indeed I didn't expect such a change to involve correctness issues. > Do you have any idea what breaks in this testcase? > Hmm, I've just seen this test fail then pass again in validations that did not involve any change related to it. So there's some "randomness", and the logs are not very helpful (no message saying memory exhaustion or similar) > Honza > > > > Using --with-mode thumb and the same other configure options makes the > > test pass. > > I'm seeing this with other configurations --with-mode arm and > > --with-fpu vfp* (as opposed to neon*) > > > > The .log file has: > > <?xml version = "1.0"?> > > <test> > > <sd value = "1547347280"> > > </sd> > > <n value = "50"> > > </n> > > <m value = "10"> > > </m> > > <tp value = "0.2"> > > </tp> > > <ip value = "0.6"> > > </ip> > > <ep value = "0.2"> > > </ep> > > <cp value = "0.001"> > > </cp> > > <mp value = "0.25"> > > </mp> > > </test> > > <cntnr name = "lu_mtf_map"> > > <desc> > > <type value = "list_update"> > > <Update_Policy value = "lu_move_to_front_policy"> > > </Update_Policy> > > </type> > > </desc> > > <progress> > > ---------------------------------------- > > ************qemu: uncaught target signal 11 (Segmentation fault) - core dumped > > > > The (incomplete?) qemu execution trace ends with: > > ---------------- > > IN: > > 0x40ada6b8: e5910008 ldr r0, [r1, #8] > > 0x40ada6bc: e1560000 cmp r6, r0 > > 0x40ada6c0: 1a00004f bne #0x40ada804 > > ---------------- > > IN: > > 0x40ada6c4: e5960004 ldr r0, [r6, #4] > > 0x40ada6c8: e582100c str r1, [r2, #0xc] > > 0x40ada6cc: e3500c02 cmp r0, #0x200 > > 0x40ada6d0: e5812008 str r2, [r1, #8] > > 0x40ada6d4: 3a000002 blo #0x40ada6e4 > > ---------------- > > IN: > > 0x40adb880: eaffff3e b #0x40adb580 > > ---------------- > > IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE > > 0x0001ffc4: e3a06000 mov r6, #0 > > 0x0001ffc8: eaffff88 b #0x1fdf0 > > ---------------- > > IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE > > 0x0001fdf0: ee180a10 vmov r0, s16 > > 0x0001fdf4: ebffcd12 bl #0x13244 > > > > Christophe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-12 18:32 Set inline-unit-growth to 40 Jan Hubicka 2019-01-14 16:28 ` Christophe Lyon @ 2019-01-14 16:53 ` Qing Zhao 2019-01-15 15:14 ` Jan Hubicka 1 sibling, 1 reply; 7+ messages in thread From: Qing Zhao @ 2019-01-14 16:53 UTC (permalink / raw) To: Jan Hubicka; +Cc: gcc-patches Hi, Honza, in addition to the code size problems, there are several runtime regression for the SPEC: (If I read the table correctly, if not, let me know) SPEC/SPEC2006/INT/483.xalancbmk <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=183.290.0> 146.131 4.89% SPEC/SPEC2006/FP/436.cactusADM <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=171.100.0> 130.967 8.07% SPEC/SPEC2006/FP/435.gromacs <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=177.90.0> 182.555 11.73% SPEC/SPEC2017/INT/541.leela_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=174.397.0> 452.333 4.17% SPEC/SPEC2017/INT/520.omnetpp_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=172.357.0> 395.582 4.98% do we have plan to study and fix these run-time regression? thanks. Qing > On Jan 12, 2019, at 12:32 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > > Hello, > this patch sets inline-unit-growth to 40. The performance changes are > - Firefox, LTO > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 > After fixes to inlining priorities this makes difference without > profile feedback only. > > Code size growth is about 9.15% with LTO and 3.95 with LTO and profile > feedback. > - Firefox noLTO > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 > With about 7% code size growth > - SPEC > https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > - C++ benchmarks > https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > > I am not entirely happy about the code-size/performance tradeoffs but it > is concerned only for programs built with -O3 or having too many inline > keywords. I have looked into inlining decisions for Firefox, HHVM and > Clang and inliner gets out of growt bounds way too early and some of > more performance aware projects already sets the limit up. > > I will tune other metrics down to handle some of the code size problems. > > Honza > > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 267882) > +++ ChangeLog (working copy) > @@ -1,3 +1,7 @@ > +2019-01-05 Jan Hubicka <hubicka@ucw.cz> > + > + * params.def (inline-unit-growth): Set to 40. > + > 2019-01-12 Jakub Jelinek <jakub@redhat.com> > > * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. > Index: params.def > =================================================================== > --- params.def (revision 267882) > +++ params.def (working copy) > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > "inline-unit-growth", > "How much can given compilation unit grow because of the inlining (in percent).", > - 20, 0, 0) > + 40, 0, 0) > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > "ipcp-unit-growth", > "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-14 16:53 ` Qing Zhao @ 2019-01-15 15:14 ` Jan Hubicka 2019-01-15 16:18 ` Qing Zhao 0 siblings, 1 reply; 7+ messages in thread From: Jan Hubicka @ 2019-01-15 15:14 UTC (permalink / raw) To: Qing Zhao; +Cc: gcc-patches > Hi, Honza, > > in addition to the code size problems, there are several runtime regression for the SPEC: (If I read the table correctly, if not, let me know) > > SPEC/SPEC2006/INT/483.xalancbmk <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=183.290.0> 146.131 4.89% This test seems to be just a noise, if you look on the mainline plots there is no noticeable regression and you can see differences +- 4% in last 10 runs. > > SPEC/SPEC2006/FP/436.cactusADM <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=171.100.0> 130.967 8.07% > > SPEC/SPEC2017/INT/520.omnetpp_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=172.357.0> 395.582 4.98% Here you can see in the graph both boxes to be yellow which means that binary did not changed nor the size changed. It is just measurement error it seems. > > SPEC/SPEC2006/FP/435.gromacs <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=177.90.0> 182.555 11.73% I plan to check on gromacs This is LTO+PGO which will be rerun only in day or two so I will see if the same regression stays on mainline. > > SPEC/SPEC2017/INT/541.leela_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=174.397.0> 452.333 4.17% This was already taken by daily testers and regression does not reproduce, so it seems to be noise too. Honza > > do we have plan to study and fix these run-time regression? > > thanks. > > Qing > > > On Jan 12, 2019, at 12:32 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > > > > Hello, > > this patch sets inline-unit-growth to 40. The performance changes are > > - Firefox, LTO > > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 > > After fixes to inlining priorities this makes difference without > > profile feedback only. > > > > Code size growth is about 9.15% with LTO and 3.95 with LTO and profile > > feedback. > > - Firefox noLTO > > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 > > With about 7% code size growth > > - SPEC > > https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > > - C++ benchmarks > > https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > > > > I am not entirely happy about the code-size/performance tradeoffs but it > > is concerned only for programs built with -O3 or having too many inline > > keywords. I have looked into inlining decisions for Firefox, HHVM and > > Clang and inliner gets out of growt bounds way too early and some of > > more performance aware projects already sets the limit up. > > > > I will tune other metrics down to handle some of the code size problems. > > > > Honza > > > > Index: ChangeLog > > =================================================================== > > --- ChangeLog (revision 267882) > > +++ ChangeLog (working copy) > > @@ -1,3 +1,7 @@ > > +2019-01-05 Jan Hubicka <hubicka@ucw.cz> > > + > > + * params.def (inline-unit-growth): Set to 40. > > + > > 2019-01-12 Jakub Jelinek <jakub@redhat.com> > > > > * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. > > Index: params.def > > =================================================================== > > --- params.def (revision 267882) > > +++ params.def (working copy) > > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > > "inline-unit-growth", > > "How much can given compilation unit grow because of the inlining (in percent).", > > - 20, 0, 0) > > + 40, 0, 0) > > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > > "ipcp-unit-growth", > > "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Set inline-unit-growth to 40 2019-01-15 15:14 ` Jan Hubicka @ 2019-01-15 16:18 ` Qing Zhao 0 siblings, 0 replies; 7+ messages in thread From: Qing Zhao @ 2019-01-15 16:18 UTC (permalink / raw) To: Jan Hubicka; +Cc: gcc-patches Okay, I see. thanks. Qing > On Jan 15, 2019, at 9:14 AM, Jan Hubicka <hubicka@ucw.cz> wrote: > >> Hi, Honza, >> >> in addition to the code size problems, there are several runtime regression for the SPEC: (If I read the table correctly, if not, let me know) >> >> SPEC/SPEC2006/INT/483.xalancbmk <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=183.290.0> 146.131 4.89% > > This test seems to be just a noise, if you look on the mainline plots > there is no noticeable regression and you can see differences +- 4% in > last 10 runs. >> >> SPEC/SPEC2006/FP/436.cactusADM <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=171.100.0> 130.967 8.07% >> >> SPEC/SPEC2017/INT/520.omnetpp_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=172.357.0> 395.582 4.98% > > Here you can see in the graph both boxes to be yellow which means that > binary did not changed nor the size changed. It is just measurement error it seems. >> >> SPEC/SPEC2006/FP/435.gromacs <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=177.90.0> 182.555 11.73% > > I plan to check on gromacs > This is LTO+PGO which will be rerun only in day or two so I will see if > the same regression stays on mainline. >> >> SPEC/SPEC2017/INT/541.leela_r <https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=174.397.0> 452.333 4.17% > > This was already taken by daily testers and regression does not > reproduce, so it seems to be noise too. > > Honza >> >> do we have plan to study and fix these run-time regression? > >> >> thanks. >> >> Qing >> >>> On Jan 12, 2019, at 12:32 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >>> >>> Hello, >>> this patch sets inline-unit-growth to 40. The performance changes are >>> - Firefox, LTO >>> https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 >>> After fixes to inlining priorities this makes difference without >>> profile feedback only. >>> >>> Code size growth is about 9.15% with LTO and 3.95 with LTO and profile >>> feedback. >>> - Firefox noLTO >>> https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 >>> With about 7% code size growth >>> - SPEC >>> https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f >>> - C++ benchmarks >>> https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f >>> >>> I am not entirely happy about the code-size/performance tradeoffs but it >>> is concerned only for programs built with -O3 or having too many inline >>> keywords. I have looked into inlining decisions for Firefox, HHVM and >>> Clang and inliner gets out of growt bounds way too early and some of >>> more performance aware projects already sets the limit up. >>> >>> I will tune other metrics down to handle some of the code size problems. >>> >>> Honza >>> >>> Index: ChangeLog >>> =================================================================== >>> --- ChangeLog (revision 267882) >>> +++ ChangeLog (working copy) >>> @@ -1,3 +1,7 @@ >>> +2019-01-05 Jan Hubicka <hubicka@ucw.cz> >>> + >>> + * params.def (inline-unit-growth): Set to 40. >>> + >>> 2019-01-12 Jakub Jelinek <jakub@redhat.com> >>> >>> * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. >>> Index: params.def >>> =================================================================== >>> --- params.def (revision 267882) >>> +++ params.def (working copy) >>> @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, >>> DEFPARAM(PARAM_INLINE_UNIT_GROWTH, >>> "inline-unit-growth", >>> "How much can given compilation unit grow because of the inlining (in percent).", >>> - 20, 0, 0) >>> + 40, 0, 0) >>> DEFPARAM(PARAM_IPCP_UNIT_GROWTH, >>> "ipcp-unit-growth", >>> "How much can given compilation unit grow because of the interprocedural constant propagation (in percent).", >> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-01-17 13:22 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-12 18:32 Set inline-unit-growth to 40 Jan Hubicka 2019-01-14 16:28 ` Christophe Lyon 2019-01-14 16:40 ` Jan Hubicka 2019-01-17 13:22 ` Christophe Lyon 2019-01-14 16:53 ` Qing Zhao 2019-01-15 15:14 ` Jan Hubicka 2019-01-15 16:18 ` Qing Zhao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).