public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug middle-end/102253] New: scalability issues with large loop depth @ 2021-09-09 9:45 rguenth at gcc dot gnu.org 2021-09-09 10:04 ` [Bug middle-end/102253] " rguenth at gcc dot gnu.org ` (4 more replies) 0 siblings, 5 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-09 9:45 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 Bug ID: 102253 Summary: scalability issues with large loop depth Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- When investigating an improvement to LIMs fill_always_executed_in I created the following testcase which creates a loop nest of depth N with conditionally executed subloops. extern void foobar (int); template <int a> struct bar { static void baz(int b, int) { if (b & (1 << (a % 32))) for (int i = 0; i < 1024; ++i) bar<a-1>::baz (b, i); } }; template <> struct bar<0> { static void baz (int, int i) { foobar (i); } }; void __attribute__((flatten)) foo(int b) { #ifndef N #define N 10 #endif bar<N>::baz (b, 0); } For N == 900 (the maximum unless you also specify -ftemplate-depth) and -O1 we see tree canonical iv : 1.42 ( 13%) 0.00 ( 0%) 1.42 ( 13%) 28M ( 13%) complete unrolling : 2.80 ( 27%) 0.00 ( 0%) 2.81 ( 26%) 42M ( 19%) integrated RA : 3.41 ( 32%) 0.32 ( 80%) 3.72 ( 34%) 640k ( 0%) TOTAL : 10.54 0.40 10.96 224M For N == 1800 and -O1 it is already tree canonical iv : 30.43 ( 28%) 0.05 ( 14%) 30.50 ( 28%) 116M ( 15%) complete unrolling : 63.96 ( 59%) 0.06 ( 17%) 64.04 ( 59%) 175M ( 22%) tree iv optimization : 5.75 ( 5%) 0.00 ( 0%) 5.77 ( 5%) 126M ( 16%) integrated RA : 1.40 ( 1%) 0.12 ( 34%) 1.53 ( 1%) 1754k ( 0%) TOTAL : 108.35 0.35 108.75 796M For reference compile-time with N == 450 is 2.5s with tree canonical iv : 0.18 ( 7%) 0.00 ( 0%) 0.19 ( 7%) 6904k ( 10%) complete unrolling : 0.34 ( 14%) 0.00 ( 0%) 0.34 ( 13%) 8412k ( 13%) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102253] scalability issues with large loop depth 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org @ 2021-09-09 10:04 ` rguenth at gcc dot gnu.org 2021-09-09 10:53 ` rguenth at gcc dot gnu.org ` (3 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-09 10:04 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- replacing the loop bound 1024 with 'b' moves the two offenders off the radar leaving, with N == 10000 ipa function summary : 12.91 ( 11%) 0.00 ( 0%) 12.92 ( 11%) 9800k ( 0%) loop init : 11.08 ( 9%) 0.12 ( 15%) 11.19 ( 9%) 1677M ( 65%) branch prediction : 81.17 ( 68%) 0.05 ( 6%) 81.12 ( 67%) 51M ( 2%) TOTAL : 119.93 0.78 120.77 2567M where IPA function summary might be because of the artificial testcase construction relying on inlining. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102253] scalability issues with large loop depth 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org 2021-09-09 10:04 ` [Bug middle-end/102253] " rguenth at gcc dot gnu.org @ 2021-09-09 10:53 ` rguenth at gcc dot gnu.org 2023-06-24 19:40 ` pinskia at gcc dot gnu.org ` (2 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-09 10:53 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org Keywords| |compile-time-hog --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- The unroll and ivcanon issues are all from invoking estimate_numbers_of_iterations which for each loop walks the whole loop body (including conditionally executed parts) and on the signed IV increment invokes infer_loop_bounds_from_signedness which performs SCEV analysis. It's highly unlikely that there's useful info from stmts in inner loop given we're looking for an evolution in the outer loop. The code is also calling analyze_scalar_evolution with a loop that's not the loop the stmt is in which might be a correctness issue. idx_infer_loop_bounds seems to do the correct thing here. Possibly all SCEV analysis could be performed once and instantiated in the outermost loop we're interested in. That is, we'd walk the outermost loop and fill in bounds for the whole subnest with a single IL walk. While scalar evolutions are cached, the instantiations are not. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102253] scalability issues with large loop depth 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org 2021-09-09 10:04 ` [Bug middle-end/102253] " rguenth at gcc dot gnu.org 2021-09-09 10:53 ` rguenth at gcc dot gnu.org @ 2023-06-24 19:40 ` pinskia at gcc dot gnu.org 2023-06-24 19:41 ` pinskia at gcc dot gnu.org 2023-06-24 20:13 ` pinskia at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-06-24 19:40 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luydorarko at vusra dot com --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- *** Bug 110396 has been marked as a duplicate of this bug. *** ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102253] scalability issues with large loop depth 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org ` (2 preceding siblings ...) 2023-06-24 19:40 ` pinskia at gcc dot gnu.org @ 2023-06-24 19:41 ` pinskia at gcc dot gnu.org 2023-06-24 20:13 ` pinskia at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-06-24 19:41 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- VRP/ranger uses SCEV now so it might even be worse, the testcase from PR 110396 has that behavior too. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102253] scalability issues with large loop depth 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org ` (3 preceding siblings ...) 2023-06-24 19:41 ` pinskia at gcc dot gnu.org @ 2023-06-24 20:13 ` pinskia at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-06-24 20:13 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102253 --- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> --- On the trunk with the original testcase here we get: tree copy headers : 12.16 ( 19%) 0.01 ( 2%) 21.57 ( 28%) 771k ( 0%) (I suspect the rest is due to not setting release checking ...) ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-06-24 20:13 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-09 9:45 [Bug middle-end/102253] New: scalability issues with large loop depth rguenth at gcc dot gnu.org 2021-09-09 10:04 ` [Bug middle-end/102253] " rguenth at gcc dot gnu.org 2021-09-09 10:53 ` rguenth at gcc dot gnu.org 2023-06-24 19:40 ` pinskia at gcc dot gnu.org 2023-06-24 19:41 ` pinskia at gcc dot gnu.org 2023-06-24 20:13 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).