public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug ipa/103227] 58% exchange2 regression with -Ofast -march=native on zen3 between g:1ae8edf5f73ca5c3 and g:2af63f0f53a12a72 Date: Sat, 13 Nov 2021 22:11:16 +0000 [thread overview] Message-ID: <bug-103227-4-dXZGkwgmen@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-103227-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 --- Comment #2 from Jan Hubicka <hubicka at gcc dot gnu.org> --- There is difference in inlier decision. Since all clones are of same size it depends on the order inliner picks them and combines together before hitting large-function-growth. It seems that with isra ordering inliner simply less lucky. Instead of inline stack: IPA function summary for digits_2.constprop/143 inlinable global time: 22960.500916 self size: 1277 global size: 2534 min size: 513 self stack: 261 global stack: 783 estimated growth:-488 size:513.000000, time:6690.410500 size:3.000000, time:2.000001, executed if:(not inlined) size:0.500000, time:0.500000, executed if:(not inlined), nonconst if:(op0[ref offset: 0] changed) && (not inlined) size:138.500000, time:217.532556, nonconst if:(op0[ref offset: 0] changed) size:36.000000, time:34.793911, executed if:(op0[ref offset: 0],(# % 3) == 2), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0],(# % 3) == 2) size:198.000000, time:574.099545, executed if:(op0[ref offset: 0],(# % 3) == 2) size:36.000000, time:34.793911, executed if:(op0[ref offset: 0],(# % 3) == 1), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0],(# % 3) == 1) size:270.000000, time:1357.103458, executed if:(op0[ref offset: 0],(# % 3) == 1) size:21.000000, time:375.971570, executed if:(op0[ref offset: 0] == 5) size:1263.000000, time:12359.502960, executed if:(op0[ref offset: 0] != 8) size:1.000000, time:0.900000, executed if:(op0[ref offset: 0] != 8), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0] != 8) size:48.000000, time:1300.920311, executed if:(op0[ref offset: 0] == 8) loop iterations: 0.68 for (op0[ref offset: 0] changed) 0.76 for (op0[ref offset: 0] changed) 0.88 for (op0[ref offset: 0] changed) 1.08 for (op0[ref offset: 0] changed) 1.40 for (op0[ref offset: 0] changed) 1.93 for (op0[ref offset: 0] changed) 2.80 for (op0[ref offset: 0] changed) 4.23 for (op0[ref offset: 0] changed) 11.88 for (op0[ref offset: 0] changed) 4.59 for (op0[ref offset: 0] changed) 3.16 for (op0[ref offset: 0] changed) 2.29 for (op0[ref offset: 0] changed) 1.76 for (op0[ref offset: 0] changed) 1.44 for (op0[ref offset: 0] changed) 1.24 for (op0[ref offset: 0] changed) 1.12 for (op0[ref offset: 0] changed) calls: covered.constprop/148 --param max-inline-insns-auto limit reached freq:0.30 loop depth: 9 size: 4 time: 13 callee size:262 stack:1472 predicate: (op0[ref offset: 0] == 8) op0 is compile time invariant op0 points to local or readonly memory op1 is compile time invariant op1 points to local or readonly memory digits_2.constprop/144 inlined freq:0.90 Stack frame offset 261, callee self size 261 __builtin_unreachable/156 unreachable freq:0.00 cross module loop depth:18 size: 0 time: 0 predicate: (false) op0 is compile time invariant op0 points to local or readonly memory op1 is compile time invariant op1 points to local or readonly memory digits_2.constprop/145 inlined freq:0.81 Stack frame offset 522, callee self size 261 __builtin_unreachable/156 unreachable freq:0.00 cross module loop depth:27 size: 0 time: 0 predicate: (false) op0 points to local or readonly memory op1 is compile time invariant op1 points to local or readonly memory digits_2.constprop/146 --param large-function-growth limit reached freq:0.73 loop depth:27 size: 2 time: 11 callee size:1019 stack:522 predicate: (op0[ref offset: 0] != 8) op0 is compile time invariant op0 points to local or readonly memory where inlining fails only at recursion depth 4 we get: IPA function summary for digits_2.constprop.isra/163 inlinable global time: 17184.704285 self size: 1277 global size: 1994 min size: 513 self stack: 261 global stack: 522 estimated growth:301 size:513.000000, time:6690.410500 size:3.000000, time:2.000001, executed if:(not inlined) size:0.500000, time:0.500000, executed if:(not inlined), nonconst if:(op0[ref offset: 0] changed) && (not inlined) size:138.500000, time:217.532556, nonconst if:(op0[ref offset: 0] changed) size:36.000000, time:34.793911, executed if:(op0[ref offset: 0],(# % 3) == 2), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0],(# % 3) == 2) size:198.000000, time:574.099545, executed if:(op0[ref offset: 0],(# % 3) == 2) size:36.000000, time:34.793911, executed if:(op0[ref offset: 0],(# % 3) == 1), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0],(# % 3) == 1) size:270.000000, time:1357.103458, executed if:(op0[ref offset: 0],(# % 3) == 1) size:21.000000, time:375.971570, executed if:(op0[ref offset: 0] == 5) size:723.000000, time:6582.815331, executed if:(op0[ref offset: 0] != 8) size:1.000000, time:0.900000, executed if:(op0[ref offset: 0] != 8), nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0] != 8) size:48.000000, time:1300.920311, executed if:(op0[ref offset: 0] == 8) loop iterations: 0.68 for (op0[ref offset: 0] changed) 0.76 for (op0[ref offset: 0] changed) 0.88 for (op0[ref offset: 0] changed) 1.08 for (op0[ref offset: 0] changed) 1.40 for (op0[ref offset: 0] changed) 1.93 for (op0[ref offset: 0] changed) 2.80 for (op0[ref offset: 0] changed) 4.23 for (op0[ref offset: 0] changed) 11.88 for (op0[ref offset: 0] changed) 4.59 for (op0[ref offset: 0] changed) 3.16 for (op0[ref offset: 0] changed) 2.29 for (op0[ref offset: 0] changed) 1.76 for (op0[ref offset: 0] changed) 1.44 for (op0[ref offset: 0] changed) 1.24 for (op0[ref offset: 0] changed) 1.12 for (op0[ref offset: 0] changed) calls: digits_2.constprop.isra/162 inlined freq:0.90 Stack frame offset 261, callee self size 261 digits_2.constprop.isra/161 --param large-function-growth limit reached freq:0.81 loop depth:18 size: 2 time: 11 callee size:1033 stack:522 predicate: (op0[ref offset: 0] != 8) op0 is compile time invariant op0 points to local or readonly memory __builtin_unreachable/168 unreachable freq:0.00 cross module loop depth:18 size: 0 time: 0 predicate: (false) op0 is compile time invariant op0 points to local or readonly memory op1 is compile time invariant op1 points to local or readonly memory covered.constprop/148 --param max-inline-insns-auto limit reached freq:0.30 loop depth: 9 size: 4 time: 13 callee size:262 stack:1472 predicate: (op0[ref offset: 0] == 8) op0 is compile time invariant op0 points to local or readonly memory op1 is compile time invariant op1 points to local or readonly memory where we fail at depth2
next prev parent reply other threads:[~2021-11-13 22:11 UTC|newest] Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-13 21:05 [Bug tree-optimization/103227] New: " hubicka at gcc dot gnu.org 2021-11-13 22:00 ` [Bug ipa/103227] " hubicka at gcc dot gnu.org 2021-11-13 22:11 ` hubicka at gcc dot gnu.org [this message] 2021-11-13 22:15 ` hubicka at gcc dot gnu.org 2021-11-15 9:04 ` [Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e rguenth at gcc dot gnu.org 2021-11-19 18:18 ` jamborm at gcc dot gnu.org 2021-11-19 21:12 ` hubicka at kam dot mff.cuni.cz 2021-11-19 21:22 ` hubicka at gcc dot gnu.org 2021-11-19 23:21 ` jamborm at gcc dot gnu.org 2021-11-20 12:32 ` hubicka at kam dot mff.cuni.cz 2021-11-20 12:39 ` hubicka at kam dot mff.cuni.cz 2021-11-21 15:16 ` cvs-commit at gcc dot gnu.org 2021-11-23 17:02 ` jamborm at gcc dot gnu.org 2021-11-24 12:52 ` jamborm at gcc dot gnu.org 2021-11-25 17:17 ` cvs-commit at gcc dot gnu.org 2021-11-26 9:19 ` hubicka at gcc dot gnu.org 2021-11-28 18:56 ` hubicka at gcc dot gnu.org 2022-12-14 0:04 ` cvs-commit at gcc dot gnu.org 2023-08-15 15:45 ` jamborm at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-103227-4-dXZGkwgmen@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).