public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "rsandifo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies Date: Wed, 08 Dec 2021 09:33:42 +0000 [thread overview] Message-ID: <bug-98782-4-Q1RNgnBXVX@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-98782-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #16 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> --- (In reply to Tamar Christina from comment #15) > > From that point of view, it doesn't look like the memory and register > > costs of R are too wrong here. The things being costed are the store > > and load around the call (which do exist if we allocate a call-clobbered > > register) and the loads at each use site (which do exist if we spill R). > > Indeed, I don't think the heuristics are wrong, but because one frequency > CALL_FREQ grows much quicker than BB_FREQ and at the smaller values they are > a > bit sensitive to any changes. The edge probabilities can barely change while > the BB_FREQ can change dramatically. On “CALL_FREQ grows much quicker than BB_FREQ”: for r104, the ALLOCNO_FREQ ought in principle to be fixed for a given loop iteration count. It shouldn't grow or shrink based on the value of SPILLED. That's because every execution of the loop body involves exactly one reference to r104. SPILLED specifies the probability that that single reference is the “call” use rather than the “non-call” use, but it doesn't change the total number of references per iteration. So I think the only reason we see the different ALLOCNO_FREQs in: ALLOCNO_FREQ 989, … vs: ALLOCNO_FREQ 990, … is round-off error. If the values had more precision, I think we'd have a fixed ALLOCNO_FREQ and a varying ALLOCNO_CALL_FREQ. At one extreme, if SPILLED is very low (but if we nevertheless continue to duplicate the common part of the loop body) then storing and loading around the call becomes very cheap (ALLOCNO_CALL_FREQ is much lower than the conceptually fixed ALLOCNO_FREQ). If SPILLED is very high (but again we continue to duplicate the common part of the loop body) then storing and loading around the call becomes very expensive. So I guess the question is: where is the cut-off? Given that: - the loop iterates 1024 times - there is a single store outside the loop - the target claims that loads and stores have equal cost ISTM that the cut-off really is in the range [0.5, 0.51]. If the value of SPILLED reflects the real probability at runtime then I think IRA is doing the right thing, given the claimed costs of storing and loading. I guess the problems are: - SPILLED is only a guess - the aarch64 costs of stores and loads don't reflect reality > > Like Feng Xue says in comment 1, I think the main missed optimisation > > opportunity here is that foo + 1024 is invariant, so if we allocate > > a call-clobbered register, we could save R once outside the loop > > and reload it after each call. That would give: > > > > - a store of R outside the loop (low execution count) > > - a load of R inside the loop (after the call) with freq 0.51 * loop iters > > > > Yes, that is the ideal solution, but also requires more changes to RA. True :-) > Instead I've chosen a middle ground here (same as yours but done in > ira_tune_allocno_costs instead), which is to store and load only inside > the loop, but to do so only in the BB which contains the call. I don't think you were saying otherwise, but just FTR: I wasn't proposing a solution, I was just describing a hack. It seemed to me like IRA was making the right decision for r104 in isolation, for the given SPILLED value and target costs. My hack to force an allocation for r104 made things worse. > > which is cheaper than both the current approaches. We don't do that > > optimisation yet though, so the current costing seems to reflect what we > > currently generate. > > In many (if not most) Arches stores are significantly cheaper than the loads > though. So the store before the call doesn't end up making that much of a > difference, but yes it adds up if you have many of them. Yeah. Could we fix the problem that way instead? The only reason IRA is treating loads and stores as equal cost is because aarch64 asked it to :-) At least for the reduced testcase, ISTM that IRA is making the right choice for the “loads and stores are equally expensive” assumption. It also seems to make the right choice for a “loads are more expensive than stores” assumption. It's up to the target to choose which is right. So I think the reduced testcase is showing a target bug.
next prev parent reply other threads:[~2021-12-08 9:33 UTC|newest] Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-21 14:31 [Bug rtl-optimization/98782] New: IRA artificially creating spills due to " tnfchris at gcc dot gnu.org 2021-01-21 14:35 ` [Bug rtl-optimization/98782] " jgreenhalgh at gcc dot gnu.org 2021-01-22 10:12 ` fxue at os dot amperecomputing.com 2021-01-29 13:34 ` tnfchris at gcc dot gnu.org 2021-02-05 12:02 ` [Bug rtl-optimization/98782] [11 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in " tnfchris at gcc dot gnu.org 2021-02-23 2:11 ` jiangning.liu at amperecomputing dot com 2021-02-23 12:06 ` rguenth at gcc dot gnu.org 2021-02-26 12:32 ` rguenth at gcc dot gnu.org 2021-11-28 19:07 ` [Bug rtl-optimization/98782] [11/12 " hubicka at gcc dot gnu.org 2021-11-29 1:33 ` jiangning.liu at amperecomputing dot com 2021-11-29 6:59 ` tnfchris at gcc dot gnu.org 2021-12-03 11:44 ` hubicka at gcc dot gnu.org 2021-12-03 11:47 ` hubicka at gcc dot gnu.org 2021-12-07 11:19 ` tnfchris at gcc dot gnu.org 2021-12-07 11:21 ` tnfchris at gcc dot gnu.org 2021-12-07 11:21 ` tnfchris at gcc dot gnu.org 2021-12-07 19:44 ` rsandifo at gcc dot gnu.org 2021-12-07 23:52 ` tnfchris at gcc dot gnu.org 2021-12-08 9:33 ` rsandifo at gcc dot gnu.org [this message] 2021-12-08 14:31 ` tnfchris at gcc dot gnu.org 2021-12-08 15:02 ` rsandifo at gcc dot gnu.org 2021-12-09 19:56 ` pthaugen at gcc dot gnu.org 2021-12-09 20:12 ` hubicka at gcc dot gnu.org 2021-12-09 21:27 ` pthaugen at gcc dot gnu.org 2021-12-10 11:36 ` hubicka at gcc dot gnu.org 2021-12-14 14:38 ` tnfchris at gcc dot gnu.org 2021-12-14 14:40 ` hubicka at kam dot mff.cuni.cz 2021-12-14 14:48 ` tnfchris at gcc dot gnu.org 2021-12-14 14:58 ` hubicka at kam dot mff.cuni.cz 2021-12-14 15:07 ` tnfchris at gcc dot gnu.org 2021-12-14 15:08 ` tnfchris at gcc dot gnu.org 2021-12-14 18:16 ` jamborm at gcc dot gnu.org 2021-12-15 12:15 ` tnfchris at gcc dot gnu.org 2021-12-20 18:06 ` rsandifo at gcc dot gnu.org 2021-12-31 17:28 ` rsandifo at gcc dot gnu.org 2022-01-04 22:26 ` pthaugen at gcc dot gnu.org 2022-01-04 22:29 ` pthaugen at gcc dot gnu.org 2022-01-06 14:53 ` rsandifo at gcc dot gnu.org 2022-01-10 1:29 ` crazylht at gmail dot com 2022-01-11 10:14 ` Jan Hubicka 2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org 2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org 2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org 2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org 2022-01-10 14:52 ` [Bug rtl-optimization/98782] [11 " rsandifo at gcc dot gnu.org 2022-01-11 10:14 ` hubicka at kam dot mff.cuni.cz 2022-01-11 14:22 ` rsandifo at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-98782-4-Q1RNgnBXVX@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).