public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "jskumari at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/109009] Shrink Wrap missed opportunity
Date: Fri, 14 Apr 2023 17:44:47 +0000	[thread overview]
Message-ID: <bug-109009-4-xRSMQckCLs@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-109009-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009

--- Comment #5 from Surya Kumari Jangala <jskumari at gcc dot gnu.org> ---
I was analysing and comparing the following test cases:

Test1 (shrink wrapped)

long
foo (long i, long cond)
{
  i = i + 1;
  if (cond)
    bar ();
  return i;
}


Test2 (not shrink wrapped)

long
foo (long i, long cond)
{
  if (cond)
    bar ();
  return i+1;
}


There is a difference in register allocation by IRA in the two cases.

Input RTL to IRA (Test1: passing case)
BB2:
  set r123, r4
  set r122, r3
  set r120, compare(r123, 0)
  set r117, r122 + 1
  if r120 jump BB4 else jump BB3
BB3:
  call bar()
BB4:
  set r3, r117
  return r3


Input RTL to IRA (Test2: failing case)

BB2:
  set r123, r4
  set r122, r3
  set r120, compare(r123, 0)
  set r118, r122
  if r120 jump BB4 else jump BB3
BB3:
  call bar()
BB4:
  set r3, r118+1
  return r3


There is a difference in registers allocated for r117 (passing case) and r118
(failing case) by IRA.
r117 is allocated r3 while r118 is allocated r31.
Since r117 is allocated r3, r3 is spilled across the call to bar() by LRA. And
so only BB3 requires a prolog and shrink wrap is successful.
In the failing case, since r31 is assigned to r118, BB2 requires a prolog and
shrink wrap fails.

In the IRA pass, after graph coloring, both r117 and r118 get assigned to r3.
The routine improve_allocation() is called after graph coloring. In this
routine, IRA checks for each allocno if spilling any conflicting allocnos can
improve the 
allocation of this allocno.

Going into more detail, improve_allocation() does the following:
1. We first compute the cost improvement for usage of each profitable hard
register for a given allocno A. The cost improvement is computed as follows:

costs[regno] = A->hard_reg_costs[regno]   // ‘hard_reg_costs’ is an array of
usage 
                                             costs for each hard register
costs[regno] -= allocno_copy_cost_saving (A, regno);
costs[regno] -= base_cost;   //Say, ‘reg’ is assigned to A. Then ‘base_cost’ is 
                               the usage cost of ‘reg’ for A.

2. Then we process each conflicting allocno of A and update the cost
improvement for the profitable hard registers of A. Basically, we compute the
spill costs of the conflicting allocnos and update the cost (for A) of the
register that was assigned to the conflicting allocno. 
3. We then find the best register among the profitable registers, spill the
conflicting allocno that uses this best register and assign the best register
to A.


However, the initial hard register costs for some of the profitable hard
registers is different in the passing and failing cases. More specifically, the
costs in hard_reg_costs[] array are 0 for regs 14-31 in the failing case. A
zero cost seems incorrect. If using a reg in the set [14..31] has zero cost,
then why wasn’t such a reg chosen for r118?
In the passing case, the costs in hard_reg_costs[] for regs 14-31 is 2000.
At the end of step 1, costs[r31] is -390 for failing case(for allocno r118) and
1610 for passing case (for allocno r117).

Another issue(?) is that in step 2, the only conficting allocno for r118 is the
allocno for r120 which is used to hold the value of the condition check. The
pseudo r120 has been assigned to r100 by the graph coloring step. But r100 is
not in the set of profitable hard registers for r118. (The profitable hard regs
are: [0, 3-12, 14-31]). So the allocno for r120 is not considered for spilling.
 And finally in step 3, r31 is assigned to r118, though r31 has not been
assigned to any conflicting allocno. Perhaps improve_allocation() should only
consider registers that have been assigned to conflicting allocnos, and not
other registers, since it’s stated aim is to see if spilling conflicting
allocnos can result in a better allocation.

I am investigating why hard_reg_costs[] has 0 cost for r14..r31.

  parent reply	other threads:[~2023-04-14 17:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-03 14:04 [Bug rtl-optimization/109009] New: " jskumari at gcc dot gnu.org
2023-03-03 16:40 ` [Bug rtl-optimization/109009] " segher at gcc dot gnu.org
2023-03-05  5:23 ` jskumari at gcc dot gnu.org
2023-03-05 12:19 ` jskumari at gcc dot gnu.org
2023-03-05 15:01 ` segher at gcc dot gnu.org
2023-04-14 17:44 ` jskumari at gcc dot gnu.org [this message]
2023-05-10 11:51 ` jskumari at gcc dot gnu.org
2023-05-11  9:49 ` jskumari at gcc dot gnu.org
2023-06-23 15:03 ` jskumari at gcc dot gnu.org
2023-06-23 19:57 ` bergner at gcc dot gnu.org
2023-06-23 20:04 ` bergner at gcc dot gnu.org
2023-06-27 13:18 ` jskumari at gcc dot gnu.org
2023-06-27 13:19 ` jskumari at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-109009-4-xRSMQckCLs@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).