Re: [PATCH RFA] Implement register pressure directed hoist pass

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Steven Bosscher <stevenb.gcc@gmail.com>
To: Bin Cheng <bin.cheng@arm.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH RFA] Implement register pressure directed hoist pass
Date: Mon, 01 Oct 2012 09:40:00 -0000	[thread overview]
Message-ID: <CABu31nOUBDXQiS+of79cgrY2Fcq_f5A6iTHkn=Oo-sPbOCPxug@mail.gmail.com> (raw)
In-Reply-To: <50669835.e988440a.4421.ffffa17cSMTPIN_ADDED@mx.google.com>

On Sat, Sep 29, 2012 at 8:37 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> This is the updated patch according to your comments. Please review.
> I also re-collected code size data and found it is improved by about 0.24%
> for mips, which is better than previous data. I believe this should be
> caused by recent changes in trunk, rather than by using DF caches to
> calculate register pressure.

Hello,

Thanks for the update. The first look wasn't a very thorough review,
so I have more comments now. Sorry for that, I should have taken the
time for this the first time round...

First, as a general note: Please add a fat comment somewhere before
the code of the pass to explain how the register pressure driven
hoisting pass works. The existing implementation used to be an almost
one-to-one copy from Muchnick's book, but lately the code has moved
away from that (e.g. with Maxim's patches from last year) and it gets
a bit hard to follow without some explanation. You should at least
document your own algorithm, but I'd really appreciate if you could
also spend some time writing down how things work in general and/or
how the GCC implementation differs from Muchnick's original algorithm.

> +Use IRA to evaluate register pressure in hoist pass for decisions to hoist

"the code hoisting pass" (brownie points for an xref to the flag for
code hoisting).

> -   - do rough calc of how many regs are needed in each block, and a rough
> -     calc of how many regs are available in each class and use that to
> -     throttle back the code in cases where RTX_COST is minimal.

This comment still applies to LCM.

> +/* Record all regs that are set in any one insn.  Communication from
> +   mark_reg_{store,clobber} and global_conflicts.  Asm can refer to
> +   all hard-registers.  */
> +static rtx regs_set[(FIRST_PSEUDO_REGISTER > MAX_RECOG_OPERANDS
> +		     ? FIRST_PSEUDO_REGISTER : MAX_RECOG_OPERANDS) * 2];
> +/* Number of regs stored in the previous array.  */
> +static int n_regs_set;

You can use DF_INSN_DEFS for this in calculate_bb_reg_pressure.

But then again...

> +	  while (n_regs_set-- > 0)
> +	    {
> +	      rtx note = find_regno_note (insn, REG_UNUSED,
> +					  REGNO (regs_set[n_regs_set]));
> +	      if (! note)
> +		continue;
> +
> +	      mark_reg_death (XEXP (note, 0));
> +	    }

Why not just mark all registers mentioned in REG_UNUSED notes as death, i.e.:

for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
  if ((REG_NOTE_KIND (note) == REG_UNUSED)
    mark_reg_death (XEXP (note, 0));

?

> +static int hoist_expr_reaches_here_p (basic_block, struct expr*, basic_block,

space between "struct expr" and "*".

> +				      int *, bitmap_head *);

"bitmap_head *" => "bitmap".
Likewise here:

> 	      /* Basic blocks that have occurrences reachable from BB.  */
> 	      bitmap_head _from_bbs, *from_bbs = &_from_bbs;
>+	      /* Basic blocks through which expr is hoisted.  */
>+	      bitmap_head _hoisted_bbs, *hoisted_bbs = &_hoisted_bbs;

And using BITMAP_ALLOC here:
> 	      bitmap_initialize (from_bbs, 0);
> +	      if (flag_ira_hoist_pressure == 1)
> +		bitmap_initialize (hoisted_bbs, 0);

(Some older code uses the "bitmap_head *" form, but that pre-dates
coretypes.h. Newer code that uses the "old form" should really be
fixed also.)

Please also use an obstack for the hoisted_bbs and from_bbs bitmaps.
They're currently allocated in GC space but that's completely
unnecessary for objects with such a clearly identifiable life time.
You can even allocate these bitmaps outside the loop, you're already
calling bitmap_clear on them.

> EXPR_BB. Stop

Two spaces after a period in comments.

> @@ -2863,7 +2909,8 @@ static int
>    if (visited == NULL)
>      {
>        visited_allocated_locally = 1;
> -      visited = XCNEWVEC (char, last_basic_block);
> +      visited = sbitmap_alloc (last_basic_block);
> +      sbitmap_zero (visited);
>      }

Can you please submit this as a separate patch? I'd go ahead and
commit it as obvious, but it's always good to have one change per
patch and this bit is independent of the reg-pressure dependent
hoisting.

> +fira-hoist-pressure
> +Common Report Var(flag_ira_hoist_pressure) Init(-1)
> +Use IRA based register pressure calculation
> +in hoist optimizations.

Please add the "Optimization" marker. Why initialize to -1? The
default initialization is 0, and if the flag is set it takes a value
1. If you follow that common behavior, you can replace all occurrences
of "if (flag_ira_hoist_pressure == 1)" with just ""if
(flag_ira_hoist_pressure)".

> +  /* Enable register pressure hoist when optimizing for size on Thumb1 set.  */
> +  if (TARGET_THUMB1 && optimize_function_for_size_p (cfun)
> +      && flag_ira_hoist_pressure == -1)
> +    flag_ira_hoist_pressure = 1;

One would expect this to be a win on all targets, but you probably
looked at this (at least Thumb2 and plain ARM). Did your patch not
help there? Otherwise, I'd be in favor of enabling this option by
default for all targets.

A few comments about the data flow bits:

> +#ifdef AUTO_INC_DEC
> +	  for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
> +	    if (REG_NOTE_KIND (link) == REG_INC)
> +	      mark_reg_store (XEXP (link, 0), NULL_RTX, NULL);
> +#endif

AFAIR, you can't have REG_INC notes at this stage except for stack ops
(push/pop). But in any case, DF handles auto-increments insns fine
without looking at REG_INC notes so this piece shouldn't be needed.

> +
> +/* Mark setting register REG.  */
> +static void
> +mark_reg_store (rtx reg, const_rtx setter ATTRIBUTE_UNUSED,
> +		void *data ATTRIBUTE_UNUSED)
> +{
> +  int regno;
> +
> +  if (GET_CODE (reg) == SUBREG)
> +    reg = SUBREG_REG (reg);

Please use DF_REF_REAL_REG in  the caller instead.

> +  if (regno >= FIRST_PSEUDO_REGISTER)
> +    mark_regno_live (regno);
> +  else
> +    {
> +      int last = regno + hard_regno_nregs[regno][GET_MODE (reg)];
> +
> +      while (regno < last)
> +	{
> +	  mark_regno_live (regno);
> +	  regno++;
> +	}
> +    }

With the DF caches, all "sub-registers" of a multi-word hard register
have a DEF (or USE) in the cache, so you don't have to do this while
loop.

With removing regs_set, using DF_REF_REAL_REG and not looping,
mark_reg_store will be just a wrapper around mark_regno_live so you
can eliminate mark_reg_store entirely :-)

> +calculate_bb_reg_pressure (void)
...
> +      FOR_BB_INSNS (bb, insn)

Ideally, you'd be walking insns from last to first, so that you don't
need REG_DEAD notes and use DF_INSN_DEFS to find dieing registers
(eliminating the need for the loop on multi-word hard registers in
mark_reg_death).

Even better would be if you can use df_simulate_one_insn_backwards and
friends, but I don't immediately see how to hook that up with your
change_pressure() function.

> +  reg = SET_DEST (set);
> +  if (GET_CODE (reg) == SUBREG)
> +    reg = SUBREG_REG (reg);
> +  if (MEM_P (reg))
> +    {
> +      *nregs = 0;
> +      pressure_class = NO_REGS;
> +    }
> +  else
> +    {
> +      if (! REG_P (reg))
> +	reg = NULL_RTX;
> +      if (reg == NULL_RTX)
> +	pressure_class = GENERAL_REGS;

gcc_assert (REG_P (reg))? If reg is not a MEM, it must be a REG or
it's not recorded in compute_hash_table. In fact, AFAIR reg must be a
REG unless flag_gcse_las is set, and I don't know if that even works
with code hoisting. It's not a particularly well tested flag (although
it's a very useful one for most RISC targets, maybe you should have a
look at that, too, for ARM variants).

Ciao!
Steven

next prev parent reply	other threads:[~2012-10-01  9:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <50655073.e54c420a.651e.ffffac0fSMTPIN_ADDED@mx.google.com>
2012-09-28  9:24 ` Steven Bosscher
2012-09-28 11:05   ` Bin Cheng
2012-09-28 11:26   ` Pedro Alves
2012-09-29 11:54   ` Bin Cheng
2012-10-02 12:58     ` Jeff Law
2012-10-08  3:53       ` Bin Cheng
2012-10-11  7:10       ` Bin Cheng
     [not found]       ` <50766d69.a853420a.2475.fffffbafSMTPIN_ADDED@mx.google.com>
2012-10-11 23:19         ` Steven Bosscher
2012-10-12  3:34           ` Bin Cheng
2012-10-12  8:20           ` Bin Cheng
2012-10-16  8:50             ` Bin Cheng
2012-10-16 17:08               ` Jeff Law
2012-10-19  6:54                 ` Bin Cheng
     [not found]   ` <50669835.e988440a.4421.ffffa17cSMTPIN_ADDED@mx.google.com>
2012-10-01  9:40     ` Steven Bosscher [this message]
2012-10-08  2:59       ` Bin Cheng
2012-09-28  8:21 Bin Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABu31nOUBDXQiS+of79cgrY2Fcq_f5A6iTHkn=Oo-sPbOCPxug@mail.gmail.com' \
    --to=stevenb.gcc@gmail.com \
    --cc=bin.cheng@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).