Re: [PATCH v5] Implement new RTL optimizations pass: fold-mem-offsets.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jeff Law <jeffreyalaw@gmail.com>
To: Manolis Tsamis <manolis.tsamis@vrull.eu>
Cc: gcc-patches@gcc.gnu.org,
	Philipp Tomsich <philipp.tomsich@vrull.eu>,
	Vineet Gupta <vineetg@rivosinc.com>,
	Richard Biener <richard.guenther@gmail.com>
Subject: Re: [PATCH v5] Implement new RTL optimizations pass: fold-mem-offsets.
Date: Fri, 29 Sep 2023 13:22:04 -0600	[thread overview]
Message-ID: <e55f37a6-b20d-4127-919a-d407a552920c@gmail.com> (raw)
In-Reply-To: <CAM3yNXoEvbRWRna4ztL86r4f4CP1NLG+vQOfS_5zWx2m+EnQSw@mail.gmail.com>



On 9/12/23 04:13, Manolis Tsamis wrote:

>>> +
>>> +/* Get the single reaching definition of an instruction inside a BB.
>>> +   The definition is desired for REG used in INSN.
>>> +   Return the definition insn or NULL if there's no definition with
>>> +   the desired criteria.  */
>>> +static rtx_insn*
>>> +get_single_def_in_bb (rtx_insn *insn, rtx reg)
>>> +{
>>> +  df_ref use;
>>> +  struct df_link *ref_chain, *ref_link;
>>> +
>>> +  FOR_EACH_INSN_USE (use, insn)
>>> +    {
>>> +      if (GET_CODE (DF_REF_REG (use)) == SUBREG)
>>> +     return NULL;
>>> +      if (REGNO (DF_REF_REG (use)) == REGNO (reg))
>>> +     break;
>>> +    }
>>> +
>>> +  if (!use)
>>> +    return NULL;
>>> +
>>> +  ref_chain = DF_REF_CHAIN (use);
>> So what if there's two uses of REG in INSN?  I don't think it's be
>> common at all, but probably better safe and reject than sorry, right? Or
>> is that case filtered out earlier?
>>
> If the REG is the same won't the definitions be the same even if that
> REG appears multiple times in INSN?
Yes.

> fold_offsets_1 should be able to handle the folding with multiple uses
> of REG just fine, for example add R1, R1 or add (ashift R1, 1), R1.
> If there's no other issue here I assume we want to keep that as-is in
> order to not reduce the propagation power (Which I assume is similar
> to ree which uses the same logic).
OK.  I was primarily concerned about the folding and rewriting aspects. 
It probably can only show up on targets with LEA like instructions, and 
even on such targets it's probably rate.




>>> +/* Test if INSN is a memory load / store that can have an offset folded to it.
>>> +   Return true iff INSN is such an instruction and return through MEM_OUT,
>>> +   REG_OUT and OFFSET_OUT the RTX that has a MEM code, the register that is
>>> +   used as a base address and the offset accordingly.
>>> +   All of the out pointers may be NULL in which case they will be ignored.  */
>>> +bool
>>> +get_fold_mem_root (rtx_insn* insn, rtx *mem_out, rtx *reg_out,
>>> +                HOST_WIDE_INT *offset_out)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +  rtx mem = NULL_RTX;
>>> +
>>> +  if (set != NULL_RTX)
>>> +    {
>>> +      rtx src = SET_SRC (set);
>>> +      rtx dest = SET_DEST (set);
>>> +
>>> +      /* Don't fold when we have unspec / volatile.  */
>>> +      if (GET_CODE (src) == UNSPEC
>>> +       || GET_CODE (src) == UNSPEC_VOLATILE
>>> +       || GET_CODE (dest) == UNSPEC
>>> +       || GET_CODE (dest) == UNSPEC_VOLATILE)
>>> +     return false;
>>> +
>>> +      if (MEM_P (src))
>>> +     mem = src;
>>> +      else if (MEM_P (dest))
>>> +     mem = dest;
>>> +      else if ((GET_CODE (src) == SIGN_EXTEND
>>> +             || GET_CODE (src) == ZERO_EXTEND)
>>> +            && MEM_P (XEXP (src, 0)))
>> Note some architectures allow both a source and destination memory.  It
>> looks like your code will prefer the source operand in that case.
>> That's fine, just pointing it out.
>>
> Thanks for pointing that possibility out. I thought for a moment that
> this would be a bug with multiple mentions of the address register.
> but it should be fine due to:
> /* Special case: A foldable memory store is not foldable if it
> mentions DEST outside of the address calculation. */
ACK.

The other thing I keep pondering is autoincrement style addressing. 
Though I think at some point I convinced myself they weren't a problem. 
I think your checks only allow specific kinds of expressions for the 
memory address and I don't think {PRE,POST}_{INC,DEC.MODIFY} were in the 
list of valid ops.


>>> +
>>> +  int max_iters = 5;
>>> +  for (int i = 0; i < max_iters; i++)
>>> +    {
>>> +      bool made_changes = false;
>>> +      for (fold_info_map::iterator iter = fold_info->begin ();
>>> +        iter != fold_info->end (); ++iter)
>>> +     {
>>> +       fold_mem_info *info = (*iter).second;
>>> +       if (bitmap_intersect_p (&cannot_fold_insns, info->fold_insns))
>>> +         made_changes |= bitmap_ior_into (&cannot_fold_insns,
>>> +                                          info->fold_insns);
>>> +     }
>>> +
>>> +      if (!made_changes)
>>> +     return true;
>>> +    }
>>> +
>>> +  return false;
>> So how was the magic value of "5" determined here?  In general we try
>> not to have magic #s like that and instead find a better way to control
>> iterations, falling back to a PARAM when all else fails.
>>
> It's indeed just a magic value :)
> In general even two iterations of this should be quite rare. One
> iteration will handle the majority of interesting cases. I chose 5 to
> limit the worst case execution time for degenerate cases.
> 
> I'll be happy to change that to something better but I couldn't
> figure out how "falling back to a PARAM when all else fails"  should
> work here. Is there a similar example somewhere else in the code that
> I can look at?
If we expect two iterations to be quite rare, then I doubt a param is 
really worth the effort.  I might suggest using a loop like

for (pass = 0; pass < flag_expensive_optimizations + 1; pass++)

No magic numbers :-)   We use similar loops elsewhere for this kind of 
scenario.


If you ever need to create a param, the procedure is to add the option 
to params.opt.  If you look in that file you'll see that you define a 
variable name in the specification that you can then reference.

So for a simple example look for param_avg_loop_niter.

Jeff

next prev parent reply	other threads:[~2023-09-29 19:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-09  8:46 Manolis Tsamis
2023-09-09  8:54 ` Manolis Tsamis
2023-09-12  0:47 ` Jeff Law
2023-09-12 10:13   ` Manolis Tsamis
2023-09-29 19:22     ` Jeff Law [this message]
2023-10-03 11:49       ` Manolis Tsamis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e55f37a6-b20d-4127-919a-d407a552920c@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=manolis.tsamis@vrull.eu \
    --cc=philipp.tomsich@vrull.eu \
    --cc=richard.guenther@gmail.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).