public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: gcc-patches@gcc.gnu.org, Robin Dapp <rdapp.gcc@gmail.com>,
	richard.sandiford@arm.com
Subject: Re: [PATCH] RFC: Add late-combine pass [PR106594]
Date: Thu, 28 Sep 2023 06:37:17 -0600	[thread overview]
Message-ID: <cc90c4dc-6a0a-4995-8cb7-8b29065aa981@gmail.com> (raw)
In-Reply-To: <mpto7hoykfe.fsf@arm.com>



On 9/26/23 10:21, Richard Sandiford wrote:
> This patch adds a combine pass that runs late in the pipeline.
> There are two instances: one between combine and split1, and one
> after postreload.
> 
> The pass currently has a single objective: remove definitions by
> substituting into all uses.  The pre-RA version tries to restrict
> itself to cases that are likely to have a neutral or beneficial
> effect on register pressure.
> 
> The patch fixes PR106594.  It also fixes a few FAILs and XFAILs
> in the aarch64 test results, mostly due to making proper use of
> MOVPRFX in cases where we didn't previously.  I hope it would
> also help with Robin's vec_duplicate testcase, although the
> pressure heuristic might need tweaking for that case.
> 
> This is just a first step..  I'm hoping that the pass could be
> used for other combine-related optimisations in future.  In particular,
> the post-RA version doesn't need to restrict itself to cases where all
> uses are substitutitable, since it doesn't have to worry about register
> pressure.  If we did that, and if we extended it to handle multi-register
> REGs, the pass might be a viable replacement for regcprop, which in
> turn might reduce the cost of having a post-RA instance of the new pass.
> 
> I've run an assembly comparison with one target per CPU directory,
> and it seems to be a win for all targets except nvptx (which is hard
> to measure, being a higher-level asm).  The biggest winner seemed
> to be AVR.
> 
> However, if a version of the pass does go in, it might be better
> to enable it by default only on targets where the extra compile
> time seems to be worth it.  IMO, fixing PR106594 and the MOVPRFX
> issues makes it worthwhile for AArch64.
> 
> The patch contains various bug fixes and new helper routines.
> I'd submit those separately in the final version.  Because of
> that, there's no GNU changelog yet.
> 
> Bootstrapped & regression tested on aarch64-linux-gnu so far.
Very interesting.

I would generally expect it to be a win on most targets and might allow 
us to reduce the number of post-reload hacks we do.  So I'd lean towards 
enabling it everywhere.


With that in mind, I briefly threw it into my tester.  The first thing 
that popped out was rl78-elf regresses on compile/20021008-1.c.

In the pre-RA version we've taken these insns:

> (insn 22 21 7 2 (set (reg/v/f:HI 44 [ buf ])
>         (const_int 0 [0])) "k.c":9:9 -1
>      (nil))
> (insn 7 22 8 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 0)
>         (mem:SI (plus:HI (reg/v/f:HI 44 [ buf ])
>                 (const_int 1 [0x1])) [1 MEM[(long double *)buf_4(D) + 1B]+0 S4 A16])) "k.c":9:9 2 {movsi}
>      (nil))
> (insn 8 7 9 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 4)
>         (mem:SI (plus:HI (reg/v/f:HI 44 [ buf ])
>                 (const_int 5 [0x5])) [1 MEM[(long double *)buf_4(D) + 1B]+4 S4 A16])) "k.c":9:9 2 {movsi}
>      (expr_list:REG_DEAD (reg/v/f:HI 44 [ buf ])
>         (nil)))


We combine insn 22 with insn 7 and 8 resulting in:

> (insn 7 22 8 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 0)
>         (mem:SI (const_int 1 [0x1]) [1 MEM[(long double *)buf_4(D) + 1B]+0 S4 A16])) "k.c":9:9 2 {movsi}
>      (nil))
> (insn 8 7 9 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 4)
>         (mem:SI (const_int 5 [0x5]) [1 MEM[(long double *)buf_4(D) + 1B]+4 S4 A16])) "k.c":9:9 2 {movsi}
>      (nil))


Which ultimately triggers an assembler error:

> k.s: Assembler messages:
> k.s:41: Error: movw ax,!1
> k.s:41: Error:          ^ Expression not word-aligned
> k.s:43: Error: movw ax,!3
> k.s:43: Error:          ^ Expression not word-aligned
[ ... ]


This seems more likely than not to be a target issue.  I suspect combine 
didn't trip over this because of the multiple uses of (reg 44).


Jeff


  reply	other threads:[~2023-09-28 12:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26 16:21 Richard Sandiford
2023-09-28 12:37 ` Jeff Law [this message]
2023-10-02 15:18 ` Robin Dapp
2023-10-07 12:58   ` Richard Sandiford
2023-10-10 18:01     ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc90c4dc-6a0a-4995-8cb7-8b29065aa981@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rdapp.gcc@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).