public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))"
Date: Thu, 04 Aug 2022 10:49:32 +0100	[thread overview]
Message-ID: <mptzggkmj83.fsf@arm.com> (raw)
In-Reply-To: <7e3fe210-6dbc-fc29-dbb8-b951e89cf7e9@yahoo.co.jp> (Takayuki Suwa's message of "Wed, 3 Aug 2022 20:17:23 +0900")

Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> writes:
> Thanks for your response.
>
> On 2022/08/03 16:52, Richard Sandiford wrote:
>> Takayuki 'January June' Suwa via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps
>>> data flow consistent, but it also increases register allocation pressure
>>> and thus often creates many unwanted register-to-register moves that
>>> cannot be optimized away.
>> 
>> There are two things here:
>> 
>> - If emit_move_complex_parts emits a clobber of a hard register,
>>   then that's probably a bug/misfeature.  The point of the clobber is
>>   to indicate that the register has no useful contents.  That's useful
>>   for wide pseudos that are written to in parts, since it avoids the
>>   need to track the liveness of each part of the pseudo individually.
>>   But it shouldn't be necessary for hard registers, since subregs of
>>   hard registers are simplified to hard registers wherever possible
>>   (which on most targets is "always").
>> 
>>   So I think the emit_move_complex_parts clobber should be restricted
>>   to !HARD_REGISTER_P, like the lower-subreg clobber is.  If that helps
>>   (if only partly) then it would be worth doing as its own patch.
>> 
>> - I think it'd be worth looking into more detail why a clobber makes
>>   a difference to register pressure.  A clobber of a pseudo register R
>>   shouldn't make R conflict with things that are live at the point of
>>   the clobber.
>
> I agree with its worth.
> In fact, aside from other ports, on the xtensa one, RA in code with frequent D[FC]mode pseudos is terribly bad.
> For example, in __muldc3 on libgcc2, the size of the stack frame reserved will almost double depending on whether or not this patch is applied.

Yeah, that's a lot.

>>>  It seems just analogous to partial register
>>> stall which is a famous problem on processors that do register renaming.
>>>
>>> In my opinion, when the register to be clobbered is a composite of hard
>>> ones, we should clobber the individual elements separetely, otherwise
>>> clear the entire to zero prior to use as the "init-regs" pass does (like
>>> partial register stall workarounds on x86 CPUs).  Such redundant zero
>>> constant assignments will be removed later in the "cprop_hardreg" pass.
>> 
>> I don't think we should rely on the zero being optimised away later.
>> 
>> Emitting the zero also makes it harder for the register allocator
>> to elide the move.  For example, if we have:
>> 
>>   (set (subreg:SI (reg:DI P) 0) (reg:SI R0))
>>   (set (subreg:SI (reg:DI P) 4) (reg:SI R1))
>> 
>> then there is at least a chance that the RA could assign hard registers
>> R0:R1 to P, which would turn the moves into nops.  If we emit:
>> 
>>   (set (reg:DI P) (const_int 0))
>> 
>> beforehand then that becomes impossible, since R0 and R1 would then
>> conflict with P.
>
> Ah, surely, as you pointed out for targets where "(reg: DI)" corresponds to one hard register.

I was thinking here about the case where (reg:DI …) corresponds to
2 hard registers.  Each subreg move is then a single hard register
copy, but assigning P to the combination R0:R1 can remove both of
the subreg moves.

>> TBH I'm surprised we still run init_regs for LRA.  I thought there was
>> a plan to stop doing that, but perhaps I misremember.
>
> Sorry I am not sure about the status of LRA... because the xtensa port is still using reload.

Ah, hadn't realised that.  If you have time to work on it, it would be
really good to move over to LRA.  There are plans to remove old reload.

It might be that old reload *does* treat a pseudo clobber as a conflict.
I can't remember now.  If so, then zeroing the register wouldn't be
too bad (for old reload only).

> As conclusion, trying to tweak the common code side may have been a bit premature.
> I'll consider if I can deal with those issues on the side of the target-specific code.

It's likely to be at least partly a target-independent issue, so tweaking
the common code makes sense in principle.

Does adding !HARD_REGISTER_P (x) to:

  /* Show the output dies here.  This is necessary for SUBREGs
     of pseudos since we cannot track their lifetimes correctly;
     hard regs shouldn't appear here except as return values.  */
  if (!reload_completed && !reload_in_progress
      && REG_P (x) && !reg_overlap_mentioned_p (x, y))
    emit_clobber (x);

in emit_move_complex_parts help?  If so, I think we should do at
least that much.

Thanks,
Richard

  reply	other threads:[~2022-08-04  9:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-03  1:35 Takayuki 'January June' Suwa
2022-08-03  7:52 ` Richard Sandiford
2022-08-03 11:17   ` Takayuki 'January June' Suwa
2022-08-04  9:49     ` Richard Sandiford [this message]
2022-08-04 12:35       ` Takayuki 'January June' Suwa
2022-08-05 16:20         ` Jeff Law
2022-10-14 11:06           ` [PATCH] xtensa: Prepare the transition from Reload to LRA Takayuki 'January June' Suwa
2022-10-16  5:03             ` Max Filippov
2022-10-18  2:57               ` [PATCH v2] " Takayuki 'January June' Suwa
2022-10-18  3:14                 ` Max Filippov
2022-10-18 12:16                   ` Max Filippov
2022-10-19  8:16                     ` [PATCH v3] " Takayuki 'January June' Suwa
2022-10-19 11:31                       ` Max Filippov
2022-10-25 20:09                       ` Jan-Benedict Glaw
2022-10-26  3:23                         ` Takayuki 'January June' Suwa
2022-10-26  6:27                         ` [PATCH] xtensa: Fix out-of-bounds array access Takayuki 'January June' Suwa
2022-10-26 17:05                           ` Max Filippov
2022-08-05 16:12       ` [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))" Jeff Law
2022-08-03 17:23   ` Jeff Law
2022-08-04  9:39     ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptzggkmj83.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jjsuwa_sys3175@yahoo.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).