public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] PR rtl-optimization/110587: Reduce useless moves in compile-time hog.
Date: Thu, 27 Jul 2023 21:05:25 +0200	[thread overview]
Message-ID: <D5E0D28B-98E0-4422-ADD2-9E789B7D3FA3@gmail.com> (raw)
In-Reply-To: <001601d9c0ad$93934e00$bab9ea00$@nextmovesoftware.com>



> Am 27.07.2023 um 19:12 schrieb Roger Sayle <roger@nextmovesoftware.com>:
> 
> 
> Hi Richard,
> 
> You're 100% right.  It’s possible to significantly clean-up this code, replacing
> the body of the conditional with a call to force_reg and simplifying the conditions
> under which it is called.  These improvements are implemented in the patch
> below, which has been tested on x86_64-pc-linux-gnu, with a bootstrap and
> make -k check, both with and without -m32, as usual.
> 
> Interestingly, the CONCAT clause afterwards is still required (I've learned something
> new),  as calling force_reg (or gen_reg_rtx) with HCmode, actually returns a CONCAT
> instead of a REG,

Heh, interesting.

> so although the code looks dead, it's required to build libgcc during
> a bootstrap.  But the remaining clean-up is good, reducing the number of source lines
> and making the logic easier to understand.
> 
> Ok for mainline?

Ok.

Thanks,
Richard 

> 2023-07-27  Roger Sayle  <roger@nextmovesoftware.com>
>            Richard Biener  <rguenther@suse.de>
> 
> gcc/ChangeLog
>        PR middle-end/28071
>        PR rtl-optimization/110587
>        * expr.cc (emit_group_load_1): Simplify logic for calling
>        force_reg on ORIG_SRC, to avoid making a copy if the source
>        is already in a pseudo register.
> 
> Roger
> --
> 
>> -----Original Message-----
>> From: Richard Biener <richard.guenther@gmail.com>
>> Sent: 25 July 2023 12:50
>> 
>>> On Tue, Jul 25, 2023 at 1:31 PM Roger Sayle <roger@nextmovesoftware.com>
>>> wrote:
>>> 
>>> This patch is the third in series of fixes for PR
>>> rtl-optimization/110587, a compile-time regression with -O0, that
>>> attempts to address the underlying cause.  As noted previously, the
>>> pathological test case pr28071.c contains a large number of useless
>>> register-to-register moves that can produce quadratic behaviour (in
>>> LRA).  These move are generated during RTL expansion in
>>> emit_group_load_1, where the middle-end attempts to simplify the
>>> source before calling extract_bit_field.  This is reasonable if the
>>> source is a complex expression (from before the tree-ssa optimizers),
>>> or a SUBREG, or a hard register, but it's not particularly useful to
>>> copy a pseudo register into a new pseudo register.  This patch eliminates that
>> redundancy.
>>> 
>>> The -fdump-tree-expand for pr28071.c compiled with -O0 currently
>>> contains 777K lines, with this patch it contains 717K lines, i.e.
>>> saving about 60K lines (admittedly of debugging text output, but it makes the
>> point).
>>> 
>>> 
>>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>>> and make -k check, both with and without --target_board=unix{-m32}
>>> with no new failures.  Ok for mainline?
>>> 
>>> As always, I'm happy to revert this change quickly if there's a
>>> problem, and investigate why this additional copy might (still) be
>>> needed on other
>>> non-x86 targets.
>> 
>> @@ -2622,6 +2622,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src,
>> tree type,
>>         be loaded directly into the destination.  */
>>       src = orig_src;
>>       if (!MEM_P (orig_src)
>> +         && (!REG_P (orig_src) || HARD_REGISTER_P (orig_src))
>>          && (!CONSTANT_P (orig_src)
>>              || (GET_MODE (orig_src) != mode
>>                  && GET_MODE (orig_src) != VOIDmode)))
>> 
>> so that means the code guarded by the conditional could instead be transformed
>> to
>> 
>>   src = force_reg (mode, orig_src);
>> 
>> ?  Btw, the || (GET_MODE (orig_src) != mode && GET_MODE (orig_src) !=
>> VOIDmode) case looks odd as in that case we'd use GET_MODE (orig_src) for the
>> move ... that might also mean we have to use force_reg (GET_MODE (orig_src) ==
>> VOIDmode ? mode : GET_MODE (orig_src), orig_src))
>> 
>> Otherwise I think this is OK, as said, using force_reg somehow would improve
>> readability here I think.
>> 
>> I also wonder how the
>> 
>>      else if (GET_CODE (src) == CONCAT)
>> 
>> case will ever trigger with the current code.
>> 
>> Richard.
>> 
>>> 
>>> 2023-07-25  Roger Sayle  <roger@nextmovesoftware.com>
>>> 
>>> gcc/ChangeLog
>>>        PR middle-end/28071
>>>        PR rtl-optimization/110587
>>>        * expr.cc (emit_group_load_1): Avoid copying a pseudo register into
>>>        a new pseudo register, i.e. only copy hard regs into a new pseudo.
>>> 
>>> 
> 
> <patchhg4.txt>

      reply	other threads:[~2023-07-27 19:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 11:31 Roger Sayle
2023-07-25 11:50 ` Richard Biener
2023-07-27 17:12   ` Roger Sayle
2023-07-27 19:05     ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D5E0D28B-98E0-4422-ADD2-9E789B7D3FA3@gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).