public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "linkw at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/100328] IRA doesn't model matching constraint well
Date: Thu, 24 Jun 2021 12:11:36 +0000	[thread overview]
Message-ID: <bug-100328-4-3dr1JMtAk2@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-100328-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328

--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #2)
> (In reply to Kewen Lin from comment #1)
> > Created attachment 50715 [details]
> > ira:consider matching cstr in all alternatives
> > 
> > With little understanding on ira, I am not quite sure this patch is on the
> > reasonable direction. It aims to check the matching constraint in all
> > alternatives, if there is one alternative with matching constraint and
> > matches the current preferred regclass, it will record the output operand
> > number and further create one copy for it. Normally it can get the priority
> > against shuffle copies and the matching constraint will get satisfied with
> > higher possibility, reload doesn't create extra copies to meet the matching
> > constraint or the desirable register class when it has to.
> > 
> > For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay as
> > shuffle copies, and later any of A,B,C,D gets assigned by one hardware
> > register which is a VSX register but not a FP register, which means it has
> > to pay costs once we can NOT go with VSX alternatives, so at that time we
> > can increase the freq for the remaining copies related to this, once the
> > matching constraint gets satisfied further, there aren't any extra costs to
> > pay. This idea seems a bit complicated in the current framework, so the
> > proposed patch aggressively emphasizes the matching constraint at the time
> > of creating copies.
> > 
> > FWIW bootstrapped/regtested on powerpc64le-linux-gnu P9. The evaluation with
> > Power9 SPEC2017 all run shows 505.mcf_r +2.98%, 508.namd_r +3.37%, 519.lbm_r
> > +2.51%, no remarkable degradation is observed.
> 
> Thank you for working on this issue.
> 
> The current implementation of ira_get_dup_out_num was specifically tuned for
> better register allocation for x86-64 div insns.
> 
> Your patch definitely improves code for power9 and I would say significantly
> (congratulations!).  The patch you proposed makes me think that it might
> work for major targets as well.
> 
> I would prefer to avoid introducing new parameter because there are too many
> of them already and its description is cryptic.
> 

Thanks for your comments, Vladimir!  Yeah, Segher also thought it can benefit
other targets and suggested making it on by default, I've made this parameter
on by default in v2, if it's fine on x86-64 and aarch64 with some testing and
benchmarking later, I think we can simply get rid of the parameter as you
suggested. 

> It would be nice if you benchmark the patch on x86-64 too, If there is no
> overall degradation with new behaviour we could remove the parameter and
> make the new behaviour as a default. If it is not, well we will keep the
> parameter.
> 

Sorry that I don't have a x86-64 or aarch64 performance machine at hand, the
new version v2 was bootstrapped/regtested on powerpc64le-linux-gnu P9 and
x86_64-redhat-linux, but had some failures on aarch64, I was still
investigating it.  Once it got root-caused and fixed, I would ask around folks
to help to benchmark this.

> As for the patch itself, I don't like some variable names.  Sorry.  Could
> you use op_regno, out_regno, and present_alt instead of op_no, out_no, tot. 
> Please, in general use longer variable names reflecting their purpose as GCC
> developers reads code in many times more than writing it.

Got it, thanks for the suggestion! This part has been simplified with
recog_op_alt, hope it looks better.

  parent reply	other threads:[~2021-06-24 12:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29  6:41 [Bug rtl-optimization/100328] New: IRA doesn't model dup num " linkw at gcc dot gnu.org
2021-04-30  8:30 ` [Bug rtl-optimization/100328] " linkw at gcc dot gnu.org
2021-06-23 18:28 ` [Bug rtl-optimization/100328] IRA doesn't model matching " vmakarov at gcc dot gnu.org
2021-06-24 12:11 ` linkw at gcc dot gnu.org [this message]
2021-06-24 12:36 ` linkw at gcc dot gnu.org
2021-06-28  3:27 ` linkw at gcc dot gnu.org
2021-06-28  5:25 ` linkw at gcc dot gnu.org
2021-06-29 16:01 ` rsandifo at gcc dot gnu.org
2021-07-01  6:18 ` linkw at gcc dot gnu.org
2021-07-06  2:35 ` cvs-commit at gcc dot gnu.org
2021-07-06  2:35 ` cvs-commit at gcc dot gnu.org
2021-07-07  3:07 ` linkw at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-100328-4-3dr1JMtAk2@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).