public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Hongtao Liu <crazylht@gmail.com>
Cc: Uros Bizjak <ubizjak@gmail.com>, Jakub Jelinek <jakub@redhat.com>,
	Alan Modra <amodra@gmail.com>,
		"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.
Date: Tue, 03 Sep 2019 11:24:00 -0000	[thread overview]
Message-ID: <CAFiYyc1_aVbNA-GrL=RZCMHz3gB0GO5H7Ac_SW6ko_1vu_pzHw@mail.gmail.com> (raw)
In-Reply-To: <CAMZc-bwwsGRHgUzYjZLju+ZrPx6X1bfK31nqR2RaBgK1vcFmAg@mail.gmail.com>

On Tue, Sep 3, 2019 at 9:57 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Sep 2, 2019 at 4:41 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Sep 2, 2019 at 10:13 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > > which is not the case with core_cost (and similar with skylake_cost):
> > > >
> > > >   2, 2, 4,                /* cost of moving XMM,YMM,ZMM register */
> > > >   {6, 6, 6, 6, 12},            /* cost of loading SSE registers
> > > >                        in 32,64,128,256 and 512-bit */
> > > >   {6, 6, 6, 6, 12},            /* cost of storing SSE registers
> > > >                        in 32,64,128,256 and 512-bit */
> > > >   2, 2,                    /* SSE->integer and integer->SSE moves */
> > > >
> > > > We have the same cost of moving between integer registers (by default
> > > > set to 2), between SSE registers and between integer and SSE register
> > > > sets. I think that at least the cost of moves between regsets should
> > > > be substantially higher, rs6000 uses 3x cost of intra-regset moves;
> > > > that would translate to the value of 6. The value should be low enough
> > > > to keep the cost below the value that forces move through the memory.
> > > > Changing core register allocation cost of SSE <-> integer to:
> > > >
> > > > --cut here--
> > > > Index: config/i386/x86-tune-costs.h
> > > > ===================================================================
> > > > --- config/i386/x86-tune-costs.h        (revision 275281)
> > > > +++ config/i386/x86-tune-costs.h        (working copy)
> > > > @@ -2555,7 +2555,7 @@ struct processor_costs core_cost = {
> > > >                                            in 32,64,128,256 and 512-bit */
> > > >    {6, 6, 6, 6, 12},                    /* cost of storing SSE registers
> > > >                                            in 32,64,128,256 and 512-bit */
> > > > -  2, 2,                                        /* SSE->integer and
> > > > integer->SSE moves */
> > > > +  6, 6,                                        /* SSE->integer and
> > > > integer->SSE moves */
> > > >    /* End of register allocator costs.  */
> > > >    },
> > > >
> > > > --cut here--
> > > >
> > > > still produces direct move in gcc.target/i386/minmax-6.c
> > > >
> > > > I think that in addition to attached patch, values between 2 and 6
> > > > should be considered in benchmarking. Unfortunately, without access to
> > > > regressed SPEC tests, I can't analyse these changes by myself.
> > > >
> > > > Uros.
> > >
> > > Apply similar change to skylake_cost, on skylake workstation we got
> > > performance like:
> > > ---------------------------
> > > version                                                            |
> > > 548_exchange_r score
> > > gcc10_20180822:                                           |   10
> > > apply remove_max8                                       |   8.9
> > > also apply increase integer_tofrom_sse cost |   9.69
> > > -----------------------------
> > > Still 3% regression which is related to _gfortran_mminloc0_4_i4 in
> > > libgfortran.so.5.0.0.
> > >
> > > I found suspicious code as bellow, does it affect?
> >
> > Hard to say without access to the test, but I'm glad that changing the
> > knob has noticeable effect. I think that (as said by Alan) a fine-tune
> > of register pressure calculation will be needed to push this forward.
> >
> > Uros.
> >
> > > ------------------
> > > modified   gcc/config/i386/i386-features.c
> > > @@ -590,7 +590,7 @@ general_scalar_chain::compute_convert_gain ()
> > >    if (dump_file)
> > >      fprintf (dump_file, "  Instruction conversion gain: %d\n", gain);
> > >
> > > -  /* ???  What about integer to SSE?  */
> > > +  /* ???  What about integer to SSE?  */???
> > >    EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, insn_uid, bi)
> > >      cost += DF_REG_DEF_COUNT (insn_uid) * ix86_cost->sse_to_integer;
> > > ------------------
> > > --
> > > BR,
> > > Hongtao
>
> Note:
> Removing limit of cost would introduce lots of regressions in SPEC2017 as follow
> --------------------------------
> 531.deepsjeng_r  -7.18%
> 548.exchange_r  -6.70%
> 557.xz_r -6.74%
> 508.namd_r -2.81%
> 527.cam4_r -6.48%
> 544.nab_r -3.99%
>
> Tested on skylake server.
> -------------------------------------
> How about  changing cost from 2 to 8 until we figure out a better number.

Certainly works for me.  Note the STV code uses the "other" sse_to_integer
number and the numbers in question here are those for the RA.  There's
a multitude of values used in the tables here, including some a lot larger.
So the overall bumping to 8 certainly was the wrong thing to do and instead
individual numbers should have been adjusted (didn't look at the history
of that bumping).  For example Pentium4 has quite high bases for move
costs, like xmm <-> xmm move costing 12 and SSE->integer costing 20
while the opposite 12.

So yes, we want to revert the patch by applying its effect to the
individual cost tables so we can revisit this for the still interesting
micro-architectures.

Richard.

> --
> BR,
> Hongtao

  reply	other threads:[~2019-09-03 11:24 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-28 16:12 [PATCH, i386]: Improve STV conversion of shifts Uros Bizjak
2019-08-29  8:49 ` Uros Bizjak
2019-08-29 10:11   ` [PATCH, i386]: Fix secondary_reload_needed (was: Re: [PATCH, i386]: Improve STV conversion of shifts) Uros Bizjak
2019-08-29 10:17     ` Uros Bizjak
2019-08-30  7:40   ` [PATCH, i386]: Improve STV conversion of shifts Richard Biener
2019-08-30  8:29     ` [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8 Uros Bizjak
2019-08-31  3:01       ` Alan Modra
2019-08-31 16:12         ` Richard Biener
2019-09-01 19:49           ` Uros Bizjak
2019-09-02  7:16             ` Alan Modra
2019-09-02  8:13             ` Hongtao Liu
2019-09-02  8:41               ` Uros Bizjak
2019-09-03  7:57                 ` Hongtao Liu
2019-09-03 11:24                   ` Richard Biener [this message]
2019-09-03 11:33                     ` Richard Biener
2019-09-03 16:50                       ` Uros Bizjak
2019-09-04  1:42                         ` Hongtao Liu
2019-09-05  5:47                           ` Hongtao Liu
2019-09-05  8:54                             ` Uros Bizjak
2019-09-06 19:32                               ` [PATCH, i386]: Fix PR 91654: Runtime SPEC regression on Haswell Uros Bizjak
2019-09-02 10:23               ` [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8 Richard Biener
2019-09-03  1:37                 ` Hongtao Liu
2019-08-29 20:01 Uros Bizjak
2019-08-30  7:12 ` Hongtao Liu
2019-08-30  7:16   ` Hongtao Liu
2019-08-30  7:22   ` Uros Bizjak
2019-08-30  7:30     ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc1_aVbNA-GrL=RZCMHz3gB0GO5H7Ac_SW6ko_1vu_pzHw@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=amodra@gmail.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).