From: Hongtao Liu <crazylht@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
Jakub Jelinek <jakub@redhat.com>, Alan Modra <amodra@gmail.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.
Date: Thu, 05 Sep 2019 05:47:00 -0000 [thread overview]
Message-ID: <CAMZc-bz1Mb-w45c-SQQfHXFe-AVptuk2juXzFdNNQH_4dGSv7w@mail.gmail.com> (raw)
In-Reply-To: <CAMZc-byKUUgD6GUcGkHiKdNk-wXijG97A7jeNpYeEuCeK-LXaw@mail.gmail.com>
On Wed, Sep 4, 2019 at 9:44 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Wed, Sep 4, 2019 at 12:50 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Tue, Sep 3, 2019 at 1:33 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> >
> > > > > Note:
> > > > > Removing limit of cost would introduce lots of regressions in SPEC2017 as follow
> > > > > --------------------------------
> > > > > 531.deepsjeng_r -7.18%
> > > > > 548.exchange_r -6.70%
> > > > > 557.xz_r -6.74%
> > > > > 508.namd_r -2.81%
> > > > > 527.cam4_r -6.48%
> > > > > 544.nab_r -3.99%
> > > > >
> > > > > Tested on skylake server.
> > > > > -------------------------------------
> > > > > How about changing cost from 2 to 8 until we figure out a better number.
> > > >
> > > > Certainly works for me. Note the STV code uses the "other" sse_to_integer
> > > > number and the numbers in question here are those for the RA. There's
> > > > a multitude of values used in the tables here, including some a lot larger.
> > > > So the overall bumping to 8 certainly was the wrong thing to do and instead
> > > > individual numbers should have been adjusted (didn't look at the history
> > > > of that bumping).
> > >
> > > For reference:
> > >
> > > r125951 | uros | 2007-06-22 19:51:06 +0200 (Fri, 22 Jun 2007) | 6 lines
> > >
> > > PR target/32413
> > > * config/i386/i386.c (ix86_register_move_cost): Rise the cost of
> > > moves between MMX/SSE registers to at least 8 units to prevent
> > > ICE caused by non-tieable SI/HI/QImodes in SSE registers.
> > >
> > > should probably have been "twice the cost of X" or something like that
> > > instead where X be some reg-reg move cost.
> >
> > Thanks for the reference. It looks that the patch fixes the issue in
> > the wrong place, this should be solved in
> > inline_secondary_memory_needed:
> >
> > /* Between SSE and general, we have moves no larger than word size. */
> > if (!(INTEGER_CLASS_P (class1) || INTEGER_CLASS_P (class2))
> > || GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode)
> > || GET_MODE_SIZE (mode) > UNITS_PER_WORD)
> > return true;
> >
> > as an alternative to implement QI and HImode moves as a SImode move
> > between SSE and int<->SSE registers. We have
> > ix86_secondary_memory_needed_mode that extends QI and HImode secondary
> > memory to SImode, so this should solve PR32413.
> >
> > Other than that, what to do with the bizzare property of direct moves
> > that benchmark far worse than indirect moves? I was expecting that
> > keeping the cost of direct inter-regset moves just a bit below the
> > cost of int<->mem<->xmm, but (much ?) higher than itra-regset moves
> > would prevent unwanted wandering of values between register sets,
> > while still generating the direct move when needed. While this almost
>
> I've not tested it yet.
> So i'll start a test about this patch(change cost from 2-->6) with
> Richard's change.
> I'll keep you informed when finishing test.
>
> > fixes the runtime regression, it is not clear to me from Hongtao Liu's
> > message if Richard's 2019-08-27 fixes the remaining regression or
> > not). Liu, can you please clarify?
> >
> --------------------------------
> 531.deepsjeng_r -7.18%
> 548.exchange_r -6.70%
> 557.xz_r -6.74%
> 508.namd_r -2.81%
> 527.cam4_r -6.48%
> 544.nab_r -3.99%
>
> Tested on skylake server.
> -------------------------------------
> Those regressions are comparing gcc10_20190830 to gcc10_20190824 which
> are mainly caused by removing limit of 8.
>
> > > > For example Pentium4 has quite high bases for move
> > > > costs, like xmm <-> xmm move costing 12 and SSE->integer costing 20
> > > > while the opposite 12.
> > > >
> > > > So yes, we want to revert the patch by applying its effect to the
> > > > individual cost tables so we can revisit this for the still interesting
> > > > micro-architectures.
> >
> > Uros.
>
>
>
> --
> BR,
> Hongtao
Change cost from 2->6 got
-------------
531.deepsjeng_r 9.64%
548.exchange_r 10.24%
557.xc_r 7.99%
508.namd_r 1.08%
527.cam4_r 6.91%
553.nab_r 3.06%
------------
for 531,548,557,527, even better comparing to version before regression.
for 508,533, still little regressions comparing to version before regression.
--
BR,
Hongtao
next prev parent reply other threads:[~2019-09-05 5:47 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-28 16:12 [PATCH, i386]: Improve STV conversion of shifts Uros Bizjak
2019-08-29 8:49 ` Uros Bizjak
2019-08-29 10:11 ` [PATCH, i386]: Fix secondary_reload_needed (was: Re: [PATCH, i386]: Improve STV conversion of shifts) Uros Bizjak
2019-08-29 10:17 ` Uros Bizjak
2019-08-30 7:40 ` [PATCH, i386]: Improve STV conversion of shifts Richard Biener
2019-08-30 8:29 ` [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8 Uros Bizjak
2019-08-31 3:01 ` Alan Modra
2019-08-31 16:12 ` Richard Biener
2019-09-01 19:49 ` Uros Bizjak
2019-09-02 7:16 ` Alan Modra
2019-09-02 8:13 ` Hongtao Liu
2019-09-02 8:41 ` Uros Bizjak
2019-09-03 7:57 ` Hongtao Liu
2019-09-03 11:24 ` Richard Biener
2019-09-03 11:33 ` Richard Biener
2019-09-03 16:50 ` Uros Bizjak
2019-09-04 1:42 ` Hongtao Liu
2019-09-05 5:47 ` Hongtao Liu [this message]
2019-09-05 8:54 ` Uros Bizjak
2019-09-06 19:32 ` [PATCH, i386]: Fix PR 91654: Runtime SPEC regression on Haswell Uros Bizjak
2019-09-02 10:23 ` [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8 Richard Biener
2019-09-03 1:37 ` Hongtao Liu
2019-08-29 20:01 Uros Bizjak
2019-08-30 7:12 ` Hongtao Liu
2019-08-30 7:16 ` Hongtao Liu
2019-08-30 7:22 ` Uros Bizjak
2019-08-30 7:30 ` Hongtao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZc-bz1Mb-w45c-SQQfHXFe-AVptuk2juXzFdNNQH_4dGSv7w@mail.gmail.com \
--to=crazylht@gmail.com \
--cc=amodra@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=richard.guenther@gmail.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).