From: Richard Biener <richard.guenther@gmail.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: xionghu luo <luoxhu@linux.ibm.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
David Edelsohn <dje.gcc@gmail.com>,
Bill Schmidt <wschmidt@linux.ibm.com>,
Jiufu Guo <guojiufu@linux.ibm.com>,
linkw@gcc.gnu.org, Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]
Date: Mon, 28 Sep 2020 10:00:56 +0200 [thread overview]
Message-ID: <CAFiYyc2GOKYKNA1DfwS4RyjyL-jLmdzE5f0OhvzPg2ecVREnsg@mail.gmail.com> (raw)
In-Reply-To: <20200925223908.GK28786@gate.crashing.org>
On Sat, Sep 26, 2020 at 12:41 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Fri, Sep 25, 2020 at 08:58:35AM +0200, Richard Biener wrote:
> > On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > > after which I get (-march=znver2)
> > >
> > > setg:
> > > vmovd %edi, %xmm1
> > > vmovd %esi, %xmm2
> > > vpbroadcastd %xmm1, %ymm1
> > > vpbroadcastd %xmm2, %ymm2
> > > vpcmpeqd .LC0(%rip), %ymm1, %ymm1
> > > vpandn %ymm0, %ymm1, %ymm0
> > > vpand %ymm2, %ymm1, %ymm1
> > > vpor %ymm0, %ymm1, %ymm0
> > > ret
> >
> > I get with -march=znver2 -O2
> >
> > vmovd %edi, %xmm1
> > vmovd %esi, %xmm2
> > vpbroadcastd %xmm1, %ymm1
> > vpbroadcastd %xmm2, %ymm2
> > vpcmpeqd .LC0(%rip), %ymm1, %ymm1
> > vpblendvb %ymm1, %ymm2, %ymm0, %ymm0
>
> Ah, maybe my x86 compiler it too old...
> x86_64-linux-gcc (GCC) 10.0.0 20190919 (experimental)
> not exactly old, huh. I wonder what I do wrong then.
>
> > Now, with SSE4.2 the 16byte case compiles to
> >
> > setg:
> > .LFB0:
> > .cfi_startproc
> > movd %edi, %xmm3
> > movdqa %xmm0, %xmm1
> > movd %esi, %xmm4
> > pshufd $0, %xmm3, %xmm0
> > pcmpeqd .LC0(%rip), %xmm0
> > movdqa %xmm0, %xmm2
> > pandn %xmm1, %xmm2
> > pshufd $0, %xmm4, %xmm1
> > pand %xmm1, %xmm0
> > por %xmm2, %xmm0
> > ret
> >
> > since there's no blend with a variable mask IIRC.
>
> PowerPC got at least *that* right since time immemorial :-)
>
> > with aarch64 and SVE it doesn't handle the 32byte case at all,
> > the 16byte case compiles to
> >
> > setg:
> > .LFB0:
> > .cfi_startproc
> > adrp x2, .LC0
> > dup v1.4s, w0
> > dup v2.4s, w1
> > ldr q3, [x2, #:lo12:.LC0]
> > cmeq v1.4s, v1.4s, v3.4s
> > bit v0.16b, v2.16b, v1.16b
> >
> > which looks equivalent to the AVX2 code.
>
> Yes, and we can do pretty much the same on Power, too.
>
> > For all of those varying the vector element type may also
> > cause "issues" I guess.
>
> For us, as long as it stays 16B vectors, all should be fine. There may
> be issues in the compiler, but at least the hardware has no problem with
> it ;-)
>
> > > and for powerpc (changing it to 16B vectors, -mcpu=power9) it is
> > >
> > > setg:
> > > addis 9,2,.LC0@toc@ha
> > > mtvsrws 32,5
> > > mtvsrws 33,6
> > > addi 9,9,.LC0@toc@l
> > > lxv 45,0(9)
> > > vcmpequw 0,0,13
> > > xxsel 34,34,33,32
> > > blr
>
> The -mcpu=power10 code right now is just
>
> plxv 45,.LC0@pcrel
> mtvsrws 32,5
> mtvsrws 33,6
> vcmpequw 0,0,13
> xxsel 34,34,33,32
> blr
>
> (exactly the same, but less memory address setup cost), so doing
> something like this as a generic version would work quite well pretty
> much everywhere I think!
Given we don't have a good way to query for variable blend support
the only change would be to use the bit and/or way which probably
would be fine (and eventually combined). I guess that could be done
in ISEL as well (given support for generating the mask, of course).
Of course that makes it difficult for targets to opt-out (generate
the in-memory variant)...
Richard.
>
> Segher
next prev parent reply other threads:[~2020-09-28 8:01 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 6:17 [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR Xiong Hu Luo
2020-09-18 6:17 ` [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251] Xiong Hu Luo
2020-09-18 20:19 ` Segher Boessenkool
2020-09-24 8:21 ` xionghu luo
2020-09-24 13:27 ` Richard Biener
2020-09-24 14:55 ` Richard Biener
2020-09-24 19:36 ` Segher Boessenkool
2020-09-25 6:58 ` Richard Biener
2020-09-25 12:21 ` Richard Sandiford
2020-09-25 22:39 ` Segher Boessenkool
2020-09-28 8:00 ` Richard Biener [this message]
2020-09-24 19:02 ` Segher Boessenkool
2020-09-25 3:50 ` xionghu luo
2020-09-25 5:00 ` Richard Biener
2020-09-18 18:20 ` [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR Segher Boessenkool
2020-09-21 8:31 ` Richard Biener
2020-09-22 3:55 ` [PATCH v3 " xionghu luo
2020-09-23 11:33 ` Richard Biener
2020-09-24 3:24 ` xionghu luo
2020-09-24 12:39 ` Richard Sandiford
2020-09-25 6:51 ` [PATCH v4 1/3] " xionghu luo
2020-09-25 11:32 ` Richard Biener
2020-09-25 13:28 ` Richard Sandiford
2020-09-27 5:45 ` xionghu luo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc2GOKYKNA1DfwS4RyjyL-jLmdzE5f0OhvzPg2ecVREnsg@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=guojiufu@linux.ibm.com \
--cc=linkw@gcc.gnu.org \
--cc=luoxhu@linux.ibm.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).