From: Richard Biener <richard.guenther@gmail.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: xionghu luo <luoxhu@linux.ibm.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
David Edelsohn <dje.gcc@gmail.com>,
Bill Schmidt <wschmidt@linux.ibm.com>,
Jiufu Guo <guojiufu@linux.ibm.com>,
linkw@gcc.gnu.org, Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]
Date: Fri, 25 Sep 2020 08:58:35 +0200 [thread overview]
Message-ID: <CAFiYyc3fdkkeHKSC6a6KHRj4hGXUJZJb4xdmrVVB689KsGae0Q@mail.gmail.com> (raw)
In-Reply-To: <20200924193628.GC28786@gate.crashing.org>
On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Thu, Sep 24, 2020 at 04:55:21PM +0200, Richard Biener wrote:
> > Btw, on x86_64 the following produces sth reasonable:
> >
> > #define N 32
> > typedef int T;
> > typedef T V __attribute__((vector_size(N)));
> > V setg (V v, int idx, T val)
> > {
> > V valv = (V){idx, idx, idx, idx, idx, idx, idx, idx};
> > V mask = ((V){0, 1, 2, 3, 4, 5, 6, 7} == valv);
> > v = (v & ~mask) | (valv & mask);
> > return v;
> > }
> >
> > vmovd %edi, %xmm1
> > vpbroadcastd %xmm1, %ymm1
> > vpcmpeqd .LC0(%rip), %ymm1, %ymm2
> > vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
> > ret
> >
> > I'm quite sure you could do sth similar on power?
>
> This only allows inserting aligned elements. Which is probably fine
> of course (we don't allow elements that straddle vector boundaries
> either, anyway).
>
> And yes, we can do that :-)
>
> That should be
> #define N 32
> typedef int T;
> typedef T V __attribute__((vector_size(N)));
> V setg (V v, int idx, T val)
> {
> V valv = (V){val, val, val, val, val, val, val, val};
> V idxv = (V){idx, idx, idx, idx, idx, idx, idx, idx};
> V mask = ((V){0, 1, 2, 3, 4, 5, 6, 7} == idxv);
> v = (v & ~mask) | (valv & mask);
> return v;
> }
Whoops yeah, simplified it a bit too much ;)
> after which I get (-march=znver2)
>
> setg:
> vmovd %edi, %xmm1
> vmovd %esi, %xmm2
> vpbroadcastd %xmm1, %ymm1
> vpbroadcastd %xmm2, %ymm2
> vpcmpeqd .LC0(%rip), %ymm1, %ymm1
> vpandn %ymm0, %ymm1, %ymm0
> vpand %ymm2, %ymm1, %ymm1
> vpor %ymm0, %ymm1, %ymm0
> ret
I get with -march=znver2 -O2
vmovd %edi, %xmm1
vmovd %esi, %xmm2
vpbroadcastd %xmm1, %ymm1
vpbroadcastd %xmm2, %ymm2
vpcmpeqd .LC0(%rip), %ymm1, %ymm1
vpblendvb %ymm1, %ymm2, %ymm0, %ymm0
and with -mavx512vl
vpbroadcastd %edi, %ymm1
vpcmpd $0, .LC0(%rip), %ymm1, %k1
vpbroadcastd %esi, %ymm0{%k1}
broadcast-with-mask - heh, would be interesting if we manage
to combine v[idx1] = val; v[idx2] = val; ;)
Now, with SSE4.2 the 16byte case compiles to
setg:
.LFB0:
.cfi_startproc
movd %edi, %xmm3
movdqa %xmm0, %xmm1
movd %esi, %xmm4
pshufd $0, %xmm3, %xmm0
pcmpeqd .LC0(%rip), %xmm0
movdqa %xmm0, %xmm2
pandn %xmm1, %xmm2
pshufd $0, %xmm4, %xmm1
pand %xmm1, %xmm0
por %xmm2, %xmm0
ret
since there's no blend with a variable mask IIRC.
with aarch64 and SVE it doesn't handle the 32byte case at all,
the 16byte case compiles to
setg:
.LFB0:
.cfi_startproc
adrp x2, .LC0
dup v1.4s, w0
dup v2.4s, w1
ldr q3, [x2, #:lo12:.LC0]
cmeq v1.4s, v1.4s, v3.4s
bit v0.16b, v2.16b, v1.16b
which looks equivalent to the AVX2 code.
For all of those varying the vector element type may also
cause "issues" I guess.
> .LC0:
> .long 0
> .long 1
> .long 2
> .long 3
> .long 4
> .long 5
> .long 6
> .long 7
>
> and for powerpc (changing it to 16B vectors, -mcpu=power9) it is
>
> setg:
> addis 9,2,.LC0@toc@ha
> mtvsrws 32,5
> mtvsrws 33,6
> addi 9,9,.LC0@toc@l
> lxv 45,0(9)
> vcmpequw 0,0,13
> xxsel 34,34,33,32
> blr
>
> .LC0:
> .long 0
> .long 1
> .long 2
> .long 3
>
> (We can generate that 0..3 vector without doing loads; I guess x86 can
> do that as well? But it takes more than one insn to do (of course we
> have to set up the memory address first *with* the load, heh).)
>
> For power8 it becomes (we need to splat in separate insns):
>
> setg:
> addis 9,2,.LC0@toc@ha
> mtvsrwz 32,5
> mtvsrwz 33,6
> addi 9,9,.LC0@toc@l
> lxvw4x 45,0,9
> xxspltw 32,32,1
> xxspltw 33,33,1
> vcmpequw 0,0,13
> xxsel 34,34,33,32
> blr
>
>
> Segher
next prev parent reply other threads:[~2020-09-25 6:58 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 6:17 [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR Xiong Hu Luo
2020-09-18 6:17 ` [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251] Xiong Hu Luo
2020-09-18 20:19 ` Segher Boessenkool
2020-09-24 8:21 ` xionghu luo
2020-09-24 13:27 ` Richard Biener
2020-09-24 14:55 ` Richard Biener
2020-09-24 19:36 ` Segher Boessenkool
2020-09-25 6:58 ` Richard Biener [this message]
2020-09-25 12:21 ` Richard Sandiford
2020-09-25 22:39 ` Segher Boessenkool
2020-09-28 8:00 ` Richard Biener
2020-09-24 19:02 ` Segher Boessenkool
2020-09-25 3:50 ` xionghu luo
2020-09-25 5:00 ` Richard Biener
2020-09-18 18:20 ` [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR Segher Boessenkool
2020-09-21 8:31 ` Richard Biener
2020-09-22 3:55 ` [PATCH v3 " xionghu luo
2020-09-23 11:33 ` Richard Biener
2020-09-24 3:24 ` xionghu luo
2020-09-24 12:39 ` Richard Sandiford
2020-09-25 6:51 ` [PATCH v4 1/3] " xionghu luo
2020-09-25 11:32 ` Richard Biener
2020-09-25 13:28 ` Richard Sandiford
2020-09-27 5:45 ` xionghu luo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc3fdkkeHKSC6a6KHRj4hGXUJZJb4xdmrVVB689KsGae0Q@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=guojiufu@linux.ibm.com \
--cc=linkw@gcc.gnu.org \
--cc=luoxhu@linux.ibm.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).