public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/91246] vectorization failure for a small loop to search array element
       [not found] <bug-91246-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-18 12:01 ` avieira at gcc dot gnu.org
  2022-03-14 13:43 ` d_vampile at 163 dot com
  1 sibling, 0 replies; 2+ messages in thread
From: avieira at gcc dot gnu.org @ 2020-03-18 12:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91246

--- Comment #5 from avieira at gcc dot gnu.org ---
I have posted a prototype on the mailing list
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541908.html

This is really just a prototype to investigate code-gen impact, I don't expect
to commit this as is and whether it makes sense to do something like this.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug tree-optimization/91246] vectorization failure for a small loop to search array element
       [not found] <bug-91246-4@http.gcc.gnu.org/bugzilla/>
  2020-03-18 12:01 ` [Bug tree-optimization/91246] vectorization failure for a small loop to search array element avieira at gcc dot gnu.org
@ 2022-03-14 13:43 ` d_vampile at 163 dot com
  1 sibling, 0 replies; 2+ messages in thread
From: d_vampile at 163 dot com @ 2022-03-14 13:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91246

d_vampile <d_vampile at 163 dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |d_vampile at 163 dot com

--- Comment #6 from d_vampile <d_vampile at 163 dot com> ---
(In reply to Jiangning Liu from comment #3)
> Expect to vectorize the inner loop by generating the code below for x86,
> 
> vpbroadcastd [mem], ymm0
> vpaddd [mem], ymm0, ymm1
> vpbroadcastd reg, ymm2
> vpcmpeqd ymm2, ymm1, k0
> kortestw k0, k0
> cmovne ...
> 
> AArch64 should have vectorization instructions counterpart to implement the
> same functionality.

I see that on x86, the result of vcmpeqb comparison can be recorded through the
vmovmskb instruction. I wonder if there is a similar instruction for
efficiently recording the result of vectorized comparison on neno?

x86 i.e..
..
vpcmpeqb %ymm0, %ymm1, %ymm0
vpmovmskb %ymm0, %ebx
cmp 0xffffffff, %ebx
..

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-14 13:43 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-91246-4@http.gcc.gnu.org/bugzilla/>
2020-03-18 12:01 ` [Bug tree-optimization/91246] vectorization failure for a small loop to search array element avieira at gcc dot gnu.org
2022-03-14 13:43 ` d_vampile at 163 dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).