From: Dorit Nuzman <DORIT@il.ibm.com>
To: Ira Rosen <IRAR@il.ibm.com>
Cc: gcc@gcc.gnu.org, Richard Henderson <rth@redhat.com>
Subject: Re: targetm.vectorize.builtin_vec_perm
Date: Tue, 17 Nov 2009 12:51:00 -0000 [thread overview]
Message-ID: <OFC18A9720.ACBE47C9-ONC2257671.003A0FF6-C2257671.0046D3AE@il.ibm.com> (raw)
In-Reply-To: <OF0FC5895C.056A1233-ONC2257671.0028281D-C2257671.00312D6F@il.ibm.com>
...
>
> >
> > I'm contemplating adding a tree- and gimple-level VEC_PERMUTE_EXPR of
> > the form:
> >
> > VEC_PERMUTE_EXPR (vlow, vhigh, vperm)
> >
> > which would be exactly equal to
> >
> > (vec_select
> > (vec_concat vlow vhigh)
> > vperm)
> >
> > at the rtl level. I.e. vperm is an integral vector of the same number
> > of elements as vlow.
> >
> > Truly variable permutation is something that's only supported by ppc
and
> > spu.
>
> Also Altivec and SPU support byte permutation (and not only element
> permutation), however, the vectorizer does not make use of this at
present.
>
Yes. I was trying to think if it would be useful to express
byte-permutations instead of element-permutations, but the only two useful
cases that came to mind are things we have covered by other, probably more
appropriate, idioms.
[One is realignment (for which we use the builtin_mask_for_load +
REALIGN_LOAD). The other is the VEC_PACK_TRUNC idiom (where the number of
elements in 'vperm' would be twice the number of elements as 'vlow'), but
other VEC_PACK variants are a little more than just a special case of
permute.]
So (unless we want VEC_PERMUTE to cover these cases, which I think we
don't), an element-wise permutations should suffice, so sounds like a good
suggestion to me.
> > Intel AVX has a limited variable permutation -- 64-bit or 32-bit
> > elements can be rearranged but only within a 128-bit subvector.
> > So if you're working with 128-bit vectors, it's fully variable, but if
> > you're working with 256-bit vectors, it's like doing 2 128-bit permute
> > operations in parallel. Intel before AVX has no variable permute.
> >
> > HOWEVER! Most of the useful permutations that I can think of for the
> > optimizers to generate are actually constant. And these can be
> > implemented everywhere (with varying degrees of efficiency).
> >
That's true for the moment, but there are cases where a variable permute
would be useful for vectorization. E.g. where vectors are used as a lookup
table. One example I know of is for finding delimiters (e.g. for XML
processing) - a lookup table of 256 bits holds one bit per ASCII character
to indicates if a character is a delimiter or not, and the scalar code
looks something like this:
table[256]={1,0,0,....};
for (i...)
if (table[data[i]] == 1)
{found delimiter}
...and this is vectorized with 2 vector registers that hold the lookup
table and a shift on the input data vector to create the permutation mask
to access the table. I think there should be other examples for lookup
tables like that used for vectorization. I also saw variable permutes used
for sorting (
http://www.dia.eui.upm.es/asignatu/pro_par/articulos/AASort.pdf).
Indeed there are some serious challenges to overcome in order to do all
that automatically in the compiler... but some pattern-matching based
vectorization approach could conceptually do this.
Also, if one day someone was to introduce platform-independent vector
intrinsics, then such a generic permute would allow programmers to take
advantage of it, even for the cases that would be otherwise too complicated
for the compiler to auto-vectorize.
So I think it would be nice to allow the more general form, but since it
will probably take a while before we actually make use of it, it's probably
not critical for the short term...
> > Anyway, I'm thinking that it might be better to add such a general
> > operation instead of continuing to add things like
> >
> > VEC_EXTRACT_EVEN_EXPR,
> > VEC_EXTRACT_ODD_EXPR,
> > VEC_INTERLEAVE_HIGH_EXPR,
> > VEC_INTERLEAVE_LOW_EXPR,
> >
> > and other obvious patterns like broadcast, duplicate even to odd,
> > duplicate odd to even, etc.
>
agreed
> If the back end will be able to identify specific masks, e.g., {0,2,4,6}
as
> extract even operation, then we can certainly remove those codes.
>
agreed
dorit
> >
> > I can imagine having some sort of target hook that computed a cost
> > metric for a given constant permutation pattern. For instance, I'd
> > imagine that the interleave patterns are half as expensive as a full
> > permute for altivec, due to not having to load a mask. This hook would
> > be fairly complicated for x86, given all of the permuting insns that
> > were incrementally added in various ISA revisions, but such is life.
> >
> > In any case, would a VEC_PERMUTE_EXPR, as described above, work for the
> > uses of builtin_vec_perm within the vectorizer at present?
>
> Yes.
>
> Ira
>
> >
> >
> > r~
>
prev parent reply other threads:[~2009-11-17 12:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-17 1:40 targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17 2:39 ` targetm.vectorize.builtin_vec_perm Joern Rennecke
2009-11-17 9:10 ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 19:21 ` targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17 8:59 ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 12:51 ` Dorit Nuzman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=OFC18A9720.ACBE47C9-ONC2257671.003A0FF6-C2257671.0046D3AE@il.ibm.com \
--to=dorit@il.ibm.com \
--cc=IRAR@il.ibm.com \
--cc=gcc@gcc.gnu.org \
--cc=rth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).