public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Dorit Nuzman <DORIT@il.ibm.com>
To: Ira Rosen <IRAR@il.ibm.com>
Cc: gcc@gcc.gnu.org, Richard Henderson <rth@redhat.com>
Subject: Re: targetm.vectorize.builtin_vec_perm
Date: Tue, 17 Nov 2009 12:51:00 -0000	[thread overview]
Message-ID: <OFC18A9720.ACBE47C9-ONC2257671.003A0FF6-C2257671.0046D3AE@il.ibm.com> (raw)
In-Reply-To: <OF0FC5895C.056A1233-ONC2257671.0028281D-C2257671.00312D6F@il.ibm.com>

...
>
> >
> > I'm contemplating adding a tree- and gimple-level VEC_PERMUTE_EXPR of
> > the form:
> >
> >    VEC_PERMUTE_EXPR (vlow, vhigh, vperm)
> >
> > which would be exactly equal to
> >
> >    (vec_select
> >      (vec_concat vlow vhigh)
> >      vperm)
> >
> > at the rtl level.  I.e. vperm is an integral vector of the same number
> > of elements as vlow.
> >
> > Truly variable permutation is something that's only supported by ppc
and
> > spu.
>
> Also Altivec and SPU support byte permutation (and not only element
> permutation), however, the vectorizer does not make use of this at
present.
>

Yes. I was trying to think if it would be useful to express
byte-permutations instead of element-permutations, but the only two useful
cases that came to mind are things we have covered by other, probably more
appropriate, idioms.

[One is realignment (for which we use the builtin_mask_for_load +
REALIGN_LOAD). The other is the VEC_PACK_TRUNC idiom (where the number of
elements in 'vperm' would be twice the number of elements as 'vlow'), but
other VEC_PACK variants are a little more than just a special case of
permute.]

So (unless we want VEC_PERMUTE to cover these cases, which I think we
don't), an element-wise permutations should suffice, so sounds like a good
suggestion to me.

> > Intel AVX has a limited variable permutation -- 64-bit or 32-bit
> > elements can be rearranged but only within a 128-bit subvector.
> > So if you're working with 128-bit vectors, it's fully variable, but if
> > you're working with 256-bit vectors, it's like doing 2 128-bit permute
> > operations in parallel.  Intel before AVX has no variable permute.
> >
> > HOWEVER!  Most of the useful permutations that I can think of for the
> > optimizers to generate are actually constant.  And these can be
> > implemented everywhere (with varying degrees of efficiency).
> >

That's true for the moment, but there are cases where a variable permute
would be useful for vectorization. E.g. where vectors are used as a lookup
table. One example I know of is for finding delimiters (e.g. for XML
processing) - a lookup table of 256 bits holds one bit per ASCII character
to indicates if a character is a delimiter or not, and the scalar code
looks something like this:
table[256]={1,0,0,....};
for (i...)
   if (table[data[i]] == 1)
     {found delimiter}
...and this is vectorized with 2 vector registers that hold the lookup
table and a shift on the input data vector to create the permutation mask
to access the table. I think there should be other examples for lookup
tables like that used for vectorization. I also saw variable permutes used
for sorting (
http://www.dia.eui.upm.es/asignatu/pro_par/articulos/AASort.pdf).

Indeed there are some serious challenges to overcome in order to do all
that automatically in the compiler... but some pattern-matching based
vectorization approach could conceptually do this.

Also, if one day someone was to introduce platform-independent vector
intrinsics, then such a generic permute would allow programmers to take
advantage of it, even for the cases that would be otherwise too complicated
for the compiler to auto-vectorize.

So I think it would be nice to allow the more general form, but since it
will probably take a while before we actually make use of it, it's probably
not critical for the short term...

> > Anyway, I'm thinking that it might be better to add such a general
> > operation instead of continuing to add things like
> >
> >    VEC_EXTRACT_EVEN_EXPR,
> >    VEC_EXTRACT_ODD_EXPR,
> >    VEC_INTERLEAVE_HIGH_EXPR,
> >    VEC_INTERLEAVE_LOW_EXPR,
> >
> > and other obvious patterns like broadcast, duplicate even to odd,
> > duplicate odd to even, etc.
>

agreed

> If the back end will be able to identify specific masks, e.g., {0,2,4,6}
as
> extract even operation, then we can certainly remove those codes.
>

agreed

dorit

> >
> > I can imagine having some sort of target hook that computed a cost
> > metric for a given constant permutation pattern.  For instance, I'd
> > imagine that the interleave patterns are half as expensive as a full
> > permute for altivec, due to not having to load a mask.  This hook would
> > be fairly complicated for x86, given all of the permuting insns that
> > were incrementally added in various ISA revisions, but such is life.
> >
> > In any case, would a VEC_PERMUTE_EXPR, as described above, work for the
> > uses of builtin_vec_perm within the vectorizer at present?
>
> Yes.
>
> Ira
>
> >
> >
> > r~
>

      reply	other threads:[~2009-11-17 12:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-17  1:40 targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17  2:39 ` targetm.vectorize.builtin_vec_perm Joern Rennecke
2009-11-17  9:10   ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 19:21   ` targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17  8:59 ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 12:51   ` Dorit Nuzman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OFC18A9720.ACBE47C9-ONC2257671.003A0FF6-C2257671.0046D3AE@il.ibm.com \
    --to=dorit@il.ibm.com \
    --cc=IRAR@il.ibm.com \
    --cc=gcc@gcc.gnu.org \
    --cc=rth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).