public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* targetm.vectorize.builtin_vec_perm
@ 2009-11-17  1:40 Richard Henderson
  2009-11-17  2:39 ` targetm.vectorize.builtin_vec_perm Joern Rennecke
  2009-11-17  8:59 ` targetm.vectorize.builtin_vec_perm Ira Rosen
  0 siblings, 2 replies; 6+ messages in thread
From: Richard Henderson @ 2009-11-17  1:40 UTC (permalink / raw)
  To: irar; +Cc: gcc

What is this hook supposed to do?  There is no description of its arguments.

What is the theory of operation of permute within the vectorizer?  Do 
you actually need variable permute, or would constants be ok?

I'm contemplating adding a tree- and gimple-level VEC_PERMUTE_EXPR of 
the form:

   VEC_PERMUTE_EXPR (vlow, vhigh, vperm)

which would be exactly equal to

   (vec_select
     (vec_concat vlow vhigh)
     vperm)

at the rtl level.  I.e. vperm is an integral vector of the same number 
of elements as vlow.

Truly variable permutation is something that's only supported by ppc and 
spu.  Intel AVX has a limited variable permutation -- 64-bit or 32-bit 
elements can be rearranged but only within a 128-bit subvector.
So if you're working with 128-bit vectors, it's fully variable, but if 
you're working with 256-bit vectors, it's like doing 2 128-bit permute 
operations in parallel.  Intel before AVX has no variable permute.

HOWEVER!  Most of the useful permutations that I can think of for the 
optimizers to generate are actually constant.  And these can be 
implemented everywhere (with varying degrees of efficiency).

Anyway, I'm thinking that it might be better to add such a general 
operation instead of continuing to add things like

	VEC_EXTRACT_EVEN_EXPR,
	VEC_EXTRACT_ODD_EXPR,
	VEC_INTERLEAVE_HIGH_EXPR,
	VEC_INTERLEAVE_LOW_EXPR,

and other obvious patterns like broadcast, duplicate even to odd, 
duplicate odd to even, etc.

I can imagine having some sort of target hook that computed a cost 
metric for a given constant permutation pattern.  For instance, I'd 
imagine that the interleave patterns are half as expensive as a full 
permute for altivec, due to not having to load a mask.  This hook would 
be fairly complicated for x86, given all of the permuting insns that 
were incrementally added in various ISA revisions, but such is life.

In any case, would a VEC_PERMUTE_EXPR, as described above, work for the 
uses of builtin_vec_perm within the vectorizer at present?


r~

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-11-17 19:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-17  1:40 targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17  2:39 ` targetm.vectorize.builtin_vec_perm Joern Rennecke
2009-11-17  9:10   ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 19:21   ` targetm.vectorize.builtin_vec_perm Richard Henderson
2009-11-17  8:59 ` targetm.vectorize.builtin_vec_perm Ira Rosen
2009-11-17 12:51   ` targetm.vectorize.builtin_vec_perm Dorit Nuzman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).