public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [rs6000, spu] Add vec_perm named pattern
@ 2011-10-12 22:52 Richard Henderson
  2011-10-14 20:48 ` Michael Meissner
  2011-10-21  3:37 ` [commit, spu] Fix vec_perm pattern (Re: [rs6000, spu] Add vec_perm named pattern) Ulrich Weigand
  0 siblings, 2 replies; 4+ messages in thread
From: Richard Henderson @ 2011-10-12 22:52 UTC (permalink / raw)
  To: dje.gcc, uweigand; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

The generic support for vector permutation will allow for automatic
lowering to V*QImode, so all we need to add to support for these targets
is the single V16QI pattern that represents the base permutation insn.

I'm not touching any of the other ways that the permutation insn 
could be generated.  After the generic support is added, I'll leave
it to the port maintainers to determine what they want to keep.  I
suspect in many cases using the generic __builtin_shuffle plus some
casting in the target-specific header files would be sufficient,
eliminating several dozen builtins.


Ok?


r~


	* config/rs6000/altivec.md (vec_permv16qi): New.

	* config/spu/spu.md (vec_permv16qi): New.

[-- Attachment #2: d-ppc-vec-perm --]
[-- Type: text/plain, Size: 906 bytes --]

commit f2d8929afb989a09d7e287dc171607440bbbbc1a
Author: Richard Henderson <rth@twiddle.net>
Date:   Mon Oct 10 12:35:25 2011 -0700

    rs6000: Implement vec_permv16qi.

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 9e7437e..84c5444 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1357,6 +1357,15 @@
   "vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")])
 
+(define_expand "vec_permv16qi"
+  [(set (match_operand:V16QI 0 "register_operand" "")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "")
+		       (match_operand:V16QI 2 "register_operand" "")
+		       (match_operand:V16QI 3 "register_operand" "")]
+		      UNSPEC_VPERM))]
+  "TARGET_ALTIVEC"
+  "")
+
 (define_insn "altivec_vrfip"		; ceil
   [(set (match_operand:V4SF 0 "register_operand" "=v")
         (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")]

[-- Attachment #3: d-spu-vec-perm --]
[-- Type: text/plain, Size: 834 bytes --]

commit a67ea08189a4399d6ade00c15e69447304f85f96
Author: Richard Henderson <rth@twiddle.net>
Date:   Mon Oct 10 12:35:50 2011 -0700

    spu: Implement vec_permv16qi.

diff --git a/gcc/config/spu/spu.md b/gcc/config/spu/spu.md
index 676d54e..00cfaa4 100644
--- a/gcc/config/spu/spu.md
+++ b/gcc/config/spu/spu.md
@@ -4395,6 +4395,18 @@ selb\t%0,%4,%0,%3"
   "shufb\t%0,%1,%2,%3"
   [(set_attr "type" "shuf")])
 
+(define_expand "vec_permv16qi"
+  [(set (match_operand:V16QI 0 "spu_reg_operand" "")
+	(unspec:V16QI
+	  [(match_operand:V16QI 1 "spu_reg_operand" "")
+	   (match_operand:V16QI 2 "spu_reg_operand" "")
+	   (match_operand:V16QI 3 "spu_reg_operand" "")]
+	  UNSPEC_SHUFB))]
+  ""
+  {
+    operands[3] = gen_lowpart (TImode, operands[3]);
+  })
+
 (define_insn "nop"
   [(unspec_volatile [(const_int 0)] UNSPECV_NOP)]
   ""

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rs6000, spu] Add vec_perm named pattern
  2011-10-12 22:52 [rs6000, spu] Add vec_perm named pattern Richard Henderson
@ 2011-10-14 20:48 ` Michael Meissner
  2011-10-14 22:39   ` Richard Henderson
  2011-10-21  3:37 ` [commit, spu] Fix vec_perm pattern (Re: [rs6000, spu] Add vec_perm named pattern) Ulrich Weigand
  1 sibling, 1 reply; 4+ messages in thread
From: Michael Meissner @ 2011-10-14 20:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: dje.gcc, uweigand, GCC Patches

On Wed, Oct 12, 2011 at 03:42:12PM -0700, Richard Henderson wrote:
> The generic support for vector permutation will allow for automatic
> lowering to V*QImode, so all we need to add to support for these targets
> is the single V16QI pattern that represents the base permutation insn.
> 
> I'm not touching any of the other ways that the permutation insn 
> could be generated.  After the generic support is added, I'll leave
> it to the port maintainers to determine what they want to keep.  I
> suspect in many cases using the generic __builtin_shuffle plus some
> casting in the target-specific header files would be sufficient,
> eliminating several dozen builtins.
> 
> 
> Ok?

I would rather change altivec_vperm_<mode> to use the new name (and also
altivec_vperm_<mode>_uns).  But I can live with a wrapper function for now.

If we are adding permute options, can we please get the vectorizer to use
optabs instead of using the targetm.vectorize.builtin_vec_perm hook?  It has
always struck me as a sore thumb that we have a hook that needs to return a
builtin function decl (targetm.vectorize.builtin_mask_for_load also).

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meissner@linux.vnet.ibm.com	fax +1 (978) 399-6899

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rs6000, spu] Add vec_perm named pattern
  2011-10-14 20:48 ` Michael Meissner
@ 2011-10-14 22:39   ` Richard Henderson
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2011-10-14 22:39 UTC (permalink / raw)
  To: Michael Meissner, dje.gcc, uweigand, GCC Patches

On 10/14/2011 12:53 PM, Michael Meissner wrote:
> I would rather change altivec_vperm_<mode> to use the new name (and also
> altivec_vperm_<mode>_uns).  But I can live with a wrapper function for now.

As I said, I'm leaving the cleanup of the old patterns to port maintainers.

> If we are adding permute options, can we please get the vectorizer to use
> optabs instead of using the targetm.vectorize.builtin_vec_perm hook?  It has
> always struck me as a sore thumb that we have a hook that needs to return a
> builtin function decl (targetm.vectorize.builtin_mask_for_load also).

That's exactly what I've been working on today.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [commit, spu] Fix vec_perm pattern (Re: [rs6000, spu] Add vec_perm named pattern)
  2011-10-12 22:52 [rs6000, spu] Add vec_perm named pattern Richard Henderson
  2011-10-14 20:48 ` Michael Meissner
@ 2011-10-21  3:37 ` Ulrich Weigand
  1 sibling, 0 replies; 4+ messages in thread
From: Ulrich Weigand @ 2011-10-21  3:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: dje.gcc, GCC Patches

Richard Henderson wrote:

> The generic support for vector permutation will allow for automatic
> lowering to V*QImode, so all we need to add to support for these targets
> is the single V16QI pattern that represents the base permutation insn.
> 
> I'm not touching any of the other ways that the permutation insn 
> could be generated.  After the generic support is added, I'll leave
> it to the port maintainers to determine what they want to keep.  I
> suspect in many cases using the generic __builtin_shuffle plus some
> casting in the target-specific header files would be sufficient,
> eliminating several dozen builtins.


Sorry I didn't get to this earlier, I got side-tracked by a number
of independent regressions on SPU ...

Unfortunately, the semantics of vec_perm do not match 100% those of the
SPU Shuffle Bytes instruction.  vec_perm assumes the selector elements
apply modulo 32, but shufb uses values >= 128 for special purposes.
See the ISA:

  Value in Register RC
  (Expressed in Binary)  Result Byte

  10xxxxxx               0x00
  110xxxxx               0xFF
  111xxxxx               0x80
  Otherwise              The byte of the concatenated register addressed by
                         the rightmost 5 bits of register RC


To implement the vec_perm semantics fully, we therefore need to reduce the
selector modulo 32 explicitly before using shuf.

Tested on spu-elf, fixes various vshuf test cases.
Committed to mainline.

Bye,
Ulrich


ChangeLog:

	* config/spu/spu.md ("vec_permv16qi"): Reduce selector modulo 32
	before using the shufb instruction.

Index: gcc/config/spu/spu.md
===================================================================
*** gcc/config/spu/spu.md	(revision 180240)
--- gcc/config/spu/spu.md	(working copy)
*************** selb\t%0,%4,%0,%3"
*** 4395,4410 ****
    "shufb\t%0,%1,%2,%3"
    [(set_attr "type" "shuf")])
  
  (define_expand "vec_permv16qi"
!   [(set (match_operand:V16QI 0 "spu_reg_operand" "")
  	(unspec:V16QI
  	  [(match_operand:V16QI 1 "spu_reg_operand" "")
  	   (match_operand:V16QI 2 "spu_reg_operand" "")
! 	   (match_operand:V16QI 3 "spu_reg_operand" "")]
  	  UNSPEC_SHUFB))]
    ""
    {
!     operands[3] = gen_lowpart (TImode, operands[3]);
    })
  
  (define_insn "nop"
--- 4395,4416 ----
    "shufb\t%0,%1,%2,%3"
    [(set_attr "type" "shuf")])
  
+ ; The semantics of vec_permv16qi are nearly identical to those of the SPU
+ ; shufb instruction, except that we need to reduce the selector modulo 32.
  (define_expand "vec_permv16qi"
!   [(set (match_dup 4) (and:V16QI (match_operand:V16QI 3 "spu_reg_operand" "")
!                                  (match_dup 6)))
!    (set (match_operand:V16QI 0 "spu_reg_operand" "")
  	(unspec:V16QI
  	  [(match_operand:V16QI 1 "spu_reg_operand" "")
  	   (match_operand:V16QI 2 "spu_reg_operand" "")
! 	   (match_dup 5)]
  	  UNSPEC_SHUFB))]
    ""
    {
!     operands[4] = gen_reg_rtx (V16QImode);
!     operands[5] = gen_lowpart (TImode, operands[4]);
!     operands[6] = spu_const (V16QImode, 31);
    })
  
  (define_insn "nop"

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-10-21  1:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-12 22:52 [rs6000, spu] Add vec_perm named pattern Richard Henderson
2011-10-14 20:48 ` Michael Meissner
2011-10-14 22:39   ` Richard Henderson
2011-10-21  3:37 ` [commit, spu] Fix vec_perm pattern (Re: [rs6000, spu] Add vec_perm named pattern) Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).