[Patch AArch64] Implement Vector Permute Support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [Patch AArch64] Implement Vector Permute Support
@ 2012-12-04 10:31 James Greenhalgh
  2012-12-04 10:36 ` [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute James Greenhalgh
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: James Greenhalgh @ 2012-12-04 10:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]


Hi,

This patch adds support for Vector Shuffle style operations
through support for TARGET_VECTORIZE_VEC_PERM_CONST_OK and
the vec_perm and vec_perm_const standard patterns.

In this patch we add the framework and support for the
generic tbl instruction. This can be used to handle any
vector permute operation, but we can do a better job for
some special cases. The second patch of this series does
that better job for the ZIP, UZP and TRN instructions.

Is this OK to commit?

Thanks,
James Greenhalgh

---
gcc/

2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-protos.h
	(aarch64_split_combinev16qi): New.
	(aarch64_expand_vec_perm): Likewise.
	(aarch64_expand_vec_perm_const): Likewise.
	* config/aarch64/aarch64-simd.md (vec_perm_const<mode>): New.
	(vec_perm<mode>): Likewise.
	(aarch64_tbl1<mode>): Likewise.
	(aarch64_tbl2v16qi): Likewise.
	(aarch64_combinev16qi): New.
	* config/aarch64/aarch64.c
	(aarch64_vectorize_vec_perm_const_ok): New.
	(aarch64_split_combinev16qi): Likewise.
  	(MAX_VECT_LEN): Define.
	(expand_vec_perm_d): New.
	(aarch64_expand_vec_perm_1): Likewise.
	(aarch64_expand_vec_perm): Likewise.
	(aarch64_evpc_tbl): Likewise.
	(aarch64_expand_vec_perm_const_1): Likewise.
	(aarch64_expand_vec_perm_const): Likewise.
	(aarch64_vectorize_vec_perm_const_ok): Likewise.
	(TARGET_VECTORIZE_VEC_PERM_CONST_OK): Likewise.
	* config/aarch64/iterators.md
	(unspec): Add UNSPEC_TBL, UNSPEC_CONCAT.
	(V_cmp_result): Add mapping for V2DF.

gcc/testsuite/

2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>

	* lib/target-supports.exp
	(check_effective_target_vect_perm): Allow aarch64*-*-*.
	(check_effective_target_vect_perm_byte): Likewise.
	(check_effective_target_vect_perm_short): Likewise.
	(check_effective_target_vect_char_mult): Likewise.
	(check_effective_target_vect_extract_even_odd): Likewise.
	(check_effective_target_vect_interleave): Likewise.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Patch-AArch64-Implement-Vector-Permute-Support.patch --]
[-- Type: text/x-patch;  name=0001-Patch-AArch64-Implement-Vector-Permute-Support.patch, Size: 16004 bytes --]

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index ab84257..7b72ead 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -236,4 +236,9 @@ rtx aarch64_expand_builtin (tree exp,
 			    int ignore ATTRIBUTE_UNUSED);
 tree aarch64_builtin_decl (unsigned, bool ATTRIBUTE_UNUSED);
 
+extern void aarch64_split_combinev16qi (rtx operands[3]);
+extern void aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
+extern bool
+aarch64_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b3d01c1..2b0c8d6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3298,6 +3298,74 @@
 
 ;; Permuted-store expanders for neon intrinsics.
 
+;; Permute instructions
+
+;; vec_perm support
+
+(define_expand "vec_perm_const<mode>"
+  [(match_operand:VALL 0 "register_operand")
+   (match_operand:VALL 1 "register_operand")
+   (match_operand:VALL 2 "register_operand")
+   (match_operand:<V_cmp_result> 3)]
+  "TARGET_SIMD"
+{
+  if (aarch64_expand_vec_perm_const (operands[0], operands[1],
+				     operands[2], operands[3]))
+    DONE;
+  else
+    FAIL;
+})
+
+(define_expand "vec_perm<mode>"
+  [(match_operand:VB 0 "register_operand")
+   (match_operand:VB 1 "register_operand")
+   (match_operand:VB 2 "register_operand")
+   (match_operand:VB 3 "register_operand")]
+  "TARGET_SIMD"
+{
+  aarch64_expand_vec_perm (operands[0], operands[1],
+			   operands[2], operands[3]);
+  DONE;
+})
+
+(define_insn "aarch64_tbl1<mode>"
+  [(set (match_operand:VB 0 "register_operand" "=w")
+	(unspec:VB [(match_operand:V16QI 1 "register_operand" "w")
+		    (match_operand:VB 2 "register_operand" "w")]
+		   UNSPEC_TBL))]
+  "TARGET_SIMD"
+  "tbl\\t%0.<Vtype>, {%1.16b}, %2.<Vtype>"
+  [(set_attr "simd_type" "simd_tbl")
+   (set_attr "simd_mode" "<MODE>")]
+)
+
+;; Two source registers.
+
+(define_insn "aarch64_tbl2v16qi"
+  [(set (match_operand:V16QI 0 "register_operand" "=w")
+	(unspec:V16QI [(match_operand:OI 1 "register_operand" "w")
+		       (match_operand:V16QI 2 "register_operand" "w")]
+		      UNSPEC_TBL))]
+  "TARGET_SIMD"
+  "tbl\\t%0.16b, {%S1.16b - %T1.16b}, %2.16b"
+  [(set_attr "simd_type" "simd_tbl")
+   (set_attr "simd_mode" "V16QI")]
+)
+
+(define_insn_and_split "aarch64_combinev16qi"
+  [(set (match_operand:OI 0 "register_operand" "=w")
+	(unspec:OI [(match_operand:V16QI 1 "register_operand" "w")
+		    (match_operand:V16QI 2 "register_operand" "w")]
+		   UNSPEC_CONCAT))]
+  "TARGET_SIMD"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  aarch64_split_combinev16qi (operands);
+  DONE;
+})
+
 (define_insn "aarch64_st2<mode>_dreg"
   [(set (match_operand:TI 0 "aarch64_simd_struct_operand" "=Utv")
 	(unspec:TI [(match_operand:OI 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f262ef9..cebc8cb 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -110,6 +110,9 @@ static unsigned bit_count (unsigned HOST_WIDE_INT);
 static bool aarch64_const_vec_all_same_int_p (rtx,
 					      HOST_WIDE_INT, HOST_WIDE_INT);
 
+static bool aarch64_vectorize_vec_perm_const_ok (enum machine_mode vmode,
+						 const unsigned char *sel);
+
 /* The processor for which instructions should be scheduled.  */
 enum aarch64_processor aarch64_tune = generic;
 
@@ -6678,6 +6681,292 @@ aarch64_c_mode_for_suffix (char suffix)
   return VOIDmode;
 }
 
+/* Split operands into moves from op[1] + op[2] into op[0].  */
+
+void
+aarch64_split_combinev16qi (rtx operands[3])
+{
+  unsigned int dest = REGNO (operands[0]);
+  unsigned int src1 = REGNO (operands[1]);
+  unsigned int src2 = REGNO (operands[2]);
+  enum machine_mode halfmode = GET_MODE (operands[1]);
+  unsigned int halfregs = HARD_REGNO_NREGS (src1, halfmode);
+  rtx destlo, desthi;
+
+  gcc_assert (halfmode == V16QImode);
+
+  if (src1 == dest && src2 == dest + halfregs)
+    {
+      /* No-op move.  Can't split to nothing; emit something.  */
+      emit_note (NOTE_INSN_DELETED);
+      return;
+    }
+
+  /* Preserve register attributes for variable tracking.  */
+  destlo = gen_rtx_REG_offset (operands[0], halfmode, dest, 0);
+  desthi = gen_rtx_REG_offset (operands[0], halfmode, dest + halfregs,
+			       GET_MODE_SIZE (halfmode));
+
+  /* Special case of reversed high/low parts.  */
+  if (reg_overlap_mentioned_p (operands[2], destlo)
+      && reg_overlap_mentioned_p (operands[1], desthi))
+    {
+      emit_insn (gen_xorv16qi3 (operands[1], operands[1], operands[2]));
+      emit_insn (gen_xorv16qi3 (operands[2], operands[1], operands[2]));
+      emit_insn (gen_xorv16qi3 (operands[1], operands[1], operands[2]));
+    }
+  else if (!reg_overlap_mentioned_p (operands[2], destlo))
+    {
+      /* Try to avoid unnecessary moves if part of the result
+	 is in the right place already.  */
+      if (src1 != dest)
+	emit_move_insn (destlo, operands[1]);
+      if (src2 != dest + halfregs)
+	emit_move_insn (desthi, operands[2]);
+    }
+  else
+    {
+      if (src2 != dest + halfregs)
+	emit_move_insn (desthi, operands[2]);
+      if (src1 != dest)
+	emit_move_insn (destlo, operands[1]);
+    }
+}
+
+/* vec_perm support.  */
+
+#define MAX_VECT_LEN 16
+
+struct expand_vec_perm_d
+{
+  rtx target, op0, op1;
+  unsigned char perm[MAX_VECT_LEN];
+  enum machine_mode vmode;
+  unsigned char nelt;
+  bool one_vector_p;
+  bool testing_p;
+};
+
+/* Generate a variable permutation.  */
+
+static void
+aarch64_expand_vec_perm_1 (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  enum machine_mode vmode = GET_MODE (target);
+  bool one_vector_p = rtx_equal_p (op0, op1);
+
+  gcc_checking_assert (vmode == V8QImode || vmode == V16QImode);
+  gcc_checking_assert (GET_MODE (op0) == vmode);
+  gcc_checking_assert (GET_MODE (op1) == vmode);
+  gcc_checking_assert (GET_MODE (sel) == vmode);
+  gcc_checking_assert (TARGET_SIMD);
+
+  if (one_vector_p)
+    {
+      if (vmode == V8QImode)
+	{
+	  /* Expand the argument to a V16QI mode by duplicating it.  */
+	  rtx pair = gen_reg_rtx (V16QImode);
+	  emit_insn (gen_aarch64_combinev8qi (pair, op0, op0));
+	  emit_insn (gen_aarch64_tbl1v8qi (target, pair, sel));
+	}
+      else
+	{
+	  emit_insn (gen_aarch64_tbl1v16qi (target, op0, sel));
+	}
+    }
+  else
+    {
+      rtx pair;
+
+      if (vmode == V8QImode)
+	{
+	  pair = gen_reg_rtx (V16QImode);
+	  emit_insn (gen_aarch64_combinev8qi (pair, op0, op1));
+	  emit_insn (gen_aarch64_tbl1v8qi (target, pair, sel));
+	}
+      else
+	{
+	  pair = gen_reg_rtx (OImode);
+	  emit_insn (gen_aarch64_combinev16qi (pair, op0, op1));
+	  emit_insn (gen_aarch64_tbl2v16qi (target, pair, sel));
+	}
+    }
+}
+
+void
+aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  enum machine_mode vmode = GET_MODE (target);
+  unsigned int i, nelt = GET_MODE_NUNITS (vmode);
+  bool one_vector_p = rtx_equal_p (op0, op1);
+  rtx rmask[MAX_VECT_LEN], mask;
+
+  gcc_checking_assert (!BYTES_BIG_ENDIAN);
+
+  /* The TBL instruction does not use a modulo index, so we must take care
+     of that ourselves.  */
+  mask = GEN_INT (one_vector_p ? nelt - 1 : 2 * nelt - 1);
+  for (i = 0; i < nelt; ++i)
+    rmask[i] = mask;
+  mask = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rmask));
+  sel = expand_simple_binop (vmode, AND, sel, mask, NULL, 0, OPTAB_LIB_WIDEN);
+
+  aarch64_expand_vec_perm_1 (target, op0, op1, sel);
+}
+
+static bool
+aarch64_evpc_tbl (struct expand_vec_perm_d *d)
+{
+  rtx rperm[MAX_VECT_LEN], sel;
+  enum machine_mode vmode = d->vmode;
+  unsigned int i, nelt = d->nelt;
+
+  /* TODO: ARM's TBL indexing is little-endian.  In order to handle GCC's
+     numbering of elements for big-endian, we must reverse the order.  */
+  if (BYTES_BIG_ENDIAN)
+    return false;
+
+  if (d->testing_p)
+    return true;
+
+  /* Generic code will try constant permutation twice.  Once with the
+     original mode and again with the elements lowered to QImode.
+     So wait and don't do the selector expansion ourselves.  */
+  if (vmode != V8QImode && vmode != V16QImode)
+    return false;
+
+  for (i = 0; i < nelt; ++i)
+    rperm[i] = GEN_INT (d->perm[i]);
+  sel = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rperm));
+  sel = force_reg (vmode, sel);
+
+  aarch64_expand_vec_perm_1 (d->target, d->op0, d->op1, sel);
+  return true;
+}
+
+static bool
+aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
+{
+  /* The pattern matching functions above are written to look for a small
+     number to begin the sequence (0, 1, N/2).  If we begin with an index
+     from the second operand, we can swap the operands.  */
+  if (d->perm[0] >= d->nelt)
+    {
+      unsigned i, nelt = d->nelt;
+      rtx x;
+
+      for (i = 0; i < nelt; ++i)
+	d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
+
+      x = d->op0;
+      d->op0 = d->op1;
+      d->op1 = x;
+    }
+
+  if (TARGET_SIMD)
+    return aarch64_evpc_tbl (d);
+  return false;
+}
+
+/* Expand a vec_perm_const pattern.  */
+
+bool
+aarch64_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  struct expand_vec_perm_d d;
+  int i, nelt, which;
+
+  d.target = target;
+  d.op0 = op0;
+  d.op1 = op1;
+
+  d.vmode = GET_MODE (target);
+  gcc_assert (VECTOR_MODE_P (d.vmode));
+  d.nelt = nelt = GET_MODE_NUNITS (d.vmode);
+  d.testing_p = false;
+
+  for (i = which = 0; i < nelt; ++i)
+    {
+      rtx e = XVECEXP (sel, 0, i);
+      int ei = INTVAL (e) & (2 * nelt - 1);
+      which |= (ei < nelt ? 1 : 2);
+      d.perm[i] = ei;
+    }
+
+  switch (which)
+    {
+    default:
+      gcc_unreachable ();
+
+    case 3:
+      d.one_vector_p = false;
+      if (!rtx_equal_p (op0, op1))
+	break;
+
+      /* The elements of PERM do not suggest that only the first operand
+	 is used, but both operands are identical.  Allow easier matching
+	 of the permutation by folding the permutation into the single
+	 input vector.  */
+      /* Fall Through.  */
+    case 2:
+      for (i = 0; i < nelt; ++i)
+	d.perm[i] &= nelt - 1;
+      d.op0 = op1;
+      d.one_vector_p = true;
+      break;
+
+    case 1:
+      d.op1 = op0;
+      d.one_vector_p = true;
+      break;
+    }
+
+  return aarch64_expand_vec_perm_const_1 (&d);
+}
+
+static bool
+aarch64_vectorize_vec_perm_const_ok (enum machine_mode vmode,
+				     const unsigned char *sel)
+{
+  struct expand_vec_perm_d d;
+  unsigned int i, nelt, which;
+  bool ret;
+
+  d.vmode = vmode;
+  d.nelt = nelt = GET_MODE_NUNITS (d.vmode);
+  d.testing_p = true;
+  memcpy (d.perm, sel, nelt);
+
+  /* Calculate whether all elements are in one vector.  */
+  for (i = which = 0; i < nelt; ++i)
+    {
+      unsigned char e = d.perm[i];
+      gcc_assert (e < 2 * nelt);
+      which |= (e < nelt ? 1 : 2);
+    }
+
+  /* If all elements are from the second vector, reindex as if from the
+     first vector.  */
+  if (which == 2)
+    for (i = 0; i < nelt; ++i)
+      d.perm[i] -= nelt;
+
+  /* Check whether the mask can be applied to a single vector.  */
+  d.one_vector_p = (which != 3);
+
+  d.target = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 1);
+  d.op1 = d.op0 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 2);
+  if (!d.one_vector_p)
+    d.op1 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 3);
+
+  start_sequence ();
+  ret = aarch64_expand_vec_perm_const_1 (&d);
+  end_sequence ();
+
+  return ret;
+}
+
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST aarch64_address_cost
 
@@ -6864,6 +7153,12 @@ aarch64_c_mode_for_suffix (char suffix)
 #undef TARGET_MAX_ANCHOR_OFFSET
 #define TARGET_MAX_ANCHOR_OFFSET 4095
 
+/* vec_perm support.  */
+
+#undef TARGET_VECTORIZE_VEC_PERM_CONST_OK
+#define TARGET_VECTORIZE_VEC_PERM_CONST_OK \
+  aarch64_vectorize_vec_perm_const_ok
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 7a1cdc8..9ea5e0c 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -228,6 +228,8 @@
     UNSPEC_FMAX		; Used in aarch64-simd.md.
     UNSPEC_FMIN		; Used in aarch64-simd.md.
     UNSPEC_BSL		; Used in aarch64-simd.md.
+    UNSPEC_TBL		; Used in vector permute patterns.
+    UNSPEC_CONCAT	; Used in vector permute patterns.
 ])
 
 ;; -------------------------------------------------------------------
@@ -415,8 +417,9 @@
 (define_mode_attr V_cmp_result [(V8QI "V8QI") (V16QI "V16QI")
 				(V4HI "V4HI") (V8HI  "V8HI")
 				(V2SI "V2SI") (V4SI  "V4SI")
+				(DI   "DI")   (V2DI  "V2DI")
 				(V2SF "V2SI") (V4SF  "V4SI")
-				(DI   "DI")   (V2DI  "V2DI")])
+				(V2DF "V2DI")])
 
 ;; Vm for lane instructions is restricted to FP_LO_REGS.
 (define_mode_attr vwx [(V4HI "x") (V8HI "x") (HI "x")
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 5935346..bce98d0 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3014,6 +3014,7 @@ proc check_effective_target_vect_perm { } {
     } else {
         set et_vect_perm_saved 0
         if { [is-effective-target arm_neon_ok]
+	     || [istarget aarch64*-*-*]
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*]
 	     || [istarget i?86-*-*]
@@ -3040,6 +3041,7 @@ proc check_effective_target_vect_perm_byte { } {
     } else {
         set et_vect_perm_byte_saved 0
         if { [is-effective-target arm_neon_ok]
+	     || [istarget aarch64*-*-*]
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_byte_saved 1
@@ -3062,6 +3064,7 @@ proc check_effective_target_vect_perm_short { } {
     } else {
         set et_vect_perm_short_saved 0
         if { [is-effective-target arm_neon_ok]
+	     || [istarget aarch64*-*-*]
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_short_saved 1
@@ -3697,7 +3700,8 @@ proc check_effective_target_vect_char_mult { } {
 	verbose "check_effective_target_vect_char_mult: using cached result" 2
     } else {
 	set et_vect_char_mult_saved 0
-	if { [istarget ia64-*-*]
+	if { [istarget aarch64*-*-*]
+	     || [istarget ia64-*-*]
 	     || [istarget i?86-*-*]
 	     || [istarget x86_64-*-*]
             || [check_effective_target_arm32] } {
@@ -3768,8 +3772,9 @@ proc check_effective_target_vect_extract_even_odd { } {
         verbose "check_effective_target_vect_extract_even_odd: using cached result" 2
     } else {
         set et_vect_extract_even_odd_saved 0 
-        if { [istarget powerpc*-*-*] 
-            || [is-effective-target arm_neon_ok]
+	if { [istarget aarch64*-*-*]
+	     || [istarget powerpc*-*-*]
+	     || [is-effective-target arm_neon_ok]
              || [istarget i?86-*-*]
              || [istarget x86_64-*-*]
              || [istarget ia64-*-*]
@@ -3793,8 +3798,9 @@ proc check_effective_target_vect_interleave { } {
         verbose "check_effective_target_vect_interleave: using cached result" 2
     } else {
         set et_vect_interleave_saved 0
-        if { [istarget powerpc*-*-*]
-            || [is-effective-target arm_neon_ok]
+	if { [istarget aarch64*-*-*]
+	     || [istarget powerpc*-*-*]
+	     || [is-effective-target arm_neon_ok]
              || [istarget i?86-*-*]
              || [istarget x86_64-*-*]
              || [istarget ia64-*-*]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute.
  2012-12-04 10:31 [Patch AArch64] Implement Vector Permute Support James Greenhalgh
@ 2012-12-04 10:36 ` James Greenhalgh
  2012-12-04 22:45   ` Marcus Shawcroft
  2012-12-04 22:44 ` [Patch AArch64] Implement Vector Permute Support Marcus Shawcroft
  2014-01-07 23:10 ` Andrew Pinski
  2 siblings, 1 reply; 16+ messages in thread
From: James Greenhalgh @ 2012-12-04 10:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 967 bytes --]


Hi,

This patch improves our code generation for some cases of
constant vector permutation. In particular, we are able to
generate better code for patterns which match the output
of the zip, uzp and trn instructions.

This patch adds support for these cases.

This patch has been tested with no regressions on
aarch64-none-elf.

OK to commit?

Thanks,
James Greenhalgh

---
gcc/
2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-simd-builtins.def: Add new builtins.
	* config/aarch64/aarch64-simd.md (simd_type): Add uzp.
	(aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>): New.
	* config/aarch64/aarch64.c (aarch64_evpc_trn): New.
	(aarch64_evpc_uzp): Likewise.
	(aarch64_evpc_zip): Likewise.
	(aarch64_expand_vec_perm_const_1): Check for trn, zip, uzp patterns.
	* config/aarch64/iterators.md (unspec): Add neccessary unspecs.
	(PERMUTE): New.
	(perm_insn): Likewise.
	(perm_hilo): Likewise.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Patch-AArch64-Add-zip-1-2-uzp-1-2-trn-1-2-support-fo.patch --]
[-- Type: text/x-patch;  name=0001-Patch-AArch64-Add-zip-1-2-uzp-1-2-trn-1-2-support-fo.patch, Size: 11386 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 2e3c4e1..8730c56 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -206,3 +206,12 @@
   BUILTIN_VDQ_BHSI (BINOP, smin)
   BUILTIN_VDQ_BHSI (BINOP, umax)
   BUILTIN_VDQ_BHSI (BINOP, umin)
+
+  /* Implemented by
+     aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>.  */
+  BUILTIN_VALL (BINOP, zip1)
+  BUILTIN_VALL (BINOP, zip2)
+  BUILTIN_VALL (BINOP, uzp1)
+  BUILTIN_VALL (BINOP, uzp2)
+  BUILTIN_VALL (BINOP, trn1)
+  BUILTIN_VALL (BINOP, trn2)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 2b0c8d6..df88ef4 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -128,7 +128,8 @@
 ; simd_store4s          store single structure from one lane for four registers (ST4 [index]).
 ; simd_tbl              table lookup.
 ; simd_trn              transpose.
-; simd_zip              zip/unzip.
+; simd_uzp              unzip.
+; simd_zip              zip.
 
 (define_attr "simd_type"
    "simd_abd,\
@@ -230,6 +231,7 @@
    simd_store4s,\
    simd_tbl,\
    simd_trn,\
+   simd_uzp,\
    simd_zip,\
    none"
   (const_string "none"))
@@ -3366,6 +3368,17 @@
   DONE;
 })
 
+(define_insn "aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>"
+  [(set (match_operand:VALL 0 "register_operand" "=w")
+	(unspec:VALL [(match_operand:VALL 1 "register_operand" "w")
+		      (match_operand:VALL 2 "register_operand" "w")]
+		       PERMUTE))]
+  "TARGET_SIMD"
+  "<PERMUTE:perm_insn><PERMUTE:perm_hilo>\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+  [(set_attr "simd_type" "simd_<PERMUTE:perm_insn>")
+   (set_attr "simd_mode" "<MODE>")]
+)
+
 (define_insn "aarch64_st2<mode>_dreg"
   [(set (match_operand:TI 0 "aarch64_simd_struct_operand" "=Utv")
 	(unspec:TI [(match_operand:OI 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cebc8cb..0eac0b7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6815,6 +6815,261 @@ aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
   aarch64_expand_vec_perm_1 (target, op0, op1, sel);
 }
 
+/* Recognize patterns suitable for the TRN instructions.  */
+static bool
+aarch64_evpc_trn (struct expand_vec_perm_d *d)
+{
+  unsigned int i, odd, mask, nelt = d->nelt;
+  rtx out, in0, in1, x;
+  rtx (*gen) (rtx, rtx, rtx);
+  enum machine_mode vmode = d->vmode;
+
+  if (GET_MODE_UNIT_SIZE (vmode) > 8)
+    return false;
+
+  /* Note that these are little-endian tests.
+     We correct for big-endian later.  */
+  if (d->perm[0] == 0)
+    odd = 0;
+  else if (d->perm[0] == 1)
+    odd = 1;
+  else
+    return false;
+  mask = (d->one_vector_p ? nelt - 1 : 2 * nelt - 1);
+
+  for (i = 0; i < nelt; i += 2)
+    {
+      if (d->perm[i] != i + odd)
+	return false;
+      if (d->perm[i + 1] != ((i + nelt + odd) & mask))
+	return false;
+    }
+
+  /* Success!  */
+  if (d->testing_p)
+    return true;
+
+  in0 = d->op0;
+  in1 = d->op1;
+  if (BYTES_BIG_ENDIAN)
+    {
+      x = in0, in0 = in1, in1 = x;
+      odd = !odd;
+    }
+  out = d->target;
+
+  if (odd)
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_trn2v16qi; break;
+	case V8QImode: gen = gen_aarch64_trn2v8qi; break;
+	case V8HImode: gen = gen_aarch64_trn2v8hi; break;
+	case V4HImode: gen = gen_aarch64_trn2v4hi; break;
+	case V4SImode: gen = gen_aarch64_trn2v4si; break;
+	case V2SImode: gen = gen_aarch64_trn2v2si; break;
+	case V2DImode: gen = gen_aarch64_trn2v2di; break;
+	case V4SFmode: gen = gen_aarch64_trn2v4sf; break;
+	case V2SFmode: gen = gen_aarch64_trn2v2sf; break;
+	case V2DFmode: gen = gen_aarch64_trn2v2df; break;
+	default:
+	  return false;
+	}
+    }
+  else
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_trn1v16qi; break;
+	case V8QImode: gen = gen_aarch64_trn1v8qi; break;
+	case V8HImode: gen = gen_aarch64_trn1v8hi; break;
+	case V4HImode: gen = gen_aarch64_trn1v4hi; break;
+	case V4SImode: gen = gen_aarch64_trn1v4si; break;
+	case V2SImode: gen = gen_aarch64_trn1v2si; break;
+	case V2DImode: gen = gen_aarch64_trn1v2di; break;
+	case V4SFmode: gen = gen_aarch64_trn1v4sf; break;
+	case V2SFmode: gen = gen_aarch64_trn1v2sf; break;
+	case V2DFmode: gen = gen_aarch64_trn1v2df; break;
+	default:
+	  return false;
+	}
+    }
+
+  emit_insn (gen (out, in0, in1));
+  return true;
+}
+
+/* Recognize patterns suitable for the UZP instructions.  */
+static bool
+aarch64_evpc_uzp (struct expand_vec_perm_d *d)
+{
+  unsigned int i, odd, mask, nelt = d->nelt;
+  rtx out, in0, in1, x;
+  rtx (*gen) (rtx, rtx, rtx);
+  enum machine_mode vmode = d->vmode;
+
+  if (GET_MODE_UNIT_SIZE (vmode) > 8)
+    return false;
+
+  /* Note that these are little-endian tests.
+     We correct for big-endian later.  */
+  if (d->perm[0] == 0)
+    odd = 0;
+  else if (d->perm[0] == 1)
+    odd = 1;
+  else
+    return false;
+  mask = (d->one_vector_p ? nelt - 1 : 2 * nelt - 1);
+
+  for (i = 0; i < nelt; i++)
+    {
+      unsigned elt = (i * 2 + odd) & mask;
+      if (d->perm[i] != elt)
+	return false;
+    }
+
+  /* Success!  */
+  if (d->testing_p)
+    return true;
+
+  in0 = d->op0;
+  in1 = d->op1;
+  if (BYTES_BIG_ENDIAN)
+    {
+      x = in0, in0 = in1, in1 = x;
+      odd = !odd;
+    }
+  out = d->target;
+
+  if (odd)
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_uzp2v16qi; break;
+	case V8QImode: gen = gen_aarch64_uzp2v8qi; break;
+	case V8HImode: gen = gen_aarch64_uzp2v8hi; break;
+	case V4HImode: gen = gen_aarch64_uzp2v4hi; break;
+	case V4SImode: gen = gen_aarch64_uzp2v4si; break;
+	case V2SImode: gen = gen_aarch64_uzp2v2si; break;
+	case V2DImode: gen = gen_aarch64_uzp2v2di; break;
+	case V4SFmode: gen = gen_aarch64_uzp2v4sf; break;
+	case V2SFmode: gen = gen_aarch64_uzp2v2sf; break;
+	case V2DFmode: gen = gen_aarch64_uzp2v2df; break;
+	default:
+	  return false;
+	}
+    }
+  else
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_uzp1v16qi; break;
+	case V8QImode: gen = gen_aarch64_uzp1v8qi; break;
+	case V8HImode: gen = gen_aarch64_uzp1v8hi; break;
+	case V4HImode: gen = gen_aarch64_uzp1v4hi; break;
+	case V4SImode: gen = gen_aarch64_uzp1v4si; break;
+	case V2SImode: gen = gen_aarch64_uzp1v2si; break;
+	case V2DImode: gen = gen_aarch64_uzp1v2di; break;
+	case V4SFmode: gen = gen_aarch64_uzp1v4sf; break;
+	case V2SFmode: gen = gen_aarch64_uzp1v2sf; break;
+	case V2DFmode: gen = gen_aarch64_uzp1v2df; break;
+	default:
+	  return false;
+	}
+    }
+
+  emit_insn (gen (out, in0, in1));
+  return true;
+}
+
+/* Recognize patterns suitable for the ZIP instructions.  */
+static bool
+aarch64_evpc_zip (struct expand_vec_perm_d *d)
+{
+  unsigned int i, high, mask, nelt = d->nelt;
+  rtx out, in0, in1, x;
+  rtx (*gen) (rtx, rtx, rtx);
+  enum machine_mode vmode = d->vmode;
+
+  if (GET_MODE_UNIT_SIZE (vmode) > 8)
+    return false;
+
+  /* Note that these are little-endian tests.
+     We correct for big-endian later.  */
+  high = nelt / 2;
+  if (d->perm[0] == high)
+    /* Do Nothing.  */
+    ;
+  else if (d->perm[0] == 0)
+    high = 0;
+  else
+    return false;
+  mask = (d->one_vector_p ? nelt - 1 : 2 * nelt - 1);
+
+  for (i = 0; i < nelt / 2; i++)
+    {
+      unsigned elt = (i + high) & mask;
+      if (d->perm[i * 2] != elt)
+	return false;
+      elt = (elt + nelt) & mask;
+      if (d->perm[i * 2 + 1] != elt)
+	return false;
+    }
+
+  /* Success!  */
+  if (d->testing_p)
+    return true;
+
+  in0 = d->op0;
+  in1 = d->op1;
+  if (BYTES_BIG_ENDIAN)
+    {
+      x = in0, in0 = in1, in1 = x;
+      high = !high;
+    }
+  out = d->target;
+
+  if (high)
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_zip2v16qi; break;
+	case V8QImode: gen = gen_aarch64_zip2v8qi; break;
+	case V8HImode: gen = gen_aarch64_zip2v8hi; break;
+	case V4HImode: gen = gen_aarch64_zip2v4hi; break;
+	case V4SImode: gen = gen_aarch64_zip2v4si; break;
+	case V2SImode: gen = gen_aarch64_zip2v2si; break;
+	case V2DImode: gen = gen_aarch64_zip2v2di; break;
+	case V4SFmode: gen = gen_aarch64_zip2v4sf; break;
+	case V2SFmode: gen = gen_aarch64_zip2v2sf; break;
+	case V2DFmode: gen = gen_aarch64_zip2v2df; break;
+	default:
+	  return false;
+	}
+    }
+  else
+    {
+      switch (vmode)
+	{
+	case V16QImode: gen = gen_aarch64_zip1v16qi; break;
+	case V8QImode: gen = gen_aarch64_zip1v8qi; break;
+	case V8HImode: gen = gen_aarch64_zip1v8hi; break;
+	case V4HImode: gen = gen_aarch64_zip1v4hi; break;
+	case V4SImode: gen = gen_aarch64_zip1v4si; break;
+	case V2SImode: gen = gen_aarch64_zip1v2si; break;
+	case V2DImode: gen = gen_aarch64_zip1v2di; break;
+	case V4SFmode: gen = gen_aarch64_zip1v4sf; break;
+	case V2SFmode: gen = gen_aarch64_zip1v2sf; break;
+	case V2DFmode: gen = gen_aarch64_zip1v2df; break;
+	default:
+	  return false;
+	}
+    }
+
+  emit_insn (gen (out, in0, in1));
+  return true;
+}
+
 static bool
 aarch64_evpc_tbl (struct expand_vec_perm_d *d)
 {
@@ -6865,7 +7120,15 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
     }
 
   if (TARGET_SIMD)
-    return aarch64_evpc_tbl (d);
+    {
+      if (aarch64_evpc_zip (d))
+	return true;
+      else if (aarch64_evpc_uzp (d))
+	return true;
+      else if (aarch64_evpc_trn (d))
+	return true;
+      return aarch64_evpc_tbl (d);
+    }
   return false;
 }
 
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 9ea5e0c..d710ea0 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -230,6 +230,12 @@
     UNSPEC_BSL		; Used in aarch64-simd.md.
     UNSPEC_TBL		; Used in vector permute patterns.
     UNSPEC_CONCAT	; Used in vector permute patterns.
+    UNSPEC_ZIP1		; Used in vector permute patterns.
+    UNSPEC_ZIP2		; Used in vector permute patterns.
+    UNSPEC_UZP1		; Used in vector permute patterns.
+    UNSPEC_UZP2		; Used in vector permute patterns.
+    UNSPEC_TRN1		; Used in vector permute patterns.
+    UNSPEC_TRN2		; Used in vector permute patterns.
 ])
 
 ;; -------------------------------------------------------------------
@@ -649,6 +655,9 @@
 
 (define_int_iterator VCMP_U [UNSPEC_CMHS UNSPEC_CMHI UNSPEC_CMTST])
 
+(define_int_iterator PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2
+			      UNSPEC_TRN1 UNSPEC_TRN2
+			      UNSPEC_UZP1 UNSPEC_UZP2])
 
 ;; -------------------------------------------------------------------
 ;; Int Iterators Attributes.
@@ -732,3 +741,10 @@
 (define_int_attr offsetlr [(UNSPEC_SSLI	"1") (UNSPEC_USLI "1")
 			   (UNSPEC_SSRI	"0") (UNSPEC_USRI "0")])
 
+(define_int_attr perm_insn [(UNSPEC_ZIP1 "zip") (UNSPEC_ZIP2 "zip")
+			    (UNSPEC_TRN1 "trn") (UNSPEC_TRN2 "trn")
+			    (UNSPEC_UZP1 "uzp") (UNSPEC_UZP2 "uzp")])
+
+(define_int_attr perm_hilo [(UNSPEC_ZIP1 "1") (UNSPEC_ZIP2 "2")
+			    (UNSPEC_TRN1 "1") (UNSPEC_TRN2 "2")
+			    (UNSPEC_UZP1 "1") (UNSPEC_UZP2 "2")])

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2012-12-04 10:31 [Patch AArch64] Implement Vector Permute Support James Greenhalgh
  2012-12-04 10:36 ` [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute James Greenhalgh
@ 2012-12-04 22:44 ` Marcus Shawcroft
  2012-12-06 16:25   ` James Greenhalgh
  2014-01-07 23:10 ` Andrew Pinski
  2 siblings, 1 reply; 16+ messages in thread
From: Marcus Shawcroft @ 2012-12-04 22:44 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, marcus.shawcroft

OK

/Marcus

On 4 December 2012 10:31, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>
> Hi,
>
> This patch adds support for Vector Shuffle style operations
> through support for TARGET_VECTORIZE_VEC_PERM_CONST_OK and
> the vec_perm and vec_perm_const standard patterns.
>
> In this patch we add the framework and support for the
> generic tbl instruction. This can be used to handle any
> vector permute operation, but we can do a better job for
> some special cases. The second patch of this series does
> that better job for the ZIP, UZP and TRN instructions.
>
> Is this OK to commit?
>
> Thanks,
> James Greenhalgh
>
> ---
> gcc/
>
> 2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * config/aarch64/aarch64-protos.h
>         (aarch64_split_combinev16qi): New.
>         (aarch64_expand_vec_perm): Likewise.
>         (aarch64_expand_vec_perm_const): Likewise.
>         * config/aarch64/aarch64-simd.md (vec_perm_const<mode>): New.
>         (vec_perm<mode>): Likewise.
>         (aarch64_tbl1<mode>): Likewise.
>         (aarch64_tbl2v16qi): Likewise.
>         (aarch64_combinev16qi): New.
>         * config/aarch64/aarch64.c
>         (aarch64_vectorize_vec_perm_const_ok): New.
>         (aarch64_split_combinev16qi): Likewise.
>         (MAX_VECT_LEN): Define.
>         (expand_vec_perm_d): New.
>         (aarch64_expand_vec_perm_1): Likewise.
>         (aarch64_expand_vec_perm): Likewise.
>         (aarch64_evpc_tbl): Likewise.
>         (aarch64_expand_vec_perm_const_1): Likewise.
>         (aarch64_expand_vec_perm_const): Likewise.
>         (aarch64_vectorize_vec_perm_const_ok): Likewise.
>         (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Likewise.
>         * config/aarch64/iterators.md
>         (unspec): Add UNSPEC_TBL, UNSPEC_CONCAT.
>         (V_cmp_result): Add mapping for V2DF.
>
> gcc/testsuite/
>
> 2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * lib/target-supports.exp
>         (check_effective_target_vect_perm): Allow aarch64*-*-*.
>         (check_effective_target_vect_perm_byte): Likewise.
>         (check_effective_target_vect_perm_short): Likewise.
>         (check_effective_target_vect_char_mult): Likewise.
>         (check_effective_target_vect_extract_even_odd): Likewise.
>         (check_effective_target_vect_interleave): Likewise.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute.
  2012-12-04 10:36 ` [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute James Greenhalgh
@ 2012-12-04 22:45   ` Marcus Shawcroft
  0 siblings, 0 replies; 16+ messages in thread
From: Marcus Shawcroft @ 2012-12-04 22:45 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, marcus.shawcroft

OK
/Marcus

On 4 December 2012 10:36, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>
> Hi,
>
> This patch improves our code generation for some cases of
> constant vector permutation. In particular, we are able to
> generate better code for patterns which match the output
> of the zip, uzp and trn instructions.
>
> This patch adds support for these cases.
>
> This patch has been tested with no regressions on
> aarch64-none-elf.
>
> OK to commit?
>
> Thanks,
> James Greenhalgh
>
> ---
> gcc/
> 2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * config/aarch64/aarch64-simd-builtins.def: Add new builtins.
>         * config/aarch64/aarch64-simd.md (simd_type): Add uzp.
>         (aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>): New.
>         * config/aarch64/aarch64.c (aarch64_evpc_trn): New.
>         (aarch64_evpc_uzp): Likewise.
>         (aarch64_evpc_zip): Likewise.
>         (aarch64_expand_vec_perm_const_1): Check for trn, zip, uzp patterns.
>         * config/aarch64/iterators.md (unspec): Add neccessary unspecs.
>         (PERMUTE): New.
>         (perm_insn): Likewise.
>         (perm_hilo): Likewise.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [Patch AArch64] Implement Vector Permute Support
  2012-12-04 22:44 ` [Patch AArch64] Implement Vector Permute Support Marcus Shawcroft
@ 2012-12-06 16:25   ` James Greenhalgh
  0 siblings, 0 replies; 16+ messages in thread
From: James Greenhalgh @ 2012-12-06 16:25 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 860 bytes --]

> OK
> 
> /Marcus

Thanks Marcus,

I've committed this and the follow-up patch to trunk and
back-ported them to AArch64-4.7-branch.

The back-port required back-porting the attached patch,
which fixes up the expected behaviour of
gcc/testsuite/gcc.dg/vect/slp-perm-8.c.

After committing this as a prerequisite, the patch series
regresses clean on aarch64-none-elf.

Thanks,
James Greenhalgh

---
gcc/testsuite

2012-12-06  James Greenhalgh  <james.greenhalgh@arm.com>

	Backport from mainline.
	2012-05-31  Greta Yorsh  <Greta.Yorsh@arm.com>

	* lib/target-supports.exp (check_effective_target_vect_char_mult):
Add
	arm32 to targets.
	* gcc.dg/vect/slp-perm-8.c (main): Prevent vectorization
	of the initialization loop.
	(dg-final): Adjust the expected number of vectorized loops depending
	on vect_char_mult target selector.

[-- Attachment #2: 0001-aarch64-4.7-Backport-fix-to-gcc.dg-vect-slp-perm-8.c.patch --]
[-- Type: application/octet-stream, Size: 1605 bytes --]

diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
index d211ef9..c4854d5 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
@@ -32,8 +32,7 @@ int main (int argc, const char* argv[])
     {
       input[i] = i;
       output[i] = 0;
-      if (input[i] > 256)
-        abort ();
+      __asm__ volatile ("");
     }
 
   for (i = 0; i < N / 3; i++)
@@ -52,7 +51,8 @@ int main (int argc, const char* argv[])
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_perm_byte } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_perm_byte && vect_char_mult } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_byte && {! vect_char_mult } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm_byte } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ccd3966..d7836eb 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3555,7 +3555,8 @@ proc check_effective_target_vect_char_mult { } {
 	set et_vect_char_mult_saved 0
 	if { [istarget ia64-*-*]
 	     || [istarget i?86-*-*]
-	     || [istarget x86_64-*-*] } {
+	     || [istarget x86_64-*-*]
+            || [check_effective_target_arm32] } {
 	   set et_vect_char_mult_saved 1
 	}
     }

--------------1.7.12.176.g3fc0e4c.dirty--



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2012-12-04 10:31 [Patch AArch64] Implement Vector Permute Support James Greenhalgh
  2012-12-04 10:36 ` [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute James Greenhalgh
  2012-12-04 22:44 ` [Patch AArch64] Implement Vector Permute Support Marcus Shawcroft
@ 2014-01-07 23:10 ` Andrew Pinski
       [not found]   ` <72A61951-68B2-4776-A2B8-05DC4E1F53A7@arm.com>
  2 siblings, 1 reply; 16+ messages in thread
From: Andrew Pinski @ 2014-01-07 23:10 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: GCC Patches, Marcus Shawcroft

On Tue, Dec 4, 2012 at 2:31 AM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
>
> Hi,
>
> This patch adds support for Vector Shuffle style operations
> through support for TARGET_VECTORIZE_VEC_PERM_CONST_OK and
> the vec_perm and vec_perm_const standard patterns.
>
> In this patch we add the framework and support for the
> generic tbl instruction. This can be used to handle any
> vector permute operation, but we can do a better job for
> some special cases. The second patch of this series does
> that better job for the ZIP, UZP and TRN instructions.
>
> Is this OK to commit?

This breaks big-endian aarch64 in a very bad way.  vec_perm<mode> is
enabled for big-endian but aarch64_expand_vec_perm will ICE right
away.  Can you please test big-endian also next time?
Here is the shortest testcase which fails at -O3:

void fill_window(unsigned short *p, unsigned wsize)
{
    unsigned n, m;
    do {
       m = *--p;
       *p = (unsigned short)(m >= wsize ? m-wsize : 0);
    } while (--n);
}

This comes from zlib and it blocks my building of the trunk.

Thanks,
Andrew Pinski



>
> Thanks,
> James Greenhalgh
>
> ---
> gcc/
>
> 2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * config/aarch64/aarch64-protos.h
>         (aarch64_split_combinev16qi): New.
>         (aarch64_expand_vec_perm): Likewise.
>         (aarch64_expand_vec_perm_const): Likewise.
>         * config/aarch64/aarch64-simd.md (vec_perm_const<mode>): New.
>         (vec_perm<mode>): Likewise.
>         (aarch64_tbl1<mode>): Likewise.
>         (aarch64_tbl2v16qi): Likewise.
>         (aarch64_combinev16qi): New.
>         * config/aarch64/aarch64.c
>         (aarch64_vectorize_vec_perm_const_ok): New.
>         (aarch64_split_combinev16qi): Likewise.
>         (MAX_VECT_LEN): Define.
>         (expand_vec_perm_d): New.
>         (aarch64_expand_vec_perm_1): Likewise.
>         (aarch64_expand_vec_perm): Likewise.
>         (aarch64_evpc_tbl): Likewise.
>         (aarch64_expand_vec_perm_const_1): Likewise.
>         (aarch64_expand_vec_perm_const): Likewise.
>         (aarch64_vectorize_vec_perm_const_ok): Likewise.
>         (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Likewise.
>         * config/aarch64/iterators.md
>         (unspec): Add UNSPEC_TBL, UNSPEC_CONCAT.
>         (V_cmp_result): Add mapping for V2DF.
>
> gcc/testsuite/
>
> 2012-12-04  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * lib/target-supports.exp
>         (check_effective_target_vect_perm): Allow aarch64*-*-*.
>         (check_effective_target_vect_perm_byte): Likewise.
>         (check_effective_target_vect_perm_short): Likewise.
>         (check_effective_target_vect_char_mult): Likewise.
>         (check_effective_target_vect_extract_even_odd): Likewise.
>         (check_effective_target_vect_interleave): Likewise.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
       [not found]   ` <72A61951-68B2-4776-A2B8-05DC4E1F53A7@arm.com>
@ 2014-01-08  0:10     ` Andrew Pinski
  2014-01-08 11:00       ` James Greenhalgh
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Pinski @ 2014-01-08  0:10 UTC (permalink / raw)
  To: Marcus Shawcroft; +Cc: James Greenhalgh, GCC Patches, Richard Earnshaw

On Tue, Jan 7, 2014 at 4:05 PM, Marcus Shawcroft
<Marcus.Shawcroft@arm.com> wrote:
>
> On 7 Jan 2014, at 23:10, Andrew Pinski <pinskia@gmail.com> wrote:
>
>> On Tue, Dec 4, 2012 at 2:31 AM, James Greenhalgh
>> <james.greenhalgh@arm.com> wrote:
>>>
>>> Hi,
>>>
>>> This patch adds support for Vector Shuffle style operations
>>> through support for TARGET_VECTORIZE_VEC_PERM_CONST_OK and
>>> the vec_perm and vec_perm_const standard patterns.
>>>
>>> In this patch we add the framework and support for the
>>> generic tbl instruction. This can be used to handle any
>>> vector permute operation, but we can do a better job for
>>> some special cases. The second patch of this series does
>>> that better job for the ZIP, UZP and TRN instructions.
>>>
>>> Is this OK to commit?
>>
>> This breaks big-endian aarch64 in a very bad way.  vec_perm<mode> is
>> enabled for big-endian but aarch64_expand_vec_perm will ICE right
>> away.  Can you please test big-endian also next time?
>> Here is the shortest testcase which fails at -O3:
>>
>> void fill_window(unsigned short *p, unsigned wsize)
>> {
>>    unsigned n, m;
>>    do {
>>       m = *--p;
>>       *p = (unsigned short)(m >= wsize ? m-wsize : 0);
>>    } while (--n);
>> }
>>
>> This comes from zlib and it blocks my building of the trunk.
>>
>> Thanks,
>> Andrew Pinski
>>
>
> Andrew, We know that there are numerous issues with aarch64 BE advsimd support in GCC.  The aarch64_be support is very much a work in progress.  Tejas sorted out a number of fundamentals with a series of patches in November, notably in PCS conformance.  There is more to come.  However, aarch64_be-* support in gcc 4.9 is not going to match the level of quality for the aarch64-* port.


Yes but should not introduce an ICE while GCC is in stage3.  This was
working before due not having a vec_perm before.  I am going to
request this to be reverted soon if it is not fixed (the GCC rules are
clear here).

Thanks,
Andrew Pinski

PS sorry if you received this message twice, I had to remove your
company stupid message.

>
> Cheers
> /Marcus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-08  0:10     ` Andrew Pinski
@ 2014-01-08 11:00       ` James Greenhalgh
  2014-01-14 15:19         ` Alex Velenko
  0 siblings, 1 reply; 16+ messages in thread
From: James Greenhalgh @ 2014-01-08 11:00 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Marcus Shawcroft, GCC Patches, Richard Earnshaw, alex.velenko,
	tejas.belagod

On Wed, Jan 08, 2014 at 12:10:13AM +0000, Andrew Pinski wrote:
> On Tue, Jan 7, 2014 at 4:05 PM, Marcus Shawcroft
> <Marcus.Shawcroft@arm.com> wrote:
> >
> > Andrew, We know that there are numerous issues with aarch64 BE advsimd support in GCC.  The aarch64_be support is very much a work in progress.  Tejas sorted out a number of fundamentals with a series of patches in November, notably in PCS conformance.  There is more to come.  However, aarch64_be-* support in gcc 4.9 is not going to match the level of quality for the aarch64-* port.
> 
> 
> Yes but should not introduce an ICE while GCC is in stage3.  This was
> working before due not having a vec_perm before.  I am going to
> request this to be reverted soon if it is not fixed (the GCC rules are
> clear here).

Hi Andrew,

I am confused, are you also proposing to revert this patch on 4.8
branch? The code has been sitting with that assert in place on trunk
for well over a year (note that December 2012 was during 4.8's
stage 3, not 4.9) there is no regression here.

But, that doesn't absolve me of the fact that this is broken in
a stupid way for big-endian AArch64.

The band-aid, which I can prepare, would be to turn off
vec_perm for BYTES_BIG_ENDIAN targets on the 4.9 and
4.8 branches. This is the most sensible thing to do in the short
term. Naturally, you will lose vectorization of permute operations,
but at least you won't get the ICE or wrong code generation. This
is what the ARM back-end (from which I ported the vec_perm code)
does.

In the longer term you would want to audit the lane-numbering
discrepancies between GCC and our architectural lane-numbers.
We are some way towards that after Tejas' PCS conformance fix,
but as Marcus has said, there is more to come. I should imagine
that in this case you will need to provide a run-time transformation
between the permute mask and an appropriate mask for tbl.

To reiterate, this does not need reverted, we'll get a fix out
disabling vec_perm for BYTES_BIG_ENDIAN on 4.8 branch and 4.9.

Thanks,
James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-08 11:00       ` James Greenhalgh
@ 2014-01-14 15:19         ` Alex Velenko
  2014-01-14 15:51           ` pinskia
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Velenko @ 2014-01-14 15:19 UTC (permalink / raw)
  To: James Greenhalgh
  Cc: Andrew Pinski, Marcus Shawcroft, GCC Patches, Richard Earnshaw,
	Tejas Belagod

[-- Attachment #1: Type: text/plain, Size: 1048 bytes --]

Hi,

This patch turns off the vec_perm patterns for aarch64_be, this should 
resolve
the issue  highlighted here 
http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
With this patch applied, the test case provided in that link compiles 
without an ICE.

However, the Big-Endian port is still in development. This patch exposes
another known but unrelated issue with Big-Endian Large-Int modes.

The patch has been tested on aarch64-none-elf and aarch64_be-none-elf 
resulting in five
further regression due to the broken implementation of Big-Endian 
Large-Int modes.

Kind regards,
Alex Velenko

gcc/

2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>

	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.

gcc/testsuite/

2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>

	* lib/target-supports.exp
	(check_effective_target_vect_perm): Exclude aarch64_be.
	(check_effective_target_vect_perm_byte): Likewise.
	(check_effective_target_vect_perm_short): Likewise.


[-- Attachment #2: vec-perm.patch --]
[-- Type: text/x-patch, Size: 2075 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bc47a291de4b9b24d829e4dbf060fff7a321558f..43a9c5b27d78a47cf965636a03232005a4c8e7c3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3840,7 +3840,7 @@
    (match_operand:VB 1 "register_operand")
    (match_operand:VB 2 "register_operand")
    (match_operand:VB 3 "register_operand")]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
 {
   aarch64_expand_vec_perm (operands[0], operands[1],
 			   operands[2], operands[3]);
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 95360089b89d5fef2997dc6dbe7f47a6864143ea..084668af5124aa1c4a7f25495cf44b52811d0e62 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3417,7 +3417,8 @@ proc check_effective_target_vect_perm { } {
     } else {
         set et_vect_perm_saved 0
         if { [is-effective-target arm_neon_ok]
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && ![istarget aarch64_be*-*-*])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*]
 	     || [istarget i?86-*-*]
@@ -3445,7 +3446,8 @@ proc check_effective_target_vect_perm_byte { } {
         set et_vect_perm_byte_saved 0
         if { ([is-effective-target arm_neon_ok]
 	      && [is-effective-target arm_little_endian])
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && ![istarget aarch64_be*-*-*])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_byte_saved 1
@@ -3469,7 +3471,8 @@ proc check_effective_target_vect_perm_short { } {
         set et_vect_perm_short_saved 0
         if { ([is-effective-target arm_neon_ok]
 	      && [is-effective-target arm_little_endian])
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && ![istarget aarch64_be*-*-*])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_short_saved 1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-14 15:19         ` Alex Velenko
@ 2014-01-14 15:51           ` pinskia
  2014-01-16 14:43             ` Alex Velenko
  0 siblings, 1 reply; 16+ messages in thread
From: pinskia @ 2014-01-14 15:51 UTC (permalink / raw)
  To: Alex Velenko
  Cc: James Greenhalgh, Marcus Shawcroft, GCC Patches,
	Richard Earnshaw, Tejas Belagod



> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
> 
> Hi,
> 
> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
> With this patch applied, the test case provided in that link compiles without an ICE.
> 
> However, the Big-Endian port is still in development. This patch exposes
> another known but unrelated issue with Big-Endian Large-Int modes.
> 
> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
> further regression due to the broken implementation of Big-Endian Large-Int modes.
> 
> Kind regards,
> Alex Velenko
> 
> gcc/
> 
> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
> 
>    * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>    * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
> 
> gcc/testsuite/
> 
> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
> 
>    * lib/target-supports.exp
>    (check_effective_target_vect_perm): Exclude aarch64_be.
>    (check_effective_target_vect_perm_byte): Likewise.
>    (check_effective_target_vect_perm_short): Likewise.

I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib. 

Thanks,
Andrew

> 
> <vec-perm.patch>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-14 15:51           ` pinskia
@ 2014-01-16 14:43             ` Alex Velenko
  2014-01-17 15:55               ` Richard Earnshaw
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Velenko @ 2014-01-16 14:43 UTC (permalink / raw)
  To: pinskia
  Cc: James Greenhalgh, Marcus Shawcroft, GCC Patches,
	Richard Earnshaw, Tejas Belagod

[-- Attachment #1: Type: text/plain, Size: 2074 bytes --]

On 14/01/14 15:51, pinskia@gmail.com wrote:
>
>
>> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
>>
>> Hi,
>>
>> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
>> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>> With this patch applied, the test case provided in that link compiles without an ICE.
>>
>> However, the Big-Endian port is still in development. This patch exposes
>> another known but unrelated issue with Big-Endian Large-Int modes.
>>
>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
>> further regression due to the broken implementation of Big-Endian Large-Int modes.
>>
>> Kind regards,
>> Alex Velenko
>>
>> gcc/
>>
>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>
>>     * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>     * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>
>> gcc/testsuite/
>>
>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>
>>     * lib/target-supports.exp
>>     (check_effective_target_vect_perm): Exclude aarch64_be.
>>     (check_effective_target_vect_perm_byte): Likewise.
>>     (check_effective_target_vect_perm_short): Likewise.
>
> I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib.
>
> Thanks,
> Andrew
>
>>
>> <vec-perm.patch>
>

Hi,
Here is a vec-perm patch with changes proposed previously.
Little and Big-Endian tested with no additional issues appearing.

Kind regards,
Alex

gcc/

2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>

	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.

gcc/testsuite/

2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>

	* lib/target-supports.exp
	(check_effective_target_vect_perm): Exclude aarch64_be.
	(check_effective_target_vect_perm_byte): Likewise.
	(check_effective_target_vect_perm_short): Likewise.

[-- Attachment #2: Vect-perm2.patch --]
[-- Type: text/x-patch, Size: 2123 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bc47a291de4b9b24d829e4dbf060fff7a321558f..43a9c5b27d78a47cf965636a03232005a4c8e7c3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3840,7 +3840,7 @@
    (match_operand:VB 1 "register_operand")
    (match_operand:VB 2 "register_operand")
    (match_operand:VB 3 "register_operand")]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
 {
   aarch64_expand_vec_perm (operands[0], operands[1],
 			   operands[2], operands[3]);
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 159f88f28dd838d4aee6d75f8d21897695609c49..b425183c1e893c6511ba575a0cd416563c9510be 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3436,7 +3436,8 @@ proc check_effective_target_vect_perm { } {
     } else {
         set et_vect_perm_saved 0
         if { [is-effective-target arm_neon_ok]
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && [is-effective-target aarch64_little_endian])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*]
 	     || [istarget i?86-*-*]
@@ -3464,7 +3465,8 @@ proc check_effective_target_vect_perm_byte { } {
         set et_vect_perm_byte_saved 0
         if { ([is-effective-target arm_neon_ok]
 	      && [is-effective-target arm_little_endian])
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && [is-effective-target aarch64_little_endian])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_byte_saved 1
@@ -3488,7 +3490,8 @@ proc check_effective_target_vect_perm_short { } {
         set et_vect_perm_short_saved 0
         if { ([is-effective-target arm_neon_ok]
 	      && [is-effective-target arm_little_endian])
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && [is-effective-target aarch64_little_endian])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*] } {
             set et_vect_perm_short_saved 1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-16 14:43             ` Alex Velenko
@ 2014-01-17 15:55               ` Richard Earnshaw
  2014-01-20 11:15                 ` Alex Velenko
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Earnshaw @ 2014-01-17 15:55 UTC (permalink / raw)
  To: Alex Velenko
  Cc: pinskia, James Greenhalgh, Marcus Shawcroft, GCC Patches, Tejas Belagod

On 16/01/14 14:43, Alex Velenko wrote:
> On 14/01/14 15:51, pinskia@gmail.com wrote:
>>
>>
>>> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
>>>
>>> Hi,
>>>
>>> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
>>> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>>> With this patch applied, the test case provided in that link compiles without an ICE.
>>>
>>> However, the Big-Endian port is still in development. This patch exposes
>>> another known but unrelated issue with Big-Endian Large-Int modes.
>>>
>>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
>>> further regression due to the broken implementation of Big-Endian Large-Int modes.
>>>
>>> Kind regards,
>>> Alex Velenko
>>>
>>> gcc/
>>>
>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>
>>>     * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>>     * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>
>>> gcc/testsuite/
>>>
>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>
>>>     * lib/target-supports.exp
>>>     (check_effective_target_vect_perm): Exclude aarch64_be.
>>>     (check_effective_target_vect_perm_byte): Likewise.
>>>     (check_effective_target_vect_perm_short): Likewise.
>>
>> I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib.
>>
>> Thanks,
>> Andrew
>>
>>>
>>> <vec-perm.patch>
>>
> 
> Hi,
> Here is a vec-perm patch with changes proposed previously.
> Little and Big-Endian tested with no additional issues appearing.
> 
> Kind regards,
> Alex
> 
> gcc/
> 
> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
> 	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
> 
> gcc/testsuite/
> 
> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
> 
> 	* lib/target-supports.exp
> 	(check_effective_target_vect_perm): Exclude aarch64_be.
> 	(check_effective_target_vect_perm_byte): Likewise.
> 	(check_effective_target_vect_perm_short): Likewise.
> 

The patch is missing the hunk for aarch64.c.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-17 15:55               ` Richard Earnshaw
@ 2014-01-20 11:15                 ` Alex Velenko
  2014-01-20 11:17                   ` Richard Earnshaw
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Velenko @ 2014-01-20 11:15 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: pinskia, James Greenhalgh, Marcus Shawcroft, GCC Patches, Tejas Belagod

On 17/01/14 15:55, Richard Earnshaw wrote:
> On 16/01/14 14:43, Alex Velenko wrote:
>> On 14/01/14 15:51, pinskia@gmail.com wrote:
>>>
>>>
>>>> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
>>>> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>>>> With this patch applied, the test case provided in that link compiles without an ICE.
>>>>
>>>> However, the Big-Endian port is still in development. This patch exposes
>>>> another known but unrelated issue with Big-Endian Large-Int modes.
>>>>
>>>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
>>>> further regression due to the broken implementation of Big-Endian Large-Int modes.
>>>>
>>>> Kind regards,
>>>> Alex Velenko
>>>>
>>>> gcc/
>>>>
>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>
>>>>      * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>>>      * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>>
>>>> gcc/testsuite/
>>>>
>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>
>>>>      * lib/target-supports.exp
>>>>      (check_effective_target_vect_perm): Exclude aarch64_be.
>>>>      (check_effective_target_vect_perm_byte): Likewise.
>>>>      (check_effective_target_vect_perm_short): Likewise.
>>>
>>> I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>>
>>>> <vec-perm.patch>
>>>
>>
>> Hi,
>> Here is a vec-perm patch with changes proposed previously.
>> Little and Big-Endian tested with no additional issues appearing.
>>
>> Kind regards,
>> Alex
>>
>> gcc/
>>
>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>
>> 	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>> 	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>
>> gcc/testsuite/
>>
>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>
>> 	* lib/target-supports.exp
>> 	(check_effective_target_vect_perm): Exclude aarch64_be.
>> 	(check_effective_target_vect_perm_byte): Likewise.
>> 	(check_effective_target_vect_perm_short): Likewise.
>>
>
> The patch is missing the hunk for aarch64.c.
>
>

Hi,
It is a faulty changelog entry, not patch.
Should be:

gcc/

2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>

     * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.

gcc/testsuite/

2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>

     * lib/target-supports.exp
     (check_effective_target_vect_perm): Exclude aarch64_be.
     (check_effective_target_vect_perm_byte): Likewise.
     (check_effective_target_vect_perm_short): Likewise.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-20 11:15                 ` Alex Velenko
@ 2014-01-20 11:17                   ` Richard Earnshaw
  2014-01-20 17:33                     ` Alex Velenko
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Earnshaw @ 2014-01-20 11:17 UTC (permalink / raw)
  To: Alex Velenko
  Cc: pinskia, James Greenhalgh, Marcus Shawcroft, GCC Patches, Tejas Belagod

On 20/01/14 11:15, Alex Velenko wrote:
> On 17/01/14 15:55, Richard Earnshaw wrote:
>> On 16/01/14 14:43, Alex Velenko wrote:
>>> On 14/01/14 15:51, pinskia@gmail.com wrote:
>>>>
>>>>
>>>>> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
>>>>> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>>>>> With this patch applied, the test case provided in that link compiles without an ICE.
>>>>>
>>>>> However, the Big-Endian port is still in development. This patch exposes
>>>>> another known but unrelated issue with Big-Endian Large-Int modes.
>>>>>
>>>>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
>>>>> further regression due to the broken implementation of Big-Endian Large-Int modes.
>>>>>
>>>>> Kind regards,
>>>>> Alex Velenko
>>>>>
>>>>> gcc/
>>>>>
>>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>>
>>>>>      * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>>>>      * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>>>
>>>>> gcc/testsuite/
>>>>>
>>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>>
>>>>>      * lib/target-supports.exp
>>>>>      (check_effective_target_vect_perm): Exclude aarch64_be.
>>>>>      (check_effective_target_vect_perm_byte): Likewise.
>>>>>      (check_effective_target_vect_perm_short): Likewise.
>>>>
>>>> I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib.
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>>>
>>>>> <vec-perm.patch>
>>>>
>>>
>>> Hi,
>>> Here is a vec-perm patch with changes proposed previously.
>>> Little and Big-Endian tested with no additional issues appearing.
>>>
>>> Kind regards,
>>> Alex
>>>
>>> gcc/
>>>
>>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>>
>>> 	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>> 	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>
>>> gcc/testsuite/
>>>
>>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>>
>>> 	* lib/target-supports.exp
>>> 	(check_effective_target_vect_perm): Exclude aarch64_be.
>>> 	(check_effective_target_vect_perm_byte): Likewise.
>>> 	(check_effective_target_vect_perm_short): Likewise.
>>>
>>
>> The patch is missing the hunk for aarch64.c.
>>
>>
> 
> Hi,
> It is a faulty changelog entry, not patch.
> Should be:
> 
> gcc/
> 
> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
> 
>      * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
> 
> gcc/testsuite/
> 
> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
> 
>      * lib/target-supports.exp
>      (check_effective_target_vect_perm): Exclude aarch64_be.
>      (check_effective_target_vect_perm_byte): Likewise.
>      (check_effective_target_vect_perm_short): Likewise.
> 

On that basis, OK.

R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-20 11:17                   ` Richard Earnshaw
@ 2014-01-20 17:33                     ` Alex Velenko
  2014-01-20 18:36                       ` Marcus Shawcroft
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Velenko @ 2014-01-20 17:33 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: pinskia, James Greenhalgh, Marcus Shawcroft, GCC Patches, Tejas Belagod

On 20/01/14 11:16, Richard Earnshaw wrote:
> On 20/01/14 11:15, Alex Velenko wrote:
>> On 17/01/14 15:55, Richard Earnshaw wrote:
>>> On 16/01/14 14:43, Alex Velenko wrote:
>>>> On 14/01/14 15:51, pinskia@gmail.com wrote:
>>>>>
>>>>>
>>>>>> On Jan 14, 2014, at 7:19 AM, Alex Velenko <Alex.Velenko@arm.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This patch turns off the vec_perm patterns for aarch64_be, this should resolve
>>>>>> the issue  highlighted here http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00321.html
>>>>>> With this patch applied, the test case provided in that link compiles without an ICE.
>>>>>>
>>>>>> However, the Big-Endian port is still in development. This patch exposes
>>>>>> another known but unrelated issue with Big-Endian Large-Int modes.
>>>>>>
>>>>>> The patch has been tested on aarch64-none-elf and aarch64_be-none-elf resulting in five
>>>>>> further regression due to the broken implementation of Big-Endian Large-Int modes.
>>>>>>
>>>>>> Kind regards,
>>>>>> Alex Velenko
>>>>>>
>>>>>> gcc/
>>>>>>
>>>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>>>
>>>>>>       * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>>>>>       * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>>>>
>>>>>> gcc/testsuite/
>>>>>>
>>>>>> 2014-01-14  Alex Velenko  <Alex.Velenko@arm.com>
>>>>>>
>>>>>>       * lib/target-supports.exp
>>>>>>       (check_effective_target_vect_perm): Exclude aarch64_be.
>>>>>>       (check_effective_target_vect_perm_byte): Likewise.
>>>>>>       (check_effective_target_vect_perm_short): Likewise.
>>>>>
>>>>> I think you want to use a function to check if the target is effectively big-endian instead.  Internally at Cavium, our elf compiler has big-endian multi-lib.
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>>
>>>>>>
>>>>>> <vec-perm.patch>
>>>>>
>>>>
>>>> Hi,
>>>> Here is a vec-perm patch with changes proposed previously.
>>>> Little and Big-Endian tested with no additional issues appearing.
>>>>
>>>> Kind regards,
>>>> Alex
>>>>
>>>> gcc/
>>>>
>>>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>>>
>>>> 	* config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>>> 	* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Add comment.
>>>>
>>>> gcc/testsuite/
>>>>
>>>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>>>
>>>> 	* lib/target-supports.exp
>>>> 	(check_effective_target_vect_perm): Exclude aarch64_be.
>>>> 	(check_effective_target_vect_perm_byte): Likewise.
>>>> 	(check_effective_target_vect_perm_short): Likewise.
>>>>
>>>
>>> The patch is missing the hunk for aarch64.c.
>>>
>>>
>>
>> Hi,
>> It is a faulty changelog entry, not patch.
>> Should be:
>>
>> gcc/
>>
>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>
>>       * config/aarch64/aarch64-simd.md (vec_perm<mode>): Add BE check.
>>
>> gcc/testsuite/
>>
>> 2014-01-16  Alex Velenko  <Alex.Velenko@arm.com>
>>
>>       * lib/target-supports.exp
>>       (check_effective_target_vect_perm): Exclude aarch64_be.
>>       (check_effective_target_vect_perm_byte): Likewise.
>>       (check_effective_target_vect_perm_short): Likewise.
>>
>
> On that basis, OK.
>
> R.
>

Can someone, please, commit this patch, as I do not have permissions?
Kind regards,
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Patch AArch64] Implement Vector Permute Support
  2014-01-20 17:33                     ` Alex Velenko
@ 2014-01-20 18:36                       ` Marcus Shawcroft
  0 siblings, 0 replies; 16+ messages in thread
From: Marcus Shawcroft @ 2014-01-20 18:36 UTC (permalink / raw)
  To: Alex Velenko
  Cc: Richard Earnshaw, pinskia, James Greenhalgh, Marcus Shawcroft,
	GCC Patches, Tejas Belagod

On 20 January 2014 17:33, Alex Velenko <Alex.Velenko@arm.com> wrote:

> Can someone, please, commit this patch, as I do not have permissions?
> Kind regards,
> Alex

Committed.

/Marcus

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-01-20 18:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-04 10:31 [Patch AArch64] Implement Vector Permute Support James Greenhalgh
2012-12-04 10:36 ` [Patch AArch64] Add zip{1, 2}, uzp{1, 2}, trn{1, 2} support for vector permute James Greenhalgh
2012-12-04 22:45   ` Marcus Shawcroft
2012-12-04 22:44 ` [Patch AArch64] Implement Vector Permute Support Marcus Shawcroft
2012-12-06 16:25   ` James Greenhalgh
2014-01-07 23:10 ` Andrew Pinski
     [not found]   ` <72A61951-68B2-4776-A2B8-05DC4E1F53A7@arm.com>
2014-01-08  0:10     ` Andrew Pinski
2014-01-08 11:00       ` James Greenhalgh
2014-01-14 15:19         ` Alex Velenko
2014-01-14 15:51           ` pinskia
2014-01-16 14:43             ` Alex Velenko
2014-01-17 15:55               ` Richard Earnshaw
2014-01-20 11:15                 ` Alex Velenko
2014-01-20 11:17                   ` Richard Earnshaw
2014-01-20 17:33                     ` Alex Velenko
2014-01-20 18:36                       ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).