public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics.
@ 2014-09-08 16:31 Alan Lawrence
  2014-09-08 16:39 ` [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane Alan Lawrence
  2014-09-09 10:52 ` [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Marcus Shawcroft
  0 siblings, 2 replies; 4+ messages in thread
From: Alan Lawrence @ 2014-09-08 16:31 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

This adds a test thath checks the result of a vset_lane intrinsic is identical 
to the input apart from one value being changed.

Test checks only one index per vset_lane_xxx in a somewhat adhoc fashion as the 
index has to be a compile-time immediate and I felt that doing a loop using 
macros did not add enough to justify the complexity.

Passing on aarch64-none-elf and aarch64_be-none-elf (cross-tested).

gcc/testsuite/ChangeLog:

	gcc.target/aarch64/vset_lane_1.c: New test.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: test_vset_lane.patch --]
[-- Type: text/x-patch; name=test_vset_lane.patch, Size: 3623 bytes --]

diff --git a/gcc/testsuite/gcc.target/aarch64/vset_lane_1.c b/gcc/testsuite/gcc.target/aarch64/vset_lane_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..5fb11399f202df7bc9a67c3d8ffb78f71c87e5c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vset_lane_1.c
@@ -0,0 +1,85 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-inline" } */
+
+#include <arm_neon.h>
+
+extern void abort (void);
+
+#define VARIANTS(VARIANT)			\
+VARIANT (uint8_t, , 8, uint8x8_t, _u8, 5)	\
+VARIANT (uint16_t, , 4, uint16x4_t, _u16, 3)	\
+VARIANT (uint32_t, , 2, uint32x2_t, _u32, 1)	\
+VARIANT (uint64_t, , 1, uint64x1_t, _u64, 0)	\
+VARIANT (int8_t, , 8, int8x8_t, _s8, 6)		\
+VARIANT (int16_t, , 4, int16x4_t, _s16, 2)	\
+VARIANT (int32_t, , 2, int32x2_t, _s32, 0)	\
+VARIANT (int64_t, , 1, int64x1_t, _s64, 0)	\
+VARIANT (poly8_t, , 8, poly8x8_t, _p8, 6)	\
+VARIANT (poly16_t, , 4, poly16x4_t, _p16, 2)	\
+VARIANT (float32_t, , 2, float32x2_t, _f32, 1)	\
+VARIANT (float64_t, , 1, float64x1_t, _f64, 0)	\
+VARIANT (uint8_t, q, 16, uint8x16_t, _u8, 11)	\
+VARIANT (uint16_t, q, 8, uint16x8_t, _u16, 7)	\
+VARIANT (uint32_t, q, 4, uint32x4_t, _u32, 2)	\
+VARIANT (uint64_t, q, 2, uint64x2_t, _u64, 1)	\
+VARIANT (int8_t, q, 16, int8x16_t, _s8, 13)	\
+VARIANT (int16_t, q, 8, int16x8_t, _s16, 5)	\
+VARIANT (int32_t, q, 4, int32x4_t, _s32, 3)	\
+VARIANT (int64_t, q, 2, int64x2_t, _s64, 0)	\
+VARIANT (poly8_t, q, 16, poly8x16_t, _p8, 14)	\
+VARIANT (poly16_t, q, 8, poly16x8_t, _p16, 6)	\
+VARIANT (float32_t, q, 4, float32x4_t, _f32, 2) \
+VARIANT (float64_t, q, 2, float64x2_t, _f64, 1)
+
+#define TESTMETH(BASETYPE, Q, NUM, TYPE, SUFFIX, INDEX)	\
+int							\
+test_vset_lane ##Q##SUFFIX (BASETYPE *data)		\
+{							\
+  BASETYPE temp [NUM];					\
+  TYPE vec = vld1##Q##SUFFIX (data);			\
+  TYPE vec2;						\
+  BASETYPE changed = data[INDEX] - INDEX;		\
+  int check;						\
+  vec = vset##Q##_lane##SUFFIX (changed, vec, INDEX);	\
+  asm volatile ("orr %0.16b, %1.16b, %1.16b"		\
+		: "=w"(vec2) : "w" (vec) : );		\
+  vst1##Q##SUFFIX (temp, vec2);				\
+  for (check = 0; check < NUM; check++)			\
+    {							\
+      BASETYPE desired = data[check];			\
+      if (check==INDEX) desired = changed;		\
+      if (temp[check] != desired)			\
+        return 1;					\
+    }							\
+  return 0;						\
+}
+
+VARIANTS (TESTMETH)
+
+#define CHECK(BASETYPE, Q, NUM, TYPE, SUFFIX, INDEX)		\
+  if (test_vset_lane##Q##SUFFIX (BASETYPE ## _ ## data) != 0)	\
+    abort ();
+
+int
+main (int argc, char **argv)
+{
+  uint8_t uint8_t_data[16] =
+      { 1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47 };
+  uint16_t uint16_t_data[8] = { 1, 22, 333, 4444, 55555, 6666, 777, 88 };
+  uint32_t uint32_t_data[4] = { 65537, 11, 70000, 23 };
+  uint64_t uint64_t_data[2] = { 0xdeadbeefcafebabeULL, 0x0123456789abcdefULL };
+  int8_t int8_t_data[16] =
+      { -1, -3, -5, -7, 9, -11, -13, 15, -17, -19, 21, -23, 25, 27, -29, -31 };
+  int16_t int16_t_data[8] = { -17, 19, 3, -999, 44048, 505, 9999, 1000};
+  int32_t int32_t_data[4] = { 123456789, -987654321, -135792468, 975318642 };
+  int64_t int64_t_data[2] = {0xfedcba9876543210LL, 0xdeadbabecafebeefLL };
+  poly8_t poly8_t_data[16] =
+      { 0, 7, 13, 18, 22, 25, 27, 28, 29, 31, 34, 38, 43, 49, 56, 64 };
+  poly16_t poly16_t_data[8] = { 11111, 2222, 333, 44, 5, 65432, 54321, 43210 };
+  float32_t float32_t_data[4] = { 3.14159, 2.718, 1.414, 100.0 };
+  float64_t float64_t_data[2] = { 1.01001000100001, 12345.6789 };
+
+  VARIANTS (CHECK);
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane
  2014-09-08 16:31 [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Alan Lawrence
@ 2014-09-08 16:39 ` Alan Lawrence
  2014-09-09 10:53   ` Marcus Shawcroft
  2014-09-09 10:52 ` [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Marcus Shawcroft
  1 sibling, 1 reply; 4+ messages in thread
From: Alan Lawrence @ 2014-09-08 16:39 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1731 bytes --]

The vset(q?)_lane_XXX intrinsics are presently implemented using inline asm 
blocks containing "ins" instructions - which are opaque to the mid-end. This 
patch replaces them with simple writes using gcc vector extension operations, 
with a lane-flip on bigendian (where ARM intrinsic lanes are indexed the other 
way around to gcc vector extensions). This should enable more optimization by 
being transparent to the mid-end.

No significant changes to assembly output for vset_lane_1.c test from previous 
patch.

Tested with check-gcc on aarch64-none-elf and aarch64_be-none-elf, including 
vset_lane_1.c test from previous patch.

gcc/ChangeLog:

	* config/aarch64/arm_neon.h (aarch64_vset_lane_any): New (*2).
	(vset_lane_f32, vset_lane_f64, vset_lane_p8, vset_lane_p16,
	vset_lane_s8, vset_lane_s16, vset_lane_s32, vset_lane_s64,
	vset_lane_u8, vset_lane_u16, vset_lane_u32, vset_lane_u64,
	vsetq_lane_f32, vsetq_lane_f64, vsetq_lane_p8, vsetq_lane_p16,
	vsetq_lane_s8, vsetq_lane_s16, vsetq_lane_s32, vsetq_lane_s64,
	vsetq_lane_u8, vsetq_lane_u16, vsetq_lane_u32, vsetq_lane_u64):
	Replace inline assembler with __aarch64_vset_lane_any.

OK for trunk?

Alan Lawrence wrote:
> This adds a test thath checks the result of a vset_lane intrinsic is identical 
> to the input apart from one value being changed.
> 
> Test checks only one index per vset_lane_xxx in a somewhat adhoc fashion as the 
> index has to be a compile-time immediate and I felt that doing a loop using 
> macros did not add enough to justify the complexity.
> 
> Passing on aarch64-none-elf and aarch64_be-none-elf (cross-tested).
> 
> gcc/testsuite/ChangeLog:
> 
> 	gcc.target/aarch64/vset_lane_1.c: New test.
> 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: vset_lane.patch --]
[-- Type: text/x-patch; name=vset_lane.patch, Size: 27336 bytes --]

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 0e087a86b3307e36fb2854a2c1d878c12aadff74..a30556d04ff30d6061249037ad016858af182286 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -673,6 +673,174 @@ typedef struct poly16x8x4_t
 #define __aarch64_vdupq_laneq_u64(__a, __b) \
    __aarch64_vdup_lane_any (u64, q, q, __a, __b)
 
+/* vset_lane internal macro.  */
+
+#ifdef __AARCH64EB__
+/* For big-endian, GCC's vector indices are the opposite way around
+   to the architectural lane indices used by Neon intrinsics.  */
+#define __aarch64_vset_lane_any(__vec, __index, __val, __lanes) \
+  __extension__							\
+  ({								\
+    __builtin_aarch64_im_lane_boundsi (__index, __lanes);	\
+    __vec[__lanes - 1 - __index] = __val;			\
+    __vec;							\
+  })
+#else
+#define __aarch64_vset_lane_any(__vec, __index, __val, __lanes) \
+  __extension__							\
+  ({								\
+    __builtin_aarch64_im_lane_boundsi (__index, __lanes);	\
+    __vec[__index] = __val;					\
+    __vec;							\
+  })
+#endif
+
+/* vset_lane  */
+
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vset_lane_f32 (float32_t __elem, float32x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vset_lane_f64 (float64_t __elem, float64x1_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 1);
+}
+
+__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__))
+vset_lane_p8 (poly8_t __elem, poly8x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline poly16x4_t __attribute__ ((__always_inline__))
+vset_lane_p16 (poly16_t __elem, poly16x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
+vset_lane_s8 (int8_t __elem, int8x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline int16x4_t __attribute__ ((__always_inline__))
+vset_lane_s16 (int16_t __elem, int16x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
+vset_lane_s32 (int32_t __elem, int32x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
+__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
+vset_lane_s64 (int64_t __elem, int64x1_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 1);
+}
+
+__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
+vset_lane_u8 (uint8_t __elem, uint8x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline uint16x4_t __attribute__ ((__always_inline__))
+vset_lane_u16 (uint16_t __elem, uint16x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__))
+vset_lane_u32 (uint32_t __elem, uint32x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
+__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
+vset_lane_u64 (uint64_t __elem, uint64x1_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 1);
+}
+
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vsetq_lane_f32 (float32_t __elem, float32x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
+vsetq_lane_f64 (float64_t __elem, float64x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
+__extension__ static __inline poly8x16_t __attribute__ ((__always_inline__))
+vsetq_lane_p8 (poly8_t __elem, poly8x16_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 16);
+}
+
+__extension__ static __inline poly16x8_t __attribute__ ((__always_inline__))
+vsetq_lane_p16 (poly16_t __elem, poly16x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
+vsetq_lane_s8 (int8_t __elem, int8x16_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 16);
+}
+
+__extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
+vsetq_lane_s16 (int16_t __elem, int16x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline int32x4_t __attribute__ ((__always_inline__))
+vsetq_lane_s32 (int32_t __elem, int32x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline int64x2_t __attribute__ ((__always_inline__))
+vsetq_lane_s64 (int64_t __elem, int64x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
+__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
+vsetq_lane_u8 (uint8_t __elem, uint8x16_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 16);
+}
+
+__extension__ static __inline uint16x8_t __attribute__ ((__always_inline__))
+vsetq_lane_u16 (uint16_t __elem, uint16x8_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 8);
+}
+
+__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
+vsetq_lane_u32 (uint32_t __elem, uint32x4_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 4);
+}
+
+__extension__ static __inline uint64x2_t __attribute__ ((__always_inline__))
+vsetq_lane_u64 (uint64_t __elem, uint64x2_t __vec, const int __index)
+{
+  return __aarch64_vset_lane_any (__vec, __index, __elem, 2);
+}
+
 /* vadd  */
 __extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
 vadd_s8 (int8x8_t __a, int8x8_t __b)
@@ -11156,318 +11324,6 @@ vrsubhn_u64 (uint64x2_t a, uint64x2_t b)
   return result;
 }
 
-#define vset_lane_f32(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       float32x2_t b_ = (b);                                            \
-       float32_t a_ = (a);                                              \
-       float32x2_t result;                                              \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_f64(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       float64x1_t b_ = (b);                                            \
-       float64_t a_ = (a);                                              \
-       float64x1_t result;                                              \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_p8(a, b, c)                                           \
-  __extension__                                                         \
-    ({                                                                  \
-       poly8x8_t b_ = (b);                                              \
-       poly8_t a_ = (a);                                                \
-       poly8x8_t result;                                                \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_p16(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       poly16x4_t b_ = (b);                                             \
-       poly16_t a_ = (a);                                               \
-       poly16x4_t result;                                               \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_s8(a, b, c)                                           \
-  __extension__                                                         \
-    ({                                                                  \
-       int8x8_t b_ = (b);                                               \
-       int8_t a_ = (a);                                                 \
-       int8x8_t result;                                                 \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_s16(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       int16x4_t b_ = (b);                                              \
-       int16_t a_ = (a);                                                \
-       int16x4_t result;                                                \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_s32(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       int32x2_t b_ = (b);                                              \
-       int32_t a_ = (a);                                                \
-       int32x2_t result;                                                \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_s64(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       int64x1_t b_ = (b);                                              \
-       int64_t a_ = (a);                                                \
-       int64x1_t result;                                                \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_u8(a, b, c)                                           \
-  __extension__                                                         \
-    ({                                                                  \
-       uint8x8_t b_ = (b);                                              \
-       uint8_t a_ = (a);                                                \
-       uint8x8_t result;                                                \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_u16(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       uint16x4_t b_ = (b);                                             \
-       uint16_t a_ = (a);                                               \
-       uint16x4_t result;                                               \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_u32(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       uint32x2_t b_ = (b);                                             \
-       uint32_t a_ = (a);                                               \
-       uint32x2_t result;                                               \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vset_lane_u64(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       uint64x1_t b_ = (b);                                             \
-       uint64_t a_ = (a);                                               \
-       uint64x1_t result;                                               \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_f32(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       float32x4_t b_ = (b);                                            \
-       float32_t a_ = (a);                                              \
-       float32x4_t result;                                              \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_f64(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       float64x2_t b_ = (b);                                            \
-       float64_t a_ = (a);                                              \
-       float64x2_t result;                                              \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_p8(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       poly8x16_t b_ = (b);                                             \
-       poly8_t a_ = (a);                                                \
-       poly8x16_t result;                                               \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_p16(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       poly16x8_t b_ = (b);                                             \
-       poly16_t a_ = (a);                                               \
-       poly16x8_t result;                                               \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_s8(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       int8x16_t b_ = (b);                                              \
-       int8_t a_ = (a);                                                 \
-       int8x16_t result;                                                \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_s16(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       int16x8_t b_ = (b);                                              \
-       int16_t a_ = (a);                                                \
-       int16x8_t result;                                                \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_s32(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       int32x4_t b_ = (b);                                              \
-       int32_t a_ = (a);                                                \
-       int32x4_t result;                                                \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_s64(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       int64x2_t b_ = (b);                                              \
-       int64_t a_ = (a);                                                \
-       int64x2_t result;                                                \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_u8(a, b, c)                                          \
-  __extension__                                                         \
-    ({                                                                  \
-       uint8x16_t b_ = (b);                                             \
-       uint8_t a_ = (a);                                                \
-       uint8x16_t result;                                               \
-       __asm__ ("ins %0.b[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_u16(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       uint16x8_t b_ = (b);                                             \
-       uint16_t a_ = (a);                                               \
-       uint16x8_t result;                                               \
-       __asm__ ("ins %0.h[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_u32(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       uint32x4_t b_ = (b);                                             \
-       uint32_t a_ = (a);                                               \
-       uint32x4_t result;                                               \
-       __asm__ ("ins %0.s[%3], %w1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
-#define vsetq_lane_u64(a, b, c)                                         \
-  __extension__                                                         \
-    ({                                                                  \
-       uint64x2_t b_ = (b);                                             \
-       uint64_t a_ = (a);                                               \
-       uint64x2_t result;                                               \
-       __asm__ ("ins %0.d[%3], %x1"                                     \
-                : "=w"(result)                                          \
-                : "r"(a_), "0"(b_), "i"(c)                              \
-                : /* No clobbers */);                                   \
-       result;                                                          \
-     })
-
 #define vshrn_high_n_s16(a, b, c)                                       \
   __extension__                                                         \
     ({                                                                  \

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics.
  2014-09-08 16:31 [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Alan Lawrence
  2014-09-08 16:39 ` [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane Alan Lawrence
@ 2014-09-09 10:52 ` Marcus Shawcroft
  1 sibling, 0 replies; 4+ messages in thread
From: Marcus Shawcroft @ 2014-09-09 10:52 UTC (permalink / raw)
  To: Alan Lawrence; +Cc: gcc-patches

On 8 September 2014 17:31, Alan Lawrence <alan.lawrence@arm.com> wrote:
> This adds a test thath checks the result of a vset_lane intrinsic is
> identical to the input apart from one value being changed.
>
> Test checks only one index per vset_lane_xxx in a somewhat adhoc fashion as
> the index has to be a compile-time immediate and I felt that doing a loop
> using macros did not add enough to justify the complexity.
>
> Passing on aarch64-none-elf and aarch64_be-none-elf (cross-tested).
>
> gcc/testsuite/ChangeLog:
>
>         gcc.target/aarch64/vset_lane_1.c: New test.

OK /Marcus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane
  2014-09-08 16:39 ` [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane Alan Lawrence
@ 2014-09-09 10:53   ` Marcus Shawcroft
  0 siblings, 0 replies; 4+ messages in thread
From: Marcus Shawcroft @ 2014-09-09 10:53 UTC (permalink / raw)
  To: Alan Lawrence; +Cc: gcc-patches

On 8 September 2014 17:39, Alan Lawrence <alan.lawrence@arm.com> wrote:

> gcc/ChangeLog:
>
>         * config/aarch64/arm_neon.h (aarch64_vset_lane_any): New (*2).
>         (vset_lane_f32, vset_lane_f64, vset_lane_p8, vset_lane_p16,
>         vset_lane_s8, vset_lane_s16, vset_lane_s32, vset_lane_s64,
>         vset_lane_u8, vset_lane_u16, vset_lane_u32, vset_lane_u64,
>         vsetq_lane_f32, vsetq_lane_f64, vsetq_lane_p8, vsetq_lane_p16,
>         vsetq_lane_s8, vsetq_lane_s16, vsetq_lane_s32, vsetq_lane_s64,
>         vsetq_lane_u8, vsetq_lane_u16, vsetq_lane_u32, vsetq_lane_u64):
>         Replace inline assembler with __aarch64_vset_lane_any.
>
> OK for trunk?

OK /Marcus

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-09 10:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08 16:31 [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Alan Lawrence
2014-09-08 16:39 ` [PATCH 2/2][AArch64] Replace temporary inline assembler for vset_lane Alan Lawrence
2014-09-09 10:53   ` Marcus Shawcroft
2014-09-09 10:52 ` [PATCH 1/2][AArch64 Testsuite] Add execution test of vset(q?)_lane intrinsics Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).