From: Tamar Christina <Tamar.Christina@arm.com>
To: Christophe Lyon <christophe.lyon@linaro.org>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
"christophe.lyon@st.com" <christophe.lyon@st.com>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
Richard Earnshaw <Richard.Earnshaw@arm.com>,
James Greenhalgh <James.Greenhalgh@arm.com>,
Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>, nd <nd@arm.com>
Subject: Re: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC
Date: Tue, 29 Nov 2016 09:50:00 -0000 [thread overview]
Message-ID: <VI1PR0801MB20314851CA0B552AE1AB5015FF8D0@VI1PR0801MB2031.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <VI1PR0801MB2031E928CFEB2B0053CA41E2FF890@VI1PR0801MB2031.eurprd08.prod.outlook.com>
[-- Attachment #1: Type: text/plain, Size: 4464 bytes --]
Hi All,
The new patch contains the proper types for the intrinsics that should be returning uint64x1
and has the rest of the comments by Christophe in them.
Kind Regards,
Tamar
________________________________________
From: Tamar Christina
Sent: Friday, November 25, 2016 4:01:30 PM
To: Christophe Lyon
Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd
Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC
>
> > A few comments about this new version:
> > * arm-neon-ref.h: why do you create
> CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64?
> > Can't you just add calls to CHECK_CRYPTO in the existing
> > CHECK_RESULTS_NAMED_NO_FP16?
Yes, that should be fine, I didn't used to have CHECK_CRYPTO before and when I added it
I didn't remove the split. I'll do it now.
> >
> > * p64_p128:
> > From what I can see ARM and AArch64 differ on the vceq variants
> > available with poly64.
> > For ARM, arm_neon.h contains: uint64x1_t vceq_p64 (poly64x1_t __a,
> > poly64x1_t __b) For AArch64, I can't see vceq_p64 in arm_neon.h? ...
> > Actually I've just noticed the other you submitted while I was writing
> > this, where you add vceq_p64 for aarch64, but it still returns
> > uint64_t.
> > Why do you change the vceq_64 test to return poly64_t instead of
> uint64_t?
This patch is slightly outdated. The correct type is `uint64_t` but when it was noticed
This patch was already sent. New one coming soon.
> >
> > Why do you add #ifdef __aarch64 before vldX_p64 tests and until vsli_p64?
> >
This is wrong, remove them. It was supposed to be around the vldX_lane_p64 tests.
> > The comment /* vget_lane_p64 tests. */ is wrong before VLDX_LANE
> > tests
> >
> > You need to protect the new vmov, vget_high and vget_lane tests with
> > #ifdef __aarch64__.
> >
vget_lane is already in an #ifdef, vmov you're right, but I also notice that the
test calls VDUP instead of VMOV, which explains why I didn't get a test failure.
Thanks for the feedback,
I'll get these updated.
>
> Actually, vget_high_p64 exists on arm, so no need for the #fidef for it.
>
>
> > Christophe
> >
> >> Kind regards,
> >> Tamar
> >> ________________________________________
> >> From: Tamar Christina
> >> Sent: Tuesday, November 8, 2016 11:58:46 AM
> >> To: Christophe Lyon
> >> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard
> >> Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd
> >> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing
> >> Poly64_t intrinsics to GCC
> >>
> >> Hi Christophe,
> >>
> >> Thanks for the review!
> >>
> >>>
> >>> A while ago I added p64_p128.c, to contain all the poly64/128 tests
> >>> except for vreinterpret.
> >>> Why do you need to create p64.c ?
> >>
> >> I originally created it because I had a much smaller set of
> >> intrinsics that I wanted to add initially, this grew and It hadn't occurred to
> me that I can use the existing file now.
> >>
> >> Another reason was the effective-target arm_crypto_ok as you
> mentioned below.
> >>
> >>>
> >>> Similarly, adding tests for vcreate_p64 etc... in p64.c or
> >>> p64_p128.c might be easier to maintain than adding them to vcreate.c
> >>> etc with several #ifdef conditions.
> >>
> >> Fair enough, I'll move them to p64_p128.c.
> >>
> >>> For vdup-vmod.c, why do you add the "&& defined(__aarch64__)"
> >>> condition? These intrinsics are defined in arm/arm_neon.h, right?
> >>> They are tested in p64_p128.c
> >>
> >> I should have looked for them, they weren't being tested before so I
> >> had Mistakenly assumed that they weren't available. Now I realize I
> >> just need To add the proper test option to the file to enable crypto. I'll
> update this as well.
> >>
> >>> Looking at your patch, it seems some tests are currently missing for arm:
> >>> vget_high_p64. I'm not sure why I missed it when I removed neont-
> >>> testgen...
> >>
> >> I'll adjust the test conditions so they run for ARM as well.
> >>
> >>>
> >>> Regarding vreinterpret_p128.c, doesn't the existing effective-target
> >>> arm_crypto_ok prevent the tests from running on aarch64?
> >>
> >> Yes they do, I was comparing the output against a clean version and
> >> hasn't noticed That they weren't running. Thanks!
> >>
> >>>
> >>> Thanks,
> >>>
> >>> Christophe
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-test-ver3.patch --]
[-- Type: text/x-patch; name="aarch64-test-ver3.patch", Size: 24846 bytes --]
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
index 462141586b3db7c5256c74b08fa0449210634226..beaf6ac31d5c5affe3702a505ad0df8679229e32 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
@@ -32,6 +32,13 @@ extern size_t strlen(const char *);
VECT_VAR(expected, int, 16, 4) -> expected_int16x4
VECT_VAR_DECL(expected, int, 16, 4) -> int16x4_t expected_int16x4
*/
+/* Some instructions don't exist on ARM.
+ Use this macro to guard against them. */
+#ifdef __aarch64__
+#define AARCH64_ONLY(X) X
+#else
+#define AARCH64_ONLY(X)
+#endif
#define xSTR(X) #X
#define STR(X) xSTR(X)
@@ -92,6 +99,13 @@ extern size_t strlen(const char *);
fprintf(stderr, "CHECKED %s %s\n", STR(VECT_TYPE(T, W, N)), MSG); \
}
+#if defined (__ARM_FEATURE_CRYPTO)
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) \
+ CHECK(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#else
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#endif
+
/* Floating-point variant. */
#define CHECK_FP(MSG,T,W,N,FMT,EXPECTED,COMMENT) \
{ \
@@ -184,6 +198,9 @@ extern ARRAY(expected, uint, 32, 2);
extern ARRAY(expected, uint, 64, 1);
extern ARRAY(expected, poly, 8, 8);
extern ARRAY(expected, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 1);
+#endif
extern ARRAY(expected, hfloat, 16, 4);
extern ARRAY(expected, hfloat, 32, 2);
extern ARRAY(expected, hfloat, 64, 1);
@@ -197,6 +214,9 @@ extern ARRAY(expected, uint, 32, 4);
extern ARRAY(expected, uint, 64, 2);
extern ARRAY(expected, poly, 8, 16);
extern ARRAY(expected, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 2);
+#endif
extern ARRAY(expected, hfloat, 16, 8);
extern ARRAY(expected, hfloat, 32, 4);
extern ARRAY(expected, hfloat, 64, 2);
@@ -213,6 +233,7 @@ extern ARRAY(expected, hfloat, 64, 2);
CHECK(test_name, uint, 64, 1, PRIx64, EXPECTED, comment); \
CHECK(test_name, poly, 8, 8, PRIx8, EXPECTED, comment); \
CHECK(test_name, poly, 16, 4, PRIx16, EXPECTED, comment); \
+ CHECK_CRYPTO(test_name, poly, 64, 1, PRIx64, EXPECTED, comment); \
CHECK_FP(test_name, float, 32, 2, PRIx32, EXPECTED, comment); \
\
CHECK(test_name, int, 8, 16, PRIx8, EXPECTED, comment); \
@@ -225,6 +246,7 @@ extern ARRAY(expected, hfloat, 64, 2);
CHECK(test_name, uint, 64, 2, PRIx64, EXPECTED, comment); \
CHECK(test_name, poly, 8, 16, PRIx8, EXPECTED, comment); \
CHECK(test_name, poly, 16, 8, PRIx16, EXPECTED, comment); \
+ CHECK_CRYPTO(test_name, poly, 64, 2, PRIx64, EXPECTED, comment); \
CHECK_FP(test_name, float, 32, 4, PRIx32, EXPECTED, comment); \
} \
@@ -398,6 +420,9 @@ static void clean_results (void)
CLEAN(result, uint, 64, 1);
CLEAN(result, poly, 8, 8);
CLEAN(result, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+ CLEAN(result, poly, 64, 1);
+#endif
#if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
CLEAN(result, float, 16, 4);
#endif
@@ -413,6 +438,9 @@ static void clean_results (void)
CLEAN(result, uint, 64, 2);
CLEAN(result, poly, 8, 16);
CLEAN(result, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+ CLEAN(result, poly, 64, 2);
+#endif
#if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
CLEAN(result, float, 16, 8);
#endif
@@ -438,6 +466,13 @@ static void clean_results (void)
#define DECL_VARIABLE(VAR, T1, W, N) \
VECT_TYPE(T1, W, N) VECT_VAR(VAR, T1, W, N)
+#if defined (__ARM_FEATURE_CRYPTO)
+#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N) \
+ DECL_VARIABLE(VAR, T1, W, N)
+#else
+#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N)
+#endif
+
/* Declare only 64 bits signed variants. */
#define DECL_VARIABLE_64BITS_SIGNED_VARIANTS(VAR) \
DECL_VARIABLE(VAR, int, 8, 8); \
@@ -473,6 +508,7 @@ static void clean_results (void)
DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR); \
DECL_VARIABLE(VAR, poly, 8, 8); \
DECL_VARIABLE(VAR, poly, 16, 4); \
+ DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1); \
DECL_VARIABLE(VAR, float, 16, 4); \
DECL_VARIABLE(VAR, float, 32, 2)
#else
@@ -481,6 +517,7 @@ static void clean_results (void)
DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR); \
DECL_VARIABLE(VAR, poly, 8, 8); \
DECL_VARIABLE(VAR, poly, 16, 4); \
+ DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1); \
DECL_VARIABLE(VAR, float, 32, 2)
#endif
@@ -491,6 +528,7 @@ static void clean_results (void)
DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR); \
DECL_VARIABLE(VAR, poly, 8, 16); \
DECL_VARIABLE(VAR, poly, 16, 8); \
+ DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2); \
DECL_VARIABLE(VAR, float, 16, 8); \
DECL_VARIABLE(VAR, float, 32, 4)
#else
@@ -499,6 +537,7 @@ static void clean_results (void)
DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR); \
DECL_VARIABLE(VAR, poly, 8, 16); \
DECL_VARIABLE(VAR, poly, 16, 8); \
+ DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2); \
DECL_VARIABLE(VAR, float, 32, 4)
#endif
/* Declare all variants. */
@@ -531,6 +570,13 @@ static void clean_results (void)
/* Helpers to call macros with 1 constant and 5 variable
arguments. */
+#if defined (__ARM_FEATURE_CRYPTO)
+#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N) \
+ MACRO(VAR1, VAR2, T1, T2, T3, W, N)
+#else
+#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N)
+#endif
+
#define TEST_MACRO_64BITS_SIGNED_VARIANTS_1_5(MACRO, VAR) \
MACRO(VAR, , int, s, 8, 8); \
MACRO(VAR, , int, s, 16, 4); \
@@ -601,13 +647,15 @@ static void clean_results (void)
TEST_MACRO_64BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \
TEST_MACRO_64BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \
MACRO(VAR1, VAR2, , poly, p, 8, 8); \
- MACRO(VAR1, VAR2, , poly, p, 16, 4)
+ MACRO(VAR1, VAR2, , poly, p, 16, 4); \
+ MACRO_CRYPTO(MACRO, VAR1, VAR2, , poly, p, 64, 1)
#define TEST_MACRO_128BITS_VARIANTS_2_5(MACRO, VAR1, VAR2) \
TEST_MACRO_128BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \
TEST_MACRO_128BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2); \
MACRO(VAR1, VAR2, q, poly, p, 8, 16); \
- MACRO(VAR1, VAR2, q, poly, p, 16, 8)
+ MACRO(VAR1, VAR2, q, poly, p, 16, 8); \
+ MACRO_CRYPTO(MACRO, VAR1, VAR2, q, poly, p, 64, 2)
#define TEST_MACRO_ALL_VARIANTS_2_5(MACRO, VAR1, VAR2) \
TEST_MACRO_64BITS_VARIANTS_2_5(MACRO, VAR1, VAR2); \
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
index 519cffb0125079022e7ba876c1ca657d9e37cac2..8907b38cde90b44a8f1501f72b2c4e812cba5707 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
@@ -1,8 +1,9 @@
/* This file contains tests for all the *p64 intrinsics, except for
vreinterpret which have their own testcase. */
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
/* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
#include <arm_neon.h>
#include "arm-neon-ref.h"
@@ -38,6 +39,17 @@ VECT_VAR_DECL(vdup_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 };
VECT_VAR_DECL(vdup_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2,
0xfffffffffffffff2 };
+/* Expected results: vmov_n. */
+VECT_VAR_DECL(vmov_n_expected0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(vmov_n_expected0,poly,64,2) [] = { 0xfffffffffffffff0,
+ 0xfffffffffffffff0 };
+VECT_VAR_DECL(vmov_n_expected1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(vmov_n_expected1,poly,64,2) [] = { 0xfffffffffffffff1,
+ 0xfffffffffffffff1 };
+VECT_VAR_DECL(vmov_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(vmov_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2,
+ 0xfffffffffffffff2 };
+
/* Expected results: vext. */
VECT_VAR_DECL(vext_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 };
@@ -45,6 +57,9 @@ VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 };
/* Expected results: vget_low. */
VECT_VAR_DECL(vget_low_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
+/* Expected results: vget_high. */
+VECT_VAR_DECL(vget_high_expected,poly,64,1) [] = { 0xfffffffffffffff1 };
+
/* Expected results: vld1. */
VECT_VAR_DECL(vld1_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
VECT_VAR_DECL(vld1_expected,poly,64,2) [] = { 0xfffffffffffffff0,
@@ -109,6 +124,39 @@ VECT_VAR_DECL(vst1_lane_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
VECT_VAR_DECL(vst1_lane_expected,poly,64,2) [] = { 0xfffffffffffffff0,
0x3333333333333333 };
+/* Expected results: vldX_lane. */
+VECT_VAR_DECL(expected_vld_st2_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st2_0,poly,64,2) [] = { 0xfffffffffffffff0,
+ 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st2_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st2_1,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+ 0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st3_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st3_0,poly,64,2) [] = { 0xfffffffffffffff0,
+ 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st3_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st3_1,poly,64,2) [] = { 0xfffffffffffffff2,
+ 0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st3_2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(expected_vld_st3_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+ 0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st4_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st4_0,poly,64,2) [] = { 0xfffffffffffffff0,
+ 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st4_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st4_1,poly,64,2) [] = { 0xfffffffffffffff2,
+ 0xfffffffffffffff3 };
+VECT_VAR_DECL(expected_vld_st4_2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(expected_vld_st4_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+ 0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st4_3,poly,64,1) [] = { 0xfffffffffffffff3 };
+VECT_VAR_DECL(expected_vld_st4_3,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+ 0xaaaaaaaaaaaaaaaa };
+
+/* Expected results: vget_lane. */
+VECT_VAR_DECL(vget_lane_expected,poly,64,1) = 0xfffffffffffffff0;
+VECT_VAR_DECL(vget_lane_expected,poly,64,2) = 0xfffffffffffffff0;
+
int main (void)
{
int i;
@@ -341,6 +389,26 @@ int main (void)
CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_low_expected, "");
+ /* vget_high_p64 tests. */
+#undef TEST_MSG
+#define TEST_MSG "VGET_HIGH"
+
+#define TEST_VGET_HIGH(T1, T2, W, N, N2) \
+ VECT_VAR(vget_high_vector64, T1, W, N) = \
+ vget_high_##T2##W(VECT_VAR(vget_high_vector128, T1, W, N2)); \
+ vst1_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vget_high_vector64, T1, W, N))
+
+ DECL_VARIABLE(vget_high_vector64, poly, 64, 1);
+ DECL_VARIABLE(vget_high_vector128, poly, 64, 2);
+
+ CLEAN(result, poly, 64, 1);
+
+ VLOAD(vget_high_vector128, buffer, q, poly, p, 64, 2);
+
+ TEST_VGET_HIGH(poly, p, 64, 1, 2);
+
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_high_expected, "");
+
/* vld1_p64 tests. */
#undef TEST_MSG
#define TEST_MSG "VLD1/VLD1Q"
@@ -645,7 +713,7 @@ int main (void)
VECT_VAR(vst1_lane_vector, T1, W, N) = \
vld1##Q##_##T2##W(VECT_VAR(buffer, T1, W, N)); \
vst1##Q##_lane_##T2##W(VECT_VAR(result, T1, W, N), \
- VECT_VAR(vst1_lane_vector, T1, W, N), L)
+ VECT_VAR(vst1_lane_vector, T1, W, N), L);
DECL_VARIABLE(vst1_lane_vector, poly, 64, 1);
DECL_VARIABLE(vst1_lane_vector, poly, 64, 2);
@@ -659,5 +727,298 @@ int main (void)
CHECK(TEST_MSG, poly, 64, 1, PRIx64, vst1_lane_expected, "");
CHECK(TEST_MSG, poly, 64, 2, PRIx64, vst1_lane_expected, "");
+#ifdef __aarch64__
+
+ /* vmov_n_p64 tests. */
+#undef TEST_MSG
+#define TEST_MSG "VMOV/VMOVQ"
+
+#define TEST_VMOV(Q, T1, T2, W, N) \
+ VECT_VAR(vmov_n_vector, T1, W, N) = \
+ vmov##Q##_n_##T2##W(VECT_VAR(buffer_dup, T1, W, N)[i]); \
+ vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vmov_n_vector, T1, W, N))
+
+ DECL_VARIABLE(vmov_n_vector, poly, 64, 1);
+ DECL_VARIABLE(vmov_n_vector, poly, 64, 2);
+
+ /* Try to read different places from the input buffer. */
+ for (i=0; i< 3; i++) {
+ CLEAN(result, poly, 64, 1);
+ CLEAN(result, poly, 64, 2);
+
+ TEST_VMOV(, poly, p, 64, 1);
+ TEST_VMOV(q, poly, p, 64, 2);
+
+ switch (i) {
+ case 0:
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected0, "");
+ CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected0, "");
+ break;
+ case 1:
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected1, "");
+ CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected1, "");
+ break;
+ case 2:
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected2, "");
+ CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected2, "");
+ break;
+ default:
+ abort();
+ }
+ }
+
+ /* vget_lane_p64 tests. */
+#undef TEST_MSG
+#define TEST_MSG "VGET_LANE/VGETQ_LANE"
+
+#define TEST_VGET_LANE(Q, T1, T2, W, N, L) \
+ VECT_VAR(vget_lane_vector, T1, W, N) = vget##Q##_lane_##T2##W(VECT_VAR(vector, T1, W, N), L); \
+ if (VECT_VAR(vget_lane_vector, T1, W, N) != VECT_VAR(vget_lane_expected, T1, W, N)) { \
+ fprintf(stderr, \
+ "ERROR in %s (%s line %d in result '%s') at type %s " \
+ "got 0x%" PRIx##W " != 0x%" PRIx##W "\n", \
+ TEST_MSG, __FILE__, __LINE__, \
+ STR(VECT_VAR(vget_lane_expected, T1, W, N)), \
+ STR(VECT_NAME(T1, W, N)), \
+ VECT_VAR(vget_lane_vector, T1, W, N), \
+ VECT_VAR(vget_lane_expected, T1, W, N)); \
+ abort (); \
+ }
+
+ /* Initialize input values. */
+ DECL_VARIABLE(vector, poly, 64, 1);
+ DECL_VARIABLE(vector, poly, 64, 2);
+
+ VLOAD(vector, buffer, , poly, p, 64, 1);
+ VLOAD(vector, buffer, q, poly, p, 64, 2);
+
+ VECT_VAR_DECL(vget_lane_vector, poly, 64, 1);
+ VECT_VAR_DECL(vget_lane_vector, poly, 64, 2);
+
+ TEST_VGET_LANE( , poly, p, 64, 1, 0);
+ TEST_VGET_LANE(q, poly, p, 64, 2, 0);
+
+ /* vldx_lane_p64 tests. */
+#undef TEST_MSG
+#define TEST_MSG "VLDX_LANE/VLDXQ_LANE"
+
+VECT_VAR_DECL_INIT(buffer_vld2_lane, poly, 64, 2);
+VECT_VAR_DECL_INIT(buffer_vld3_lane, poly, 64, 3);
+VECT_VAR_DECL_INIT(buffer_vld4_lane, poly, 64, 4);
+
+ /* In this case, input variables are arrays of vectors. */
+#define DECL_VLD_STX_LANE(T1, W, N, X) \
+ VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X); \
+ VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X); \
+ VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N]
+
+ /* We need to use a temporary result buffer (result_bis), because
+ the one used for other tests is not large enough. A subset of the
+ result data is moved from result_bis to result, and it is this
+ subset which is used to check the actual behavior. The next
+ macro enables to move another chunk of data from result_bis to
+ result. */
+ /* We also use another extra input buffer (buffer_src), which we
+ fill with 0xAA, and which it used to load a vector from which we
+ read a given lane. */
+
+#define TEST_VLDX_LANE(Q, T1, T2, W, N, X, L) \
+ memset (VECT_VAR(buffer_src, T1, W, N), 0xAA, \
+ sizeof(VECT_VAR(buffer_src, T1, W, N))); \
+ \
+ VECT_ARRAY_VAR(vector_src, T1, W, N, X) = \
+ vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N)); \
+ \
+ VECT_ARRAY_VAR(vector, T1, W, N, X) = \
+ /* Use dedicated init buffer, of size. X */ \
+ vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X), \
+ VECT_ARRAY_VAR(vector_src, T1, W, N, X), \
+ L); \
+ vst##X##Q##_##T2##W(VECT_VAR(result_bis_##X, T1, W, N), \
+ VECT_ARRAY_VAR(vector, T1, W, N, X)); \
+ memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \
+ sizeof(VECT_VAR(result, T1, W, N)))
+
+ /* Overwrite "result" with the contents of "result_bis"[Y]. */
+#undef TEST_EXTRA_CHUNK
+#define TEST_EXTRA_CHUNK(T1, W, N, X, Y) \
+ memcpy(VECT_VAR(result, T1, W, N), \
+ &(VECT_VAR(result_bis_##X, T1, W, N)[Y*N]), \
+ sizeof(VECT_VAR(result, T1, W, N)));
+
+ /* Add some padding to try to catch out of bound accesses. */
+#define ARRAY1(V, T, W, N) VECT_VAR_DECL(V,T,W,N)[1]={42}
+#define DUMMY_ARRAY(V, T, W, N, L) \
+ VECT_VAR_DECL(V,T,W,N)[N*L]={0}; \
+ ARRAY1(V##_pad,T,W,N)
+
+#define DECL_ALL_VLD_STX_LANE(X) \
+ DECL_VLD_STX_LANE(poly, 64, 1, X); \
+ DECL_VLD_STX_LANE(poly, 64, 2, X);
+
+#define TEST_ALL_VLDX_LANE(X) \
+ TEST_VLDX_LANE(, poly, p, 64, 1, X, 0); \
+ TEST_VLDX_LANE(q, poly, p, 64, 2, X, 0);
+
+#define TEST_ALL_EXTRA_CHUNKS(X,Y) \
+ TEST_EXTRA_CHUNK(poly, 64, 1, X, Y) \
+ TEST_EXTRA_CHUNK(poly, 64, 2, X, Y)
+
+#define CHECK_RESULTS_VLD_STX_LANE(test_name,EXPECTED,comment) \
+ CHECK(test_name, poly, 64, 1, PRIx64, EXPECTED, comment); \
+ CHECK(test_name, poly, 64, 2, PRIx64, EXPECTED, comment);
+
+ /* Declare the temporary buffers / variables. */
+ DECL_ALL_VLD_STX_LANE(2);
+ DECL_ALL_VLD_STX_LANE(3);
+ DECL_ALL_VLD_STX_LANE(4);
+
+ DUMMY_ARRAY(buffer_src, poly, 64, 1, 4);
+ DUMMY_ARRAY(buffer_src, poly, 64, 2, 4);
+
+ /* Check vld2_lane/vld2q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD2_LANE/VLD2Q_LANE"
+ TEST_ALL_VLDX_LANE(2);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_0, " chunk 0");
+
+ TEST_ALL_EXTRA_CHUNKS(2, 1);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_1, " chunk 1");
+
+ /* Check vld3_lane/vld3q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD3_LANE/VLD3Q_LANE"
+ TEST_ALL_VLDX_LANE(3);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_0, " chunk 0");
+
+ TEST_ALL_EXTRA_CHUNKS(3, 1);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_1, " chunk 1");
+
+ TEST_ALL_EXTRA_CHUNKS(3, 2);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_2, " chunk 2");
+
+ /* Check vld4_lane/vld4q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD4_LANE/VLD4Q_LANE"
+ TEST_ALL_VLDX_LANE(4);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_0, " chunk 0");
+
+ TEST_ALL_EXTRA_CHUNKS(4, 1);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_1, " chunk 1");
+ TEST_ALL_EXTRA_CHUNKS(4, 2);
+
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_2, " chunk 2");
+
+ TEST_ALL_EXTRA_CHUNKS(4, 3);
+ CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_3, " chunk 3");
+
+ /* In this case, input variables are arrays of vectors. */
+#define DECL_VSTX_LANE(T1, W, N, X) \
+ VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X); \
+ VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X); \
+ VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N]
+
+ /* We need to use a temporary result buffer (result_bis), because
+ the one used for other tests is not large enough. A subset of the
+ result data is moved from result_bis to result, and it is this
+ subset which is used to check the actual behavior. The next
+ macro enables to move another chunk of data from result_bis to
+ result. */
+ /* We also use another extra input buffer (buffer_src), which we
+ fill with 0xAA, and which it used to load a vector from which we
+ read a given lane. */
+#define TEST_VSTX_LANE(Q, T1, T2, W, N, X, L) \
+ memset (VECT_VAR(buffer_src, T1, W, N), 0xAA, \
+ sizeof(VECT_VAR(buffer_src, T1, W, N))); \
+ memset (VECT_VAR(result_bis_##X, T1, W, N), 0, \
+ sizeof(VECT_VAR(result_bis_##X, T1, W, N))); \
+ \
+ VECT_ARRAY_VAR(vector_src, T1, W, N, X) = \
+ vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N)); \
+ \
+ VECT_ARRAY_VAR(vector, T1, W, N, X) = \
+ /* Use dedicated init buffer, of size X. */ \
+ vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X), \
+ VECT_ARRAY_VAR(vector_src, T1, W, N, X), \
+ L); \
+ vst##X##Q##_lane_##T2##W(VECT_VAR(result_bis_##X, T1, W, N), \
+ VECT_ARRAY_VAR(vector, T1, W, N, X), \
+ L); \
+ memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \
+ sizeof(VECT_VAR(result, T1, W, N)));
+
+#define TEST_ALL_VSTX_LANE(X) \
+ TEST_VSTX_LANE(, poly, p, 64, 1, X, 0); \
+ TEST_VSTX_LANE(q, poly, p, 64, 2, X, 0);
+
+ /* Check vst2_lane/vst2q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST2_LANE/VST2Q_LANE"
+ TEST_ALL_VSTX_LANE(2);
+
+#define CMT " (chunk 0)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_0, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(2, 1);
+#undef CMT
+#define CMT " chunk 1"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_1, CMT);
+
+ /* Check vst3_lane/vst3q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST3_LANE/VST3Q_LANE"
+ TEST_ALL_VSTX_LANE(3);
+
+#undef CMT
+#define CMT " (chunk 0)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_0, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(3, 1);
+
+#undef CMT
+#define CMT " (chunk 1)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_1, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(3, 2);
+
+#undef CMT
+#define CMT " (chunk 2)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_2, CMT);
+
+ /* Check vst4_lane/vst4q_lane. */
+ clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST4_LANE/VST4Q_LANE"
+ TEST_ALL_VSTX_LANE(4);
+
+#undef CMT
+#define CMT " (chunk 0)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_0, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(4, 1);
+
+#undef CMT
+#define CMT " (chunk 1)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_1, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(4, 2);
+
+#undef CMT
+#define CMT " (chunk 2)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_2, CMT);
+
+ TEST_ALL_EXTRA_CHUNKS(4, 3);
+
+#undef CMT
+#define CMT " (chunk 3)"
+ CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_3, CMT);
+
+#endif /* __aarch64__. */
+
return 0;
}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
index 808641524c47b2c245ee2f10e74a784a7bccefc9..f192d4dda514287c8417e7fc922bc580b209b163 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
@@ -1,7 +1,8 @@
/* This file contains tests for the vreinterpret *p128 intrinsics. */
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
/* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
#include <arm_neon.h>
#include "arm-neon-ref.h"
@@ -78,9 +79,7 @@ VECT_VAR_DECL(vreint_expected_q_f16_p128,hfloat,16,8) [] = { 0xfff0, 0xffff,
int main (void)
{
DECL_VARIABLE_128BITS_VARIANTS(vreint_vector);
- DECL_VARIABLE(vreint_vector, poly, 64, 2);
DECL_VARIABLE_128BITS_VARIANTS(vreint_vector_res);
- DECL_VARIABLE(vreint_vector_res, poly, 64, 2);
clean_results ();
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
index 1d8cf9aa69f0b5b0717e98de613e3c350d6395d4..c915fd2fea6b4d8770c9a4aab88caad391105d89 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
@@ -1,7 +1,8 @@
/* This file contains tests for the vreinterpret *p64 intrinsics. */
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
/* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
#include <arm_neon.h>
#include "arm-neon-ref.h"
@@ -121,11 +122,7 @@ int main (void)
CHECK_FP(TEST_MSG, T1, W, N, PRIx##W, EXPECTED, "");
DECL_VARIABLE_ALL_VARIANTS(vreint_vector);
- DECL_VARIABLE(vreint_vector, poly, 64, 1);
- DECL_VARIABLE(vreint_vector, poly, 64, 2);
DECL_VARIABLE_ALL_VARIANTS(vreint_vector_res);
- DECL_VARIABLE(vreint_vector_res, poly, 64, 1);
- DECL_VARIABLE(vreint_vector_res, poly, 64, 2);
clean_results ();
next prev parent reply other threads:[~2016-11-29 9:50 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-07 13:55 Tamar Christina
2016-11-08 11:21 ` Christophe Lyon
2016-11-08 11:59 ` Tamar Christina
2016-11-24 11:45 ` Tamar Christina
2016-11-25 14:54 ` Christophe Lyon
2016-11-25 15:03 ` Christophe Lyon
2016-11-25 16:01 ` Tamar Christina
2016-11-29 9:50 ` Tamar Christina [this message]
2016-11-29 10:12 ` Christophe Lyon
2016-11-29 12:58 ` Christophe Lyon
2016-11-29 13:48 ` Kyrill Tkachov
2016-11-29 13:55 ` James Greenhalgh
2016-11-30 9:05 ` Christophe Lyon
2016-11-30 9:40 ` Tamar Christina
2016-12-07 4:34 ` Andrew Pinski
2016-12-12 11:29 ` Tamar Christina
2016-12-12 23:53 ` Andrew Pinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR0801MB20314851CA0B552AE1AB5015FF8D0@VI1PR0801MB2031.eurprd08.prod.outlook.com \
--to=tamar.christina@arm.com \
--cc=James.Greenhalgh@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=christophe.lyon@linaro.org \
--cc=christophe.lyon@st.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).