* [PATCH 1/3] optabs.c: Make vector_compare_rtx cope with VOIDmode constants (e.g. const0_rtx)
2015-04-17 15:37 [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
@ 2015-04-17 15:39 ` Alan Lawrence
2015-05-01 15:12 ` Alan Lawrence
2015-04-17 15:40 ` [PATCH 2/3][AArch64] Add vcond(u?)didi pattern Alan Lawrence
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Alan Lawrence @ 2015-04-17 15:39 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 228 bytes --]
As per introduction, this allows vector_compare_rtx to work on DImode vectors.
Bootstrapped + check-gcc on x86-unknown-linux-gnu.
gcc/ChangeLog:
* optabs.c (vector_compare_rtx): Handle RTL operands having VOIDmode.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: vec_cmp_rtx.patch --]
[-- Type: text/x-patch; name=vec_cmp_rtx.patch, Size: 1497 bytes --]
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f8d584eeeb11a2c19d8c8d887a0ff18aed5f56b4..135c88938f8bc03eed4dc7f1b5adcb0bb0606b1e 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6530,18 +6530,28 @@ vector_compare_rtx (enum tree_code tcode, tree t_op0, tree t_op1,
{
struct expand_operand ops[2];
rtx rtx_op0, rtx_op1;
+ machine_mode m0, m1;
enum rtx_code rcode = get_rtx_code (tcode, unsignedp);
gcc_assert (TREE_CODE_CLASS (tcode) == tcc_comparison);
- /* Expand operands. */
+ /* Expand operands. For vector types with scalar modes, e.g. where int64x1_t
+ has mode DImode, this can produce a constant RTX of mode VOIDmode; in such
+ cases, use the original mode. */
rtx_op0 = expand_expr (t_op0, NULL_RTX, TYPE_MODE (TREE_TYPE (t_op0)),
EXPAND_STACK_PARM);
+ m0 = GET_MODE (rtx_op0);
+ if (m0 == VOIDmode)
+ m0 = TYPE_MODE (TREE_TYPE (t_op0));
+
rtx_op1 = expand_expr (t_op1, NULL_RTX, TYPE_MODE (TREE_TYPE (t_op1)),
EXPAND_STACK_PARM);
+ m1 = GET_MODE (rtx_op1);
+ if (m1 == VOIDmode)
+ m1 = TYPE_MODE (TREE_TYPE (t_op1));
- create_input_operand (&ops[0], rtx_op0, GET_MODE (rtx_op0));
- create_input_operand (&ops[1], rtx_op1, GET_MODE (rtx_op1));
+ create_input_operand (&ops[0], rtx_op0, m0);
+ create_input_operand (&ops[1], rtx_op1, m1);
if (!maybe_legitimize_operands (icode, 4, 2, ops))
gcc_unreachable ();
return gen_rtx_fmt_ee (rcode, VOIDmode, ops[0].value, ops[1].value);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/3][AArch64] Add vcond(u?)didi pattern
2015-04-17 15:37 [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
2015-04-17 15:39 ` [PATCH 1/3] optabs.c: Make vector_compare_rtx cope with VOIDmode constants (e.g. const0_rtx) Alan Lawrence
@ 2015-04-17 15:40 ` Alan Lawrence
2015-05-05 11:13 ` Marcus Shawcroft
2015-04-17 15:41 ` [PATCH 3/3][AArch64] Idiomatic 64x1 comparisons in arm_neon.h Alan Lawrence
2015-05-05 9:43 ` [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
3 siblings, 1 reply; 9+ messages in thread
From: Alan Lawrence @ 2015-04-17 15:40 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 316 bytes --]
This just adds the necessary patterns used for comparisons of DImode vectors.
Used as part of arm_neon.h, in next/final patch.
Tested on aarch64-none-elf.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_vcond_internal<mode><mode>,
vcond<mode><mode>, vcondu<mode>,<mode>): Add DImode variant.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: vcond_pats.patch --]
[-- Type: text/x-patch; name=vcond_pats.patch, Size: 2625 bytes --]
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 52a1c3ba792adcaeaec9be4d8ada0f81bfa4714a..591740f5809d95f6f5502feda8599fd2958327bd 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2031,13 +2031,13 @@
})
(define_expand "aarch64_vcond_internal<mode><mode>"
- [(set (match_operand:VDQ_I 0 "register_operand")
- (if_then_else:VDQ_I
+ [(set (match_operand:VSDQ_I_DI 0 "register_operand")
+ (if_then_else:VSDQ_I_DI
(match_operator 3 "comparison_operator"
- [(match_operand:VDQ_I 4 "register_operand")
- (match_operand:VDQ_I 5 "nonmemory_operand")])
- (match_operand:VDQ_I 1 "nonmemory_operand")
- (match_operand:VDQ_I 2 "nonmemory_operand")))]
+ [(match_operand:VSDQ_I_DI 4 "register_operand")
+ (match_operand:VSDQ_I_DI 5 "nonmemory_operand")])
+ (match_operand:VSDQ_I_DI 1 "nonmemory_operand")
+ (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))]
"TARGET_SIMD"
{
rtx op1 = operands[1];
@@ -2339,13 +2339,13 @@
})
(define_expand "vcond<mode><mode>"
- [(set (match_operand:VALL 0 "register_operand")
- (if_then_else:VALL
+ [(set (match_operand:VALLDI 0 "register_operand")
+ (if_then_else:VALLDI
(match_operator 3 "comparison_operator"
- [(match_operand:VALL 4 "register_operand")
- (match_operand:VALL 5 "nonmemory_operand")])
- (match_operand:VALL 1 "nonmemory_operand")
- (match_operand:VALL 2 "nonmemory_operand")))]
+ [(match_operand:VALLDI 4 "register_operand")
+ (match_operand:VALLDI 5 "nonmemory_operand")])
+ (match_operand:VALLDI 1 "nonmemory_operand")
+ (match_operand:VALLDI 2 "nonmemory_operand")))]
"TARGET_SIMD"
{
emit_insn (gen_aarch64_vcond_internal<mode><mode> (operands[0], operands[1],
@@ -2372,13 +2372,13 @@
})
(define_expand "vcondu<mode><mode>"
- [(set (match_operand:VDQ_I 0 "register_operand")
- (if_then_else:VDQ_I
+ [(set (match_operand:VSDQ_I_DI 0 "register_operand")
+ (if_then_else:VSDQ_I_DI
(match_operator 3 "comparison_operator"
- [(match_operand:VDQ_I 4 "register_operand")
- (match_operand:VDQ_I 5 "nonmemory_operand")])
- (match_operand:VDQ_I 1 "nonmemory_operand")
- (match_operand:VDQ_I 2 "nonmemory_operand")))]
+ [(match_operand:VSDQ_I_DI 4 "register_operand")
+ (match_operand:VSDQ_I_DI 5 "nonmemory_operand")])
+ (match_operand:VSDQ_I_DI 1 "nonmemory_operand")
+ (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))]
"TARGET_SIMD"
{
emit_insn (gen_aarch64_vcond_internal<mode><mode> (operands[0], operands[1],
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3][AArch64] Idiomatic 64x1 comparisons in arm_neon.h
2015-04-17 15:37 [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
2015-04-17 15:39 ` [PATCH 1/3] optabs.c: Make vector_compare_rtx cope with VOIDmode constants (e.g. const0_rtx) Alan Lawrence
2015-04-17 15:40 ` [PATCH 2/3][AArch64] Add vcond(u?)didi pattern Alan Lawrence
@ 2015-04-17 15:41 ` Alan Lawrence
2015-05-05 11:14 ` Marcus Shawcroft
2015-05-05 9:43 ` [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
3 siblings, 1 reply; 9+ messages in thread
From: Alan Lawrence @ 2015-04-17 15:41 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 507 bytes --]
This also makes the existing intrinsics tests apply to the new patterns.
Tested on aarch64-none-elf.
gcc/ChangeLog:
* config/aarch64/arm_neon.h (vceq_s64, vceq_u64, vceqz_s64, vceqz_u64,
vcge_s64, vcge_u64, vcgez_s64, vcgt_s64, vcgt_u64, vcgtz_s64, vcle_s64,
vcle_u64, vclez_s64, vclt_s64, vclt_u64, vcltz_s64, vtst_s64,
vtst_u64): Rewrite using gcc vector extensions.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/singleton_intrinsics_1.c: Generalize regex to
allow cmlt or sshr.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cmp64x1_neon.patch --]
[-- Type: text/x-patch; name=cmp64x1_neon.patch, Size: 7619 bytes --]
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 319cd8c1a0a441831a037e9c063badce7565f97c..02cdc7852d92e30e38c9c62ed09137b0d96cf6a6 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -12367,7 +12367,7 @@ vceq_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vceq_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) {__a[0] == __b[0] ? -1ll : 0ll};
+ return (uint64x1_t) (__a == __b);
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -12391,7 +12391,7 @@ vceq_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vceq_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) {__a[0] == __b[0] ? -1ll : 0ll};
+ return (__a == __b);
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -12527,7 +12527,7 @@ vceqz_s32 (int32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vceqz_s64 (int64x1_t __a)
{
- return (uint64x1_t) {__a[0] == 0ll ? -1ll : 0ll};
+ return (uint64x1_t) (__a == __AARCH64_INT64_C (0));
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -12551,7 +12551,7 @@ vceqz_u32 (uint32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vceqz_u64 (uint64x1_t __a)
{
- return (uint64x1_t) {__a[0] == 0ll ? -1ll : 0ll};
+ return (__a == __AARCH64_UINT64_C (0));
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -12681,7 +12681,7 @@ vcge_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcge_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) {__a[0] >= __b[0] ? -1ll : 0ll};
+ return (uint64x1_t) (__a >= __b);
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -12705,7 +12705,7 @@ vcge_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcge_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) {__a[0] >= __b[0] ? -1ll : 0ll};
+ return (__a >= __b);
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -12829,7 +12829,7 @@ vcgez_s32 (int32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcgez_s64 (int64x1_t __a)
{
- return (uint64x1_t) {__a[0] >= 0ll ? -1ll : 0ll};
+ return (uint64x1_t) (__a >= __AARCH64_INT64_C (0));
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -12923,7 +12923,7 @@ vcgt_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcgt_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) (__a[0] > __b[0] ? -1ll : 0ll);
+ return (uint64x1_t) (__a > __b);
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -12947,7 +12947,7 @@ vcgt_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcgt_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) (__a[0] > __b[0] ? -1ll : 0ll);
+ return (__a > __b);
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -13071,7 +13071,7 @@ vcgtz_s32 (int32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcgtz_s64 (int64x1_t __a)
{
- return (uint64x1_t) {__a[0] > 0ll ? -1ll : 0ll};
+ return (uint64x1_t) (__a > __AARCH64_INT64_C (0));
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -13165,7 +13165,7 @@ vcle_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcle_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) {__a[0] <= __b[0] ? -1ll : 0ll};
+ return (uint64x1_t) (__a <= __b);
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -13189,7 +13189,7 @@ vcle_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcle_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) {__a[0] <= __b[0] ? -1ll : 0ll};
+ return (__a <= __b);
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -13313,7 +13313,7 @@ vclez_s32 (int32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vclez_s64 (int64x1_t __a)
{
- return (uint64x1_t) {__a[0] <= 0ll ? -1ll : 0ll};
+ return (uint64x1_t) (__a <= __AARCH64_INT64_C (0));
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -13407,7 +13407,7 @@ vclt_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vclt_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) {__a[0] < __b[0] ? -1ll : 0ll};
+ return (uint64x1_t) (__a < __b);
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -13431,7 +13431,7 @@ vclt_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vclt_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) {__a[0] < __b[0] ? -1ll : 0ll};
+ return (__a < __b);
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -13555,7 +13555,7 @@ vcltz_s32 (int32x2_t __a)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vcltz_s64 (int64x1_t __a)
{
- return (uint64x1_t) {__a[0] < 0ll ? -1ll : 0ll};
+ return (uint64x1_t) (__a < __AARCH64_INT64_C (0));
}
__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__))
@@ -24083,7 +24083,7 @@ vtst_s32 (int32x2_t __a, int32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vtst_s64 (int64x1_t __a, int64x1_t __b)
{
- return (uint64x1_t) {(__a[0] & __b[0]) ? -1ll : 0ll};
+ return (uint64x1_t) ((__a & __b) != __AARCH64_INT64_C (0));
}
__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
@@ -24107,7 +24107,7 @@ vtst_u32 (uint32x2_t __a, uint32x2_t __b)
__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__))
vtst_u64 (uint64x1_t __a, uint64x1_t __b)
{
- return (uint64x1_t) {(__a[0] & __b[0]) ? -1ll : 0ll};
+ return ((__a & __b) != __AARCH64_UINT64_C (0));
}
__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
diff --git a/gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c b/gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c
index 4a0934b01f9442b7f1324a1f4528d45022daf9b8..633a0d24eade982181d972b915f303b06e5087c4 100644
--- a/gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c
@@ -235,8 +235,8 @@ test_vrshl_u64 (uint64x1_t a, int64x1_t b)
return vrshl_u64 (a, b);
}
-/* For int64x1_t, sshr...#63 is output instead of the equivalent cmlt...#0. */
-/* { dg-final { scan-assembler-times "\\tsshr\\td\[0-9\]+" 2 } } */
+/* For int64x1_t, sshr...#63 is equivalent to cmlt...#0. */
+/* { dg-final { scan-assembler-times "\\t(?:sshr|cmlt)\\td\[0-9\]+" 2 } } */
int64x1_t
test_vshr_n_s64 (int64x1_t a)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3][AArch64] DImode vector compares
2015-04-17 15:37 [PATCH 0/3][AArch64] DImode vector compares Alan Lawrence
` (2 preceding siblings ...)
2015-04-17 15:41 ` [PATCH 3/3][AArch64] Idiomatic 64x1 comparisons in arm_neon.h Alan Lawrence
@ 2015-05-05 9:43 ` Alan Lawrence
3 siblings, 0 replies; 9+ messages in thread
From: Alan Lawrence @ 2015-05-05 9:43 UTC (permalink / raw)
To: Alan Lawrence; +Cc: gcc-patches, Marcus Shawcroft
Alan Lawrence wrote:
> Hi,
>
> Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC
> vector extensions currently generates very poor assembly code, for example
> "uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3):
>
> fmov x0, d0 // 22 movdi_aarch64/12 [length = 4]
> fmov x1, d1 // 23 movdi_aarch64/12 [length = 4]
> cmp x0, x1 // 10 cmpdi/1 [length = 4]
> csinv x0, xzr, xzr, cc // 17 cmovdi_insn/3 [length = 4]
> fmov d0, x0 // 24 *movdi_aarch64/11 [length = 4]
> ret // 27 simple_return [length = 4]
>
> Meaning that arm_neon.h instead has to use rather awkward forms like "return
> (uint64x1_t) {__a[0] >= __b[0] ? -1ll : 0ll};" to produce the desired assembly
>
> cmhs d0, d0, d1
> ret
>
> This series adds vcond(u?)didi patterns for AArch64, to generate appropriate RTL
> from direct comparisons of 64x1 vectors (which are of DImode). However, as
> things stand, adding a vconddidi pattern causes an ICE in vector_compare_rtx
> (maybe_legitimize_operands), because a DImode constant-zero (vector or
> otherwise) is expanded as const0_rtx, which has mode VOIDmode. I tried quite a
> bit to generate an RTL const_vector, or even just something with mode DImode,
> but without success, hence the first patch fixes vector_compare_rtx to use the
> mode from the tree if necessary. (DImode vectors are specifically allowed by
> stor-layout.c, but no other platform defines vconddidi.)
Can I ping the AArch64 parts of this (patches 2+3)? These then provide the
testcases requested by Jeff Law in his approval of the first patch
(https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00076.html).
Thanks, Alan
^ permalink raw reply [flat|nested] 9+ messages in thread