* [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.
@ 2014-11-14 15:43 Alan Lawrence
2014-11-14 15:57 ` Alan Lawrence
0 siblings, 1 reply; 4+ messages in thread
From: Alan Lawrence @ 2014-11-14 15:43 UTC (permalink / raw)
To: gcc-patches
Following recent vectorizer changes to reductions via shifts, AArch64 will now
reduce loops such as this
unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
int
main (unsigned char argc, char **argv)
{
unsigned char prod = 1;
/* Prevent constant propagation of the entire loop below. */
asm volatile ("" : : : "memory");
for (unsigned char i = 0; i < 8; i++)
prod *= in[i];
if (prod != 17)
__builtin_printf("Failed %d\n", prod);
return 0;
}
using an 'ext' instruction from aarch64_expand_vec_perm_const:
main:
adrp x0, .LANCHOR0
movi v2.2s, 0 <=== note reg used here
ldr d1, [x0, #:lo12:.LANCHOR0]
ext v0.8b, v1.8b, v2.8b, #4
mul v1.8b, v1.8b, v0.8b
ext v0.8b, v1.8b, v2.8b, #2
mul v0.8b, v1.8b, v0.8b
ext v2.8b, v0.8b, v2.8b, #1
mul v0.8b, v0.8b, v2.8b
umov w1, v0.b[0]
The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bit
vectors, we can do slightly better using ushr; this patch improves the above to:
main:
adrp x0, .LANCHOR0
ldr d0, [x0, #:lo12:.LANCHOR0]
ushr d1, d0, 32
mul v0.8b, v0.8b, v1.8b
ushr d1, d0, 16
mul v0.8b, v0.8b, v1.8b
ushr d1, d0, 8
mul v0.8b, v0.8b, v1.8b
umov w1, v0.b[0]
...
Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
Cross-testing of check-gcc on aarch64_be-none-elf in progress.
Ok if no regressions on big-endian?
Cheers,
--Alan
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (vec_shr<mode>): New.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_whole_vector_shift): Add aarch64{,_be}.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.
2014-11-14 15:43 [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests Alan Lawrence
@ 2014-11-14 15:57 ` Alan Lawrence
2014-11-17 12:23 ` Alan Lawrence
2014-11-21 16:54 ` Marcus Shawcroft
0 siblings, 2 replies; 4+ messages in thread
From: Alan Lawrence @ 2014-11-14 15:57 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]
...Patch attached...
Alan Lawrence wrote:
> Following recent vectorizer changes to reductions via shifts, AArch64 will now
> reduce loops such as this
>
> unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
>
> int
> main (unsigned char argc, char **argv)
> {
> unsigned char prod = 1;
>
> /* Prevent constant propagation of the entire loop below. */
> asm volatile ("" : : : "memory");
>
> for (unsigned char i = 0; i < 8; i++)
> prod *= in[i];
>
> if (prod != 17)
> __builtin_printf("Failed %d\n", prod);
>
> return 0;
> }
>
> using an 'ext' instruction from aarch64_expand_vec_perm_const:
>
> main:
> adrp x0, .LANCHOR0
> movi v2.2s, 0 <=== note reg used here
> ldr d1, [x0, #:lo12:.LANCHOR0]
> ext v0.8b, v1.8b, v2.8b, #4
> mul v1.8b, v1.8b, v0.8b
> ext v0.8b, v1.8b, v2.8b, #2
> mul v0.8b, v1.8b, v0.8b
> ext v2.8b, v0.8b, v2.8b, #1
> mul v0.8b, v0.8b, v2.8b
> umov w1, v0.b[0]
>
> The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bit
> vectors, we can do slightly better using ushr; this patch improves the above to:
>
> main:
> adrp x0, .LANCHOR0
> ldr d0, [x0, #:lo12:.LANCHOR0]
> ushr d1, d0, 32
> mul v0.8b, v0.8b, v1.8b
> ushr d1, d0, 16
> mul v0.8b, v0.8b, v1.8b
> ushr d1, d0, 8
> mul v0.8b, v0.8b, v1.8b
> umov w1, v0.b[0]
> ...
>
> Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
> Cross-testing of check-gcc on aarch64_be-none-elf in progress.
>
> Ok if no regressions on big-endian?
>
> Cheers,
> --Alan
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd.md (vec_shr<mode>): New.
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp
> (check_effective_target_whole_vector_shift): Add aarch64{,_be}.
>
>
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64_vec_shr.patch --]
[-- Type: text/x-patch; name=aarch64_vec_shr.patch, Size: 1541 bytes --]
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index ef196e4b6fb39c0d2fd9ebfee76abab8369b1e92..397cb5186dd4ff000307f3b14bb4964d84c79469 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -779,6 +779,21 @@
}
)
+;; For 64-bit modes we use ushl/r, as this does not require a SIMD zero.
+(define_insn "vec_shr_<mode>"
+ [(set (match_operand:VD 0 "register_operand" "=w")
+ (lshiftrt:VD (match_operand:VD 1 "register_operand" "w")
+ (match_operand:SI 2 "immediate_operand" "i")))]
+ "TARGET_SIMD"
+ {
+ if (BYTES_BIG_ENDIAN)
+ return "ushl %d0, %d1, %2";
+ else
+ return "ushr %d0, %d1, %2";
+ }
+ [(set_attr "type" "neon_shift_imm")]
+)
+
(define_insn "aarch64_simd_vec_setv2di"
[(set (match_operand:V2DI 0 "register_operand" "=w,w")
(vec_merge:V2DI
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 3361c2f9e8d98c5d1cc194617db6281127db2277..464c910777a53867110b462f121c02525d8dd140 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3335,6 +3335,7 @@ proc check_effective_target_vect_shift { } {
proc check_effective_target_whole_vector_shift { } {
if { [istarget i?86-*-*] || [istarget x86_64-*-*]
|| [istarget ia64-*-*]
+ || [istarget aarch64*-*-*]
|| ([check_effective_target_arm32]
&& [check_effective_target_arm_little_endian])
|| ([istarget mips*-*-*]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.
2014-11-14 15:57 ` Alan Lawrence
@ 2014-11-17 12:23 ` Alan Lawrence
2014-11-21 16:54 ` Marcus Shawcroft
1 sibling, 0 replies; 4+ messages in thread
From: Alan Lawrence @ 2014-11-17 12:23 UTC (permalink / raw)
To: gcc-patches; +Cc: Marcus Shawcroft
I confirm no regressions on aarch64_be-none-elf.
--Alan
Alan Lawrence wrote:
> ...Patch attached...
>
> Alan Lawrence wrote:
>> Following recent vectorizer changes to reductions via shifts, AArch64 will now
>> reduce loops such as this
>>
>> unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
>>
>> int
>> main (unsigned char argc, char **argv)
>> {
>> unsigned char prod = 1;
>>
>> /* Prevent constant propagation of the entire loop below. */
>> asm volatile ("" : : : "memory");
>>
>> for (unsigned char i = 0; i < 8; i++)
>> prod *= in[i];
>>
>> if (prod != 17)
>> __builtin_printf("Failed %d\n", prod);
>>
>> return 0;
>> }
>>
>> using an 'ext' instruction from aarch64_expand_vec_perm_const:
>>
>> main:
>> adrp x0, .LANCHOR0
>> movi v2.2s, 0 <=== note reg used here
>> ldr d1, [x0, #:lo12:.LANCHOR0]
>> ext v0.8b, v1.8b, v2.8b, #4
>> mul v1.8b, v1.8b, v0.8b
>> ext v0.8b, v1.8b, v2.8b, #2
>> mul v0.8b, v1.8b, v0.8b
>> ext v2.8b, v0.8b, v2.8b, #1
>> mul v0.8b, v0.8b, v2.8b
>> umov w1, v0.b[0]
>>
>> The 'ext' works for both 64-bit vectors, and 128-bit vectors; but for 64-bit
>> vectors, we can do slightly better using ushr; this patch improves the above to:
>>
>> main:
>> adrp x0, .LANCHOR0
>> ldr d0, [x0, #:lo12:.LANCHOR0]
>> ushr d1, d0, 32
>> mul v0.8b, v0.8b, v1.8b
>> ushr d1, d0, 16
>> mul v0.8b, v0.8b, v1.8b
>> ushr d1, d0, 8
>> mul v0.8b, v0.8b, v1.8b
>> umov w1, v0.b[0]
>> ...
>>
>> Tested with bootstrap + check-gcc on aarch64-none-linux-gnu.
>> Cross-testing of check-gcc on aarch64_be-none-elf in progress.
>>
>> Ok if no regressions on big-endian?
>>
>> Cheers,
>> --Alan
>>
>> gcc/ChangeLog:
>>
>> * config/aarch64/aarch64-simd.md (vec_shr<mode>): New.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * lib/target-supports.exp
>> (check_effective_target_whole_vector_shift): Add aarch64{,_be}.
>>
>>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.
2014-11-14 15:57 ` Alan Lawrence
2014-11-17 12:23 ` Alan Lawrence
@ 2014-11-21 16:54 ` Marcus Shawcroft
1 sibling, 0 replies; 4+ messages in thread
From: Marcus Shawcroft @ 2014-11-21 16:54 UTC (permalink / raw)
To: Alan Lawrence; +Cc: gcc-patches
On 14 November 2014 15:46, Alan Lawrence <alan.lawrence@arm.com> wrote:
>> gcc/ChangeLog:
>>
>> * config/aarch64/aarch64-simd.md (vec_shr<mode>): New.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * lib/target-supports.exp
>> (check_effective_target_whole_vector_shift): Add aarch64{,_be}.
OK /Marcus
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-11-21 16:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-14 15:43 [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests Alan Lawrence
2014-11-14 15:57 ` Alan Lawrence
2014-11-17 12:23 ` Alan Lawrence
2014-11-21 16:54 ` Marcus Shawcroft
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).