* [3/3][aarch64] Add support for vec_widen_shift pattern
@ 2020-11-12 19:34 Joel Hutton
2020-11-13 8:05 ` Richard Biener
2020-11-13 10:14 ` Richard Sandiford
0 siblings, 2 replies; 6+ messages in thread
From: Joel Hutton @ 2020-11-12 19:34 UTC (permalink / raw)
To: GCC Patches; +Cc: Kyrylo Tkachov, Richard Biener
[-- Attachment #1: Type: text/plain, Size: 617 bytes --]
Hi all,
This patch adds support in the aarch64 backend for the vec_widen_shift vect-pattern and makes a minor mid-end fix to support it.
All 3 patches together bootstrapped and regression tested on aarch64.
Ok for stage 1?
gcc/ChangeLog:
2020-11-12 Joel Hutton <joel.hutton@arm.com>
* config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo<mode> patterns
* tree-vect-stmts.c
(vectorizable_conversion): Fix for widen_lshift case
gcc/testsuite/ChangeLog:
2020-11-12 Joel Hutton <joel.hutton@arm.com>
* gcc.target/aarch64/vect-widen-lshift.c: New test.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-AArch64-vect-vec_widen_lshift-pattern.patch --]
[-- Type: text/x-patch; name="0003-AArch64-vect-vec_widen_lshift-pattern.patch", Size: 5870 bytes --]
From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Thu, 12 Nov 2020 11:48:25 +0000
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern
Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end.
---
gcc/config/aarch64/aarch64-simd.md | 66 +++++++++++++++++++
.../gcc.target/aarch64/vect-widen-lshift.c | 60 +++++++++++++++++
gcc/tree-vect-stmts.c | 9 ++-
3 files changed, 133 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4711,8 +4711,74 @@
[(set_attr "type" "neon_sat_shift_reg<q>")]
)
+(define_expand "vec_widen_<sur>shiftl_lo_<mode>"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
+ (match_operand:SI 2
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, false);
+ emit_insn (gen_aarch64_<sur>shll<mode>_internal (operands[0], operands[1],
+ p, operands[2]));
+ DONE;
+ }
+)
+
+(define_expand "vec_widen_<sur>shiftl_hi_<mode>"
+ [(set (match_operand:<VWIDE> 0 "register_operand")
+ (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
+ (match_operand:SI 2
+ "immediate_operand" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, true);
+ emit_insn (gen_aarch64_<sur>shll2<mode>_internal (operands[0], operands[1],
+ p, operands[2]));
+ DONE;
+ }
+)
+
;; vshll_n
+(define_insn "aarch64_<sur>shll<mode>_internal"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(vec_select:<VHALF>
+ (match_operand:VQW 1 "register_operand" "w")
+ (match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+ (match_operand:SI 3
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
+ return "shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
+ else
+ return "<sur>shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
+ }
+ [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_<sur>shll2<mode>_internal"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(vec_select:<VHALF>
+ (match_operand:VQW 1 "register_operand" "w")
+ (match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+ (match_operand:SI 3
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
+ return "shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
+ else
+ return "<sur>shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
+ }
+ [(set_attr "type" "neon_shift_imm_long")]
+)
+
(define_insn "aarch64_<sur>shll_n<mode>"
[(set (match_operand:<VWIDE> 0 "register_operand" "=w")
(unspec:<VWIDE> [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index 0000000000000000000000000000000000000000..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include <stdint.h>
+#include <string.h>
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+ {
+ foo[i] = a[i] << 16;
+ foo[i+1] = a[i+1] << 16;
+ foo[i+2] = a[i+2] << 16;
+ foo[i+3] = a[i+3] << 16;
+ }
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+ {
+ foo[i] = a[i] << 16;
+ foo[i+1] = a[i+1] << 16;
+ foo[i+2] = a[i+2] << 16;
+ foo[i+3] = a[i+3] << 16;
+ }
+}
+
+
+void __attribute__((optimize (0)))
+init(uint16_t *a, uint16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE;i++)
+ {
+ a[i] = i;
+ b[i] = 2*i;
+ }
+}
+
+int __attribute__((optimize (0)))
+main()
+{
+ uint32_t foo_arr[ARR_SIZE];
+ uint32_t bar_arr[ARR_SIZE];
+ uint16_t a[ARR_SIZE];
+ uint16_t b[ARR_SIZE];
+
+ init(a, b);
+ sshll_opt(foo_arr, a, b);
+ sshll_nonopt(bar_arr, a, b);
+ if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0)
+ return 1;
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times "shll\t" 1} } */
+/* { dg-final { scan-assembler-times "shll2\t" 1} } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index f12fd158b13656ee24022ec7e445c53444be6554..1f40b59c0560eec675af1d9a0e3e818d47589de6 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4934,8 +4934,13 @@ vectorizable_conversion (vec_info *vinfo,
&vec_oprnds1);
if (code == WIDEN_LSHIFT_EXPR)
{
- vec_oprnds1.create (ncopies * ninputs);
- for (i = 0; i < ncopies * ninputs; ++i)
+ int oprnds_size = ncopies * ninputs;
+ /* In the case of SLP ncopies = 1, so the size of vec_oprnds1 here
+ * should be obtained by the the size of vec_oprnds0. */
+ if (slp_node)
+ oprnds_size = vec_oprnds0.length ();
+ vec_oprnds1.create (oprnds_size);
+ for (i = 0; i < oprnds_size; ++i)
vec_oprnds1.quick_push (op1);
}
/* Arguments are ready. Create the new vector stmts. */
--
2.17.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3/3][aarch64] Add support for vec_widen_shift pattern
2020-11-12 19:34 [3/3][aarch64] Add support for vec_widen_shift pattern Joel Hutton
@ 2020-11-13 8:05 ` Richard Biener
2020-11-13 10:14 ` Richard Sandiford
1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2020-11-13 8:05 UTC (permalink / raw)
To: Joel Hutton; +Cc: GCC Patches, Kyrylo Tkachov
On Thu, 12 Nov 2020, Joel Hutton wrote:
> Hi all,
>
> This patch adds support in the aarch64 backend for the vec_widen_shift vect-pattern and makes a minor mid-end fix to support it.
>
> All 3 patches together bootstrapped and regression tested on aarch64.
>
> Ok for stage 1?
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index
f12fd158b13656ee24022ec7e445c53444be6554..1f40b59c0560eec675af1d9a0e3e818d47
589de6 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4934,8 +4934,13 @@ vectorizable_conversion (vec_info *vinfo,
&vec_oprnds1);
if (code == WIDEN_LSHIFT_EXPR)
{
- vec_oprnds1.create (ncopies * ninputs);
- for (i = 0; i < ncopies * ninputs; ++i)
+ int oprnds_size = ncopies * ninputs;
+ /* In the case of SLP ncopies = 1, so the size of vec_oprnds1
here
+ * should be obtained by the the size of vec_oprnds0. */
You should be able to always use vec_oprnds0.length ()
This hunk is OK with that change.
+ if (slp_node)
+ oprnds_size = vec_oprnds0.length ();
+ vec_oprnds1.create (oprnds_size);
+ for (i = 0; i < oprnds_size; ++i)
vec_oprnds1.quick_push (op1);
}
/* Arguments are ready. Create the new vector stmts. */
>
> gcc/ChangeLog:
>
> 2020-11-12 ?Joel Hutton ?<joel.hutton@arm.com>
>
> ? ? ? ? * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo<mode> patterns
> ? ? ? ? * tree-vect-stmts.c
> ? ? ? ? (vectorizable_conversion): Fix for widen_lshift case
>
> gcc/testsuite/ChangeLog:
>
> 2020-11-12 ?Joel Hutton ?<joel.hutton@arm.com>
>
> ? ? ? ? * gcc.target/aarch64/vect-widen-lshift.c: New test.
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3/3][aarch64] Add support for vec_widen_shift pattern
2020-11-12 19:34 [3/3][aarch64] Add support for vec_widen_shift pattern Joel Hutton
2020-11-13 8:05 ` Richard Biener
@ 2020-11-13 10:14 ` Richard Sandiford
2020-11-13 16:50 ` Joel Hutton
1 sibling, 1 reply; 6+ messages in thread
From: Richard Sandiford @ 2020-11-13 10:14 UTC (permalink / raw)
To: Joel Hutton via Gcc-patches; +Cc: Joel Hutton, Richard Biener
Joel Hutton via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi all,
>
> This patch adds support in the aarch64 backend for the vec_widen_shift vect-pattern and makes a minor mid-end fix to support it.
>
> All 3 patches together bootstrapped and regression tested on aarch64.
>
> Ok for stage 1?
>
> gcc/ChangeLog:
>
> 2020-11-12 Joel Hutton <joel.hutton@arm.com>
>
> * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo<mode> patterns
> * tree-vect-stmts.c
> (vectorizable_conversion): Fix for widen_lshift case
>
> gcc/testsuite/ChangeLog:
>
> 2020-11-12 Joel Hutton <joel.hutton@arm.com>
>
> * gcc.target/aarch64/vect-widen-lshift.c: New test.
>
> From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
> From: Joel Hutton <joel.hutton@arm.com>
> Date: Thu, 12 Nov 2020 11:48:25 +0000
> Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern
>
> Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
> mid-end.
> ---
> gcc/config/aarch64/aarch64-simd.md | 66 +++++++++++++++++++
> .../gcc.target/aarch64/vect-widen-lshift.c | 60 +++++++++++++++++
> gcc/tree-vect-stmts.c | 9 ++-
> 3 files changed, 133 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4711,8 +4711,74 @@
> [(set_attr "type" "neon_sat_shift_reg<q>")]
> )
>
> +(define_expand "vec_widen_<sur>shiftl_lo_<mode>"
> + [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
> + (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
> + (match_operand:SI 2
> + "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
> + VSHLL))]
> + "TARGET_SIMD"
> + {
> + rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, false);
> + emit_insn (gen_aarch64_<sur>shll<mode>_internal (operands[0], operands[1],
> + p, operands[2]));
> + DONE;
> + }
> +)
> +
> +(define_expand "vec_widen_<sur>shiftl_hi_<mode>"
> + [(set (match_operand:<VWIDE> 0 "register_operand")
> + (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
> + (match_operand:SI 2
> + "immediate_operand" "i")]
> + VSHLL))]
> + "TARGET_SIMD"
> + {
> + rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, true);
> + emit_insn (gen_aarch64_<sur>shll2<mode>_internal (operands[0], operands[1],
> + p, operands[2]));
> + DONE;
> + }
> +)
> +
> ;; vshll_n
>
> +(define_insn "aarch64_<sur>shll<mode>_internal"
> + [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
> + (unspec:<VWIDE> [(vec_select:<VHALF>
> + (match_operand:VQW 1 "register_operand" "w")
> + (match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
> + (match_operand:SI 3
> + "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
> + VSHLL))]
> + "TARGET_SIMD"
> + {
> + if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
> + return "shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
> + else
> + return "<sur>shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
> + }
> + [(set_attr "type" "neon_shift_imm_long")]
> +)
> +
> +(define_insn "aarch64_<sur>shll2<mode>_internal"
> + [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
> + (unspec:<VWIDE> [(vec_select:<VHALF>
> + (match_operand:VQW 1 "register_operand" "w")
> + (match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
> + (match_operand:SI 3
> + "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
> + VSHLL))]
> + "TARGET_SIMD"
> + {
> + if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
> + return "shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
> + else
> + return "<sur>shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
> + }
> + [(set_attr "type" "neon_shift_imm_long")]
> +)
> +
> (define_insn "aarch64_<sur>shll_n<mode>"
> [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
> (unspec:<VWIDE> [(match_operand:VD_BHSI 1 "register_operand" "w")
> diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
> @@ -0,0 +1,60 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -save-temps" } */
> +#include <stdint.h>
> +#include <string.h>
> +
SVE targets will need a:
#pragma GCC target "+nosve"
here, since we'll generate different code for SVE.
> +#define ARR_SIZE 1024
> +
> +/* Should produce an shll,shll2 pair*/
> +void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
> +{
> + for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> + {
> + foo[i] = a[i] << 16;
> + foo[i+1] = a[i+1] << 16;
> + foo[i+2] = a[i+2] << 16;
> + foo[i+3] = a[i+3] << 16;
> + }
> +}
> +
> +__attribute__((optimize (0)))
> +void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
> +{
> + for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> + {
> + foo[i] = a[i] << 16;
> + foo[i+1] = a[i+1] << 16;
> + foo[i+2] = a[i+2] << 16;
> + foo[i+3] = a[i+3] << 16;
> + }
> +}
> +
> +
> +void __attribute__((optimize (0)))
> +init(uint16_t *a, uint16_t *b)
> +{
> + for( int i = 0; i < ARR_SIZE;i++)
> + {
> + a[i] = i;
> + b[i] = 2*i;
> + }
> +}
> +
> +int __attribute__((optimize (0)))
> +main()
> +{
> + uint32_t foo_arr[ARR_SIZE];
> + uint32_t bar_arr[ARR_SIZE];
> + uint16_t a[ARR_SIZE];
> + uint16_t b[ARR_SIZE];
> +
> + init(a, b);
> + sshll_opt(foo_arr, a, b);
> + sshll_nonopt(bar_arr, a, b);
> + if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0)
> + return 1;
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "shll\t" 1} } */
> +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
Very minor nit, sorry, but I think:
/* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
would be better. Using "…\t" works, but IIRC it shows up as a tab
character in the testsuite result summary too.
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index f12fd158b13656ee24022ec7e445c53444be6554..1f40b59c0560eec675af1d9a0e3e818d47589de6 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -4934,8 +4934,13 @@ vectorizable_conversion (vec_info *vinfo,
> &vec_oprnds1);
> if (code == WIDEN_LSHIFT_EXPR)
> {
> - vec_oprnds1.create (ncopies * ninputs);
> - for (i = 0; i < ncopies * ninputs; ++i)
> + int oprnds_size = ncopies * ninputs;
> + /* In the case of SLP ncopies = 1, so the size of vec_oprnds1 here
> + * should be obtained by the the size of vec_oprnds0. */
This is redundant given Richard's comment, but FWIW, GCC style
is to indent without “*”s, so:
/* In the case of SLP ncopies = 1, so the size of vec_oprnds1 here
should be obtained by the the size of vec_oprnds0. */
OK for the aarch64 bits with the testsuite changes above.
Thanks,
Richard
> + if (slp_node)
> + oprnds_size = vec_oprnds0.length ();
> + vec_oprnds1.create (oprnds_size);
> + for (i = 0; i < oprnds_size; ++i)
> vec_oprnds1.quick_push (op1);
> }
> /* Arguments are ready. Create the new vector stmts. */
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3/3][aarch64] Add support for vec_widen_shift pattern
2020-11-13 10:14 ` Richard Sandiford
@ 2020-11-13 16:50 ` Joel Hutton
2020-11-16 14:00 ` Richard Biener
0 siblings, 1 reply; 6+ messages in thread
From: Joel Hutton @ 2020-11-13 16:50 UTC (permalink / raw)
To: Richard Sandiford, Joel Hutton via Gcc-patches; +Cc: Richard Biener
[-- Attachment #1: Type: text/plain, Size: 1093 bytes --]
Tests are still running, but I believe I've addressed all the comments.
> > +#include <string.h>
> > +
>
> SVE targets will need a:
>
> #pragma GCC target "+nosve"
>
> here, since we'll generate different code for SVE.
Fixed.
> > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
> > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
>
> Very minor nit, sorry, but I think:
>
> /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
>
> would be better. Using "…\t" works, but IIRC it shows up as a tab
> character in the testsuite result summary too.
Fixed. Minor nits welcome. :)
> OK for the aarch64 bits with the testsuite changes above.
ok?
gcc/ChangeLog:
2020-11-13 Joel Hutton <joel.hutton@arm.com>
* config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo<mode>
patterns.
* tree-vect-stmts.c
(vectorizable_conversion): Fix for widen_lshift case.
gcc/testsuite/ChangeLog:
2020-11-13 Joel Hutton <joel.hutton@arm.com>
* gcc.target/aarch64/vect-widen-lshift.c: New test.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-AArch64-vect-vec_widen_lshift-pattern.patch --]
[-- Type: text/x-patch; name="0003-AArch64-vect-vec_widen_lshift-pattern.patch", Size: 5788 bytes --]
From e8d3ed6fa739850eb649b97c250f1f2c650c34c1 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Thu, 12 Nov 2020 11:48:25 +0000
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern
Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end. This pattern takes one vector with N elements of size S, shifts
each element left by the element width and stores the results as N
elements of size 2*s (in 2 result vectors). The aarch64 backend
implements this with the shll,shll2 instruction pair.
---
gcc/config/aarch64/aarch64-simd.md | 66 +++++++++++++++++++
.../gcc.target/aarch64/vect-widen-lshift.c | 62 +++++++++++++++++
gcc/tree-vect-stmts.c | 5 +-
3 files changed, 131 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 30299610635..4ba799a27c9 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4664,8 +4664,74 @@
[(set_attr "type" "neon_sat_shift_reg<q>")]
)
+(define_expand "vec_widen_<sur>shiftl_lo_<mode>"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
+ (match_operand:SI 2
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, false);
+ emit_insn (gen_aarch64_<sur>shll<mode>_internal (operands[0], operands[1],
+ p, operands[2]));
+ DONE;
+ }
+)
+
+(define_expand "vec_widen_<sur>shiftl_hi_<mode>"
+ [(set (match_operand:<VWIDE> 0 "register_operand")
+ (unspec:<VWIDE> [(match_operand:VQW 1 "register_operand" "w")
+ (match_operand:SI 2
+ "immediate_operand" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, <nunits>, true);
+ emit_insn (gen_aarch64_<sur>shll2<mode>_internal (operands[0], operands[1],
+ p, operands[2]));
+ DONE;
+ }
+)
+
;; vshll_n
+(define_insn "aarch64_<sur>shll<mode>_internal"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(vec_select:<VHALF>
+ (match_operand:VQW 1 "register_operand" "w")
+ (match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+ (match_operand:SI 3
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
+ return "shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
+ else
+ return "<sur>shll\\t%0.<Vwtype>, %1.<Vhalftype>, %3";
+ }
+ [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_<sur>shll2<mode>_internal"
+ [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+ (unspec:<VWIDE> [(vec_select:<VHALF>
+ (match_operand:VQW 1 "register_operand" "w")
+ (match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+ (match_operand:SI 3
+ "aarch64_simd_shift_imm_bitsize_<ve_mode>" "i")]
+ VSHLL))]
+ "TARGET_SIMD"
+ {
+ if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (<MODE>mode))
+ return "shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
+ else
+ return "<sur>shll2\\t%0.<Vwtype>, %1.<Vtype>, %3";
+ }
+ [(set_attr "type" "neon_shift_imm_long")]
+)
+
(define_insn "aarch64_<sur>shll_n<mode>"
[(set (match_operand:<VWIDE> 0 "register_operand" "=w")
(unspec:<VWIDE> [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index 00000000000..48a3719d4ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include <stdint.h>
+#include <string.h>
+
+#pragma GCC target "+nosve"
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+ {
+ foo[i] = a[i] << 16;
+ foo[i+1] = a[i+1] << 16;
+ foo[i+2] = a[i+2] << 16;
+ foo[i+3] = a[i+3] << 16;
+ }
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+ {
+ foo[i] = a[i] << 16;
+ foo[i+1] = a[i+1] << 16;
+ foo[i+2] = a[i+2] << 16;
+ foo[i+3] = a[i+3] << 16;
+ }
+}
+
+
+void __attribute__((optimize (0)))
+init(uint16_t *a, uint16_t *b)
+{
+ for( int i = 0; i < ARR_SIZE;i++)
+ {
+ a[i] = i;
+ b[i] = 2*i;
+ }
+}
+
+int __attribute__((optimize (0)))
+main()
+{
+ uint32_t foo_arr[ARR_SIZE];
+ uint32_t bar_arr[ARR_SIZE];
+ uint16_t a[ARR_SIZE];
+ uint16_t b[ARR_SIZE];
+
+ init(a, b);
+ sshll_opt(foo_arr, a, b);
+ sshll_nonopt(bar_arr, a, b);
+ if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0)
+ return 1;
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times {\tshll\t} 1} } */
+/* { dg-final { scan-assembler-times {\tshll2\t} 1} } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 25a8474c774..5c676c7d5ef 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4934,8 +4934,9 @@ vectorizable_conversion (vec_info *vinfo,
&vec_oprnds1);
if (code == WIDEN_LSHIFT_EXPR)
{
- vec_oprnds1.create (ncopies * ninputs);
- for (i = 0; i < ncopies * ninputs; ++i)
+ int oprnds_size = vec_oprnds0.length ();
+ vec_oprnds1.create (oprnds_size);
+ for (i = 0; i < oprnds_size; ++i)
vec_oprnds1.quick_push (op1);
}
/* Arguments are ready. Create the new vector stmts. */
--
2.17.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3/3][aarch64] Add support for vec_widen_shift pattern
2020-11-13 16:50 ` Joel Hutton
@ 2020-11-16 14:00 ` Richard Biener
2020-11-17 13:41 ` Richard Sandiford
0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2020-11-16 14:00 UTC (permalink / raw)
To: Joel Hutton; +Cc: Richard Sandiford, Joel Hutton via Gcc-patches
On Fri, 13 Nov 2020, Joel Hutton wrote:
> Tests are still running, but I believe I've addressed all the comments.
>
> > > +#include <string.h>
> > > +
> >
> > SVE targets will need a:
> >
> > #pragma GCC target "+nosve"
> >
> > here, since we'll generate different code for SVE.
>
> Fixed.
>
> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
> >
> > Very minor nit, sorry, but I think:
> >
> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
> >
> > would be better. Using "?\t" works, but IIRC it shows up as a tab
> > character in the testsuite result summary too.
>
> Fixed. Minor nits welcome. :)
>
>
> > OK for the aarch64 bits with the testsuite changes above.
> ok?
The gcc/tree-vect-stmts.c parts are OK.
Richard.
> gcc/ChangeLog:
>
> 2020-11-13 Joel Hutton <joel.hutton@arm.com>
>
> * config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo<mode>
> patterns.
> * tree-vect-stmts.c
> (vectorizable_conversion): Fix for widen_lshift case.
>
> gcc/testsuite/ChangeLog:
>
> 2020-11-13 Joel Hutton <joel.hutton@arm.com>
>
> * gcc.target/aarch64/vect-widen-lshift.c: New test.
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [3/3][aarch64] Add support for vec_widen_shift pattern
2020-11-16 14:00 ` Richard Biener
@ 2020-11-17 13:41 ` Richard Sandiford
0 siblings, 0 replies; 6+ messages in thread
From: Richard Sandiford @ 2020-11-17 13:41 UTC (permalink / raw)
To: Richard Biener; +Cc: Joel Hutton, Joel Hutton via Gcc-patches
Richard Biener <rguenther@suse.de> writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>>
>> > > +#include <string.h>
>> > > +
>> >
>> > SVE targets will need a:
>> >
>> > #pragma GCC target "+nosve"
>> >
>> > here, since we'll generate different code for SVE.
>>
>> Fixed.
>>
>> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
>> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
>> >
>> > Very minor nit, sorry, but I think:
>> >
>> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
>> >
>> > would be better. Using "?\t" works, but IIRC it shows up as a tab
>> > character in the testsuite result summary too.
>>
>> Fixed. Minor nits welcome. :)
>>
>>
>> > OK for the aarch64 bits with the testsuite changes above.
>> ok?
>
> The gcc/tree-vect-stmts.c parts are OK.
Same for the AArch64 stuff.
Thanks,
Richard
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-11-17 13:41 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12 19:34 [3/3][aarch64] Add support for vec_widen_shift pattern Joel Hutton
2020-11-13 8:05 ` Richard Biener
2020-11-13 10:14 ` Richard Sandiford
2020-11-13 16:50 ` Joel Hutton
2020-11-16 14:00 ` Richard Biener
2020-11-17 13:41 ` Richard Sandiford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).