* [AArch64] Improve SVE handling of single-vector permutes
@ 2018-08-23 8:58 Richard Sandiford
0 siblings, 0 replies; only message in thread
From: Richard Sandiford @ 2018-08-23 8:58 UTC (permalink / raw)
To: gcc-patches
aarch64_vectorize_vec_perm_const was failing to set one_vector_p
if the permute had only a single input. This in turn was hiding
a problem in the SVE TBL handling: it accepted single-vector
variable-length permutes, but sent them through the general
two-vector aarch64_expand_sve_vec_perm, which is only set up
to handle constant-length permutes.
Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied to trunk, counting the aarch64_vectorize_vec_perm_const change
as obvious. Might be worth backporting to GCC 8 at some point.
Richard
2018-08-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_sve_tbl): Fix handling
of single-vector TBLs.
(aarch64_vectorize_vec_perm_const): Set one_vector_p when only
one input is given.
gcc/testsuite/
* gcc.dg/vect/no-vfa-vect-depend-2.c: Remove XFAIL.
* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Update for vect_fold_extract_last.
* gcc.dg/vect/pr80631-2.c: Likewise.
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c 2018-08-23 09:49:50.426715321 +0100
+++ gcc/config/aarch64/aarch64.c 2018-08-23 09:53:58.416580848 +0100
@@ -15423,7 +15423,10 @@ aarch64_evpc_sve_tbl (struct expand_vec_
machine_mode sel_mode = mode_for_int_vector (d->vmode).require ();
rtx sel = vec_perm_indices_to_rtx (sel_mode, d->perm);
- aarch64_expand_sve_vec_perm (d->target, d->op0, d->op1, sel);
+ if (d->one_vector_p)
+ emit_unspec2 (d->target, UNSPEC_TBL, d->op0, force_reg (sel_mode, sel));
+ else
+ aarch64_expand_sve_vec_perm (d->target, d->op0, d->op1, sel);
return true;
}
@@ -15476,7 +15479,8 @@ aarch64_vectorize_vec_perm_const (machin
struct expand_vec_perm_d d;
/* Check whether the mask can be applied to a single vector. */
- if (op0 && rtx_equal_p (op0, op1))
+ if (sel.ninputs () == 1
+ || (op0 && rtx_equal_p (op0, op1)))
d.one_vector_p = true;
else if (sel.all_from_input_p (0))
{
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c 2018-05-02 08:37:48.981604753 +0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c 2018-08-23 09:53:58.416580848 +0100
@@ -51,7 +51,4 @@ int main (void)
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* Requires reverse for variable-length SVE, which is implemented for
- by a later patch. Until then we report it twice, once for SVE and
- once for 128-bit Advanced SIMD. */
-/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
+/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c 2018-05-02 08:37:49.021604375 +0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c 2018-08-23 09:53:58.416580848 +0100
@@ -183,7 +183,4 @@ int main ()
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* f4 requires reverse for SVE, which is implemented by a later patch.
- Until then we report it twice, once for SVE and once for 128-bit
- Advanced SIMD. */
-/* { dg-final { scan-tree-dump-times "dependence distance negative" 4 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
+/* { dg-final { scan-tree-dump-times "dependence distance negative" 4 "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr65947-13.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr65947-13.c 2018-05-02 08:37:49.041604185 +0100
+++ gcc/testsuite/gcc.dg/vect/pr65947-13.c 2018-08-23 09:53:58.416580848 +0100
@@ -41,4 +41,5 @@ main (void)
}
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 4 "vect" } } */
+/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 4 "vect" { xfail vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 4 "vect" { target vect_fold_extract_last } } } */
Index: gcc/testsuite/gcc.dg/vect/pr80631-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr80631-2.c 2018-05-02 08:37:48.977604791 +0100
+++ gcc/testsuite/gcc.dg/vect/pr80631-2.c 2018-08-23 09:53:58.416580848 +0100
@@ -72,4 +72,5 @@ main ()
}
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 5 "vect" { target vect_condition } } } */
-/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 10 "vect" { target vect_condition } } } */
+/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 10 "vect" { target vect_condition xfail vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 10 "vect" { target vect_fold_extract_last } } } */
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2018-08-23 8:58 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 8:58 [AArch64] Improve SVE handling of single-vector permutes Richard Sandiford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).