public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r12-5407] middle-end: Handle FMA_CONJ correctly after SLP layout update.
@ 2021-11-19 15:13 Tamar Christina
  0 siblings, 0 replies; only message in thread
From: Tamar Christina @ 2021-11-19 15:13 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:487d604b6fa0f0a981eadc216d9e481d08ed7e7b

commit r12-5407-g487d604b6fa0f0a981eadc216d9e481d08ed7e7b
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Fri Nov 19 15:12:38 2021 +0000

    middle-end: Handle FMA_CONJ correctly after SLP layout update.
    
    Apologies, I got dinged by the i386 regressions bot for a test I didn't have in
    my tree at the time I made the previous patch.  The bot was telling me that FMA
    stopped working after I strengthened the FMA check in the previous patch.
    
    The reason is that the check is slightly early.  The first check can indeed only
    exit early when either node isn't a mult.  However we need to delay till we know
    if the node is a MUL or FMA before enforcing that both nodes must be a MULT
    since the node to inspect is different if the operation is a MUL or FMA.
    
    Also with the update patch for GCC 11 tree layout update to the new GCC 12 one
    I had missed that the difference in which node is conjucated is not symmetrical.
    
    So the test for it can just be testing the inverse order.  It was Currently
    no detecting when the first node was conjucated instead of the second one.
    
    This also made me wonder why my own test didn't detect this.  It turns out that
    the tests, being copied from the _Float16 ones were incorrectly marked as
    xfail.  The _Float16 ones are marked as xfail since C doesn't have a conj
    operation for _Float16, which means you get extra type-casts in between.
    
    While you could use the GCC _Complex extension here I opted to mark them xfail
    since I wanted to include detection over the widenings next year.
    
    Secondly the double tests were being skipped because Adv. SIMD was missing from
    targets supporting Complex Double vectorization.
    
    With these changes all other tests run and pass and only XFAIL ones are
    correctly the _Float16 ones.  Sorry for missing this before, testing should now
    cover all cases.
    
    gcc/ChangeLog:
    
            PR tree-optimization/103311
            PR target/103330
            * tree-vect-slp-patterns.c (vect_validate_multiplication): Fix CONJ
            test to new codegen.
            (complex_mul_pattern::matches): Move check downwards.
    
    gcc/testsuite/ChangeLog:
    
            PR tree-optimization/103311
            PR target/103330
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Fix it.
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
            * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
            * lib/target-supports.exp
            (check_effective_target_vect_complex_add_double): Add Adv. SIMD.

Diff:
---
 .../complex/fast-math-bb-slp-complex-mla-double.c  |  5 +++--
 .../complex/fast-math-bb-slp-complex-mla-float.c   |  6 +++---
 .../complex/fast-math-bb-slp-complex-mls-double.c  |  7 +++----
 .../complex/fast-math-bb-slp-complex-mls-float.c   |  6 +++---
 .../complex/fast-math-bb-slp-complex-mul-double.c  |  5 +++--
 .../complex/fast-math-bb-slp-complex-mul-float.c   |  4 ++--
 gcc/testsuite/lib/target-supports.exp              |  6 ++++--
 gcc/tree-vect-slp-patterns.c                       | 24 ++++++++++++++--------
 8 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
index 462063abc30..b77c847403d 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
@@ -1,10 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 
 #define TYPE double
 #define N 16
 #include "complex-mla-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
index a88adc8184e..cd68fd19008 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
@@ -1,10 +1,10 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_float } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 /* { dg-add-options arm_v8_3a_fp16_complex_neon } */
 
 #define TYPE float
 #define N 16
 #include "complex-mla-template.c"
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
index a434fd1f1d3..9d9839417a2 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
@@ -1,12 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 
 #define TYPE double
 #define N 16
 #include "complex-mls-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CONJ" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c
index b7ccbbdb757..cf540a08acd 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_float } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
 
 #define TYPE float
 #define N 16
 #include "complex-mls-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CONJ" "slp1"  { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "slp1"  { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c
index f7e9386334e..dcac519cd98 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c
@@ -1,10 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_complex_add_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 
 #define TYPE double
 #define N 16
 #include "complex-mul-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_MUL_CONJ" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c
index 0dc9c525556..27280ae2ba4 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c
@@ -7,5 +7,5 @@
 #define N 16
 #include "complex-mul-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_MUL_CONJ" "slp1"  { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "slp1"  { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL_CONJ" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "slp1" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index c928d99a14b..155034c9ca4 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3632,8 +3632,10 @@ proc check_effective_target_vect_complex_add_float { } {
 proc check_effective_target_vect_complex_add_double { } {
     return [check_cached_effective_target_indexed vect_complex_add_double {
       expr {
-	 ([check_effective_target_aarch64_sve2]
-	      && [check_effective_target_aarch64_little_endian])
+	 (([check_effective_target_arm_v8_3a_complex_neon_ok]
+	  && [check_effective_target_aarch64_little_endian])
+	 || ([check_effective_target_aarch64_sve2]
+	      && [check_effective_target_aarch64_little_endian]))
 	}}]
 }
 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index d916fc9cef9..0350441fad9 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -809,14 +809,20 @@ vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
       if (linear_loads_p (perm_cache, left_op[index2]) == PERM_EVENODD)
 	return true;
     }
-  else if (kind == PERM_EVENODD)
+  else if (kind == PERM_EVENODD && !neg_first)
     {
-      if ((kind = linear_loads_p (perm_cache, left_op[index2])) == PERM_EVENODD)
+      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENEVEN)
 	return false;
       return true;
     }
-  else if (!neg_first)
-    *conj_first_operand = true;
+  else if (kind == PERM_EVENEVEN && neg_first)
+    {
+      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENODD)
+	return false;
+
+      *conj_first_operand = true;
+      return true;
+    }
   else
     return false;
 
@@ -949,7 +955,7 @@ complex_mul_pattern::matches (complex_operation_t op,
 
   bool mul0 = vect_match_expression_p (l0node[0], MULT_EXPR);
   bool mul1 = vect_match_expression_p (l0node[1], MULT_EXPR);
-  if (!mul0 || !mul1)
+  if (!mul0 && !mul1)
     return IFN_LAST;
 
   /* Now operand2+4 may lead to another expression.  */
@@ -962,7 +968,7 @@ complex_mul_pattern::matches (complex_operation_t op,
     {
       auto vals = SLP_TREE_CHILDREN (l0node[0]);
       /* Check if it's a multiply, otherwise no idea what this is.  */
-      if (!vect_match_expression_p (vals[1], MULT_EXPR))
+      if (!(mul0 = vect_match_expression_p (vals[1], MULT_EXPR)))
 	return IFN_LAST;
 
       /* Check if the ADD is linear, otherwise it's not valid complex FMA.  */
@@ -979,6 +985,8 @@ complex_mul_pattern::matches (complex_operation_t op,
 
   if (left_op.length () != 2
       || right_op.length () != 2
+      || !mul0
+      || !mul1
       || linear_loads_p (perm_cache, left_op[1]) == PERM_ODDEVEN)
     return IFN_LAST;
 
@@ -993,7 +1001,7 @@ complex_mul_pattern::matches (complex_operation_t op,
       if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN)
 	  || vect_normalize_conj_loc (left_op))
 	return IFN_LAST;
-      if (!mul0)
+      if (add0)
 	ifn = IFN_COMPLEX_FMA;
       else
 	ifn = IFN_COMPLEX_MUL;
@@ -1005,7 +1013,7 @@ complex_mul_pattern::matches (complex_operation_t op,
 					 false))
 	return IFN_LAST;
 
-      if(!mul0)
+      if(add0)
 	ifn = IFN_COMPLEX_FMA_CONJ;
       else
 	ifn = IFN_COMPLEX_MUL_CONJ;


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-11-19 15:13 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-19 15:13 [gcc r12-5407] middle-end: Handle FMA_CONJ correctly after SLP layout update Tamar Christina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).