public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>,
	Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: nd <nd@arm.com>, "rguenther@suse.de" <rguenther@suse.de>
Subject: RE: [1/3 PATCH]middle-end vect: Simplify and extend the complex numbers validation routines.
Date: Mon, 20 Dec 2021 16:18:40 +0000	[thread overview]
Message-ID: <VI1PR08MB5325E4E7FC25F1ECF90FC1E1FF7B9@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <mpth7b76w83.fsf@arm.com>

[-- Attachment #1: Type: text/plain, Size: 46136 bytes --]



> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Friday, December 17, 2021 4:19 PM
> To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
> Cc: Tamar Christina <Tamar.Christina@arm.com>; nd <nd@arm.com>;
> rguenther@suse.de
> Subject: Re: [1/3 PATCH]middle-end vect: Simplify and extend the complex
> numbers validation routines.
> 
> Just a comment on the documentation:
> 
> Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> >
> 9ec051e94e10cca9eec2773e1b8c01b74b6ea4db..60dc5b3ea6087c2824ad1467
> bc66
> > e9cfebe9dcfc 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -6325,12 +6325,12 @@ Perform a vector multiply and accumulate that
> > is semantically the same as  a multiply and accumulate of complex numbers.
> >
> >  @smallexample
> > -  complex TYPE c[N];
> > -  complex TYPE a[N];
> > -  complex TYPE b[N];
> > +  complex TYPE op0[N];
> > +  complex TYPE op1[N];
> > +  complex TYPE op2[N];
> >    for (int i = 0; i < N; i += 1)
> >      @{
> > -      c[i] += a[i] * b[i];
> > +      op2[i] += op1[i] * op2[i];
> >      @}
> 
> I think this should be:
> 
>   op0[i] = op1[i] * op2[i] + op3[i];
> 
> since operand 0 is the output and operand 3 is the accumulator input.
> 
> Same idea for the others.  For:
> 
> > @@ -6415,12 +6415,12 @@ Perform a vector multiply that is semantically
> > the same as multiply of  complex numbers.
> >
> >  @smallexample
> > -  complex TYPE c[N];
> > -  complex TYPE a[N];
> > -  complex TYPE b[N];
> > +  complex TYPE op0[N];
> > +  complex TYPE op1[N];
> > +  complex TYPE op2[N];
> >    for (int i = 0; i < N; i += 1)
> >      @{
> > -      c[i] = a[i] * b[i];
> > +      op2[i] = op0[i] * op1[i];
> 
> …this I think it should be:
> 
>   op0[i] = op1[i] * op2[i];

Updated patch attached.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for master? and backport to GCC 11 after some stew?

Thanks,
Tamar

gcc/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* doc/md.texi: Update docs for cfms, cfma.
	* tree-data-ref.h (same_data_refs): Accept optional offset.
	* tree-vect-slp-patterns.c (is_linear_load_p): Fix issue with repeating
	patterns.
	(vect_normalize_conj_loc): Remove.
	(is_eq_or_top): Change to take two nodes.
	(enum _conj_status, compatible_complex_nodes_p,
	vect_validate_multiplication): New.
	(class complex_add_pattern, complex_add_pattern::matches,
	complex_add_pattern::recognize, class complex_mul_pattern,
	complex_mul_pattern::recognize, class complex_fms_pattern,
	complex_fms_pattern::recognize, class complex_operations_pattern,
	complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
	new cache.
	(complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new
	cache and use new validation code.
	* tree-vect-slp.c (vect_match_slp_patterns_2, vect_match_slp_patterns,
	vect_analyze_slp): Pass along cache.
	(compatible_calls_p): Expose.
	* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
	slp_compat_nodes_map_t): New.
	(class vect_pattern): Update signatures include new cache.

gcc/testsuite/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* g++.dg/vect/pr99149.cc: xfail for now.
	* gcc.dg/vect/complex/pr102819-1.c: New test.
	* gcc.dg/vect/complex/pr102819-2.c: New test.
	* gcc.dg/vect/complex/pr102819-3.c: New test.
	* gcc.dg/vect/complex/pr102819-4.c: New test.
	* gcc.dg/vect/complex/pr102819-5.c: New test.
	* gcc.dg/vect/complex/pr102819-6.c: New test.
	* gcc.dg/vect/complex/pr102819-7.c: New test.
	* gcc.dg/vect/complex/pr102819-8.c: New test.
	* gcc.dg/vect/complex/pr102819-9.c: New test.
	* gcc.dg/vect/complex/pr103169.c: New test.

--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 9ec051e94e10cca9eec2773e1b8c01b74b6ea4db..ad06b02d36876082afe4c3f3fb51887f7a522b23 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6325,12 +6325,13 @@ Perform a vector multiply and accumulate that is semantically the same as
 a multiply and accumulate of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * b[i];
+      op0[i] = op1[i] * op2[i] + op3[i];
     @}
 @end smallexample
 
@@ -6348,12 +6349,13 @@ the same as a multiply and accumulate of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) + op3[i];
     @}
 @end smallexample
 
@@ -6370,12 +6372,13 @@ Perform a vector multiply and subtract that is semantically the same as
 a multiply and subtract of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * b[i];
+      op0[i] = op1[i] * op2[i] - op3[i];
     @}
 @end smallexample
 
@@ -6393,12 +6396,13 @@ the same as a multiply and subtract of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) - op3[i];
     @}
 @end smallexample
 
@@ -6415,12 +6419,12 @@ Perform a vector multiply that is semantically the same as multiply of
 complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * b[i];
+      op0[i] = op1[i] * op2[i];
     @}
 @end smallexample
 
@@ -6437,12 +6441,12 @@ Perform a vector multiply by conjugate that is semantically the same as a
 multiply of complex numbers where the second multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]);
     @}
 @end smallexample
 
diff --git a/gcc/testsuite/g++.dg/vect/pr99149.cc b/gcc/testsuite/g++.dg/vect/pr99149.cc
index e6e0594a336fa053ffba64a12e2de43a4e373f49..bb9f5fa89f12b184368bf5488d6e9432c2166463 100755
--- a/gcc/testsuite/g++.dg/vect/pr99149.cc
+++ b/gcc/testsuite/g++.dg/vect/pr99149.cc
@@ -24,4 +24,4 @@ public:
 } n;
 main() { n.j(); }
 
-/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_MUL" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_MUL" 1 "slp2" { xfail { vect_float } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..46b9a55f05279d732fa1418e02f779cf693ede07
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 4)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v2) - f[1][i] * (f[2][i] + v1);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+      f[0][r+2] = f[1][r+2] * (f[2][r+2] + v2) - f[1][i+2] * (f[2][i+2] + v1);
+      f[0][i+2] = f[1][r+2] * (f[2][i+2] + v1) + f[1][i+2] * (f[2][r+2] + v2);
+      //                  ^^^^^^^             ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..ffe646efe57f7ad07541b0fb96601596f46dc5f8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v1) - f[1][i] * (f[2][i] + v2);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..5f98aa204d8b11b0cb433f8965dbb72cf8940de1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v2) - f[1][i] * (f[2][i] + v1);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..882851789c5085e734000609114be480d3b08bd0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good1()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][i] * f[2][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[2][r];
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..6a2d549d65f3f27d407fb0bd469473e6a5c333ae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good2()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + 1) - f[1][i] * (f[2][i] + 1);
+      f[0][i] = f[1][r] * (f[2][i] + 1) + f[1][i] * (f[2][r] + 1);
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..71e66dbe3b29eec1fffb8df9b216022fdc0af54e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][i] * f[3][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[3][r];
+      //                  ^^^^^^^             ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..536672f3c8bb474ad5fa4bb61b3a36b555acf3cf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad2()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + 1) - f[1][i] * f[2][i];
+      f[0][i] = f[1][r] * (f[2][i] + 1) + f[1][i] * f[2][r];
+      //                          ^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..07b48148688b7d530e5891d023d558b58a485c23
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad3()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][r] * f[2][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[2][r];
+      //                            ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..7655852434b21b381fe7ee316e8caf3d485b8ee1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+#include <stdio.h>
+#include <complex.h>
+
+#define N 200
+#define TYPE float
+#define TYPE2 float
+
+void g (TYPE2 complex a[restrict N], TYPE complex b[restrict N], TYPE complex c[restrict N])
+{
+  for (int i=0; i < N; i++)
+    {
+      c[i] -=  a[i] * b[0];
+    }
+}
+
+/* The pattern overlaps with COMPLEX_ADD so we need to support consuming ADDs in COMPLEX_FMS.  */
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "vect" { xfail { vect_float } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr103169.c b/gcc/testsuite/gcc.dg/vect/complex/pr103169.c
new file mode 100644
index 0000000000000000000000000000000000000000..1bfabbd85a0eedfb4156a82574324126e9083fc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr103169.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { vect_double } } } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options "-O2 -fvect-cost-model=unlimited" } */
+
+_Complex double b_0, c_0;
+
+void
+mul270snd (void)
+{
+  c_0 = b_0 * 1.0iF * 1.0iF;
+}
+
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index 74f579c9f3f23bac25d21546068c2ab43209aa2b..8ad5fa521279b20fa5e63eecf442d5dc5c16e7ee 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -600,10 +600,11 @@ same_data_refs_base_objects (data_reference_p a, data_reference_p b)
 }
 
 /* Return true when the data references A and B are accessing the same
-   memory object with the same access functions.  */
+   memory object with the same access functions.  Optionally skip the
+   last OFFSET dimensions in the data reference.  */
 
 static inline bool
-same_data_refs (data_reference_p a, data_reference_p b)
+same_data_refs (data_reference_p a, data_reference_p b, int offset = 0)
 {
   unsigned int i;
 
@@ -614,7 +615,7 @@ same_data_refs (data_reference_p a, data_reference_p b)
   if (!same_data_refs_base_objects (a, b))
     return false;
 
-  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+  for (i = offset; i < DR_NUM_DIMENSIONS (a); i++)
     if (!eq_evolutions_p (DR_ACCESS_FN (a, i), DR_ACCESS_FN (b, i)))
       return false;
 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 0350441fad9690cd5d04337171ca3470a064a571..020c29bba08c5bd80503a2dbc04292f8fd310b3c 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -149,12 +149,13 @@ is_linear_load_p (load_permutation_t loads)
   int valid_patterns = 4;
   FOR_EACH_VEC_ELT (loads, i, load)
     {
-      if (candidates[0] != PERM_UNKNOWN && load != 1)
+      unsigned adj_load = load % 2;
+      if (candidates[0] != PERM_UNKNOWN && adj_load != 1)
 	{
 	  candidates[0] = PERM_UNKNOWN;
 	  valid_patterns--;
 	}
-      if (candidates[1] != PERM_UNKNOWN && load != 0)
+      if (candidates[1] != PERM_UNKNOWN && adj_load != 0)
 	{
 	  candidates[1] = PERM_UNKNOWN;
 	  valid_patterns--;
@@ -596,11 +597,12 @@ class complex_add_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -647,6 +649,7 @@ complex_add_pattern::build (vec_info *vinfo)
 internal_fn
 complex_add_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t * /* compat_cache */,
 			      slp_tree *node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -692,13 +695,14 @@ complex_add_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_add_pattern::matches (op, perm_cache, node, &ops);
+    = complex_add_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -709,147 +713,214 @@ complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
  * complex_mul_pattern
  ******************************************************************************/
 
-/* Check to see if either of the trees in ARGS are a NEGATE_EXPR.  If the first
-   child (args[0]) is a NEGATE_EXPR then NEG_FIRST_P is set to TRUE.
-
-   If a negate is found then the values in ARGS are reordered such that the
-   negate node is always the second one and the entry is replaced by the child
-   of the negate node.  */
+/* Helper function to check if PERM is KIND or PERM_TOP.  */
 
 static inline bool
-vect_normalize_conj_loc (vec<slp_tree> &args, bool *neg_first_p = NULL)
+is_eq_or_top (slp_tree_to_load_perm_map_t *perm_cache,
+	      slp_tree op1, complex_perm_kinds_t kind1,
+	      slp_tree op2, complex_perm_kinds_t kind2)
 {
-  gcc_assert (args.length () == 2);
-  bool neg_found = false;
-
-  if (vect_match_expression_p (args[0], NEGATE_EXPR))
-    {
-      std::swap (args[0], args[1]);
-      neg_found = true;
-      if (neg_first_p)
-	*neg_first_p = true;
-    }
-  else if (vect_match_expression_p (args[1], NEGATE_EXPR))
-    {
-      neg_found = true;
-      if (neg_first_p)
-	*neg_first_p = false;
-    }
+  complex_perm_kinds_t perm1 = linear_loads_p (perm_cache, op1);
+  if (perm1 != kind1 && perm1 != PERM_TOP)
+    return false;
 
-  if (neg_found)
-    args[1] = SLP_TREE_CHILDREN (args[1])[0];
+  complex_perm_kinds_t perm2 = linear_loads_p (perm_cache, op2);
+  if (perm2 != kind2 && perm2 != PERM_TOP)
+    return false;
 
-  return neg_found;
+  return true;
 }
 
-/* Helper function to check if PERM is KIND or PERM_TOP.  */
+enum _conj_status { CONJ_NONE, CONJ_FST, CONJ_SND };
 
 static inline bool
-is_eq_or_top (complex_perm_kinds_t perm, complex_perm_kinds_t kind)
+compatible_complex_nodes_p (slp_compat_nodes_map_t *compat_cache,
+			    slp_tree a, int *pa, slp_tree b, int *pb)
 {
-  return perm == kind || perm == PERM_TOP;
-}
+  bool *tmp;
+  std::pair<slp_tree, slp_tree> key = std::make_pair(a, b);
+  if ((tmp = compat_cache->get (key)) != NULL)
+    return *tmp;
 
-/* Helper function that checks to see if LEFT_OP and RIGHT_OP are both MULT_EXPR
-   nodes but also that they represent an operation that is either a complex
-   multiplication or a complex multiplication by conjugated value.
+   compat_cache->put (key, false);
 
-   Of the negation is expected to be in the first half of the tree (As required
-   by an FMS pattern) then NEG_FIRST is true.  If the operation is a conjugate
-   operation then CONJ_FIRST_OPERAND is set to indicate whether the first or
-   second operand contains the conjugate operation.  */
+  if (SLP_TREE_CHILDREN (a).length () != SLP_TREE_CHILDREN (b).length ())
+    return false;
 
-static inline bool
-vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
-			      const vec<slp_tree> &left_op,
-			      const vec<slp_tree> &right_op,
-			     bool neg_first, bool *conj_first_operand,
-			     bool fms)
-{
-  /* The presence of a negation indicates that we have either a conjugate or a
-     rotation.  We need to distinguish which one.  */
-  *conj_first_operand = false;
-  complex_perm_kinds_t kind;
-
-  /* Complex conjugates have the negation on the imaginary part of the
-     number where rotations affect the real component.  So check if the
-     negation is on a dup of lane 1.  */
-  if (fms)
+  if (SLP_TREE_DEF_TYPE (a) != SLP_TREE_DEF_TYPE (b))
+    return false;
+
+  /* Only internal nodes can be loads, as such we can't check further if they
+     are externals.  */
+  if (SLP_TREE_DEF_TYPE (a) != vect_internal_def)
     {
-      /* Canonicalization for fms is not consistent. So have to test both
-	 variants to be sure.  This needs to be fixed in the mid-end so
-	 this part can be simpler.  */
-      kind = linear_loads_p (perm_cache, right_op[0]);
-      if (!((is_eq_or_top (linear_loads_p (perm_cache, right_op[0]), PERM_ODDODD)
-	   && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]),
-			     PERM_ODDEVEN))
-	  || (kind == PERM_ODDEVEN
-	      && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]),
-			     PERM_ODDODD))))
-	return false;
+      for (unsigned i = 0; i < SLP_TREE_SCALAR_OPS (a).length (); i++)
+	{
+	  tree op1 = SLP_TREE_SCALAR_OPS (a)[pa[i % 2]];
+	  tree op2 = SLP_TREE_SCALAR_OPS (b)[pb[i % 2]];
+	  if (!operand_equal_p (op1, op2, 0))
+	    return false;
+	}
+
+      compat_cache->put (key, true);
+      return true;
     }
+
+  auto a_stmt = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (a));
+  auto b_stmt = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (b));
+
+  if (gimple_code (a_stmt) != gimple_code (b_stmt))
+    return false;
+
+  /* code, children, type, externals, loads, constants  */
+  if (gimple_num_args (a_stmt) != gimple_num_args (b_stmt))
+    return false;
+
+  /* At this point, a and b are known to be the same gimple operations.  */
+  if (is_gimple_call (a_stmt))
+    {
+	if (!compatible_calls_p (dyn_cast <gcall *> (a_stmt),
+				 dyn_cast <gcall *> (b_stmt)))
+	  return false;
+    }
+  else if (!is_gimple_assign (a_stmt))
+    return false;
   else
     {
-      if (linear_loads_p (perm_cache, right_op[1]) != PERM_ODDODD
-	  && !is_eq_or_top (linear_loads_p (perm_cache, right_op[0]),
-			    PERM_ODDEVEN))
+      tree_code acode = gimple_assign_rhs_code (a_stmt);
+      tree_code bcode = gimple_assign_rhs_code (b_stmt);
+      if ((acode == REALPART_EXPR || acode == IMAGPART_EXPR)
+	  && (bcode == REALPART_EXPR || bcode == IMAGPART_EXPR))
+	return true;
+
+      if (acode != bcode)
 	return false;
     }
 
-  /* Deal with differences in indexes.  */
-  int index1 = fms ? 1 : 0;
-  int index2 = fms ? 0 : 1;
-
-  /* Check if the conjugate is on the second first or second operand.  The
-     order of the node with the conjugate value determines this, and the dup
-     node must be one of lane 0 of the same DR as the neg node.  */
-  kind = linear_loads_p (perm_cache, left_op[index1]);
-  if (kind == PERM_TOP)
+  if (!SLP_TREE_LOAD_PERMUTATION (a).exists ()
+      || !SLP_TREE_LOAD_PERMUTATION (b).exists ())
     {
-      if (linear_loads_p (perm_cache, left_op[index2]) == PERM_EVENODD)
-	return true;
+      for (unsigned i = 0; i < gimple_num_args (a_stmt); i++)
+	{
+	  tree t1 = gimple_arg (a_stmt, i);
+	  tree t2 = gimple_arg (b_stmt, i);
+	  if (TREE_CODE (t1) != TREE_CODE (t2))
+	    return false;
+
+	  /* If SSA name then we will need to inspect the children
+	     so we can punt here.  */
+	  if (TREE_CODE (t1) == SSA_NAME)
+	    continue;
+
+	  if (!operand_equal_p (t1, t2, 0))
+	    return false;
+	}
     }
-  else if (kind == PERM_EVENODD && !neg_first)
+  else
     {
-      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENEVEN)
+      auto dr1 = STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (a));
+      auto dr2 = STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (b));
+      /* Don't check the last dimension as that's checked by the lineary
+	 checks.  This check is also much stricter than what we need
+	 because it doesn't consider loading from adjacent elements
+	 in the same struct as loading from the same base object.
+	 But for now, I'll play it safe.  */
+      if (!same_data_refs (dr1, dr2, 1))
 	return false;
-      return true;
     }
-  else if (kind == PERM_EVENEVEN && neg_first)
+
+  for (unsigned i = 0; i < SLP_TREE_CHILDREN (a).length (); i++)
     {
-      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENODD)
+      if (!compatible_complex_nodes_p (compat_cache,
+				       SLP_TREE_CHILDREN (a)[i], pa,
+				       SLP_TREE_CHILDREN (b)[i], pb))
 	return false;
-
-      *conj_first_operand = true;
-      return true;
     }
-  else
-    return false;
-
-  if (kind != PERM_EVENEVEN)
-    return false;
 
+  compat_cache->put (key, true);
   return true;
 }
 
-/* Helper function to help distinguish between a conjugate and a rotation in a
-   complex multiplication.  The operations have similar shapes but the order of
-   the load permutes are different.  This function returns TRUE when the order
-   is consistent with a multiplication or multiplication by conjugated
-   operand but returns FALSE if it's a multiplication by rotated operand.  */
-
 static inline bool
 vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
-			      const vec<slp_tree> &op,
-			      complex_perm_kinds_t permKind)
+			      slp_compat_nodes_map_t *compat_cache,
+			      vec<slp_tree> &left_op,
+			      vec<slp_tree> &right_op,
+			      bool subtract,
+			      enum _conj_status *_status)
 {
-  /* The left node is the more common case, test it first.  */
-  if (!is_eq_or_top (linear_loads_p (perm_cache, op[0]), permKind))
+  auto_vec<slp_tree> ops;
+  enum _conj_status stats = CONJ_NONE;
+
+  /* The complex operations can occur in two layouts and two permute sequences
+     so declare them and re-use them.  */
+  int styles[][4] = { { 0, 2, 1, 3} /* {L1, R1} + {L2, R2}.  */
+		    , { 0, 3, 1, 2} /* {L1, R2} + {L2, R1}.  */
+		    };
+
+  /* Now for the corresponding permutes that go with these values.  */
+  complex_perm_kinds_t perms[][4]
+    = { { PERM_EVENEVEN, PERM_ODDODD, PERM_EVENODD, PERM_ODDEVEN }
+      , { PERM_EVENODD, PERM_ODDEVEN, PERM_EVENEVEN, PERM_ODDODD }
+      };
+
+  /* These permutes are used during comparisons of externals on which
+     we require strict equality.  */
+  int cq[][4][2]
+    = { { { 0, 0 }, { 1, 1 }, { 0, 1 }, { 1, 0 } }
+      , { { 0, 1 }, { 1, 0 }, { 0, 0 }, { 1, 1 } }
+      };
+
+  /* Default to style and perm 0, most operations use this one.  */
+  int style = 0;
+  int perm = subtract ? 1 : 0;
+
+  /* Check if we have a negate operation, if so absorb the node and continue
+     looking.  */
+  bool neg0 = vect_match_expression_p (right_op[0], NEGATE_EXPR);
+  bool neg1 = vect_match_expression_p (right_op[1], NEGATE_EXPR);
+
+  /* Determine which style we're looking at.  We only have different ones
+     whenever a conjugate is involved.  */
+  if (neg0 && neg1)
+    ;
+  else if (neg0)
     {
-      if (!is_eq_or_top (linear_loads_p (perm_cache, op[1]), permKind))
-	return false;
+      right_op[0] = SLP_TREE_CHILDREN (right_op[0])[0];
+      stats = CONJ_FST;
+      if (subtract)
+	perm = 0;
     }
-  return true;
+  else if (neg1)
+    {
+      right_op[1] = SLP_TREE_CHILDREN (right_op[1])[0];
+      stats = CONJ_SND;
+      perm = 1;
+    }
+
+  *_status = stats;
+
+  /* Flatten the inputs after we've remapped them.  */
+  ops.create (4);
+  ops.safe_splice (left_op);
+  ops.safe_splice (right_op);
+
+  /* Extract out the elements to check.  */
+  slp_tree op0 = ops[styles[style][0]];
+  slp_tree op1 = ops[styles[style][1]];
+  slp_tree op2 = ops[styles[style][2]];
+  slp_tree op3 = ops[styles[style][3]];
+
+  /* Do cheapest test first.  If failed no need to analyze further.  */
+  if (linear_loads_p (perm_cache, op0) != perms[perm][0]
+      || linear_loads_p (perm_cache, op1) != perms[perm][1]
+      || !is_eq_or_top (perm_cache, op2, perms[perm][2], op3, perms[perm][3]))
+    return false;
+
+  return compatible_complex_nodes_p (compat_cache, op0, cq[perm][0], op1,
+				     cq[perm][1])
+	 && compatible_complex_nodes_p (compat_cache, op2, cq[perm][2], op3,
+					cq[perm][3]);
 }
 
 /* This function combines two nodes containing only even and only odd lanes
@@ -908,11 +979,12 @@ class complex_mul_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -943,6 +1015,7 @@ class complex_mul_pattern : public complex_pattern
 internal_fn
 complex_mul_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t *compat_cache,
 			      slp_tree *node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -990,17 +1063,13 @@ complex_mul_pattern::matches (complex_operation_t op,
       || linear_loads_p (perm_cache, left_op[1]) == PERM_ODDEVEN)
     return IFN_LAST;
 
-  bool neg_first = false;
-  bool conj_first_operand = false;
-  bool is_neg = vect_normalize_conj_loc (right_op, &neg_first);
+  enum _conj_status status;
+  if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
+				     right_op, false, &status))
+    return IFN_LAST;
 
-  if (!is_neg)
+  if (status == CONJ_NONE)
     {
-      /* A multiplication needs to multiply agains the real pair, otherwise
-	 the pattern matches that of FMS.   */
-      if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN)
-	  || vect_normalize_conj_loc (left_op))
-	return IFN_LAST;
       if (add0)
 	ifn = IFN_COMPLEX_FMA;
       else
@@ -1008,11 +1077,6 @@ complex_mul_pattern::matches (complex_operation_t op,
     }
   else
     {
-      if (!vect_validate_multiplication (perm_cache, left_op, right_op,
-					 neg_first, &conj_first_operand,
-					 false))
-	return IFN_LAST;
-
       if(add0)
 	ifn = IFN_COMPLEX_FMA_CONJ;
       else
@@ -1029,19 +1093,13 @@ complex_mul_pattern::matches (complex_operation_t op,
     ops->quick_push (add0);
 
   complex_perm_kinds_t kind = linear_loads_p (perm_cache, left_op[0]);
-  if (kind == PERM_EVENODD)
+  if (kind == PERM_EVENODD || kind == PERM_TOP)
     {
       ops->quick_push (left_op[1]);
       ops->quick_push (right_op[1]);
       ops->quick_push (left_op[0]);
     }
-  else if (kind == PERM_TOP)
-    {
-      ops->quick_push (left_op[1]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (left_op[0]);
-    }
-  else if (kind == PERM_EVENEVEN && !conj_first_operand)
+  else if (kind == PERM_EVENEVEN && status != CONJ_SND)
     {
       ops->quick_push (left_op[0]);
       ops->quick_push (right_op[0]);
@@ -1061,13 +1119,14 @@ complex_mul_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_mul_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_mul_pattern::matches (op, perm_cache, node, &ops);
+    = complex_mul_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -1115,9 +1174,9 @@ complex_mul_pattern::build (vec_info *vinfo)
 
 	/* First re-arrange the children.  */
 	SLP_TREE_CHILDREN (*this->m_node).safe_grow (3);
-	SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[0];
-	SLP_TREE_CHILDREN (*this->m_node)[1] = this->m_ops[3];
-	SLP_TREE_CHILDREN (*this->m_node)[2] = newnode;
+	SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[3];
+	SLP_TREE_CHILDREN (*this->m_node)[1] = newnode;
+	SLP_TREE_CHILDREN (*this->m_node)[2] = this->m_ops[0];
 
 	/* Tell the builder to expect an extra argument.  */
 	this->m_num_args++;
@@ -1147,11 +1206,12 @@ class complex_fms_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -1182,6 +1242,7 @@ class complex_fms_pattern : public complex_pattern
 internal_fn
 complex_fms_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t *compat_cache,
 			      slp_tree * ref_node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -1197,6 +1258,8 @@ complex_fms_pattern::matches (complex_operation_t op,
   if (!vect_match_expression_p (root, MINUS_EXPR))
     return IFN_LAST;
 
+  /* TODO: Support invariants here, with the new layout CADD now
+	   can match before we get a chance to try CFMS.  */
   auto nodes = SLP_TREE_CHILDREN (root);
   if (!vect_match_expression_p (nodes[1], MULT_EXPR)
       || vect_detect_pair_op (nodes[0]) != PLUS_MINUS)
@@ -1217,16 +1280,14 @@ complex_fms_pattern::matches (complex_operation_t op,
       || !vect_match_expression_p (l0node[1], MULT_EXPR))
     return IFN_LAST;
 
-  bool is_neg = vect_normalize_conj_loc (left_op);
-
-  bool conj_first_operand = false;
-  if (!vect_validate_multiplication (perm_cache, right_op, left_op, false,
-				     &conj_first_operand, true))
+  enum _conj_status status;
+  if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
+				     left_op, true, &status))
     return IFN_LAST;
 
-  if (!is_neg)
+  if (status == CONJ_NONE)
     ifn = IFN_COMPLEX_FMS;
-  else if (is_neg)
+  else
     ifn = IFN_COMPLEX_FMS_CONJ;
 
   if (!vect_pattern_validate_optab (ifn, *ref_node))
@@ -1243,26 +1304,12 @@ complex_fms_pattern::matches (complex_operation_t op,
       ops->quick_push (right_op[1]);
       ops->quick_push (left_op[1]);
     }
-  else if (kind == PERM_TOP)
-    {
-      ops->quick_push (l0node[0]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[0]);
-    }
-  else if (kind == PERM_EVENEVEN && !is_neg)
-    {
-      ops->quick_push (l0node[0]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[0]);
-    }
   else
     {
       ops->quick_push (l0node[0]);
       ops->quick_push (right_op[1]);
       ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[1]);
+      ops->quick_push (left_op[0]);
     }
 
   return ifn;
@@ -1272,13 +1319,14 @@ complex_fms_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_fms_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_fms_pattern::matches (op, perm_cache, node, &ops);
+    = complex_fms_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -1305,9 +1353,9 @@ complex_fms_pattern::build (vec_info *vinfo)
   SLP_TREE_CHILDREN (*this->m_node).create (3);
 
   /* First re-arrange the children.  */
-  SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[0]);
   SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[1]);
   SLP_TREE_CHILDREN (*this->m_node).quick_push (newnode);
+  SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[0]);
 
   /* And then rewrite the node itself.  */
   complex_pattern::build (vinfo);
@@ -1334,11 +1382,12 @@ class complex_operations_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 };
 
 /* Dummy matches implementation for proxy object.  */
@@ -1347,6 +1396,7 @@ internal_fn
 complex_operations_pattern::
 matches (complex_operation_t /* op */,
 	 slp_tree_to_load_perm_map_t * /* perm_cache */,
+	 slp_compat_nodes_map_t * /* compat_cache */,
 	 slp_tree * /* ref_node */, vec<slp_tree> * /* ops */)
 {
   return IFN_LAST;
@@ -1356,6 +1406,7 @@ matches (complex_operation_t /* op */,
 
 vect_pattern*
 complex_operations_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				       slp_compat_nodes_map_t *ccache,
 				       slp_tree *node)
 {
   auto_vec<slp_tree> ops;
@@ -1363,15 +1414,15 @@ complex_operations_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn = IFN_LAST;
 
-  ifn  = complex_fms_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_fms_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_fms_pattern::mkInstance (node, &ops, ifn);
 
-  ifn  = complex_mul_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_mul_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_mul_pattern::mkInstance (node, &ops, ifn);
 
-  ifn  = complex_add_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_add_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_add_pattern::mkInstance (node, &ops, ifn);
 
@@ -1398,11 +1449,13 @@ class addsub_pattern : public vect_pattern
     void build (vec_info *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 };
 
 vect_pattern *
-addsub_pattern::recognize (slp_tree_to_load_perm_map_t *, slp_tree *node_)
+addsub_pattern::recognize (slp_tree_to_load_perm_map_t *,
+			   slp_compat_nodes_map_t *, slp_tree *node_)
 {
   slp_tree node = *node_;
   if (SLP_TREE_CODE (node) != VEC_PERM_EXPR
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index b912c3577df61a694d5bb9e22c5303fe6a48ab6e..cb577f8a612d583254e42bb06a6d7a0875de5e75 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -804,7 +804,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char swap,
 /* Return true if call statements CALL1 and CALL2 are similar enough
    to be combined into the same SLP group.  */
 
-static bool
+bool
 compatible_calls_p (gcall *call1, gcall *call2)
 {
   unsigned int nargs = gimple_call_num_args (call1);
@@ -2907,6 +2907,7 @@ optimize_load_redistribution (scalar_stmts_to_slp_tree_map_t *bst_map,
 static bool
 vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
 			   slp_tree_to_load_perm_map_t *perm_cache,
+			   slp_compat_nodes_map_t *compat_cache,
 			   hash_set<slp_tree> *visited)
 {
   unsigned i;
@@ -2918,11 +2919,13 @@ vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
   slp_tree child;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
     found_p |= vect_match_slp_patterns_2 (&SLP_TREE_CHILDREN (node)[i],
-					  vinfo, perm_cache, visited);
+					  vinfo, perm_cache, compat_cache,
+					  visited);
 
   for (unsigned x = 0; x < num__slp_patterns; x++)
     {
-      vect_pattern *pattern = slp_patterns[x] (perm_cache, ref_node);
+      vect_pattern *pattern
+	= slp_patterns[x] (perm_cache, compat_cache, ref_node);
       if (pattern)
 	{
 	  pattern->build (vinfo);
@@ -2943,7 +2946,8 @@ vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
 static bool
 vect_match_slp_patterns (slp_instance instance, vec_info *vinfo,
 			 hash_set<slp_tree> *visited,
-			 slp_tree_to_load_perm_map_t *perm_cache)
+			 slp_tree_to_load_perm_map_t *perm_cache,
+			 slp_compat_nodes_map_t *compat_cache)
 {
   DUMP_VECT_SCOPE ("vect_match_slp_patterns");
   slp_tree *ref_node = &SLP_INSTANCE_TREE (instance);
@@ -2953,7 +2957,8 @@ vect_match_slp_patterns (slp_instance instance, vec_info *vinfo,
 		     "Analyzing SLP tree %p for patterns\n",
 		     SLP_INSTANCE_TREE (instance));
 
-  return vect_match_slp_patterns_2 (ref_node, vinfo, perm_cache, visited);
+  return vect_match_slp_patterns_2 (ref_node, vinfo, perm_cache, compat_cache,
+				    visited);
 }
 
 /* STMT_INFO is a store group of size GROUP_SIZE that we are considering
@@ -3437,12 +3442,14 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size)
 
   hash_set<slp_tree> visited_patterns;
   slp_tree_to_load_perm_map_t perm_cache;
+  slp_compat_nodes_map_t compat_cache;
 
   /* See if any patterns can be found in the SLP tree.  */
   bool pattern_found = false;
   FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (vinfo), i, instance)
     pattern_found |= vect_match_slp_patterns (instance, vinfo,
-					      &visited_patterns, &perm_cache);
+					      &visited_patterns, &perm_cache,
+					      &compat_cache);
 
   /* If any were found optimize permutations of loads.  */
   if (pattern_found)
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 2f6e1e268fb07e9de065ff9c45af87546e565d66..83cd0919c7838c65576e1debd881e0ec636a605a 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2268,6 +2268,7 @@ extern void duplicate_and_interleave (vec_info *, gimple_seq *, tree,
 extern int vect_get_place_in_interleaving_chain (stmt_vec_info, stmt_vec_info);
 extern slp_tree vect_create_new_slp_node (unsigned, tree_code);
 extern void vect_free_slp_tree (slp_tree);
+extern bool compatible_calls_p (gcall *, gcall *);
 
 /* In tree-vect-patterns.c.  */
 extern void
@@ -2306,6 +2307,12 @@ typedef enum _complex_perm_kinds {
 typedef hash_map <slp_tree, complex_perm_kinds_t>
   slp_tree_to_load_perm_map_t;
 
+/* Cache from nodes pair to being compatible or not.  */
+typedef pair_hash <nofree_ptr_hash <_slp_tree>,
+		   nofree_ptr_hash <_slp_tree>> slp_node_hash;
+typedef hash_map <slp_node_hash, bool> slp_compat_nodes_map_t;
+
+
 /* Vector pattern matcher base class.  All SLP pattern matchers must inherit
    from this type.  */
 
@@ -2338,7 +2345,8 @@ class vect_pattern
   public:
 
     /* Create a new instance of the pattern matcher class of the given type.  */
-    static vect_pattern* recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    static vect_pattern* recognize (slp_tree_to_load_perm_map_t *,
+				    slp_compat_nodes_map_t *, slp_tree *);
 
     /* Build the pattern from the data collected so far.  */
     virtual void build (vec_info *) = 0;
@@ -2352,6 +2360,7 @@ class vect_pattern
 
 /* Function pointer to create a new pattern matcher from a generic type.  */
 typedef vect_pattern* (*vect_pattern_decl_t) (slp_tree_to_load_perm_map_t *,
+					      slp_compat_nodes_map_t *,
 					      slp_tree *);
 
 /* List of supported pattern matchers.  */

[-- Attachment #2: rb15145.patch --]
[-- Type: application/octet-stream, Size: 41026 bytes --]

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 9ec051e94e10cca9eec2773e1b8c01b74b6ea4db..ad06b02d36876082afe4c3f3fb51887f7a522b23 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6325,12 +6325,13 @@ Perform a vector multiply and accumulate that is semantically the same as
 a multiply and accumulate of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * b[i];
+      op0[i] = op1[i] * op2[i] + op3[i];
     @}
 @end smallexample
 
@@ -6348,12 +6349,13 @@ the same as a multiply and accumulate of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) + op3[i];
     @}
 @end smallexample
 
@@ -6370,12 +6372,13 @@ Perform a vector multiply and subtract that is semantically the same as
 a multiply and subtract of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * b[i];
+      op0[i] = op1[i] * op2[i] - op3[i];
     @}
 @end smallexample
 
@@ -6393,12 +6396,13 @@ the same as a multiply and subtract of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) - op3[i];
     @}
 @end smallexample
 
@@ -6415,12 +6419,12 @@ Perform a vector multiply that is semantically the same as multiply of
 complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * b[i];
+      op0[i] = op1[i] * op2[i];
     @}
 @end smallexample
 
@@ -6437,12 +6441,12 @@ Perform a vector multiply by conjugate that is semantically the same as a
 multiply of complex numbers where the second multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]);
     @}
 @end smallexample
 
diff --git a/gcc/testsuite/g++.dg/vect/pr99149.cc b/gcc/testsuite/g++.dg/vect/pr99149.cc
index e6e0594a336fa053ffba64a12e2de43a4e373f49..bb9f5fa89f12b184368bf5488d6e9432c2166463 100755
--- a/gcc/testsuite/g++.dg/vect/pr99149.cc
+++ b/gcc/testsuite/g++.dg/vect/pr99149.cc
@@ -24,4 +24,4 @@ public:
 } n;
 main() { n.j(); }
 
-/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_MUL" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_MUL" 1 "slp2" { xfail { vect_float } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..46b9a55f05279d732fa1418e02f779cf693ede07
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 4)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v2) - f[1][i] * (f[2][i] + v1);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+      f[0][r+2] = f[1][r+2] * (f[2][r+2] + v2) - f[1][i+2] * (f[2][i+2] + v1);
+      f[0][i+2] = f[1][r+2] * (f[2][i+2] + v1) + f[1][i+2] * (f[2][r+2] + v2);
+      //                  ^^^^^^^             ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..ffe646efe57f7ad07541b0fb96601596f46dc5f8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v1) - f[1][i] * (f[2][i] + v2);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..5f98aa204d8b11b0cb433f8965dbb72cf8940de1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good1(float v1, float v2)
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + v2) - f[1][i] * (f[2][i] + v1);
+      f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2);
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..882851789c5085e734000609114be480d3b08bd0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good1()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][i] * f[2][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[2][r];
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..6a2d549d65f3f27d407fb0bd469473e6a5c333ae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-5.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void good2()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + 1) - f[1][i] * (f[2][i] + 1);
+      f[0][i] = f[1][r] * (f[2][i] + 1) + f[1][i] * (f[2][r] + 1);
+    }
+}
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c
new file mode 100644
index 0000000000000000000000000000000000000000..71e66dbe3b29eec1fffb8df9b216022fdc0af54e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad1()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][i] * f[3][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[3][r];
+      //                  ^^^^^^^             ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c
new file mode 100644
index 0000000000000000000000000000000000000000..536672f3c8bb474ad5fa4bb61b3a36b555acf3cf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-7.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad2()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * (f[2][r] + 1) - f[1][i] * f[2][i];
+      f[0][i] = f[1][r] * (f[2][i] + 1) + f[1][i] * f[2][r];
+      //                          ^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c
new file mode 100644
index 0000000000000000000000000000000000000000..07b48148688b7d530e5891d023d558b58a485c23
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+float f[12][100];
+
+void bad3()
+{
+  for (int r = 0; r < 100; r += 2)
+    {
+      int i = r + 1;
+      f[0][r] = f[1][r] * f[2][r] - f[1][r] * f[2][i];
+      f[0][i] = f[1][r] * f[2][i] + f[1][i] * f[2][r];
+      //                            ^^^^^^^
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "Found COMPLEX_MUL" "vect" { target { vect_float } } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c b/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c
new file mode 100644
index 0000000000000000000000000000000000000000..7655852434b21b381fe7ee316e8caf3d485b8ee1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr102819-9.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+
+#include <stdio.h>
+#include <complex.h>
+
+#define N 200
+#define TYPE float
+#define TYPE2 float
+
+void g (TYPE2 complex a[restrict N], TYPE complex b[restrict N], TYPE complex c[restrict N])
+{
+  for (int i=0; i < N; i++)
+    {
+      c[i] -=  a[i] * b[0];
+    }
+}
+
+/* The pattern overlaps with COMPLEX_ADD so we need to support consuming ADDs in COMPLEX_FMS.  */
+
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "vect" { xfail { vect_float } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/complex/pr103169.c b/gcc/testsuite/gcc.dg/vect/complex/pr103169.c
new file mode 100644
index 0000000000000000000000000000000000000000..1bfabbd85a0eedfb4156a82574324126e9083fc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/complex/pr103169.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { vect_double } } } */
+/* { dg-add-options arm_v8_3a_complex_neon } */
+/* { dg-additional-options "-O2 -fvect-cost-model=unlimited" } */
+
+_Complex double b_0, c_0;
+
+void
+mul270snd (void)
+{
+  c_0 = b_0 * 1.0iF * 1.0iF;
+}
+
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index 74f579c9f3f23bac25d21546068c2ab43209aa2b..8ad5fa521279b20fa5e63eecf442d5dc5c16e7ee 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -600,10 +600,11 @@ same_data_refs_base_objects (data_reference_p a, data_reference_p b)
 }
 
 /* Return true when the data references A and B are accessing the same
-   memory object with the same access functions.  */
+   memory object with the same access functions.  Optionally skip the
+   last OFFSET dimensions in the data reference.  */
 
 static inline bool
-same_data_refs (data_reference_p a, data_reference_p b)
+same_data_refs (data_reference_p a, data_reference_p b, int offset = 0)
 {
   unsigned int i;
 
@@ -614,7 +615,7 @@ same_data_refs (data_reference_p a, data_reference_p b)
   if (!same_data_refs_base_objects (a, b))
     return false;
 
-  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+  for (i = offset; i < DR_NUM_DIMENSIONS (a); i++)
     if (!eq_evolutions_p (DR_ACCESS_FN (a, i), DR_ACCESS_FN (b, i)))
       return false;
 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 0350441fad9690cd5d04337171ca3470a064a571..020c29bba08c5bd80503a2dbc04292f8fd310b3c 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -149,12 +149,13 @@ is_linear_load_p (load_permutation_t loads)
   int valid_patterns = 4;
   FOR_EACH_VEC_ELT (loads, i, load)
     {
-      if (candidates[0] != PERM_UNKNOWN && load != 1)
+      unsigned adj_load = load % 2;
+      if (candidates[0] != PERM_UNKNOWN && adj_load != 1)
 	{
 	  candidates[0] = PERM_UNKNOWN;
 	  valid_patterns--;
 	}
-      if (candidates[1] != PERM_UNKNOWN && load != 0)
+      if (candidates[1] != PERM_UNKNOWN && adj_load != 0)
 	{
 	  candidates[1] = PERM_UNKNOWN;
 	  valid_patterns--;
@@ -596,11 +597,12 @@ class complex_add_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -647,6 +649,7 @@ complex_add_pattern::build (vec_info *vinfo)
 internal_fn
 complex_add_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t * /* compat_cache */,
 			      slp_tree *node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -692,13 +695,14 @@ complex_add_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_add_pattern::matches (op, perm_cache, node, &ops);
+    = complex_add_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -709,147 +713,214 @@ complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
  * complex_mul_pattern
  ******************************************************************************/
 
-/* Check to see if either of the trees in ARGS are a NEGATE_EXPR.  If the first
-   child (args[0]) is a NEGATE_EXPR then NEG_FIRST_P is set to TRUE.
-
-   If a negate is found then the values in ARGS are reordered such that the
-   negate node is always the second one and the entry is replaced by the child
-   of the negate node.  */
+/* Helper function to check if PERM is KIND or PERM_TOP.  */
 
 static inline bool
-vect_normalize_conj_loc (vec<slp_tree> &args, bool *neg_first_p = NULL)
+is_eq_or_top (slp_tree_to_load_perm_map_t *perm_cache,
+	      slp_tree op1, complex_perm_kinds_t kind1,
+	      slp_tree op2, complex_perm_kinds_t kind2)
 {
-  gcc_assert (args.length () == 2);
-  bool neg_found = false;
-
-  if (vect_match_expression_p (args[0], NEGATE_EXPR))
-    {
-      std::swap (args[0], args[1]);
-      neg_found = true;
-      if (neg_first_p)
-	*neg_first_p = true;
-    }
-  else if (vect_match_expression_p (args[1], NEGATE_EXPR))
-    {
-      neg_found = true;
-      if (neg_first_p)
-	*neg_first_p = false;
-    }
+  complex_perm_kinds_t perm1 = linear_loads_p (perm_cache, op1);
+  if (perm1 != kind1 && perm1 != PERM_TOP)
+    return false;
 
-  if (neg_found)
-    args[1] = SLP_TREE_CHILDREN (args[1])[0];
+  complex_perm_kinds_t perm2 = linear_loads_p (perm_cache, op2);
+  if (perm2 != kind2 && perm2 != PERM_TOP)
+    return false;
 
-  return neg_found;
+  return true;
 }
 
-/* Helper function to check if PERM is KIND or PERM_TOP.  */
+enum _conj_status { CONJ_NONE, CONJ_FST, CONJ_SND };
 
 static inline bool
-is_eq_or_top (complex_perm_kinds_t perm, complex_perm_kinds_t kind)
+compatible_complex_nodes_p (slp_compat_nodes_map_t *compat_cache,
+			    slp_tree a, int *pa, slp_tree b, int *pb)
 {
-  return perm == kind || perm == PERM_TOP;
-}
+  bool *tmp;
+  std::pair<slp_tree, slp_tree> key = std::make_pair(a, b);
+  if ((tmp = compat_cache->get (key)) != NULL)
+    return *tmp;
 
-/* Helper function that checks to see if LEFT_OP and RIGHT_OP are both MULT_EXPR
-   nodes but also that they represent an operation that is either a complex
-   multiplication or a complex multiplication by conjugated value.
+   compat_cache->put (key, false);
 
-   Of the negation is expected to be in the first half of the tree (As required
-   by an FMS pattern) then NEG_FIRST is true.  If the operation is a conjugate
-   operation then CONJ_FIRST_OPERAND is set to indicate whether the first or
-   second operand contains the conjugate operation.  */
+  if (SLP_TREE_CHILDREN (a).length () != SLP_TREE_CHILDREN (b).length ())
+    return false;
 
-static inline bool
-vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
-			      const vec<slp_tree> &left_op,
-			      const vec<slp_tree> &right_op,
-			     bool neg_first, bool *conj_first_operand,
-			     bool fms)
-{
-  /* The presence of a negation indicates that we have either a conjugate or a
-     rotation.  We need to distinguish which one.  */
-  *conj_first_operand = false;
-  complex_perm_kinds_t kind;
-
-  /* Complex conjugates have the negation on the imaginary part of the
-     number where rotations affect the real component.  So check if the
-     negation is on a dup of lane 1.  */
-  if (fms)
+  if (SLP_TREE_DEF_TYPE (a) != SLP_TREE_DEF_TYPE (b))
+    return false;
+
+  /* Only internal nodes can be loads, as such we can't check further if they
+     are externals.  */
+  if (SLP_TREE_DEF_TYPE (a) != vect_internal_def)
     {
-      /* Canonicalization for fms is not consistent. So have to test both
-	 variants to be sure.  This needs to be fixed in the mid-end so
-	 this part can be simpler.  */
-      kind = linear_loads_p (perm_cache, right_op[0]);
-      if (!((is_eq_or_top (linear_loads_p (perm_cache, right_op[0]), PERM_ODDODD)
-	   && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]),
-			     PERM_ODDEVEN))
-	  || (kind == PERM_ODDEVEN
-	      && is_eq_or_top (linear_loads_p (perm_cache, right_op[1]),
-			     PERM_ODDODD))))
-	return false;
+      for (unsigned i = 0; i < SLP_TREE_SCALAR_OPS (a).length (); i++)
+	{
+	  tree op1 = SLP_TREE_SCALAR_OPS (a)[pa[i % 2]];
+	  tree op2 = SLP_TREE_SCALAR_OPS (b)[pb[i % 2]];
+	  if (!operand_equal_p (op1, op2, 0))
+	    return false;
+	}
+
+      compat_cache->put (key, true);
+      return true;
     }
+
+  auto a_stmt = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (a));
+  auto b_stmt = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (b));
+
+  if (gimple_code (a_stmt) != gimple_code (b_stmt))
+    return false;
+
+  /* code, children, type, externals, loads, constants  */
+  if (gimple_num_args (a_stmt) != gimple_num_args (b_stmt))
+    return false;
+
+  /* At this point, a and b are known to be the same gimple operations.  */
+  if (is_gimple_call (a_stmt))
+    {
+	if (!compatible_calls_p (dyn_cast <gcall *> (a_stmt),
+				 dyn_cast <gcall *> (b_stmt)))
+	  return false;
+    }
+  else if (!is_gimple_assign (a_stmt))
+    return false;
   else
     {
-      if (linear_loads_p (perm_cache, right_op[1]) != PERM_ODDODD
-	  && !is_eq_or_top (linear_loads_p (perm_cache, right_op[0]),
-			    PERM_ODDEVEN))
+      tree_code acode = gimple_assign_rhs_code (a_stmt);
+      tree_code bcode = gimple_assign_rhs_code (b_stmt);
+      if ((acode == REALPART_EXPR || acode == IMAGPART_EXPR)
+	  && (bcode == REALPART_EXPR || bcode == IMAGPART_EXPR))
+	return true;
+
+      if (acode != bcode)
 	return false;
     }
 
-  /* Deal with differences in indexes.  */
-  int index1 = fms ? 1 : 0;
-  int index2 = fms ? 0 : 1;
-
-  /* Check if the conjugate is on the second first or second operand.  The
-     order of the node with the conjugate value determines this, and the dup
-     node must be one of lane 0 of the same DR as the neg node.  */
-  kind = linear_loads_p (perm_cache, left_op[index1]);
-  if (kind == PERM_TOP)
+  if (!SLP_TREE_LOAD_PERMUTATION (a).exists ()
+      || !SLP_TREE_LOAD_PERMUTATION (b).exists ())
     {
-      if (linear_loads_p (perm_cache, left_op[index2]) == PERM_EVENODD)
-	return true;
+      for (unsigned i = 0; i < gimple_num_args (a_stmt); i++)
+	{
+	  tree t1 = gimple_arg (a_stmt, i);
+	  tree t2 = gimple_arg (b_stmt, i);
+	  if (TREE_CODE (t1) != TREE_CODE (t2))
+	    return false;
+
+	  /* If SSA name then we will need to inspect the children
+	     so we can punt here.  */
+	  if (TREE_CODE (t1) == SSA_NAME)
+	    continue;
+
+	  if (!operand_equal_p (t1, t2, 0))
+	    return false;
+	}
     }
-  else if (kind == PERM_EVENODD && !neg_first)
+  else
     {
-      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENEVEN)
+      auto dr1 = STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (a));
+      auto dr2 = STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (b));
+      /* Don't check the last dimension as that's checked by the lineary
+	 checks.  This check is also much stricter than what we need
+	 because it doesn't consider loading from adjacent elements
+	 in the same struct as loading from the same base object.
+	 But for now, I'll play it safe.  */
+      if (!same_data_refs (dr1, dr2, 1))
 	return false;
-      return true;
     }
-  else if (kind == PERM_EVENEVEN && neg_first)
+
+  for (unsigned i = 0; i < SLP_TREE_CHILDREN (a).length (); i++)
     {
-      if ((kind = linear_loads_p (perm_cache, left_op[index2])) != PERM_EVENODD)
+      if (!compatible_complex_nodes_p (compat_cache,
+				       SLP_TREE_CHILDREN (a)[i], pa,
+				       SLP_TREE_CHILDREN (b)[i], pb))
 	return false;
-
-      *conj_first_operand = true;
-      return true;
     }
-  else
-    return false;
-
-  if (kind != PERM_EVENEVEN)
-    return false;
 
+  compat_cache->put (key, true);
   return true;
 }
 
-/* Helper function to help distinguish between a conjugate and a rotation in a
-   complex multiplication.  The operations have similar shapes but the order of
-   the load permutes are different.  This function returns TRUE when the order
-   is consistent with a multiplication or multiplication by conjugated
-   operand but returns FALSE if it's a multiplication by rotated operand.  */
-
 static inline bool
 vect_validate_multiplication (slp_tree_to_load_perm_map_t *perm_cache,
-			      const vec<slp_tree> &op,
-			      complex_perm_kinds_t permKind)
+			      slp_compat_nodes_map_t *compat_cache,
+			      vec<slp_tree> &left_op,
+			      vec<slp_tree> &right_op,
+			      bool subtract,
+			      enum _conj_status *_status)
 {
-  /* The left node is the more common case, test it first.  */
-  if (!is_eq_or_top (linear_loads_p (perm_cache, op[0]), permKind))
+  auto_vec<slp_tree> ops;
+  enum _conj_status stats = CONJ_NONE;
+
+  /* The complex operations can occur in two layouts and two permute sequences
+     so declare them and re-use them.  */
+  int styles[][4] = { { 0, 2, 1, 3} /* {L1, R1} + {L2, R2}.  */
+		    , { 0, 3, 1, 2} /* {L1, R2} + {L2, R1}.  */
+		    };
+
+  /* Now for the corresponding permutes that go with these values.  */
+  complex_perm_kinds_t perms[][4]
+    = { { PERM_EVENEVEN, PERM_ODDODD, PERM_EVENODD, PERM_ODDEVEN }
+      , { PERM_EVENODD, PERM_ODDEVEN, PERM_EVENEVEN, PERM_ODDODD }
+      };
+
+  /* These permutes are used during comparisons of externals on which
+     we require strict equality.  */
+  int cq[][4][2]
+    = { { { 0, 0 }, { 1, 1 }, { 0, 1 }, { 1, 0 } }
+      , { { 0, 1 }, { 1, 0 }, { 0, 0 }, { 1, 1 } }
+      };
+
+  /* Default to style and perm 0, most operations use this one.  */
+  int style = 0;
+  int perm = subtract ? 1 : 0;
+
+  /* Check if we have a negate operation, if so absorb the node and continue
+     looking.  */
+  bool neg0 = vect_match_expression_p (right_op[0], NEGATE_EXPR);
+  bool neg1 = vect_match_expression_p (right_op[1], NEGATE_EXPR);
+
+  /* Determine which style we're looking at.  We only have different ones
+     whenever a conjugate is involved.  */
+  if (neg0 && neg1)
+    ;
+  else if (neg0)
     {
-      if (!is_eq_or_top (linear_loads_p (perm_cache, op[1]), permKind))
-	return false;
+      right_op[0] = SLP_TREE_CHILDREN (right_op[0])[0];
+      stats = CONJ_FST;
+      if (subtract)
+	perm = 0;
     }
-  return true;
+  else if (neg1)
+    {
+      right_op[1] = SLP_TREE_CHILDREN (right_op[1])[0];
+      stats = CONJ_SND;
+      perm = 1;
+    }
+
+  *_status = stats;
+
+  /* Flatten the inputs after we've remapped them.  */
+  ops.create (4);
+  ops.safe_splice (left_op);
+  ops.safe_splice (right_op);
+
+  /* Extract out the elements to check.  */
+  slp_tree op0 = ops[styles[style][0]];
+  slp_tree op1 = ops[styles[style][1]];
+  slp_tree op2 = ops[styles[style][2]];
+  slp_tree op3 = ops[styles[style][3]];
+
+  /* Do cheapest test first.  If failed no need to analyze further.  */
+  if (linear_loads_p (perm_cache, op0) != perms[perm][0]
+      || linear_loads_p (perm_cache, op1) != perms[perm][1]
+      || !is_eq_or_top (perm_cache, op2, perms[perm][2], op3, perms[perm][3]))
+    return false;
+
+  return compatible_complex_nodes_p (compat_cache, op0, cq[perm][0], op1,
+				     cq[perm][1])
+	 && compatible_complex_nodes_p (compat_cache, op2, cq[perm][2], op3,
+					cq[perm][3]);
 }
 
 /* This function combines two nodes containing only even and only odd lanes
@@ -908,11 +979,12 @@ class complex_mul_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -943,6 +1015,7 @@ class complex_mul_pattern : public complex_pattern
 internal_fn
 complex_mul_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t *compat_cache,
 			      slp_tree *node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -990,17 +1063,13 @@ complex_mul_pattern::matches (complex_operation_t op,
       || linear_loads_p (perm_cache, left_op[1]) == PERM_ODDEVEN)
     return IFN_LAST;
 
-  bool neg_first = false;
-  bool conj_first_operand = false;
-  bool is_neg = vect_normalize_conj_loc (right_op, &neg_first);
+  enum _conj_status status;
+  if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
+				     right_op, false, &status))
+    return IFN_LAST;
 
-  if (!is_neg)
+  if (status == CONJ_NONE)
     {
-      /* A multiplication needs to multiply agains the real pair, otherwise
-	 the pattern matches that of FMS.   */
-      if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN)
-	  || vect_normalize_conj_loc (left_op))
-	return IFN_LAST;
       if (add0)
 	ifn = IFN_COMPLEX_FMA;
       else
@@ -1008,11 +1077,6 @@ complex_mul_pattern::matches (complex_operation_t op,
     }
   else
     {
-      if (!vect_validate_multiplication (perm_cache, left_op, right_op,
-					 neg_first, &conj_first_operand,
-					 false))
-	return IFN_LAST;
-
       if(add0)
 	ifn = IFN_COMPLEX_FMA_CONJ;
       else
@@ -1029,19 +1093,13 @@ complex_mul_pattern::matches (complex_operation_t op,
     ops->quick_push (add0);
 
   complex_perm_kinds_t kind = linear_loads_p (perm_cache, left_op[0]);
-  if (kind == PERM_EVENODD)
+  if (kind == PERM_EVENODD || kind == PERM_TOP)
     {
       ops->quick_push (left_op[1]);
       ops->quick_push (right_op[1]);
       ops->quick_push (left_op[0]);
     }
-  else if (kind == PERM_TOP)
-    {
-      ops->quick_push (left_op[1]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (left_op[0]);
-    }
-  else if (kind == PERM_EVENEVEN && !conj_first_operand)
+  else if (kind == PERM_EVENEVEN && status != CONJ_SND)
     {
       ops->quick_push (left_op[0]);
       ops->quick_push (right_op[0]);
@@ -1061,13 +1119,14 @@ complex_mul_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_mul_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_mul_pattern::matches (op, perm_cache, node, &ops);
+    = complex_mul_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -1115,9 +1174,9 @@ complex_mul_pattern::build (vec_info *vinfo)
 
 	/* First re-arrange the children.  */
 	SLP_TREE_CHILDREN (*this->m_node).safe_grow (3);
-	SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[0];
-	SLP_TREE_CHILDREN (*this->m_node)[1] = this->m_ops[3];
-	SLP_TREE_CHILDREN (*this->m_node)[2] = newnode;
+	SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[3];
+	SLP_TREE_CHILDREN (*this->m_node)[1] = newnode;
+	SLP_TREE_CHILDREN (*this->m_node)[2] = this->m_ops[0];
 
 	/* Tell the builder to expect an extra argument.  */
 	this->m_num_args++;
@@ -1147,11 +1206,12 @@ class complex_fms_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 
     static vect_pattern*
     mkInstance (slp_tree *node, vec<slp_tree> *m_ops, internal_fn ifn)
@@ -1182,6 +1242,7 @@ class complex_fms_pattern : public complex_pattern
 internal_fn
 complex_fms_pattern::matches (complex_operation_t op,
 			      slp_tree_to_load_perm_map_t *perm_cache,
+			      slp_compat_nodes_map_t *compat_cache,
 			      slp_tree * ref_node, vec<slp_tree> *ops)
 {
   internal_fn ifn = IFN_LAST;
@@ -1197,6 +1258,8 @@ complex_fms_pattern::matches (complex_operation_t op,
   if (!vect_match_expression_p (root, MINUS_EXPR))
     return IFN_LAST;
 
+  /* TODO: Support invariants here, with the new layout CADD now
+	   can match before we get a chance to try CFMS.  */
   auto nodes = SLP_TREE_CHILDREN (root);
   if (!vect_match_expression_p (nodes[1], MULT_EXPR)
       || vect_detect_pair_op (nodes[0]) != PLUS_MINUS)
@@ -1217,16 +1280,14 @@ complex_fms_pattern::matches (complex_operation_t op,
       || !vect_match_expression_p (l0node[1], MULT_EXPR))
     return IFN_LAST;
 
-  bool is_neg = vect_normalize_conj_loc (left_op);
-
-  bool conj_first_operand = false;
-  if (!vect_validate_multiplication (perm_cache, right_op, left_op, false,
-				     &conj_first_operand, true))
+  enum _conj_status status;
+  if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
+				     left_op, true, &status))
     return IFN_LAST;
 
-  if (!is_neg)
+  if (status == CONJ_NONE)
     ifn = IFN_COMPLEX_FMS;
-  else if (is_neg)
+  else
     ifn = IFN_COMPLEX_FMS_CONJ;
 
   if (!vect_pattern_validate_optab (ifn, *ref_node))
@@ -1243,26 +1304,12 @@ complex_fms_pattern::matches (complex_operation_t op,
       ops->quick_push (right_op[1]);
       ops->quick_push (left_op[1]);
     }
-  else if (kind == PERM_TOP)
-    {
-      ops->quick_push (l0node[0]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[0]);
-    }
-  else if (kind == PERM_EVENEVEN && !is_neg)
-    {
-      ops->quick_push (l0node[0]);
-      ops->quick_push (right_op[1]);
-      ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[0]);
-    }
   else
     {
       ops->quick_push (l0node[0]);
       ops->quick_push (right_op[1]);
       ops->quick_push (right_op[0]);
-      ops->quick_push (left_op[1]);
+      ops->quick_push (left_op[0]);
     }
 
   return ifn;
@@ -1272,13 +1319,14 @@ complex_fms_pattern::matches (complex_operation_t op,
 
 vect_pattern*
 complex_fms_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				slp_compat_nodes_map_t *compat_cache,
 				slp_tree *node)
 {
   auto_vec<slp_tree> ops;
   complex_operation_t op
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn
-    = complex_fms_pattern::matches (op, perm_cache, node, &ops);
+    = complex_fms_pattern::matches (op, perm_cache, compat_cache, node, &ops);
   if (ifn == IFN_LAST)
     return NULL;
 
@@ -1305,9 +1353,9 @@ complex_fms_pattern::build (vec_info *vinfo)
   SLP_TREE_CHILDREN (*this->m_node).create (3);
 
   /* First re-arrange the children.  */
-  SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[0]);
   SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[1]);
   SLP_TREE_CHILDREN (*this->m_node).quick_push (newnode);
+  SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[0]);
 
   /* And then rewrite the node itself.  */
   complex_pattern::build (vinfo);
@@ -1334,11 +1382,12 @@ class complex_operations_pattern : public complex_pattern
   public:
     void build (vec_info *);
     static internal_fn
-    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *,
-	     vec<slp_tree> *);
+    matches (complex_operation_t op, slp_tree_to_load_perm_map_t *,
+	     slp_compat_nodes_map_t *, slp_tree *, vec<slp_tree> *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 };
 
 /* Dummy matches implementation for proxy object.  */
@@ -1347,6 +1396,7 @@ internal_fn
 complex_operations_pattern::
 matches (complex_operation_t /* op */,
 	 slp_tree_to_load_perm_map_t * /* perm_cache */,
+	 slp_compat_nodes_map_t * /* compat_cache */,
 	 slp_tree * /* ref_node */, vec<slp_tree> * /* ops */)
 {
   return IFN_LAST;
@@ -1356,6 +1406,7 @@ matches (complex_operation_t /* op */,
 
 vect_pattern*
 complex_operations_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
+				       slp_compat_nodes_map_t *ccache,
 				       slp_tree *node)
 {
   auto_vec<slp_tree> ops;
@@ -1363,15 +1414,15 @@ complex_operations_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache,
     = vect_detect_pair_op (*node, true, &ops);
   internal_fn ifn = IFN_LAST;
 
-  ifn  = complex_fms_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_fms_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_fms_pattern::mkInstance (node, &ops, ifn);
 
-  ifn  = complex_mul_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_mul_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_mul_pattern::mkInstance (node, &ops, ifn);
 
-  ifn  = complex_add_pattern::matches (op, perm_cache, node, &ops);
+  ifn  = complex_add_pattern::matches (op, perm_cache, ccache, node, &ops);
   if (ifn != IFN_LAST)
     return complex_add_pattern::mkInstance (node, &ops, ifn);
 
@@ -1398,11 +1449,13 @@ class addsub_pattern : public vect_pattern
     void build (vec_info *);
 
     static vect_pattern*
-    recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    recognize (slp_tree_to_load_perm_map_t *, slp_compat_nodes_map_t *,
+	       slp_tree *);
 };
 
 vect_pattern *
-addsub_pattern::recognize (slp_tree_to_load_perm_map_t *, slp_tree *node_)
+addsub_pattern::recognize (slp_tree_to_load_perm_map_t *,
+			   slp_compat_nodes_map_t *, slp_tree *node_)
 {
   slp_tree node = *node_;
   if (SLP_TREE_CODE (node) != VEC_PERM_EXPR
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index b912c3577df61a694d5bb9e22c5303fe6a48ab6e..cb577f8a612d583254e42bb06a6d7a0875de5e75 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -804,7 +804,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char swap,
 /* Return true if call statements CALL1 and CALL2 are similar enough
    to be combined into the same SLP group.  */
 
-static bool
+bool
 compatible_calls_p (gcall *call1, gcall *call2)
 {
   unsigned int nargs = gimple_call_num_args (call1);
@@ -2907,6 +2907,7 @@ optimize_load_redistribution (scalar_stmts_to_slp_tree_map_t *bst_map,
 static bool
 vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
 			   slp_tree_to_load_perm_map_t *perm_cache,
+			   slp_compat_nodes_map_t *compat_cache,
 			   hash_set<slp_tree> *visited)
 {
   unsigned i;
@@ -2918,11 +2919,13 @@ vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
   slp_tree child;
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
     found_p |= vect_match_slp_patterns_2 (&SLP_TREE_CHILDREN (node)[i],
-					  vinfo, perm_cache, visited);
+					  vinfo, perm_cache, compat_cache,
+					  visited);
 
   for (unsigned x = 0; x < num__slp_patterns; x++)
     {
-      vect_pattern *pattern = slp_patterns[x] (perm_cache, ref_node);
+      vect_pattern *pattern
+	= slp_patterns[x] (perm_cache, compat_cache, ref_node);
       if (pattern)
 	{
 	  pattern->build (vinfo);
@@ -2943,7 +2946,8 @@ vect_match_slp_patterns_2 (slp_tree *ref_node, vec_info *vinfo,
 static bool
 vect_match_slp_patterns (slp_instance instance, vec_info *vinfo,
 			 hash_set<slp_tree> *visited,
-			 slp_tree_to_load_perm_map_t *perm_cache)
+			 slp_tree_to_load_perm_map_t *perm_cache,
+			 slp_compat_nodes_map_t *compat_cache)
 {
   DUMP_VECT_SCOPE ("vect_match_slp_patterns");
   slp_tree *ref_node = &SLP_INSTANCE_TREE (instance);
@@ -2953,7 +2957,8 @@ vect_match_slp_patterns (slp_instance instance, vec_info *vinfo,
 		     "Analyzing SLP tree %p for patterns\n",
 		     SLP_INSTANCE_TREE (instance));
 
-  return vect_match_slp_patterns_2 (ref_node, vinfo, perm_cache, visited);
+  return vect_match_slp_patterns_2 (ref_node, vinfo, perm_cache, compat_cache,
+				    visited);
 }
 
 /* STMT_INFO is a store group of size GROUP_SIZE that we are considering
@@ -3437,12 +3442,14 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size)
 
   hash_set<slp_tree> visited_patterns;
   slp_tree_to_load_perm_map_t perm_cache;
+  slp_compat_nodes_map_t compat_cache;
 
   /* See if any patterns can be found in the SLP tree.  */
   bool pattern_found = false;
   FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (vinfo), i, instance)
     pattern_found |= vect_match_slp_patterns (instance, vinfo,
-					      &visited_patterns, &perm_cache);
+					      &visited_patterns, &perm_cache,
+					      &compat_cache);
 
   /* If any were found optimize permutations of loads.  */
   if (pattern_found)
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 2f6e1e268fb07e9de065ff9c45af87546e565d66..83cd0919c7838c65576e1debd881e0ec636a605a 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2268,6 +2268,7 @@ extern void duplicate_and_interleave (vec_info *, gimple_seq *, tree,
 extern int vect_get_place_in_interleaving_chain (stmt_vec_info, stmt_vec_info);
 extern slp_tree vect_create_new_slp_node (unsigned, tree_code);
 extern void vect_free_slp_tree (slp_tree);
+extern bool compatible_calls_p (gcall *, gcall *);
 
 /* In tree-vect-patterns.c.  */
 extern void
@@ -2306,6 +2307,12 @@ typedef enum _complex_perm_kinds {
 typedef hash_map <slp_tree, complex_perm_kinds_t>
   slp_tree_to_load_perm_map_t;
 
+/* Cache from nodes pair to being compatible or not.  */
+typedef pair_hash <nofree_ptr_hash <_slp_tree>,
+		   nofree_ptr_hash <_slp_tree>> slp_node_hash;
+typedef hash_map <slp_node_hash, bool> slp_compat_nodes_map_t;
+
+
 /* Vector pattern matcher base class.  All SLP pattern matchers must inherit
    from this type.  */
 
@@ -2338,7 +2345,8 @@ class vect_pattern
   public:
 
     /* Create a new instance of the pattern matcher class of the given type.  */
-    static vect_pattern* recognize (slp_tree_to_load_perm_map_t *, slp_tree *);
+    static vect_pattern* recognize (slp_tree_to_load_perm_map_t *,
+				    slp_compat_nodes_map_t *, slp_tree *);
 
     /* Build the pattern from the data collected so far.  */
     virtual void build (vec_info *) = 0;
@@ -2352,6 +2360,7 @@ class vect_pattern
 
 /* Function pointer to create a new pattern matcher from a generic type.  */
 typedef vect_pattern* (*vect_pattern_decl_t) (slp_tree_to_load_perm_map_t *,
+					      slp_compat_nodes_map_t *,
 					      slp_tree *);
 
 /* List of supported pattern matchers.  */

  reply	other threads:[~2021-12-20 16:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 15:42 Tamar Christina
2021-12-17 15:42 ` [2/3 PATCH]AArch64 use canonical ordering for complex mul, fma and fms Tamar Christina
2021-12-17 16:24   ` Richard Sandiford
2021-12-17 16:48     ` Richard Sandiford
2021-12-20 16:20       ` Tamar Christina
2022-01-11  7:10         ` Tamar Christina
2022-02-01  9:55           ` Tamar Christina
2022-02-01 11:04         ` Richard Sandiford
2021-12-17 15:43 ` [3/3 PATCH][AArch32] " Tamar Christina
2021-12-20 16:22   ` Tamar Christina
2022-01-11  7:10     ` Tamar Christina
2022-02-01  9:54       ` Tamar Christina
2022-02-01  9:56     ` Kyrylo Tkachov
2021-12-17 16:18 ` [1/3 PATCH]middle-end vect: Simplify and extend the complex numbers validation routines Richard Sandiford
2021-12-20 16:18   ` Tamar Christina [this message]
2022-01-10 10:16     ` Tamar Christina
2022-01-10 13:00 ` Richard Biener
2022-01-11  7:31   ` Tamar Christina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5325E4E7FC25F1ECF90FC1E1FF7B9@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).