[PATCH] RISC-V: Support vfwmul.vv combine lowering

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] RISC-V: Support vfwmul.vv combine lowering
@ 2023-06-28  4:15 Juzhe-Zhong
  2023-06-28 16:24 ` Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Juzhe-Zhong @ 2023-06-28  4:15 UTC (permalink / raw)
  To: gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, jeffreyalaw, rdapp.gcc,
	Juzhe-Zhong

Consider the following complicate case:
#define TEST_TYPE(TYPE1, TYPE2)                                                \
  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      {                                                                        \
	dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
	dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
	dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
	dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
      }                                                                        \
  }

TEST_TYPE (double, float)

Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
So the combine PASS will first try to combine one of the combine extension, and then combine
the other. The combine flow is as follows:

Original IR:
(set (reg 0) (float_extend: (reg 1))
(set (reg 3) (float_extend: (reg 2)) 
(set (reg 4) (mult: (reg 0) (reg 3))

First step of combine:
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (float_extend: (reg 1) (reg 3))

Second step of combine:
(set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))

So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).

gcc/ChangeLog:

        * config/riscv/autovec-opt.md (@pred_single_widen_mul<any_extend:su><mode>): Change "@" into "*" in pattern name which simplifies build files.
        (*pred_single_widen_mul<any_extend:su><mode>): Ditto.
        (*pred_single_widen_mul<mode>): New pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/widen/widen-3.c: Add floating-point.
        * gcc.target/riscv/rvv/autovec/widen/widen-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c: New test.

---
 gcc/config/riscv/autovec-opt.md               | 41 ++++++++++++++++++-
 .../riscv/rvv/autovec/widen/widen-3.c         |  7 +++-
 .../riscv/rvv/autovec/widen/widen-7.c         |  7 +++-
 .../rvv/autovec/widen/widen-complicate-3.c    |  7 +++-
 .../riscv/rvv/autovec/widen/widen_run-3.c     |  5 ++-
 .../riscv/rvv/autovec/widen/widen_run-7.c     |  5 ++-
 .../rvv/autovec/widen/widen_run_zvfh-3.c      | 28 +++++++++++++
 .../rvv/autovec/widen/widen_run_zvfh-7.c      | 28 +++++++++++++
 8 files changed, 117 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 28040805b23..1fcd55ac2a0 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -21,7 +21,7 @@
 ;; We don't have vwmul.wv instruction like vwadd.wv in RVV.
 ;; This pattern is an intermediate RTL IR as a pseudo vwmul.wv to enhance
 ;; optimization of instructions combine.
-(define_insn_and_split "@pred_single_widen_mul<any_extend:su><mode>"
+(define_insn_and_split "*pred_single_widen_mul<any_extend:su><mode>"
   [(set (match_operand:VWEXTI 0 "register_operand"                  "=&vr,&vr")
 	(if_then_else:VWEXTI
 	  (unspec:<VM>
@@ -405,3 +405,42 @@
   "vmv.x.s\t%0,%1"
   [(set_attr "type" "vimovvx")
    (set_attr "mode" "<MODE>")])
+
+;; We don't have vfwmul.wv instruction like vfwadd.wv in RVV.
+;; This pattern is an intermediate RTL IR as a pseudo vfwmul.wv to enhance
+;; optimization of instructions combine.
+(define_insn_and_split "*pred_single_widen_mul<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand"                  "=&vr,  &vr")
+	(if_then_else:VWEXTF
+	  (unspec:<VM>
+	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
+	     (match_operand 5 "vector_length_operand"              "   rK,   rK")
+	     (match_operand 6 "const_int_operand"                  "    i,    i")
+	     (match_operand 7 "const_int_operand"                  "    i,    i")
+	     (match_operand 8 "const_int_operand"                  "    i,    i")
+	     (match_operand 9 "const_int_operand"                  "    i,    i")
+	     (reg:SI VL_REGNUM)
+	     (reg:SI VTYPE_REGNUM)
+	     (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
+	  (mult:VWEXTF
+	    (float_extend:VWEXTF
+	      (match_operand:<V_DOUBLE_TRUNC> 4 "register_operand" "   vr,   vr"))
+	    (match_operand:VWEXTF 3 "register_operand"             "   vr,   vr"))
+	  (match_operand:VWEXTF 2 "vector_merge_operand"           "   vu,    0")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ops[] = {tmp, operands[4]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
+
+    emit_insn (gen_pred (MULT, <MODE>mode, operands[0], operands[1], operands[2],
+			 operands[3], tmp, operands[5], operands[6],
+			 operands[7], operands[8], operands[9]));
+    DONE;
+  }
+  [(set_attr "type" "vfwmul")
+   (set_attr "mode" "<MODE>")])
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
index 609a5c09f70..b2b14405902 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvwmul\.vv} 3 } } */
 /* { dg-final { scan-assembler-times {\tvwmulu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
index cc43d9ba3fe..3806e8b98ee 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvsext\.vf2} 3 } } */
 /* { dg-final { scan-assembler-times {\tvzext\.vf2} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwcvt} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
index e1fd79430c3..1515374890d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -24,9 +24,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvwmul\.vv} 12 } } */
 /* { dg-final { scan-assembler-times {\tvwmulu\.vv} 12 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
index beb0cc2b58b..b7dd60fa8e8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <assert.h>
 #include "widen-3.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
 
 int
 main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
index 4abddd5d718..ab29f4a0f70 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <assert.h>
 #include "widen-7.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
 
 int
 main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
new file mode 100644
index 00000000000..c3efd0b97bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-3.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == ((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
new file mode 100644
index 00000000000..60e2401c088
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-7.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE1 b##TYPE1[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % LIMIT;                                         \
+      b##TYPE1[i] = LIMIT + i & LIMIT;                                         \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE1, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == (((TYPE1) a##TYPE2[i]) * b##TYPE1[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-28  4:15 [PATCH] RISC-V: Support vfwmul.vv combine lowering Juzhe-Zhong
@ 2023-06-28 16:24 ` Jeff Law
  2023-06-28 22:00   ` 钟居哲
       [not found]   ` <2023062906005450585022@rivai.ai>
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2023-06-28 16:24 UTC (permalink / raw)
  To: Juzhe-Zhong, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc



On 6/27/23 22:15, Juzhe-Zhong wrote:
> Consider the following complicate case:
> #define TEST_TYPE(TYPE1, TYPE2)                                                \
>    __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
>      TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
>      TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
>      TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
>    {                                                                            \
>      for (int i = 0; i < n; i++)                                                \
>        {                                                                        \
> 	dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
> 	dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
> 	dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
> 	dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
>        }                                                                        \
>    }
> 
> TEST_TYPE (double, float)
> 
> Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
> So the combine PASS will first try to combine one of the combine extension, and then combine
> the other. The combine flow is as follows:
> 
> Original IR:
> (set (reg 0) (float_extend: (reg 1))
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (reg 0) (reg 3))
> 
> First step of combine:
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (float_extend: (reg 1) (reg 3))
> 
> Second step of combine:
> (set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))
> 
> So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
> which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).
Hmm, something doesn't make sense here.  Combine knows how to do a 3->1 
combination.  I would expect to see the first step fail (substituting 
just one operand), then a later step try to combine all three 
instructions, substituting the extension for both input operands.

Can you pass along the .combine dump from the failing case?

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-28 16:24 ` Jeff Law
@ 2023-06-28 22:00   ` 钟居哲
  2023-06-29 22:59     ` Jeff Law
                       ` (3 more replies)
       [not found]   ` <2023062906005450585022@rivai.ai>
  1 sibling, 4 replies; 22+ messages in thread
From: 钟居哲 @ 2023-06-28 22:00 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

You can see here:

https://godbolt.org/z/d78646hWb 

The first case can't genreate vfwmul.vv but second case succeed.

Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
            (reg:VNx2DF 148 [ vect__8.49 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))


This patch is adding this combine pattern.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-29 00:24
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 6/27/23 22:15, Juzhe-Zhong wrote:
> Consider the following complicate case:
> #define TEST_TYPE(TYPE1, TYPE2)                                                \
>    __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
>      TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
>      TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
>      TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
>    {                                                                            \
>      for (int i = 0; i < n; i++)                                                \
>        {                                                                        \
> dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
> dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
> dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
> dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
>        }                                                                        \
>    }
> 
> TEST_TYPE (double, float)
> 
> Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
> So the combine PASS will first try to combine one of the combine extension, and then combine
> the other. The combine flow is as follows:
> 
> Original IR:
> (set (reg 0) (float_extend: (reg 1))
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (reg 0) (reg 3))
> 
> First step of combine:
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (float_extend: (reg 1) (reg 3))
> 
> Second step of combine:
> (set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))
> 
> So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
> which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).
Hmm, something doesn't make sense here.  Combine knows how to do a 3->1 
combination.  I would expect to see the first step fail (substituting 
just one operand), then a later step try to combine all three 
instructions, substituting the extension for both input operands.
 
Can you pass along the .combine dump from the failing case?
 
Jeff
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
       [not found]   ` <2023062906005450585022@rivai.ai>
@ 2023-06-28 22:59     ` 钟居哲
  0 siblings, 0 replies; 22+ messages in thread
From: 钟居哲 @ 2023-06-28 22:59 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc


[-- Attachment #1.1: Type: text/plain, Size: 3696 bytes --]

This is the dump



juzhe.zhong@rivai.ai
 
From: 钟居哲
Date: 2023-06-29 06:00
To: Jeff Law; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
You can see here:

https://godbolt.org/z/d78646hWb 

The first case can't genreate vfwmul.vv but second case succeed.

Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
            (reg:VNx2DF 148 [ vect__8.49 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))


This patch is adding this combine pattern.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-29 00:24
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 6/27/23 22:15, Juzhe-Zhong wrote:
> Consider the following complicate case:
> #define TEST_TYPE(TYPE1, TYPE2)                                                \
>    __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
>      TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
>      TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
>      TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
>    {                                                                            \
>      for (int i = 0; i < n; i++)                                                \
>        {                                                                        \
> dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
> dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
> dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
> dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
>        }                                                                        \
>    }
> 
> TEST_TYPE (double, float)
> 
> Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
> So the combine PASS will first try to combine one of the combine extension, and then combine
> the other. The combine flow is as follows:
> 
> Original IR:
> (set (reg 0) (float_extend: (reg 1))
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (reg 0) (reg 3))
> 
> First step of combine:
> (set (reg 3) (float_extend: (reg 2))
> (set (reg 4) (mult: (float_extend: (reg 1) (reg 3))
> 
> Second step of combine:
> (set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))
> 
> So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
> which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).
Hmm, something doesn't make sense here.  Combine knows how to do a 3->1 
combination.  I would expect to see the first step fail (substituting 
just one operand), then a later step try to combine all three 
instructions, substituting the extension for both input operands.
 
Can you pass along the .combine dump from the failing case?
 
Jeff
 

[-- Attachment #2: dump.txt --]
[-- Type: application/octet-stream, Size: 170917 bytes --]

;; Function vwadd_TYPE1_float (_Z17vwadd_TYPE1_floatPdS_S_S_PfS0_S0_S0_i, funcdef_no=0, decl_uid=2849, cgraph_uid=1, symbol_order=0)

scanning new insn with uid = 79.
rescanning insn with uid = 2.
scanning new insn with uid = 80.
rescanning insn with uid = 3.
scanning new insn with uid = 81.
rescanning insn with uid = 4.
scanning new insn with uid = 82.
rescanning insn with uid = 5.
scanning new insn with uid = 83.
rescanning insn with uid = 6.
scanning new insn with uid = 84.
rescanning insn with uid = 7.
scanning new insn with uid = 85.
rescanning insn with uid = 8.
scanning new insn with uid = 86.
rescanning insn with uid = 9.
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 7 count 7 (  1.2)


vwadd_TYPE1_float

Dataflow summary:
def_info->table_size = 56, use_info->table_size = 0
;;  fully invalidated by EH 	 0 [zero] 3 [gp] 4 [tp] 5 [t0] 6 [t1] 7 [t2] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 28 [t3] 29 [t4] 30 [t5] 31 [t6] 32 [ft0] 33 [ft1] 34 [ft2] 35 [ft3] 36 [ft4] 37 [ft5] 38 [ft6] 39 [ft7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 60 [ft8] 61 [ft9] 62 [ft10] 63 [ft11] 66 [vl] 67 [vtype] 68 [vxrm] 69 [N/A] 70 [N/A] 71 [N/A] 72 [N/A] 73 [N/A] 74 [N/A] 75 [N/A] 76 [N/A] 77 [N/A] 78 [N/A] 79 [N/A] 80 [N/A] 81 [N/A] 82 [N/A] 83 [N/A] 84 [N/A] 85 [N/A] 86 [N/A] 87 [N/A] 88 [N/A] 89 [N/A] 90 [N/A] 91 [N/A] 92 [N/A] 93 [N/A] 94 [N/A] 95 [N/A] 96 [v0] 97 [v1] 98 [v2] 99 [v3] 100 [v4] 101 [v5] 102 [v6] 103 [v7] 104 [v8] 105 [v9] 106 [v10] 107 [v11] 108 [v12] 109 [v13] 110 [v14] 111 [v15] 112 [v16] 113 [v17] 114 [v18] 115 [v19] 116 [v20] 117 [v21] 118 [v22] 119 [v23] 120 [v24] 121 [v25] 122 [v26] 123 [v27] 124 [v28] 125 [v29] 126 [v30] 127 [v31]
;;  hardware regs used 	 2 [sp] 64 [arg] 65 [frame]
;;  regular block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  eh block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  entry block defs 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;;  exit block uses 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;;  regs ever live 	 0 [zero] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 66 [vl] 67 [vtype] 69 [N/A]
;;  ref usage 	r0={8u} r1={1d,1u} r2={1d,5u} r8={1d,5u} r10={1d,1u} r11={1d,1u} r12={1d,1u} r13={1d,1u} r14={1d,1u} r15={1d,1u} r16={1d,1u} r17={1d,1u} r42={1d} r43={1d} r44={1d} r45={1d} r46={1d} r47={1d} r48={1d} r49={1d} r64={1d,5u,1e} r65={1d,5u} r66={12u} r67={12u} r69={4u} r136={1d,1u} r139={1d,2u,2e} r140={1d,1u} r141={1d,4u} r143={1d,2u,2e} r144={1d,1u} r145={1d,3u,3e} r146={1d,1u} r147={1d,4u} r148={2d,3u} r149={1d,11u} r150={2d,3u} r151={2d,3u} r152={2d,3u} r153={2d,3u} r154={2d,2u} r155={2d,2u} r156={2d,2u} r157={2d,2u} r158={1d,2u} r160={1d,1u} r162={1d,1u} r164={1d,1u} r166={1d,1u,1e} r167={1d,1u} r169={1d,4u} r170={1d,1u} r171={1d,1u} r172={1d,1u} r173={1d,1u} r174={1d,1u} r175={1d,1u} r176={1d,1u} r177={1d,1u} 
;;    total ref usage 210{64d,137u,9e} in 59{59 regular + 0 call} insns.

( )->[0]->( 2 )
;; bb 0 artificial_defs: { d0(1){ }d1(2){ }d2(8){ }d3(10){ }d4(11){ }d5(12){ }d6(13){ }d7(14){ }d8(15){ }d9(16){ }d10(17){ }d11(42){ }d12(43){ }d13(44){ }d14(45){ }d15(46){ }d16(47){ }d17(48){ }d18(49){ }d19(64){ }d20(65){ }}
;; bb 0 artificial_uses: { }
;; lr  in  	 0 [zero] 66 [vl] 67 [vtype] 69 [N/A]
;; lr  use 	
;; lr  def 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;; live  in  	
;; live  gen 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A]
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 64 [arg] 65 [frame]

( 0 )->[2]->( 3 5 )
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A]
;; lr  use 	 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 64 [arg] 65 [frame]
;; lr  def 	 150 151 152 153 154 155 156 157 158 170 171 172 173 174 175 176 177
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 64 [arg] 65 [frame]
;; live  gen 	 150 151 152 153 154 155 156 157 158
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 150 151 152 153 154 155 156 157 158
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 150 151 152 153 154 155 156 157 158

( 2 )->[3]->( 4 )
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 150 151 152 153 154 155 156 157 158
;; lr  use 	 2 [sp] 8 [s0] 64 [arg] 65 [frame] 158
;; lr  def 	 148 169
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 150 151 152 153 154 155 156 157 158
;; live  gen 	 148 169
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 148 150 151 152 153 154 155 156 157 169

( 4 3 )->[4]->( 4 5 )
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
;; lr  use 	 0 [zero] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
;; lr  def 	 136 139 140 141 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 160 162 164 166 167
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 148 150 151 152 153 154 155 156 157 169
;; live  gen 	 136 139 140 141 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 160 162 164 166 167
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 148 150 151 152 153 154 155 156 157 169

( 4 2 )->[5]->( 1 )
;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; lr  use 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; lr  def 	
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; live  gen 	
;; live  kill	
;; lr  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]

( 5 )->[1]->( )
;; bb 1 artificial_defs: { }
;; bb 1 artificial_uses: { u-1(1){ }u-1(2){ }u-1(8){ }u-1(65){ }}
;; lr  in  	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; lr  use 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; lr  def 	
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; live  gen 	
;; live  kill	
;; lr  out 	
;; live  out 	

Finding needed instructions:
  Adding insn 20 to worklist
  Adding insn 64 to worklist
  Adding insn 50 to worklist
  Adding insn 44 to worklist
  Adding insn 40 to worklist
  Adding insn 34 to worklist
Finished finding needed instructions:
processing block 5 lr out =  1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
processing block 4 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
  Adding insn 62 to worklist
  Adding insn 61 to worklist
  Adding insn 60 to worklist
  Adding insn 59 to worklist
  Adding insn 58 to worklist
  Adding insn 57 to worklist
  Adding insn 56 to worklist
  Adding insn 55 to worklist
  Adding insn 54 to worklist
  Adding insn 49 to worklist
  Adding insn 47 to worklist
  Adding insn 46 to worklist
  Adding insn 43 to worklist
  Adding insn 39 to worklist
  Adding insn 37 to worklist
  Adding insn 36 to worklist
  Adding insn 33 to worklist
  Adding insn 31 to worklist
  Adding insn 30 to worklist
  Adding insn 29 to worklist
  Adding insn 28 to worklist
  Adding insn 27 to worklist
  Adding insn 26 to worklist
  Adding insn 24 to worklist
processing block 3 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 148 150 151 152 153 154 155 156 157 169
  Adding insn 74 to worklist
  Adding insn 22 to worklist
processing block 2 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 150 151 152 153 154 155 156 157 158
  Adding insn 10 to worklist
  Adding insn 9 to worklist
  Adding insn 86 to worklist
  Adding insn 8 to worklist
  Adding insn 85 to worklist
  Adding insn 7 to worklist
  Adding insn 84 to worklist
  Adding insn 6 to worklist
  Adding insn 83 to worklist
  Adding insn 5 to worklist
  Adding insn 82 to worklist
  Adding insn 4 to worklist
  Adding insn 81 to worklist
  Adding insn 3 to worklist
  Adding insn 80 to worklist
  Adding insn 2 to worklist
  Adding insn 79 to worklist
df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 7 count 7 (  1.2)
insn_cost 4 for    79: r170:DI=a0:DI
      REG_DEAD a0:DI
insn_cost 4 for     2: r150:DI=r170:DI
      REG_DEAD r170:DI
insn_cost 4 for    80: r171:DI=a1:DI
      REG_DEAD a1:DI
insn_cost 4 for     3: r151:DI=r171:DI
      REG_DEAD r171:DI
insn_cost 4 for    81: r172:DI=a2:DI
      REG_DEAD a2:DI
insn_cost 4 for     4: r152:DI=r172:DI
      REG_DEAD r172:DI
insn_cost 4 for    82: r173:DI=a3:DI
      REG_DEAD a3:DI
insn_cost 4 for     5: r153:DI=r173:DI
      REG_DEAD r173:DI
insn_cost 4 for    83: r174:DI=a4:DI
      REG_DEAD a4:DI
insn_cost 4 for     6: r154:DI=r174:DI
      REG_DEAD r174:DI
insn_cost 4 for    84: r175:DI=a5:DI
      REG_DEAD a5:DI
insn_cost 4 for     7: r155:DI=r175:DI
      REG_DEAD r175:DI
insn_cost 4 for    85: r176:DI=a6:DI
      REG_DEAD a6:DI
insn_cost 4 for     8: r156:DI=r176:DI
      REG_DEAD r176:DI
insn_cost 4 for    86: r177:DI=a7:DI
      REG_DEAD a7:DI
insn_cost 4 for     9: r157:DI=r177:DI
      REG_DEAD r177:DI
insn_cost 28 for    10: r158:DI=sign_extend([arg:DI])
      REG_EQUIV sign_extend([arg:DI])
insn_cost 0 for    14: debug begin stmt marker
insn_cost 0 for    15: debug i => 0
insn_cost 0 for    16: debug begin stmt marker
insn_cost 4 for    20: pc={(r158:DI<=0)?L68:pc}
      REG_BR_PROB 118111604
insn_cost 4 for    22: r148:DI=r158:DI
      REG_DEAD r158:DI
insn_cost 8 for    74: r169:DI=unspec[0x40] 70
insn_cost 12 for    24: r149:DI=unspec[r148:DI,0x8,0x5,0,0] 67
insn_cost 0 for    25: debug begin stmt marker
insn_cost 4 for    26: r147:DI=r149:DI<<0x2
insn_cost 4 for    27: r146:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r154:DI]:unspec[zero:SI] 68}
insn_cost 4 for    28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
insn_cost 4 for    29: r144:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r155:DI]:unspec[zero:SI] 68}
insn_cost 4 for    30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
insn_cost 4 for    31: r141:DI=r149:DI<<0x3
insn_cost 4 for    33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
insn_cost 4 for    34: [r150:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r160:VNx2DF:[r150:DI]}
      REG_DEAD r160:VNx2DF
insn_cost 0 for    35: debug begin stmt marker
insn_cost 4 for    36: r140:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r156:DI]:unspec[zero:SI] 68}
insn_cost 4 for    37: r139:VNx2DF=float_extend(r140:VNx2SF)
      REG_DEAD r140:VNx2SF
insn_cost 4 for    39: r162:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r143:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
insn_cost 4 for    40: [r151:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r162:VNx2DF:[r151:DI]}
      REG_DEAD r162:VNx2DF
insn_cost 0 for    41: debug begin stmt marker
insn_cost 4 for    43: r164:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r139:VNx2DF
      REG_EQUAL r139:VNx2DF*r145:VNx2DF
insn_cost 4 for    44: [r152:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r164:VNx2DF:[r152:DI]}
      REG_DEAD r164:VNx2DF
insn_cost 0 for    45: debug begin stmt marker
insn_cost 4 for    46: r136:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r157:DI]:unspec[zero:SI] 68}
insn_cost 4 for    47: r166:VNx2DF=float_extend(r136:VNx2SF)
      REG_DEAD r136:VNx2SF
insn_cost 4 for    49: r167:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r166:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
insn_cost 4 for    50: [r153:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r167:VNx2DF:[r153:DI]}
      REG_DEAD r167:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
insn_cost 0 for    51: debug begin stmt marker
insn_cost 0 for    52: debug i => optimized away
insn_cost 0 for    53: debug begin stmt marker
insn_cost 4 for    54: r154:DI=r154:DI+r147:DI
insn_cost 4 for    55: r155:DI=r155:DI+r147:DI
insn_cost 4 for    56: r150:DI=r150:DI+r141:DI
insn_cost 4 for    57: r156:DI=r156:DI+r147:DI
insn_cost 4 for    58: r151:DI=r151:DI+r141:DI
insn_cost 4 for    59: r152:DI=r152:DI+r141:DI
insn_cost 4 for    60: r157:DI=r157:DI+r147:DI
      REG_DEAD r147:DI
insn_cost 4 for    61: r153:DI=r153:DI+r141:DI
      REG_DEAD r141:DI
insn_cost 4 for    62: r148:DI=r148:DI-r149:DI
      REG_DEAD r149:DI
insn_cost 4 for    64: pc={(r148:DI!=0)?L63:pc}
      REG_BR_PROB 894784862

Trying 10 -> 20:
   10: r158:DI=sign_extend([arg:DI])
      REG_EQUIV sign_extend([arg:DI])
   20: pc={(r158:DI<=0)?L68:pc}
      REG_BR_PROB 118111604
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
                    (const_int 0 [0]))
                (label_ref:DI 68)
                (pc)))
        (set (reg/v:DI 158 [ n ])
            (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
    ])
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
                    (const_int 0 [0]))
                (label_ref:DI 68)
                (pc)))
        (set (reg/v:DI 158 [ n ])
            (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
    ])
Successfully matched this instruction:
(set (reg/v:DI 158 [ n ])
    (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
Failed to match this instruction:
(set (pc)
    (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
            (const_int 0 [0]))
        (label_ref:DI 68)
        (pc)))

Trying 10 -> 20:
   10: r158:DI=sign_extend([arg:DI])
      REG_EQUIV sign_extend([arg:DI])
   20: pc={(r158:DI<=0)?L68:pc}
      REG_BR_PROB 118111604
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
                    (const_int 0 [0]))
                (label_ref:DI 68)
                (pc)))
        (set (reg/v:DI 158 [ n ])
            (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
    ])
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
                    (const_int 0 [0]))
                (label_ref:DI 68)
                (pc)))
        (set (reg/v:DI 158 [ n ])
            (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
    ])
Successfully matched this instruction:
(set (reg/v:DI 158 [ n ])
    (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128])))
Failed to match this instruction:
(set (pc)
    (if_then_else (le (subreg:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]) 0)
            (const_int 0 [0]))
        (label_ref:DI 68)
        (pc)))

Trying 24 -> 26:
   24: r149:DI=unspec[r148:DI,0x8,0x5,0,0] 67
   26: r147:DI=r149:DI<<0x2
Failed to match this instruction:
(parallel [
        (set (reg:DI 147 [ ivtmp_77 ])
            (ashift:DI (unspec:DI [
                        (reg:DI 148 [ ivtmp_87 ])
                        (const_int 8 [0x8])
                        (const_int 5 [0x5])
                        (const_int 0 [0]) repeated x2
                    ] UNSPEC_VSETVL)
                (const_int 2 [0x2])))
        (set (reg:DI 149 [ _89 ])
            (unspec:DI [
                    (reg:DI 148 [ ivtmp_87 ])
                    (const_int 8 [0x8])
                    (const_int 5 [0x5])
                    (const_int 0 [0]) repeated x2
                ] UNSPEC_VSETVL))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:DI 147 [ ivtmp_77 ])
            (ashift:DI (unspec:DI [
                        (reg:DI 148 [ ivtmp_87 ])
                        (const_int 8 [0x8])
                        (const_int 5 [0x5])
                        (const_int 0 [0]) repeated x2
                    ] UNSPEC_VSETVL)
                (const_int 2 [0x2])))
        (set (reg:DI 149 [ _89 ])
            (unspec:DI [
                    (reg:DI 148 [ ivtmp_87 ])
                    (const_int 8 [0x8])
                    (const_int 5 [0x5])
                    (const_int 0 [0]) repeated x2
                ] UNSPEC_VSETVL))
    ])
Successfully matched this instruction:
(set (reg:DI 149 [ _89 ])
    (unspec:DI [
            (reg:DI 148 [ ivtmp_87 ])
            (const_int 8 [0x8])
            (const_int 5 [0x5])
            (const_int 0 [0]) repeated x2
        ] UNSPEC_VSETVL))
Failed to match this instruction:
(set (reg:DI 147 [ ivtmp_77 ])
    (ashift:DI (unspec:DI [
                (reg:DI 148 [ ivtmp_87 ])
                (const_int 8 [0x8])
                (const_int 5 [0x5])
                (const_int 0 [0]) repeated x2
            ] UNSPEC_VSETVL)
        (const_int 2 [0x2])))

Trying 27 -> 28:
   27: r146:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r154:DI]:unspec[zero:SI] 68}
   28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 145 [ vect__5.10 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 29 -> 30:
   29: r144:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r155:DI]:unspec[zero:SI] 68}
   30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 143 [ vect__8.14 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 28 -> 33:
   28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 145 [ vect__5.10 ])
    (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
Failed to match this instruction:
(set (reg:VNx2DF 160 [ vect__11.15 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
            (reg:VNx2DF 143 [ vect__8.14 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 30 -> 33:
   30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                    (reg:VNx2DF 145 [ vect__5.10 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                    (reg:VNx2DF 145 [ vect__5.10 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 143 [ vect__8.14 ])
    (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
Failed to match this instruction:
(set (reg:VNx2DF 160 [ vect__11.15 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 27, 28 -> 33:
   27: r146:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r154:DI]:unspec[zero:SI] 68}
   28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(set (reg:VNx2DF 145 [ vect__5.10 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 29, 30 -> 33:
   29: r144:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r155:DI]:unspec[zero:SI] 68}
   30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 145 [ vect__5.10 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 145 [ vect__5.10 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(set (reg:VNx2DF 143 [ vect__8.14 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 30, 28 -> 33:
   30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
   28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                    (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 160 [ vect__11.15 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                    (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])

Trying 33 -> 34:
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
   34: [r150:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r160:VNx2DF:[r150:DI]}
      REG_DEAD r160:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 143 [ vect__8.14 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))

Trying 28, 33 -> 34:
   28: r145:VNx2DF=float_extend(r146:VNx2SF)
      REG_DEAD r146:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
   34: [r150:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r160:VNx2DF:[r150:DI]}
      REG_DEAD r160:VNx2DF
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
                        (reg:VNx2DF 143 [ vect__8.14 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
                        (reg:VNx2DF 143 [ vect__8.14 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 145 [ vect__5.10 ])
            (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 145 [ vect__5.10 ])
    (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))
                (reg:VNx2DF 143 [ vect__8.14 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))

Trying 30, 33 -> 34:
   30: r143:VNx2DF=float_extend(r144:VNx2SF)
      REG_DEAD r144:VNx2SF
   33: r160:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r143:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
   34: [r150:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r160:VNx2DF:[r150:DI]}
      REG_DEAD r160:VNx2DF
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                        (reg:VNx2DF 145 [ vect__5.10 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                        (reg:VNx2DF 145 [ vect__5.10 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 143 [ vect__8.14 ])
            (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 143 [ vect__8.14 ])
    (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))

Trying 33 -> 34:
   33: r160:VNx2DF=r143:VNx2DF*r145:VNx2DF
      REG_EQUAL r143:VNx2DF*r145:VNx2DF
   34: [r150:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r160:VNx2DF:[r150:DI]}
      REG_DEAD r160:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (reg:VNx2DF 143 [ vect__8.14 ])
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])))

Trying 36 -> 37:
   36: r140:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r156:DI]:unspec[zero:SI] 68}
   37: r139:VNx2DF=float_extend(r140:VNx2SF)
      REG_DEAD r140:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 139 [ vect__14.21 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 37 -> 39:
   37: r139:VNx2DF=float_extend(r140:VNx2SF)
      REG_DEAD r140:VNx2SF
   39: r162:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r143:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 162 [ vect__16.22 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 162 [ vect__16.22 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 139 [ vect__14.21 ])
    (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
Failed to match this instruction:
(set (reg:VNx2DF 162 [ vect__16.22 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
            (reg:VNx2DF 143 [ vect__8.14 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 36, 37 -> 39:
   36: r140:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r156:DI]:unspec[zero:SI] 68}
   37: r139:VNx2DF=float_extend(r140:VNx2SF)
      REG_DEAD r140:VNx2SF
   39: r162:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r143:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 162 [ vect__16.22 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:VNx2DF 162 [ vect__16.22 ])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 169)
                        (const_int 2 [0x2]) repeated x2
                        (const_int 1 [0x1])
                        (const_int 7 [0x7])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                        (reg:SI 69 N/A)
                    ] UNSPEC_VPREDICATE)
                (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                    (const_vector:VNx2BI repeat [
                                            (const_int 1 [0x1])
                                        ])
                                    (reg:DI 149 [ _89 ])
                                    (const_int 2 [0x2]) repeated x2
                                    (const_int 0 [0])
                                    (reg:SI 66 vl)
                                    (reg:SI 67 vtype)
                                ] UNSPEC_VPREDICATE)
                            (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
                            (unspec:VNx2SF [
                                    (reg:SI 0 zero)
                                ] UNSPEC_VUNDEF)))
                    (reg:VNx2DF 143 [ vect__8.14 ]))
                (unspec:VNx2DF [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
    ])
Failed to match this instruction:
(set (reg:VNx2DF 139 [ vect__14.21 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 39 -> 40:
   39: r162:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r143:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
   40: [r151:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r162:VNx2DF:[r151:DI]}
      REG_DEAD r162:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 143 [ vect__8.14 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])))

Trying 37, 39 -> 40:
   37: r139:VNx2DF=float_extend(r140:VNx2SF)
      REG_DEAD r140:VNx2SF
   39: r162:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r143:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
   40: [r151:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r162:VNx2DF:[r151:DI]}
      REG_DEAD r162:VNx2DF
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
                        (reg:VNx2DF 143 [ vect__8.14 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
            (if_then_else:VNx2DF (unspec:VNx2BI [
                        (const_vector:VNx2BI repeat [
                                (const_int 1 [0x1])
                            ])
                        (reg:DI 149 [ _89 ])
                        (const_int 0 [0])
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (if_then_else:VNx2DF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 169)
                            (const_int 2 [0x2]) repeated x2
                            (const_int 1 [0x1])
                            (const_int 7 [0x7])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                            (reg:SI 69 N/A)
                        ] UNSPEC_VPREDICATE)
                    (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
                        (reg:VNx2DF 143 [ vect__8.14 ]))
                    (unspec:VNx2DF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))
                (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])))
        (set (reg:VNx2DF 139 [ vect__14.21 ])
            (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
    ])
Successfully matched this instruction:
(set (reg:VNx2DF 139 [ vect__14.21 ])
    (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ])))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))
                (reg:VNx2DF 143 [ vect__8.14 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])))

Trying 39 -> 40:
   39: r162:VNx2DF=r139:VNx2DF*r143:VNx2DF
      REG_DEAD r143:VNx2DF
      REG_EQUAL r139:VNx2DF*r143:VNx2DF
   40: [r151:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r162:VNx2DF:[r151:DI]}
      REG_DEAD r162:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
            (reg:VNx2DF 143 [ vect__8.14 ]))
        (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])))

Trying 43 -> 44:
   43: r164:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r139:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r139:VNx2DF
      REG_EQUAL r139:VNx2DF*r145:VNx2DF
   44: [r152:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r164:VNx2DF:[r152:DI]}
      REG_DEAD r164:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64])))

Trying 43 -> 44:
   43: r164:VNx2DF=r139:VNx2DF*r145:VNx2DF
      REG_DEAD r139:VNx2DF
      REG_EQUAL r139:VNx2DF*r145:VNx2DF
   44: [r152:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r164:VNx2DF:[r152:DI]}
      REG_DEAD r164:VNx2DF
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64])))

Trying 46 -> 47:
   46: r136:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r157:DI]:unspec[zero:SI] 68}
   47: r166:VNx2DF=float_extend(r136:VNx2SF)
      REG_DEAD r136:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 166 [ vect__21.31 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 157 [ b2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b2.28_46]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 47 -> 49:
   47: r166:VNx2DF=float_extend(r136:VNx2SF)
      REG_DEAD r136:VNx2SF
   49: r167:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r166:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 167 [ vect__23.32 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [ vect__20.30 ]))
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 46, 47 -> 49:
   46: r136:VNx2SF={(unspec[const_vector,r149:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r157:DI]:unspec[zero:SI] 68}
   47: r166:VNx2DF=float_extend(r136:VNx2SF)
      REG_DEAD r136:VNx2SF
   49: r167:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r166:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 167 [ vect__23.32 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 149 [ _89 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 157 [ b2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b2.28_46]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Successfully matched this instruction:
(set (reg:VNx2SF 166 [ vect__21.31 ])
    (if_then_else:VNx2SF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 2 [0x2]) repeated x2
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mem:VNx2SF (reg/v/f:DI 157 [ b2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b2.28_46]+0 S[8, 8] A32])
        (unspec:VNx2SF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Failed to match this instruction:
(set (reg:VNx2DF 167 [ vect__23.32 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 169)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 166 [ vect__21.31 ]))
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 49 -> 50:
   49: r167:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r166:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
   50: [r153:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r167:VNx2DF:[r153:DI]}
      REG_DEAD r167:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 166 [ vect__21.31 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])))

Trying 47, 49 -> 50:
   47: r166:VNx2DF=float_extend(r136:VNx2SF)
      REG_DEAD r136:VNx2SF
   49: r167:VNx2DF={(unspec[const_vector,r169:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r166:VNx2DF*r145:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
   50: [r153:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r167:VNx2DF:[r153:DI]}
      REG_DEAD r167:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [ vect__20.30 ]))
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])))
Successfully matched this instruction:
(set (reg:VNx2DF 167 [ vect__23.32 ])
    (float_extend:VNx2DF (reg:VNx2SF 136 [ vect__20.30 ])))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 167 [ vect__23.32 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])))

Trying 49 -> 50:
   49: r167:VNx2DF=r166:VNx2DF*r145:VNx2DF
      REG_DEAD r166:VNx2DF
      REG_DEAD r145:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r166:VNx2DF*r145:VNx2DF
   50: [r153:DI]={(unspec[const_vector,r149:DI,0,vl:SI,vtype:SI] 69)?r167:VNx2DF:[r153:DI]}
      REG_DEAD r167:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 149 [ _89 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (reg:VNx2DF 166 [ vect__21.31 ])
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])))

Trying 26 -> 54:
   26: r147:DI=r149:DI<<0x2
   54: r154:DI=r154:DI+r147:DI
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 154 [ a ])
            (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]))
                (reg/v/f:DI 154 [ a ])))
        (set (reg:DI 147 [ ivtmp_77 ])
            (ashift:DI (reg:DI 149 [ _89 ])
                (const_int 2 [0x2])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 154 [ a ])
            (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]))
                (reg/v/f:DI 154 [ a ])))
        (set (reg:DI 147 [ ivtmp_77 ])
            (ashift:DI (reg:DI 149 [ _89 ])
                (const_int 2 [0x2])))
    ])
Successfully matched this instruction:
(set (reg:DI 147 [ ivtmp_77 ])
    (ashift:DI (reg:DI 149 [ _89 ])
        (const_int 2 [0x2])))
Failed to match this instruction:
(set (reg/v/f:DI 154 [ a ])
    (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
            (const_int 2 [0x2]))
        (reg/v/f:DI 154 [ a ])))

Trying 24, 26 -> 54:
   24: r149:DI=unspec[r148:DI,0x8,0x5,0,0] 67
   26: r147:DI=r149:DI<<0x2
   54: r154:DI=r154:DI+r147:DI
Can't combine i1 into i3

Trying 31 -> 56:
   31: r141:DI=r149:DI<<0x3
   56: r150:DI=r150:DI+r141:DI
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 150 [ dst ])
            (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
                    (const_int 3 [0x3]))
                (reg/v/f:DI 150 [ dst ])))
        (set (reg:DI 141 [ ivtmp_66 ])
            (ashift:DI (reg:DI 149 [ _89 ])
                (const_int 3 [0x3])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 150 [ dst ])
            (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
                    (const_int 3 [0x3]))
                (reg/v/f:DI 150 [ dst ])))
        (set (reg:DI 141 [ ivtmp_66 ])
            (ashift:DI (reg:DI 149 [ _89 ])
                (const_int 3 [0x3])))
    ])
Successfully matched this instruction:
(set (reg:DI 141 [ ivtmp_66 ])
    (ashift:DI (reg:DI 149 [ _89 ])
        (const_int 3 [0x3])))
Failed to match this instruction:
(set (reg/v/f:DI 150 [ dst ])
    (plus:DI (ashift:DI (reg:DI 149 [ _89 ])
            (const_int 3 [0x3]))
        (reg/v/f:DI 150 [ dst ])))

Trying 62 -> 64:
   62: r148:DI=r148:DI-r149:DI
      REG_DEAD r149:DI
   64: pc={(r148:DI!=0)?L63:pc}
      REG_BR_PROB 894784862
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (ne (reg:DI 148 [ ivtmp_87 ])
                    (reg:DI 149 [ _89 ]))
                (label_ref:DI 63)
                (pc)))
        (set (reg:DI 148 [ ivtmp_87 ])
            (minus:DI (reg:DI 148 [ ivtmp_87 ])
                (reg:DI 149 [ _89 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (ne (reg:DI 148 [ ivtmp_87 ])
                    (reg:DI 149 [ _89 ]))
                (label_ref:DI 63)
                (pc)))
        (set (reg:DI 148 [ ivtmp_87 ])
            (minus:DI (reg:DI 148 [ ivtmp_87 ])
                (reg:DI 149 [ _89 ])))
    ])
starting the processing of deferred insns
ending the processing of deferred insns


vwadd_TYPE1_float

Dataflow summary:
;;  fully invalidated by EH 	 0 [zero] 3 [gp] 4 [tp] 5 [t0] 6 [t1] 7 [t2] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 28 [t3] 29 [t4] 30 [t5] 31 [t6] 32 [ft0] 33 [ft1] 34 [ft2] 35 [ft3] 36 [ft4] 37 [ft5] 38 [ft6] 39 [ft7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 60 [ft8] 61 [ft9] 62 [ft10] 63 [ft11] 66 [vl] 67 [vtype] 68 [vxrm] 69 [N/A] 70 [N/A] 71 [N/A] 72 [N/A] 73 [N/A] 74 [N/A] 75 [N/A] 76 [N/A] 77 [N/A] 78 [N/A] 79 [N/A] 80 [N/A] 81 [N/A] 82 [N/A] 83 [N/A] 84 [N/A] 85 [N/A] 86 [N/A] 87 [N/A] 88 [N/A] 89 [N/A] 90 [N/A] 91 [N/A] 92 [N/A] 93 [N/A] 94 [N/A] 95 [N/A] 96 [v0] 97 [v1] 98 [v2] 99 [v3] 100 [v4] 101 [v5] 102 [v6] 103 [v7] 104 [v8] 105 [v9] 106 [v10] 107 [v11] 108 [v12] 109 [v13] 110 [v14] 111 [v15] 112 [v16] 113 [v17] 114 [v18] 115 [v19] 116 [v20] 117 [v21] 118 [v22] 119 [v23] 120 [v24] 121 [v25] 122 [v26] 123 [v27] 124 [v28] 125 [v29] 126 [v30] 127 [v31]
;;  hardware regs used 	 2 [sp] 64 [arg] 65 [frame]
;;  regular block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  eh block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  entry block defs 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;;  exit block uses 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;;  regs ever live 	 0 [zero] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 66 [vl] 67 [vtype] 69 [N/A]
;;  ref usage 	r0={8u} r1={1d,1u} r2={1d,5u} r8={1d,5u} r10={1d,1u} r11={1d,1u} r12={1d,1u} r13={1d,1u} r14={1d,1u} r15={1d,1u} r16={1d,1u} r17={1d,1u} r42={1d} r43={1d} r44={1d} r45={1d} r46={1d} r47={1d} r48={1d} r49={1d} r64={1d,5u,1e} r65={1d,5u} r66={12u} r67={12u} r69={4u} r136={1d,1u} r139={1d,2u,2e} r140={1d,1u} r141={1d,4u} r143={1d,2u,2e} r144={1d,1u} r145={1d,3u,3e} r146={1d,1u} r147={1d,4u} r148={2d,3u} r149={1d,11u} r150={2d,3u} r151={2d,3u} r152={2d,3u} r153={2d,3u} r154={2d,2u} r155={2d,2u} r156={2d,2u} r157={2d,2u} r158={1d,2u} r160={1d,1u} r162={1d,1u} r164={1d,1u} r166={1d,1u,1e} r167={1d,1u} r169={1d,4u} r170={1d,1u} r171={1d,1u} r172={1d,1u} r173={1d,1u} r174={1d,1u} r175={1d,1u} r176={1d,1u} r177={1d,1u} 
;;    total ref usage 210{64d,137u,9e} in 59{59 regular + 0 call} insns.
(note 12 0 79 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 79 12 2 2 (set (reg:DI 170)
        (reg:DI 10 a0 [ dst ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 10 a0 [ dst ])
        (nil)))
(insn 2 79 80 2 (set (reg/v/f:DI 150 [ dst ])
        (reg:DI 170)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 170)
        (nil)))
(insn 80 2 3 2 (set (reg:DI 171)
        (reg:DI 11 a1 [ dst2 ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 11 a1 [ dst2 ])
        (nil)))
(insn 3 80 81 2 (set (reg/v/f:DI 151 [ dst2 ])
        (reg:DI 171)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 171)
        (nil)))
(insn 81 3 4 2 (set (reg:DI 172)
        (reg:DI 12 a2 [ dst3 ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 12 a2 [ dst3 ])
        (nil)))
(insn 4 81 82 2 (set (reg/v/f:DI 152 [ dst3 ])
        (reg:DI 172)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 172)
        (nil)))
(insn 82 4 5 2 (set (reg:DI 173)
        (reg:DI 13 a3 [ dst4 ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 13 a3 [ dst4 ])
        (nil)))
(insn 5 82 83 2 (set (reg/v/f:DI 153 [ dst4 ])
        (reg:DI 173)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 173)
        (nil)))
(insn 83 5 6 2 (set (reg:DI 174)
        (reg:DI 14 a4 [ a ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 14 a4 [ a ])
        (nil)))
(insn 6 83 84 2 (set (reg/v/f:DI 154 [ a ])
        (reg:DI 174)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 174)
        (nil)))
(insn 84 6 7 2 (set (reg:DI 175)
        (reg:DI 15 a5 [ b ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 15 a5 [ b ])
        (nil)))
(insn 7 84 85 2 (set (reg/v/f:DI 155 [ b ])
        (reg:DI 175)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 175)
        (nil)))
(insn 85 7 8 2 (set (reg:DI 176)
        (reg:DI 16 a6 [ a2 ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 16 a6 [ a2 ])
        (nil)))
(insn 8 85 86 2 (set (reg/v/f:DI 156 [ a2 ])
        (reg:DI 176)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 176)
        (nil)))
(insn 86 8 9 2 (set (reg:DI 177)
        (reg:DI 17 a7 [ b2 ])) "/app/example.cpp":18:1 -1
     (expr_list:REG_DEAD (reg:DI 17 a7 [ b2 ])
        (nil)))
(insn 9 86 10 2 (set (reg/v/f:DI 157 [ b2 ])
        (reg:DI 177)) "/app/example.cpp":18:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 177)
        (nil)))
(insn 10 9 11 2 (set (reg/v:DI 158 [ n ])
        (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]))) "/app/example.cpp":18:1 -1
     (expr_list:REG_EQUIV (sign_extend:DI (mem/c:SI (reg/f:DI 64 arg) [3 n+0 S4 A128]))
        (nil)))
(note 11 10 14 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 14 11 15 2 (debug_marker) "/app/example.cpp":18:1 -1
     (nil))
(debug_insn 15 14 16 2 (var_location:SI i (const_int 0 [0])) -1
     (nil))
(debug_insn 16 15 20 2 (debug_marker) "/app/example.cpp":18:1 discrim 1 -1
     (nil))
(jump_insn 20 16 21 2 (set (pc)
        (if_then_else (le (reg/v:DI 158 [ n ])
                (const_int 0 [0]))
            (label_ref:DI 68)
            (pc))) "/app/example.cpp":18:1 discrim 1 242 {*branchdi}
     (int_list:REG_BR_PROB 118111604 (nil))
 -> 68)
(note 21 20 22 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 22 21 74 3 (set (reg:DI 148 [ ivtmp_87 ])
        (reg/v:DI 158 [ n ])) 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 158 [ n ])
        (nil)))
(insn 74 22 63 3 (set (reg:DI 169)
        (unspec:DI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)) 688 {vlmax_avldi}
     (nil))
(code_label 63 74 23 4 3 (nil) [1 uses])
(note 23 63 24 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 24 23 25 4 (set (reg:DI 149 [ _89 ])
        (unspec:DI [
                (reg:DI 148 [ ivtmp_87 ])
                (const_int 8 [0x8])
                (const_int 5 [0x5])
                (const_int 0 [0]) repeated x2
            ] UNSPEC_VSETVL)) 1116 {vsetvldi_no_side_effects}
     (nil))
(debug_insn 25 24 26 4 (debug_marker) "/app/example.cpp":18:1 discrim 3 -1
     (nil))
(insn 26 25 27 4 (set (reg:DI 147 [ ivtmp_77 ])
        (ashift:DI (reg:DI 149 [ _89 ])
            (const_int 2 [0x2]))) 198 {ashldi3}
     (nil))
(insn 27 26 28 4 (set (reg:VNx2SF 146 [ vect__4.9 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 154 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.7_76]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(insn 28 27 29 4 (set (reg:VNx2DF 145 [ vect__5.10 ])
        (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ]))) "/app/example.cpp":18:1 discrim 3 12395 {extendvnx2sfvnx2df2}
     (expr_list:REG_DEAD (reg:VNx2SF 146 [ vect__4.9 ])
        (nil)))
(insn 29 28 30 4 (set (reg:VNx2SF 144 [ vect__7.13 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 155 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.11_71]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(insn 30 29 31 4 (set (reg:VNx2DF 143 [ vect__8.14 ])
        (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))) "/app/example.cpp":18:1 discrim 3 12395 {extendvnx2sfvnx2df2}
     (expr_list:REG_DEAD (reg:VNx2SF 144 [ vect__7.13 ])
        (nil)))
(insn 31 30 33 4 (set (reg:DI 141 [ ivtmp_66 ])
        (ashift:DI (reg:DI 149 [ _89 ])
            (const_int 3 [0x3]))) 198 {ashldi3}
     (nil))
(insn 33 31 34 4 (set (reg:VNx2DF 160 [ vect__11.15 ])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 143 [ vect__8.14 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 -1
     (expr_list:REG_EQUAL (mult:VNx2DF (reg:VNx2DF 143 [ vect__8.14 ])
            (reg:VNx2DF 145 [ vect__5.10 ]))
        (nil)))
(insn 34 33 35 4 (set (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (reg:VNx2DF 160 [ vect__11.15 ])
            (mem:VNx2DF (reg/v/f:DI 150 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.16_65]+0 S[16, 16] A64]))) "/app/example.cpp":18:1 discrim 3 1201 {pred_storevnx2df}
     (expr_list:REG_DEAD (reg:VNx2DF 160 [ vect__11.15 ])
        (nil)))
(debug_insn 35 34 36 4 (debug_marker) "/app/example.cpp":18:1 discrim 3 -1
     (nil))
(insn 36 35 37 4 (set (reg:VNx2SF 140 [ vect__13.20 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 156 [ a2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a2.18_61]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(insn 37 36 39 4 (set (reg:VNx2DF 139 [ vect__14.21 ])
        (float_extend:VNx2DF (reg:VNx2SF 140 [ vect__13.20 ]))) "/app/example.cpp":18:1 discrim 3 12395 {extendvnx2sfvnx2df2}
     (expr_list:REG_DEAD (reg:VNx2SF 140 [ vect__13.20 ])
        (nil)))
(insn 39 37 40 4 (set (reg:VNx2DF 162 [ vect__16.22 ])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 143 [ vect__8.14 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 -1
     (expr_list:REG_DEAD (reg:VNx2DF 143 [ vect__8.14 ])
        (expr_list:REG_EQUAL (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 143 [ vect__8.14 ]))
            (nil))))
(insn 40 39 41 4 (set (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (reg:VNx2DF 162 [ vect__16.22 ])
            (mem:VNx2DF (reg/v/f:DI 151 [ dst2 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst2.23_55]+0 S[16, 16] A64]))) "/app/example.cpp":18:1 discrim 3 1201 {pred_storevnx2df}
     (expr_list:REG_DEAD (reg:VNx2DF 162 [ vect__16.22 ])
        (nil)))
(debug_insn 41 40 43 4 (debug_marker) "/app/example.cpp":18:1 discrim 3 -1
     (nil))
(insn 43 41 44 4 (set (reg:VNx2DF 164 [ vect__18.25 ])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 -1
     (expr_list:REG_DEAD (reg:VNx2DF 139 [ vect__14.21 ])
        (expr_list:REG_EQUAL (mult:VNx2DF (reg:VNx2DF 139 [ vect__14.21 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (nil))))
(insn 44 43 45 4 (set (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (reg:VNx2DF 164 [ vect__18.25 ])
            (mem:VNx2DF (reg/v/f:DI 152 [ dst3 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst3.26_50]+0 S[16, 16] A64]))) "/app/example.cpp":18:1 discrim 3 1201 {pred_storevnx2df}
     (expr_list:REG_DEAD (reg:VNx2DF 164 [ vect__18.25 ])
        (nil)))
(debug_insn 45 44 46 4 (debug_marker) "/app/example.cpp":18:1 discrim 3 -1
     (nil))
(insn 46 45 47 4 (set (reg:VNx2SF 136 [ vect__20.30 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 157 [ b2 ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b2.28_46]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(insn 47 46 49 4 (set (reg:VNx2DF 166 [ vect__21.31 ])
        (float_extend:VNx2DF (reg:VNx2SF 136 [ vect__20.30 ]))) "/app/example.cpp":18:1 discrim 3 12395 {extendvnx2sfvnx2df2}
     (expr_list:REG_DEAD (reg:VNx2SF 136 [ vect__20.30 ])
        (nil)))
(insn 49 47 50 4 (set (reg:VNx2DF 167 [ vect__23.32 ])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 169)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (reg:VNx2DF 166 [ vect__21.31 ])
                (reg:VNx2DF 145 [ vect__5.10 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":18:1 discrim 3 -1
     (expr_list:REG_DEAD (reg:VNx2DF 166 [ vect__21.31 ])
        (expr_list:REG_DEAD (reg:VNx2DF 145 [ vect__5.10 ])
            (expr_list:REG_DEAD (reg:SI 69 N/A)
                (expr_list:REG_DEAD (reg:SI 0 zero)
                    (expr_list:REG_EQUAL (mult:VNx2DF (reg:VNx2DF 166 [ vect__21.31 ])
                            (reg:VNx2DF 145 [ vect__5.10 ]))
                        (nil)))))))
(insn 50 49 51 4 (set (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 149 [ _89 ])
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (reg:VNx2DF 167 [ vect__23.32 ])
            (mem:VNx2DF (reg/v/f:DI 153 [ dst4 ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst4.33_83]+0 S[16, 16] A64]))) "/app/example.cpp":18:1 discrim 3 1201 {pred_storevnx2df}
     (expr_list:REG_DEAD (reg:VNx2DF 167 [ vect__23.32 ])
        (expr_list:REG_DEAD (reg:SI 67 vtype)
            (expr_list:REG_DEAD (reg:SI 66 vl)
                (nil)))))
(debug_insn 51 50 52 4 (debug_marker) "/app/example.cpp":18:1 discrim 3 -1
     (nil))
(debug_insn 52 51 53 4 (var_location:SI i (clobber (const_int 0 [0]))) -1
     (nil))
(debug_insn 53 52 54 4 (debug_marker) "/app/example.cpp":18:1 discrim 1 -1
     (nil))
(insn 54 53 55 4 (set (reg/v/f:DI 154 [ a ])
        (plus:DI (reg/v/f:DI 154 [ a ])
            (reg:DI 147 [ ivtmp_77 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 55 54 56 4 (set (reg/v/f:DI 155 [ b ])
        (plus:DI (reg/v/f:DI 155 [ b ])
            (reg:DI 147 [ ivtmp_77 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 56 55 57 4 (set (reg/v/f:DI 150 [ dst ])
        (plus:DI (reg/v/f:DI 150 [ dst ])
            (reg:DI 141 [ ivtmp_66 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 57 56 58 4 (set (reg/v/f:DI 156 [ a2 ])
        (plus:DI (reg/v/f:DI 156 [ a2 ])
            (reg:DI 147 [ ivtmp_77 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 58 57 59 4 (set (reg/v/f:DI 151 [ dst2 ])
        (plus:DI (reg/v/f:DI 151 [ dst2 ])
            (reg:DI 141 [ ivtmp_66 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 59 58 60 4 (set (reg/v/f:DI 152 [ dst3 ])
        (plus:DI (reg/v/f:DI 152 [ dst3 ])
            (reg:DI 141 [ ivtmp_66 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (nil))
(insn 60 59 61 4 (set (reg/v/f:DI 157 [ b2 ])
        (plus:DI (reg/v/f:DI 157 [ b2 ])
            (reg:DI 147 [ ivtmp_77 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (expr_list:REG_DEAD (reg:DI 147 [ ivtmp_77 ])
        (nil)))
(insn 61 60 62 4 (set (reg/v/f:DI 153 [ dst4 ])
        (plus:DI (reg/v/f:DI 153 [ dst4 ])
            (reg:DI 141 [ ivtmp_66 ]))) "/app/example.cpp":18:1 discrim 1 5 {adddi3}
     (expr_list:REG_DEAD (reg:DI 141 [ ivtmp_66 ])
        (nil)))
(insn 62 61 64 4 (set (reg:DI 148 [ ivtmp_87 ])
        (minus:DI (reg:DI 148 [ ivtmp_87 ])
            (reg:DI 149 [ _89 ]))) "/app/example.cpp":18:1 discrim 1 11 {subdi3}
     (expr_list:REG_DEAD (reg:DI 149 [ _89 ])
        (nil)))
(jump_insn 64 62 68 4 (set (pc)
        (if_then_else (ne (reg:DI 148 [ ivtmp_87 ])
                (const_int 0 [0]))
            (label_ref:DI 63)
            (pc))) 242 {*branchdi}
     (int_list:REG_BR_PROB 894784862 (nil))
 -> 63)
(code_label 68 64 69 5 1 (nil) [1 uses])
(note 69 68 0 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

;; Function vwmul_TYPE1_float (_Z17vwmul_TYPE1_floatPdPfS0_i, funcdef_no=1, decl_uid=2860, cgraph_uid=2, symbol_order=1)

scanning new insn with uid = 51.
rescanning insn with uid = 2.
scanning new insn with uid = 52.
rescanning insn with uid = 3.
scanning new insn with uid = 53.
rescanning insn with uid = 4.
scanning new insn with uid = 54.
rescanning insn with uid = 5.
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 7 count 7 (  1.2)


vwmul_TYPE1_float

Dataflow summary:
def_info->table_size = 39, use_info->table_size = 0
;;  fully invalidated by EH 	 0 [zero] 3 [gp] 4 [tp] 5 [t0] 6 [t1] 7 [t2] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 28 [t3] 29 [t4] 30 [t5] 31 [t6] 32 [ft0] 33 [ft1] 34 [ft2] 35 [ft3] 36 [ft4] 37 [ft5] 38 [ft6] 39 [ft7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 60 [ft8] 61 [ft9] 62 [ft10] 63 [ft11] 66 [vl] 67 [vtype] 68 [vxrm] 69 [N/A] 70 [N/A] 71 [N/A] 72 [N/A] 73 [N/A] 74 [N/A] 75 [N/A] 76 [N/A] 77 [N/A] 78 [N/A] 79 [N/A] 80 [N/A] 81 [N/A] 82 [N/A] 83 [N/A] 84 [N/A] 85 [N/A] 86 [N/A] 87 [N/A] 88 [N/A] 89 [N/A] 90 [N/A] 91 [N/A] 92 [N/A] 93 [N/A] 94 [N/A] 95 [N/A] 96 [v0] 97 [v1] 98 [v2] 99 [v3] 100 [v4] 101 [v5] 102 [v6] 103 [v7] 104 [v8] 105 [v9] 106 [v10] 107 [v11] 108 [v12] 109 [v13] 110 [v14] 111 [v15] 112 [v16] 113 [v17] 114 [v18] 115 [v19] 116 [v20] 117 [v21] 118 [v22] 119 [v23] 120 [v24] 121 [v25] 122 [v26] 123 [v27] 124 [v28] 125 [v29] 126 [v30] 127 [v31]
;;  hardware regs used 	 2 [sp] 64 [arg] 65 [frame]
;;  regular block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  eh block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  entry block defs 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;;  exit block uses 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;;  regs ever live 	 0 [zero] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 66 [vl] 67 [vtype] 69 [N/A]
;;  ref usage 	r0={3u} r1={1d,1u} r2={1d,5u} r8={1d,5u} r10={1d,1u} r11={1d,1u} r12={1d,1u} r13={1d,1u} r14={1d} r15={1d} r16={1d} r17={1d} r42={1d} r43={1d} r44={1d} r45={1d} r46={1d} r47={1d} r48={1d} r49={1d} r64={1d,4u} r65={1d,5u} r66={4u} r67={4u} r69={1u} r134={1d,6u} r135={2d,3u} r139={1d,1u} r141={1d,1u} r142={1d,2u} r143={2d,3u} r144={2d,2u} r145={2d,2u} r146={1d,2u} r148={1d,1u,1e} r149={1d,1u,1e} r150={1d,1u} r152={1d,1u} r153={1d,1u} r154={1d,1u} r155={1d,1u} r156={1d,1u} r157={1d,1u} 
;;    total ref usage 112{43d,67u,2e} in 32{32 regular + 0 call} insns.

( )->[0]->( 2 )
;; bb 0 artificial_defs: { d0(1){ }d1(2){ }d2(8){ }d3(10){ }d4(11){ }d5(12){ }d6(13){ }d7(14){ }d8(15){ }d9(16){ }d10(17){ }d11(42){ }d12(43){ }d13(44){ }d14(45){ }d15(46){ }d16(47){ }d17(48){ }d18(49){ }d19(64){ }d20(65){ }}
;; bb 0 artificial_uses: { }
;; lr  in  	 0 [zero] 66 [vl] 67 [vtype] 69 [N/A]
;; lr  use 	
;; lr  def 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;; live  in  	
;; live  gen 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A]
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 64 [arg] 65 [frame]

( 0 )->[2]->( 3 5 )
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A]
;; lr  use 	 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 64 [arg] 65 [frame]
;; lr  def 	 143 144 145 146 154 155 156 157
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 64 [arg] 65 [frame]
;; live  gen 	 143 144 145 146
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 143 144 145 146
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 143 144 145 146

( 2 )->[3]->( 4 )
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 143 144 145 146
;; lr  use 	 2 [sp] 8 [s0] 64 [arg] 65 [frame] 146
;; lr  def 	 135 153
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 143 144 145 146
;; live  gen 	 135 153
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 135 143 144 145 153

( 4 3 )->[4]->( 4 5 )
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
;; lr  use 	 0 [zero] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
;; lr  def 	 134 135 139 141 142 143 144 145 148 149 150 152
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 135 143 144 145 153
;; live  gen 	 134 135 139 141 142 143 144 145 148 149 150 152
;; live  kill	
;; lr  out 	 0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 135 143 144 145 153

( 4 2 )->[5]->( 1 )
;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u-1(2){ }u-1(8){ }u-1(64){ }u-1(65){ }}
;; lr  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; lr  use 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; lr  def 	
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; live  gen 	
;; live  kill	
;; lr  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;; live  out 	 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]

( 5 )->[1]->( )
;; bb 1 artificial_defs: { }
;; bb 1 artificial_uses: { u-1(1){ }u-1(2){ }u-1(8){ }u-1(65){ }}
;; lr  in  	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; lr  use 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; lr  def 	
;; live  in  	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;; live  gen 	
;; live  kill	
;; lr  out 	
;; live  out 	

Finding needed instructions:
  Adding insn 15 to worklist
  Adding insn 38 to worklist
  Adding insn 28 to worklist
Finished finding needed instructions:
processing block 5 lr out =  1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame]
processing block 4 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
  Adding insn 36 to worklist
  Adding insn 35 to worklist
  Adding insn 34 to worklist
  Adding insn 33 to worklist
  Adding insn 32 to worklist
  Adding insn 27 to worklist
  Adding insn 25 to worklist
  Adding insn 24 to worklist
  Adding insn 23 to worklist
  Adding insn 22 to worklist
  Adding insn 21 to worklist
  Adding insn 19 to worklist
processing block 3 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 135 143 144 145 153
  Adding insn 46 to worklist
  Adding insn 17 to worklist
processing block 2 lr out =  0 [zero] 1 [ra] 2 [sp] 8 [s0] 64 [arg] 65 [frame] 66 [vl] 67 [vtype] 69 [N/A] 143 144 145 146
  Adding insn 5 to worklist
  Adding insn 54 to worklist
  Adding insn 4 to worklist
  Adding insn 53 to worklist
  Adding insn 3 to worklist
  Adding insn 52 to worklist
  Adding insn 2 to worklist
  Adding insn 51 to worklist
df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 7 count 7 (  1.2)
insn_cost 4 for    51: r154:DI=a0:DI
      REG_DEAD a0:DI
insn_cost 4 for     2: r143:DI=r154:DI
      REG_DEAD r154:DI
insn_cost 4 for    52: r155:DI=a1:DI
      REG_DEAD a1:DI
insn_cost 4 for     3: r144:DI=r155:DI
      REG_DEAD r155:DI
insn_cost 4 for    53: r156:DI=a2:DI
      REG_DEAD a2:DI
insn_cost 4 for     4: r145:DI=r156:DI
      REG_DEAD r156:DI
insn_cost 4 for    54: r157:DI=a3:DI
      REG_DEAD a3:DI
insn_cost 4 for     5: r146:DI=r157:DI
      REG_DEAD r157:DI
insn_cost 0 for     9: debug begin stmt marker
insn_cost 0 for    10: debug i => 0
insn_cost 0 for    11: debug begin stmt marker
insn_cost 4 for    15: pc={(r146:DI<=0)?L42:pc}
      REG_BR_PROB 118111604
insn_cost 4 for    17: r135:DI=r146:DI
      REG_DEAD r146:DI
insn_cost 8 for    46: r153:DI=unspec[0x40] 70
insn_cost 12 for    19: r134:DI=unspec[r135:DI,0x8,0x5,0,0] 67
insn_cost 0 for    20: debug begin stmt marker
insn_cost 4 for    21: r142:DI=r134:DI<<0x2
insn_cost 4 for    22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
insn_cost 4 for    23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
insn_cost 4 for    24: r148:VNx2DF=float_extend(r139:VNx2SF)
      REG_DEAD r139:VNx2SF
insn_cost 4 for    25: r149:VNx2DF=float_extend(r141:VNx2SF)
      REG_DEAD r141:VNx2SF
insn_cost 4 for    27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
insn_cost 4 for    28: [r143:DI]={(unspec[const_vector,r134:DI,0,vl:SI,vtype:SI] 69)?r150:VNx2DF:[r143:DI]}
      REG_DEAD r150:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
insn_cost 0 for    29: debug begin stmt marker
insn_cost 0 for    30: debug i => optimized away
insn_cost 0 for    31: debug begin stmt marker
insn_cost 4 for    32: r144:DI=r144:DI+r142:DI
insn_cost 4 for    33: r145:DI=r145:DI+r142:DI
      REG_DEAD r142:DI
insn_cost 4 for    34: r152:DI=r134:DI<<0x3
insn_cost 4 for    35: r143:DI=r143:DI+r152:DI
      REG_DEAD r152:DI
insn_cost 4 for    36: r135:DI=r135:DI-r134:DI
      REG_DEAD r134:DI
insn_cost 4 for    38: pc={(r135:DI!=0)?L37:pc}
      REG_BR_PROB 894784862

Trying 5 -> 15:
    5: r146:DI=r157:DI
      REG_DEAD r157:DI
   15: pc={(r146:DI<=0)?L42:pc}
      REG_BR_PROB 118111604
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (reg:DI 157)
                    (const_int 0 [0]))
                (label_ref:DI 42)
                (pc)))
        (set (reg/v:DI 146 [ n ])
            (reg:DI 157))
    ])
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (le (reg:DI 157)
                    (const_int 0 [0]))
                (label_ref:DI 42)
                (pc)))
        (set (reg/v:DI 146 [ n ])
            (reg:DI 157))
    ])

Trying 19 -> 21:
   19: r134:DI=unspec[r135:DI,0x8,0x5,0,0] 67
   21: r142:DI=r134:DI<<0x2
Failed to match this instruction:
(parallel [
        (set (reg:DI 142 [ ivtmp_41 ])
            (ashift:DI (unspec:DI [
                        (reg:DI 135 [ ivtmp_21 ])
                        (const_int 8 [0x8])
                        (const_int 5 [0x5])
                        (const_int 0 [0]) repeated x2
                    ] UNSPEC_VSETVL)
                (const_int 2 [0x2])))
        (set (reg:DI 134 [ _13 ])
            (unspec:DI [
                    (reg:DI 135 [ ivtmp_21 ])
                    (const_int 8 [0x8])
                    (const_int 5 [0x5])
                    (const_int 0 [0]) repeated x2
                ] UNSPEC_VSETVL))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:DI 142 [ ivtmp_41 ])
            (ashift:DI (unspec:DI [
                        (reg:DI 135 [ ivtmp_21 ])
                        (const_int 8 [0x8])
                        (const_int 5 [0x5])
                        (const_int 0 [0]) repeated x2
                    ] UNSPEC_VSETVL)
                (const_int 2 [0x2])))
        (set (reg:DI 134 [ _13 ])
            (unspec:DI [
                    (reg:DI 135 [ ivtmp_21 ])
                    (const_int 8 [0x8])
                    (const_int 5 [0x5])
                    (const_int 0 [0]) repeated x2
                ] UNSPEC_VSETVL))
    ])
Successfully matched this instruction:
(set (reg:DI 134 [ _13 ])
    (unspec:DI [
            (reg:DI 135 [ ivtmp_21 ])
            (const_int 8 [0x8])
            (const_int 5 [0x5])
            (const_int 0 [0]) repeated x2
        ] UNSPEC_VSETVL))
Failed to match this instruction:
(set (reg:DI 142 [ ivtmp_41 ])
    (ashift:DI (unspec:DI [
                (reg:DI 135 [ ivtmp_21 ])
                (const_int 8 [0x8])
                (const_int 5 [0x5])
                (const_int 0 [0]) repeated x2
            ] UNSPEC_VSETVL)
        (const_int 2 [0x2])))

Trying 23 -> 24:
   23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
   24: r148:VNx2DF=float_extend(r139:VNx2SF)
      REG_DEAD r139:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 148 [ vect__8.49 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 22 -> 25:
   22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
   25: r149:VNx2DF=float_extend(r141:VNx2SF)
      REG_DEAD r141:VNx2SF
Failed to match this instruction:
(set (reg:VNx2DF 149 [ vect__5.45 ])
    (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))))

Trying 24 -> 27:
   24: r148:VNx2DF=float_extend(r139:VNx2SF)
      REG_DEAD r139:VNx2SF
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ]))
            (reg:VNx2DF 149 [ vect__5.45 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 25 -> 27:
   25: r149:VNx2DF=float_extend(r141:VNx2SF)
      REG_DEAD r141:VNx2SF
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
            (reg:VNx2DF 148 [ vect__8.49 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 23, 24 -> 27:
   23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
   24: r148:VNx2DF=float_extend(r139:VNx2SF)
      REG_DEAD r139:VNx2SF
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (reg:VNx2DF 149 [ vect__5.45 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Successfully matched this instruction:
(set (reg:VNx2SF 148 [ vect__8.49 ])
    (if_then_else:VNx2SF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 2 [0x2]) repeated x2
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
        (unspec:VNx2SF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 148 [ vect__8.49 ]))
            (reg:VNx2DF 149 [ vect__5.45 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 22, 25 -> 27:
   22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
   25: r149:VNx2DF=float_extend(r141:VNx2SF)
      REG_DEAD r141:VNx2SF
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (reg:VNx2DF 148 [ vect__8.49 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Successfully matched this instruction:
(set (reg:VNx2SF 149 [ vect__5.45 ])
    (if_then_else:VNx2SF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 2 [0x2]) repeated x2
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
        (unspec:VNx2SF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
            (reg:VNx2DF 148 [ vect__8.49 ]))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 25, 24 -> 27:
   25: r149:VNx2DF=float_extend(r141:VNx2SF)
      REG_DEAD r141:VNx2SF
   24: r148:VNx2DF=float_extend(r139:VNx2SF)
      REG_DEAD r139:VNx2SF
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
      REG_DEAD r149:VNx2DF
      REG_DEAD r148:VNx2DF
      REG_DEAD N/A:SI
      REG_DEAD zero:SI
      REG_EQUAL r148:VNx2DF*r149:VNx2DF
Successfully matched this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
            (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
allowing combination of insns 24, 25 and 27
original costs 4 + 4 + 4 = 12
replacement cost 4
deferring deletion of insn with uid = 25.
deferring deletion of insn with uid = 24.
modifying insn i3    27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
deferring rescan insn with uid = 27.

Trying 23 -> 27:
   23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 22 -> 27:
   22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 22, 23 -> 27:
   22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
   23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF))))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Successfully matched this instruction:
(set (reg:VNx2SF 139 [ vect__7.48 ])
    (if_then_else:VNx2SF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 2 [0x2]) repeated x2
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
        (unspec:VNx2SF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 153)
                (const_int 2 [0x2]) repeated x2
                (const_int 1 [0x1])
                (const_int 7 [0x7])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
                (reg:SI 69 N/A)
            ] UNSPEC_VPREDICATE)
        (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                            (const_vector:VNx2BI repeat [
                                    (const_int 1 [0x1])
                                ])
                            (reg:DI 134 [ _13 ])
                            (const_int 2 [0x2]) repeated x2
                            (const_int 0 [0])
                            (reg:SI 66 vl)
                            (reg:SI 67 vtype)
                        ] UNSPEC_VPREDICATE)
                    (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                    (unspec:VNx2SF [
                            (reg:SI 0 zero)
                        ] UNSPEC_VUNDEF)))
            (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
        (unspec:VNx2DF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))

Trying 27 -> 28:
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
   28: [r143:DI]={(unspec[const_vector,r134:DI,0,vl:SI,vtype:SI] 69)?r150:VNx2DF:[r143:DI]}
      REG_DEAD r150:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
                (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])))

Trying 23, 27 -> 28:
   23: r139:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r145:DI]:unspec[zero:SI] 68}
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
   28: [r143:DI]={(unspec[const_vector,r134:DI,0,vl:SI,vtype:SI] 69)?r150:VNx2DF:[r143:DI]}
      REG_DEAD r150:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
                (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                (const_vector:VNx2BI repeat [
                                        (const_int 1 [0x1])
                                    ])
                                (reg:DI 134 [ _13 ])
                                (const_int 2 [0x2]) repeated x2
                                (const_int 0 [0])
                                (reg:SI 66 vl)
                                (reg:SI 67 vtype)
                            ] UNSPEC_VPREDICATE)
                        (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
                        (unspec:VNx2SF [
                                (reg:SI 0 zero)
                            ] UNSPEC_VUNDEF))))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])))
Successfully matched this instruction:
(set (reg:VNx2SF 150 [ vect__11.50 ])
    (if_then_else:VNx2SF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 2 [0x2]) repeated x2
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
        (unspec:VNx2SF [
                (reg:SI 0 zero)
            ] UNSPEC_VUNDEF)))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
                (float_extend:VNx2DF (reg:VNx2SF 150 [ vect__11.50 ])))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])))

Trying 22, 27 -> 28:
   22: r141:VNx2SF={(unspec[const_vector,r134:DI,0x2,0x2,0,vl:SI,vtype:SI] 69)?[r144:DI]:unspec[zero:SI] 68}
   27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?float_extend(r141:VNx2SF)*float_extend(r139:VNx2SF):unspec[zero:SI] 68}
      REG_DEAD r139:VNx2SF
      REG_DEAD r141:VNx2SF
      REG_DEAD zero:SI
      REG_DEAD N/A:SI
   28: [r143:DI]={(unspec[const_vector,r134:DI,0,vl:SI,vtype:SI] 69)?r150:VNx2DF:[r143:DI]}
      REG_DEAD r150:VNx2DF
      REG_DEAD vtype:SI
      REG_DEAD vl:SI
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                (const_vector:VNx2BI repeat [
                                        (const_int 1 [0x1])
                                    ])
                                (reg:DI 134 [ _13 ])
                                (const_int 2 [0x2]) repeated x2
                                (const_int 0 [0])
                                (reg:SI 66 vl)
                                (reg:SI 67 vtype)
                            ] UNSPEC_VPREDICATE)
                        (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                        (unspec:VNx2SF [
                                (reg:SI 0 zero)
                            ] UNSPEC_VUNDEF)))
                (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])))
Successfully matched this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
    (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
Failed to match this instruction:
(set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
    (if_then_else:VNx2DF (unspec:VNx2BI [
                (const_vector:VNx2BI repeat [
                        (const_int 1 [0x1])
                    ])
                (reg:DI 134 [ _13 ])
                (const_int 0 [0])
                (reg:SI 66 vl)
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (if_then_else:VNx2SF (unspec:VNx2BI [
                                (const_vector:VNx2BI repeat [
                                        (const_int 1 [0x1])
                                    ])
                                (reg:DI 134 [ _13 ])
                                (const_int 2 [0x2]) repeated x2
                                (const_int 0 [0])
                                (reg:SI 66 vl)
                                (reg:SI 67 vtype)
                            ] UNSPEC_VPREDICATE)
                        (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
                        (unspec:VNx2SF [
                                (reg:SI 0 zero)
                            ] UNSPEC_VUNDEF)))
                (reg:VNx2DF 150 [ vect__11.50 ]))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))
        (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])))

Trying 21 -> 32:
   21: r142:DI=r134:DI<<0x2
   32: r144:DI=r144:DI+r142:DI
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 144 [ a ])
            (plus:DI (ashift:DI (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]))
                (reg/v/f:DI 144 [ a ])))
        (set (reg:DI 142 [ ivtmp_41 ])
            (ashift:DI (reg:DI 134 [ _13 ])
                (const_int 2 [0x2])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg/v/f:DI 144 [ a ])
            (plus:DI (ashift:DI (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]))
                (reg/v/f:DI 144 [ a ])))
        (set (reg:DI 142 [ ivtmp_41 ])
            (ashift:DI (reg:DI 134 [ _13 ])
                (const_int 2 [0x2])))
    ])
Successfully matched this instruction:
(set (reg:DI 142 [ ivtmp_41 ])
    (ashift:DI (reg:DI 134 [ _13 ])
        (const_int 2 [0x2])))
Failed to match this instruction:
(set (reg/v/f:DI 144 [ a ])
    (plus:DI (ashift:DI (reg:DI 134 [ _13 ])
            (const_int 2 [0x2]))
        (reg/v/f:DI 144 [ a ])))

Trying 19, 21 -> 32:
   19: r134:DI=unspec[r135:DI,0x8,0x5,0,0] 67
   21: r142:DI=r134:DI<<0x2
   32: r144:DI=r144:DI+r142:DI
Can't combine i1 into i3

Trying 34 -> 35:
   34: r152:DI=r134:DI<<0x3
   35: r143:DI=r143:DI+r152:DI
      REG_DEAD r152:DI
Failed to match this instruction:
(set (reg/v/f:DI 143 [ dst ])
    (plus:DI (ashift:DI (reg:DI 134 [ _13 ])
            (const_int 3 [0x3]))
        (reg/v/f:DI 143 [ dst ])))

Trying 36 -> 38:
   36: r135:DI=r135:DI-r134:DI
      REG_DEAD r134:DI
   38: pc={(r135:DI!=0)?L37:pc}
      REG_BR_PROB 894784862
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (ne (reg:DI 135 [ ivtmp_21 ])
                    (reg:DI 134 [ _13 ]))
                (label_ref:DI 37)
                (pc)))
        (set (reg:DI 135 [ ivtmp_21 ])
            (minus:DI (reg:DI 135 [ ivtmp_21 ])
                (reg:DI 134 [ _13 ])))
    ])
Failed to match this instruction:
(parallel [
        (set (pc)
            (if_then_else (ne (reg:DI 135 [ ivtmp_21 ])
                    (reg:DI 134 [ _13 ]))
                (label_ref:DI 37)
                (pc)))
        (set (reg:DI 135 [ ivtmp_21 ])
            (minus:DI (reg:DI 135 [ ivtmp_21 ])
                (reg:DI 134 [ _13 ])))
    ])
starting the processing of deferred insns
rescanning insn with uid = 27.
ending the processing of deferred insns


vwmul_TYPE1_float

Dataflow summary:
;;  fully invalidated by EH 	 0 [zero] 3 [gp] 4 [tp] 5 [t0] 6 [t1] 7 [t2] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 28 [t3] 29 [t4] 30 [t5] 31 [t6] 32 [ft0] 33 [ft1] 34 [ft2] 35 [ft3] 36 [ft4] 37 [ft5] 38 [ft6] 39 [ft7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 60 [ft8] 61 [ft9] 62 [ft10] 63 [ft11] 66 [vl] 67 [vtype] 68 [vxrm] 69 [N/A] 70 [N/A] 71 [N/A] 72 [N/A] 73 [N/A] 74 [N/A] 75 [N/A] 76 [N/A] 77 [N/A] 78 [N/A] 79 [N/A] 80 [N/A] 81 [N/A] 82 [N/A] 83 [N/A] 84 [N/A] 85 [N/A] 86 [N/A] 87 [N/A] 88 [N/A] 89 [N/A] 90 [N/A] 91 [N/A] 92 [N/A] 93 [N/A] 94 [N/A] 95 [N/A] 96 [v0] 97 [v1] 98 [v2] 99 [v3] 100 [v4] 101 [v5] 102 [v6] 103 [v7] 104 [v8] 105 [v9] 106 [v10] 107 [v11] 108 [v12] 109 [v13] 110 [v14] 111 [v15] 112 [v16] 113 [v17] 114 [v18] 115 [v19] 116 [v20] 117 [v21] 118 [v22] 119 [v23] 120 [v24] 121 [v25] 122 [v26] 123 [v27] 124 [v28] 125 [v29] 126 [v30] 127 [v31]
;;  hardware regs used 	 2 [sp] 64 [arg] 65 [frame]
;;  regular block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  eh block artificial uses 	 2 [sp] 8 [s0] 64 [arg] 65 [frame]
;;  entry block defs 	 1 [ra] 2 [sp] 8 [s0] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 14 [a4] 15 [a5] 16 [a6] 17 [a7] 42 [fa0] 43 [fa1] 44 [fa2] 45 [fa3] 46 [fa4] 47 [fa5] 48 [fa6] 49 [fa7] 64 [arg] 65 [frame]
;;  exit block uses 	 1 [ra] 2 [sp] 8 [s0] 65 [frame]
;;  regs ever live 	 0 [zero] 10 [a0] 11 [a1] 12 [a2] 13 [a3] 66 [vl] 67 [vtype] 69 [N/A]
;;  ref usage 	r0={3u} r1={1d,1u} r2={1d,5u} r8={1d,5u} r10={1d,1u} r11={1d,1u} r12={1d,1u} r13={1d,1u} r14={1d} r15={1d} r16={1d} r17={1d} r42={1d} r43={1d} r44={1d} r45={1d} r46={1d} r47={1d} r48={1d} r49={1d} r64={1d,4u} r65={1d,5u} r66={4u} r67={4u} r69={1u} r134={1d,6u} r135={2d,3u} r139={1d,1u} r141={1d,1u} r142={1d,2u} r143={2d,3u} r144={2d,2u} r145={2d,2u} r146={1d,2u} r150={1d,1u} r152={1d,1u} r153={1d,1u} r154={1d,1u} r155={1d,1u} r156={1d,1u} r157={1d,1u} 
;;    total ref usage 106{41d,65u,0e} in 30{30 regular + 0 call} insns.
(note 7 0 51 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 51 7 2 2 (set (reg:DI 154)
        (reg:DI 10 a0 [ dst ])) "/app/example.cpp":30:1 -1
     (expr_list:REG_DEAD (reg:DI 10 a0 [ dst ])
        (nil)))
(insn 2 51 52 2 (set (reg/v/f:DI 143 [ dst ])
        (reg:DI 154)) "/app/example.cpp":30:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 154)
        (nil)))
(insn 52 2 3 2 (set (reg:DI 155)
        (reg:DI 11 a1 [ a ])) "/app/example.cpp":30:1 -1
     (expr_list:REG_DEAD (reg:DI 11 a1 [ a ])
        (nil)))
(insn 3 52 53 2 (set (reg/v/f:DI 144 [ a ])
        (reg:DI 155)) "/app/example.cpp":30:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 155)
        (nil)))
(insn 53 3 4 2 (set (reg:DI 156)
        (reg:DI 12 a2 [ b ])) "/app/example.cpp":30:1 -1
     (expr_list:REG_DEAD (reg:DI 12 a2 [ b ])
        (nil)))
(insn 4 53 54 2 (set (reg/v/f:DI 145 [ b ])
        (reg:DI 156)) "/app/example.cpp":30:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 156)
        (nil)))
(insn 54 4 5 2 (set (reg:DI 157)
        (reg:DI 13 a3 [ n ])) "/app/example.cpp":30:1 -1
     (expr_list:REG_DEAD (reg:DI 13 a3 [ n ])
        (nil)))
(insn 5 54 6 2 (set (reg/v:DI 146 [ n ])
        (reg:DI 157)) "/app/example.cpp":30:1 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 157)
        (nil)))
(note 6 5 9 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 9 6 10 2 (debug_marker) "/app/example.cpp":30:1 -1
     (nil))
(debug_insn 10 9 11 2 (var_location:SI i (const_int 0 [0])) -1
     (nil))
(debug_insn 11 10 15 2 (debug_marker) "/app/example.cpp":30:1 discrim 1 -1
     (nil))
(jump_insn 15 11 16 2 (set (pc)
        (if_then_else (le (reg/v:DI 146 [ n ])
                (const_int 0 [0]))
            (label_ref:DI 42)
            (pc))) "/app/example.cpp":30:1 discrim 1 242 {*branchdi}
     (int_list:REG_BR_PROB 118111604 (nil))
 -> 42)
(note 16 15 17 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 17 16 46 3 (set (reg:DI 135 [ ivtmp_21 ])
        (reg/v:DI 146 [ n ])) 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 146 [ n ])
        (nil)))
(insn 46 17 37 3 (set (reg:DI 153)
        (unspec:DI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)) 688 {vlmax_avldi}
     (nil))
(code_label 37 46 18 4 9 (nil) [1 uses])
(note 18 37 19 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 19 18 20 4 (set (reg:DI 134 [ _13 ])
        (unspec:DI [
                (reg:DI 135 [ ivtmp_21 ])
                (const_int 8 [0x8])
                (const_int 5 [0x5])
                (const_int 0 [0]) repeated x2
            ] UNSPEC_VSETVL)) 1116 {vsetvldi_no_side_effects}
     (nil))
(debug_insn 20 19 21 4 (debug_marker) "/app/example.cpp":30:1 discrim 3 -1
     (nil))
(insn 21 20 22 4 (set (reg:DI 142 [ ivtmp_41 ])
        (ashift:DI (reg:DI 134 [ _13 ])
            (const_int 2 [0x2]))) 198 {ashldi3}
     (nil))
(insn 22 21 23 4 (set (reg:VNx2SF 141 [ vect__4.44 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 144 [ a ]) [1 MEM <vector([2,2]) float> [(float *)vectp_a.42_40]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":30:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(insn 23 22 24 4 (set (reg:VNx2SF 139 [ vect__7.48 ])
        (if_then_else:VNx2SF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 134 [ _13 ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (mem:VNx2SF (reg/v/f:DI 145 [ b ]) [1 MEM <vector([2,2]) float> [(float *)vectp_b.46_35]+0 S[8, 8] A32])
            (unspec:VNx2SF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":30:1 discrim 3 1151 {pred_movvnx2sf}
     (nil))
(note 24 23 25 4 NOTE_INSN_DELETED)
(note 25 24 27 4 NOTE_INSN_DELETED)
(insn 27 25 28 4 (set (reg:VNx2DF 150 [ vect__11.50 ])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 153)
                    (const_int 2 [0x2]) repeated x2
                    (const_int 1 [0x1])
                    (const_int 7 [0x7])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                    (reg:SI 69 N/A)
                ] UNSPEC_VPREDICATE)
            (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
                (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
            (unspec:VNx2DF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.cpp":30:1 discrim 3 6728 {pred_dual_widen_mulvnx2df}
     (expr_list:REG_DEAD (reg:VNx2SF 139 [ vect__7.48 ])
        (expr_list:REG_DEAD (reg:VNx2SF 141 [ vect__4.44 ])
            (expr_list:REG_DEAD (reg:SI 0 zero)
                (expr_list:REG_DEAD (reg:SI 69 N/A)
                    (nil))))))
(insn 28 27 29 4 (set (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64])
        (if_then_else:VNx2DF (unspec:VNx2BI [
                    (const_vector:VNx2BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg:DI 134 [ _13 ])
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (reg:VNx2DF 150 [ vect__11.50 ])
            (mem:VNx2DF (reg/v/f:DI 143 [ dst ]) [2 MEM <vector([2,2]) double> [(double *)vectp_dst.51_29]+0 S[16, 16] A64]))) "/app/example.cpp":30:1 discrim 3 1201 {pred_storevnx2df}
     (expr_list:REG_DEAD (reg:VNx2DF 150 [ vect__11.50 ])
        (expr_list:REG_DEAD (reg:SI 67 vtype)
            (expr_list:REG_DEAD (reg:SI 66 vl)
                (nil)))))
(debug_insn 29 28 30 4 (debug_marker) "/app/example.cpp":30:1 discrim 3 -1
     (nil))
(debug_insn 30 29 31 4 (var_location:SI i (clobber (const_int 0 [0]))) -1
     (nil))
(debug_insn 31 30 32 4 (debug_marker) "/app/example.cpp":30:1 discrim 1 -1
     (nil))
(insn 32 31 33 4 (set (reg/v/f:DI 144 [ a ])
        (plus:DI (reg/v/f:DI 144 [ a ])
            (reg:DI 142 [ ivtmp_41 ]))) "/app/example.cpp":30:1 discrim 1 5 {adddi3}
     (nil))
(insn 33 32 34 4 (set (reg/v/f:DI 145 [ b ])
        (plus:DI (reg/v/f:DI 145 [ b ])
            (reg:DI 142 [ ivtmp_41 ]))) "/app/example.cpp":30:1 discrim 1 5 {adddi3}
     (expr_list:REG_DEAD (reg:DI 142 [ ivtmp_41 ])
        (nil)))
(insn 34 33 35 4 (set (reg:DI 152)
        (ashift:DI (reg:DI 134 [ _13 ])
            (const_int 3 [0x3]))) "/app/example.cpp":30:1 discrim 1 198 {ashldi3}
     (nil))
(insn 35 34 36 4 (set (reg/v/f:DI 143 [ dst ])
        (plus:DI (reg/v/f:DI 143 [ dst ])
            (reg:DI 152))) "/app/example.cpp":30:1 discrim 1 5 {adddi3}
     (expr_list:REG_DEAD (reg:DI 152)
        (nil)))
(insn 36 35 38 4 (set (reg:DI 135 [ ivtmp_21 ])
        (minus:DI (reg:DI 135 [ ivtmp_21 ])
            (reg:DI 134 [ _13 ]))) "/app/example.cpp":30:1 discrim 1 11 {subdi3}
     (expr_list:REG_DEAD (reg:DI 134 [ _13 ])
        (nil)))
(jump_insn 38 36 42 4 (set (pc)
        (if_then_else (ne (reg:DI 135 [ ivtmp_21 ])
                (const_int 0 [0]))
            (label_ref:DI 37)
            (pc))) 242 {*branchdi}
     (int_list:REG_BR_PROB 894784862 (nil))
 -> 37)
(code_label 42 38 43 5 7 (nil) [1 uses])
(note 43 42 0 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

;; Combiner totals: 51 attempts, 49 substitutions (20 requiring new space),
;; 1 successes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-28 22:00   ` 钟居哲
@ 2023-06-29 22:59     ` Jeff Law
  2023-06-29 23:02       ` 钟居哲
  2023-06-29 23:04       ` 钟居哲
  2023-06-29 23:39     ` Jeff Law
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2023-06-29 22:59 UTC (permalink / raw)
  To: 钟居哲, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc



On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
> 
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
> 
> The first case can't genreate vfwmul.vv but second case succeed.
> 
> Failed to match this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>      (if_then_else:VNx2DF (unspec:VNx2BI [
>                  (const_vector:VNx2BI repeat [
>                          (const_int 1 [0x1])
>                      ])
>                  (reg:DI 153)
>                  (const_int 2 [0x2]) repeated x2
>                  (const_int 1 [0x1])
>                  (const_int 7 [0x7])
>                  (reg:SI 66 vl)
>                  (reg:SI 67 vtype)
>                  (reg:SI 69 N/A)
>              ] UNSPEC_VPREDICATE)
>          (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
>              (reg:VNx2DF 148 [ vect__8.49 ]))
>          (unspec:VNx2DF [
>                  (reg:SI 0 zero)
>              ] UNSPEC_VUNDEF)))
Right.  We try combining:
   24 -> 27
   25 -> 27
   23, 24 -> 27
   22, 25 -> 27

All of which fail, as expected.  24 -> 27 and 25-> 27 only put an 
extension on one operand of the mult.  The other two try to substitute a 
float extend of an if-then-else which I fully expect to fail.  All as 
expected.

The next one that gets tried is:

> Trying 25, 24 -> 27:
>    25: r149:VNx2DF=float_extend(r141:VNx2SF)
>       REG_DEAD r141:VNx2SF
>    24: r148:VNx2DF=float_extend(r139:VNx2SF)
>       REG_DEAD r139:VNx2SF
>    27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
>       REG_DEAD r149:VNx2DF
>       REG_DEAD r148:VNx2DF
>       REG_DEAD N/A:SI
>       REG_DEAD zero:SI
>       REG_EQUAL r148:VNx2DF*r149:VNx2DF
> Successfully matched this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>     (if_then_else:VNx2DF (unspec:VNx2BI [
>                 (const_vector:VNx2BI repeat [
>                         (const_int 1 [0x1])
>                     ])
>                 (reg:DI 153)
>                 (const_int 2 [0x2]) repeated x2
>                 (const_int 1 [0x1])
>                 (const_int 7 [0x7])
>                 (reg:SI 66 vl)
>                 (reg:SI 67 vtype)
>                 (reg:SI 69 N/A)
>             ] UNSPEC_VPREDICATE)
>         (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
>             (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
>         (unspec:VNx2DF [
>                 (reg:SI 0 zero)
>             ] UNSPEC_VUNDEF)))
> allowing combination of insns 24, 25 and 27
> original costs 4 + 4 + 4 = 12
> replacement cost 4

Note how it replaced both operands of the mult with extended versions 
and the pattern matches, as expected.

The point being that I don't think those helper patterns are needed to 
handle the problem you suggested they were there to handle.  Combine 
knows how to handle multiple substitutions just fine.

Right now I don't see a need for this patch.



Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-29 22:59     ` Jeff Law
@ 2023-06-29 23:02       ` 钟居哲
  2023-06-29 23:04       ` 钟居哲
  1 sibling, 0 replies; 22+ messages in thread
From: 钟居哲 @ 2023-06-29 23:02 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 4585 bytes --]

>> Right now I don't see a need for this patch.
No, we need this patch.

With this patch,  this following case can be combine into vfwmul.vv:
#define TEST_TYPE(TYPE1, TYPE2)                                                \
  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      {                                                                        \
	dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
	dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
	dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
	dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
      }                                                                        \
  }
TEST_TYPE (double, float)
You should try this, then you will know I am saying.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-30 06:59
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
> 
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
> 
> The first case can't genreate vfwmul.vv but second case succeed.
> 
> Failed to match this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>      (if_then_else:VNx2DF (unspec:VNx2BI [
>                  (const_vector:VNx2BI repeat [
>                          (const_int 1 [0x1])
>                      ])
>                  (reg:DI 153)
>                  (const_int 2 [0x2]) repeated x2
>                  (const_int 1 [0x1])
>                  (const_int 7 [0x7])
>                  (reg:SI 66 vl)
>                  (reg:SI 67 vtype)
>                  (reg:SI 69 N/A)
>              ] UNSPEC_VPREDICATE)
>          (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
>              (reg:VNx2DF 148 [ vect__8.49 ]))
>          (unspec:VNx2DF [
>                  (reg:SI 0 zero)
>              ] UNSPEC_VUNDEF)))
Right.  We try combining:
   24 -> 27
   25 -> 27
   23, 24 -> 27
   22, 25 -> 27
 
All of which fail, as expected.  24 -> 27 and 25-> 27 only put an 
extension on one operand of the mult.  The other two try to substitute a 
float extend of an if-then-else which I fully expect to fail.  All as 
expected.
 
The next one that gets tried is:
 
> Trying 25, 24 -> 27:
>    25: r149:VNx2DF=float_extend(r141:VNx2SF)
>       REG_DEAD r141:VNx2SF
>    24: r148:VNx2DF=float_extend(r139:VNx2SF)
>       REG_DEAD r139:VNx2SF
>    27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
>       REG_DEAD r149:VNx2DF
>       REG_DEAD r148:VNx2DF
>       REG_DEAD N/A:SI
>       REG_DEAD zero:SI
>       REG_EQUAL r148:VNx2DF*r149:VNx2DF
> Successfully matched this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>     (if_then_else:VNx2DF (unspec:VNx2BI [
>                 (const_vector:VNx2BI repeat [
>                         (const_int 1 [0x1])
>                     ])
>                 (reg:DI 153)
>                 (const_int 2 [0x2]) repeated x2
>                 (const_int 1 [0x1])
>                 (const_int 7 [0x7])
>                 (reg:SI 66 vl)
>                 (reg:SI 67 vtype)
>                 (reg:SI 69 N/A)
>             ] UNSPEC_VPREDICATE)
>         (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
>             (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
>         (unspec:VNx2DF [
>                 (reg:SI 0 zero)
>             ] UNSPEC_VUNDEF)))
> allowing combination of insns 24, 25 and 27
> original costs 4 + 4 + 4 = 12
> replacement cost 4
 
Note how it replaced both operands of the mult with extended versions 
and the pattern matches, as expected.
 
The point being that I don't think those helper patterns are needed to 
handle the problem you suggested they were there to handle.  Combine 
knows how to handle multiple substitutions just fine.
 
Right now I don't see a need for this patch.
 
 
 
Jeff
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-29 22:59     ` Jeff Law
  2023-06-29 23:02       ` 钟居哲
@ 2023-06-29 23:04       ` 钟居哲
  1 sibling, 0 replies; 22+ messages in thread
From: 钟居哲 @ 2023-06-29 23:04 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 3440 bytes --]

Or do you have better solution to make the case succeed to combine into vfwmul?
I am ok with any solution.



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-30 06:59
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
> 
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
> 
> The first case can't genreate vfwmul.vv but second case succeed.
> 
> Failed to match this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>      (if_then_else:VNx2DF (unspec:VNx2BI [
>                  (const_vector:VNx2BI repeat [
>                          (const_int 1 [0x1])
>                      ])
>                  (reg:DI 153)
>                  (const_int 2 [0x2]) repeated x2
>                  (const_int 1 [0x1])
>                  (const_int 7 [0x7])
>                  (reg:SI 66 vl)
>                  (reg:SI 67 vtype)
>                  (reg:SI 69 N/A)
>              ] UNSPEC_VPREDICATE)
>          (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
>              (reg:VNx2DF 148 [ vect__8.49 ]))
>          (unspec:VNx2DF [
>                  (reg:SI 0 zero)
>              ] UNSPEC_VUNDEF)))
Right.  We try combining:
   24 -> 27
   25 -> 27
   23, 24 -> 27
   22, 25 -> 27
 
All of which fail, as expected.  24 -> 27 and 25-> 27 only put an 
extension on one operand of the mult.  The other two try to substitute a 
float extend of an if-then-else which I fully expect to fail.  All as 
expected.
 
The next one that gets tried is:
 
> Trying 25, 24 -> 27:
>    25: r149:VNx2DF=float_extend(r141:VNx2SF)
>       REG_DEAD r141:VNx2SF
>    24: r148:VNx2DF=float_extend(r139:VNx2SF)
>       REG_DEAD r139:VNx2SF
>    27: r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
>       REG_DEAD r149:VNx2DF
>       REG_DEAD r148:VNx2DF
>       REG_DEAD N/A:SI
>       REG_DEAD zero:SI
>       REG_EQUAL r148:VNx2DF*r149:VNx2DF
> Successfully matched this instruction:
> (set (reg:VNx2DF 150 [ vect__11.50 ])
>     (if_then_else:VNx2DF (unspec:VNx2BI [
>                 (const_vector:VNx2BI repeat [
>                         (const_int 1 [0x1])
>                     ])
>                 (reg:DI 153)
>                 (const_int 2 [0x2]) repeated x2
>                 (const_int 1 [0x1])
>                 (const_int 7 [0x7])
>                 (reg:SI 66 vl)
>                 (reg:SI 67 vtype)
>                 (reg:SI 69 N/A)
>             ] UNSPEC_VPREDICATE)
>         (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
>             (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
>         (unspec:VNx2DF [
>                 (reg:SI 0 zero)
>             ] UNSPEC_VUNDEF)))
> allowing combination of insns 24, 25 and 27
> original costs 4 + 4 + 4 = 12
> replacement cost 4
 
Note how it replaced both operands of the mult with extended versions 
and the pattern matches, as expected.
 
The point being that I don't think those helper patterns are needed to 
handle the problem you suggested they were there to handle.  Combine 
knows how to handle multiple substitutions just fine.
 
Right now I don't see a need for this patch.
 
 
 
Jeff
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-28 22:00   ` 钟居哲
  2023-06-29 22:59     ` Jeff Law
@ 2023-06-29 23:39     ` Jeff Law
  2023-06-30 10:14       ` Robin Dapp
  2023-06-29 23:41     ` Jeff Law
       [not found]     ` <99D6E636A491D16D+F0E92F80-33DF-4109-912E-F9CAAD6F07B5@rivai.ai>
  3 siblings, 1 reply; 22+ messages in thread
From: Jeff Law @ 2023-06-29 23:39 UTC (permalink / raw)
  To: 钟居哲, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc



On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
> 
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
So just to be explicit, I see no difference with that test before/after 
your proposed change.  Nor would I expect one based on my understanding 
of the patch.

The explicit conversions I see are because we need the output of the 
conversion in multiple vfmul instructions.  That won't be helped by the 
patch you've proposed.

To be more concrete:

>        vsetvli t1,t5,e32,mf2,ta,ma     # 99    [c=0 l=4]  vsetvldi
>         vle32.v v2,0(a4)        # 23    [c=4 l=4]  pred_movvnx2sf/1
>         vle32.v v1,0(a5)        # 25    [c=4 l=4]  pred_movvnx2sf/1
>         vsetvli t0,zero,e32,mf2,ta,ma   # 101   [c=0 l=4]  vsetvldi
>         vfwcvt.f.f.v    v3,v2   # 77    [c=4 l=4]  pred_extendvnx2df/0
>         vfwcvt.f.f.v    v2,v1   # 79    [c=4 l=4]  pred_extendvnx2df/0
>         vsetvli zero,t1,e32,mf2,ta,ma   # 102   [c=0 l=4]  vsetvl_discard_resultdi
>         vle32.v v5,0(a6)        # 31    [c=4 l=4]  pred_movvnx2sf/1
>         vle32.v v4,0(a7)        # 39    [c=4 l=4]  pred_movvnx2sf/1
>         vsetvli t0,zero,e32,mf2,ta,ma   # 103   [c=0 l=4]  vsetvldi
>         vfwcvt.f.f.v    v1,v5   # 81    [c=4 l=4]  pred_extendvnx2df/0
>         vsetvli zero,zero,e64,m1,ta,ma  # 104   [c=16 l=4]  vsetvl_vtype_change_only
>         vfmul.vv        v5,v2,v3        # 29    [c=4 l=4]  pred_mulvnx2df/2
>         vfmul.vv        v2,v1,v2        # 34    [c=4 l=4]  pred_mulvnx2df/2
>         vsetvli zero,t1,e64,m1,ta,ma    # 105   [c=0 l=4]  vsetvl_discard_resultdi
>         vse64.v v2,0(a1)        # 35    [c=4 l=4]  pred_storevnx2df
>         vse64.v v5,0(a0)        # 30    [c=4 l=4]  pred_storevnx2df
>         vsetvli t6,zero,e64,m1,ta,ma    # 106   [c=0 l=4]  vsetvldi
>         vfmul.vv        v1,v1,v3        # 37    [c=4 l=4]  pred_mulvnx2df/2
>         vsetvli zero,zero,e32,mf2,ta,ma # 107   [c=20 l=4]  vsetvl_vtype_change_only
>         vfwcvt.f.f.v    v2,v4   # 83    [c=4 l=4]  pred_extendvnx2df/0
>         vsetvli zero,t1,e64,m1,ta,ma    # 108   [c=0 l=4]  vsetvl_discard_resultdi
>         vse64.v v1,0(a2)        # 38    [c=4 l=4]  pred_storevnx2df
>         vsetvli t6,zero,e64,m1,ta,ma    # 109   [c=0 l=4]  vsetvldi
>         slli    t4,t1,2 # 22    [c=4 l=4]  ashldi3
>         slli    t3,t1,3 # 27    [c=4 l=4]  ashldi3
>         vfmul.vv        v1,v2,v3        # 42    [c=4 l=4]  pred_mulvnx2df/2


Note how the output of the explicit conversion done in insn 77 is used 
by the vfmul in insns 29, 37 and 42.  Similarly for the other explcit 
conversions.

Your pattern isn't going to help that problem.

You could model this as a dependency height reduction.  I think that 
will get you were you want to go.

You'll need a pattern that matches this:

> (parallel [     
>         (set (reg:VNx2DF 160 [ vect__11.15 ])
>             (if_then_else:VNx2DF (unspec:VNx2BI [
>                         (const_vector:VNx2BI repeat [
>                                 (const_int 1 [0x1])
>                             ])      
>                         (reg:DI 169)
>                         (const_int 2 [0x2]) repeated x2
>                         (const_int 1 [0x1]) 
>                         (const_int 7 [0x7])
>                         (reg:SI 66 vl)
>                         (reg:SI 67 vtype)
>                         (reg:SI 69 frm)
>                     ] UNSPEC_VPREDICATE)
>                 (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ]))
>                     (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
>                 (unspec:VNx2DF [
>                         (reg:SI 0 zero)
>                     ] UNSPEC_VUNDEF)))
>         (set (reg:VNx2DF 143 [ vect__8.14 ])
>             (float_extend:VNx2DF (reg:VNx2SF 144 [ vect__7.13 ])))
>         (set (reg:VNx2DF 145 [ vect__5.10 ])
>             (float_extend:VNx2DF (reg:VNx2SF 146 [ vect__4.9 ])))
>     ])

It'll need to be a define_insn_and_split as its a 3->3 splitter.  The 
split will emit the two extensions and the widening multiply as 3 
distinct insns.

This has two positive effects.  First the widening multiply is no longer 
data dependent on the float_extend and so it can issue when ever r144 
and r146 are ready rather than when r143 and r145 are ready.

The second effect is I think this pattern will end up matching all the 
multiplies in this sample code.  As a result all the float_extend insns 
you generated when splitting become dead and should be removed by DCE.


Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-28 22:00   ` 钟居哲
  2023-06-29 22:59     ` Jeff Law
  2023-06-29 23:39     ` Jeff Law
@ 2023-06-29 23:41     ` Jeff Law
       [not found]     ` <99D6E636A491D16D+F0E92F80-33DF-4109-912E-F9CAAD6F07B5@rivai.ai>
  3 siblings, 0 replies; 22+ messages in thread
From: Jeff Law @ 2023-06-29 23:41 UTC (permalink / raw)
  To: 钟居哲, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc



On 6/28/23 16:00, 钟居哲 wrote:
> You can see here:
> 
> https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
You patch doesn't help that code and your patch is a result of 
fundamentally misunderstanding combine's capabilities AFAICT.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
       [not found]     ` <99D6E636A491D16D+F0E92F80-33DF-4109-912E-F9CAAD6F07B5@rivai.ai>
@ 2023-06-29 23:48       ` Jeff Law
  2023-06-30  0:44         ` juzhe.zhong
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Law @ 2023-06-29 23:48 UTC (permalink / raw)
  To: juzhe.zhong
  Cc: gcc-patches, kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc



On 6/29/23 17:46, juzhe.zhong wrote:
> You should try the example check the codegen before and after the patch. 
> You will understand it.
I've already done that.  It makes _no_ difference on the godbold example.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-29 23:48       ` Jeff Law
@ 2023-06-30  0:44         ` juzhe.zhong
  0 siblings, 0 replies; 22+ messages in thread
From: juzhe.zhong @ 2023-06-30  0:44 UTC (permalink / raw)
  To: jeffreyalaw
  Cc: gcc-patches, kito.cheng, Kito.cheng, palmer, palmer, Robin Dapp


[-- Attachment #1.1: Type: text/plain, Size: 2645 bytes --]

Hi, Jeff.

That's odd. I think maybe you should first clean up your environment ?
Or you didn't build up the toolchain correctly with this patch?

Compile option: --param=riscv-autovec-preference=scalable -O3 -ffast-math
Before this patch:
https://godbolt.org/z/Y5d44WMqs 

fail.s:

lw t5,0(sp)
ble t5,zero,.L5
.L3:
vsetvli t1,t5,e32,mf2,ta,ma
vle32.v v2,0(a4)
vle32.v v1,0(a5)
vsetvli t0,zero,e32,mf2,ta,ma
vfwcvt.f.f.v v3,v2
vfwcvt.f.f.v v2,v1
vsetvli zero,t1,e32,mf2,ta,ma
vle32.v v5,0(a6)
vle32.v v4,0(a7)
vsetvli t0,zero,e32,mf2,ta,ma
vfwcvt.f.f.v v1,v5
vsetvli zero,zero,e64,m1,ta,ma
vfmul.vv v5,v2,v3
vfmul.vv v2,v1,v2
vsetvli zero,t1,e64,m1,ta,ma
vse64.v v2,0(a1)
vse64.v v5,0(a0)
vsetvli t6,zero,e64,m1,ta,ma
vfmul.vv v1,v1,v3
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.f.v v2,v4
vsetvli zero,t1,e64,m1,ta,ma
vse64.v v1,0(a2)
vsetvli t6,zero,e64,m1,ta,ma
slli t4,t1,2
slli t3,t1,3
vfmul.vv v1,v2,v3
sub t5,t5,t1
vsetvli zero,t1,e64,m1,ta,ma
vse64.v v1,0(a3)
add a4,a4,t4
add a5,a5,t4
add a0,a0,t3
add a6,a6,t4
add a1,a1,t3
add a2,a2,t3
add a7,a7,t4
add a3,a3,t3
bne t5,zero,.L3
.L5:
ret

After this patch:
pass.s:

lw t5,0(sp)
ble t5,zero,.L5
.L3:
vsetvli t1,t5,e32,mf2,ta,ma
vle32.v v1,0(a4)
vle32.v v3,0(a5)
vle32.v v2,0(a6)
vle32.v v4,0(a7)
vsetvli t6,zero,e32,mf2,ta,ma
vfwmul.vv v5,v3,v2
vfwmul.vv v6,v1,v3
vsetvli zero,t1,e64,m1,ta,ma
vse64.v v6,0(a0)
vse64.v v5,0(a1)
vsetvli t6,zero,e32,mf2,ta,ma
slli t4,t1,2
slli t3,t1,3
vfwmul.vv v3,v2,v1
sub t5,t5,t1
vfwmul.vv v2,v1,v4
vsetvli zero,t1,e64,m1,ta,ma
vse64.v v3,0(a2)
vse64.v v2,0(a3)
add a4,a4,t4
add a5,a5,t4
add a0,a0,t3
add a6,a6,t4
add a1,a1,t3
add a2,a2,t3
add a7,a7,t4
add a3,a3,t3
bne t5,zero,.L3
.L5:
ret

It's very obvious the codegen with this patch is perfect.

I have attached the .S in this patch.

I am not claiming that this patch solution is the only solution.

I am welcome you can provide another solution as long as you can make this codegen become the perfect codegen that this patch achieved.

I think maybe you should make sure you are using the correct toolchain that built with patch.

Thanks.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-30 07:48
To: juzhe.zhong
CC: gcc-patches; kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 6/29/23 17:46, juzhe.zhong wrote:
> You should try the example check the codegen before and after the patch. 
> You will understand it.
I've already done that.  It makes _no_ difference on the godbold example.
 
Jeff
 

[-- Attachment #2: fail.s --]
[-- Type: application/octet-stream, Size: 1318 bytes --]

	.file	"auto.c"
	.option nopic
	.attribute arch, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zfh1p0_zfhmin1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
	.attribute unaligned_access, 0
	.attribute stack_align, 16
	.text
	.align	1
	.globl	vwadd_TYPE1_float
	.type	vwadd_TYPE1_float, @function
vwadd_TYPE1_float:
	lw	t5,0(sp)
	ble	t5,zero,.L5
.L3:
	vsetvli	t1,t5,e32,mf2,ta,ma
	vle32.v	v2,0(a4)
	vle32.v	v1,0(a5)
	vsetvli	t0,zero,e32,mf2,ta,ma
	vfwcvt.f.f.v	v3,v2
	vfwcvt.f.f.v	v2,v1
	vsetvli	zero,t1,e32,mf2,ta,ma
	vle32.v	v5,0(a6)
	vle32.v	v4,0(a7)
	vsetvli	t0,zero,e32,mf2,ta,ma
	vfwcvt.f.f.v	v1,v5
	vsetvli	zero,zero,e64,m1,ta,ma
	vfmul.vv	v5,v2,v3
	vfmul.vv	v2,v1,v2
	vsetvli	zero,t1,e64,m1,ta,ma
	vse64.v	v2,0(a1)
	vse64.v	v5,0(a0)
	vsetvli	t6,zero,e64,m1,ta,ma
	vfmul.vv	v1,v1,v3
	vsetvli	zero,zero,e32,mf2,ta,ma
	vfwcvt.f.f.v	v2,v4
	vsetvli	zero,t1,e64,m1,ta,ma
	vse64.v	v1,0(a2)
	vsetvli	t6,zero,e64,m1,ta,ma
	slli	t4,t1,2
	slli	t3,t1,3
	vfmul.vv	v1,v2,v3
	sub	t5,t5,t1
	vsetvli	zero,t1,e64,m1,ta,ma
	vse64.v	v1,0(a3)
	add	a4,a4,t4
	add	a5,a5,t4
	add	a0,a0,t3
	add	a6,a6,t4
	add	a1,a1,t3
	add	a2,a2,t3
	add	a7,a7,t4
	add	a3,a3,t3
	bne	t5,zero,.L3
.L5:
	ret
	.size	vwadd_TYPE1_float, .-vwadd_TYPE1_float
	.ident	"GCC: (GNU) 14.0.0 20230629 (experimental)"

[-- Attachment #3: pass.s --]
[-- Type: application/octet-stream, Size: 1056 bytes --]

	.file	"auto.c"
	.option nopic
	.attribute arch, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zfh1p0_zfhmin1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
	.attribute unaligned_access, 0
	.attribute stack_align, 16
	.text
	.align	1
	.globl	vwadd_TYPE1_float
	.type	vwadd_TYPE1_float, @function
vwadd_TYPE1_float:
	lw	t5,0(sp)
	ble	t5,zero,.L5
.L3:
	vsetvli	t1,t5,e32,mf2,ta,ma
	vle32.v	v1,0(a4)
	vle32.v	v3,0(a5)
	vle32.v	v2,0(a6)
	vle32.v	v4,0(a7)
	vsetvli	t6,zero,e32,mf2,ta,ma
	vfwmul.vv	v5,v3,v2
	vfwmul.vv	v6,v1,v3
	vsetvli	zero,t1,e64,m1,ta,ma
	vse64.v	v6,0(a0)
	vse64.v	v5,0(a1)
	vsetvli	t6,zero,e32,mf2,ta,ma
	slli	t4,t1,2
	slli	t3,t1,3
	vfwmul.vv	v3,v2,v1
	sub	t5,t5,t1
	vfwmul.vv	v2,v1,v4
	vsetvli	zero,t1,e64,m1,ta,ma
	vse64.v	v3,0(a2)
	vse64.v	v2,0(a3)
	add	a4,a4,t4
	add	a5,a5,t4
	add	a0,a0,t3
	add	a6,a6,t4
	add	a1,a1,t3
	add	a2,a2,t3
	add	a7,a7,t4
	add	a3,a3,t3
	bne	t5,zero,.L3
.L5:
	ret
	.size	vwadd_TYPE1_float, .-vwadd_TYPE1_float
	.ident	"GCC: (GNU) 14.0.0 20230629 (experimental)"

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-29 23:39     ` Jeff Law
@ 2023-06-30 10:14       ` Robin Dapp
  2023-06-30 22:35         ` Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Robin Dapp @ 2023-06-30 10:14 UTC (permalink / raw)
  To: Jeff Law, 钟居哲, gcc-patches
  Cc: rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer

> The explicit conversions I see are because we need the output of the
> conversion in multiple vfmul instructions.  That won't be helped by
> the patch you've proposed.

FWIW on my local branch and the patch applied I see that the vfwmuls
are being generated (all of the vfmuls are replaced).

> It'll need to be a define_insn_and_split as its a 3->3 splitter.  The
> split will emit the two extensions and the widening multiply as 3
> distinct insns.

I tried this and while it worked for the first vfwmul the subsequent
ones are not being combined/optimized.  Now I'm not a combine expert
at all but it looks as if the source float_extends are being deleted

 deferring deletion of insn with uid = 39.
 deferring deletion of insn with uid = 37.

with that pattern successfully matched, while they are only "rescanned"
with the synthetic "single widen" one.  Them being deleted (or rather
absorbed by the vfwmul) no further combination is possible (until after
split?)

This seems to be a fundamental difference between the two approaches.
Maybe the "double widen" pattern can be adjusted to also handle this
or I did something wrong when writing the splitter?

With the "single widen" pattern, however, it works more or less
naturally therefore I'd still suggest going for it.

Regards
 Robin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-30 10:14       ` Robin Dapp
@ 2023-06-30 22:35         ` Jeff Law
  2023-07-01 11:45           ` Robin Dapp
       [not found]           ` <8D5801744511A6AD+6077E043-F267-4BC0-90B8-B2FCDCA10089@rivai.ai>
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2023-06-30 22:35 UTC (permalink / raw)
  To: Robin Dapp, 钟居哲, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer



On 6/30/23 04:14, Robin Dapp wrote:
>> The explicit conversions I see are because we need the output of the
>> conversion in multiple vfmul instructions.  That won't be helped by
>> the patch you've proposed.
> 
> FWIW on my local branch and the patch applied I see that the vfwmuls
> are being generated (all of the vfmuls are replaced).
> 
>> It'll need to be a define_insn_and_split as its a 3->3 splitter.  The
>> split will emit the two extensions and the widening multiply as 3
>> distinct insns.
> 
> I tried this and while it worked for the first vfwmul the subsequent
> ones are not being combined/optimized.  Now I'm not a combine expert
> at all but it looks as if the source float_extends are being deleted
> 
>   deferring deletion of insn with uid = 39.
>   deferring deletion of insn with uid = 37.
> 
> with that pattern successfully matched, while they are only "rescanned"
> with the synthetic "single widen" one.  Them being deleted (or rather
> absorbed by the vfwmul) no further combination is possible (until after
> split?)
> 
> This seems to be a fundamental difference between the two approaches.
> Maybe the "double widen" pattern can be adjusted to also handle this
> or I did something wrong when writing the splitter?
> 
> With the "single widen" pattern, however, it works more or less
> naturally therefore I'd still suggest going for it.
I'd hoped to have time to revisit all of this today, but I'm quickly 
running out of time.

There has to be some kind of mismatch between the patch or testcase or 
what we're looking at to judge success.

Monday and Tuesday are holidays in the US.  Naturally that means the 
rest of my work week is going to be busier than normal.  I don't want to 
hold things up unnecessarily.

While I really don't see the need to have the bridge pattern, I'm still 
willing to believe that I've missed something, which is why I wanted to 
dive into it myself.  For example, we have heuristics to avoid trying 
too many 4->n combine patterns and we might be tripping over that or who 
knows what.

So my suggestion is that if both of you are getting the desired code, 
then Robin handle the review side of the two patches that introduce the 
helper patterns.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-06-30 22:35         ` Jeff Law
@ 2023-07-01 11:45           ` Robin Dapp
       [not found]           ` <8D5801744511A6AD+6077E043-F267-4BC0-90B8-B2FCDCA10089@rivai.ai>
  1 sibling, 0 replies; 22+ messages in thread
From: Robin Dapp @ 2023-07-01 11:45 UTC (permalink / raw)
  To: Jeff Law, 钟居哲, gcc-patches
  Cc: rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer

> There has to be some kind of mismatch between the patch or testcase
> or what we're looking at to judge success.

Yeah I think the initially posted example was misleading because it
contained an already working example.

> While I really don't see the need to have the bridge pattern, I'm
> still willing to believe that I've missed something, which is why I
> wanted to dive into it myself.  For example, we have heuristics to
> avoid trying too many 4->n combine patterns and we might be tripping
> over that or who knows what.
> 
> So my suggestion is that if both of you are getting the desired code,
> then Robin handle the review side of the two patches that introduce
> the helper patterns.

I went over both patches again and given the context they seem
reasonable to me.  I'd propose go with both of them for now and - in
the meanwhile - I'm going to  brush up on my combine knowledge some
time in the next weeks and get back to this then, hopefully with a
better explanation than my last one.

Regards
 Robin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
       [not found]           ` <8D5801744511A6AD+6077E043-F267-4BC0-90B8-B2FCDCA10089@rivai.ai>
@ 2023-07-03  7:49             ` Robin Dapp
  2023-07-03  8:42               ` juzhe.zhong
  0 siblings, 1 reply; 22+ messages in thread
From: Robin Dapp @ 2023-07-03  7:49 UTC (permalink / raw)
  To: juzhe.zhong
  Cc: rdapp.gcc, Jeff Law, gcc-patches, kito.cheng, kito.cheng, palmer, palmer

> Thanks. Ok for trunk？

OK from my side.  As agreed with Jeff, I'm going to get back to this
and revisit/change if needed in the future.

Regards
 Robin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  7:49             ` Robin Dapp
@ 2023-07-03  8:42               ` juzhe.zhong
  2023-07-03  8:44                 ` Robin Dapp
  2023-07-07 21:11                 ` Jeff Law
  0 siblings, 2 replies; 22+ messages in thread
From: juzhe.zhong @ 2023-07-03  8:42 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Robin Dapp, jeffreyalaw, gcc-patches, kito.cheng, Kito.cheng,
	palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]

We failed to merge it since it's been rejected.
https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/ 




juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-03 15:49
To: juzhe.zhong
CC: rdapp.gcc; Jeff Law; gcc-patches; kito.cheng; kito.cheng; palmer; palmer
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
> Thanks. Ok for trunk？
 
OK from my side.  As agreed with Jeff, I'm going to get back to this
and revisit/change if needed in the future.
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  8:42               ` juzhe.zhong
@ 2023-07-03  8:44                 ` Robin Dapp
  2023-07-03  8:45                   ` juzhe.zhong
  2023-07-07 21:11                 ` Jeff Law
  1 sibling, 1 reply; 22+ messages in thread
From: Robin Dapp @ 2023-07-03  8:44 UTC (permalink / raw)
  To: juzhe.zhong
  Cc: rdapp.gcc, jeffreyalaw, gcc-patches, kito.cheng, Kito.cheng,
	palmer, palmer

> We failed to merge it since it's been rejected.
> https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/ <https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/> 

Err, who rejected?  Or is this about the patch itself
that doesn't apply cleanly anymore?

Regards
 Robin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  8:44                 ` Robin Dapp
@ 2023-07-03  8:45                   ` juzhe.zhong
  2023-07-03  8:49                     ` Robin Dapp
  0 siblings, 1 reply; 22+ messages in thread
From: juzhe.zhong @ 2023-07-03  8:45 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Robin Dapp, jeffreyalaw, gcc-patches, kito.cheng, Kito.cheng,
	palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

We can apply it but not sure why the patchwork shows it's rejected.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-03 16:44
To: juzhe.zhong@rivai.ai
CC: rdapp.gcc; jeffreyalaw; gcc-patches; kito.cheng; Kito.cheng; palmer; palmer
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
> We failed to merge it since it's been rejected.
> https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/ <https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/> 
 
Err, who rejected?  Or is this about the patch itself
that doesn't apply cleanly anymore?
 
Regards
Robin
 
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  8:45                   ` juzhe.zhong
@ 2023-07-03  8:49                     ` Robin Dapp
  2023-07-03  8:51                       ` juzhe.zhong
  0 siblings, 1 reply; 22+ messages in thread
From: Robin Dapp @ 2023-07-03  8:49 UTC (permalink / raw)
  To: juzhe.zhong
  Cc: rdapp.gcc, jeffreyalaw, gcc-patches, kito.cheng, Kito.cheng,
	palmer, palmer

On 7/3/23 10:45, juzhe.zhong@rivai.ai wrote:
> We can apply it but not sure why the patchwork shows it's rejected.

I believe it also failed for me locally because the order of
patterns in autovec-opt.md was somehow different.  The one attached
worked for me though after some minor merge adjustments on my branch.

Regards
 Robin

From 29b12a473a31b2caa64fa2d1d97920a460ced0a2 Mon Sep 17 00:00:00 2001
From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date: Wed, 28 Jun 2023 12:15:12 +0800
Subject: [PATCH] RISC-V: Support vfwmul.vv combine lowering

Consider the following complicate case:
#define TEST_TYPE(TYPE1, TYPE2)                                                \
  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      {                                                                        \
	dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
	dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
	dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
	dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
      }                                                                        \
  }

TEST_TYPE (double, float)

Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
So the combine PASS will first try to combine one of the combine extension, and then combine
the other. The combine flow is as follows:

Original IR:
(set (reg 0) (float_extend: (reg 1))
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (reg 0) (reg 3))

First step of combine:
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (float_extend: (reg 1) (reg 3))

Second step of combine:
(set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))

So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).

gcc/ChangeLog:

        * config/riscv/autovec-opt.md (@pred_single_widen_mul<any_extend:su><mode>): Change "@" into "*" in pattern name which simplifies build files.
        (*pred_single_widen_mul<any_extend:su><mode>): Ditto.
        (*pred_single_widen_mul<mode>): New pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/widen/widen-3.c: Add floating-point.
        * gcc.target/riscv/rvv/autovec/widen/widen-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c: New test.
---
 gcc/config/riscv/autovec-opt.md               | 39 +++++++++++++++++++
 .../riscv/rvv/autovec/widen/widen-3.c         |  7 +++-
 .../riscv/rvv/autovec/widen/widen-7.c         |  7 +++-
 .../rvv/autovec/widen/widen-complicate-3.c    |  7 +++-
 .../riscv/rvv/autovec/widen/widen_run-3.c     |  5 ++-
 .../riscv/rvv/autovec/widen/widen_run-7.c     |  5 ++-
 .../rvv/autovec/widen/widen_run_zvfh-3.c      | 28 +++++++++++++
 .../rvv/autovec/widen/widen_run_zvfh-7.c      | 28 +++++++++++++
 8 files changed, 116 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index fd9cd27f50a..99b609a99d9 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -406,6 +406,45 @@ (define_insn "*pred_extract_first_sextsi<mode>"
   [(set_attr "type" "vimovvx")
    (set_attr "mode" "<MODE>")])
 
+;; We don't have vfwmul.wv instruction like vfwadd.wv in RVV.
+;; This pattern is an intermediate RTL IR as a pseudo vfwmul.wv to enhance
+;; optimization of instructions combine.
+(define_insn_and_split "*pred_single_widen_mul<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand"                  "=&vr,  &vr")
+       (if_then_else:VWEXTF
+         (unspec:<VM>
+           [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
+            (match_operand 5 "vector_length_operand"              "   rK,   rK")
+            (match_operand 6 "const_int_operand"                  "    i,    i")
+            (match_operand 7 "const_int_operand"                  "    i,    i")
+            (match_operand 8 "const_int_operand"                  "    i,    i")
+            (match_operand 9 "const_int_operand"                  "    i,    i")
+            (reg:SI VL_REGNUM)
+            (reg:SI VTYPE_REGNUM)
+            (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
+         (mult:VWEXTF
+           (float_extend:VWEXTF
+             (match_operand:<V_DOUBLE_TRUNC> 4 "register_operand" "   vr,   vr"))
+           (match_operand:VWEXTF 3 "register_operand"             "   vr,   vr"))
+         (match_operand:VWEXTF 2 "vector_merge_operand"           "   vu,    0")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ops[] = {tmp, operands[4]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
+
+    emit_insn (gen_pred (MULT, <MODE>mode, operands[0], operands[1], operands[2],
+                        operands[3], tmp, operands[5], operands[6],
+                        operands[7], operands[8], operands[9]));
+    DONE;
+  }
+  [(set_attr "type" "vfwmul")
+   (set_attr "mode" "<MODE>")])
+
 ;; -------------------------------------------------------------------------
 ;; ---- [FP] VFWMACC
 ;; -------------------------------------------------------------------------
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
index 609a5c09f70..b2b14405902 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvwmul\.vv} 3 } } */
 /* { dg-final { scan-assembler-times {\tvwmulu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
index cc43d9ba3fe..3806e8b98ee 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvsext\.vf2} 3 } } */
 /* { dg-final { scan-assembler-times {\tvzext\.vf2} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwcvt} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
index e1fd79430c3..1515374890d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <stdint-gcc.h>
 
@@ -24,9 +24,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
 
 TEST_ALL ()
 
 /* { dg-final { scan-assembler-times {\tvwmul\.vv} 12 } } */
 /* { dg-final { scan-assembler-times {\tvwmulu\.vv} 12 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
index beb0cc2b58b..b7dd60fa8e8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <assert.h>
 #include "widen-3.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
 
 int
 main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
index 4abddd5d718..ab29f4a0f70 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
 
 #include <assert.h>
 #include "widen-7.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
 
 int
 main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
new file mode 100644
index 00000000000..c3efd0b97bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-3.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == ((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
new file mode 100644
index 00000000000..60e2401c088
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-7.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE1 b##TYPE1[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % LIMIT;                                         \
+      b##TYPE1[i] = LIMIT + i & LIMIT;                                         \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE1, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == (((TYPE1) a##TYPE2[i]) * b##TYPE1[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
-- 
2.41.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  8:49                     ` Robin Dapp
@ 2023-07-03  8:51                       ` juzhe.zhong
  0 siblings, 0 replies; 22+ messages in thread
From: juzhe.zhong @ 2023-07-03  8:51 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Robin Dapp, jeffreyalaw, gcc-patches, kito.cheng, Kito.cheng,
	palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 15981 bytes --]

OK. Thanks. Will commit with your cleanup patch.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-03 16:49
To: juzhe.zhong@rivai.ai
CC: rdapp.gcc; jeffreyalaw; gcc-patches; kito.cheng; Kito.cheng; palmer; palmer
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
On 7/3/23 10:45, juzhe.zhong@rivai.ai wrote:
> We can apply it but not sure why the patchwork shows it's rejected.
 
I believe it also failed for me locally because the order of
patterns in autovec-opt.md was somehow different.  The one attached
worked for me though after some minor merge adjustments on my branch.
 
Regards
Robin
 
From 29b12a473a31b2caa64fa2d1d97920a460ced0a2 Mon Sep 17 00:00:00 2001
From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date: Wed, 28 Jun 2023 12:15:12 +0800
Subject: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
Consider the following complicate case:
#define TEST_TYPE(TYPE1, TYPE2)                                                \
  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      {                                                                        \
dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
      }                                                                        \
  }
 
TEST_TYPE (double, float)
 
Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
So the combine PASS will first try to combine one of the combine extension, and then combine
the other. The combine flow is as follows:
 
Original IR:
(set (reg 0) (float_extend: (reg 1))
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (reg 0) (reg 3))
 
First step of combine:
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (float_extend: (reg 1) (reg 3))
 
Second step of combine:
(set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))
 
So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).
 
gcc/ChangeLog:
 
        * config/riscv/autovec-opt.md (@pred_single_widen_mul<any_extend:su><mode>): Change "@" into "*" in pattern name which simplifies build files.
        (*pred_single_widen_mul<any_extend:su><mode>): Ditto.
        (*pred_single_widen_mul<mode>): New pattern.
 
gcc/testsuite/ChangeLog:
 
        * gcc.target/riscv/rvv/autovec/widen/widen-3.c: Add floating-point.
        * gcc.target/riscv/rvv/autovec/widen/widen-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-3.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: Ditto.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c: New test.
---
gcc/config/riscv/autovec-opt.md               | 39 +++++++++++++++++++
.../riscv/rvv/autovec/widen/widen-3.c         |  7 +++-
.../riscv/rvv/autovec/widen/widen-7.c         |  7 +++-
.../rvv/autovec/widen/widen-complicate-3.c    |  7 +++-
.../riscv/rvv/autovec/widen/widen_run-3.c     |  5 ++-
.../riscv/rvv/autovec/widen/widen_run-7.c     |  5 ++-
.../rvv/autovec/widen/widen_run_zvfh-3.c      | 28 +++++++++++++
.../rvv/autovec/widen/widen_run_zvfh-7.c      | 28 +++++++++++++
8 files changed, 116 insertions(+), 10 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
 
diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index fd9cd27f50a..99b609a99d9 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -406,6 +406,45 @@ (define_insn "*pred_extract_first_sextsi<mode>"
   [(set_attr "type" "vimovvx")
    (set_attr "mode" "<MODE>")])
+;; We don't have vfwmul.wv instruction like vfwadd.wv in RVV.
+;; This pattern is an intermediate RTL IR as a pseudo vfwmul.wv to enhance
+;; optimization of instructions combine.
+(define_insn_and_split "*pred_single_widen_mul<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand"                  "=&vr,  &vr")
+       (if_then_else:VWEXTF
+         (unspec:<VM>
+           [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
+            (match_operand 5 "vector_length_operand"              "   rK,   rK")
+            (match_operand 6 "const_int_operand"                  "    i,    i")
+            (match_operand 7 "const_int_operand"                  "    i,    i")
+            (match_operand 8 "const_int_operand"                  "    i,    i")
+            (match_operand 9 "const_int_operand"                  "    i,    i")
+            (reg:SI VL_REGNUM)
+            (reg:SI VTYPE_REGNUM)
+            (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
+         (mult:VWEXTF
+           (float_extend:VWEXTF
+             (match_operand:<V_DOUBLE_TRUNC> 4 "register_operand" "   vr,   vr"))
+           (match_operand:VWEXTF 3 "register_operand"             "   vr,   vr"))
+         (match_operand:VWEXTF 2 "vector_merge_operand"           "   vu,    0")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ops[] = {tmp, operands[4]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
+
+    emit_insn (gen_pred (MULT, <MODE>mode, operands[0], operands[1], operands[2],
+                        operands[3], tmp, operands[5], operands[6],
+                        operands[7], operands[8], operands[9]));
+    DONE;
+  }
+  [(set_attr "type" "vfwmul")
+   (set_attr "mode" "<MODE>")])
+
;; -------------------------------------------------------------------------
;; ---- [FP] VFWMACC
;; -------------------------------------------------------------------------
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
index 609a5c09f70..b2b14405902 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
#include <stdint-gcc.h>
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
TEST_ALL ()
/* { dg-final { scan-assembler-times {\tvwmul\.vv} 3 } } */
/* { dg-final { scan-assembler-times {\tvwmulu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
index cc43d9ba3fe..3806e8b98ee 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
#include <stdint-gcc.h>
@@ -19,9 +19,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
TEST_ALL ()
/* { dg-final { scan-assembler-times {\tvsext\.vf2} 3 } } */
/* { dg-final { scan-assembler-times {\tvzext\.vf2} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfwcvt} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
index e1fd79430c3..1515374890d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
#include <stdint-gcc.h>
@@ -24,9 +24,12 @@
   TEST_TYPE (int32_t, int16_t)                                                 \
   TEST_TYPE (uint32_t, uint16_t)                                               \
   TEST_TYPE (int64_t, int32_t)                                                 \
-  TEST_TYPE (uint64_t, uint32_t)
+  TEST_TYPE (uint64_t, uint32_t)                                               \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
TEST_ALL ()
/* { dg-final { scan-assembler-times {\tvwmul\.vv} 12 } } */
/* { dg-final { scan-assembler-times {\tvwmulu\.vv} 12 } } */
+/* { dg-final { scan-assembler-times {\tvfwmul\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
index beb0cc2b58b..b7dd60fa8e8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
#include <assert.h>
#include "widen-3.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
int
main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
index 4abddd5d718..ab29f4a0f70 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector } } } */
-/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
#include <assert.h>
#include "widen-7.c"
@@ -25,7 +25,8 @@
   RUN (int32_t, int16_t, -32768)                                               \
   RUN (uint32_t, uint16_t, 65535)                                              \
   RUN (int64_t, int32_t, -2147483648)                                          \
-  RUN (uint64_t, uint32_t, 4294967295)
+  RUN (uint64_t, uint32_t, 4294967295)                                         \
+  RUN (double, float, -2147483648)
int
main ()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
new file mode 100644
index 00000000000..c3efd0b97bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-3.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == ((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
new file mode 100644
index 00000000000..60e2401c088
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-7.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE1 b##TYPE1[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % LIMIT;                                         \
+      b##TYPE1[i] = LIMIT + i & LIMIT;                                         \
+    }                                                                          \
+  vwmul_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE1, SZ);                  \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i] == (((TYPE1) a##TYPE2[i]) * b##TYPE1[i]));
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
-- 
2.41.0
 
 
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-03  8:42               ` juzhe.zhong
  2023-07-03  8:44                 ` Robin Dapp
@ 2023-07-07 21:11                 ` Jeff Law
  2023-07-07 23:05                   ` 钟居哲
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Law @ 2023-07-07 21:11 UTC (permalink / raw)
  To: juzhe.zhong, Robin Dapp
  Cc: gcc-patches, kito.cheng, Kito.cheng, palmer, palmer



On 7/3/23 02:42, juzhe.zhong@rivai.ai wrote:
> We failed to merge it since it's been rejected.
> https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/ <https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/>
That was based on the belief that the bridging patterns should not be 
needed.  With the decision to move forward with those patterns this 
patch should be reconsidered.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
  2023-07-07 21:11                 ` Jeff Law
@ 2023-07-07 23:05                   ` 钟居哲
  0 siblings, 0 replies; 22+ messages in thread
From: 钟居哲 @ 2023-07-07 23:05 UTC (permalink / raw)
  To: Jeff Law, rdapp.gcc; +Cc: gcc-patches, kito.cheng, kito.cheng, palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]

Sure. 

We can come back to see in the future which doesn't change this codegen quality:
https://godbolt.org/z/d6rWPTWeW 



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-07-08 05:11
To: juzhe.zhong@rivai.ai; Robin Dapp
CC: gcc-patches; kito.cheng; Kito.cheng; palmer; palmer
Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering
 
 
On 7/3/23 02:42, juzhe.zhong@rivai.ai wrote:
> We failed to merge it since it's been rejected.
> https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/ <https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zhong@rivai.ai/>
That was based on the belief that the bridging patterns should not be 
needed.  With the decision to move forward with those patterns this 
patch should be reconsidered.
 
jeff
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-07-07 23:05 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28  4:15 [PATCH] RISC-V: Support vfwmul.vv combine lowering Juzhe-Zhong
2023-06-28 16:24 ` Jeff Law
2023-06-28 22:00   ` 钟居哲
2023-06-29 22:59     ` Jeff Law
2023-06-29 23:02       ` 钟居哲
2023-06-29 23:04       ` 钟居哲
2023-06-29 23:39     ` Jeff Law
2023-06-30 10:14       ` Robin Dapp
2023-06-30 22:35         ` Jeff Law
2023-07-01 11:45           ` Robin Dapp
     [not found]           ` <8D5801744511A6AD+6077E043-F267-4BC0-90B8-B2FCDCA10089@rivai.ai>
2023-07-03  7:49             ` Robin Dapp
2023-07-03  8:42               ` juzhe.zhong
2023-07-03  8:44                 ` Robin Dapp
2023-07-03  8:45                   ` juzhe.zhong
2023-07-03  8:49                     ` Robin Dapp
2023-07-03  8:51                       ` juzhe.zhong
2023-07-07 21:11                 ` Jeff Law
2023-07-07 23:05                   ` 钟居哲
2023-06-29 23:41     ` Jeff Law
     [not found]     ` <99D6E636A491D16D+F0E92F80-33DF-4109-912E-F9CAAD6F07B5@rivai.ai>
2023-06-29 23:48       ` Jeff Law
2023-06-30  0:44         ` juzhe.zhong
     [not found]   ` <2023062906005450585022@rivai.ai>
2023-06-28 22:59     ` 钟居哲

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).