public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] i386: Extend cvtps2pd to memory
@ 2022-06-30  5:59 Haochen Jiang
  2022-06-30  6:20 ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Haochen Jiang @ 2022-06-30  5:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: ubizjak, hongtao.liu

Hi all,

This patch aims to fix the cvtps2pd insn, which should also work on
memory operand but currently does not. After this fix, when loop == 2,
it will eliminate movq instruction.

Regtested on x86_64-pc-linux-gnu. Ok for trunk?

BRs,
Haochen

gcc/ChangeLog:

	PR target/43618
	* config/i386/sse.md (extendv2sfv2df2): New define_expand.
	(sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.

gcc/testsuite/ChangeLog:

	PR target/43618
	* gcc.target/i386/pr43618-1.c: New test.
---
 gcc/config/i386/sse.md                    | 24 ++++++++++++++++++-----
 gcc/testsuite/gcc.target/i386/pr43618-1.c | 13 ++++++++++++
 2 files changed, 32 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr43618-1.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8b2602bfa79..f96bb3dc6c3 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -9175,11 +9175,25 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
+(define_expand "extendv2sfv2df2"
+  [(set (match_operand:V2DF 0 "register_operand")
+        (float_extend:V2DF
+          (match_operand:V2SF 1 "nonimmediate_operand")))]
+  "TARGET_MMX_WITH_SSE"
+{
+  if (!MEM_P (operands[1]))
+    {
+      operands[1] = lowpart_subreg (V4SFmode, operands[1], V2SFmode);
+      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
+      DONE;
+    }
+})
+
 (define_insn "sse2_cvtps2pd<mask_name>"
   [(set (match_operand:V2DF 0 "register_operand" "=v")
 	(float_extend:V2DF
 	  (vec_select:V2SF
-	    (match_operand:V4SF 1 "vector_operand" "vm")
+	    (match_operand:V4SF 1 "register_operand" "v")
 	    (parallel [(const_int 0) (const_int 1)]))))]
   "TARGET_SSE2 && <mask_avx512vl_condition>"
   "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
@@ -9191,12 +9205,12 @@
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "V2DF")])
 
-(define_insn "extendv2sfv2df2"
+(define_insn "sse2_cvtps2pd_load<mask_name>"
   [(set (match_operand:V2DF 0 "register_operand" "=v")
 	(float_extend:V2DF
-	  (match_operand:V2SF 1 "register_operand" "v")))]
-  "TARGET_MMX_WITH_SSE"
-  "%vcvtps2pd\t{%1, %0|%0, %1}"
+	  (match_operand:V2SF 1 "memory_operand" "m")))]
+  "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>"
+  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
   [(set_attr "type" "ssecvt")
    (set_attr "amdfam10_decode" "direct")
    (set_attr "athlon_decode" "double")
diff --git a/gcc/testsuite/gcc.target/i386/pr43618-1.c b/gcc/testsuite/gcc.target/i386/pr43618-1.c
new file mode 100644
index 00000000000..3c84ea444aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr43618-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "movq" } } */
+/* { dg-final { scan-assembler "cvtps2pd" } } */
+
+void
+foo (float a[2], double b[2])
+{
+    int i;
+    for (i = 0; i < 2; i++)
+      b[i] = a[i];
+}
+
-- 
2.18.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  5:59 [PATCH] i386: Extend cvtps2pd to memory Haochen Jiang
@ 2022-06-30  6:20 ` Uros Bizjak
  2022-06-30  7:24   ` Jiang, Haochen
  0 siblings, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2022-06-30  6:20 UTC (permalink / raw)
  To: Haochen Jiang; +Cc: gcc-patches, Hongtao Liu

On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> wrote:
>
> Hi all,
>
> This patch aims to fix the cvtps2pd insn, which should also work on
> memory operand but currently does not. After this fix, when loop == 2,
> it will eliminate movq instruction.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
>         PR target/43618
>         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
>         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.
>
> gcc/testsuite/ChangeLog:
>
>         PR target/43618
>         * gcc.target/i386/pr43618-1.c: New test.

This patch could be as simple as:

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8cd0f617bf3..c331445cb2d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -9195,7 +9195,7 @@
(define_insn "extendv2sfv2df2"
  [(set (match_operand:V2DF 0 "register_operand" "=v")
       (float_extend:V2DF
-         (match_operand:V2SF 1 "register_operand" "v")))]
+         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
  "TARGET_MMX_WITH_SSE"
  "%vcvtps2pd\t{%1, %0|%0, %1}"
  [(set_attr "type" "ssecvt")

Uros.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  6:20 ` Uros Bizjak
@ 2022-06-30  7:24   ` Jiang, Haochen
  2022-06-30  7:41     ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Jiang, Haochen @ 2022-06-30  7:24 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Liu, Hongtao

> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: Thursday, June 30, 2022 2:20 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> 
> On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com>
> wrote:
> >
> > Hi all,
> >
> > This patch aims to fix the cvtps2pd insn, which should also work on
> > memory operand but currently does not. After this fix, when loop == 2,
> > it will eliminate movq instruction.
> >
> > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> >
> > BRs,
> > Haochen
> >
> > gcc/ChangeLog:
> >
> >         PR target/43618
> >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> >         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/43618
> >         * gcc.target/i386/pr43618-1.c: New test.
> 
> This patch could be as simple as:
> 
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> 8cd0f617bf3..c331445cb2d 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -9195,7 +9195,7 @@
> (define_insn "extendv2sfv2df2"
>   [(set (match_operand:V2DF 0 "register_operand" "=v")
>        (float_extend:V2DF
> -         (match_operand:V2SF 1 "register_operand" "v")))]
> +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
>   "TARGET_MMX_WITH_SSE"
>   "%vcvtps2pd\t{%1, %0|%0, %1}"
>   [(set_attr "type" "ssecvt")

We also tested on this version, it is ok.

The reason why the patch looks like this is because in the previous insn
sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand
actually does not match the actual instruction. Memory operand is V2SF,
not V4SF.

Therefore, we changed the constraint in that insn. Then it caused another issue.
For memory operand, it seems that we cannot generate those mask instructions.
So I change the pattern to how extendv2hfv2df2 works.

Haochen

> Uros.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  7:24   ` Jiang, Haochen
@ 2022-06-30  7:41     ` Uros Bizjak
  2022-06-30  8:45       ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2022-06-30  7:41 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: gcc-patches, Liu, Hongtao

On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote:
>
> > -----Original Message-----
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: Thursday, June 30, 2022 2:20 PM
> > To: Jiang, Haochen <haochen.jiang@intel.com>
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> >
> > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com>
> > wrote:
> > >
> > > Hi all,
> > >
> > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > memory operand but currently does not. After this fix, when loop == 2,
> > > it will eliminate movq instruction.
> > >
> > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > >
> > > BRs,
> > > Haochen
> > >
> > > gcc/ChangeLog:
> > >
> > >         PR target/43618
> > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > >         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         PR target/43618
> > >         * gcc.target/i386/pr43618-1.c: New test.
> >
> > This patch could be as simple as:
> >
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > 8cd0f617bf3..c331445cb2d 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -9195,7 +9195,7 @@
> > (define_insn "extendv2sfv2df2"
> >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> >        (float_extend:V2DF
> > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> >   "TARGET_MMX_WITH_SSE"
> >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> >   [(set_attr "type" "ssecvt")
>
> We also tested on this version, it is ok.
>
> The reason why the patch looks like this is because in the previous insn
> sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand
> actually does not match the actual instruction. Memory operand is V2SF,
> not V4SF.
>
> Therefore, we changed the constraint in that insn. Then it caused another issue.
> For memory operand, it seems that we cannot generate those mask instructions.
> So I change the pattern to how extendv2hfv2df2 works.

If you want to change the memory access in sse2_cvtps2pd<mask_name>,
then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
addition to two instructions, you will need one define_insn_and_split
with a pre-reload splitter.

Uros,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  7:41     ` Uros Bizjak
@ 2022-06-30  8:45       ` Uros Bizjak
  2022-06-30  8:53         ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2022-06-30  8:45 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: gcc-patches, Liu, Hongtao

On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote:
> >
> > > -----Original Message-----
> > > From: Uros Bizjak <ubizjak@gmail.com>
> > > Sent: Thursday, June 30, 2022 2:20 PM
> > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > >
> > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com>
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > > memory operand but currently does not. After this fix, when loop == 2,
> > > > it will eliminate movq instruction.
> > > >
> > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > >
> > > > BRs,
> > > > Haochen
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         PR target/43618
> > > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > >         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.

Rename FROM ...

Please also mention change to sse2_cvtps2pd<mask_name>.

> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         PR target/43618
> > > >         * gcc.target/i386/pr43618-1.c: New test.
> > >
> > > This patch could be as simple as:
> > >
> > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > > 8cd0f617bf3..c331445cb2d 100644
> > > --- a/gcc/config/i386/sse.md
> > > +++ b/gcc/config/i386/sse.md
> > > @@ -9195,7 +9195,7 @@
> > > (define_insn "extendv2sfv2df2"
> > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > >        (float_extend:V2DF
> > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > >   "TARGET_MMX_WITH_SSE"
> > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > >   [(set_attr "type" "ssecvt")
> >
> > We also tested on this version, it is ok.
> >
> > The reason why the patch looks like this is because in the previous insn
> > sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand
> > actually does not match the actual instruction. Memory operand is V2SF,
> > not V4SF.
> >
> > Therefore, we changed the constraint in that insn. Then it caused another issue.
> > For memory operand, it seems that we cannot generate those mask instructions.
> > So I change the pattern to how extendv2hfv2df2 works.
>
> If you want to change the memory access in sse2_cvtps2pd<mask_name>,
> then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
> addition to two instructions, you will need one define_insn_and_split
> with a pre-reload splitter.

Oh, nowadays combine does vec_select from a paradoxical subreg on its own.

+(define_expand "extendv2sfv2df2"
+  [(set (match_operand:V2DF 0 "register_operand")
+        (float_extend:V2DF
+          (match_operand:V2SF 1 "nonimmediate_operand")))]
+  "TARGET_MMX_WITH_SSE"
+{
+  if (!MEM_P (operands[1]))
+    {

You will need force reg here:

        rtx op1 = force_reg (V2SFmode, operands[1]);
+      operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
+      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
+      DONE;
+    }
+})


-(define_insn "extendv2sfv2df2"
+(define_insn "sse2_cvtps2pd_load<mask_name>"

Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the
star at the beginning, You don't have to make the name public.

OK with the above changes.

Thanks,
Uros,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  8:45       ` Uros Bizjak
@ 2022-06-30  8:53         ` Uros Bizjak
  2022-06-30  8:57           ` Liu, Hongtao
  0 siblings, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2022-06-30  8:53 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: gcc-patches, Liu, Hongtao

On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Uros Bizjak <ubizjak@gmail.com>
> > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > >
> > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com>
> > > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > > > memory operand but currently does not. After this fix, when loop == 2,
> > > > > it will eliminate movq instruction.
> > > > >
> > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > >
> > > > > BRs,
> > > > > Haochen
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         PR target/43618
> > > > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > >         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.
>
> Rename FROM ...
>
> Please also mention change to sse2_cvtps2pd<mask_name>.
>
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >         PR target/43618
> > > > >         * gcc.target/i386/pr43618-1.c: New test.
> > > >
> > > > This patch could be as simple as:
> > > >
> > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > > > 8cd0f617bf3..c331445cb2d 100644
> > > > --- a/gcc/config/i386/sse.md
> > > > +++ b/gcc/config/i386/sse.md
> > > > @@ -9195,7 +9195,7 @@
> > > > (define_insn "extendv2sfv2df2"
> > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > >        (float_extend:V2DF
> > > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > >   "TARGET_MMX_WITH_SSE"
> > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > >   [(set_attr "type" "ssecvt")
> > >
> > > We also tested on this version, it is ok.
> > >
> > > The reason why the patch looks like this is because in the previous insn
> > > sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand
> > > actually does not match the actual instruction. Memory operand is V2SF,
> > > not V4SF.
> > >
> > > Therefore, we changed the constraint in that insn. Then it caused another issue.
> > > For memory operand, it seems that we cannot generate those mask instructions.
> > > So I change the pattern to how extendv2hfv2df2 works.
> >
> > If you want to change the memory access in sse2_cvtps2pd<mask_name>,
> > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
> > addition to two instructions, you will need one define_insn_and_split
> > with a pre-reload splitter.
>
> Oh, nowadays combine does vec_select from a paradoxical subreg on its own.
>
> +(define_expand "extendv2sfv2df2"
> +  [(set (match_operand:V2DF 0 "register_operand")
> +        (float_extend:V2DF
> +          (match_operand:V2SF 1 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
> +{
> +  if (!MEM_P (operands[1]))
> +    {
>
> You will need force reg here:
>
>         rtx op1 = force_reg (V2SFmode, operands[1]);
> +      operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> +      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> +      DONE;
> +    }
> +})
>
>
> -(define_insn "extendv2sfv2df2"
> +(define_insn "sse2_cvtps2pd_load<mask_name>"
>
> Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the
> star at the beginning, You don't have to make the name public.
>
> OK with the above changes.

Forgot to mention:


-         (match_operand:V2SF 1 "register_operand" "v")))]
-  "TARGET_MMX_WITH_SSE"
-  "%vcvtps2pd\t{%1, %0|%0, %1}"
+         (match_operand:V2SF 1 "memory_operand" "m")))]
+  "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>"
+  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper
and2>, %q1}"
   [(set_attr "type" "ssecvt")

The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we
can use TARGET_SSE2 here.

Which opens the question if the expander could also be TARGET_SSE2
only. There are no MMX registers involved in any of the patterns
anymore.

Uros.
>
> Thanks,
> Uros,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  8:53         ` Uros Bizjak
@ 2022-06-30  8:57           ` Liu, Hongtao
  2022-07-04  5:09             ` Jiang, Haochen
  0 siblings, 1 reply; 9+ messages in thread
From: Liu, Hongtao @ 2022-06-30  8:57 UTC (permalink / raw)
  To: Uros Bizjak, Jiang, Haochen; +Cc: gcc-patches



> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: Thursday, June 30, 2022 4:53 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> 
> On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com>
> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Uros Bizjak <ubizjak@gmail.com>
> > > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > > > > <hongtao.liu@intel.com>
> > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > > >
> > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang
> > > > > <haochen.jiang@intel.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This patch aims to fix the cvtps2pd insn, which should also
> > > > > > work on memory operand but currently does not. After this fix,
> > > > > > when loop == 2, it will eliminate movq instruction.
> > > > > >
> > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > > >
> > > > > > BRs,
> > > > > > Haochen
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > >         PR target/43618
> > > > > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > > >         (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2.
> >
> > Rename FROM ...
> >
> > Please also mention change to sse2_cvtps2pd<mask_name>.
> >
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > >         PR target/43618
> > > > > >         * gcc.target/i386/pr43618-1.c: New test.
> > > > >
> > > > > This patch could be as simple as:
> > > > >
> > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > index 8cd0f617bf3..c331445cb2d 100644
> > > > > --- a/gcc/config/i386/sse.md
> > > > > +++ b/gcc/config/i386/sse.md
> > > > > @@ -9195,7 +9195,7 @@
> > > > > (define_insn "extendv2sfv2df2"
> > > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > > >        (float_extend:V2DF
> > > > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > > > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > > >   "TARGET_MMX_WITH_SSE"
> > > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > > >   [(set_attr "type" "ssecvt")
> > > >
> > > > We also tested on this version, it is ok.
> > > >
> > > > The reason why the patch looks like this is because in the
> > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and
> > > > vector_operand actually does not match the actual instruction.
> > > > Memory operand is V2SF, not V4SF.
> > > >
> > > > Therefore, we changed the constraint in that insn. Then it caused another
> issue.
> > > > For memory operand, it seems that we cannot generate those mask
> instructions.
> > > > So I change the pattern to how extendv2hfv2df2 works.
> > >
> > > If you want to change the memory access in sse2_cvtps2pd<mask_name>,
> > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
> > > addition to two instructions, you will need one
> > > define_insn_and_split with a pre-reload splitter.
> >
> > Oh, nowadays combine does vec_select from a paradoxical subreg on its own.
> >
> > +(define_expand "extendv2sfv2df2"
> > +  [(set (match_operand:V2DF 0 "register_operand")
> > +        (float_extend:V2DF
> > +          (match_operand:V2SF 1 "nonimmediate_operand")))]
> > +  "TARGET_MMX_WITH_SSE"
> > +{
> > +  if (!MEM_P (operands[1]))
> > +    {
> >
> > You will need force reg here:
> >
> >         rtx op1 = force_reg (V2SFmode, operands[1]);
> > +      operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> > +      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> > +      DONE;
> > +    }
> > +})
> >
> >
> > -(define_insn "extendv2sfv2df2"
> > +(define_insn "sse2_cvtps2pd_load<mask_name>"
> >
> > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the
> > star at the beginning, You don't have to make the name public.
> >
> > OK with the above changes.
> 
> Forgot to mention:
> 
> 
> -         (match_operand:V2SF 1 "register_operand" "v")))]
> -  "TARGET_MMX_WITH_SSE"
> -  "%vcvtps2pd\t{%1, %0|%0, %1}"
> +         (match_operand:V2SF 1 "memory_operand" "m")))]
> + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>"
> +  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper
> and2>, %q1}"
>    [(set_attr "type" "ssecvt")
> 
> The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we
> can use TARGET_SSE2 here.
> 
> Which opens the question if the expander could also be TARGET_SSE2 only.
> There are no MMX registers involved in any of the patterns anymore.
Yes.
> 
> Uros.
> >
> > Thanks,
> > Uros,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] i386: Extend cvtps2pd to memory
  2022-06-30  8:57           ` Liu, Hongtao
@ 2022-07-04  5:09             ` Jiang, Haochen
  2022-07-04  6:17               ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Jiang, Haochen @ 2022-07-04  5:09 UTC (permalink / raw)
  To: Liu, Hongtao, Uros Bizjak; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5795 bytes --]

Hi all,

I revised my patch according to all your reviews.

Regtested on x86_64-pc-linux-gnu.

BRs,
Haochen

> -----Original Message-----
> From: Liu, Hongtao <hongtao.liu@intel.com>
> Sent: Thursday, June 30, 2022 4:57 PM
> To: Uros Bizjak <ubizjak@gmail.com>; Jiang, Haochen
> <haochen.jiang@intel.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] i386: Extend cvtps2pd to memory
> 
> 
> 
> > -----Original Message-----
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: Thursday, June 30, 2022 4:53 PM
> > To: Jiang, Haochen <haochen.jiang@intel.com>
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> >
> > On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen
> <haochen.jiang@intel.com>
> > wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Uros Bizjak <ubizjak@gmail.com>
> > > > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > > > > > <hongtao.liu@intel.com>
> > > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > > > >
> > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang
> > > > > > <haochen.jiang@intel.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This patch aims to fix the cvtps2pd insn, which should also
> > > > > > > work on memory operand but currently does not. After this fix,
> > > > > > > when loop == 2, it will eliminate movq instruction.
> > > > > > >
> > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > > > >
> > > > > > > BRs,
> > > > > > > Haochen
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >         PR target/43618
> > > > > > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > > > >         (sse2_cvtps2pd_load<mask_name>): Rename
> extendvsdfv2df2.
> > >
> > > Rename FROM ...
> > >
> > > Please also mention change to sse2_cvtps2pd<mask_name>.
> > >
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >         PR target/43618
> > > > > > >         * gcc.target/i386/pr43618-1.c: New test.
> > > > > >
> > > > > > This patch could be as simple as:
> > > > > >
> > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > > index 8cd0f617bf3..c331445cb2d 100644
> > > > > > --- a/gcc/config/i386/sse.md
> > > > > > +++ b/gcc/config/i386/sse.md
> > > > > > @@ -9195,7 +9195,7 @@
> > > > > > (define_insn "extendv2sfv2df2"
> > > > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > > > >        (float_extend:V2DF
> > > > > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > > > > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > > > >   "TARGET_MMX_WITH_SSE"
> > > > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > > > >   [(set_attr "type" "ssecvt")
> > > > >
> > > > > We also tested on this version, it is ok.
> > > > >
> > > > > The reason why the patch looks like this is because in the
> > > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and
> > > > > vector_operand actually does not match the actual instruction.
> > > > > Memory operand is V2SF, not V4SF.
> > > > >
> > > > > Therefore, we changed the constraint in that insn. Then it caused
> another
> > issue.
> > > > > For memory operand, it seems that we cannot generate those mask
> > instructions.
> > > > > So I change the pattern to how extendv2hfv2df2 works.
> > > >
> > > > If you want to change the memory access in
> sse2_cvtps2pd<mask_name>,
> > > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
> > > > addition to two instructions, you will need one
> > > > define_insn_and_split with a pre-reload splitter.
> > >
> > > Oh, nowadays combine does vec_select from a paradoxical subreg on its
> own.
> > >
> > > +(define_expand "extendv2sfv2df2"
> > > +  [(set (match_operand:V2DF 0 "register_operand")
> > > +        (float_extend:V2DF
> > > +          (match_operand:V2SF 1 "nonimmediate_operand")))]
> > > +  "TARGET_MMX_WITH_SSE"
> > > +{
> > > +  if (!MEM_P (operands[1]))
> > > +    {
> > >
> > > You will need force reg here:
> > >
> > >         rtx op1 = force_reg (V2SFmode, operands[1]);
> > > +      operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> > > +      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> > > +      DONE;
> > > +    }
> > > +})
> > >
> > >
> > > -(define_insn "extendv2sfv2df2"
> > > +(define_insn "sse2_cvtps2pd_load<mask_name>"
> > >
> > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note
> the
> > > star at the beginning, You don't have to make the name public.
> > >
> > > OK with the above changes.
> >
> > Forgot to mention:
> >
> >
> > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > -  "TARGET_MMX_WITH_SSE"
> > -  "%vcvtps2pd\t{%1, %0|%0, %1}"
> > +         (match_operand:V2SF 1 "memory_operand" "m")))]
> > + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>"
> > +  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper
> > and2>, %q1}"
> >    [(set_attr "type" "ssecvt")
> >
> > The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so
> we
> > can use TARGET_SSE2 here.
> >
> > Which opens the question if the expander could also be TARGET_SSE2 only.
> > There are no MMX registers involved in any of the patterns anymore.
> Yes.
> >
> > Uros.
> > >
> > > Thanks,
> > > Uros,

[-- Attachment #2: 0001-i386-Extend-cvtps2pd-to-memory.patch --]
[-- Type: application/octet-stream, Size: 2996 bytes --]

From 842ab824d565d5ab823bb6da55ad7adb36f87bf3 Mon Sep 17 00:00:00 2001
From: Haochen Jiang <haochen.jiang@intel.com>
Date: Mon, 30 May 2022 17:12:31 +0800
Subject: [PATCH] i386: Extend cvtps2pd to memory

gcc/ChangeLog:

	PR target/43618
	* config/i386/sse.md (extendv2sfv2df2): New define_expand.
	(sse2_cvtps2pd_<mask_name>): Change constraint of operands[1].
	(*sse2_cvtps2pd_<mask_name>_1): Rename from extendvsdfv2df2.

gcc/testsuite/ChangeLog:

	PR target/43618
	* gcc.target/i386/pr43618-1.c: New test.
---
 gcc/config/i386/sse.md                    | 26 ++++++++++++++++++-----
 gcc/testsuite/gcc.target/i386/pr43618-1.c | 12 +++++++++++
 2 files changed, 33 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr43618-1.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8b2602bfa79..742bb6c222c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -9175,11 +9175,27 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
+(define_expand "extendv2sfv2df2"
+  [(set (match_operand:V2DF 0 "register_operand")
+	(float_extend:V2DF
+	  (match_operand:V2SF 1 "nonimmediate_operand")))]
+  "TARGET_MMX_WITH_SSE"
+{
+  if (!MEM_P (operands[1]))
+    {
+      operands[1] = lowpart_subreg (V4SFmode,
+				    force_reg (V2SFmode, operands[1]),
+				    V2SFmode);
+      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
+      DONE;
+    }
+})
+
 (define_insn "sse2_cvtps2pd<mask_name>"
   [(set (match_operand:V2DF 0 "register_operand" "=v")
 	(float_extend:V2DF
 	  (vec_select:V2SF
-	    (match_operand:V4SF 1 "vector_operand" "vm")
+	    (match_operand:V4SF 1 "register_operand" "v")
 	    (parallel [(const_int 0) (const_int 1)]))))]
   "TARGET_SSE2 && <mask_avx512vl_condition>"
   "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
@@ -9191,12 +9207,12 @@
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "V2DF")])
 
-(define_insn "extendv2sfv2df2"
+(define_insn "*sse2_cvtps2pd<mask_name>_1"
   [(set (match_operand:V2DF 0 "register_operand" "=v")
 	(float_extend:V2DF
-	  (match_operand:V2SF 1 "register_operand" "v")))]
-  "TARGET_MMX_WITH_SSE"
-  "%vcvtps2pd\t{%1, %0|%0, %1}"
+	  (match_operand:V2SF 1 "memory_operand" "m")))]
+  "TARGET_SSE2 && <mask_avx512vl_condition>"
+  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"
   [(set_attr "type" "ssecvt")
    (set_attr "amdfam10_decode" "direct")
    (set_attr "athlon_decode" "double")
diff --git a/gcc/testsuite/gcc.target/i386/pr43618-1.c b/gcc/testsuite/gcc.target/i386/pr43618-1.c
new file mode 100644
index 00000000000..7d16f0e0cf9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr43618-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "movq" } } */
+/* { dg-final { scan-assembler "cvtps2pd" } } */
+
+void
+foo (float a[2], double b[2])
+{
+    int i;
+    for (i = 0; i < 2; i++)
+      b[i] = a[i];
+}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] i386: Extend cvtps2pd to memory
  2022-07-04  5:09             ` Jiang, Haochen
@ 2022-07-04  6:17               ` Uros Bizjak
  0 siblings, 0 replies; 9+ messages in thread
From: Uros Bizjak @ 2022-07-04  6:17 UTC (permalink / raw)
  To: Jiang, Haochen; +Cc: Liu, Hongtao, gcc-patches

On Mon, Jul 4, 2022 at 7:10 AM Jiang, Haochen <haochen.jiang@intel.com> wrote:
>
> Hi all,
>
> I revised my patch according to all your reviews.
>
> Regtested on x86_64-pc-linux-gnu.

OK.

Thanks,
Uros.

>
> BRs,
> Haochen
>
> > -----Original Message-----
> > From: Liu, Hongtao <hongtao.liu@intel.com>
> > Sent: Thursday, June 30, 2022 4:57 PM
> > To: Uros Bizjak <ubizjak@gmail.com>; Jiang, Haochen
> > <haochen.jiang@intel.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH] i386: Extend cvtps2pd to memory
> >
> >
> >
> > > -----Original Message-----
> > > From: Uros Bizjak <ubizjak@gmail.com>
> > > Sent: Thursday, June 30, 2022 4:53 PM
> > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > >
> > > On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > > >
> > > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen
> > <haochen.jiang@intel.com>
> > > wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Uros Bizjak <ubizjak@gmail.com>
> > > > > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > > > > To: Jiang, Haochen <haochen.jiang@intel.com>
> > > > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > > > > > > <hongtao.liu@intel.com>
> > > > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > > > > >
> > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang
> > > > > > > <haochen.jiang@intel.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > This patch aims to fix the cvtps2pd insn, which should also
> > > > > > > > work on memory operand but currently does not. After this fix,
> > > > > > > > when loop == 2, it will eliminate movq instruction.
> > > > > > > >
> > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > > > > >
> > > > > > > > BRs,
> > > > > > > > Haochen
> > > > > > > >
> > > > > > > > gcc/ChangeLog:
> > > > > > > >
> > > > > > > >         PR target/43618
> > > > > > > >         * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > > > > >         (sse2_cvtps2pd_load<mask_name>): Rename
> > extendvsdfv2df2.
> > > >
> > > > Rename FROM ...
> > > >
> > > > Please also mention change to sse2_cvtps2pd<mask_name>.
> > > >
> > > > > > > >
> > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > >
> > > > > > > >         PR target/43618
> > > > > > > >         * gcc.target/i386/pr43618-1.c: New test.
> > > > > > >
> > > > > > > This patch could be as simple as:
> > > > > > >
> > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > > > index 8cd0f617bf3..c331445cb2d 100644
> > > > > > > --- a/gcc/config/i386/sse.md
> > > > > > > +++ b/gcc/config/i386/sse.md
> > > > > > > @@ -9195,7 +9195,7 @@
> > > > > > > (define_insn "extendv2sfv2df2"
> > > > > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > > > > >        (float_extend:V2DF
> > > > > > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > > > > > +         (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > > > > >   "TARGET_MMX_WITH_SSE"
> > > > > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > > > > >   [(set_attr "type" "ssecvt")
> > > > > >
> > > > > > We also tested on this version, it is ok.
> > > > > >
> > > > > > The reason why the patch looks like this is because in the
> > > > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and
> > > > > > vector_operand actually does not match the actual instruction.
> > > > > > Memory operand is V2SF, not V4SF.
> > > > > >
> > > > > > Therefore, we changed the constraint in that insn. Then it caused
> > another
> > > issue.
> > > > > > For memory operand, it seems that we cannot generate those mask
> > > instructions.
> > > > > > So I change the pattern to how extendv2hfv2df2 works.
> > > > >
> > > > > If you want to change the memory access in
> > sse2_cvtps2pd<mask_name>,
> > > > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In
> > > > > addition to two instructions, you will need one
> > > > > define_insn_and_split with a pre-reload splitter.
> > > >
> > > > Oh, nowadays combine does vec_select from a paradoxical subreg on its
> > own.
> > > >
> > > > +(define_expand "extendv2sfv2df2"
> > > > +  [(set (match_operand:V2DF 0 "register_operand")
> > > > +        (float_extend:V2DF
> > > > +          (match_operand:V2SF 1 "nonimmediate_operand")))]
> > > > +  "TARGET_MMX_WITH_SSE"
> > > > +{
> > > > +  if (!MEM_P (operands[1]))
> > > > +    {
> > > >
> > > > You will need force reg here:
> > > >
> > > >         rtx op1 = force_reg (V2SFmode, operands[1]);
> > > > +      operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> > > > +      emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> > > > +      DONE;
> > > > +    }
> > > > +})
> > > >
> > > >
> > > > -(define_insn "extendv2sfv2df2"
> > > > +(define_insn "sse2_cvtps2pd_load<mask_name>"
> > > >
> > > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note
> > the
> > > > star at the beginning, You don't have to make the name public.
> > > >
> > > > OK with the above changes.
> > >
> > > Forgot to mention:
> > >
> > >
> > > -         (match_operand:V2SF 1 "register_operand" "v")))]
> > > -  "TARGET_MMX_WITH_SSE"
> > > -  "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > +         (match_operand:V2SF 1 "memory_operand" "m")))]
> > > + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>"
> > > +  "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper
> > > and2>, %q1}"
> > >    [(set_attr "type" "ssecvt")
> > >
> > > The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so
> > we
> > > can use TARGET_SSE2 here.
> > >
> > > Which opens the question if the expander could also be TARGET_SSE2 only.
> > > There are no MMX registers involved in any of the patterns anymore.
> > Yes.
> > >
> > > Uros.
> > > >
> > > > Thanks,
> > > > Uros,

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-07-04  6:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30  5:59 [PATCH] i386: Extend cvtps2pd to memory Haochen Jiang
2022-06-30  6:20 ` Uros Bizjak
2022-06-30  7:24   ` Jiang, Haochen
2022-06-30  7:41     ` Uros Bizjak
2022-06-30  8:45       ` Uros Bizjak
2022-06-30  8:53         ` Uros Bizjak
2022-06-30  8:57           ` Liu, Hongtao
2022-07-04  5:09             ` Jiang, Haochen
2022-07-04  6:17               ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).