* [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
2020-02-29 14:16 ` [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV H.J. Lu
@ 2020-02-29 14:16 ` H.J. Lu
2020-03-12 3:53 ` Jeff Law
2020-02-29 14:16 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
Remove ext_sse_reg_operand since it is no longer needed.
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_V1DF and
MODE_V2SF.
* config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call
ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand
check.
* config/i386/predicates.md (ext_sse_reg_operand): Removed.
---
gcc/config/i386/i386.c | 10 ++++++++++
gcc/config/i386/mmx.md | 29 ++---------------------------
gcc/config/i386/predicates.md | 5 -----
3 files changed, 12 insertions(+), 32 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1d3b784532b..f34a708cdc3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5142,6 +5142,16 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
else
return "%vmovss\t{%1, %0|%0, %1}";
+ case MODE_V1DF:
+ gcc_assert (!TARGET_AVX);
+ return "movlpd\t{%1, %0|%0, %1}";
+
+ case MODE_V2SF:
+ if (TARGET_AVX && REG_P (operands[0]))
+ return "vmovlps\t{%1, %d0|%d0, %1}";
+ else
+ return "%vmovlps\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index e1c8b0af4c7..c3f195bb34a 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -118,29 +118,7 @@ (define_insn "*mov<mode>_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
- case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
- case MODE_XI:
- return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
-
- case MODE_V2SF:
- if (TARGET_AVX && REG_P (operands[0]))
- return "vmovlps\t{%1, %0, %0|%0, %0, %1}";
- return "%vmovlps\t{%1, %0|%0, %1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -189,10 +167,7 @@ (define_insn "*mov<mode>_internal"
(cond [(eq_attr "alternative" "2")
(const_string "SI")
(eq_attr "alternative" "11,12")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (match_test "<MODE>mode == V2SFmode")
+ (cond [(match_test "<MODE>mode == V2SFmode")
(const_string "V4SF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 1119366d54e..71f4cb1193c 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -61,11 +61,6 @@ (define_predicate "sse_reg_operand"
(and (match_code "reg")
(match_test "SSE_REGNO_P (REGNO (op))")))
-;; True if the operand is an AVX-512 new register.
-(define_predicate "ext_sse_reg_operand"
- (and (match_code "reg")
- (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))")))
-
;; Return true if op is a QImode register.
(define_predicate "any_QIreg_operand"
(and (match_code "reg")
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move
@ 2020-02-29 14:16 H.J. Lu
2020-02-29 14:16 ` [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV H.J. Lu
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
This patch set was originally submitted in Feb 2019:
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01841.html
I broke it into 6 smaller patches for easy review.
On x86, when AVX and AVX512 are enabled, vector move instructions can
be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in
128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on
x86 vector move patterns indicate target preferences of vector move
encoding. For scalar register to register move, we can use 512-bit
vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
available. With AVX512F and AVX512VL, we should use VEX encoding for
128-bit/256-bit vector moves if upper 16 vector registers aren't used.
This patch adds a function, ix86_output_ssemov, to generate vector moves:
1. If zmm registers are used, use EVEX encoding.
2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
will be generated.
3. If xmm16-xmm31/ymm16-ymm31 registers are used:
a. With AVX512VL, AVX512VL vector moves will be generated.
b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
move will be done with zmm register move.
There is no need to set mode attribute to XImode explicitly since
ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
with and without AVX512VL.
Tested on AVX2 and AVX512 with and without --with-arch=native.
H.J. Lu (6):
i386: Properly encode vector registers in vector move
i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
gcc/config/i386/i386-protos.h | 2 +
gcc/config/i386/i386.c | 242 ++++++++++++++++++
gcc/config/i386/i386.md | 212 +--------------
gcc/config/i386/mmx.md | 29 +--
gcc/config/i386/predicates.md | 5 -
gcc/config/i386/sse.md | 98 +------
.../gcc.target/i386/avx512vl-vmovdqa64-1.c | 7 +-
gcc/testsuite/gcc.target/i386/pr89229-2a.c | 15 ++
gcc/testsuite/gcc.target/i386/pr89229-2b.c | 13 +
gcc/testsuite/gcc.target/i386/pr89229-2c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-3a.c | 16 ++
gcc/testsuite/gcc.target/i386/pr89229-3b.c | 12 +
gcc/testsuite/gcc.target/i386/pr89229-3c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 ++
gcc/testsuite/gcc.target/i386/pr89229-4b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-4c.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-5a.c | 17 ++
gcc/testsuite/gcc.target/i386/pr89229-5b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-5c.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 ++
gcc/testsuite/gcc.target/i386/pr89229-6b.c | 7 +
gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 ++
gcc/testsuite/gcc.target/i386/pr89229-7b.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-7c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89346.c | 15 ++
26 files changed, 465 insertions(+), 330 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
` (2 preceding siblings ...)
2020-02-29 14:16 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
@ 2020-02-29 14:16 ` H.J. Lu
2020-03-12 3:41 ` Jeff Law
2020-02-29 14:16 ` [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV H.J. Lu
2020-02-29 15:30 ` [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV H.J. Lu
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
There is no need to set mode attribute to XImode nor V8DFmode since
ix86_output_ssemov can properly encode xmm16-xmm31 registers with and
without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove TARGET_AVX512F, TARGET_PREFER_AVX256,
TARGET_AVX512VL and ext_sse_reg_operand check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-6a.c: New test.
* gcc.target/i386/pr89229-6b.c: Likewise.
* gcc.target/i386/pr89229-6c.c: Likewise.
---
gcc/config/i386/i386.c | 6 +++
gcc/config/i386/i386.md | 44 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 ++++++++
gcc/testsuite/gcc.target/i386/pr89229-6b.c | 7 ++++
gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 +++
5 files changed, 38 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c28c162282a..a6fe9894ab8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5130,6 +5130,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
case MODE_SI:
return "%vmovd\t{%1, %0|%0, %1}";
+ case MODE_DF:
+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+ return "vmovsd\t{%d1, %0|%0, %d1}";
+ else
+ return "%vmovsd\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e9537fadfe8..060a34c4bd4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3307,37 +3307,7 @@ (define_insn "*movdf_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
- return "vmovsd\t{%d1, %0|%0, %d1}";
- return "%vmovsd\t{%1, %0|%0, %1}";
-
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
- case MODE_V8DF:
- return "vmovapd\t{%g1, %g0|%g0, %g1}";
- case MODE_V2DF:
- return "%vmovapd\t{%1, %0|%0, %1}";
-
- case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
- case MODE_V1DF:
- gcc_assert (!TARGET_AVX);
- return "movlpd\t{%1, %0|%0, %1}";
-
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -3391,10 +3361,7 @@ (define_insn "*movdf_internal"
/* xorps is one byte shorter for non-AVX targets. */
(eq_attr "alternative" "12,16")
- (cond [(and (match_test "TARGET_AVX512F")
- (not (match_test "TARGET_PREFER_AVX256")))
- (const_string "XI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "V2DF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
@@ -3410,12 +3377,7 @@ (define_insn "*movdf_internal"
/* movaps is one byte shorter for non-AVX targets. */
(eq_attr "alternative" "13,17")
- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
- (ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V8DF")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "DF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6a.c b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
new file mode 100644
index 00000000000..5bc10d25619
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern double d;
+
+void
+foo1 (double x)
+{
+ register double xmm16 __asm ("xmm16") = x;
+ asm volatile ("" : "+v" (xmm16));
+ register double xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "vmovapd" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6b.c b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
new file mode 100644
index 00000000000..b248a3726f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-6a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
+/* { dg-final { scan-assembler-not "vmovapd" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6c.c b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
new file mode 100644
index 00000000000..7a4d254670c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-6a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
@ 2020-02-29 14:16 ` H.J. Lu
2020-03-12 3:32 ` Jeff Law
2020-02-29 14:16 ` [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV H.J. Lu
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DI.
* config/i386/i386.md (*movdi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-4a.c: New test.
* gcc.target/i386/pr89229-4b.c: Likewise.
* gcc.target/i386/pr89229-4c.c: Likewise.
---
gcc/config/i386/i386.c | 9 +++++++
gcc/config/i386/i386.md | 31 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 ++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-4b.c | 6 +++++
gcc/testsuite/gcc.target/i386/pr89229-4c.c | 7 +++++
5 files changed, 41 insertions(+), 29 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7bbfbb4c5a7..baf70a64193 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5118,6 +5118,15 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
case MODE_V4SF:
return ix86_get_ssemov (operands, 16, insn_mode, mode);
+ case MODE_DI:
+ /* Handle broken assemblers that require movd instead of movq. */
+ if (!HAVE_AS_IX86_INTERUNIT_MOVQ
+ && (GENERAL_REG_P (operands[0])
+ || GENERAL_REG_P (operands[1])))
+ return "%vmovd\t{%1, %0|%0, %1}";
+ else
+ return "%vmovq\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cea831b6086..d8462b3de37 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2054,31 +2054,7 @@ (define_insn "*movdi_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
- case MODE_TI:
- /* Handle AVX512 registers set. */
- if (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1]))
- return "vmovdqa64\t{%1, %0|%0, %1}";
- return "%vmovdqa\t{%1, %0|%0, %1}";
-
- case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_SSECVT:
if (SSE_REG_P (operands[0]))
@@ -2164,10 +2140,7 @@ (define_insn "*movdi_internal"
(cond [(eq_attr "alternative" "2")
(const_string "SI")
(eq_attr "alternative" "12,13")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "TI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "TI")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4a.c b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
new file mode 100644
index 00000000000..cb9b071e873
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+extern long long i;
+
+long long
+foo1 (void)
+{
+ register long long xmm16 __asm ("xmm16") = i;
+ asm volatile ("" : "+v" (xmm16));
+ register long long xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ return xmm17;
+}
+
+/* { dg-final { scan-assembler-times "vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4b.c b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
new file mode 100644
index 00000000000..023e81253a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-4a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4c.c b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
new file mode 100644
index 00000000000..e02eb37c16d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-4a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/6] i386: Properly encode vector registers in vector move
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
2020-02-29 14:16 ` [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV H.J. Lu
2020-02-29 14:16 ` [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV H.J. Lu
@ 2020-02-29 14:16 ` H.J. Lu
2020-03-05 23:47 ` Jeff Law
2020-02-29 14:16 ` [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV H.J. Lu
` (2 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
On x86, when AVX and AVX512 are enabled, vector move instructions can
be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in
128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on
x86 vector move patterns indicate target preferences of vector move
encoding. For scalar register to register move, we can use 512-bit
vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
available. With AVX512F and AVX512VL, we should use VEX encoding for
128-bit/256-bit vector moves if upper 16 vector registers aren't used.
This patch adds a function, ix86_output_ssemov, to generate vector moves:
1. If zmm registers are used, use EVEX encoding.
2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
will be generated.
3. If xmm16-xmm31/ymm16-ymm31 registers are used:
a. With AVX512VL, AVX512VL vector moves will be generated.
b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
move will be done with zmm register move.
There is no need to set mode attribute to XImode explicitly since
ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
with and without AVX512VL.
Tested on AVX2 and AVX512 with and without --with-arch=native.
gcc/
PR target/89229
PR target/89346
* config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
* config/i386/i386.c (ix86_get_ssemov): New function.
(ix86_output_ssemov): Likewise.
* config/i386/sse.md (VMOVE:mov<mode>_internal): Call
ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL
check.
(*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
(*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
Remove ext_sse_reg_operand and TARGET_AVX512VL check.
(*movti_internal): Likewise.
(*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
gcc/testsuite/
PR target/89229
PR target/89346
* gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
* gcc.target/i386/pr89346.c: New test.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-2a.c: New test.
* gcc.target/i386/pr89229-2b.c: Likewise.
* gcc.target/i386/pr89229-2c.c: Likewise.
* gcc.target/i386/pr89229-3a.c: Likewise.
* gcc.target/i386/pr89229-3b.c: Likewise.
* gcc.target/i386/pr89229-3c.c: Likewise.
---
gcc/config/i386/i386-protos.h | 2 +
gcc/config/i386/i386.c | 208 ++++++++++++++++++
gcc/config/i386/i386.md | 86 +-------
gcc/config/i386/sse.md | 98 +--------
.../gcc.target/i386/avx512vl-vmovdqa64-1.c | 7 +-
gcc/testsuite/gcc.target/i386/pr89229-2a.c | 15 ++
gcc/testsuite/gcc.target/i386/pr89229-2b.c | 13 ++
gcc/testsuite/gcc.target/i386/pr89229-2c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89229-3a.c | 16 ++
gcc/testsuite/gcc.target/i386/pr89229-3b.c | 12 +
gcc/testsuite/gcc.target/i386/pr89229-3c.c | 6 +
gcc/testsuite/gcc.target/i386/pr89346.c | 15 ++
12 files changed, 303 insertions(+), 181 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 266381ca5a6..39fcaa0ad5f 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -38,6 +38,8 @@ extern void ix86_expand_split_stack_prologue (void);
extern void ix86_output_addr_vec_elt (FILE *, int);
extern void ix86_output_addr_diff_elt (FILE *, int, int);
+extern const char *ix86_output_ssemov (rtx_insn *, rtx *);
+
extern enum calling_abi ix86_cfun_abi (void);
extern enum calling_abi ix86_function_type_abi (const_tree);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index dac7a3fc5fd..7bbfbb4c5a7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4915,6 +4915,214 @@ ix86_pre_reload_split (void)
&& !(cfun->curr_properties & PROP_rtl_split_insns));
}
+/* Return the opcode of the TYPE_SSEMOV instruction. To move from
+ or to xmm16-xmm31/ymm16-ymm31 registers, we either require
+ TARGET_AVX512VL or it is a register to register move which can
+ be done with zmm register move. */
+
+static const char *
+ix86_get_ssemov (rtx *operands, unsigned size,
+ enum attr_mode insn_mode, machine_mode mode)
+{
+ char buf[128];
+ bool misaligned_p = (misaligned_operand (operands[0], mode)
+ || misaligned_operand (operands[1], mode));
+ bool evex_reg_p = (size == 64
+ || EXT_REX_SSE_REG_P (operands[0])
+ || EXT_REX_SSE_REG_P (operands[1]));
+ machine_mode scalar_mode;
+
+ const char *opcode = NULL;
+ enum
+ {
+ opcode_int,
+ opcode_float,
+ opcode_double
+ } type = opcode_int;
+
+ switch (insn_mode)
+ {
+ case MODE_V16SF:
+ case MODE_V8SF:
+ case MODE_V4SF:
+ scalar_mode = E_SFmode;
+ type = opcode_float;
+ break;
+ case MODE_V8DF:
+ case MODE_V4DF:
+ case MODE_V2DF:
+ scalar_mode = E_DFmode;
+ type = opcode_double;
+ break;
+ case MODE_XI:
+ case MODE_OI:
+ case MODE_TI:
+ scalar_mode = GET_MODE_INNER (mode);
+ break;
+ default:
+ gcc_unreachable ();
+ }
+
+ /* NB: To move xmm16-xmm31/ymm16-ymm31 registers without AVX512VL,
+ we can only use zmm register move without memory operand. */
+ if (evex_reg_p
+ && !TARGET_AVX512VL
+ && GET_MODE_SIZE (mode) < 64)
+ {
+ /* NB: Since ix86_hard_regno_mode_ok only allows xmm16-xmm31 or
+ ymm16-ymm31 in 128/256 bit modes when AVX512VL is enabled,
+ we get here only for xmm16-xmm31 or ymm16-ymm31 in 32/64 bit
+ modes. */
+ if (GET_MODE_SIZE (mode) >= 16
+ || memory_operand (operands[0], mode)
+ || memory_operand (operands[1], mode))
+ gcc_unreachable ();
+ size = 64;
+ switch (type)
+ {
+ case opcode_int:
+ opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32";
+ break;
+ case opcode_float:
+ opcode = misaligned_p ? "vmovups" : "vmovaps";
+ break;
+ case opcode_double:
+ opcode = misaligned_p ? "vmovupd" : "vmovapd";
+ break;
+ }
+ }
+ else if (SCALAR_FLOAT_MODE_P (scalar_mode))
+ {
+ switch (scalar_mode)
+ {
+ case E_SFmode:
+ opcode = misaligned_p ? "%vmovups" : "%vmovaps";
+ break;
+ case E_DFmode:
+ opcode = misaligned_p ? "%vmovupd" : "%vmovapd";
+ break;
+ case E_TFmode:
+ if (evex_reg_p)
+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
+ else
+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
+ break;
+ default:
+ gcc_unreachable ();
+ }
+ }
+ else if (SCALAR_INT_MODE_P (scalar_mode))
+ {
+ switch (scalar_mode)
+ {
+ case E_QImode:
+ if (evex_reg_p)
+ opcode = (misaligned_p
+ ? (TARGET_AVX512BW
+ ? "vmovdqu8"
+ : "vmovdqu64")
+ : "vmovdqa64");
+ else
+ opcode = (misaligned_p
+ ? (TARGET_AVX512BW
+ ? "vmovdqu8"
+ : "%vmovdqu")
+ : "%vmovdqa");
+ break;
+ case E_HImode:
+ if (evex_reg_p)
+ opcode = (misaligned_p
+ ? (TARGET_AVX512BW
+ ? "vmovdqu16"
+ : "vmovdqu64")
+ : "vmovdqa64");
+ else
+ opcode = (misaligned_p
+ ? (TARGET_AVX512BW
+ ? "vmovdqu16"
+ : "%vmovdqu")
+ : "%vmovdqa");
+ break;
+ case E_SImode:
+ if (evex_reg_p)
+ opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32";
+ else
+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
+ break;
+ case E_DImode:
+ case E_TImode:
+ case E_OImode:
+ if (evex_reg_p)
+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
+ else
+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
+ break;
+ case E_XImode:
+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
+ break;
+ default:
+ gcc_unreachable ();
+ }
+ }
+ else
+ gcc_unreachable ();
+
+ switch (size)
+ {
+ case 64:
+ snprintf (buf, sizeof (buf), "%s\t{%%g1, %%g0|%%g0, %%g1}",
+ opcode);
+ break;
+ case 32:
+ snprintf (buf, sizeof (buf), "%s\t{%%t1, %%t0|%%t0, %%t1}",
+ opcode);
+ break;
+ case 16:
+ snprintf (buf, sizeof (buf), "%s\t{%%x1, %%x0|%%x0, %%x1}",
+ opcode);
+ break;
+ default:
+ gcc_unreachable ();
+ }
+ output_asm_insn (buf, operands);
+ return "";
+}
+
+/* Return the template of the TYPE_SSEMOV instruction to move
+ operands[1] into operands[0]. */
+
+const char *
+ix86_output_ssemov (rtx_insn *insn, rtx *operands)
+{
+ machine_mode mode = GET_MODE (operands[0]);
+ if (get_attr_type (insn) != TYPE_SSEMOV
+ || mode != GET_MODE (operands[1]))
+ gcc_unreachable ();
+
+ enum attr_mode insn_mode = get_attr_mode (insn);
+
+ switch (insn_mode)
+ {
+ case MODE_XI:
+ case MODE_V8DF:
+ case MODE_V16SF:
+ return ix86_get_ssemov (operands, 64, insn_mode, mode);
+
+ case MODE_OI:
+ case MODE_V4DF:
+ case MODE_V8SF:
+ return ix86_get_ssemov (operands, 32, insn_mode, mode);
+
+ case MODE_TI:
+ case MODE_V2DF:
+ case MODE_V4SF:
+ return ix86_get_ssemov (operands, 16, insn_mode, mode);
+
+ default:
+ gcc_unreachable ();
+ }
+}
+
/* Returns true if OP contains a symbol reference */
bool
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6c57500ae8e..cea831b6086 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1902,11 +1902,7 @@ (define_insn "*movxi_internal_avx512f"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- if (misaligned_operand (operands[0], XImode)
- || misaligned_operand (operands[1], XImode))
- return "vmovdqu32\t{%1, %0|%0, %1}";
- else
- return "vmovdqa32\t{%1, %0|%0, %1}";
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -1929,21 +1925,7 @@ (define_insn "*movoi_internal_avx"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- if (misaligned_operand (operands[0], OImode)
- || misaligned_operand (operands[1], OImode))
- {
- if (get_attr_mode (insn) == MODE_XI)
- return "vmovdqu32\t{%1, %0|%0, %1}";
- else
- return "vmovdqu\t{%1, %0|%0, %1}";
- }
- else
- {
- if (get_attr_mode (insn) == MODE_XI)
- return "vmovdqa32\t{%1, %0|%0, %1}";
- else
- return "vmovdqa\t{%1, %0|%0, %1}";
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -1952,15 +1934,7 @@ (define_insn "*movoi_internal_avx"
[(set_attr "isa" "*,avx2,*,*")
(set_attr "type" "sselog1,sselog1,ssemov,ssemov")
(set_attr "prefix" "vex")
- (set (attr "mode")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (and (eq_attr "alternative" "1")
- (match_test "TARGET_AVX512VL"))
- (const_string "XI")
- ]
- (const_string "OI")))])
+ (set_attr "mode" "OI")])
(define_insn "*movti_internal"
[(set (match_operand:TI 0 "nonimmediate_operand" "=!r ,o ,v,v ,v ,m,?r,?Yd")
@@ -1981,27 +1955,7 @@ (define_insn "*movti_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- /* TDmode values are passed as TImode on the stack. Moving them
- to stack may result in unaligned memory access. */
- if (misaligned_operand (operands[0], TImode)
- || misaligned_operand (operands[1], TImode))
- {
- if (get_attr_mode (insn) == MODE_V4SF)
- return "%vmovups\t{%1, %0|%0, %1}";
- else if (get_attr_mode (insn) == MODE_XI)
- return "vmovdqu32\t{%1, %0|%0, %1}";
- else
- return "%vmovdqu\t{%1, %0|%0, %1}";
- }
- else
- {
- if (get_attr_mode (insn) == MODE_V4SF)
- return "%vmovaps\t{%1, %0|%0, %1}";
- else if (get_attr_mode (insn) == MODE_XI)
- return "vmovdqa32\t{%1, %0|%0, %1}";
- else
- return "%vmovdqa\t{%1, %0|%0, %1}";
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -2028,12 +1982,6 @@ (define_insn "*movti_internal"
(set (attr "mode")
(cond [(eq_attr "alternative" "0,1")
(const_string "DI")
- (ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (and (eq_attr "alternative" "3")
- (match_test "TARGET_AVX512VL"))
- (const_string "XI")
(match_test "TARGET_AVX")
(const_string "TI")
(ior (not (match_test "TARGET_SSE2"))
@@ -3254,31 +3202,7 @@ (define_insn "*movtf_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- /* Handle misaligned load/store since we
- don't have movmisaligntf pattern. */
- if (misaligned_operand (operands[0], TFmode)
- || misaligned_operand (operands[1], TFmode))
- {
- if (get_attr_mode (insn) == MODE_V4SF)
- return "%vmovups\t{%1, %0|%0, %1}";
- else if (TARGET_AVX512VL
- && (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1])))
- return "vmovdqu64\t{%1, %0|%0, %1}";
- else
- return "%vmovdqu\t{%1, %0|%0, %1}";
- }
- else
- {
- if (get_attr_mode (insn) == MODE_V4SF)
- return "%vmovaps\t{%1, %0|%0, %1}";
- else if (TARGET_AVX512VL
- && (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1])))
- return "vmovdqa64\t{%1, %0|%0, %1}";
- else
- return "%vmovdqa\t{%1, %0|%0, %1}";
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_MULTI:
return "#";
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index ee1f138d1af..8f5902292c6 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1013,98 +1013,7 @@ (define_insn "mov<mode>_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
- in avx512f, so we need to use workarounds, to access sse registers
- 16-31, which are evex-only. In avx512vl we don't need workarounds. */
- if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL
- && (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1])))
- {
- if (memory_operand (operands[0], <MODE>mode))
- {
- if (<MODE_SIZE> == 32)
- return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
- else if (<MODE_SIZE> == 16)
- return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, 0x0}";
- else
- gcc_unreachable ();
- }
- else if (memory_operand (operands[1], <MODE>mode))
- {
- if (<MODE_SIZE> == 32)
- return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0, %1}";
- else if (<MODE_SIZE> == 16)
- return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0, %1}";
- else
- gcc_unreachable ();
- }
- else
- /* Reg -> reg move is always aligned. Just use wider move. */
- switch (get_attr_mode (insn))
- {
- case MODE_V8SF:
- case MODE_V4SF:
- return "vmovaps\t{%g1, %g0|%g0, %g1}";
- case MODE_V4DF:
- case MODE_V2DF:
- return "vmovapd\t{%g1, %g0|%g0, %g1}";
- case MODE_OI:
- case MODE_TI:
- return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
- default:
- gcc_unreachable ();
- }
- }
-
- switch (get_attr_mode (insn))
- {
- case MODE_V16SF:
- case MODE_V8SF:
- case MODE_V4SF:
- if (misaligned_operand (operands[0], <MODE>mode)
- || misaligned_operand (operands[1], <MODE>mode))
- return "%vmovups\t{%1, %0|%0, %1}";
- else
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- case MODE_V8DF:
- case MODE_V4DF:
- case MODE_V2DF:
- if (misaligned_operand (operands[0], <MODE>mode)
- || misaligned_operand (operands[1], <MODE>mode))
- return "%vmovupd\t{%1, %0|%0, %1}";
- else
- return "%vmovapd\t{%1, %0|%0, %1}";
-
- case MODE_OI:
- case MODE_TI:
- if (misaligned_operand (operands[0], <MODE>mode)
- || misaligned_operand (operands[1], <MODE>mode))
- return TARGET_AVX512VL
- && (<MODE>mode == V4SImode
- || <MODE>mode == V2DImode
- || <MODE>mode == V8SImode
- || <MODE>mode == V4DImode
- || TARGET_AVX512BW)
- ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}"
- : "%vmovdqu\t{%1, %0|%0, %1}";
- else
- return TARGET_AVX512VL ? "vmovdqa64\t{%1, %0|%0, %1}"
- : "%vmovdqa\t{%1, %0|%0, %1}";
- case MODE_XI:
- if (misaligned_operand (operands[0], <MODE>mode)
- || misaligned_operand (operands[1], <MODE>mode))
- return (<MODE>mode == V16SImode
- || <MODE>mode == V8DImode
- || TARGET_AVX512BW)
- ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}"
- : "vmovdqu64\t{%1, %0|%0, %1}";
- else
- return "vmovdqa64\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -1113,10 +1022,7 @@ (define_insn "mov<mode>_internal"
[(set_attr "type" "sselog1,sselog1,ssemov,ssemov")
(set_attr "prefix" "maybe_vex")
(set (attr "mode")
- (cond [(and (eq_attr "alternative" "1")
- (match_test "TARGET_AVX512VL"))
- (const_string "<sseinsnmode>")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "<sseinsnmode>")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
index 14fe4b84544..db4d9d14875 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
@@ -4,14 +4,13 @@
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\\(\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
-/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\\(\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
+/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\\(\[^\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
+/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\\(\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5,6}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
-/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\nxy\]*\\((?:\n|\[ \\t\]+#)" 1 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times "vmovdqa\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5,6}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2a.c b/gcc/testsuite/gcc.target/i386/pr89229-2a.c
new file mode 100644
index 00000000000..0cf78039481
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-2a.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+typedef __int128 __m128t __attribute__ ((__vector_size__ (16),
+ __may_alias__));
+
+__m128t
+foo1 (void)
+{
+ register __int128 xmm16 __asm ("xmm16") = (__int128) -1;
+ asm volatile ("" : "+v" (xmm16));
+ return (__m128t) xmm16;
+}
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2b.c b/gcc/testsuite/gcc.target/i386/pr89229-2b.c
new file mode 100644
index 00000000000..8d5d6c41d30
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-2b.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+typedef __int128 __m128t __attribute__ ((__vector_size__ (16),
+ __may_alias__));
+
+__m128t
+foo1 (void)
+{
+ register __int128 xmm16 __asm ("xmm16") = (__int128) -1; /* { dg-error "register specified for 'xmm16'" } */
+ asm volatile ("" : "+v" (xmm16));
+ return (__m128t) xmm16;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2c.c b/gcc/testsuite/gcc.target/i386/pr89229-2c.c
new file mode 100644
index 00000000000..218da46dcd0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-2c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-2a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3a.c b/gcc/testsuite/gcc.target/i386/pr89229-3a.c
new file mode 100644
index 00000000000..fcb85c366b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-3a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern __float128 d;
+
+void
+foo1 (__float128 x)
+{
+ register __float128 xmm16 __asm ("xmm16") = x;
+ asm volatile ("" : "+v" (xmm16));
+ register __float128 xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3b.c b/gcc/testsuite/gcc.target/i386/pr89229-3b.c
new file mode 100644
index 00000000000..37eb83c783b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-3b.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+extern __float128 d;
+
+void
+foo1 (__float128 x)
+{
+ register __float128 xmm16 __asm ("xmm16") = x; /* { dg-error "register specified for 'xmm16'" } */
+ asm volatile ("" : "+v" (xmm16));
+ d = xmm16;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3c.c b/gcc/testsuite/gcc.target/i386/pr89229-3c.c
new file mode 100644
index 00000000000..529a520133c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-3c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89346.c b/gcc/testsuite/gcc.target/i386/pr89346.c
new file mode 100644
index 00000000000..cdc9accf521
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89346.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+#include <immintrin.h>
+
+long long *p;
+volatile __m256i y;
+
+void
+foo (void)
+{
+ _mm256_store_epi64 (p, y);
+}
+
+/* { dg-final { scan-assembler-not "vmovdqa64" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
` (3 preceding siblings ...)
2020-02-29 14:16 ` [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV H.J. Lu
@ 2020-02-29 14:16 ` H.J. Lu
2020-03-12 3:46 ` Jeff Law
2020-02-29 15:30 ` [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV H.J. Lu
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 14:16 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
There is no need to set mode attribute to V16SFmode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_SF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove TARGET_PREFER_AVX256, TARGET_AVX512VL
and ext_sse_reg_operand check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-7a.c: New test.
* gcc.target/i386/pr89229-7b.c: Likewise.
* gcc.target/i386/pr89229-7c.c: Likewise.
---
gcc/config/i386/i386.c | 6 +++++
gcc/config/i386/i386.md | 26 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 +++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-7b.c | 6 +++++
gcc/testsuite/gcc.target/i386/pr89229-7c.c | 6 +++++
5 files changed, 36 insertions(+), 24 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a6fe9894ab8..1d3b784532b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5136,6 +5136,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
else
return "%vmovsd\t{%1, %0|%0, %1}";
+ case MODE_SF:
+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+ return "vmovss\t{%d1, %0|%0, %d1}";
+ else
+ return "%vmovss\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 060a34c4bd4..b837c345f4e 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3469,24 +3469,7 @@ (define_insn "*movsf_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_SF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
- return "vmovss\t{%d1, %0|%0, %d1}";
- return "%vmovss\t{%1, %0|%0, %1}";
-
- case MODE_V16SF:
- return "vmovaps\t{%g1, %g0|%g0, %g1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- case MODE_SI:
- return "%vmovd\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_MMXMOV:
switch (get_attr_mode (insn))
@@ -3558,12 +3541,7 @@ (define_insn "*movsf_internal"
better to maintain the whole registers in single format
to avoid problems on using packed logical operations. */
(eq_attr "alternative" "6")
- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
- (ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V16SF")
- (ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+ (cond [(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
(match_test "TARGET_SSE_SPLIT_REGS"))
(const_string "V4SF")
]
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7a.c b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
new file mode 100644
index 00000000000..856115b2f5a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern float d;
+
+void
+foo1 (float x)
+{
+ register float xmm16 __asm ("xmm16") = x;
+ asm volatile ("" : "+v" (xmm16));
+ register float xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7b.c b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
new file mode 100644
index 00000000000..93d1e43770c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-assembler-times "vmovaps\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7c.c b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
new file mode 100644
index 00000000000..e37ff2bf5bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
` (4 preceding siblings ...)
2020-02-29 14:16 ` [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV H.J. Lu
@ 2020-02-29 15:30 ` H.J. Lu
2020-03-12 3:39 ` Jeff Law
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-29 15:30 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek, Jeffrey Law, Jan Hubicka, Uros Bizjak
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_SI.
* config/i386/i386.md (*movsi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-5a.c: New test.
* gcc.target/i386/pr89229-5b.c: Likewise.
* gcc.target/i386/pr89229-5c.c: Likewise.
---
gcc/config/i386/i386.c | 3 +++
gcc/config/i386/i386.md | 25 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-5a.c | 17 +++++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-5b.c | 6 ++++++
gcc/testsuite/gcc.target/i386/pr89229-5c.c | 7 ++++++
5 files changed, 35 insertions(+), 23 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index baf70a64193..c28c162282a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5127,6 +5127,9 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
else
return "%vmovq\t{%1, %0|%0, %1}";
+ case MODE_SI:
+ return "%vmovd\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d8462b3de37..e9537fadfe8 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2261,25 +2261,7 @@ (define_insn "*movsi_internal"
gcc_unreachable ();
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_SI:
- return "%vmovd\t{%1, %0|%0, %1}";
- case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
- case MODE_XI:
- return "vmovdqa32\t{%g1, %g0|%g0, %g1}";
-
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- case MODE_SF:
- gcc_assert (!TARGET_AVX);
- return "movss\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_MMX:
return "pxor\t%0, %0";
@@ -2345,10 +2327,7 @@ (define_insn "*movsi_internal"
(cond [(eq_attr "alternative" "2,3")
(const_string "DI")
(eq_attr "alternative" "8,9")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "TI")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5a.c b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
new file mode 100644
index 00000000000..fd56f447016
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern int i;
+
+int
+foo1 (void)
+{
+ register int xmm16 __asm ("xmm16") = i;
+ asm volatile ("" : "+v" (xmm16));
+ register int xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ return xmm17;
+}
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5b.c b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
new file mode 100644
index 00000000000..261f2e12e8d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5c.c b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
new file mode 100644
index 00000000000..16fad809385
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/6] i386: Properly encode vector registers in vector move
2020-02-29 14:16 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
@ 2020-03-05 23:47 ` Jeff Law
2020-03-08 12:04 ` [COMMITTED, PATCH] gcc.target/i386/pr89229-3c.c: Include "pr89229-3a.c" H.J. Lu
2020-03-10 12:35 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
0 siblings, 2 replies; 16+ messages in thread
From: Jeff Law @ 2020-03-05 23:47 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> On x86, when AVX and AVX512 are enabled, vector move instructions can
> be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
>
> 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
> 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
>
> We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
> only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
> and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in
> 128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on
> x86 vector move patterns indicate target preferences of vector move
> encoding. For scalar register to register move, we can use 512-bit
> vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
> available. With AVX512F and AVX512VL, we should use VEX encoding for
> 128-bit/256-bit vector moves if upper 16 vector registers aren't used.
> This patch adds a function, ix86_output_ssemov, to generate vector moves:
>
> 1. If zmm registers are used, use EVEX encoding.
> 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
> will be generated.
> 3. If xmm16-xmm31/ymm16-ymm31 registers are used:
> a. With AVX512VL, AVX512VL vector moves will be generated.
> b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
> move will be done with zmm register move.
>
> There is no need to set mode attribute to XImode explicitly since
> ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
> with and without AVX512VL.
>
> Tested on AVX2 and AVX512 with and without --with-arch=native.
>
> gcc/
>
> PR target/89229
> PR target/89346
> * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
> * config/i386/i386.c (ix86_get_ssemov): New function.
> (ix86_output_ssemov): Likewise.
> * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
> ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL
> check.
> (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
> (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> (*movti_internal): Likewise.
> (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
>
> gcc/testsuite/
>
> PR target/89229
> PR target/89346
> * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
> * gcc.target/i386/pr89346.c: New test.
>
> gcc/testsuite/
>
> PR target/89229
> * gcc.target/i386/pr89229-2a.c: New test.
> * gcc.target/i386/pr89229-2b.c: Likewise.
> * gcc.target/i386/pr89229-2c.c: Likewise.
> * gcc.target/i386/pr89229-3a.c: Likewise.
> * gcc.target/i386/pr89229-3b.c: Likewise.
> * gcc.target/i386/pr89229-3c.c: Likewise.
OK. Let's get this one installed, let the various testers out there chew on it
for a day, then we'll iterate through the rest.
Thanks again for your patience.
jeff
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [COMMITTED, PATCH] gcc.target/i386/pr89229-3c.c: Include "pr89229-3a.c"
2020-03-05 23:47 ` Jeff Law
@ 2020-03-08 12:04 ` H.J. Lu
2020-03-10 12:35 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2020-03-08 12:04 UTC (permalink / raw)
To: Jeffrey Law; +Cc: GCC Patches, Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Thu, Mar 5, 2020 at 3:47 PM Jeff Law <law@redhat.com> wrote:
>
> On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> > On x86, when AVX and AVX512 are enabled, vector move instructions can
> > be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
> >
> > 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
> > 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
> >
> > We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
> > only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
> > and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in
> > 128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on
> > x86 vector move patterns indicate target preferences of vector move
> > encoding. For scalar register to register move, we can use 512-bit
> > vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
> > available. With AVX512F and AVX512VL, we should use VEX encoding for
> > 128-bit/256-bit vector moves if upper 16 vector registers aren't used.
> > This patch adds a function, ix86_output_ssemov, to generate vector moves:
> >
> > 1. If zmm registers are used, use EVEX encoding.
> > 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
> > will be generated.
> > 3. If xmm16-xmm31/ymm16-ymm31 registers are used:
> > a. With AVX512VL, AVX512VL vector moves will be generated.
> > b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
> > move will be done with zmm register move.
> >
> > There is no need to set mode attribute to XImode explicitly since
> > ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
> > with and without AVX512VL.
> >
> > Tested on AVX2 and AVX512 with and without --with-arch=native.
> >
> > gcc/
> >
> > PR target/89229
> > PR target/89346
> > * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
> > * config/i386/i386.c (ix86_get_ssemov): New function.
> > (ix86_output_ssemov): Likewise.
> > * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
> > ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL
> > check.
> > (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
> > (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> > Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> > (*movti_internal): Likewise.
> > (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> >
> > gcc/testsuite/
> >
> > PR target/89229
> > PR target/89346
> > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
> > * gcc.target/i386/pr89346.c: New test.
> >
> > gcc/testsuite/
> >
> > PR target/89229
> > * gcc.target/i386/pr89229-2a.c: New test.
> > * gcc.target/i386/pr89229-2b.c: Likewise.
> > * gcc.target/i386/pr89229-2c.c: Likewise.
> > * gcc.target/i386/pr89229-3a.c: Likewise.
> > * gcc.target/i386/pr89229-3b.c: Likewise.
> > * gcc.target/i386/pr89229-3c.c: Likewise.
> OK. Let's get this one installed, let the various testers out there chew on it
> for a day, then we'll iterate through the rest.
>
> Thanks again for your patience.
>
I checked in this patch to fix
FAIL: gcc.target/i386/pr89229-3c.c (test for excess errors)
Thanks.
--
H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/6] i386: Properly encode vector registers in vector move
2020-03-05 23:47 ` Jeff Law
2020-03-08 12:04 ` [COMMITTED, PATCH] gcc.target/i386/pr89229-3c.c: Include "pr89229-3a.c" H.J. Lu
@ 2020-03-10 12:35 ` H.J. Lu
1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2020-03-10 12:35 UTC (permalink / raw)
To: Jeffrey Law; +Cc: GCC Patches, Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Thu, Mar 5, 2020 at 3:47 PM Jeff Law <law@redhat.com> wrote:
>
> On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> > On x86, when AVX and AVX512 are enabled, vector move instructions can
> > be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
> >
> > 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
> > 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2
> >
> > We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F
> > only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit
> > and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in
> > 128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on
> > x86 vector move patterns indicate target preferences of vector move
> > encoding. For scalar register to register move, we can use 512-bit
> > vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
> > available. With AVX512F and AVX512VL, we should use VEX encoding for
> > 128-bit/256-bit vector moves if upper 16 vector registers aren't used.
> > This patch adds a function, ix86_output_ssemov, to generate vector moves:
> >
> > 1. If zmm registers are used, use EVEX encoding.
> > 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
> > will be generated.
> > 3. If xmm16-xmm31/ymm16-ymm31 registers are used:
> > a. With AVX512VL, AVX512VL vector moves will be generated.
> > b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
> > move will be done with zmm register move.
> >
> > There is no need to set mode attribute to XImode explicitly since
> > ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
> > with and without AVX512VL.
> >
> > Tested on AVX2 and AVX512 with and without --with-arch=native.
> >
> > gcc/
> >
> > PR target/89229
> > PR target/89346
> > * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
> > * config/i386/i386.c (ix86_get_ssemov): New function.
> > (ix86_output_ssemov): Likewise.
> > * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
> > ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL
> > check.
> > (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
> > (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> > Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> > (*movti_internal): Likewise.
> > (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> >
> > gcc/testsuite/
> >
> > PR target/89229
> > PR target/89346
> > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
> > * gcc.target/i386/pr89346.c: New test.
> >
> > gcc/testsuite/
> >
> > PR target/89229
> > * gcc.target/i386/pr89229-2a.c: New test.
> > * gcc.target/i386/pr89229-2b.c: Likewise.
> > * gcc.target/i386/pr89229-2c.c: Likewise.
> > * gcc.target/i386/pr89229-3a.c: Likewise.
> > * gcc.target/i386/pr89229-3b.c: Likewise.
> > * gcc.target/i386/pr89229-3c.c: Likewise.
> OK. Let's get this one installed, let the various testers out there chew on it
> for a day, then we'll iterate through the rest.
>
> Thanks again for your patience.
Hi, Jeff,
My first patch has been installed for 5 days without problems. Can you
review the rest?
Thanks.
--
H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
2020-02-29 14:16 ` [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV H.J. Lu
@ 2020-03-12 3:32 ` Jeff Law
0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2020-03-12 3:32 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> There is no need to set mode attribute to XImode since ix86_output_ssemov
> can properly encode xmm16-xmm31 registers with and without AVX512VL.
>
> gcc/
>
> PR target/89229
> * config/i386/i386.c (ix86_output_ssemov): Handle MODE_DI.
> * config/i386/i386.md (*movdi_internal): Call ix86_output_ssemov
> for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
> check.
>
> gcc/testsuite/
>
> PR target/89229
> * gcc.target/i386/pr89229-4a.c: New test.
> * gcc.target/i386/pr89229-4b.c: Likewise.
> * gcc.target/i386/pr89229-4c.c: Likewise.
So for alternatives 14, 15, 16 and !TARGET_SSE2 can't the insn_mode be V2SF?
Isn't that going to trigger the gcc_unreachable in ix86_output_ssemov?
Jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
2020-02-29 15:30 ` [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV H.J. Lu
@ 2020-03-12 3:39 ` Jeff Law
0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2020-03-12 3:39 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> There is no need to set mode attribute to XImode since ix86_output_ssemov
> can properly encode xmm16-xmm31 registers with and without AVX512VL.
>
> gcc/
>
> PR target/89229
> * config/i386/i386.c (ix86_output_ssemov): Handle MODE_SI.
> * config/i386/i386.md (*movsi_internal): Call ix86_output_ssemov
> for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
> check.
>
> gcc/testsuite/
>
> PR target/89229
> * gcc.target/i386/pr89229-5a.c: New test.
> * gcc.target/i386/pr89229-5b.c: Likewise.
> * gcc.target/i386/pr89229-5c.c: Likewise.
Similar to #2, can't we get insn_mode to be SFmode for alternatives 10,11 and
!TARGET_SSE2? Won't that cause us to hit the gcc_unreachable in
ix86_output_ssemov?
jeff
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
2020-02-29 14:16 ` [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV H.J. Lu
@ 2020-03-12 3:41 ` Jeff Law
0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2020-03-12 3:41 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> There is no need to set mode attribute to XImode nor V8DFmode since
> ix86_output_ssemov can properly encode xmm16-xmm31 registers with and
> without AVX512VL.
>
> gcc/
>
> PR target/89229
> * config/i386/i386.c (ix86_output_ssemov): Handle MODE_DF.
> * config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
> for TYPE_SSEMOV. Remove TARGET_AVX512F, TARGET_PREFER_AVX256,
> TARGET_AVX512VL and ext_sse_reg_operand check.
And I worry about V1DF for alternatives 14,18 and SSE_SPLIT_REGS as well as
alternative 15,19 and !TARGET_SSE2 which has insn_mode of V2SF.
jeff
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
2020-02-29 14:16 ` [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV H.J. Lu
@ 2020-03-12 3:46 ` Jeff Law
0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2020-03-12 3:46 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> There is no need to set mode attribute to V16SFmode since ix86_output_ssemov
> can properly encode xmm16-xmm31 registers with and without AVX512VL.
>
> gcc/
>
> PR target/89229
> * config/i386/i386.c (ix86_output_ssemov): Handle MODE_SF.
> * config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
> for TYPE_SSEMOV. Remove TARGET_PREFER_AVX256, TARGET_AVX512VL
> and ext_sse_reg_operand check.
>
> gcc/testsuite/
>
> PR target/89229
> * gcc.target/i386/pr89229-7a.c: New test.
> * gcc.target/i386/pr89229-7b.c: Likewise.
> * gcc.target/i386/pr89229-7c.c: Likewise.
I believe this as a dependency on patch #3. It's OK once patch #3 is approved.
Alternately, you could break out the MODE_SI hunk in ix86_output_ssemov from
patch #3, add it to this patch and that would be approved for immediate
integration.
Jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
2020-02-29 14:16 ` [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV H.J. Lu
@ 2020-03-12 3:53 ` Jeff Law
2020-03-12 10:52 ` H.J. Lu
0 siblings, 1 reply; 16+ messages in thread
From: Jeff Law @ 2020-03-12 3:53 UTC (permalink / raw)
To: H.J. Lu, gcc-patches; +Cc: Jakub Jelinek, Jan Hubicka, Uros Bizjak
On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> There is no need to set mode attribute to XImode since ix86_output_ssemov
> can properly encode xmm16-xmm31 registers with and without AVX512VL.
>
> Remove ext_sse_reg_operand since it is no longer needed.
>
> PR target/89229
> * config/i386/i386.c (ix86_output_ssemov): Handle MODE_V1DF and
> MODE_V2SF.
> * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call
> ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand
> check.
> * config/i386/predicates.md (ext_sse_reg_operand): Removed.
This is OK. I think once this is in, patch #2 becomes OK because this patch
adds V2SF handling in ix86_output_ssemov.
Similarly I think patch #4 is OK once this one goes in since it adds V1DF as
well.
So perhaps an integration plan would be to immediately install #6, followed 24hrs
later by patch #4, then 24hrs after patch #2.
Then we can work on patch #5 and patch #3 where I think we go with patch #5 plus
the MODE_SI hunk from patch #3. THen 24hrs after that the remaining bits of
patch #3.
I think that covers the whole series.
jeff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
2020-03-12 3:53 ` Jeff Law
@ 2020-03-12 10:52 ` H.J. Lu
0 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2020-03-12 10:52 UTC (permalink / raw)
To: Jeffrey Law; +Cc: GCC Patches, Jakub Jelinek, Jan Hubicka, Uros Bizjak
[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]
On Wed, Mar 11, 2020 at 8:53 PM Jeff Law <law@redhat.com> wrote:
>
> On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> > There is no need to set mode attribute to XImode since ix86_output_ssemov
> > can properly encode xmm16-xmm31 registers with and without AVX512VL.
> >
> > Remove ext_sse_reg_operand since it is no longer needed.
> >
> > PR target/89229
> > * config/i386/i386.c (ix86_output_ssemov): Handle MODE_V1DF and
> > MODE_V2SF.
> > * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call
> > ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand
> > check.
> > * config/i386/predicates.md (ext_sse_reg_operand): Removed.
> This is OK. I think once this is in, patch #2 becomes OK because this patch
> adds V2SF handling in ix86_output_ssemov.
I need to take out the ext_sse_reg_operand removal since it is still being
used. I added MODE_DI to to ix86_output_ssemov.
> Similarly I think patch #4 is OK once this one goes in since it adds V1DF as
> well.
>
> So perhaps an integration plan would be to immediately install #6, followed 24hrs
> later by patch #4, then 24hrs after patch #2.
>
> Then we can work on patch #5 and patch #3 where I think we go with patch #5 plus
> the MODE_SI hunk from patch #3. THen 24hrs after that the remaining bits of
> patch #3.
I am enclosing the updated 5 remaining patches. I will check in the
first one and
check in the rest one patch every 24hrs.
> I think that covers the whole series.
>
Thanks.
--
H.J.
[-- Attachment #2: 0001-i386-Use-ix86_output_ssemov-for-MMX-TYPE_SSEMOV.patch --]
[-- Type: text/x-patch, Size: 3294 bytes --]
From 555880dad82a9b511945250c0436ee05c4962f65 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 14 Feb 2020 11:07:34 -0800
Subject: [PATCH 1/5] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DI,
MODE_V1DF and MODE_V2SF.
* config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call
ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand
check.
---
gcc/config/i386/i386.c | 19 +++++++++++++++++++
gcc/config/i386/mmx.md | 29 ++---------------------------
2 files changed, 21 insertions(+), 27 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7bbfbb4c5a7..6d83855692f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5118,6 +5118,25 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
case MODE_V4SF:
return ix86_get_ssemov (operands, 16, insn_mode, mode);
+ case MODE_DI:
+ /* Handle broken assemblers that require movd instead of movq. */
+ if (!HAVE_AS_IX86_INTERUNIT_MOVQ
+ && (GENERAL_REG_P (operands[0])
+ || GENERAL_REG_P (operands[1])))
+ return "%vmovd\t{%1, %0|%0, %1}";
+ else
+ return "%vmovq\t{%1, %0|%0, %1}";
+
+ case MODE_V1DF:
+ gcc_assert (!TARGET_AVX);
+ return "movlpd\t{%1, %0|%0, %1}";
+
+ case MODE_V2SF:
+ if (TARGET_AVX && REG_P (operands[0]))
+ return "vmovlps\t{%1, %d0|%d0, %1}";
+ else
+ return "%vmovlps\t{%1, %0|%0, %1}";
+
default:
gcc_unreachable ();
}
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index e1c8b0af4c7..c3f195bb34a 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -118,29 +118,7 @@ (define_insn "*mov<mode>_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
- case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
- case MODE_XI:
- return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
-
- case MODE_V2SF:
- if (TARGET_AVX && REG_P (operands[0]))
- return "vmovlps\t{%1, %0, %0|%0, %0, %1}";
- return "%vmovlps\t{%1, %0|%0, %1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -189,10 +167,7 @@ (define_insn "*mov<mode>_internal"
(cond [(eq_attr "alternative" "2")
(const_string "SI")
(eq_attr "alternative" "11,12")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (match_test "<MODE>mode == V2SFmode")
+ (cond [(match_test "<MODE>mode == V2SFmode")
(const_string "V4SF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
--
2.24.1
[-- Attachment #3: 0002-i386-Use-ix86_output_ssemov-for-DFmode-TYPE_SSEMOV.patch --]
[-- Type: text/x-patch, Size: 5796 bytes --]
From d02ae1b84bb6dcc30230808a57e12d49d6f4a853 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 14 Feb 2020 10:32:06 -0800
Subject: [PATCH 2/5] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
There is no need to set mode attribute to XImode nor V8DFmode since
ix86_output_ssemov can properly encode xmm16-xmm31 registers with and
without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove TARGET_AVX512F, TARGET_PREFER_AVX256,
TARGET_AVX512VL and ext_sse_reg_operand check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-4a.c: New test.
* gcc.target/i386/pr89229-4b.c: Likewise.
* gcc.target/i386/pr89229-4c.c: Likewise.
---
gcc/config/i386/i386.c | 6 +++
gcc/config/i386/i386.md | 44 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-4a.c | 16 ++++++++
gcc/testsuite/gcc.target/i386/pr89229-4b.c | 7 ++++
gcc/testsuite/gcc.target/i386/pr89229-4c.c | 6 +++
5 files changed, 38 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6d83855692f..924f9558b24 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5127,6 +5127,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
else
return "%vmovq\t{%1, %0|%0, %1}";
+ case MODE_DF:
+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+ return "vmovsd\t{%d1, %0|%0, %d1}";
+ else
+ return "%vmovsd\t{%1, %0|%0, %1}";
+
case MODE_V1DF:
gcc_assert (!TARGET_AVX);
return "movlpd\t{%1, %0|%0, %1}";
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8b5ae34ee11..0f57f939cc3 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3355,37 +3355,7 @@ (define_insn "*movdf_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
- return "vmovsd\t{%d1, %0|%0, %d1}";
- return "%vmovsd\t{%1, %0|%0, %1}";
-
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
- case MODE_V8DF:
- return "vmovapd\t{%g1, %g0|%g0, %g1}";
- case MODE_V2DF:
- return "%vmovapd\t{%1, %0|%0, %1}";
-
- case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
- case MODE_V1DF:
- gcc_assert (!TARGET_AVX);
- return "movlpd\t{%1, %0|%0, %1}";
-
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
default:
gcc_unreachable ();
@@ -3439,10 +3409,7 @@ (define_insn "*movdf_internal"
/* xorps is one byte shorter for non-AVX targets. */
(eq_attr "alternative" "12,16")
- (cond [(and (match_test "TARGET_AVX512F")
- (not (match_test "TARGET_PREFER_AVX256")))
- (const_string "XI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "V2DF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
@@ -3458,12 +3425,7 @@ (define_insn "*movdf_internal"
/* movaps is one byte shorter for non-AVX targets. */
(eq_attr "alternative" "13,17")
- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
- (ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V8DF")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "DF")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4a.c b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
new file mode 100644
index 00000000000..5bc10d25619
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern double d;
+
+void
+foo1 (double x)
+{
+ register double xmm16 __asm ("xmm16") = x;
+ asm volatile ("" : "+v" (xmm16));
+ register double xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "vmovapd" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4b.c b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
new file mode 100644
index 00000000000..228aeb7b580
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-4a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
+/* { dg-final { scan-assembler-not "vmovapd" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4c.c b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
new file mode 100644
index 00000000000..537c82fbc54
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-4a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
[-- Attachment #4: 0003-i386-Use-ix86_output_ssemov-for-DImode-TYPE_SSEMOV.patch --]
[-- Type: text/x-patch, Size: 4604 bytes --]
From 762e781167e1f1584e35087c301b1decc6794d13 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 14 Feb 2020 10:16:34 -0800
Subject: [PATCH 3/5] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.md (*movdi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-5a.c: New test.
* gcc.target/i386/pr89229-5b.c: Likewise.
* gcc.target/i386/pr89229-5c.c: Likewise.
---
gcc/config/i386/i386.md | 31 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-5a.c | 17 ++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-5b.c | 6 +++++
gcc/testsuite/gcc.target/i386/pr89229-5c.c | 7 +++++
4 files changed, 32 insertions(+), 29 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 0f57f939cc3..6fa5db0a452 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2054,31 +2054,7 @@ (define_insn "*movdi_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq. */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
- return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
- case MODE_TI:
- /* Handle AVX512 registers set. */
- if (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1]))
- return "vmovdqa64\t{%1, %0|%0, %1}";
- return "%vmovdqa\t{%1, %0|%0, %1}";
-
- case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_SSECVT:
if (SSE_REG_P (operands[0]))
@@ -2164,10 +2140,7 @@ (define_insn "*movdi_internal"
(cond [(eq_attr "alternative" "2")
(const_string "SI")
(eq_attr "alternative" "12,13")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "TI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "TI")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5a.c b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
new file mode 100644
index 00000000000..cb9b071e873
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+extern long long i;
+
+long long
+foo1 (void)
+{
+ register long long xmm16 __asm ("xmm16") = i;
+ asm volatile ("" : "+v" (xmm16));
+ register long long xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ return xmm17;
+}
+
+/* { dg-final { scan-assembler-times "vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5b.c b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
new file mode 100644
index 00000000000..261f2e12e8d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5c.c b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
new file mode 100644
index 00000000000..5fe537f47cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
[-- Attachment #5: 0004-i386-Use-ix86_output_ssemov-for-SFmode-TYPE_SSEMOV.patch --]
[-- Type: text/x-patch, Size: 5173 bytes --]
From 9bff24ba58a91fac044582bc03d3e0ab121b8067 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 14 Feb 2020 10:38:47 -0800
Subject: [PATCH 4/5] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
There is no need to set mode attribute to V16SFmode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
gcc/
PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_SI and
MODE_SF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove TARGET_PREFER_AVX256, TARGET_AVX512VL
and ext_sse_reg_operand check.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-6a.c: New test.
* gcc.target/i386/pr89229-6b.c: Likewise.
* gcc.target/i386/pr89229-6c.c: Likewise.
---
gcc/config/i386/i386.c | 9 ++++++++
gcc/config/i386/i386.md | 26 ++--------------------
gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 +++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-6b.c | 6 +++++
gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 +++++
5 files changed, 39 insertions(+), 24 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 924f9558b24..d1910b42b1b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5127,12 +5127,21 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
else
return "%vmovq\t{%1, %0|%0, %1}";
+ case MODE_SI:
+ return "%vmovd\t{%1, %0|%0, %1}";
+
case MODE_DF:
if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
return "vmovsd\t{%d1, %0|%0, %d1}";
else
return "%vmovsd\t{%1, %0|%0, %1}";
+ case MODE_SF:
+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+ return "vmovss\t{%d1, %0|%0, %d1}";
+ else
+ return "%vmovss\t{%1, %0|%0, %1}";
+
case MODE_V1DF:
gcc_assert (!TARGET_AVX);
return "movlpd\t{%1, %0|%0, %1}";
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6fa5db0a452..af39f90c68e 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3490,24 +3490,7 @@ (define_insn "*movsf_internal"
return standard_sse_constant_opcode (insn, operands);
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_SF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
- return "vmovss\t{%d1, %0|%0, %d1}";
- return "%vmovss\t{%1, %0|%0, %1}";
-
- case MODE_V16SF:
- return "vmovaps\t{%g1, %g0|%g0, %g1}";
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- case MODE_SI:
- return "%vmovd\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_MMXMOV:
switch (get_attr_mode (insn))
@@ -3579,12 +3562,7 @@ (define_insn "*movsf_internal"
better to maintain the whole registers in single format
to avoid problems on using packed logical operations. */
(eq_attr "alternative" "6")
- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
- (ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V16SF")
- (ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+ (cond [(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
(match_test "TARGET_SSE_SPLIT_REGS"))
(const_string "V4SF")
]
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6a.c b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
new file mode 100644
index 00000000000..856115b2f5a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern float d;
+
+void
+foo1 (float x)
+{
+ register float xmm16 __asm ("xmm16") = x;
+ asm volatile ("" : "+v" (xmm16));
+ register float xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6b.c b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
new file mode 100644
index 00000000000..a74f7169e6e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-6a.c"
+
+/* { dg-final { scan-assembler-times "vmovaps\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6c.c b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
new file mode 100644
index 00000000000..7a4d254670c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-6a.c"
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
[-- Attachment #6: 0005-i386-Use-ix86_output_ssemov-for-SImode-TYPE_SSEMOV.patch --]
[-- Type: text/x-patch, Size: 4983 bytes --]
From f3151bd92c342ddddf95f54c2c1a2bad57ea56b1 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 14 Feb 2020 10:21:17 -0800
Subject: [PATCH 5/5] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.
Remove ext_sse_reg_operand since it is no longer needed.
gcc/
PR target/89229
* config/i386/i386.md (*movsi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV. Remove ext_sse_reg_operand and TARGET_AVX512VL
check.
* config/i386/predicates.md (ext_sse_reg_operand): Removed.
gcc/testsuite/
PR target/89229
* gcc.target/i386/pr89229-7a.c: New test.
* gcc.target/i386/pr89229-7b.c: Likewise.
* gcc.target/i386/pr89229-7c.c: Likewise.
---
gcc/config/i386/i386.md | 25 ++--------------------
gcc/config/i386/predicates.md | 5 -----
gcc/testsuite/gcc.target/i386/pr89229-7a.c | 17 +++++++++++++++
gcc/testsuite/gcc.target/i386/pr89229-7b.c | 6 ++++++
gcc/testsuite/gcc.target/i386/pr89229-7c.c | 7 ++++++
5 files changed, 32 insertions(+), 28 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index af39f90c68e..3051624d89f 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2261,25 +2261,7 @@ (define_insn "*movsi_internal"
gcc_unreachable ();
case TYPE_SSEMOV:
- switch (get_attr_mode (insn))
- {
- case MODE_SI:
- return "%vmovd\t{%1, %0|%0, %1}";
- case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
- case MODE_XI:
- return "vmovdqa32\t{%g1, %g0|%g0, %g1}";
-
- case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
- case MODE_SF:
- gcc_assert (!TARGET_AVX);
- return "movss\t{%1, %0|%0, %1}";
-
- default:
- gcc_unreachable ();
- }
+ return ix86_output_ssemov (insn, operands);
case TYPE_MMX:
return "pxor\t%0, %0";
@@ -2345,10 +2327,7 @@ (define_insn "*movsi_internal"
(cond [(eq_attr "alternative" "2,3")
(const_string "DI")
(eq_attr "alternative" "8,9")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
- (const_string "XI")
- (match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
(const_string "TI")
(ior (not (match_test "TARGET_SSE2"))
(match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 1119366d54e..71f4cb1193c 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -61,11 +61,6 @@ (define_predicate "sse_reg_operand"
(and (match_code "reg")
(match_test "SSE_REGNO_P (REGNO (op))")))
-;; True if the operand is an AVX-512 new register.
-(define_predicate "ext_sse_reg_operand"
- (and (match_code "reg")
- (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))")))
-
;; Return true if op is a QImode register.
(define_predicate "any_QIreg_operand"
(and (match_code "reg")
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7a.c b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
new file mode 100644
index 00000000000..fd56f447016
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern int i;
+
+int
+foo1 (void)
+{
+ register int xmm16 __asm ("xmm16") = i;
+ asm volatile ("" : "+v" (xmm16));
+ register int xmm17 __asm ("xmm17") = xmm16;
+ asm volatile ("" : "+v" (xmm17));
+ return xmm17;
+}
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7b.c b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
new file mode 100644
index 00000000000..d3a56e6e2b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7c.c b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
new file mode 100644
index 00000000000..e14634e1edd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-assembler-times "vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
--
2.24.1
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-03-12 10:53 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-29 14:16 V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move H.J. Lu
2020-02-29 14:16 ` [PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV H.J. Lu
2020-03-12 3:32 ` Jeff Law
2020-02-29 14:16 ` [PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV H.J. Lu
2020-03-12 3:53 ` Jeff Law
2020-03-12 10:52 ` H.J. Lu
2020-02-29 14:16 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
2020-03-05 23:47 ` Jeff Law
2020-03-08 12:04 ` [COMMITTED, PATCH] gcc.target/i386/pr89229-3c.c: Include "pr89229-3a.c" H.J. Lu
2020-03-10 12:35 ` [PATCH 1/6] i386: Properly encode vector registers in vector move H.J. Lu
2020-02-29 14:16 ` [PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV H.J. Lu
2020-03-12 3:41 ` Jeff Law
2020-02-29 14:16 ` [PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV H.J. Lu
2020-03-12 3:46 ` Jeff Law
2020-02-29 15:30 ` [PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV H.J. Lu
2020-03-12 3:39 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).