public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc/devel/autopar_devel] x86: Fix vextract* masked patterns [PR93069]
@ 2020-08-22 20:54 Giuliano Belinassi
  0 siblings, 0 replies; only message in thread
From: Giuliano Belinassi @ 2020-08-22 20:54 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:0ae0c5a988fe996833d122beee1b3f9db53a04f0

commit 0ae0c5a988fe996833d122beee1b3f9db53a04f0
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed May 6 20:05:02 2020 +0200

    x86: Fix vextract* masked patterns [PR93069]
    
    The AVX512F documentation clearly states that in instructions where the
    destination is a memory only merging-masking is possible, not zero-masking,
    and the assembler enforces that.
    
    The testcase in this patch fails to assemble because of
    Error: unsupported masking for `vextracti32x8'
    on
            vextracti32x8   $0x0, %zmm1, -64(%rsp){%k1}{z}
    For the vector extraction patterns, we apparently have 7 *_maskm patterns
    that only accept memory destinations and rtx_equal_p merge-masking source
    for it, 7 *<mask_name> corresponding patterns that allow memory destination
    only for the non-masked cases (through <store_mask_constraint>), then 2
    *<mask_name> patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which
    do allow memory destination even for masked cases and are the cause of the
    testsuite failure, because we must not allow C constraint if the destination
    is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf
    VI4F_256), which has another issue (for which I don't have a testcase
    though), where if it would match zero-masking with register destination,
    it wouldn't emit the needed {z} into assembly.
    The attached patch fixes those 3 issues only, perhaps more suitable for
    backporting.
    But, even with that fixed, we are missing 3 further *_maskm patterns and
    more importantly, I find the split into 3 separate patterns after subst,
    *_maskm for masking with memory destination, *_mask for masking with
    register destination and * for non-masking unnecessarily complex and harder
    for reload, so the included patch below (non-attached) instead kills all
    *_maskm patterns and splits the *<mask_name> patterns into * and *_mask
    by hand instead of subst, where the *_mask ones make sure that with v
    destination they use 0C, while with m destination they use 0 and as
    condition enforce that either destination is not MEM, or rtx_equal_p between
    the destination and corresponding merging-masking operand source.
    If we had those 3 missing *_maskm patterns, this patch would actually result
    in both shorter sse.md and shorter machine description after subst (e.g.
    length of tmp-mddump.md), as we don't have them, the patch is actually 16
    lines longer sse.md, but still shorter tmp-mddump.md.
    
    2020-05-06  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/93069
            * config/i386/subst.md (store_mask_constraint, store_mask_predicate):
            Remove.
            (avx512dq_vextract<shuffletype>64x2_1_maskm,
            avx512f_vextract<shuffletype>32x4_1_maskm,
            vec_extract_lo_<mode>_maskm, vec_extract_hi_<mode>_maskm): Remove.
            (<mask_codefor>avx512dq_vextract<shuffletype>64x2_1<mask_name>): Split
            into ...
            (*avx512dq_vextract<shuffletype>64x2_1,
            avx512dq_vextract<shuffletype>64x2_1_mask): ... these new
            define_insns.  Even in the masked variant allow memory output but in
            that case use 0 rather than 0C constraint on the source of masked-out
            elts.
            (<mask_codefor>avx512f_vextract<shuffletype>32x4_1<mask_name>): Split
            into ...
            (*avx512f_vextract<shuffletype>32x4_1,
            avx512f_vextract<shuffletype>32x4_1_mask): ... these new define_insns.
            Even in the masked variant allow memory output but in that case use
            0 rather than 0C constraint on the source of masked-out elts.
            (vec_extract_lo_<mode><mask_name>): Split into ...
            (vec_extract_lo_<mode>, vec_extract_lo_<mode>_mask): ... these new
            define_insns.  Even in the masked variant allow memory output but in
            that case use 0 rather than 0C constraint on the source of masked-out
            elts.
            (vec_extract_hi_<mode><mask_name>): Split into ...
            (vec_extract_hi_<mode>, vec_extract_hi_<mode>_mask): ... these new
            define_insns.  Even in the masked variant allow memory output but in
            that case use 0 rather than 0C constraint on the source of masked-out
            elts.

Diff:
---
 gcc/ChangeLog            |  78 +++++++---
 gcc/config/i386/sse.md   | 389 ++++++++++++++++++++++++-----------------------
 gcc/config/i386/subst.md |   2 -
 3 files changed, 256 insertions(+), 213 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 18800ec605a..1afb7824fa5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,35 @@
+2020-05-06  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93069
+	* config/i386/subst.md (store_mask_constraint, store_mask_predicate):
+	Remove.
+	(avx512dq_vextract<shuffletype>64x2_1_maskm,
+	avx512f_vextract<shuffletype>32x4_1_maskm,
+	vec_extract_lo_<mode>_maskm, vec_extract_hi_<mode>_maskm): Remove.
+	(<mask_codefor>avx512dq_vextract<shuffletype>64x2_1<mask_name>): Split
+	into ...
+	(*avx512dq_vextract<shuffletype>64x2_1,
+	avx512dq_vextract<shuffletype>64x2_1_mask): ... these new
+	define_insns.  Even in the masked variant allow memory output but in
+	that case use 0 rather than 0C constraint on the source of masked-out
+	elts.
+	(<mask_codefor>avx512f_vextract<shuffletype>32x4_1<mask_name>): Split
+	into ...
+	(*avx512f_vextract<shuffletype>32x4_1,
+	avx512f_vextract<shuffletype>32x4_1_mask): ... these new define_insns.
+	Even in the masked variant allow memory output but in that case use
+	0 rather than 0C constraint on the source of masked-out elts.
+	(vec_extract_lo_<mode><mask_name>): Split into ...
+	(vec_extract_lo_<mode>, vec_extract_lo_<mode>_mask): ... these new
+	define_insns.  Even in the masked variant allow memory output but in
+	that case use 0 rather than 0C constraint on the source of masked-out
+	elts.
+	(vec_extract_hi_<mode><mask_name>): Split into ...
+	(vec_extract_hi_<mode>, vec_extract_hi_<mode>_mask): ... these new
+	define_insns.  Even in the masked variant allow memory output but in
+	that case use 0 rather than 0C constraint on the source of masked-out
+	elts.
+
 2020-05-06  qing zhao  <qing.zhao@oracle.com>
 
 	PR c/94230
@@ -111,27 +143,27 @@
 2020-05-06  Hongtao Liu  <hongtao.liu@intel.com>
 	    Wei Xiao  <wei3.xiao@intel.com>
 
-	* gcc/common/config/i386/i386-common.c (OPTION_MASK_ISA2_SERIALIZE_SET,
+	* common/config/i386/i386-common.c (OPTION_MASK_ISA2_SERIALIZE_SET,
 	OPTION_MASK_ISA2_SERIALIZE_UNSET): New macros.
 	(ix86_handle_option): Handle -mserialize.
-	* gcc/config.gcc (serializeintrin.h): New header file.
-	* gcc/config/i386/cpuid.h (bit_SERIALIZE): New bit.
-	* gcc/config/i386/driver-i386.c (host_detect_local_cpu): Detect
+	* config.gcc (serializeintrin.h): New header file.
+	* config/i386/cpuid.h (bit_SERIALIZE): New bit.
+	* config/i386/driver-i386.c (host_detect_local_cpu): Detect
 	-mserialize.
-	* gcc/config/i386/i386-builtin.def: Add new builtin.
-	* gcc/config/i386/i386-c.c (__SERIALIZE__): New macro.
-	* gcc/config/i386/i386-options.c (ix86_target_opts_isa2_opts):
+	* config/i386/i386-builtin.def: Add new builtin.
+	* config/i386/i386-c.c (__SERIALIZE__): New macro.
+	* config/i386/i386-options.c (ix86_target_opts_isa2_opts):
 	  Add -mserialize.
 	* (ix86_valid_target_attribute_inner_p): Add target attribute
 	* for serialize.
-	* gcc/config/i386/i386.h (TARGET_SERIALIZE, TARGET_SERIALIZE_P):
+	* config/i386/i386.h (TARGET_SERIALIZE, TARGET_SERIALIZE_P):
 	  New macros.
-	* gcc/config/i386/i386.md (UNSPECV_SERIALIZE): New unspec.
+	* config/i386/i386.md (UNSPECV_SERIALIZE): New unspec.
 	  (serialize): New define_insn.
-	* gcc/config/i386/i386.opt (mserialize): New option
-	* gcc/config/i386/immintrin.h: Include serailizeintrin.h.
-	* gcc/config/i386/serializeintrin.h: New header file.
-	* gcc/doc/invoke.texi: Add documents for -mserialize.
+	* config/i386/i386.opt (mserialize): New option
+	* config/i386/immintrin.h: Include serailizeintrin.h.
+	* config/i386/serializeintrin.h: New header file.
+	* doc/invoke.texi: Add documents for -mserialize.
 
 2020-05-06  Richard Biener  <rguenther@suse.de>
 
@@ -144,7 +176,7 @@
 	private branch.
 	* config/rs6000/rs6000-c.c: Likewise.
 	* config/rs6000/rs6000-call.c: Likewise.
-	* gcc/config/rs6000/rs6000.c: Likewise.
+	* config/rs6000/rs6000.c: Likewise.
 
 2020-05-05  Sebastian Huber  <sebastian.huber@embedded-brains.de>
 
@@ -865,7 +897,7 @@
 2020-04-28  Alexandre Oliva <oliva@adacore.com>
 
 	PR target/94812
-	* gcc/config/rs6000/rs6000.md (rs6000_mffsl): Copy result to
+	* config/rs6000/rs6000.md (rs6000_mffsl): Copy result to
 	output operand in emulation.  Don't overwrite pseudos.
 
 2020-04-28  Jeff Law  <law@redhat.com>
@@ -1120,7 +1152,7 @@
 
 2020-04-23  Bill Schmidt  <wschmidt@linux.ibm.com>
 
-	* gcc/doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
+	* doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
 	Replace outdated link to ELFv2 ABI.
 
 2020-04-23  Jakub Jelinek  <jakub@redhat.com>
@@ -2700,7 +2732,7 @@
 
 2020-03-29  John David Anglin  <dave.anglin@bell.net>
 
-	* gcc/config/pa/pa.c (pa_asm_output_aligned_bss): Delete duplicate
+	* config/pa/pa.c (pa_asm_output_aligned_bss): Delete duplicate
 	.align output.
 
 2020-03-28  Jakub Jelinek  <jakub@redhat.com>
@@ -3192,7 +3224,7 @@
 2020-03-21 Iain Sandoe <iain@sandoe.co.uk>
 
 	PR target/93694
-	* gcc/config/darwin.opt: Amend options descriptions.
+	* config/darwin.opt: Amend options descriptions.
 
 2020-03-21  Richard Sandiford  <richard.sandiford@arm.com>
 
@@ -3214,7 +3246,7 @@
 2020-03-20  Carl Love  <cel@us.ibm.com>
 
 	PR/target 87583
-	* gcc/config/rs6000/rs6000.c (rs6000_option_override_internal):
+	* config/rs6000/rs6000.c (rs6000_option_override_internal):
 	Add check for TARGET_FPRND for Power 7 or newer.
 
 2020-03-20  Jan Hubicka  <hubicka@ucw.cz>
@@ -10798,7 +10830,7 @@
 2020-03-10  Jiufu Guo  <guojiufu@linux.ibm.com>
 
 	PR target/93709
-	* gcc/config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): Check
+	* config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): Check
 	NAN and SIGNED_ZEROR for smax/smin.
 
 2020-03-10  Will Schmidt  <will_schmidt@vnet.ibm.com>
@@ -11856,9 +11888,9 @@
 
 2020-02-21  John David Anglin  <danglin@gcc.gnu.org>
 
-	* gcc/config/pa/pa.c (pa_function_value): Fix check for word and
+	* config/pa/pa.c (pa_function_value): Fix check for word and
 	double-word size when handling aggregate return values.
-	* gcc/config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Fix to indicate
+	* config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Fix to indicate
 	that homogeneous SFmode and DFmode aggregates are passed and returned
 	in general registers.
 
@@ -13983,7 +14015,7 @@
 
 2020-01-21  Mihail-Calin Ionescu  <mihail.ionescu@arm.com>
 
-	* gcc/config/arm/arm.c (clear_operation_p):
+	* config/arm/arm.c (clear_operation_p):
 	Initialise last_regno, skip first iteration
 	based on the first_set value and use ints instead
 	of the unnecessary HOST_WIDE_INTs.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 0d69c9eb903..7a7ecd4be87 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8323,60 +8323,31 @@
   DONE;
 })
 
-(define_insn "avx512dq_vextract<shuffletype>64x2_1_maskm"
-  [(set (match_operand:<ssequartermode> 0 "memory_operand" "=m")
+(define_insn "avx512dq_vextract<shuffletype>64x2_1_mask"
+  [(set (match_operand:<ssequartermode> 0 "nonimmediate_operand" "=v,m")
 	(vec_merge:<ssequartermode>
 	  (vec_select:<ssequartermode>
-	    (match_operand:V8FI 1 "register_operand" "v")
-	    (parallel [(match_operand 2  "const_0_to_7_operand")
-	      (match_operand 3  "const_0_to_7_operand")]))
-	  (match_operand:<ssequartermode> 4 "memory_operand" "0")
-	  (match_operand:QI 5 "register_operand" "Yk")))]
+	    (match_operand:V8FI 1 "register_operand" "v,v")
+	    (parallel [(match_operand 2 "const_0_to_7_operand")
+		       (match_operand 3 "const_0_to_7_operand")]))
+	  (match_operand:<ssequartermode> 4 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 5 "register_operand" "Yk,Yk")))]
   "TARGET_AVX512DQ
    && INTVAL (operands[2]) % 2 == 0
    && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
-   && rtx_equal_p (operands[4], operands[0])"
-{
-  operands[2] = GEN_INT ((INTVAL (operands[2])) >> 1);
-  return "vextract<shuffletype>64x2\t{%2, %1, %0%{%5%}|%0%{%5%}, %1, %2}";
-}
-  [(set_attr "type" "sselog")
-   (set_attr "prefix_extra" "1")
-   (set_attr "length_immediate" "1")
-   (set_attr "memory" "store")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "avx512f_vextract<shuffletype>32x4_1_maskm"
-  [(set (match_operand:<ssequartermode> 0 "memory_operand" "=m")
-	(vec_merge:<ssequartermode>
-	  (vec_select:<ssequartermode>
-	    (match_operand:V16FI 1 "register_operand" "v")
-	    (parallel [(match_operand 2  "const_0_to_15_operand")
-	      (match_operand 3  "const_0_to_15_operand")
-	      (match_operand 4  "const_0_to_15_operand")
-	      (match_operand 5  "const_0_to_15_operand")]))
-	  (match_operand:<ssequartermode> 6 "memory_operand" "0")
-	  (match_operand:QI 7 "register_operand" "Yk")))]
-  "TARGET_AVX512F
-   && INTVAL (operands[2]) % 4 == 0
-   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
-   && INTVAL (operands[3]) == INTVAL (operands[4]) - 1
-   && INTVAL (operands[4]) == INTVAL (operands[5]) - 1
-   && rtx_equal_p (operands[6], operands[0])"
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[4]))"
 {
-  operands[2] = GEN_INT (INTVAL (operands[2]) >> 2);
-  return "vextract<shuffletype>32x4\t{%2, %1, %0%{%7%}|%0%{%7%}, %1, %2}";
+  operands[2] = GEN_INT (INTVAL (operands[2]) >> 1);
+  return "vextract<shuffletype>64x2\t{%2, %1, %0%{%5%}%N4|%0%{%5%}%N4, %1, %2}";
 }
-  [(set_attr "type" "sselog")
+  [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
-   (set_attr "memory" "store")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<mask_codefor>avx512dq_vextract<shuffletype>64x2_1<mask_name>"
-  [(set (match_operand:<ssequartermode> 0 "<store_mask_predicate>" "=<store_mask_constraint>")
+(define_insn "*avx512dq_vextract<shuffletype>64x2_1"
+  [(set (match_operand:<ssequartermode> 0 "nonimmediate_operand" "=vm")
 	(vec_select:<ssequartermode>
 	  (match_operand:V8FI 1 "register_operand" "v")
 	  (parallel [(match_operand 2  "const_0_to_7_operand")
@@ -8386,7 +8357,7 @@
    && INTVAL (operands[2]) == INTVAL (operands[3]) - 1"
 {
   operands[2] = GEN_INT (INTVAL (operands[2]) >> 1);
-  return "vextract<shuffletype>64x2\t{%2, %1, %0<mask_operand4>|%0<mask_operand4>, %1, %2}";
+  return "vextract<shuffletype>64x2\t{%2, %1, %0|%0, %1, %2}";
 }
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
@@ -8415,14 +8386,41 @@
     operands[1] = gen_lowpart (<ssequartermode>mode, operands[1]);
 })
 
-(define_insn "<mask_codefor>avx512f_vextract<shuffletype>32x4_1<mask_name>"
-  [(set (match_operand:<ssequartermode> 0 "<store_mask_predicate>" "=<store_mask_constraint>")
+(define_insn "avx512f_vextract<shuffletype>32x4_1_mask"
+  [(set (match_operand:<ssequartermode> 0 "nonimmediate_operand" "=v,m")
+	(vec_merge:<ssequartermode>
+	  (vec_select:<ssequartermode>
+	    (match_operand:V16FI 1 "register_operand" "v,v")
+	    (parallel [(match_operand 2 "const_0_to_15_operand")
+		       (match_operand 3 "const_0_to_15_operand")
+		       (match_operand 4 "const_0_to_15_operand")
+		       (match_operand 5 "const_0_to_15_operand")]))
+	  (match_operand:<ssequartermode> 6 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 7 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512F
+   && INTVAL (operands[2]) % 4 == 0
+   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
+   && INTVAL (operands[3]) == INTVAL (operands[4]) - 1
+   && INTVAL (operands[4]) == INTVAL (operands[5]) - 1
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[6]))"
+{
+  operands[2] = GEN_INT (INTVAL (operands[2]) >> 2);
+  return "vextract<shuffletype>32x4\t{%2, %1, %0%{%7%}%N6|%0%{%7%}%N6, %1, %2}";
+}
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "*avx512f_vextract<shuffletype>32x4_1"
+  [(set (match_operand:<ssequartermode> 0 "nonimmediate_operand" "=vm")
 	(vec_select:<ssequartermode>
 	  (match_operand:V16FI 1 "register_operand" "v")
-	  (parallel [(match_operand 2  "const_0_to_15_operand")
-            (match_operand 3  "const_0_to_15_operand")
-            (match_operand 4  "const_0_to_15_operand")
-            (match_operand 5  "const_0_to_15_operand")])))]
+	  (parallel [(match_operand 2 "const_0_to_15_operand")
+		     (match_operand 3 "const_0_to_15_operand")
+		     (match_operand 4 "const_0_to_15_operand")
+		     (match_operand 5 "const_0_to_15_operand")])))]
   "TARGET_AVX512F
    && INTVAL (operands[2]) % 4 == 0
    && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
@@ -8430,7 +8428,7 @@
    && INTVAL (operands[4]) == INTVAL (operands[5]) - 1"
 {
   operands[2] = GEN_INT (INTVAL (operands[2]) >> 2);
-  return "vextract<shuffletype>32x4\t{%2, %1, %0<mask_operand6>|%0<mask_operand6>, %1, %2}";
+  return "vextract<shuffletype>32x4\t{%2, %1, %0|%0, %1, %2}";
 }
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
@@ -8514,35 +8512,35 @@
   [(set (match_dup 0) (match_dup 1))]
   "operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);")
 
-(define_insn "vec_extract_lo_<mode>_maskm"
-  [(set (match_operand:<ssehalfvecmode> 0 "memory_operand" "=m")
+(define_insn "vec_extract_lo_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
 	(vec_merge:<ssehalfvecmode>
 	  (vec_select:<ssehalfvecmode>
-	    (match_operand:V8FI 1 "register_operand" "v")
+	    (match_operand:V8FI 1 "register_operand" "v,v")
 	    (parallel [(const_int 0) (const_int 1)
-	      (const_int 2) (const_int 3)]))
-	  (match_operand:<ssehalfvecmode> 2 "memory_operand" "0")
-	  (match_operand:QI 3 "register_operand" "Yk")))]
+		       (const_int 2) (const_int 3)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
   "TARGET_AVX512F
-   && rtx_equal_p (operands[2], operands[0])"
-  "vextract<shuffletype>64x4\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}"
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>64x4\t{$0x0, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x0}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
+   (set_attr "memory" "none,store")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_lo_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=v,<store_mask_constraint>,v")
+(define_insn "vec_extract_lo_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,vm,v")
 	(vec_select:<ssehalfvecmode>
-	  (match_operand:V8FI 1 "<store_mask_predicate>" "v,v,<store_mask_constraint>")
+	  (match_operand:V8FI 1 "nonimmediate_operand" "v,v,vm")
 	  (parallel [(const_int 0) (const_int 1)
-            (const_int 2) (const_int 3)])))]
-  "TARGET_AVX512F
-   && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
+		     (const_int 2) (const_int 3)])))]
+  "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
-  if (<mask_applied> || (!TARGET_AVX512VL && !MEM_P (operands[1])))
-    return "vextract<shuffletype>64x4\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
+  if (!TARGET_AVX512VL && !MEM_P (operands[1]))
+    return "vextract<shuffletype>64x4\t{$0x0, %1, %0|%0, %1, 0x0}";
   else
     return "#";
 }
@@ -8553,70 +8551,69 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_hi_<mode>_maskm"
-  [(set (match_operand:<ssehalfvecmode> 0 "memory_operand" "=m")
+(define_insn "vec_extract_hi_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
 	(vec_merge:<ssehalfvecmode>
 	  (vec_select:<ssehalfvecmode>
-	    (match_operand:V8FI 1 "register_operand" "v")
+	    (match_operand:V8FI 1 "register_operand" "v,v")
 	    (parallel [(const_int 4) (const_int 5)
-	      (const_int 6) (const_int 7)]))
-	  (match_operand:<ssehalfvecmode> 2 "memory_operand" "0")
-	  (match_operand:QI 3 "register_operand" "Yk")))]
+		       (const_int 6) (const_int 7)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
   "TARGET_AVX512F
-   && rtx_equal_p (operands[2], operands[0])"
-  "vextract<shuffletype>64x4\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}"
-  [(set_attr "type" "sselog")
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>64x4\t{$0x1, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x1}"
+  [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
-   (set_attr "memory" "store")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_hi_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=<store_mask_constraint>")
+(define_insn "vec_extract_hi_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=vm")
 	(vec_select:<ssehalfvecmode>
 	  (match_operand:V8FI 1 "register_operand" "v")
 	  (parallel [(const_int 4) (const_int 5)
-            (const_int 6) (const_int 7)])))]
+		     (const_int 6) (const_int 7)])))]
   "TARGET_AVX512F"
-  "vextract<shuffletype>64x4\t{$0x1, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x1}"
+  "vextract<shuffletype>64x4\t{$0x1, %1, %0|%0, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_hi_<mode>_maskm"
-   [(set (match_operand:<ssehalfvecmode> 0 "memory_operand" "=m")
+(define_insn "vec_extract_hi_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
 	(vec_merge:<ssehalfvecmode>
 	  (vec_select:<ssehalfvecmode>
-	    (match_operand:V16FI 1 "register_operand" "v")
+	    (match_operand:V16FI 1 "register_operand" "v,v")
 	    (parallel [(const_int 8) (const_int 9)
-	      (const_int 10) (const_int 11)
-	      (const_int 12) (const_int 13)
-	      (const_int 14) (const_int 15)]))
-	  (match_operand:<ssehalfvecmode> 2 "memory_operand" "0")
-	  (match_operand:QI 3 "register_operand" "Yk")))]
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
   "TARGET_AVX512DQ
-   && rtx_equal_p (operands[2], operands[0])"
-  "vextract<shuffletype>32x8\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}"
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>32x8\t{$0x1, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_hi_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=<store_mask_constraint>,vm")
+(define_insn "vec_extract_hi_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=vm,vm")
 	(vec_select:<ssehalfvecmode>
 	  (match_operand:V16FI 1 "register_operand" "v,v")
 	  (parallel [(const_int 8) (const_int 9)
-            (const_int 10) (const_int 11)
-	    (const_int 12) (const_int 13)
-	    (const_int 14) (const_int 15)])))]
-  "TARGET_AVX512F && <mask_avx512dq_condition>"
+		     (const_int 10) (const_int 11)
+		     (const_int 12) (const_int 13)
+		     (const_int 14) (const_int 15)])))]
+  "TARGET_AVX512F"
   "@
-   vextract<shuffletype>32x8\t{$0x1, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x1}
+   vextract<shuffletype>32x8\t{$0x1, %1, %0|%0, %1, 0x1}
    vextracti64x4\t{$0x1, %1, %0|%0, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
@@ -8692,27 +8689,44 @@
   DONE;
 })
 
-(define_insn "vec_extract_lo_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>"
-					  "=v,v,<store_mask_constraint>")
+(define_insn "vec_extract_lo_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
+	(vec_merge:<ssehalfvecmode>
+	  (vec_select:<ssehalfvecmode>
+	    (match_operand:V16FI 1 "register_operand" "v,v")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512DQ
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>32x8\t{$0x0, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x0}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "memory" "none,store")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "vec_extract_lo_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,v,m")
 	(vec_select:<ssehalfvecmode>
-	  (match_operand:V16FI 1 "<store_mask_predicate>"
-				 "v,<store_mask_constraint>,v")
+	  (match_operand:V16FI 1 "nonimmediate_operand" "v,m,v")
 	  (parallel [(const_int 0) (const_int 1)
-                     (const_int 2) (const_int 3)
-                     (const_int 4) (const_int 5)
-                     (const_int 6) (const_int 7)])))]
+		     (const_int 2) (const_int 3)
+		     (const_int 4) (const_int 5)
+		     (const_int 6) (const_int 7)])))]
   "TARGET_AVX512F
-   && <mask_avx512dq_condition>
-   && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
-  if (<mask_applied>
-      || (!TARGET_AVX512VL
-	  && !REG_P (operands[0])
-	  && EXT_REX_SSE_REG_P (operands[1])))
+  if (!TARGET_AVX512VL
+      && !REG_P (operands[0])
+      && EXT_REX_SSE_REG_P (operands[1]))
     {
       if (TARGET_AVX512DQ)
-	return "vextract<shuffletype>32x8\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
+	return "vextract<shuffletype>32x8\t{$0x0, %1, %0|%0, %1, 0x0}";
       else
 	return "vextract<shuffletype>64x4\t{$0x0, %1, %0|%0, %1, 0x0}";
     }
@@ -8750,29 +8764,34 @@
     operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);
 })
 
-(define_insn "vec_extract_lo_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>"
-					  "=v,v,<store_mask_constraint>")
-	(vec_select:<ssehalfvecmode>
-	  (match_operand:VI8F_256 1 "<store_mask_predicate>"
-				    "v,<store_mask_constraint>,v")
-	  (parallel [(const_int 0) (const_int 1)])))]
-  "TARGET_AVX
-   && <mask_avx512vl_condition> && <mask_avx512dq_condition>
-   && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
-{
-  if (<mask_applied>)
-    return "vextract<shuffletype>64x2\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
-  else
-    return "#";
-}
+(define_insn "vec_extract_lo_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
+	(vec_merge:<ssehalfvecmode>
+	  (vec_select:<ssehalfvecmode>
+	    (match_operand:VI8F_256 1 "register_operand" "v,v")
+	    (parallel [(const_int 0) (const_int 1)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512DQ
+   && TARGET_AVX512VL
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>64x2\t{$0x0, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x0}"
    [(set_attr "type" "sselog1")
     (set_attr "prefix_extra" "1")
     (set_attr "length_immediate" "1")
-    (set_attr "memory" "none,load,store")
+    (set_attr "memory" "none,store")
     (set_attr "prefix" "evex")
     (set_attr "mode" "XI")])
 
+(define_insn "vec_extract_lo_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=vm,v")
+	(vec_select:<ssehalfvecmode>
+	  (match_operand:VI8F_256 1 "nonimmediate_operand" "v,vm")
+	  (parallel [(const_int 0) (const_int 1)])))]
+  "TARGET_AVX
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "#")
+
 (define_split
   [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
 	(vec_select:<ssehalfvecmode>
@@ -8783,20 +8802,38 @@
   [(set (match_dup 0) (match_dup 1))]
   "operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);")
 
-(define_insn "vec_extract_hi_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=v,<store_mask_constraint>")
+(define_insn "vec_extract_hi_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
+	(vec_merge:<ssehalfvecmode>
+	  (vec_select:<ssehalfvecmode>
+	    (match_operand:VI8F_256 1 "register_operand" "v,v")
+	    (parallel [(const_int 2) (const_int 3)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512DQ
+   && TARGET_AVX512VL
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>64x2\t{$0x1, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x1}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "<sseinsnmode>")])
+
+(define_insn "vec_extract_hi_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=vm")
 	(vec_select:<ssehalfvecmode>
-	  (match_operand:VI8F_256 1 "register_operand" "v,v")
+	  (match_operand:VI8F_256 1 "register_operand" "v")
 	  (parallel [(const_int 2) (const_int 3)])))]
-  "TARGET_AVX && <mask_avx512vl_condition> && <mask_avx512dq_condition>"
+  "TARGET_AVX"
 {
   if (TARGET_AVX512VL)
-  {
-    if (TARGET_AVX512DQ)
-      return "vextract<shuffletype>64x2\t{$0x1, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x1}";
-    else
-      return "vextract<shuffletype>32x4\t{$0x1, %1, %0|%0, %1, 0x1}";
-  }
+    {
+      if (TARGET_AVX512DQ)
+	return "vextract<shuffletype>64x2\t{$0x1, %1, %0|%0, %1, 0x1}";
+      else
+	return "vextract<shuffletype>32x4\t{$0x1, %1, %0|%0, %1, 0x1}";
+    }
   else
     return "vextract<i128>\t{$0x1, %1, %0|%0, %1, 0x1}";
 }
@@ -8817,74 +8854,50 @@
   [(set (match_dup 0) (match_dup 1))]
   "operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);")
 
-(define_insn "vec_extract_lo_<mode><mask_name>"
-  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>"
-					  "=<store_mask_constraint>,v")
-	(vec_select:<ssehalfvecmode>
-	  (match_operand:VI4F_256 1 "<store_mask_predicate>"
-				    "v,<store_mask_constraint>")
-	  (parallel [(const_int 0) (const_int 1)
-		     (const_int 2) (const_int 3)])))]
-  "TARGET_AVX
-   && <mask_avx512vl_condition>
-   && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
-{
-  if (<mask_applied>)
-    return "vextract<shuffletype>32x4\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
-  else
-    return "#";
-}
-  [(set_attr "type" "sselog1")
-   (set_attr "prefix_extra" "1")
-   (set_attr "length_immediate" "1")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "vec_extract_lo_<mode>_maskm"
-  [(set (match_operand:<ssehalfvecmode> 0 "memory_operand" "=m")
+(define_insn "vec_extract_lo_<mode>_mask"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
 	(vec_merge:<ssehalfvecmode>
 	  (vec_select:<ssehalfvecmode>
-	    (match_operand:VI4F_256 1 "register_operand" "v")
+	    (match_operand:VI4F_256 1 "register_operand" "v,v")
 	    (parallel [(const_int 0) (const_int 1)
-		      (const_int 2) (const_int 3)]))
-	  (match_operand:<ssehalfvecmode> 2 "memory_operand" "0")
-	  (match_operand:QI 3 "register_operand" "Yk")))]
-  "TARGET_AVX512VL && TARGET_AVX512F
-   && rtx_equal_p (operands[2], operands[0])"
-  "vextract<shuffletype>32x4\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}"
+		       (const_int 2) (const_int 3)]))
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:QI 3 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512VL
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
+  "vextract<shuffletype>32x4\t{$0x0, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x0}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "vec_extract_hi_<mode>_maskm"
-  [(set (match_operand:<ssehalfvecmode> 0 "memory_operand" "=m")
-	(vec_merge:<ssehalfvecmode>
-	  (vec_select:<ssehalfvecmode>
-	    (match_operand:VI4F_256 1 "register_operand" "v")
-	    (parallel [(const_int 4) (const_int 5)
-		      (const_int 6) (const_int 7)]))
-	  (match_operand:<ssehalfvecmode> 2 "memory_operand" "0")
-	  (match_operand:<ssehalfvecmode> 3 "register_operand" "Yk")))]
-  "TARGET_AVX512F && TARGET_AVX512VL
-   && rtx_equal_p (operands[2], operands[0])"
-  "vextract<shuffletype>32x4\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}"
+(define_insn "vec_extract_lo_<mode>"
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=vm,v")
+	(vec_select:<ssehalfvecmode>
+	  (match_operand:VI4F_256 1 "nonimmediate_operand" "v,vm")
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)])))]
+  "TARGET_AVX
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+  "#"
   [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
 (define_insn "vec_extract_hi_<mode>_mask"
-  [(set (match_operand:<ssehalfvecmode> 0 "register_operand" "=v")
+  [(set (match_operand:<ssehalfvecmode> 0 "register_operand" "=v,m")
 	(vec_merge:<ssehalfvecmode>
 	  (vec_select:<ssehalfvecmode>
-	    (match_operand:VI4F_256 1 "register_operand" "v")
+	    (match_operand:VI4F_256 1 "register_operand" "v,v")
 	    (parallel [(const_int 4) (const_int 5)
 		       (const_int 6) (const_int 7)]))
-	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C")
-	  (match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))]
-  "TARGET_AVX512VL"
+	  (match_operand:<ssehalfvecmode> 2 "nonimm_or_0_operand" "0C,0")
+	  (match_operand:<avx512fmaskmode> 3 "register_operand" "Yk,Yk")))]
+  "TARGET_AVX512VL
+   && (!MEM_P (operands[0]) || rtx_equal_p (operands[0], operands[2]))"
   "vextract<shuffletype>32x4\t{$0x1, %1, %0%{%3%}%N2|%0%{%3%}%N2, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "length_immediate" "1")
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index 4a1c9b0801e..a5ca144c7f7 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -57,8 +57,6 @@
 (define_subst_attr "mask_avx512vl_condition" "mask" "1" "TARGET_AVX512VL")
 (define_subst_attr "mask_avx512bw_condition" "mask" "1" "TARGET_AVX512BW")
 (define_subst_attr "mask_avx512dq_condition" "mask" "1" "TARGET_AVX512DQ")
-(define_subst_attr "store_mask_constraint" "mask" "vm" "v")
-(define_subst_attr "store_mask_predicate" "mask" "nonimmediate_operand" "register_operand")
 (define_subst_attr "mask_prefix" "mask" "vex" "evex")
 (define_subst_attr "mask_prefix2" "mask" "maybe_vex" "evex")
 (define_subst_attr "mask_prefix3" "mask" "orig,vex" "evex,evex")


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-08-22 20:54 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-22 20:54 [gcc/devel/autopar_devel] x86: Fix vextract* masked patterns [PR93069] Giuliano Belinassi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).