public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongyu Wang <hongyu.wang@intel.com>
To: gcc-patches@gcc.gnu.org
Cc: hongtao.liu@intel.com, ubizjak@gmail.com, hubicka@ucw.cz,
	vmakarov@redhat.com, jakub@redhat.com,
	Kong Lingling <lingling.kong@intel.com>
Subject: [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5)
Date: Thu, 31 Aug 2023 16:20:21 +0800	[thread overview]
Message-ID: <20230831082024.314097-11-hongyu.wang@intel.com> (raw)
In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com>

From: Kong Lingling <lingling.kong@intel.com>

These legacy insns in opcode map2/3 have vex but no evex
counterpart, disable EGPR for them by adjusting alternatives and
attr_gpr32.

insn list:
1. phaddw/vphaddw, phaddd/vphaddd, phaddsw/vphaddsw
2. phsubw/vphsubw, phsubd/vphsubd, phsubsw/vphsubsw
3. psignb/vpsginb, psignw/vpsignw, psignd/vpsignd
4. blendps/vblendps, blendpd/vblendpd
5. blendvps/vblendvps, blendvpd/vblendvpd
6. pblendvb/vpblendvb, pblendw/vpblendw
7. mpsadbw/vmpsadbw
8. dpps/vddps, dppd/vdppd
9. pcmpeqq/vpcmpeqq, pcmpgtq/vpcmpgtq

gcc/ChangeLog:

	* config/i386/sse.md (avx2_ph<plusminus_mnemonic>wv16hi3): Set
	attr gpr32 0 and constraint Bt/BM to all mem alternatives.
	(ssse3_ph<plusminus_mnemonic>wv8hi3): Likewise.
	(ssse3_ph<plusminus_mnemonic>wv4hi3): Likewise.
	(avx2_ph<plusminus_mnemonic>dv8si3): Likewise.
	(ssse3_ph<plusminus_mnemonic>dv4si3): Likewise.
	(ssse3_ph<plusminus_mnemonic>dv2si3): Likewise.
	(<ssse3_avx2>_psign<mode>3): Likewise.
	(ssse3_psign<mode>3): Likewise.
	(<sse4_1>_blend<ssemodesuffix><avxsizesuffix): Likewise.
	(<sse4_1>_blendv<ssemodesuffix><avxsizesuffix): Likewise.
	(*<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>_lt): Likewise.
	(*<sse4_1>_blendv<ssefltmodesuff)ix><avxsizesuffix>_not_ltint: Likewise.
	(<sse4_1>_dp<ssemodesuffix><avxsizesuffix>): Likewise.
	(<sse4_1_avx2>_mpsadbw): Likewise.
	(<sse4_1_avx2>_pblendvb): Likewise.
	(*<sse4_1_avx2>_pblendvb_lt): Likewise.
	(sse4_1_pblend<ssemodesuffix>): Likewise.
	(*avx2_pblend<ssemodesuffix>): Likewise.
	(avx2_permv2ti): Likewise.
	(*avx_vperm2f128<mode>_nozero): Likewise.
	(*avx2_eq<mode>3): Likewise.
	(*sse4_1_eqv2di3): Likewise.
	(sse4_2_gtv2di3): Likewise.
	(avx2_gt<mode>3): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/apx-legacy-insn-check-norex2.c: Add
	sse/vex intrinsic tests.
---
 gcc/config/i386/sse.md                        |  80 ++++++++-----
 .../i386/apx-legacy-insn-check-norex2.c       | 106 ++++++++++++++++++
 2 files changed, 159 insertions(+), 27 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index bd6674d34f9..05963de9219 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16837,7 +16837,7 @@ (define_insn "*avx2_eq<mode>3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
 	(eq:VI_256
 	  (match_operand:VI_256 1 "nonimmediate_operand" "%x")
-	  (match_operand:VI_256 2 "nonimmediate_operand" "xm")))]
+	  (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))]
   "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "vpcmpeq<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
@@ -16845,6 +16845,7 @@ (define_insn "*avx2_eq<mode>3"
      (if_then_else (eq (const_string "<MODE>mode") (const_string "V4DImode"))
 		   (const_string "1")
 		   (const_string "*")))
+   (set_attr "gpr32" "0")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
 
@@ -17027,7 +17028,7 @@ (define_insn "*sse4_1_eqv2di3"
   [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x")
 	(eq:V2DI
 	  (match_operand:V2DI 1 "vector_operand" "%0,0,x")
-	  (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))]
+	  (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))]
   "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "@
    pcmpeqq\t{%2, %0|%0, %2}
@@ -17035,6 +17036,7 @@ (define_insn "*sse4_1_eqv2di3"
    vpcmpeqq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssecmp")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "orig,orig,vex")
    (set_attr "mode" "TI")])
@@ -17043,7 +17045,7 @@ (define_insn "*sse2_eq<mode>3"
   [(set (match_operand:VI124_128 0 "register_operand" "=x,x")
 	(eq:VI124_128
 	  (match_operand:VI124_128 1 "vector_operand" "%0,x")
-	  (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))]
+	  (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))]
   "TARGET_SSE2
    && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "@
@@ -17058,7 +17060,7 @@ (define_insn "sse4_2_gtv2di3"
   [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x")
 	(gt:V2DI
 	  (match_operand:V2DI 1 "register_operand" "0,0,x")
-	  (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))]
+	  (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))]
   "TARGET_SSE4_2"
   "@
    pcmpgtq\t{%2, %0|%0, %2}
@@ -17066,6 +17068,7 @@ (define_insn "sse4_2_gtv2di3"
    vpcmpgtq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssecmp")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "orig,orig,vex")
    (set_attr "mode" "TI")])
@@ -17074,7 +17077,7 @@ (define_insn "avx2_gt<mode>3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
 	(gt:VI_256
 	  (match_operand:VI_256 1 "register_operand" "x")
-	  (match_operand:VI_256 2 "nonimmediate_operand" "xm")))]
+	  (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))]
   "TARGET_AVX2"
   "vpcmpgt<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
@@ -17082,6 +17085,7 @@ (define_insn "avx2_gt<mode>3"
      (if_then_else (eq (const_string "<MODE>mode") (const_string "V4DImode"))
 		   (const_string "1")
 		   (const_string "*")))
+   (set_attr "gpr32" "0")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
 
@@ -17105,7 +17109,7 @@ (define_insn "*sse2_gt<mode>3"
   [(set (match_operand:VI124_128 0 "register_operand" "=x,x")
 	(gt:VI124_128
 	  (match_operand:VI124_128 1 "register_operand" "0,x")
-	  (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))]
+	  (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))]
   "TARGET_SSE2"
   "@
    pcmpgt<ssemodesuffix>\t{%2, %0|%0, %2}
@@ -21228,7 +21232,7 @@ (define_insn "avx2_ph<plusminus_mnemonic>wv16hi3"
 	  (vec_select:V16HI
 	    (vec_concat:V32HI
 	      (match_operand:V16HI 1 "register_operand" "x")
-	      (match_operand:V16HI 2 "nonimmediate_operand" "xm"))
+	      (match_operand:V16HI 2 "nonimmediate_operand" "xBt"))
 	    (parallel
 	      [(const_int 0) (const_int 2) (const_int 4) (const_int 6)
 	       (const_int 16) (const_int 18) (const_int 20) (const_int 22)
@@ -21244,6 +21248,7 @@ (define_insn "avx2_ph<plusminus_mnemonic>wv16hi3"
   "TARGET_AVX2"
   "vph<plusminus_mnemonic>w\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
@@ -21254,7 +21259,7 @@ (define_insn "ssse3_ph<plusminus_mnemonic>wv8hi3"
 	  (vec_select:V8HI
 	    (vec_concat:V16HI
 	      (match_operand:V8HI 1 "register_operand" "0,x")
-	      (match_operand:V8HI 2 "vector_operand" "xBm,xm"))
+	      (match_operand:V8HI 2 "vector_operand" "xBT,xBt"))
 	    (parallel
 	      [(const_int 0) (const_int 2) (const_int 4) (const_int 6)
 	       (const_int 8) (const_int 10) (const_int 12) (const_int 14)]))
@@ -21269,6 +21274,7 @@ (define_insn "ssse3_ph<plusminus_mnemonic>wv8hi3"
    vph<plusminus_mnemonic>w\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "atom_unit" "complex")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "orig,vex")
@@ -21280,7 +21286,7 @@ (define_insn_and_split "ssse3_ph<plusminus_mnemonic>wv4hi3"
 	  (vec_select:V4HI
 	    (vec_concat:V8HI
 	      (match_operand:V4HI 1 "register_operand" "0,0,x")
-	      (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,x"))
+	      (match_operand:V4HI 2 "register_mmxmem_operand" "yBt,x,x"))
 	    (parallel
 	      [(const_int 0) (const_int 2) (const_int 4) (const_int 6)]))
 	  (vec_select:V4HI
@@ -21309,6 +21315,7 @@ (define_insn_and_split "ssse3_ph<plusminus_mnemonic>wv4hi3"
 }
   [(set_attr "mmx_isa" "native,sse_noavx,avx")
    (set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "atom_unit" "complex")
    (set_attr "prefix_extra" "1")
    (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
@@ -21320,7 +21327,7 @@ (define_insn "avx2_ph<plusminus_mnemonic>dv8si3"
 	  (vec_select:V8SI
 	    (vec_concat:V16SI
 	      (match_operand:V8SI 1 "register_operand" "x")
-	      (match_operand:V8SI 2 "nonimmediate_operand" "xm"))
+	      (match_operand:V8SI 2 "nonimmediate_operand" "xBt"))
 	    (parallel
 	      [(const_int 0) (const_int 2) (const_int 8) (const_int 10)
 	       (const_int 4) (const_int 6) (const_int 12) (const_int 14)]))
@@ -21332,6 +21339,7 @@ (define_insn "avx2_ph<plusminus_mnemonic>dv8si3"
   "TARGET_AVX2"
   "vph<plusminus_mnemonic>d\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
@@ -21342,7 +21350,7 @@ (define_insn "ssse3_ph<plusminus_mnemonic>dv4si3"
 	  (vec_select:V4SI
 	    (vec_concat:V8SI
 	      (match_operand:V4SI 1 "register_operand" "0,x")
-	      (match_operand:V4SI 2 "vector_operand" "xBm,xm"))
+	      (match_operand:V4SI 2 "vector_operand" "xBT,xBt"))
 	    (parallel
 	      [(const_int 0) (const_int 2) (const_int 4) (const_int 6)]))
 	  (vec_select:V4SI
@@ -21355,6 +21363,7 @@ (define_insn "ssse3_ph<plusminus_mnemonic>dv4si3"
    vph<plusminus_mnemonic>d\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "atom_unit" "complex")
    (set_attr "prefix_data16" "1,*")
    (set_attr "prefix_extra" "1")
@@ -21367,7 +21376,7 @@ (define_insn_and_split "ssse3_ph<plusminus_mnemonic>dv2si3"
 	  (vec_select:V2SI
 	    (vec_concat:V4SI
 	      (match_operand:V2SI 1 "register_operand" "0,0,x")
-	      (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,x"))
+	      (match_operand:V2SI 2 "register_mmxmem_operand" "yBt,x,x"))
 	    (parallel [(const_int 0) (const_int 2)]))
 	  (vec_select:V2SI
 	    (vec_concat:V4SI (match_dup 1) (match_dup 2))
@@ -21394,6 +21403,7 @@ (define_insn_and_split "ssse3_ph<plusminus_mnemonic>dv2si3"
 }
   [(set_attr "mmx_isa" "native,sse_noavx,avx")
    (set_attr "type" "sseiadd")
+   (set_attr "gpr32" "0")
    (set_attr "atom_unit" "complex")
    (set_attr "prefix_extra" "1")
    (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
@@ -21848,7 +21858,7 @@ (define_insn "<ssse3_avx2>_psign<mode>3"
   [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x")
 	(unspec:VI124_AVX2
 	  [(match_operand:VI124_AVX2 1 "register_operand" "0,x")
-	   (match_operand:VI124_AVX2 2 "vector_operand" "xBm,xm")]
+	   (match_operand:VI124_AVX2 2 "vector_operand" "xBT,xBt")]
 	  UNSPEC_PSIGN))]
   "TARGET_SSSE3"
   "@
@@ -21856,6 +21866,7 @@ (define_insn "<ssse3_avx2>_psign<mode>3"
    vpsign<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sselog1")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "orig,vex")
    (set_attr "mode" "<sseinsnmode>")])
@@ -21864,7 +21875,7 @@ (define_insn "ssse3_psign<mode>3"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,x")
 	(unspec:MMXMODEI
 	  [(match_operand:MMXMODEI 1 "register_operand" "0,0,x")
-	   (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,x")]
+	   (match_operand:MMXMODEI 2 "register_mmxmem_operand" "yBt,x,x")]
 	  UNSPEC_PSIGN))]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
   "@
@@ -21874,6 +21885,7 @@ (define_insn "ssse3_psign<mode>3"
   [(set_attr "isa" "*,noavx,avx")
    (set_attr "mmx_isa" "native,*,*")
    (set_attr "type" "sselog1")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
    (set_attr "mode" "DI,TI,TI")])
@@ -22153,7 +22165,7 @@ (define_mode_attr blendbits
 (define_insn "<sse4_1>_blend<ssemodesuffix><avxsizesuffix>"
   [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
 	(vec_merge:VF_128_256
-	  (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm")
+	  (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt")
 	  (match_operand:VF_128_256 1 "register_operand" "0,0,x")
 	  (match_operand:SI 3 "const_0_to_<blendbits>_operand")))]
   "TARGET_SSE4_1"
@@ -22163,6 +22175,7 @@ (define_insn "<sse4_1>_blend<ssemodesuffix><avxsizesuffix>"
    vblend<ssemodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_data16" "1,1,*")
    (set_attr "prefix_extra" "1")
@@ -22173,7 +22186,7 @@ (define_insn "<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>"
   [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
 	(unspec:VF_128_256
 	  [(match_operand:VF_128_256 1 "register_operand" "0,0,x")
-	   (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (match_operand:VF_128_256 3 "register_operand" "Yz,Yz,x")]
 	  UNSPEC_BLENDV))]
   "TARGET_SSE4_1"
@@ -22183,6 +22196,7 @@ (define_insn "<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>"
    vblendv<ssemodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_data16" "1,1,*")
    (set_attr "prefix_extra" "1")
@@ -22234,7 +22248,7 @@ (define_insn_and_split "*<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>_lt"
   [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
 	(unspec:VF_128_256
 	  [(match_operand:VF_128_256 1 "register_operand" "0,0,x")
-	   (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (lt:VF_128_256
 	     (match_operand:<sseintvecmode> 3 "register_operand" "Yz,Yz,x")
 	     (match_operand:<sseintvecmode> 4 "const0_operand"))]
@@ -22248,6 +22262,7 @@ (define_insn_and_split "*<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>_lt"
   "operands[3] = gen_lowpart (<MODE>mode, operands[3]);"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_data16" "1,1,*")
    (set_attr "prefix_extra" "1")
@@ -22266,7 +22281,7 @@ (define_insn_and_split "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint"
   [(set (match_operand:<ssebytemode> 0 "register_operand" "=Yr,*x,x")
 	(unspec:<ssebytemode>
 	  [(match_operand:<ssebytemode> 1 "register_operand" "0,0,x")
-	   (match_operand:<ssebytemode> 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:<ssebytemode> 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (subreg:<ssebytemode>
 	     (lt:VI48_AVX
 	       (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x")
@@ -22286,6 +22301,7 @@ (define_insn_and_split "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint"
 }
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_data16" "1,1,*")
    (set_attr "prefix_extra" "1")
@@ -22324,7 +22340,7 @@ (define_insn "<sse4_1>_dp<ssemodesuffix><avxsizesuffix>"
   [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
 	(unspec:VF_128_256
 	  [(match_operand:VF_128_256 1 "vector_operand" "%0,0,x")
-	   (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (match_operand:SI 3 "const_0_to_255_operand")]
 	  UNSPEC_DP))]
   "TARGET_SSE4_1"
@@ -22334,6 +22350,7 @@ (define_insn "<sse4_1>_dp<ssemodesuffix><avxsizesuffix>"
    vdp<ssemodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemul")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_data16" "1,1,*")
    (set_attr "prefix_extra" "1")
@@ -22362,7 +22379,7 @@ (define_insn "<sse4_1_avx2>_mpsadbw"
   [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x")
 	(unspec:VI1_AVX2
 	  [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x")
-	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (match_operand:SI 3 "const_0_to_255_operand")]
 	  UNSPEC_MPSADBW))]
   "TARGET_SSE4_1"
@@ -22372,6 +22389,7 @@ (define_insn "<sse4_1_avx2>_mpsadbw"
    vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "sselog1")
+   (set_attr "gpr32" "0")
    (set_attr "length_immediate" "1")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "orig,orig,vex")
@@ -22400,7 +22418,7 @@ (define_insn "<sse4_1_avx2>_pblendvb"
   [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x")
 	(unspec:VI1_AVX2
 	  [(match_operand:VI1_AVX2 1 "register_operand"  "0,0,x")
-	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x")]
 	  UNSPEC_BLENDV))]
   "TARGET_SSE4_1"
@@ -22410,6 +22428,7 @@ (define_insn "<sse4_1_avx2>_pblendvb"
    vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "*,*,1")
    (set_attr "prefix" "orig,orig,vex")
@@ -22449,7 +22468,7 @@ (define_insn_and_split "*<sse4_1_avx2>_pblendvb_lt"
   [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x")
 	(unspec:VI1_AVX2
 	  [(match_operand:VI1_AVX2 1 "register_operand"  "0,0,x")
-	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm")
+	   (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt")
 	   (lt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x")
 			(match_operand:VI1_AVX2 4 "const0_operand"))]
 	  UNSPEC_BLENDV))]
@@ -22462,6 +22481,7 @@ (define_insn_and_split "*<sse4_1_avx2>_pblendvb_lt"
   ""
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "*,*,1")
    (set_attr "prefix" "orig,orig,vex")
@@ -22493,7 +22513,7 @@ (define_insn_and_split "*<sse4_1_avx2>_pblendvb_lt_subreg_not"
 (define_insn "sse4_1_pblend<ssemodesuffix>"
   [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x")
 	(vec_merge:V8_128
-	  (match_operand:V8_128 2 "vector_operand" "YrBm,*xBm,xm")
+	  (match_operand:V8_128 2 "vector_operand" "YrBT,*xBT,xBt")
 	  (match_operand:V8_128 1 "register_operand" "0,0,x")
 	  (match_operand:SI 3 "const_0_to_255_operand")))]
   "TARGET_SSE4_1"
@@ -22503,6 +22523,7 @@ (define_insn "sse4_1_pblend<ssemodesuffix>"
    vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "isa" "noavx,noavx,avx")
    (set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "orig,orig,vex")
@@ -22565,7 +22586,7 @@ (define_expand "avx2_pblend<ssemodesuffix>_1"
 (define_insn "*avx2_pblend<ssemodesuffix>"
   [(set (match_operand:V16_256 0 "register_operand" "=x")
 	(vec_merge:V16_256
-	  (match_operand:V16_256 2 "nonimmediate_operand" "xm")
+	  (match_operand:V16_256 2 "nonimmediate_operand" "xBt")
 	  (match_operand:V16_256 1 "register_operand" "x")
 	  (match_operand:SI 3 "avx2_pblendw_operand")))]
   "TARGET_AVX2"
@@ -22574,6 +22595,7 @@ (define_insn "*avx2_pblend<ssemodesuffix>"
   return "vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}";
 }
   [(set_attr "type" "ssemov")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "vex")
@@ -22582,7 +22604,7 @@ (define_insn "*avx2_pblend<ssemodesuffix>"
 (define_insn "avx2_pblendd<mode>"
   [(set (match_operand:VI4_AVX2 0 "register_operand" "=x")
 	(vec_merge:VI4_AVX2
-	  (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xm")
+	  (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xBt")
 	  (match_operand:VI4_AVX2 1 "register_operand" "x")
 	  (match_operand:SI 3 "const_0_to_255_operand")))]
   "TARGET_AVX2"
@@ -26443,11 +26465,13 @@ (define_insn "avx512f_perm<mode>_1<mask_name>"
    (set_attr "prefix" "<mask_prefix2>")
    (set_attr "mode" "<sseinsnmode>")])
 
+;; TODO (APX): vmovaps supports EGPR but not others, could split
+;; pattern to enable gpr32 for this one.
 (define_insn "avx2_permv2ti"
   [(set (match_operand:V4DI 0 "register_operand" "=x")
 	(unspec:V4DI
 	  [(match_operand:V4DI 1 "register_operand" "x")
-	   (match_operand:V4DI 2 "nonimmediate_operand" "xm")
+	   (match_operand:V4DI 2 "nonimmediate_operand" "xBt")
 	   (match_operand:SI 3 "const_0_to_255_operand")]
 	  UNSPEC_VPERMTI))]
   "TARGET_AVX2"
@@ -26474,6 +26498,7 @@ (define_insn "avx2_permv2ti"
     return "vperm2i128\t{%3, %2, %1, %0|%0, %1, %2, %3}";
   }
   [(set_attr "type" "sselog")
+   (set_attr "gpr32" "0")
    (set_attr "prefix" "vex")
    (set_attr "mode" "OI")])
 
@@ -27089,7 +27114,7 @@ (define_insn "*avx_vperm2f128<mode>_nozero"
 	(vec_select:AVX256MODE2P
 	  (vec_concat:<ssedoublevecmode>
 	    (match_operand:AVX256MODE2P 1 "register_operand" "x")
-	    (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm"))
+	    (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xBt"))
 	  (match_parallel 3 ""
 	    [(match_operand 4 "const_int_operand")])))]
   "TARGET_AVX
@@ -27106,6 +27131,7 @@ (define_insn "*avx_vperm2f128<mode>_nozero"
   return "vperm2<i128>\t{%3, %2, %1, %0|%0, %1, %2, %3}";
 }
   [(set_attr "type" "sselog")
+   (set_attr "gpr32" "0")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "vex")
diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
index 1e5450dfb73..510213a6ca7 100644
--- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
+++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
@@ -28,3 +28,109 @@ void legacy_test ()
 /* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
 /* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
 /* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+
+#ifdef DTYPE
+#undef DTYPE
+#define DTYPE u64
+#endif
+
+typedef union
+{
+  __m128i xi[8];
+  __m128 xf[8];
+  __m128d xd[8];
+  __m256i yi[4];
+  __m256 yf[4];
+  __m256d yd[4];
+  DTYPE a[16];
+} tmp_u;
+
+__attribute__((target("sse4.2")))
+void sse_test ()
+{
+  register tmp_u *tdst __asm__("%r16");
+  register tmp_u *src1 __asm__("%r17");
+  register tmp_u *src2 __asm__("%r18");
+ 
+  src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]);
+  src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]);
+  tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]);
+  tdst->xi[3] = _mm_hsub_epi16 (src1->xi[6], src2->xi[7]);
+  tdst->xi[4] = _mm_hsub_epi32 (src1->xi[0], src2->xi[1]);
+  tdst->xi[5] = _mm_hsubs_epi16 (src1->xi[2], src2->xi[3]);
+
+  src1->xi[6] = _mm_cmpeq_epi64 (tdst->xi[4], src2->xi[5]);
+  src1->xi[7] = _mm_cmpgt_epi64 (tdst->xi[6], src2->xi[7]);
+
+  tdst->xf[0] = _mm_dp_ps (src1->xf[0], src2->xf[1], 0xbf);
+  tdst->xd[1] = _mm_dp_pd (src1->xd[2], src2->xd[3], 0xae);
+
+  tdst->xi[2] = _mm_mpsadbw_epu8 (src1->xi[4], src2->xi[5], 0xc1);
+
+  tdst->xi[3] = _mm_blend_epi16 (src1->xi[6], src2->xi[7], 0xc);
+  tdst->xi[4] = _mm_blendv_epi8 (src1->xi[0], src2->xi[1], tdst->xi[2]);
+  tdst->xf[5] = _mm_blend_ps (src1->xf[3], src2->xf[4], 0x4);
+  tdst->xf[6] = _mm_blendv_ps (src1->xf[5], src2->xf[6], tdst->xf[7]);
+  tdst->xd[7] = _mm_blend_pd (tdst->xd[0], src1->xd[1], 0x1);
+  tdst->xd[0] = _mm_blendv_pd (src1->xd[2], src2->xd[3], tdst->xd[4]);
+
+  tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]);
+  tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]);
+  tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]);
+}
+
+__attribute__((target("avx2")))
+void vex_test ()
+{
+
+  register tmp_u *tdst __asm__("%r16");
+  register tmp_u *src1 __asm__("%r17");
+  register tmp_u *src2 __asm__("%r18");
+  
+  src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]);
+  src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]);
+  tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]);
+  tdst->yi[0] = _mm256_hsub_epi16 (src1->yi[3], src2->yi[0]);
+  tdst->yi[1] = _mm256_hsub_epi32 (src1->yi[0], src2->yi[1]);
+  tdst->yi[2] = _mm256_hsubs_epi16 (src1->yi[2], src2->yi[3]);
+
+  src1->yi[2] = _mm256_cmpeq_epi64 (tdst->yi[1], src2->yi[2]);
+  src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]);
+
+  tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf);
+  tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf);
+
+  tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1);
+
+  tdst->yi[0] = _mm256_blend_epi16 (src1->yi[1], src2->yi[2], 0xc);
+  tdst->yi[1] = _mm256_blendv_epi8 (src1->yi[1], src2->yi[2], tdst->yi[0]);
+  tdst->yf[2] = _mm256_blend_ps (src1->yf[0], src2->yf[1], 0x4);
+  tdst->yf[3] = _mm256_blendv_ps (src1->yf[2], src2->yf[3], tdst->yf[1]);
+  tdst->yd[3] = _mm256_blend_pd (tdst->yd[1], src1->yd[0], 0x1);
+  tdst->yd[1] = _mm256_blendv_pd (src1->yd[2], src2->yd[3], tdst->yd[2]);
+
+  tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]);
+  tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]);
+  tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]);
+}
+
+/* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?pcmpgtq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phaddw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phaddd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phaddsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phsubw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phsubd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?phsubsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?dpps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?dppd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?psadbw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?pblendw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?pblendvb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?blendps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?blendvps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?blendpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?blendvpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
+/* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
-- 
2.31.1


  parent reply	other threads:[~2023-08-31  8:20 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-31  8:20 [PATCH 00/13] [RFC] Support Intel APX EGPR Hongyu Wang
2023-08-31  8:20 ` [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class Hongyu Wang
2023-08-31 10:15   ` Uros Bizjak
2023-09-01  9:07     ` Hongyu Wang
2023-09-06 19:43       ` Vladimir Makarov
2023-09-07  6:23         ` Uros Bizjak
2023-09-07 12:13           ` Vladimir Makarov
2023-09-08 17:03   ` Vladimir Makarov
2023-09-10  4:49     ` Hongyu Wang
2023-09-14 12:09       ` Vladimir Makarov
2023-08-31  8:20 ` [PATCH 02/13] [APX EGPR] middle-end: Add index_reg_class with insn argument Hongyu Wang
2023-08-31  8:20 ` [PATCH 03/13] [APX_EGPR] Initial support for APX_F Hongyu Wang
2023-08-31  8:20 ` [PATCH 04/13] [APX EGPR] Add 16 new integer general purpose registers Hongyu Wang
2023-08-31  8:20 ` [PATCH 05/13] [APX EGPR] Add register and memory constraints that disallow EGPR Hongyu Wang
2023-08-31  8:20 ` [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint Hongyu Wang
2023-08-31  9:17   ` Jakub Jelinek
2023-08-31 10:00     ` Uros Bizjak
2023-09-01  9:04       ` Hongyu Wang
2023-09-01  9:38         ` Uros Bizjak
2023-09-01 10:35           ` Hongtao Liu
2023-09-01 11:27             ` Uros Bizjak
2023-09-04  0:28               ` Hongtao Liu
2023-09-04  8:57                 ` Uros Bizjak
2023-09-04  9:10                   ` Hongtao Liu
2023-09-01 11:03       ` Richard Sandiford
2023-09-04  1:03         ` Hongtao Liu
2023-09-01  9:04     ` Hongyu Wang
2023-08-31  8:20 ` [PATCH 07/13] [APX EGPR] Add backend hook for base_reg_class/index_reg_class Hongyu Wang
2023-08-31  8:20 ` [PATCH 08/13] [APX EGPR] Handle GPR16 only vector move insns Hongyu Wang
2023-08-31  9:43   ` Jakub Jelinek
2023-09-01  9:07     ` Hongyu Wang
2023-09-01  9:20       ` Jakub Jelinek
2023-09-01 11:34         ` Hongyu Wang
2023-09-01 11:41           ` Jakub Jelinek
2023-08-31  8:20 ` [PATCH 09/13] [APX EGPR] Handle legacy insn that only support GPR16 (1/5) Hongyu Wang
2023-08-31 10:06   ` Uros Bizjak
2023-08-31  8:20 ` Hongyu Wang [this message]
2023-08-31  8:20 ` [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5) Hongyu Wang
2023-08-31  9:26   ` Richard Biener
2023-08-31  9:28     ` Richard Biener
2023-09-01  9:03       ` Hongyu Wang
2023-09-01 10:38       ` Hongtao Liu
2023-08-31  9:31     ` Jakub Jelinek
2023-08-31  8:20 ` [PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5) Hongyu Wang
2023-08-31  8:20 ` [PATCH 13/13] [APX EGPR] Handle vex insns that only support GPR16 (5/5) Hongyu Wang
2023-08-31  9:19 ` [PATCH 00/13] [RFC] Support Intel APX EGPR Richard Biener
2023-09-01  8:55   ` Hongyu Wang
2023-09-22 10:56 [PATCH v2 00/13] " Hongyu Wang
2023-09-22 10:56 ` [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5) Hongyu Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230831082024.314097-11-hongyu.wang@intel.com \
    --to=hongyu.wang@intel.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=jakub@redhat.com \
    --cc=lingling.kong@intel.com \
    --cc=ubizjak@gmail.com \
    --cc=vmakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).