From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jakub@sourceware.org>
Received: by sourceware.org (Postfix, from userid 2153)
	id 646113858C39; Tue,  9 Apr 2024 10:41:25 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 646113858C39
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1712659285;
	bh=60h/ioCv1Fr85WZJu9/fiTk1YI8LZ+R0xCVAvk9u/Jo=;
	h=From:To:Subject:Date:From;
	b=L07etlZEPD6rC7hbsB2vzDjcnHUQeUuNWxMy/WXdjq2FY1GKv1kE/sNOSP+TxMGUB
	 f24rq6+PxAfRjOiJBUa0cbomGNNFpt8r/iUJbutu9ngCOuU/GRLQkysGnCP+Rl/p14
	 vDZ2dJzINSed6g88f7+IVoQp5Jxrmajw76BUjmtY=
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="utf-8"
From: Jakub Jelinek <jakub@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc r14-9869] i386: Fix aes/vaes patterns [PR114576]
X-Act-Checkin: gcc
X-Git-Author: Jakub Jelinek <jakub@redhat.com>
X-Git-Refname: refs/heads/master
X-Git-Oldrev: 897a241ddde53eae912c612a1623c84ff4dfe339
X-Git-Newrev: a79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e
Message-Id: <20240409104125.646113858C39@sourceware.org>
Date: Tue,  9 Apr 2024 10:41:25 +0000 (GMT)
List-Id: <gcc-cvs.sourceware.org>

https://gcc.gnu.org/g:a79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e

commit r14-9869-ga79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Apr 9 12:35:18 2024 +0200

    i386: Fix aes/vaes patterns [PR114576]
    
    On Wed, Apr 19, 2023 at 02:40:59AM +0000, Jiang, Haochen via Gcc-patches wrote:
    > > >  (define_insn "aesenc"
    > > > -  [(set (match_operand:V2DI 0 "register_operand" "=x,x")
    > > > -       (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x")
    > > > -                      (match_operand:V2DI 2 "vector_operand" "xBm,xm")]
    > > > +  [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
    > > > +       (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
    > > > +                      (match_operand:V2DI 2 "vector_operand"
    > > > + "xBm,xm,vm")]
    > > >                       UNSPEC_AESENC))]
    > > > -  "TARGET_AES"
    > > > +  "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
    > > >    "@
    > > >     aesenc\t{%2, %0|%0, %2}
    > > > +   vaesenc\t{%2, %1, %0|%0, %1, %2}
    > > >     vaesenc\t{%2, %1, %0|%0, %1, %2}"
    > > > -  [(set_attr "isa" "noavx,avx")
    > > > +  [(set_attr "isa" "noavx,aes,avx512vl")
    > > Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES &&
    > > TARGET_AVX512VL)" from condition.
    >
    > Since VAES should not imply AES, we need that "|| (TARGET_VAES &&
    > TARGET_AVX512VL)"
    >
    > And there is no need to add vaes_avx512vl since the last alternative will only
    > be hit when there is no aes. When there is no aes, the pattern will need vaes
    > and avx512vl both or we could not use this pattern. avx512vl here is just like
    > a placeholder.
    
    As the following testcase shows, the above change was incorrect.
    
    Using aes isa for the second alternative is obviously wrong, aes is enabled
    whenever -maes is, regardless of -mavx or -mno-avx, so the above change
    means that for -maes -mno-avx RA can choose, either it matches the first
    alternative with the dup operand, or it matches the second one (but that
    is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID).
    
    The big question is if "Since VAES should not imply AES" is the case or not.
    Looking around at what LLVM does on godbolt, seems since clang 6 which added
    -mvaes support -mvaes there implies -maes, but GCC treats those two
    independent.
    
    Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes
    imply -mno-vaes, then we should probably just revert the above patch and
    tweak common/config/i386/ to do the implications (+ add the testcase from
    this patch).
    
    If we keep the current behavior, where AES and VAES are completely
    independent extensions, then we need to do more changes as the following
    patch attempts to do.
    We should use the aesenc etc. insns for noavx as before, we know at that
    point that TARGET_AES must be true because (TARGET_VAES && TARGET_AVX512VL)
    won't be true when !TARGET_AVX - TARGET_AVX512VL implies TARGET_AVX.
    For the second alternative, i.e. the AVX AES VEX or VAES AVX512F EVEX case
    without using %xmm16+/EGPR regs, the patch uses avx isa, but we need to
    emit {evex} prefix in the assembly if AES ISA is not enabled.
    For the last alternative, we need to use a new vaes_avx512vl isa attribute,
    because the %xmm16+/EGPR support is there only if both VAES and AVX512VL
    is enabled, not just AVX and AES.
    Still, I wonder if -mvaes shouldn't imply at least -mavx512f and
    -mno-avx512f shouldn't imply -mno-vaes, because otherwise can't see how
    it could use 512-bit registers (this part not done in the patch).
    
    2024-04-09  Jakub Jelinek  <jakub@redhat.com>
    
            PR target/114576
            * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl.
            (enabled): Remove aes isa check, add vaes_avx512vl.
            * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Use
            jm instead of m for second alternative and emit {evex} prefix
            for it if !TARGET_AES.  Use noavx,avx,vaes_avx512vl isa attribute.
            (vaesdec_<mode>, vaesdeclast_<mode>, vaesenc_<mode>,
            vaesenclast_<mode>): Add second alternative with x instead of v
            and jm instead of m.
    
            * gcc.target/i386/aes-pr114576.c: New test.

Diff:
---
 gcc/config/i386/i386.md                      |  8 ++-
 gcc/config/i386/sse.md                       | 90 ++++++++++++++++------------
 gcc/testsuite/gcc.target/i386/aes-pr114576.c | 63 +++++++++++++++++++
 3 files changed, 121 insertions(+), 40 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 10ae3113ae8..d4ce3809e6d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -568,13 +568,14 @@
 
 ;; Used to control the "enabled" attribute on a per-instruction basis.
 (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx,
-		    x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd,
+		    x64_avx,x64_avx512bw,x64_avx512dq,apx_ndd,
 		    sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx,
 		    avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512,
 		    noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq,
 		    noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnni,
 		    avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxneconvert,
-		    avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl"
+		    avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl,
+		    vaes_avx512vl"
   (const_string "base"))
 
 ;; The (bounding maximum) length of an instruction immediate.
@@ -915,7 +916,6 @@
 	   (symbol_ref "TARGET_64BIT && TARGET_AVX512BW")
 	 (eq_attr "isa" "x64_avx512dq")
 	   (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ")
-	 (eq_attr "isa" "aes") (symbol_ref "TARGET_AES")
 	 (eq_attr "isa" "sse_noavx")
 	   (symbol_ref "TARGET_SSE && !TARGET_AVX")
 	 (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2")
@@ -968,6 +968,8 @@
 	   (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL")
 	 (eq_attr "isa" "apx_ndd")
 	   (symbol_ref "TARGET_APX_NDD")
+	 (eq_attr "isa" "vaes_avx512vl")
+	   (symbol_ref "TARGET_VAES && TARGET_AVX512VL")
 
 	 (eq_attr "mmx_isa" "native")
 	   (symbol_ref "!TARGET_MMX_WITH_SSE")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3286d3a4fac..03aa249b842 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -26279,72 +26279,72 @@
 (define_insn "aesenc"
   [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
 	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-		       (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+		       (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")]
 		      UNSPEC_AESENC))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
    aesenc\t{%2, %0|%0, %2}
-   vaesenc\t{%2, %1, %0|%0, %1, %2}
+   * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\";
    vaesenc\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "noavx,aes,avx512vl")
+  [(set_attr "isa" "noavx,avx,vaes_avx512vl")
    (set_attr "type" "sselog1")
    (set_attr "addr" "gpr16,*,*")
    (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,vex,evex")
+   (set_attr "prefix" "orig,maybe_evex,evex")
    (set_attr "btver2_decode" "double,double,double")
    (set_attr "mode" "TI")])
 
 (define_insn "aesenclast"
   [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
 	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-		       (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+		       (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")]
 		      UNSPEC_AESENCLAST))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
    aesenclast\t{%2, %0|%0, %2}
-   vaesenclast\t{%2, %1, %0|%0, %1, %2}
+   * return TARGET_AES ? \"vaesenclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}\";
    vaesenclast\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "noavx,aes,avx512vl")
+  [(set_attr "isa" "noavx,avx,vaes_avx512vl")
    (set_attr "type" "sselog1")
    (set_attr "addr" "gpr16,*,*")
    (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,vex,evex")
-   (set_attr "btver2_decode" "double,double,double") 
+   (set_attr "prefix" "orig,maybe_evex,evex")
+   (set_attr "btver2_decode" "double,double,double")
    (set_attr "mode" "TI")])
 
 (define_insn "aesdec"
   [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
 	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-		       (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+		       (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")]
 		      UNSPEC_AESDEC))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
    aesdec\t{%2, %0|%0, %2}
-   vaesdec\t{%2, %1, %0|%0, %1, %2}
+   * return TARGET_AES ? \"vaesdec\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}\";
    vaesdec\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "noavx,aes,avx512vl")
+  [(set_attr "isa" "noavx,avx,vaes_avx512vl")
    (set_attr "type" "sselog1")
    (set_attr "addr" "gpr16,*,*")
    (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,vex,evex")
+   (set_attr "prefix" "orig,maybe_evex,evex")
    (set_attr "btver2_decode" "double,double,double") 
    (set_attr "mode" "TI")])
 
 (define_insn "aesdeclast"
   [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
 	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-		       (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+		       (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")]
 		      UNSPEC_AESDECLAST))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
    aesdeclast\t{%2, %0|%0, %2}
-   vaesdeclast\t{%2, %1, %0|%0, %1, %2}
+   * return TARGET_AES ? \"vaesdeclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}\";
    vaesdeclast\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "noavx,aes,avx512vl")
+  [(set_attr "isa" "noavx,avx,vaes_avx512vl")
    (set_attr "addr" "gpr16,*,*")
    (set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,vex,evex")
+   (set_attr "prefix" "orig,maybe_evex,evex")
    (set_attr "btver2_decode" "double,double,double")
    (set_attr "mode" "TI")])
 
@@ -30246,44 +30246,60 @@
    [(set_attr ("prefix") ("evex"))])
 
 (define_insn "vaesdec_<mode>"
-  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v")
+  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
 	(unspec:VI1_AVX512VL_F
-	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v")
-	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")]
+	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v")
+	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")]
 	  UNSPEC_VAESDEC))]
   "TARGET_VAES"
-  "vaesdec\t{%2, %1, %0|%0, %1, %2}"
-)
+{
+  if (which_alternative == 0 && <MODE>mode == V16QImode)
+    return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}";
+  else
+    return "vaesdec\t{%2, %1, %0|%0, %1, %2}";
+})
 
 (define_insn "vaesdeclast_<mode>"
-  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v")
+  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
 	(unspec:VI1_AVX512VL_F
-	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v")
-	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")]
+	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v")
+	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")]
 	  UNSPEC_VAESDECLAST))]
   "TARGET_VAES"
-  "vaesdeclast\t{%2, %1, %0|%0, %1, %2}"
-)
+{
+  if (which_alternative == 0 && <MODE>mode == V16QImode)
+    return "%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}";
+  else
+    return "vaesdeclast\t{%2, %1, %0|%0, %1, %2}";
+})
 
 (define_insn "vaesenc_<mode>"
-  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v")
+  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
 	(unspec:VI1_AVX512VL_F
-	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v")
-	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")]
+	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v")
+	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")]
 	  UNSPEC_VAESENC))]
   "TARGET_VAES"
-  "vaesenc\t{%2, %1, %0|%0, %1, %2}"
-)
+{
+  if (which_alternative == 0 && <MODE>mode == V16QImode)
+    return "%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}";
+  else
+    return "vaesenc\t{%2, %1, %0|%0, %1, %2}";
+})
 
 (define_insn "vaesenclast_<mode>"
-  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v")
+  [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
 	(unspec:VI1_AVX512VL_F
-	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v")
-	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")]
+	  [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v")
+	   (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")]
 	  UNSPEC_VAESENCLAST))]
   "TARGET_VAES"
-  "vaesenclast\t{%2, %1, %0|%0, %1, %2}"
-)
+{
+  if (which_alternative == 0 && <MODE>mode == V16QImode)
+    return "%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}";
+  else
+    return "vaesenclast\t{%2, %1, %0|%0, %1, %2}";
+})
 
 (define_insn "vpclmulqdq_<mode>"
   [(set (match_operand:VI8_FVL 0 "register_operand" "=v")
diff --git a/gcc/testsuite/gcc.target/i386/aes-pr114576.c b/gcc/testsuite/gcc.target/i386/aes-pr114576.c
new file mode 100644
index 00000000000..423125aff19
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/aes-pr114576.c
@@ -0,0 +1,63 @@
+/* PR target/114576 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -maes -mno-avx" } */
+/* { dg-final { scan-assembler-times "\taesenc\t" 2 } } */
+/* { dg-final { scan-assembler-times "\taesdec\t" 2 } } */
+/* { dg-final { scan-assembler-times "\taesenclast\t" 2 } } */
+/* { dg-final { scan-assembler-times "\taesdeclast\t" 2 } } */
+/* { dg-final { scan-assembler-not "\tvaesenc" } } */
+/* { dg-final { scan-assembler-not "\tvaesdec" } } */
+
+#include <immintrin.h>
+
+__m128i
+f1 (__m128i x, __m128i y)
+{
+  return _mm_aesenc_si128 (x, y);
+}
+
+__m128i
+f2 (__m128i x, __m128i y)
+{
+  __m128i z = _mm_aesenc_si128 (x, y);
+  return z + x + y;
+}
+
+__m128i
+f3 (__m128i x, __m128i y)
+{
+  return _mm_aesdec_si128 (x, y);
+}
+
+__m128i
+f4 (__m128i x, __m128i y)
+{
+  __m128i z = _mm_aesdec_si128 (x, y);
+  return z + x + y;
+}
+
+__m128i
+f5 (__m128i x, __m128i y)
+{
+  return _mm_aesenclast_si128 (x, y);
+}
+
+__m128i
+f6 (__m128i x, __m128i y)
+{
+  __m128i z = _mm_aesenclast_si128 (x, y);
+  return z + x + y;
+}
+
+__m128i
+f7 (__m128i x, __m128i y)
+{
+  return _mm_aesdeclast_si128 (x, y);
+}
+
+__m128i
+f8 (__m128i x, __m128i y)
+{
+  __m128i z = _mm_aesdeclast_si128 (x, y);
+  return z + x + y;
+}