From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2153) id 646113858C39; Tue, 9 Apr 2024 10:41:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 646113858C39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1712659285; bh=60h/ioCv1Fr85WZJu9/fiTk1YI8LZ+R0xCVAvk9u/Jo=; h=From:To:Subject:Date:From; b=L07etlZEPD6rC7hbsB2vzDjcnHUQeUuNWxMy/WXdjq2FY1GKv1kE/sNOSP+TxMGUB f24rq6+PxAfRjOiJBUa0cbomGNNFpt8r/iUJbutu9ngCOuU/GRLQkysGnCP+Rl/p14 vDZ2dJzINSed6g88f7+IVoQp5Jxrmajw76BUjmtY= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Jakub Jelinek To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-9869] i386: Fix aes/vaes patterns [PR114576] X-Act-Checkin: gcc X-Git-Author: Jakub Jelinek X-Git-Refname: refs/heads/master X-Git-Oldrev: 897a241ddde53eae912c612a1623c84ff4dfe339 X-Git-Newrev: a79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e Message-Id: <20240409104125.646113858C39@sourceware.org> Date: Tue, 9 Apr 2024 10:41:25 +0000 (GMT) List-Id: https://gcc.gnu.org/g:a79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e commit r14-9869-ga79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e Author: Jakub Jelinek Date: Tue Apr 9 12:35:18 2024 +0200 i386: Fix aes/vaes patterns [PR114576] On Wed, Apr 19, 2023 at 02:40:59AM +0000, Jiang, Haochen via Gcc-patches wrote: > > > (define_insn "aesenc" > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x") > > > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > > > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > > > + [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") > > > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > > > + (match_operand:V2DI 2 "vector_operand" > > > + "xBm,xm,vm")] > > > UNSPEC_AESENC))] > > > - "TARGET_AES" > > > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > > > "@ > > > aesenc\t{%2, %0|%0, %2} > > > + vaesenc\t{%2, %1, %0|%0, %1, %2} > > > vaesenc\t{%2, %1, %0|%0, %1, %2}" > > > - [(set_attr "isa" "noavx,avx") > > > + [(set_attr "isa" "noavx,aes,avx512vl") > > Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES && > > TARGET_AVX512VL)" from condition. > > Since VAES should not imply AES, we need that "|| (TARGET_VAES && > TARGET_AVX512VL)" > > And there is no need to add vaes_avx512vl since the last alternative will only > be hit when there is no aes. When there is no aes, the pattern will need vaes > and avx512vl both or we could not use this pattern. avx512vl here is just like > a placeholder. As the following testcase shows, the above change was incorrect. Using aes isa for the second alternative is obviously wrong, aes is enabled whenever -maes is, regardless of -mavx or -mno-avx, so the above change means that for -maes -mno-avx RA can choose, either it matches the first alternative with the dup operand, or it matches the second one (but that is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID). The big question is if "Since VAES should not imply AES" is the case or not. Looking around at what LLVM does on godbolt, seems since clang 6 which added -mvaes support -mvaes there implies -maes, but GCC treats those two independent. Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes imply -mno-vaes, then we should probably just revert the above patch and tweak common/config/i386/ to do the implications (+ add the testcase from this patch). If we keep the current behavior, where AES and VAES are completely independent extensions, then we need to do more changes as the following patch attempts to do. We should use the aesenc etc. insns for noavx as before, we know at that point that TARGET_AES must be true because (TARGET_VAES && TARGET_AVX512VL) won't be true when !TARGET_AVX - TARGET_AVX512VL implies TARGET_AVX. For the second alternative, i.e. the AVX AES VEX or VAES AVX512F EVEX case without using %xmm16+/EGPR regs, the patch uses avx isa, but we need to emit {evex} prefix in the assembly if AES ISA is not enabled. For the last alternative, we need to use a new vaes_avx512vl isa attribute, because the %xmm16+/EGPR support is there only if both VAES and AVX512VL is enabled, not just AVX and AES. Still, I wonder if -mvaes shouldn't imply at least -mavx512f and -mno-avx512f shouldn't imply -mno-vaes, because otherwise can't see how it could use 512-bit registers (this part not done in the patch). 2024-04-09 Jakub Jelinek PR target/114576 * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl. (enabled): Remove aes isa check, add vaes_avx512vl. * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Use jm instead of m for second alternative and emit {evex} prefix for it if !TARGET_AES. Use noavx,avx,vaes_avx512vl isa attribute. (vaesdec_, vaesdeclast_, vaesenc_, vaesenclast_): Add second alternative with x instead of v and jm instead of m. * gcc.target/i386/aes-pr114576.c: New test. Diff: --- gcc/config/i386/i386.md | 8 ++- gcc/config/i386/sse.md | 90 ++++++++++++++++------------ gcc/testsuite/gcc.target/i386/aes-pr114576.c | 63 +++++++++++++++++++ 3 files changed, 121 insertions(+), 40 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 10ae3113ae8..d4ce3809e6d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -568,13 +568,14 @@ ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, + x64_avx,x64_avx512bw,x64_avx512dq,apx_ndd, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnni, avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxneconvert, - avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl" + avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl, + vaes_avx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -915,7 +916,6 @@ (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") (eq_attr "isa" "x64_avx512dq") (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ") - (eq_attr "isa" "aes") (symbol_ref "TARGET_AES") (eq_attr "isa" "sse_noavx") (symbol_ref "TARGET_SSE && !TARGET_AVX") (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2") @@ -968,6 +968,8 @@ (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") (eq_attr "isa" "apx_ndd") (symbol_ref "TARGET_APX_NDD") + (eq_attr "isa" "vaes_avx512vl") + (symbol_ref "TARGET_VAES && TARGET_AVX512VL") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 3286d3a4fac..03aa249b842 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -26279,72 +26279,72 @@ (define_insn "aesenc" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESENC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesenc\t{%2, %0|%0, %2} - vaesenc\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; vaesenc\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesenclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESENCLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesenclast\t{%2, %0|%0, %2} - vaesenclast\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesenclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}\"; vaesenclast\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") - (set_attr "btver2_decode" "double,double,double") + (set_attr "prefix" "orig,maybe_evex,evex") + (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesdec" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESDEC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesdec\t{%2, %0|%0, %2} - vaesdec\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesdec\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}\"; vaesdec\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesdeclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESDECLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesdeclast\t{%2, %0|%0, %2} - vaesdeclast\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesdeclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}\"; vaesdeclast\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "addr" "gpr16,*,*") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) @@ -30246,44 +30246,60 @@ [(set_attr ("prefix") ("evex"))]) (define_insn "vaesdec_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESDEC))] "TARGET_VAES" - "vaesdec\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesdec\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesdeclast_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESDECLAST))] "TARGET_VAES" - "vaesdeclast\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesenc_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESENC))] "TARGET_VAES" - "vaesenc\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesenc\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesenclast_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESENCLAST))] "TARGET_VAES" - "vaesenclast\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesenclast\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vpclmulqdq_" [(set (match_operand:VI8_FVL 0 "register_operand" "=v") diff --git a/gcc/testsuite/gcc.target/i386/aes-pr114576.c b/gcc/testsuite/gcc.target/i386/aes-pr114576.c new file mode 100644 index 00000000000..423125aff19 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/aes-pr114576.c @@ -0,0 +1,63 @@ +/* PR target/114576 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -maes -mno-avx" } */ +/* { dg-final { scan-assembler-times "\taesenc\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesdec\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesenclast\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesdeclast\t" 2 } } */ +/* { dg-final { scan-assembler-not "\tvaesenc" } } */ +/* { dg-final { scan-assembler-not "\tvaesdec" } } */ + +#include + +__m128i +f1 (__m128i x, __m128i y) +{ + return _mm_aesenc_si128 (x, y); +} + +__m128i +f2 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesenc_si128 (x, y); + return z + x + y; +} + +__m128i +f3 (__m128i x, __m128i y) +{ + return _mm_aesdec_si128 (x, y); +} + +__m128i +f4 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesdec_si128 (x, y); + return z + x + y; +} + +__m128i +f5 (__m128i x, __m128i y) +{ + return _mm_aesenclast_si128 (x, y); +} + +__m128i +f6 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesenclast_si128 (x, y); + return z + x + y; +} + +__m128i +f7 (__m128i x, __m128i y) +{ + return _mm_aesdeclast_si128 (x, y); +} + +__m128i +f8 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesdeclast_si128 (x, y); + return z + x + y; +}