From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by sourceware.org (Postfix) with ESMTPS id 203B03858D28 for ; Wed, 19 Apr 2023 02:31:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 203B03858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-54f6a796bd0so368304067b3.12 for ; Tue, 18 Apr 2023 19:31:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681871504; x=1684463504; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QQNsDya9WIDjsYtvOj8Ciz2xNWFQD4HOTSxGHOXGqdI=; b=bzRLY1a6I7gnSreA8HPJ7PoMdxdqNlclls94MLRJd7EL/QC5Zqa9erh382jTD7W4JN /2XcJiN9vK+guG9RB24xsQt1hnpU779O568N00U/QoyH4BnU0g7OPXIT0gNiYrCNWdY/ kfy/Hl/sGzlg4FjL6Wv/yCcLPiWGkA7AJIknxm7T01MOhwlklYDg3Me628SrxZ2bGd4r I4/S47BywzK7m4F25elhyiTyrIFBzdcUemVXaU0e/6qYH8OJPdaO6Wd1maXulm/y54FQ 0YGkthWSiqkq4Xe3MX66HVYo/aJe9jPA3a9z2ZMOr63QLXvQpKGapmSwFz70hncznUCw Pr1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681871504; x=1684463504; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QQNsDya9WIDjsYtvOj8Ciz2xNWFQD4HOTSxGHOXGqdI=; b=U5Qgh0obtMbItHdGzC1cxbLYajEkBBE/Jt48XvFbJz6Y3UZjBX2Dlrc12nU2ZIUn1G ovGvtNBR6g64lfxj1c8G1f9IW/HcaPeZbL6q2fRZZVNbcKuzm5ermIGaTy6YAOVDcZd/ CwFGwbNDOCo/JAQ3x3c1NPN6d5SFlXZnS2QzprBBdCBKgO73g2HtO9WqpkVI0IF849nP knhVaCy/MlOFeYjE5ZX3mq/Rg9GiyJPok3ZFS1YBc6Lg7gDrvCf2pbC3h+17xnG5wAV7 H/nE5vNTTwMx+aSgu/jfPC1lDMdVzKeMdkPbpdDaQByx7SiUMNOvM28PBiAiaWGEnZ96 0sYA== X-Gm-Message-State: AAQBX9cmNisGNZBlz3jdEpzcJhHNtKp3i9TGNcBgMdG2kvMYjuXUdE2T zFzzuzUkBtPRxz19DL10XVnmhU1Q8k8lWNJwLqk= X-Google-Smtp-Source: AKy350Z5/ZI5yNtlN3aSUgaU0+Wp7d1UFkHNs664OVF4tYGBiPKfLCMjdDDNqIf8xPVIf1DoZl4lymXwRyy6FJG8bcI= X-Received: by 2002:a0d:cacb:0:b0:54f:52b7:3eac with SMTP id m194-20020a0dcacb000000b0054f52b73eacmr906220ywd.9.1681871504355; Tue, 18 Apr 2023 19:31:44 -0700 (PDT) MIME-Version: 1.0 References: <20230418071851.4192579-1-haochen.jiang@intel.com> In-Reply-To: <20230418071851.4192579-1-haochen.jiang@intel.com> From: Hongtao Liu Date: Wed, 19 Apr 2023 10:31:33 +0800 Message-ID: Subject: Re: [PATCH] i386: Share AES xmm intrin with VAES To: Haochen Jiang Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, ubizjak@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Apr 18, 2023 at 3:19=E2=80=AFPM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > Currently in GCC, the 128 bit intrin for instruction vaes{end,dec}{last,} > is under AES ISA. Because there is no dependency between ISA set AES > and VAES, The 128 bit intrin is not available when we use compiler flag > -mvaes -mavx512vl and there is no other way to use that intrin. But it > should according to Intel SDM. > > Although VAES aims to be a VEX/EVEX promotion for AES, but it is only par= t > of it. Therefore, we share the AES xmm intrin with VAES. > > Also, since -mvaes indicates that we could use VEX encoding for ymm, we > should imply AVX for VAES. > > Tested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > > * common/config/i386/i386-common.cc > (OPTION_MASK_ISA2_AVX_UNSET): Add OPTION_MASK_ISA2_VAES_UNSET. > (ix86_handle_option): Set AVX flag for VAES. > * config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins): > Add OPTION_MASK_ISA2_VAES_UNSET. > (def_builtin): Share builtin between AES and VAES. > * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): > Ditto. > * config/i386/i386.md (aes): New isa attribute. > * config/i386/sse.md (aesenc): Add pattern for VAES with xmm. > (aesenclast): Ditto. > (aesdec): Ditto. > (aesdeclast): Ditto. > * config/i386/vaesintrin.h: Remove redundant avx target push. > * config/i386/wmmintrin.h (_mm_aesdec_si128): Change to macro. > (_mm_aesdeclast_si128): Ditto. > (_mm_aesenc_si128): Ditto. > (_mm_aesenclast_si128): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/avx512fvl-vaes-1.c: Add VAES xmm test. > * gcc.target/i386/pr84335.c: Modify error message. > --- > gcc/common/config/i386/i386-common.cc | 5 +- > gcc/config/i386/i386-builtins.cc | 21 ++++--- > gcc/config/i386/i386-expand.cc | 1 + > gcc/config/i386/i386.md | 3 +- > gcc/config/i386/sse.md | 60 ++++++++++--------- > gcc/config/i386/vaesintrin.h | 4 +- > gcc/config/i386/wmmintrin.h | 29 +++------ > .../gcc.target/i386/avx512fvl-vaes-1.c | 11 ++++ > gcc/testsuite/gcc.target/i386/pr84335.c | 4 +- > 9 files changed, 75 insertions(+), 63 deletions(-) > > diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i3= 86/i386-common.cc > index c7954da8e34..bf126f14073 100644 > --- a/gcc/common/config/i386/i386-common.cc > +++ b/gcc/common/config/i386/i386-common.cc > @@ -348,7 +348,8 @@ along with GCC; see the file COPYING3. If not see > | OPTION_MASK_ISA2_AVX512VP2INTERSECT_UNSET) > #define OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET \ > OPTION_MASK_ISA2_SSE_UNSET > -#define OPTION_MASK_ISA2_AVX_UNSET OPTION_MASK_ISA2_AVX2_UNSET > +#define OPTION_MASK_ISA2_AVX_UNSET \ > + (OPTION_MASK_ISA2_AVX2_UNSET | OPTION_MASK_ISA2_VAES_UNSET) > #define OPTION_MASK_ISA2_SSE4_2_UNSET OPTION_MASK_ISA2_AVX_UNSET > #define OPTION_MASK_ISA2_SSE4_1_UNSET OPTION_MASK_ISA2_SSE4_2_UNSET > #define OPTION_MASK_ISA2_SSE4_UNSET OPTION_MASK_ISA2_SSE4_1_UNSET > @@ -685,6 +686,8 @@ ix86_handle_option (struct gcc_options *opts, > { > opts->x_ix86_isa_flags2 |=3D OPTION_MASK_ISA2_VAES_SET; > opts->x_ix86_isa_flags2_explicit |=3D OPTION_MASK_ISA2_VAES_SET= ; > + opts->x_ix86_isa_flags |=3D OPTION_MASK_ISA_AVX_SET; > + opts->x_ix86_isa_flags_explicit |=3D OPTION_MASK_ISA_AVX_SET; > } > else > { > diff --git a/gcc/config/i386/i386-builtins.cc b/gcc/config/i386/i386-buil= tins.cc > index fc0c82b156e..28f404da288 100644 > --- a/gcc/config/i386/i386-builtins.cc > +++ b/gcc/config/i386/i386-builtins.cc > @@ -279,14 +279,15 @@ def_builtin (HOST_WIDE_INT mask, HOST_WIDE_INT mask= 2, > if (((mask2 =3D=3D 0 || (mask2 & ix86_isa_flags2) !=3D 0) > && (mask =3D=3D 0 || (mask & ix86_isa_flags) !=3D 0)) > || ((mask & OPTION_MASK_ISA_MMX) !=3D 0 && TARGET_MMX_WITH_SSE) > - /* "Unified" builtin used by either AVXVNNI/AVXIFMA intrinsics > - or AVX512VNNIVL/AVX512IFMAVL non-mask intrinsics should be > - defined whenever avxvnni/avxifma or avx512vnni/avxifma && > - avx512vl exist. */ > + /* "Unified" builtin used by either AVXVNNI/AVXIFMA/AES intrins= ics > + or AVX512VNNIVL/AVX512IFMAVL/VAESVL non-mask intrinsics shou= ld be > + defined whenever avxvnni/avxifma/aes or avx512vnni/avx512ifm= a/vaes > + && avx512vl exist. */ > || (mask2 =3D=3D OPTION_MASK_ISA2_AVXVNNI) > || (mask2 =3D=3D OPTION_MASK_ISA2_AVXIFMA) > || (mask2 =3D=3D (OPTION_MASK_ISA2_AVXNECONVERT > | OPTION_MASK_ISA2_AVX512BF16)) > + || ((mask2 & OPTION_MASK_ISA2_VAES) !=3D 0) > || (lang_hooks.builtin_function > =3D=3D lang_hooks.builtin_function_ext_scope)) > { > @@ -661,16 +662,20 @@ ix86_init_mmx_sse_builtins (void) > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); > > /* AES */ > - def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, 0, > + def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, > + OPTION_MASK_ISA2_VAES, > "__builtin_ia32_aesenc128", > V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESENC128); > - def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, 0, > + def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, > + OPTION_MASK_ISA2_VAES, > "__builtin_ia32_aesenclast128", > V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESENCLAST128); > - def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, 0, > + def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, > + OPTION_MASK_ISA2_VAES, > "__builtin_ia32_aesdec128", > V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESDEC128); > - def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, 0, > + def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, > + OPTION_MASK_ISA2_VAES, > "__builtin_ia32_aesdeclast128", > V2DI_FTYPE_V2DI_V2DI, IX86_BUILTIN_AESDECLAST128); > def_builtin_const (OPTION_MASK_ISA_AES | OPTION_MASK_ISA_SSE2, 0, > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand= .cc > index 54d5dfae677..28574a5809b 100644 > --- a/gcc/config/i386/i386-expand.cc > +++ b/gcc/config/i386/i386-expand.cc > @@ -12624,6 +12624,7 @@ ix86_check_builtin_isa_match (unsigned int fcode, > OPTION_MASK_ISA2_AVXIFMA); > SHARE_BUILTIN (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, = 0, > OPTION_MASK_ISA2_AVXNECONVERT); > + SHARE_BUILTIN (OPTION_MASK_ISA_AES, 0, 0, OPTION_MASK_ISA2_VAES); > isa =3D tmp_isa; > isa2 =3D tmp_isa2; > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index acc994226e7..15c366cb595 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -836,7 +836,7 @@ > > ;; Used to control the "enabled" attribute on a per-instruction basis. > (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, > - x64_avx,x64_avx512bw,x64_avx512dq, > + x64_avx,x64_avx512bw,x64_avx512dq,aes, > sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_n= oavx, > avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx= 512f, > avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512= vl, > @@ -863,6 +863,7 @@ > (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") > (eq_attr "isa" "x64_avx512dq") > (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ") > + (eq_attr "isa" "aes") (symbol_ref "TARGET_AES") > (eq_attr "isa" "sse_noavx") > (symbol_ref "TARGET_SSE && !TARGET_AVX") > (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2") > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 33e281901cf..e7d565a8389 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -25107,67 +25107,71 @@ > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > > (define_insn "aesenc" > - [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x") > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > + [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > + (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm"= )] > UNSPEC_AESENC))] > - "TARGET_AES" > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesenc\t{%2, %0|%0, %2} > + vaesenc\t{%2, %1, %0|%0, %1, %2} > vaesenc\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,avx") > + [(set_attr "isa" "noavx,aes,avx512vl") Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES && TARGET_AVX512VL)" from condition. Similar for below patterns. Others LGTM. > (set_attr "type" "sselog1") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex") > - (set_attr "btver2_decode" "double,double") > + (set_attr "prefix" "orig,vex,evex") > + (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesenclast" > - [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x") > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > + [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > + (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm"= )] > UNSPEC_AESENCLAST))] > - "TARGET_AES" > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesenclast\t{%2, %0|%0, %2} > + vaesenclast\t{%2, %1, %0|%0, %1, %2} > vaesenclast\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,avx") > + [(set_attr "isa" "noavx,aes,avx512vl") > (set_attr "type" "sselog1") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex") > - (set_attr "btver2_decode" "double,double") > + (set_attr "prefix" "orig,vex,evex") > + (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesdec" > - [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x") > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > + [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > + (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm"= )] > UNSPEC_AESDEC))] > - "TARGET_AES" > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesdec\t{%2, %0|%0, %2} > + vaesdec\t{%2, %1, %0|%0, %1, %2} > vaesdec\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,avx") > + [(set_attr "isa" "noavx,aes,avx512vl") > (set_attr "type" "sselog1") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex") > - (set_attr "btver2_decode" "double,double") > + (set_attr "prefix" "orig,vex,evex") > + (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesdeclast" > - [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x") > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > + [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > + (match_operand:V2DI 2 "vector_operand" "xBm,xm,vm"= )] > UNSPEC_AESDECLAST))] > - "TARGET_AES" > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesdeclast\t{%2, %0|%0, %2} > + vaesdeclast\t{%2, %1, %0|%0, %1, %2} > vaesdeclast\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,avx") > + [(set_attr "isa" "noavx,aes,avx512vl") > (set_attr "type" "sselog1") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex") > - (set_attr "btver2_decode" "double,double") > + (set_attr "prefix" "orig,vex,evex") > + (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesimc" > diff --git a/gcc/config/i386/vaesintrin.h b/gcc/config/i386/vaesintrin.h > index 0f1cffe71e9..58fc19c9eb3 100644 > --- a/gcc/config/i386/vaesintrin.h > +++ b/gcc/config/i386/vaesintrin.h > @@ -24,9 +24,9 @@ > #ifndef __VAESINTRIN_H_INCLUDED > #define __VAESINTRIN_H_INCLUDED > > -#if !defined(__VAES__) || !defined(__AVX__) > +#if !defined(__VAES__) > #pragma GCC push_options > -#pragma GCC target("vaes,avx") > +#pragma GCC target("vaes") > #define __DISABLE_VAES__ > #endif /* __VAES__ */ > > diff --git a/gcc/config/i386/wmmintrin.h b/gcc/config/i386/wmmintrin.h > index ae15cea429e..da314dbd44d 100644 > --- a/gcc/config/i386/wmmintrin.h > +++ b/gcc/config/i386/wmmintrin.h > @@ -40,36 +40,23 @@ > > /* Performs 1 round of AES decryption of the first m128i using > the second m128i as a round key. */ > -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__= , __artificial__)) > -_mm_aesdec_si128 (__m128i __X, __m128i __Y) > -{ > - return (__m128i) __builtin_ia32_aesdec128 ((__v2di)__X, (__v2di)__Y); > -} > +#define _mm_aesdec_si128(X, Y) \ > + (__m128i) __builtin_ia32_aesdec128 ((__v2di) (X), (__v2di) (Y)) > > /* Performs the last round of AES decryption of the first m128i > using the second m128i as a round key. */ > -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__= , __artificial__)) > -_mm_aesdeclast_si128 (__m128i __X, __m128i __Y) > -{ > - return (__m128i) __builtin_ia32_aesdeclast128 ((__v2di)__X, > - (__v2di)__Y); > -} > +#define _mm_aesdeclast_si128(X, Y) \ > + (__m128i) __builtin_ia32_aesdeclast128 ((__v2di) (X), (__v2di) (Y)) > > /* Performs 1 round of AES encryption of the first m128i using > the second m128i as a round key. */ > -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__= , __artificial__)) > -_mm_aesenc_si128 (__m128i __X, __m128i __Y) > -{ > - return (__m128i) __builtin_ia32_aesenc128 ((__v2di)__X, (__v2di)__Y); > -} > +#define _mm_aesenc_si128(X, Y) \ > + (__m128i) __builtin_ia32_aesenc128 ((__v2di) (X), (__v2di) (Y)) > > /* Performs the last round of AES encryption of the first m128i > using the second m128i as a round key. */ > -extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__= , __artificial__)) > -_mm_aesenclast_si128 (__m128i __X, __m128i __Y) > -{ > - return (__m128i) __builtin_ia32_aesenclast128 ((__v2di)__X, (__v2di)__= Y); > -} > +#define _mm_aesenclast_si128(X, Y) \ > + (__m128i) __builtin_ia32_aesenclast128 ((__v2di) (X), (__v2di) (Y)) > > /* Performs the InverseMixColumn operation on the source m128i > and stores the result into m128i destination. */ > diff --git a/gcc/testsuite/gcc.target/i386/avx512fvl-vaes-1.c b/gcc/tests= uite/gcc.target/i386/avx512fvl-vaes-1.c > index c65b570cd47..f35742ec98b 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512fvl-vaes-1.c > +++ b/gcc/testsuite/gcc.target/i386/avx512fvl-vaes-1.c > @@ -10,10 +10,16 @@ > /* { dg-final { scan-assembler-times "vaesenc\[ \\t\]+\[^\{\n\]*%ymm\[0-= 9\]+\[^\{\n\]*%ymm\[0-9\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } *= / > /* { dg-final { scan-assembler-times "vaesenclast\[ \\t\]+\[^\{\n\]*%ymm= \[0-9\]+\[^\{\n\]*%ymm\[0-9\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 }= } */ > > +/* { dg-final { scan-assembler-times "vaesdec\[ \\t\]+\[^\{\n\]*%xmm\[0-= 9\]+\[^\{\n\]*%xmm\[0-9\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } *= / > +/* { dg-final { scan-assembler-times "vaesdeclast\[ \\t\]+\[^\{\n\]*%xmm= \[0-9\]+\[^\{\n\]*%xmm\[0-9\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 }= } */ > +/* { dg-final { scan-assembler-times "vaesenc\[ \\t\]+\[^\{\n\]*%xmm\[0-= 9\]+\[^\{\n\]*%xmm\[0-9\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } *= / > +/* { dg-final { scan-assembler-times "vaesenclast\[ \\t\]+\[^\{\n\]*%xmm= \[0-9\]+\[^\{\n\]*%xmm\[0-9\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 }= } */ > + > #include > > volatile __m512i x,y; > volatile __m256i x256, y256; > +volatile __m128i x128, y128; > > void extern > avx512f_test (void) > @@ -27,4 +33,9 @@ avx512f_test (void) > x256 =3D _mm256_aesdeclast_epi128 (x256, y256); > x256 =3D _mm256_aesenc_epi128 (x256, y256); > x256 =3D _mm256_aesenclast_epi128 (x256, y256); > + > + x128 =3D _mm_aesdec_si128 (x128, y128); > + x128 =3D _mm_aesdeclast_si128 (x128, y128); > + x128 =3D _mm_aesenc_si128 (x128, y128); > + x128 =3D _mm_aesenclast_si128 (x128, y128); > } > diff --git a/gcc/testsuite/gcc.target/i386/pr84335.c b/gcc/testsuite/gcc.= target/i386/pr84335.c > index c8d2a712f1f..5e45e2b322a 100644 > --- a/gcc/testsuite/gcc.target/i386/pr84335.c > +++ b/gcc/testsuite/gcc.target/i386/pr84335.c > @@ -6,5 +6,5 @@ typedef long long V __attribute__ ((__vector_size__ (16))= ); > V > foo (V *a, V *b) > { > - return __builtin_ia32_aesenc128 (*a, *b); /* { dg-error "needs isa = option" } */ > -} > + return __builtin_ia32_aesenc128 (*a, *b); /* { dg-warning "implicit= declaration of function" } */ > +} /* { dg-error "incompatib= le types when returning type" "" { target *-*-* } .-1 } */ > -- > 2.31.1 > --=20 BR, Hongtao