From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by sourceware.org (Postfix) with ESMTPS id 093DB3858CDB for ; Tue, 9 Apr 2024 10:32:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 093DB3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 093DB3858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::f35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712658749; cv=none; b=GQnSL4Tc48180OUUWIjmWMONBzbzgm7r8631pPQjCb9iaY0YKVbredsytI4W7pTjhcr7aZ+IXgT6wAOboUrSU5N+CcvFCRT/A+vtv4wwjPCUFXdZa0pFBEtA7FMtOabFpEk3wNAv976t2PH4OkX2044nLcSZVIt4Vn5NbqfunHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712658749; c=relaxed/simple; bh=Ue38PA2zlU0wdJmuyve4FY/9NPLbSfslYcjHo/Bnh5c=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=nXxlr5yFrxfK8UhwGoQ3ZsBOh8pn+k35nup1TdhV90tJoxKPGFby1pSb+FpzXPqhOVLfNULRDg/OtrfN/FII9gQTKoSfripxj2dmauXYGZFhVocjIZamX2HHtsaGoEnUqN+4k75iHSEEC5O8aUv5tlGGU0zIPm4zO02AAloUZEo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-qv1-xf35.google.com with SMTP id 6a1803df08f44-699398a7849so25542976d6.1 for ; Tue, 09 Apr 2024 03:32:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712658746; x=1713263546; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=S64WejElo6jGGhPiU7k/Qi5681Kq0Kif1MmZzKoadxU=; b=fL8dqwqfe8N1X1bn1SUVJDQJtFlBdt7RB58smwwUnmjnYSzxtVQ0JSzXBMRlXQZk31 Jjw4a4KXP60Ssqu9INnerYs8ycxJ81sR214lb2fF2dMkFmhxj05V9pC3RvtKiQXWw6Wz RlJ13U3q8glc/DCIfxVuV30gXacyykIvVz20uR8BCzKJNBOK7ivUSLPJd4mjTBjWXEsF jArSzd2E2xomU6jA7i32a15UfoMWUFqempM30vQzdl8AG9H1LD0ONip42myjP/cINoDB attnq6FxNfLZeyGcevEjpFiCcjCXYSbj+Rnrdi22f9Pgig6QJ4BIN36ZMWoMhl417Ava Hr0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712658746; x=1713263546; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=S64WejElo6jGGhPiU7k/Qi5681Kq0Kif1MmZzKoadxU=; b=aNRajE0iomKSlGAngNm/OCuhXijb1ZkfXNaGlNY7H/CPpSN/rYf5/N99Z3CYw3NWlo g0gqFLSzLO/DcB2guLU56GfSyPWIlud3ooKi1EbQmPFHKU4IMprnP4Qr+0vSkaP0GNtD h7uiaK5NaBwd0OaSiiz7EAGqpH/t8XZ2gf57Ask5QQIgYreefsgvJnre7EA6HgPeg4ig ldrksoCMJy95OzUjbl9SwEw4wPCoM7PUQ6hKctsQaHCjWfomRn3pvvZzgBJrlbaG5WFy SObnK1yTDcYYoYdzvCIudP71hVll1b5VqxjkU7T1aTPG5p2GnCRroisjKVJRPKRMNYaP tDWA== X-Forwarded-Encrypted: i=1; AJvYcCVTPnpxmRJWjb1MwpcMfaRPOZKvDZRTik4VTnydC69eSNptDemvK6ja8dZ4BayGM9f7bxZkuvzC2iKSB/e45g1HJ1KZ4OP/Ig== X-Gm-Message-State: AOJu0YzjAfeljTS45mMYnoaO8w/ij7e3ZI+QVtC/Mt42bGH5MkfRVwZB pu6NMjg761MyZYs16DbnAV9MfwLlHEXvkOb5dO1MWVV6wrZW1ijL9R2iigczQRNtETKj3CbVO0G cde5jNcaKVNsOYIys9d6eU0hDgvuT/oP5KCM= X-Google-Smtp-Source: AGHT+IGeuCIQ7/+udWdqchMSG5McfNunthnefDP7qTe9La1gORwOLJkV8U0WQhdo3q70X7aEKPN4plj3uFamRqvGC1U= X-Received: by 2002:ad4:5bcc:0:b0:69b:29b2:9ae9 with SMTP id t12-20020ad45bcc000000b0069b29b29ae9mr1304556qvt.64.1712658746034; Tue, 09 Apr 2024 03:32:26 -0700 (PDT) MIME-Version: 1.0 References: <20230418071851.4192579-1-haochen.jiang@intel.com> In-Reply-To: From: Hongtao Liu Date: Tue, 9 Apr 2024 18:32:14 +0800 Message-ID: Subject: Re: [PATCH] i386, v2: Fix aes/vaes patterns [PR114576] To: Jakub Jelinek Cc: "Jiang, Haochen" , "gcc-patches@gcc.gnu.org" , "Liu, Hongtao" , "ubizjak@gmail.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Apr 9, 2024 at 5:18=E2=80=AFPM Jakub Jelinek wro= te: > > On Tue, Apr 09, 2024 at 11:23:40AM +0800, Hongtao Liu wrote: > > I think we can merge alternative 2 with 3 to > > * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}"\" : > > \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; > > Then it can handle vaes_avx512vl + -mno-aes case. > > Ok, done in the patch below. > > > > @@ -30246,44 +30250,60 @@ (define_insn "vpdpwssds__maskz_1" > > > [(set_attr ("prefix") ("evex"))]) > > > > > > (define_insn "vaesdec_" > > > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dv") > > > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dx,v") > > > (unspec:VI1_AVX512VL_F > > > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > > > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > > > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > > > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")= ] > > > UNSPEC_VAESDEC))] > > > "TARGET_VAES" > > > - "vaesdec\t{%2, %1, %0|%0, %1, %2}" > > > -) > > > +{ > > > + if (which_alternative =3D=3D 0 && mode =3D=3D V16QImode) > > > + return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}"; > > Similar, but something like > > * return TARGET_AES || mode !=3D V16QImode ? \"vaesenc\t{%2, %1, > > %0|%0, %1, %2}"\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; > > For a single alternative, it would need to be > { > return x86_evex_reg_mentioned_p (operands, 3) > ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" > : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; > } > (* return would just mean uselessly too long line). > Is that what you want instead? I thought the 2 separate alternatives > where only the latter covers those cases is more readable... > > The following patch just changes the aes* patterns, not the vaes* ones. Patch LGTM. > > 2024-04-09 Jakub Jelinek > > PR target/114576 > * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl. > (enabled): Remove aes isa check, add vaes_avx512vl. > * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Us= e > jm instead of m for second alternative and emit {evex} prefix > for it if !TARGET_AES. Use noavx,avx,vaes_avx512vl isa attribute= . > (vaesdec_, vaesdeclast_, vaesenc_, > vaesenclast_): Add second alternative with x instead of v > and jm instead of m. > > * gcc.target/i386/aes-pr114576.c: New test. > > --- gcc/config/i386/i386.md.jj 2024-04-09 08:12:29.259451422 +0200 > +++ gcc/config/i386/i386.md 2024-04-09 10:53:24.965516804 +0200 > @@ -568,13 +568,14 @@ (define_attr "unit" "integer,i387,sse,mm > > ;; Used to control the "enabled" attribute on a per-instruction basis. > (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, > - x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, > + x64_avx,x64_avx512bw,x64_avx512dq,apx_ndd, > sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_n= oavx, > avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx51= 2f_512, > noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, > noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnn= i, > avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxnecon= vert, > - avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512v= l" > + avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512v= l, > + vaes_avx512vl" > (const_string "base")) > > ;; The (bounding maximum) length of an instruction immediate. > @@ -915,7 +916,6 @@ (define_attr "enabled" "" > (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") > (eq_attr "isa" "x64_avx512dq") > (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ") > - (eq_attr "isa" "aes") (symbol_ref "TARGET_AES") > (eq_attr "isa" "sse_noavx") > (symbol_ref "TARGET_SSE && !TARGET_AVX") > (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2") > @@ -968,6 +968,8 @@ (define_attr "enabled" "" > (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") > (eq_attr "isa" "apx_ndd") > (symbol_ref "TARGET_APX_NDD") > + (eq_attr "isa" "vaes_avx512vl") > + (symbol_ref "TARGET_VAES && TARGET_AVX512VL") > > (eq_attr "mmx_isa" "native") > (symbol_ref "!TARGET_MMX_WITH_SSE") > --- gcc/config/i386/sse.md.jj 2024-04-04 10:43:32.107789627 +0200 > +++ gcc/config/i386/sse.md 2024-04-09 10:53:06.138772957 +0200 > @@ -26279,72 +26279,72 @@ (define_insn "xop_vpermil23" > (define_insn "aesenc" > [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm"= )] > + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm= ")] > UNSPEC_AESENC))] > "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesenc\t{%2, %0|%0, %2} > - vaesenc\t{%2, %1, %0|%0, %1, %2} > + * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex= %} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; > vaesenc\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,aes,avx512vl") > + [(set_attr "isa" "noavx,avx,vaes_avx512vl") > (set_attr "type" "sselog1") > (set_attr "addr" "gpr16,*,*") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,maybe_evex,evex") > (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesenclast" > [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm"= )] > + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm= ")] > UNSPEC_AESENCLAST))] > "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesenclast\t{%2, %0|%0, %2} > - vaesenclast\t{%2, %1, %0|%0, %1, %2} > + * return TARGET_AES ? \"vaesenclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{= evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}\"; > vaesenclast\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,aes,avx512vl") > + [(set_attr "isa" "noavx,avx,vaes_avx512vl") > (set_attr "type" "sselog1") > (set_attr "addr" "gpr16,*,*") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex,evex") > - (set_attr "btver2_decode" "double,double,double") > + (set_attr "prefix" "orig,maybe_evex,evex") > + (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesdec" > [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm"= )] > + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm= ")] > UNSPEC_AESDEC))] > "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesdec\t{%2, %0|%0, %2} > - vaesdec\t{%2, %1, %0|%0, %1, %2} > + * return TARGET_AES ? \"vaesdec\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex= %} vaesdec\t{%2, %1, %0|%0, %1, %2}\"; > vaesdec\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,aes,avx512vl") > + [(set_attr "isa" "noavx,avx,vaes_avx512vl") > (set_attr "type" "sselog1") > (set_attr "addr" "gpr16,*,*") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,maybe_evex,evex") > (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > (define_insn "aesdeclast" > [(set (match_operand:V2DI 0 "register_operand" "=3Dx,x,v") > (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm"= )] > + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm= ")] > UNSPEC_AESDECLAST))] > "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > "@ > aesdeclast\t{%2, %0|%0, %2} > - vaesdeclast\t{%2, %1, %0|%0, %1, %2} > + * return TARGET_AES ? \"vaesdeclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{= evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}\"; > vaesdeclast\t{%2, %1, %0|%0, %1, %2}" > - [(set_attr "isa" "noavx,aes,avx512vl") > + [(set_attr "isa" "noavx,avx,vaes_avx512vl") > (set_attr "addr" "gpr16,*,*") > (set_attr "type" "sselog1") > (set_attr "prefix_extra" "1") > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,maybe_evex,evex") > (set_attr "btver2_decode" "double,double,double") > (set_attr "mode" "TI")]) > > @@ -30246,44 +30246,60 @@ (define_insn "vpdpwssds__maskz_1" > [(set_attr ("prefix") ("evex"))]) > > (define_insn "vaesdec_" > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dv") > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dx,v") > (unspec:VI1_AVX512VL_F > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] > UNSPEC_VAESDEC))] > "TARGET_VAES" > - "vaesdec\t{%2, %1, %0|%0, %1, %2}" > -) > +{ > + if (which_alternative =3D=3D 0 && mode =3D=3D V16QImode) > + return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}"; > + else > + return "vaesdec\t{%2, %1, %0|%0, %1, %2}"; > +}) > > (define_insn "vaesdeclast_" > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dv") > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dx,v") > (unspec:VI1_AVX512VL_F > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] > UNSPEC_VAESDECLAST))] > "TARGET_VAES" > - "vaesdeclast\t{%2, %1, %0|%0, %1, %2}" > -) > +{ > + if (which_alternative =3D=3D 0 && mode =3D=3D V16QImode) > + return "%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; > + else > + return "vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; > +}) > > (define_insn "vaesenc_" > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dv") > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dx,v") > (unspec:VI1_AVX512VL_F > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] > UNSPEC_VAESENC))] > "TARGET_VAES" > - "vaesenc\t{%2, %1, %0|%0, %1, %2}" > -) > +{ > + if (which_alternative =3D=3D 0 && mode =3D=3D V16QImode) > + return "%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}"; > + else > + return "vaesenc\t{%2, %1, %0|%0, %1, %2}"; > +}) > > (define_insn "vaesenclast_" > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dv") > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=3Dx,v") > (unspec:VI1_AVX512VL_F > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] > UNSPEC_VAESENCLAST))] > "TARGET_VAES" > - "vaesenclast\t{%2, %1, %0|%0, %1, %2}" > -) > +{ > + if (which_alternative =3D=3D 0 && mode =3D=3D V16QImode) > + return "%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}"; > + else > + return "vaesenclast\t{%2, %1, %0|%0, %1, %2}"; > +}) > > (define_insn "vpclmulqdq_" > [(set (match_operand:VI8_FVL 0 "register_operand" "=3Dv") > --- gcc/testsuite/gcc.target/i386/aes-pr114576.c.jj 2024-04-09 10:27:= 32.782646751 +0200 > +++ gcc/testsuite/gcc.target/i386/aes-pr114576.c 2024-04-09 10:27:= 32.782646751 +0200 > @@ -0,0 +1,63 @@ > +/* PR target/114576 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -maes -mno-avx" } */ > +/* { dg-final { scan-assembler-times "\taesenc\t" 2 } } */ > +/* { dg-final { scan-assembler-times "\taesdec\t" 2 } } */ > +/* { dg-final { scan-assembler-times "\taesenclast\t" 2 } } */ > +/* { dg-final { scan-assembler-times "\taesdeclast\t" 2 } } */ > +/* { dg-final { scan-assembler-not "\tvaesenc" } } */ > +/* { dg-final { scan-assembler-not "\tvaesdec" } } */ > + > +#include > + > +__m128i > +f1 (__m128i x, __m128i y) > +{ > + return _mm_aesenc_si128 (x, y); > +} > + > +__m128i > +f2 (__m128i x, __m128i y) > +{ > + __m128i z =3D _mm_aesenc_si128 (x, y); > + return z + x + y; > +} > + > +__m128i > +f3 (__m128i x, __m128i y) > +{ > + return _mm_aesdec_si128 (x, y); > +} > + > +__m128i > +f4 (__m128i x, __m128i y) > +{ > + __m128i z =3D _mm_aesdec_si128 (x, y); > + return z + x + y; > +} > + > +__m128i > +f5 (__m128i x, __m128i y) > +{ > + return _mm_aesenclast_si128 (x, y); > +} > + > +__m128i > +f6 (__m128i x, __m128i y) > +{ > + __m128i z =3D _mm_aesenclast_si128 (x, y); > + return z + x + y; > +} > + > +__m128i > +f7 (__m128i x, __m128i y) > +{ > + return _mm_aesdeclast_si128 (x, y); > +} > + > +__m128i > +f8 (__m128i x, __m128i y) > +{ > + __m128i z =3D _mm_aesdeclast_si128 (x, y); > + return z + x + y; > +} > > > Jakub > --=20 BR, Hongtao