From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 7C0AB385840B for ; Tue, 9 Apr 2024 09:18:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7C0AB385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7C0AB385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712654296; cv=none; b=vE0nI4q9YwHo2XTiiuqPVPNPKL3D9KBQ6eoxfWM+zXUHYZ96W6A9Q/CC4DPqL8U6hQ1gHyNCUKST+ctE+t5+22gX/K2gD/m+YCwdk1g2P6J98M6nXLUryd3iInyYUB1GyykiXHelmgH0h5hwfNvePsnfK4WZypa8AVcT2YEWwBM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712654296; c=relaxed/simple; bh=3TsTFrJl7rcNfTLKOzYReNtb8wP1Rpt9QbFrwXUATWs=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Hv25ucfiRCbjqZZ62YKBGO+iHRa3z69szk/8O7IA/PMl5OWnSYkzjE5Pfvb+pEXa+kQBZSN5oqy2c81AvCz54eiOpENBhewsYxjMNLBOhlBe6Awqz7CXcrW0TAVo5jmDBq+pjT3++dUsvjS6dOFHcxox/Cxw2wjV0dP4xqTQNXM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712654293; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=8a/WOTffhUULdllXfexZBThAb+GQ78HYJx21quUVsoU=; b=ShsNZdJqU1D2MUMtFRGkcIluMJtLJQj4PI1CB9tuZpoyeMyNeFk7Mnf6O5CEFZHnjXNWAX /s3e+KzjBhnnll7kv6zA8onzj8UW7xAUAvxotBZJS3NNcDvvgYgN8lw4CgKzhlZqWFydge wDUSL8a3czdFe+B73zz3RUqV6a8I2ls= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-164-q-_KUprBOJKQpl0FQfY7sQ-1; Tue, 09 Apr 2024 05:18:09 -0400 X-MC-Unique: q-_KUprBOJKQpl0FQfY7sQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7BDEC3806736; Tue, 9 Apr 2024 09:18:09 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.224.14]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 140AF1C060A6; Tue, 9 Apr 2024 09:18:08 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 4399I3cs1308809 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 9 Apr 2024 11:18:03 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 4399I2w81308808; Tue, 9 Apr 2024 11:18:02 +0200 Date: Tue, 9 Apr 2024 11:18:02 +0200 From: Jakub Jelinek To: Hongtao Liu Cc: "Jiang, Haochen" , "gcc-patches@gcc.gnu.org" , "Liu, Hongtao" , "ubizjak@gmail.com" Subject: [PATCH] i386, v2: Fix aes/vaes patterns [PR114576] Message-ID: Reply-To: Jakub Jelinek References: <20230418071851.4192579-1-haochen.jiang@intel.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Apr 09, 2024 at 11:23:40AM +0800, Hongtao Liu wrote: > I think we can merge alternative 2 with 3 to > * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}"\" : > \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; > Then it can handle vaes_avx512vl + -mno-aes case. Ok, done in the patch below. > > @@ -30246,44 +30250,60 @@ (define_insn "vpdpwssds__maskz_1" > > [(set_attr ("prefix") ("evex"))]) > > > > (define_insn "vaesdec_" > > - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") > > + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") > > (unspec:VI1_AVX512VL_F > > - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") > > - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] > > + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") > > + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] > > UNSPEC_VAESDEC))] > > "TARGET_VAES" > > - "vaesdec\t{%2, %1, %0|%0, %1, %2}" > > -) > > +{ > > + if (which_alternative == 0 && mode == V16QImode) > > + return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}"; > Similar, but something like > * return TARGET_AES || mode != V16QImode ? \"vaesenc\t{%2, %1, > %0|%0, %1, %2}"\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; For a single alternative, it would need to be { return x86_evex_reg_mentioned_p (operands, 3) ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; } (* return would just mean uselessly too long line). Is that what you want instead? I thought the 2 separate alternatives where only the latter covers those cases is more readable... The following patch just changes the aes* patterns, not the vaes* ones. 2024-04-09 Jakub Jelinek PR target/114576 * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl. (enabled): Remove aes isa check, add vaes_avx512vl. * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Use jm instead of m for second alternative and emit {evex} prefix for it if !TARGET_AES. Use noavx,avx,vaes_avx512vl isa attribute. (vaesdec_, vaesdeclast_, vaesenc_, vaesenclast_): Add second alternative with x instead of v and jm instead of m. * gcc.target/i386/aes-pr114576.c: New test. --- gcc/config/i386/i386.md.jj 2024-04-09 08:12:29.259451422 +0200 +++ gcc/config/i386/i386.md 2024-04-09 10:53:24.965516804 +0200 @@ -568,13 +568,14 @@ (define_attr "unit" "integer,i387,sse,mm ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, - x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd, + x64_avx,x64_avx512bw,x64_avx512dq,apx_ndd, sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512, noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq, noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnni, avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxneconvert, - avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl" + avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl, + vaes_avx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -915,7 +916,6 @@ (define_attr "enabled" "" (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") (eq_attr "isa" "x64_avx512dq") (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ") - (eq_attr "isa" "aes") (symbol_ref "TARGET_AES") (eq_attr "isa" "sse_noavx") (symbol_ref "TARGET_SSE && !TARGET_AVX") (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2") @@ -968,6 +968,8 @@ (define_attr "enabled" "" (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") (eq_attr "isa" "apx_ndd") (symbol_ref "TARGET_APX_NDD") + (eq_attr "isa" "vaes_avx512vl") + (symbol_ref "TARGET_VAES && TARGET_AVX512VL") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") --- gcc/config/i386/sse.md.jj 2024-04-04 10:43:32.107789627 +0200 +++ gcc/config/i386/sse.md 2024-04-09 10:53:06.138772957 +0200 @@ -26279,72 +26279,72 @@ (define_insn "xop_vpermil23" (define_insn "aesenc" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESENC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesenc\t{%2, %0|%0, %2} - vaesenc\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesenc\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}\"; vaesenc\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesenclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESENCLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesenclast\t{%2, %0|%0, %2} - vaesenclast\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesenclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}\"; vaesenclast\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") - (set_attr "btver2_decode" "double,double,double") + (set_attr "prefix" "orig,maybe_evex,evex") + (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesdec" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESDEC))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesdec\t{%2, %0|%0, %2} - vaesdec\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesdec\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}\"; vaesdec\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "type" "sselog1") (set_attr "addr" "gpr16,*,*") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) (define_insn "aesdeclast" [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") - (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")] + (match_operand:V2DI 2 "vector_operand" "xja,xjm,vm")] UNSPEC_AESDECLAST))] "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" "@ aesdeclast\t{%2, %0|%0, %2} - vaesdeclast\t{%2, %1, %0|%0, %1, %2} + * return TARGET_AES ? \"vaesdeclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}\"; vaesdeclast\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,aes,avx512vl") + [(set_attr "isa" "noavx,avx,vaes_avx512vl") (set_attr "addr" "gpr16,*,*") (set_attr "type" "sselog1") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,vex,evex") + (set_attr "prefix" "orig,maybe_evex,evex") (set_attr "btver2_decode" "double,double,double") (set_attr "mode" "TI")]) @@ -30246,44 +30246,60 @@ (define_insn "vpdpwssds__maskz_1" [(set_attr ("prefix") ("evex"))]) (define_insn "vaesdec_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESDEC))] "TARGET_VAES" - "vaesdec\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesdec\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesdeclast_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESDECLAST))] "TARGET_VAES" - "vaesdeclast\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesdeclast\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesenc_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESENC))] "TARGET_VAES" - "vaesenc\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesenc\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vaesenclast_" - [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=v") + [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v") (unspec:VI1_AVX512VL_F - [(match_operand:VI1_AVX512VL_F 1 "register_operand" "v") - (match_operand:VI1_AVX512VL_F 2 "vector_operand" "vm")] + [(match_operand:VI1_AVX512VL_F 1 "register_operand" "x,v") + (match_operand:VI1_AVX512VL_F 2 "vector_operand" "xjm,vm")] UNSPEC_VAESENCLAST))] "TARGET_VAES" - "vaesenclast\t{%2, %1, %0|%0, %1, %2}" -) +{ + if (which_alternative == 0 && mode == V16QImode) + return "%{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}"; + else + return "vaesenclast\t{%2, %1, %0|%0, %1, %2}"; +}) (define_insn "vpclmulqdq_" [(set (match_operand:VI8_FVL 0 "register_operand" "=v") --- gcc/testsuite/gcc.target/i386/aes-pr114576.c.jj 2024-04-09 10:27:32.782646751 +0200 +++ gcc/testsuite/gcc.target/i386/aes-pr114576.c 2024-04-09 10:27:32.782646751 +0200 @@ -0,0 +1,63 @@ +/* PR target/114576 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -maes -mno-avx" } */ +/* { dg-final { scan-assembler-times "\taesenc\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesdec\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesenclast\t" 2 } } */ +/* { dg-final { scan-assembler-times "\taesdeclast\t" 2 } } */ +/* { dg-final { scan-assembler-not "\tvaesenc" } } */ +/* { dg-final { scan-assembler-not "\tvaesdec" } } */ + +#include + +__m128i +f1 (__m128i x, __m128i y) +{ + return _mm_aesenc_si128 (x, y); +} + +__m128i +f2 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesenc_si128 (x, y); + return z + x + y; +} + +__m128i +f3 (__m128i x, __m128i y) +{ + return _mm_aesdec_si128 (x, y); +} + +__m128i +f4 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesdec_si128 (x, y); + return z + x + y; +} + +__m128i +f5 (__m128i x, __m128i y) +{ + return _mm_aesenclast_si128 (x, y); +} + +__m128i +f6 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesenclast_si128 (x, y); + return z + x + y; +} + +__m128i +f7 (__m128i x, __m128i y) +{ + return _mm_aesdeclast_si128 (x, y); +} + +__m128i +f8 (__m128i x, __m128i y) +{ + __m128i z = _mm_aesdeclast_si128 (x, y); + return z + x + y; +} Jakub