From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by sourceware.org (Postfix) with ESMTPS id A0F673858D37 for ; Fri, 21 Oct 2022 00:52:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A0F673858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb31.google.com with SMTP id t186so1524073yba.12 for ; Thu, 20 Oct 2022 17:52:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XiucFFJL8RJEAQ7P0xBHr9GFvP6fpkg340Wim0YFokM=; b=nXa1k+wTMsKXjZJ9YB1qiDUnjRAdtIihZltEvLkS+SuFvEYb5bkvER7HOhWn0We+Vj zpz+sK4r8eyY2S/l2a1Dlr1wy0iym/Rzjl4vaLeQgCtMF7UjUd75Mlp0Y/pRS2/5eYK6 GzgxbOaCRwXNNTa0oZTNDYThIu7mXxi/oOJv5WcHt0M5l3nR4ve3p0IwKSfwJ6u6b4ce bHldeYIPakYC3u/siwyIjnXjKxwxa+/NwRym8qpBHhF3lKOT7+Z8mdpYUFGabeXb2e5u dq/dL67QM+qf/KmatdFl8rt5i5bvQeURjMqO9CNGMAjKkdlrAuwfeyghuswlaHO45UgV hq3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XiucFFJL8RJEAQ7P0xBHr9GFvP6fpkg340Wim0YFokM=; b=GvxV90EqrNy9Uf2LEsGrugRwsWvGUSstkSEbzlKCYim/Z9zmbEtyFdFDsAy/DimNrR Un+I9VuuRb9vZLx0qEkAuaBeHlGaN338NpitOoG0GrtN9TxvegvTBHSz/W4k5dnGPORO /cGtQl1lwY9VERx+9sOUsYNfrNJxgr6CULUX3a9mAdwIzeCvFGQrcX33erzyFMIdh5xT XoNvyL5rdtFJK4WRfe9XjwwtQJViDVXhrxeLLgBFhf2OXH7msSc0PYeZuTbycgPDIihD xDM/Ih0TOuD9C19BK82X5AC1Z3pn+ONEPJKDELng3K7SWsA6QFw4LN3qjoQWA4N56Or/ 1cAg== X-Gm-Message-State: ACrzQf16tUDbcXbb2QEyk0fetsW8A3ktIro+xx58kKK4LNHnUAw5aBcx OxjkdaPdb/6uxBigOBxi1Nf7AO4ZyPxt3d4glx++NYWhbT4= X-Google-Smtp-Source: AMsMyM7dyXB41Fl4d9UShR8yPNkZeOkcKFvm1NYTqiXza2t6HKlhZF1aqEwdJClDTnGZIT+jB5zA1fxSnWdHd4rLT3I= X-Received: by 2002:a25:c102:0:b0:6c4:c94:2842 with SMTP id r2-20020a25c102000000b006c40c942842mr14558359ybf.611.1666313547216; Thu, 20 Oct 2022 17:52:27 -0700 (PDT) MIME-Version: 1.0 References: <20221014075445.7938-2-haochen.jiang@intel.com> <20221019060321.61112-1-hongyu.wang@intel.com> In-Reply-To: <20221019060321.61112-1-hongyu.wang@intel.com> From: Hongtao Liu Date: Fri, 21 Oct 2022 08:52:15 +0800 Message-ID: Subject: Re: [PATCH] Support Intel AVX-IFMA To: Hongyu Wang Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Oct 19, 2022 at 2:04 PM Hongyu Wang via Gcc-patches wrote: > > Hi, > > Here is the update patch that align the implementation to AVX-VNNI, > and corrects some spelling error for AVX512IFMA pattern. > > Bootstrapped/regtested on x86_64-pc-linux-gnu and sde. Ok for trunk? Ok for this one. > > gcc/ > > * common/config/i386/i386-common.cc > (OPTION_MASK_ISA_AVXIFMA_SET, OPTION_MASK_ISA2_AVXIFMA_UNSET, > OPTION_MASK_ISA2_AVX2_UNSET): New macro. > (ix86_handle_option): Handle -mavxifma. > * common/config/i386/i386-cpuinfo.h (processor_types): Add > FEATURE_AVXIFMA. > * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for > avxifma. > * common/config/i386/cpuinfo.h (get_available_features): > Detect avxifma. > * config.gcc: Add avxifmaintrin.h > * config/i386/avx512ifmavlintrin.h: (_mm_madd52lo_epu64): Change > to macro. > (_mm_madd52hi_epu64): Likewise. > (_mm256_madd52lo_epu64): Likewise. > (_mm256_madd52hi_epu64): Likewise. > * config/i386/avxifmaintrin.h: New header. > * config/i386/cpuid.h (bit_AVXIFMA): New. > * config/i386/i386-builtin.def: Add new builtins, and correct > pattern names for AVX512IFMA. > * config/i386/i386-builtins.cc (def_builtin): Handle AVX-IFMA > builtins like AVX-VNNI. > * config/i386/i386-c.cc (ix86_target_macros_internal): Define > __AVXIFMA__. > * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): > Relax ISA masks for AVXIFMA. > * config/i386/i386-isa.def: Add AVXIFMA. > * config/i386/i386-options.cc (isa2_opts): Add -mavxifma. > (ix86_valid_target_attribute_inner_p): Handle avxifma. > * config/i386/i386.md (isa): Add attr avxifma and avxifmavl. > * config/i386/i386.opt: Add option -mavxifma. > * config/i386/immintrin.h: Inculde avxifmaintrin.h. > * config/i386/sse.md (avx_vpmadd52_): > Remove. > (vpamdd52): Remove. > (vpamdd52huq_maskz): Rename to ... > (vpmadd52huq_maskz): ... this. > (vpamdd52luq_maskz): Rename to ... > (vpmadd52luq_maskz): ... this. > (vpmadd52): New define_insn. > (vpmadd52v8di): Likewise. > (vpmadd52_maskz_1): Likewise. > (vpamdd52_mask): Rename to ... > (vpmadd52_mask): ... this. > * doc/invoke.texi: Document -mavxifma. > * doc/extend.texi: Document avxifma. > * doc/sourcebuild.texi: Document target avxifma. > > gcc/testsuite/ > > * gcc.target/i386/avx-check.h: Add avxifma check. > * gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Remane.. > * gcc.target/i386/avx512ifma-vpmaddhuq-1a.c: To this. > * gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto. > * gcc.target/i386/avx512ifma-vpmaddluq-1a.c: Ditto. > * gcc.target/i386/avx512ifma-vpmaddhuq-1b.c: New Test. > * gcc.target/i386/avx512ifma-vpmaddluq-1b.c: Ditto. > * gcc.target/i386/avx-ifma-1.c: Ditto. > * gcc.target/i386/avx-ifma-2.c: Ditto. > * gcc.target/i386/avx-ifma-3.c: Ditto. > * gcc.target/i386/avx-ifma-4.c: Ditto. > * gcc.target/i386/avx-ifma-5.c: Ditto. > * gcc.target/i386/avx-ifma-6.c: Ditto. > * gcc.target/i386/avx-ifma-vpmaddhuq-2.c: Ditto. > * gcc.target/i386/avx-ifma-vpmaddluq-2.c: Ditto. > * gcc.target/i386/sse-12.c: Add -mavxifma. > * gcc.target/i386/sse-13.c: Ditto. > * gcc.target/i386/sse-14.c: Ditto. > * gcc.target/i386/sse-22.c: Ditto. > * gcc.target/i386/sse-23.c: Ditto. > * g++.dg/other/i386-2.C: Ditto. > * g++.dg/other/i386-3.C: Ditto. > * gcc.target/i386/funcspec-56.inc: Add new target attribute. > * lib/target-supports.exp > (check_effective_target_avxifma): New. > --- > gcc/common/config/i386/cpuinfo.h | 2 + > gcc/common/config/i386/i386-common.cc | 20 ++++- > gcc/common/config/i386/i386-cpuinfo.h | 1 + > gcc/common/config/i386/i386-isas.h | 1 + > gcc/config.gcc | 3 +- > gcc/config/i386/avx512ifmavlintrin.h | 59 +++++--------- > gcc/config/i386/avxifmaintrin.h | 78 +++++++++++++++++++ > gcc/config/i386/cpuid.h | 1 + > gcc/config/i386/i386-builtin.def | 28 ++++--- > gcc/config/i386/i386-builtins.cc | 8 +- > gcc/config/i386/i386-c.cc | 2 + > gcc/config/i386/i386-expand.cc | 13 ++++ > gcc/config/i386/i386-isa.def | 1 + > gcc/config/i386/i386-options.cc | 4 +- > gcc/config/i386/i386.md | 6 +- > gcc/config/i386/i386.opt | 5 ++ > gcc/config/i386/immintrin.h | 2 + > gcc/config/i386/sse.md | 56 ++++++++++--- > gcc/doc/extend.texi | 5 ++ > gcc/doc/invoke.texi | 9 ++- > gcc/doc/sourcebuild.texi | 3 + > gcc/testsuite/g++.dg/other/i386-2.C | 2 +- > gcc/testsuite/g++.dg/other/i386-3.C | 2 +- > gcc/testsuite/gcc.target/i386/avx-check.h | 6 +- > gcc/testsuite/gcc.target/i386/avx-ifma-1.c | 20 +++++ > gcc/testsuite/gcc.target/i386/avx-ifma-2.c | 21 +++++ > gcc/testsuite/gcc.target/i386/avx-ifma-3.c | 16 ++++ > gcc/testsuite/gcc.target/i386/avx-ifma-4.c | 16 ++++ > gcc/testsuite/gcc.target/i386/avx-ifma-5.c | 10 +++ > gcc/testsuite/gcc.target/i386/avx-ifma-6.c | 20 +++++ > .../gcc.target/i386/avx-ifma-vpmaddhuq-2.c | 72 +++++++++++++++++ > .../gcc.target/i386/avx-ifma-vpmaddluq-2.c | 61 +++++++++++++++ > ...pmaddhuq-1.c =3D> avx512ifma-vpmaddhuq-1a.c} | 0 > .../gcc.target/i386/avx512ifma-vpmaddhuq-1b.c | 33 ++++++++ > ...pmaddluq-1.c =3D> avx512ifma-vpmaddluq-1a.c} | 0 > .../gcc.target/i386/avx512ifma-vpmaddluq-1b.c | 33 ++++++++ > gcc/testsuite/gcc.target/i386/funcspec-56.inc | 2 + > gcc/testsuite/gcc.target/i386/sse-12.c | 2 +- > gcc/testsuite/gcc.target/i386/sse-13.c | 2 +- > gcc/testsuite/gcc.target/i386/sse-14.c | 2 +- > gcc/testsuite/gcc.target/i386/sse-22.c | 4 +- > gcc/testsuite/gcc.target/i386/sse-23.c | 2 +- > gcc/testsuite/lib/target-supports.exp | 12 +++ > 43 files changed, 563 insertions(+), 82 deletions(-) > create mode 100644 gcc/config/i386/avxifmaintrin.h > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddhuq-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddluq-2.c > rename gcc/testsuite/gcc.target/i386/{avx512ifma-vpmaddhuq-1.c =3D> avx5= 12ifma-vpmaddhuq-1a.c} (100%) > create mode 100644 gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1b= .c > rename gcc/testsuite/gcc.target/i386/{avx512ifma-vpmaddluq-1.c =3D> avx5= 12ifma-vpmaddluq-1a.c} (100%) > create mode 100644 gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1b= .c > > diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cp= uinfo.h > index b5c1b21e554..9bb21c6cacc 100644 > --- a/gcc/common/config/i386/cpuinfo.h > +++ b/gcc/common/config/i386/cpuinfo.h > @@ -793,6 +793,8 @@ get_available_features (struct __processor_model *cpu= _model, > { > if (eax & bit_AVXVNNI) > set_feature (FEATURE_AVXVNNI); > + if (eax & bit_AVXIFMA) > + set_feature (FEATURE_AVXIFMA); > } > if (avx512_usable) > { > diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i3= 86/i386-common.cc > index d6a68dc9b1d..4de7906b247 100644 > --- a/gcc/common/config/i386/i386-common.cc > +++ b/gcc/common/config/i386/i386-common.cc > @@ -76,6 +76,7 @@ along with GCC; see the file COPYING3. If not see > (OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512F_SET) > #define OPTION_MASK_ISA_AVX512IFMA_SET \ > (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512F_SET) > +#define OPTION_MASK_ISA2_AVXIFMA_SET OPTION_MASK_ISA2_AVXIFMA > #define OPTION_MASK_ISA_AVX512VBMI_SET \ > (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET) > #define OPTION_MASK_ISA2_AVX5124FMAPS_SET OPTION_MASK_ISA2_AVX5124FMAPS > @@ -212,7 +213,8 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA_AVX2_UNSET \ > (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX512F_UNSET) > #define OPTION_MASK_ISA2_AVX2_UNSET \ > - (OPTION_MASK_ISA2_AVXVNNI_UNSET | OPTION_MASK_ISA2_AVX512F_UNSET) > + (OPTION_MASK_ISA2_AVXIFMA_UNSET | OPTION_MASK_ISA2_AVXVNNI_UNSET \ > + | OPTION_MASK_ISA2_AVX512F_UNSET) > #define OPTION_MASK_ISA_AVX512F_UNSET \ > (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX512CD_UNSET \ > | OPTION_MASK_ISA_AVX512PF_UNSET | OPTION_MASK_ISA_AVX512ER_UNSET \ > @@ -230,6 +232,7 @@ along with GCC; see the file COPYING3. If not see > (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VBMI_UNSET) > #define OPTION_MASK_ISA_AVX512VL_UNSET OPTION_MASK_ISA_AVX512VL > #define OPTION_MASK_ISA_AVX512IFMA_UNSET OPTION_MASK_ISA_AVX512IFMA > +#define OPTION_MASK_ISA2_AVXIFMA_UNSET OPTION_MASK_ISA2_AVXIFMA > #define OPTION_MASK_ISA_AVX512VBMI_UNSET OPTION_MASK_ISA_AVX512VBMI > #define OPTION_MASK_ISA2_AVX5124FMAPS_UNSET OPTION_MASK_ISA2_AVX5124FMAP= S > #define OPTION_MASK_ISA2_AVX5124VNNIW_UNSET OPTION_MASK_ISA2_AVX5124VNNI= W > @@ -1124,6 +1127,21 @@ ix86_handle_option (struct gcc_options *opts, > } > return true; > > + case OPT_mavxifma: > + if (value) > + { > + opts->x_ix86_isa_flags2 |=3D OPTION_MASK_ISA2_AVXIFMA_SET; > + opts->x_ix86_isa_flags2_explicit |=3D OPTION_MASK_ISA2_AVXIFMA_= SET; > + opts->x_ix86_isa_flags |=3D OPTION_MASK_ISA_AVX2_SET; > + opts->x_ix86_isa_flags_explicit |=3D OPTION_MASK_ISA_AVX2_SET; > + } > + else > + { > + opts->x_ix86_isa_flags2 &=3D ~OPTION_MASK_ISA2_AVXIFMA_UNSET; > + opts->x_ix86_isa_flags2_explicit |=3D OPTION_MASK_ISA2_AVXIFMA_= UNSET; > + } > + return true; > + > case OPT_mfma: > if (value) > { > diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i3= 86/i386-cpuinfo.h > index 643fbd97378..968f9a56a6c 100644 > --- a/gcc/common/config/i386/i386-cpuinfo.h > +++ b/gcc/common/config/i386/i386-cpuinfo.h > @@ -240,6 +240,7 @@ enum processor_features > FEATURE_X86_64_V2, > FEATURE_X86_64_V3, > FEATURE_X86_64_V4, > + FEATURE_AVXIFMA, > CPU_FEATURE_MAX > }; > > diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/= i386-isas.h > index 2d0646a68f8..b05b4bb8f0d 100644 > --- a/gcc/common/config/i386/i386-isas.h > +++ b/gcc/common/config/i386/i386-isas.h > @@ -175,4 +175,5 @@ ISA_NAMES_TABLE_START > ISA_NAMES_TABLE_ENTRY("x86-64-v2", FEATURE_X86_64_V2, P_X86_64_V2, NUL= L) > ISA_NAMES_TABLE_ENTRY("x86-64-v3", FEATURE_X86_64_V3, P_X86_64_V3, NUL= L) > ISA_NAMES_TABLE_ENTRY("x86-64-v4", FEATURE_X86_64_V4, P_X86_64_V4, NUL= L) > + ISA_NAMES_TABLE_ENTRY("avxifma", FEATURE_AVXIFMA, P_NONE, "-mavxifma") > ISA_NAMES_TABLE_END > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 2af30b4a6ec..d086dbdf8fb 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -421,7 +421,8 @@ i[34567]86-*-* | x86_64-*-*) > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > amxbf16intrin.h x86gprintrin.h uintrintrin.h > hresetintrin.h keylockerintrin.h avxvnniintrin.h > - mwaitintrin.h avx512fp16intrin.h avx512fp16vlintri= n.h" > + mwaitintrin.h avx512fp16intrin.h avx512fp16vlintri= n.h > + avxifmaintrin.h" > ;; > ia64-*-*) > extra_headers=3Dia64intrin.h > diff --git a/gcc/config/i386/avx512ifmavlintrin.h b/gcc/config/i386/avx51= 2ifmavlintrin.h > index a7a50d89df4..506dce8e477 100644 > --- a/gcc/config/i386/avx512ifmavlintrin.h > +++ b/gcc/config/i386/avx512ifmavlintrin.h > @@ -34,45 +34,26 @@ > #define __DISABLE_AVX512IFMAVL__ > #endif /* __AVX512IFMAVL__ */ > > -extern __inline __m128i > -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > -_mm_madd52lo_epu64 (__m128i __X, __m128i __Y, __m128i __Z) > -{ > - return (__m128i) __builtin_ia32_vpmadd52luq128_mask ((__v2di) __X, > - (__v2di) __Y, > - (__v2di) __Z, > - (__mmask8) -1); > -} > - > -extern __inline __m128i > -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > -_mm_madd52hi_epu64 (__m128i __X, __m128i __Y, __m128i __Z) > -{ > - return (__m128i) __builtin_ia32_vpmadd52huq128_mask ((__v2di) __X, > - (__v2di) __Y, > - (__v2di) __Z, > - (__mmask8) -1); > -} > - > -extern __inline __m256i > -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > -_mm256_madd52lo_epu64 (__m256i __X, __m256i __Y, __m256i __Z) > -{ > - return (__m256i) __builtin_ia32_vpmadd52luq256_mask ((__v4di) __X, > - (__v4di) __Y, > - (__v4di) __Z, > - (__mmask8) -1); > -} > - > -extern __inline __m256i > -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > -_mm256_madd52hi_epu64 (__m256i __X, __m256i __Y, __m256i __Z) > -{ > - return (__m256i) __builtin_ia32_vpmadd52huq256_mask ((__v4di) __X, > - (__v4di) __Y, > - (__v4di) __Z, > - (__mmask8) -1); > -} > +#define _mm_madd52lo_epu64(A, B, C) \ > + ((__m128i) __builtin_ia32_vpmadd52luq128 ((__v2di) (A), \ > + (__v2di) (B), \ > + (__v2di) (C))) > + > +#define _mm_madd52hi_epu64(A, B, C) \ > + ((__m128i) __builtin_ia32_vpmadd52huq128 ((__v2di) (A), \ > + (__v2di) (B), \ > + (__v2di) (C))) > + > +#define _mm256_madd52lo_epu64(A, B, C) \ > + ((__m256i) __builtin_ia32_vpmadd52luq256 ((__v4di) (A), \ > + (__v4di) (B), \ > + (__v4di) (C))) > + > + > +#define _mm256_madd52hi_epu64(A, B, C) \ > + ((__m256i) __builtin_ia32_vpmadd52huq256 ((__v4di) (A), \ > + (__v4di) (B), \ > + (__v4di) (C))) > > extern __inline __m128i > __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > diff --git a/gcc/config/i386/avxifmaintrin.h b/gcc/config/i386/avxifmaint= rin.h > new file mode 100644 > index 00000000000..3878d10f991 > --- /dev/null > +++ b/gcc/config/i386/avxifmaintrin.h > @@ -0,0 +1,78 @@ > +/* Copyright (C) 2020 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + Under Section 7 of GPL version 3, you are granted additional > + permissions described in the GCC Runtime Library Exception, version > + 3.1, as published by the Free Software Foundation. > + > + You should have received a copy of the GNU General Public License and > + a copy of the GCC Runtime Library Exception along with this program; > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > + . */ > + > +#ifndef _IMMINTRIN_H_INCLUDED > +#error "Never use directly; include inst= ead." > +#endif > + > +#ifndef _AVXIFMAINTRIN_H_INCLUDED > +#define _AVXIFMAINTRIN_H_INCLUDED > + > +#ifndef __AVXIFMA__ > +#pragma GCC push_options > +#pragma GCC target("avxifma") > +#define __DISABLE_AVXIFMA__ > +#endif /* __AVXIFMA__ */ > + > +extern __inline __m128i > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_madd52lo_avx_epu64 (__m128i __X, __m128i __Y, __m128i __Z) > +{ > + return (__m128i) __builtin_ia32_vpmadd52luq128 ((__v2di) __X, > + (__v2di) __Y, > + (__v2di) __Z); > +} > + > +extern __inline __m128i > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_madd52hi_avx_epu64 (__m128i __X, __m128i __Y, __m128i __Z) > +{ > + return (__m128i) __builtin_ia32_vpmadd52huq128 ((__v2di) __X, > + (__v2di) __Y, > + (__v2di) __Z); > +} > + > +extern __inline __m256i > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm256_madd52lo_avx_epu64 (__m256i __X, __m256i __Y, __m256i __Z) > +{ > + return (__m256i) __builtin_ia32_vpmadd52luq256 ((__v4di) __X, > + (__v4di) __Y, > + (__v4di) __Z); > +} > + > +extern __inline __m256i > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm256_madd52hi_avx_epu64 (__m256i __X, __m256i __Y, __m256i __Z) > +{ > + return (__m256i) __builtin_ia32_vpmadd52huq256 ((__v4di) __X, > + (__v4di) __Y, > + (__v4di) __Z); > +} > + > +#ifdef __DISABLE_AVXIFMA__ > +#undef __DISABLE_AVXIFMA__ > +#pragma GCC pop_options > +#endif /* __DISABLE_AVXIFMA__ */ > + > +#endif /* _AVXIFMAINTRIN_H_INCLUDED */ > diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h > index a4c2fed7eda..9885699efd5 100644 > --- a/gcc/config/i386/cpuid.h > +++ b/gcc/config/i386/cpuid.h > @@ -28,6 +28,7 @@ > #define bit_AVXVNNI (1 << 4) > #define bit_AVX512BF16 (1 << 5) > #define bit_HRESET (1 << 22) > +#define bit_AVXIFMA (1 << 23) > > /* %ecx */ > #define bit_SSE3 (1 << 0) > diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-buil= tin.def > index dea52a28d28..d22d79df054 100644 > --- a/gcc/config/i386/i386-builtin.def > +++ b/gcc/config/i386/i386-builtin.def > @@ -2486,18 +2486,22 @@ BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_avx5= 12bw_ucmpv64qi3_mask, "__builti > BDESC (OPTION_MASK_ISA_AVX512BW, 0, CODE_FOR_avx512bw_ucmpv32hi3_mask, "= __builtin_ia32_ucmpw512_mask", IX86_BUILTIN_UCMPW512, UNKNOWN, (int) USI_FT= YPE_V32HI_V32HI_INT_USI) > > /* AVX512IFMA */ > -BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpamdd52luqv8di_mask, "__= builtin_ia32_vpmadd52luq512_mask", IX86_BUILTIN_VPMADD52LUQ512, UNKNOWN, (i= nt) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpamdd52luqv8di_maskz, "_= _builtin_ia32_vpmadd52luq512_maskz", IX86_BUILTIN_VPMADD52LUQ512_MASKZ, UNK= NOWN, (int) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpamdd52huqv8di_mask, "__= builtin_ia32_vpmadd52huq512_mask", IX86_BUILTIN_VPMADD52HUQ512, UNKNOWN, (i= nt) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpamdd52huqv8di_maskz, "_= _builtin_ia32_vpmadd52huq512_maskz", IX86_BUILTIN_VPMADD52HUQ512_MASKZ, UNK= NOWN, (int) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52luqv4di_mask, "__builtin_ia32_vpmadd52luq256_mask", IX86_BUILTIN_= VPMADD52LUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52luqv4di_maskz, "__builtin_ia32_vpmadd52luq256_maskz", IX86_BUILTI= N_VPMADD52LUQ256_MASKZ, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52huqv4di_mask, "__builtin_ia32_vpmadd52huq256_mask", IX86_BUILTIN_= VPMADD52HUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52huqv4di_maskz, "__builtin_ia32_vpmadd52huq256_maskz", IX86_BUILTI= N_VPMADD52HUQ256_MASKZ, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52luqv2di_mask, "__builtin_ia32_vpmadd52luq128_mask", IX86_BUILTIN_= VPMADD52LUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52luqv2di_maskz, "__builtin_ia32_vpmadd52luq128_maskz", IX86_BUILTI= N_VPMADD52LUQ128_MASKZ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52huqv2di_mask, "__builtin_ia32_vpmadd52huq128_mask", IX86_BUILTIN_= VPMADD52HUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > -BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpamdd52huqv2di_maskz, "__builtin_ia32_vpmadd52huq128_maskz", IX86_BUILTI= N_VPMADD52HUQ128_MASKZ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpmadd52luqv8di_mask, "__= builtin_ia32_vpmadd52luq512_mask", IX86_BUILTIN_VPMADD52LUQ512, UNKNOWN, (i= nt) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpmadd52luqv8di_maskz, "_= _builtin_ia32_vpmadd52luq512_maskz", IX86_BUILTIN_VPMADD52LUQ512_MASKZ, UNK= NOWN, (int) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpmadd52huqv8di_mask, "__= builtin_ia32_vpmadd52huq512_mask", IX86_BUILTIN_VPMADD52HUQ512, UNKNOWN, (i= nt) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA, 0, CODE_FOR_vpmadd52huqv8di_maskz, "_= _builtin_ia32_vpmadd52huq512_maskz", IX86_BUILTIN_VPMADD52HUQ512_MASKZ, UNK= NOWN, (int) V8DI_FTYPE_V8DI_V8DI_V8DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52luqv4di_mask, "__builtin_ia32_vpmadd52luq256_mask", IX86_BUILTIN_= VPMADD52LUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52luqv4di_maskz, "__builtin_ia32_vpmadd52luq256_maskz", IX86_BUILTI= N_VPMADD52LUQ256_MASKZ, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52huqv4di_mask, "__builtin_ia32_vpmadd52huq256_mask", IX86_BUILTIN_= VPMADD52HUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52huqv4di_maskz, "__builtin_ia32_vpmadd52huq256_maskz", IX86_BUILTI= N_VPMADD52HUQ256_MASKZ, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52luqv2di_mask, "__builtin_ia32_vpmadd52luq128_mask", IX86_BUILTIN_= VPMADD52LUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52luqv2di_maskz, "__builtin_ia32_vpmadd52luq128_maskz", IX86_BUILTI= N_VPMADD52LUQ128_MASKZ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52huqv2di_mask, "__builtin_ia32_vpmadd52huq128_mask", IX86_BUILTIN_= VPMADD52HUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, 0, CODE_FO= R_vpmadd52huqv2di_maskz, "__builtin_ia32_vpmadd52huq128_maskz", IX86_BUILTI= N_VPMADD52HUQ128_MASKZ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, OPTION_MAS= K_ISA2_AVXIFMA, CODE_FOR_vpmadd52luqv4di, "__builtin_ia32_vpmadd52luq256", = IX86_BUINTIN_VPMADD52LUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, OPTION_MAS= K_ISA2_AVXIFMA, CODE_FOR_vpmadd52huqv4di, "__builtin_ia32_vpmadd52huq256", = IX86_BUINTIN_VPMADD52HUQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, OPTION_MAS= K_ISA2_AVXIFMA, CODE_FOR_vpmadd52luqv2di, "__builtin_ia32_vpmadd52luq128", = IX86_BUINTIN_VPMADD52LUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI) > +BDESC (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL, OPTION_MAS= K_ISA2_AVXIFMA, CODE_FOR_vpmadd52huqv2di, "__builtin_ia32_vpmadd52huq128", = IX86_BUINTIN_VPMADD52HUQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI) > > /* AVX512VBMI */ > BDESC (OPTION_MASK_ISA_AVX512VBMI, 0, CODE_FOR_vpmultishiftqbv64qi_mask,= "__builtin_ia32_vpmultishiftqb512_mask", IX86_BUILTIN_VPMULTISHIFTQB512, U= NKNOWN, (int) V64QI_FTYPE_V64QI_V64QI_V64QI_UDI) > diff --git a/gcc/config/i386/i386-builtins.cc b/gcc/config/i386/i386-buil= tins.cc > index b5c651a1cab..d1c31bb9235 100644 > --- a/gcc/config/i386/i386-builtins.cc > +++ b/gcc/config/i386/i386-builtins.cc > @@ -279,10 +279,12 @@ def_builtin (HOST_WIDE_INT mask, HOST_WIDE_INT mask= 2, > if (((mask2 =3D=3D 0 || (mask2 & ix86_isa_flags2) !=3D 0) > && (mask =3D=3D 0 || (mask & ix86_isa_flags) !=3D 0)) > || ((mask & OPTION_MASK_ISA_MMX) !=3D 0 && TARGET_MMX_WITH_SSE) > - /* "Unified" builtin used by either AVXVNNI intrinsics or AVX51= 2VNNIVL > - non-mask intrinsics should be defined whenever avxvnni > - or avx512vnni && avx512vl exist. */ > + /* "Unified" builtin used by either AVXVNNI/AVXIFMA intrinsics > + or AVX512VNNIVL/AVX512IFMAVL non-mask intrinsics should be > + defined whenever avxvnni/avxifma or avx512vnni/avxifma && > + avx512vl exist. */ > || (mask2 =3D=3D OPTION_MASK_ISA2_AVXVNNI) > + || (mask2 =3D=3D OPTION_MASK_ISA2_AVXIFMA) > || (lang_hooks.builtin_function > =3D=3D lang_hooks.builtin_function_ext_scope)) > { > diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc > index eb0e3b36a76..3494ec035d5 100644 > --- a/gcc/config/i386/i386-c.cc > +++ b/gcc/config/i386/i386-c.cc > @@ -633,6 +633,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, > def_or_undef (parse_in, "__WIDEKL__"); > if (isa_flag2 & OPTION_MASK_ISA2_AVXVNNI) > def_or_undef (parse_in, "__AVXVNNI__"); > + if (isa_flag2 & OPTION_MASK_ISA2_AVXIFMA) > + def_or_undef (parse_in, "__AVXIFMA__"); > if (TARGET_IAMCU) > { > def_or_undef (parse_in, "__iamcu"); > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand= .cc > index a0f8a98986e..621199be07a 100644 > --- a/gcc/config/i386/i386-expand.cc > +++ b/gcc/config/i386/i386-expand.cc > @@ -12367,6 +12367,8 @@ ix86_check_builtin_isa_match (unsigned int fcode, > OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4 > (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL) or > OPTION_MASK_ISA2_AVXVNNI > + (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512IFMA) or > + OPTION_MASK_ISA2_AVXIFMA > where for each such pair it is sufficient if either of the ISAs is > enabled, plus if it is ored with other options also those others. > OPTION_MASK_ISA_MMX in bisa is satisfied also if TARGET_MMX_WITH_SS= E. */ > @@ -12396,6 +12398,17 @@ ix86_check_builtin_isa_match (unsigned int fcode= , > isa2 |=3D OPTION_MASK_ISA2_AVXVNNI; > } > > + if ((((bisa & (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL)) > + =3D=3D (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL)) > + || (bisa2 & OPTION_MASK_ISA2_AVXIFMA) !=3D 0) > + && (((isa & (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL= )) > + =3D=3D (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL)= ) > + || (isa2 & OPTION_MASK_ISA2_AVXIFMA) !=3D 0)) > + { > + isa |=3D OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512VL; > + isa2 |=3D OPTION_MASK_ISA2_AVXIFMA; > + } > + > if ((bisa & OPTION_MASK_ISA_MMX) && !TARGET_MMX && TARGET_MMX_WITH_SSE > /* __builtin_ia32_maskmovq requires MMX registers. */ > && fcode !=3D IX86_BUILTIN_MASKMOVQ) > diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def > index 83659d0bea4..6e0254ce418 100644 > --- a/gcc/config/i386/i386-isa.def > +++ b/gcc/config/i386/i386-isa.def > @@ -109,3 +109,4 @@ DEF_PTA(KL) > DEF_PTA(WIDEKL) > DEF_PTA(AVXVNNI) > DEF_PTA(AVX512FP16) > +DEF_PTA(AVXIFMA) > diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-optio= ns.cc > index acb2291e70f..5facb64c2a8 100644 > --- a/gcc/config/i386/i386-options.cc > +++ b/gcc/config/i386/i386-options.cc > @@ -226,7 +226,8 @@ static struct ix86_target_opts isa2_opts[] =3D > { "-mkl", OPTION_MASK_ISA2_KL }, > { "-mwidekl", OPTION_MASK_ISA2_WIDEKL }, > { "-mavxvnni", OPTION_MASK_ISA2_AVXVNNI }, > - { "-mavx512fp16", OPTION_MASK_ISA2_AVX512FP16 } > + { "-mavx512fp16", OPTION_MASK_ISA2_AVX512FP16 }, > + { "-mavxifma", OPTION_MASK_ISA2_AVXIFMA } > }; > static struct ix86_target_opts isa_opts[] =3D > { > @@ -1072,6 +1073,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, t= ree args, char *p_strings[], > IX86_ATTR_ISA ("hreset", OPT_mhreset), > IX86_ATTR_ISA ("avxvnni", OPT_mavxvnni), > IX86_ATTR_ISA ("avx512fp16", OPT_mavx512fp16), > + IX86_ATTR_ISA ("avxifma", OPT_mavxifma), > > /* enum options */ > IX86_ATTR_ENUM ("fpmath=3D", OPT_mfpmath_), > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 6688d92b63c..93538c5b3c6 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -835,7 +835,8 @@ (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,= x64_sse4_noavx, > sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_n= oavx, > avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx= 512f, > avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512= vl, > - avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16" > + avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16,a= vxifma, > + avx512ifmavl" > (const_string "base")) > > ;; Define instruction set of MMX instructions > @@ -891,6 +892,9 @@ (define_attr "enabled" "" > (symbol_ref "TARGET_AVX512VNNI && TARGET_AVX512VL") > (eq_attr "isa" "avx512fp16") > (symbol_ref "TARGET_AVX512FP16") > + (eq_attr "isa" "avxifma") (symbol_ref "TARGET_AVXIFMA") > + (eq_attr "isa" "avx512ifmavl") > + (symbol_ref "TARGET_AVX512IFMA && TARGET_AVX512VL") > > (eq_attr "mmx_isa" "native") > (symbol_ref "!TARGET_MMX_WITH_SSE") > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > index 0dbaacb57ed..36e28b7063d 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1214,3 +1214,8 @@ Do not use GOT to access external symbols. > -param=3Dx86-stlf-window-ninsns=3D > Target Joined UInteger Var(x86_stlf_window_ninsns) Init(64) Param > Instructions number above which STFL stall penalty can be compensated. > + > +mavxifma > +Target Mask(ISA2_AVXIFMA) Var(ix86_isa_flags2) Save > +Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, and > +AVXIFMA built-in functions and code generation. > diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h > index 6afd78c2b6f..e9d4e975243 100644 > --- a/gcc/config/i386/immintrin.h > +++ b/gcc/config/i386/immintrin.h > @@ -44,6 +44,8 @@ > > #include > > +#include > + > #include > > #include > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 076064f97e6..33f306a0c75 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -27867,7 +27867,7 @@ (define_int_iterator VPMADD52 > (define_int_attr vpmadd52type > [(UNSPEC_VPMADD52LUQ "luq") (UNSPEC_VPMADD52HUQ "huq")]) > > -(define_expand "vpamdd52huq_maskz" > +(define_expand "vpmadd52huq_maskz" > [(match_operand:VI8_AVX512VL 0 "register_operand") > (match_operand:VI8_AVX512VL 1 "register_operand") > (match_operand:VI8_AVX512VL 2 "register_operand") > @@ -27875,13 +27875,13 @@ (define_expand "vpamdd52huq_maskz" > (match_operand: 4 "register_operand")] > "TARGET_AVX512IFMA" > { > - emit_insn (gen_vpamdd52huq_maskz_1 ( > + emit_insn (gen_vpmadd52huq_maskz_1 ( > operands[0], operands[1], operands[2], operands[3], > CONST0_RTX (mode), operands[4])); > DONE; > }) > > -(define_expand "vpamdd52luq_maskz" > +(define_expand "vpmadd52luq_maskz" > [(match_operand:VI8_AVX512VL 0 "register_operand") > (match_operand:VI8_AVX512VL 1 "register_operand") > (match_operand:VI8_AVX512VL 2 "register_operand") > @@ -27889,26 +27889,58 @@ (define_expand "vpamdd52luq_maskz" > (match_operand: 4 "register_operand")] > "TARGET_AVX512IFMA" > { > - emit_insn (gen_vpamdd52luq_maskz_1 ( > + emit_insn (gen_vpmadd52luq_maskz_1 ( > operands[0], operands[1], operands[2], operands[3], > CONST0_RTX (mode), operands[4])); > DONE; > }) > > -(define_insn "vpamdd52" > - [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=3Dv") > - (unspec:VI8_AVX512VL > - [(match_operand:VI8_AVX512VL 1 "register_operand" "0") > - (match_operand:VI8_AVX512VL 2 "register_operand" "v") > - (match_operand:VI8_AVX512VL 3 "nonimmediate_operand" "vm")] > +(define_insn "vpmadd52v8di" > + [(set (match_operand:V8DI 0 "register_operand" "=3Dv") > + (unspec:V8DI > + [(match_operand:V8DI 1 "register_operand" "0") > + (match_operand:V8DI 2 "register_operand" "v") > + (match_operand:V8DI 3 "nonimmediate_operand" "vm")] > VPMADD52))] > "TARGET_AVX512IFMA" > - "vpmadd52\t{%3, %2, %0|%0, %2,= %3}" > + "vpmadd52\t{%3, %2, %0|%0, %2, %3}" > + [(set_attr "type" "ssemuladd") > + (set_attr "prefix" "evex") > + (set_attr "mode" "XI")]) > + > +(define_insn "vpmadd52" > + [(set (match_operand:VI8_AVX2 0 "register_operand" "=3Dx,v") > + (unspec:VI8_AVX2 > + [(match_operand:VI8_AVX2 1 "register_operand" "0,0") > + (match_operand:VI8_AVX2 2 "register_operand" "x,v") > + (match_operand:VI8_AVX2 3 "nonimmediate_operand" "xm,vm")] > + VPMADD52))] > + "TARGET_AVXIFMA || (TARGET_AVX512IFMA && TARGET_AVX512VL)" > + "@ > + %{vex%} vpmadd52\t{%3, %2, %0|%0, %2, %3} > + vpmadd52\t{%3, %2, %0|%0, %2, %3}" > + [(set_attr "isa" "avxifma,avx512ifmavl") > + (set_attr "type" "ssemuladd") > + (set_attr "prefix" "vex,evex") > + (set_attr "mode" "")]) > + > +(define_insn "vpmadd52_maskz_1" > + [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=3Dv") > + (vec_merge:VI8_AVX512VL > + (unspec:VI8_AVX512VL > + [(match_operand:VI8_AVX512VL 1 "register_operand" "0") > + (match_operand:VI8_AVX512VL 2 "register_operand" "v") > + (match_operand:VI8_AVX512VL 3 "nonimmediate_operand" "vm")] > + VPMADD52) > + (match_operand:VI8_AVX512VL 4 "const0_operand" "C") > + (match_operand: 5 "register_operand" "Yk")))] > + "TARGET_AVX512IFMA" > + "vpmadd52\t{%3, %2, %0%{%5%}%{z%}|%0%{%5%}%{z%}, %2, %3}= " > [(set_attr "type" "ssemuladd") > (set_attr "prefix" "evex") > (set_attr "mode" "")]) > > -(define_insn "vpamdd52_mask" > +(define_insn "vpmadd52_mask" > [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=3Dv") > (vec_merge:VI8_AVX512VL > (unspec:VI8_AVX512VL > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index cfbe32afce9..edecf5c0070 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -7060,6 +7060,11 @@ Enable/disable the generation of the WIDEKL instru= ctions. > @cindex @code{target("avxvnni")} function attribute, x86 > Enable/disable the generation of the AVXVNNI instructions. > > +@item avxifma > +@itemx no-avxifma > +@cindex @code{target("avxifma")} function attribute, x86 > +Enable/disable the generation of the AVXIFMA instructions. > + > @item cld > @itemx no-cld > @cindex @code{target("cld")} function attribute, x86 > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index c176e2dc646..2cd617a9d44 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1436,7 +1436,7 @@ See RS/6000 and PowerPC Options. > -mavx5124fmaps -mavx512vnni -mavx5124vnniw -mprfchw -mrdpid @gol > -mrdseed -msgx -mavx512vp2intersect -mserialize -mtsxldtrk@gol > -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni@gol > --mavx512fp16 @gol > +-mavx512fp16 -mavxifma @gol > -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops= @gol > -minline-stringops-dynamically -mstringop-strategy=3D@var{alg} @gol > -mkl -mwidekl @gol > @@ -32893,6 +32893,9 @@ preferred alignment to @option{-mpreferred-stack-= boundary=3D2}. > @need 200 > @itemx -mwidekl > @opindex mwidekl > +@need 200 > +@itemx -mavxifma > +@opindex mavxifma > These switches enable the use of instructions in the MMX, SSE, > SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F, AVX5= 12PF, > AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI= , SHA, > @@ -32902,8 +32905,8 @@ WBNOINVD, FMA4, PREFETCHW, RDPID, PREFETCHWT1, RD= SEED, SGX, XOP, LWP, > XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI= 2, > GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX51= 2BF16, > ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALI= ZE, > -UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512FP1= 6 > -or CLDEMOTE extended instruction sets. Each has a corresponding > +UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512FP1= 6, > +AVXIFMA or CLDEMOTE extended instruction sets. Each has a corresponding > @option{-mno-} option to disable use of these instructions. > > These extensions are also available as built-in functions: see > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi > index c81e2ffd43a..0173acf4a65 100644 > --- a/gcc/doc/sourcebuild.texi > +++ b/gcc/doc/sourcebuild.texi > @@ -2490,6 +2490,9 @@ Target supports the execution of @code{avx512f} ins= tructions. > @item avx512vp2intersect > Target supports the execution of @code{avx512vp2intersect} instructions. > > +@item avxifma > +Target supports the execution of @code{avxifma} instructions. > + > @item amx_tile > Target supports the execution of @code{amx-tile} instructions. > > diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/o= ther/i386-2.C > index fba3d1ac684..5388606779b 100644 > --- a/gcc/testsuite/g++.dg/other/i386-2.C > +++ b/gcc/testsuite/g++.dg/other/i386-2.C > @@ -1,5 +1,5 @@ > /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ > -/* { dg-options "-O -pedantic-errors -march=3Dk8 -msse4a -m3dnow -mavx -= mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm= -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr= -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 = -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512if= ma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcnt= dq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpco= nfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mts= xldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp1= 6" } */ > +/* { dg-options "-O -pedantic-errors -march=3Dk8 -msse4a -m3dnow -mavx -= mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm= -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr= -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 = -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512if= ma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcnt= dq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpco= nfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mts= xldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp1= 6 -mavxifma" } */ > > /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, > xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, > diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/o= ther/i386-3.C > index 5cc0fa83457..86cedd3d32f 100644 > --- a/gcc/testsuite/g++.dg/other/i386-3.C > +++ b/gcc/testsuite/g++.dg/other/i386-3.C > @@ -1,5 +1,5 @@ > /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ > -/* { dg-options "-O -fkeep-inline-functions -march=3Dk8 -msse4a -m3dnow = -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi= 2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx= -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefe= tchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mav= x512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512v= popcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg= -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserializ= e -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx= 512fp16" } */ > +/* { dg-options "-O -fkeep-inline-functions -march=3Dk8 -msse4a -m3dnow = -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi= 2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx= -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefe= tchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mav= x512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512v= popcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg= -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserializ= e -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx= 512fp16 -mavxifma" } */ > > /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, > xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, > diff --git a/gcc/testsuite/gcc.target/i386/avx-check.h b/gcc/testsuite/gc= c.target/i386/avx-check.h > index 7ddca9d7b80..24ee6ab4efd 100644 > --- a/gcc/testsuite/gcc.target/i386/avx-check.h > +++ b/gcc/testsuite/gcc.target/i386/avx-check.h > @@ -22,7 +22,11 @@ main () > > /* Run AVX test only if host has AVX support. */ > if (((ecx & (bit_AVX | bit_OSXSAVE)) =3D=3D (bit_AVX | bit_OSXSAVE)) > - && avx_os_support ()) > + && avx_os_support () > +#ifdef AVXIFMA > + && __builtin_cpu_supports ("avxifma") > +#endif > + ) > { > do_test (); > #ifdef DEBUG > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-1.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-1.c > new file mode 100644 > index 00000000000..a0cfc446e4d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-1.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavxifma -O2" } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > + > +#include > + > +volatile __m256i x,y,z; > +volatile __m128i x_,y_,z_; > + > +void extern > +avxifma_test (void) > +{ > + x =3D _mm256_madd52hi_epu64 (x, y, z); > + x =3D _mm256_madd52lo_epu64 (x, y, z); > + x_ =3D _mm_madd52hi_epu64 (x_, y_, z_); > + x_ =3D _mm_madd52lo_epu64 (x_, y_, z_); > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-2.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-2.c > new file mode 100644 > index 00000000000..5f82ffec3e2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-2.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > + > +#include > + > +volatile __m256i x,y,z; > +volatile __m128i x_,y_,z_; > + > +__attribute__((target("avxifma"))) > +void > +avxifma_test (void) > +{ > + x =3D _mm256_madd52hi_epu64 (x, y, z); > + x =3D _mm256_madd52lo_epu64 (x, y, z); > + x_ =3D _mm_madd52hi_epu64 (x_, y_, z_); > + x_ =3D _mm_madd52lo_epu64 (x_, y_, z_); > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-3.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-3.c > new file mode 100644 > index 00000000000..536c1de96c5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-3.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=3Dx86-64" } */ > + > +__attribute__ ((__gnu_inline__, __always_inline__, target("avxifma"))) > +inline int > +foo (void) /* { dg-error "inlining failed in call to 'always_inline' .* = target specific option mismatch" } */ > +{ > + return 0; > +} > + > +__attribute__ ((target("avx512ifma,avx512vl"))) > +int > +bar (void) > +{ > + return foo (); /* { dg-message "called from here" } */ > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-4.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-4.c > new file mode 100644 > index 00000000000..62d26497510 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-4.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=3Dx86-64" } */ > + > +__attribute__ ((__gnu_inline__, __always_inline__, target("avx512ifma,av= x512vl"))) > +inline int > +foo (void) /* { dg-error "inlining failed in call to 'always_inline' .* = target specific option mismatch" } */ > +{ > + return 0; > +} > + > +__attribute__ ((target("avxifma"))) > +int > +bar (void) > +{ > + return foo (); /* { dg-message "called from here" } */ > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-5.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-5.c > new file mode 100644 > index 00000000000..b6110e5f7f0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-5.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavxifma -mavx512ifma -mavx512vl" } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > + > +#include > + > +#include "avx-ifma-1.c" > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-6.c b/gcc/testsuite/g= cc.target/i386/avx-ifma-6.c > new file mode 100644 > index 00000000000..6388373123c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-6.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavxifma -O2" } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > + > +#include > + > +volatile __m256i x,y,z; > +volatile __m128i x_,y_,z_; > + > +void extern > +avxifma_test (void) > +{ > + x =3D _mm256_madd52hi_avx_epu64 (x, y, z); > + x =3D _mm256_madd52lo_avx_epu64 (x, y, z); > + x_ =3D _mm_madd52hi_avx_epu64 (x_, y_, z_); > + x_ =3D _mm_madd52lo_avx_epu64 (x_, y_, z_); > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddhuq-2.c b/gcc/t= estsuite/gcc.target/i386/avx-ifma-vpmaddhuq-2.c > new file mode 100644 > index 00000000000..c9efee33091 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddhuq-2.c > @@ -0,0 +1,72 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -mavxifma" } */ > +/* { dg-require-effective-target avxifma } */ > +#define AVXIFMA > +#ifndef CHECK > +#define CHECK "avx-check.h" > +#endif > + > +#ifndef TEST > +#define TEST avx_test > +#endif > + > +#include CHECK > + > +void > +CALC (long long *r, long long *s1, long long *s2, long long *s3, int siz= e) > +{ > + int i; > + long long a,b; > + > + for (i =3D 0; i < size; i++) > + { > + /* Simulate higher 52 bits out of 104 bit, > + by shifting opernads with 0 in lower 26 bits. */ > + a =3D s2[i] >> 26; > + b =3D s3[i] >> 26; > + r[i] =3D a * b + s1[i]; > + } > +} > + > +void > +TEST (void) > +{ > + union256i_q src1_256, src2_256, dst_256; > + union128i_q src1_128, src2_128, dst_128; > + long long dst_ref_256[4], dst_ref_128[2]; > + int i; > + > + for (i =3D 0; i < 4; i++) > + { > + src1_256.a[i] =3D 15 + 3467 * i; > + src2_256.a[i] =3D 9217 + i; > + src1_256.a[i] =3D src1_256.a[i] << 26; > + src2_256.a[i] =3D src2_256.a[i] << 26; > + src1_256.a[i] &=3D ((1LL << 52) - 1); > + src2_256.a[i] &=3D ((1LL << 52) - 1); > + dst_256.a[i] =3D -1; > + } > + > + for (i =3D 0; i < 2; i++) > + { > + src1_128.a[i] =3D 16 + 3467 * i; > + src2_128.a[i] =3D 9127 + i; > + src1_128.a[i] =3D src1_128.a[i] << 26; > + src2_128.a[i] =3D src2_128.a[i] << 26; > + src1_128.a[i] &=3D ((1LL << 52) - 1); > + src2_128.a[i] &=3D ((1LL << 52) - 1); > + dst_128.a[i] =3D -1; > + } > + > + CALC (dst_ref_256, dst_256.a, src1_256.a, src2_256.a, 4); > + dst_256.x =3D _mm256_madd52hi_avx_epu64 (dst_256.x, src1_256.x, src2_2= 56.x); > + if (check_union256i_q (dst_256, dst_ref_256)) > + abort (); > + > + CALC (dst_ref_128, dst_128.a, src1_128.a, src2_128.a, 2); > + dst_128.x =3D _mm_madd52hi_avx_epu64 (dst_128.x, src1_128.x, src2_128.= x); > + if (check_union128i_q (dst_128, dst_ref_128)) > + abort (); > + > +} > + > diff --git a/gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddluq-2.c b/gcc/t= estsuite/gcc.target/i386/avx-ifma-vpmaddluq-2.c > new file mode 100644 > index 00000000000..600978ea9ad > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-ifma-vpmaddluq-2.c > @@ -0,0 +1,61 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -mavxifma" } */ > +/* { dg-require-effective-target avxifma } */ > +#define AVXIFMA > +#ifndef CHECK > +#define CHECK "avx-check.h" > +#endif > + > +#ifndef TEST > +#define TEST avx_test > +#endif > + > +#include CHECK > + > +void > +CALC (unsigned long long *r, unsigned long long *s1, > + unsigned long long *s2, unsigned long long *s3, > + int size) > +{ > + int i; > + > + for (i =3D 0; i < size; i++) > + { > + r[i] =3D s2[i] * s3[i] + s1[i]; > + } > +} > + > +void > +TEST (void) > +{ > + union256i_q src1_256, src2_256, dst_256; > + union128i_q src1_128, src2_128, dst_128; > + unsigned long long dst_ref_256[4], dst_ref_128[2]; > + int i; > + > + for (i =3D 0; i < 4; i++) > + { > + src1_256.a[i] =3D 3450 * i; > + src2_256.a[i] =3D 7863 * i; > + dst_256.a[i] =3D 117; > + } > + > + for (i =3D 0; i < 2; i++) > + { > + src1_128.a[i] =3D 3540 * i; > + src2_128.a[i] =3D 7683 * i; > + dst_128.a[i] =3D 117; > + } > + > + CALC (dst_ref_256, dst_256.a, src1_256.a, src2_256.a, 4); > + dst_256.x =3D _mm256_madd52lo_avx_epu64 (dst_256.x, src1_256.x, src2_2= 56.x); > + if (check_union256i_q (dst_256, dst_ref_256)) > + abort (); > + > + CALC (dst_ref_128, dst_128.a, src1_128.a, src2_128.a, 2); > + dst_128.x =3D _mm_madd52lo_avx_epu64 (dst_128.x, src1_128.x, src2_128.= x); > + if (check_union128i_q (dst_128, dst_ref_128)) > + abort (); > + > +} > + > diff --git a/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1.c b/gcc= /testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1a.c > similarity index 100% > rename from gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1.c > rename to gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1a.c > diff --git a/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1b.c b/gc= c/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1b.c > new file mode 100644 > index 00000000000..67e94baa01b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddhuq-1b.c > @@ -0,0 +1,33 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512ifma -mavx512vl -mavxifma -O2" } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52huq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52huq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > + > +#include > + > +volatile __m512i _x1, _y1, _z1; > +volatile __m256i _x2, _y2, _z2; > +volatile __m128i _x3, _y3, _z3; > + > +void extern > +avx512ifma_test (void) > +{ > + _x3 =3D _mm_madd52hi_epu64 (_x3, _y3, _z3); > + _x3 =3D _mm_mask_madd52hi_epu64 (_x3, 2, _y3, _z3); > + _x3 =3D _mm_maskz_madd52hi_epu64 (2, _x3, _y3, _z3); > + _x2 =3D _mm256_madd52hi_epu64 (_x2, _y2, _z2); > + _x2 =3D _mm256_mask_madd52hi_epu64 (_x2, 3, _y2, _z2); > + _x2 =3D _mm256_maskz_madd52hi_epu64 (3, _x2, _y2, _z2); > + _x1 =3D _mm512_madd52hi_epu64 (_x1, _y1, _z1); > + _x1 =3D _mm512_mask_madd52hi_epu64 (_x1, 3, _y1, _z1); > + _x1 =3D _mm512_maskz_madd52hi_epu64 (3, _x1, _y1, _z1); > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1.c b/gcc= /testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1a.c > similarity index 100% > rename from gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1.c > rename to gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1a.c > diff --git a/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1b.c b/gc= c/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1b.c > new file mode 100644 > index 00000000000..4b8ea27f403 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512ifma-vpmaddluq-1b.c > @@ -0,0 +1,33 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512ifma -mavx512vl -mavxifma -O2" } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%xmm\[= 0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "\{vex\} vpmadd52luq\[ \\t\]+\[^\n\= ]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%ymm\[= 0-9\]+\[^\n\]*%ymm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+" 3 } } */ > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } *= / > +/* { dg-final { scan-assembler-times "vpmadd52luq\[ \\t\]+\[^\n\]*%zmm\[= 0-9\]+\[^\n\]*%zmm\[0-9\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */ > + > +#include > + > +volatile __m512i _x1, _y1, _z1; > +volatile __m256i _x2, _y2, _z2; > +volatile __m128i _x3, _y3, _z3; > + > +void extern > +avx512ifma_test (void) > +{ > + _x3 =3D _mm_madd52lo_epu64 (_x3, _y3, _z3); > + _x3 =3D _mm_mask_madd52lo_epu64 (_x3, 2, _y3, _z3); > + _x3 =3D _mm_maskz_madd52lo_epu64 (2, _x3, _y3, _z3); > + _x2 =3D _mm256_madd52lo_epu64 (_x2, _y2, _z2); > + _x2 =3D _mm256_mask_madd52lo_epu64 (_x2, 3, _y2, _z2); > + _x2 =3D _mm256_maskz_madd52lo_epu64 (3, _x2, _y2, _z2); > + _x1 =3D _mm512_madd52lo_epu64 (_x1, _y1, _z1); > + _x1 =3D _mm512_mask_madd52lo_epu64 (_x1, 3, _y1, _z1); > + _x1 =3D _mm512_maskz_madd52lo_epu64 (3, _x1, _y1, _z1); > +} > diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuit= e/gcc.target/i386/funcspec-56.inc > index b76dddb86a2..466555c0d06 100644 > --- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc > +++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc > @@ -80,6 +80,7 @@ extern void test_keylocker (void) __attribu= te__((__target__("kl"))); > extern void test_widekl (void) __attribute__((__target__= ("widekl"))); > extern void test_avxvnni (void) __attribute__((__= target__("avxvnni"))); > extern void test_avx512fp16 (void) __attribute__((__target__= ("avx512fp16"))); > +extern void test_avxifma (void) __attribute__((__= target__("avxifma"))); > > extern void test_no_sgx (void) __attribute__((__target__= ("no-sgx"))); > extern void test_no_avx5124fmaps(void) __attribute__((__target__= ("no-avx5124fmaps"))); > @@ -161,6 +162,7 @@ extern void test_no_keylocker (void) _= _attribute__((__target__("no-kl"))); > extern void test_no_widekl (void) __attribute__((__target__= ("no-widekl"))); > extern void test_no_avxvnni (void) __attribute__((__target__= ("no-avxvnni"))); > extern void test_no_avx512fp16 (void) __attribute__((__target__= ("no-avx512fp16"))); > +extern void test_no_avxifma (void) __attribute__((__target__= ("no-avxifma"))); > > extern void test_arch_nocona (void) __attribute__((__target__= ("arch=3Dnocona"))); > extern void test_arch_core2 (void) __attribute__((__target__= ("arch=3Dcore2"))); > diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.t= arget/i386/sse-12.c > index 375d4d1b4de..fde56261d8f 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-12.c > +++ b/gcc/testsuite/gcc.target/i386/sse-12.c > @@ -3,7 +3,7 @@ > popcntintrin.h gfniintrin.h and mm_malloc.h are usable > with -O -std=3Dc89 -pedantic-errors. */ > /* { dg-do compile } */ > -/* { dg-options "-O -std=3Dc89 -pedantic-errors -march=3Dk8 -msse4a -m3d= now -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -= mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -= madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mp= refetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl = -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx= 512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bi= talg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mseri= alize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni" = } */ > +/* { dg-options "-O -std=3Dc89 -pedantic-errors -march=3Dk8 -msse4a -m3d= now -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -= mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -= madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mp= refetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl = -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx= 512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bi= talg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mseri= alize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -= mavxifma" } */ > > #include > > diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.t= arget/i386/sse-13.c > index e285c307d00..bb29555babe 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-13.c > +++ b/gcc/testsuite/gcc.target/i386/sse-13.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq = -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku= -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -men= qcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl = -mavxvnni -mavx512fp16" } */ > +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq = -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku= -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -men= qcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl = -mavxvnni -mavx512fp16 -mavxifma" } */ > /* { dg-add-options bind_pic_locally } */ > > #include > diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.t= arget/i386/sse-14.c > index f41493b93f3..f2701ddaaf9 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-14.c > +++ b/gcc/testsuite/gcc.target/i386/sse-14.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O0 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw = -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni= -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect= -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mav= xvnni -mavx512fp16" } */ > +/* { dg-options "-O0 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw = -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni= -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect= -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mav= xvnni -mavx512fp16 -mavxifma" } */ > /* { dg-add-options bind_pic_locally } */ > > #include > diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.t= arget/i386/sse-22.c > index 31492ef3697..3d196975b1e 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-22.c > +++ b/gcc/testsuite/gcc.target/i386/sse-22.c > @@ -103,7 +103,7 @@ > > > #ifndef DIFFERENT_PRAGMAS > -#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsav= eopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,a= vx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512v= popcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk= ,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16") > +#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsav= eopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,a= vx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512v= popcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk= ,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma") > #endif > > /* Following intrinsics require immediate arguments. They > @@ -220,7 +220,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int= , 1) > > /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */ > #ifdef DIFFERENT_PRAGMAS > -#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,a= vx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx51= 2vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf= 16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,wide= kl,avxvnni,avx512fp16") > +#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,a= vx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx51= 2vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf= 16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,wide= kl,avxvnni,avx512fp16,avxifma") > #endif > #include > test_1 (_cvtss_sh, unsigned short, float, 1) > diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.t= arget/i386/sse-23.c > index f71a7b29157..d3a233f90fc 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-23.c > +++ b/gcc/testsuite/gcc.target/i386/sse-23.c > @@ -843,6 +843,6 @@ > #define __builtin_ia32_vpclmulqdq_v2di(A, B, C) __builtin_ia32_vpclmulq= dq_v2di(A, B, 1) > #define __builtin_ia32_vpclmulqdq_v8di(A, B, C) __builtin_ia32_vpclmulq= dq_v8di(A, B, 1) > > -#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,= xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,c= lflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx= 5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2= ,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2inters= ect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512f= p16") > +#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,= xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,c= lflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx= 5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2= ,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2inters= ect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512f= p16,avxifma") > > #include > diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/ta= rget-supports.exp > index 8d45bc2427f..f9f5423398b 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -9522,6 +9522,18 @@ proc check_effective_target_avxvnni { } { > } "-mavxvnni" ] > } > > +# Return 1 if avxifma instructions can be compiled. > +proc check_effective_target_avxifma { } { > + return [check_no_compiler_messages avxifma object { > + typedef long long __v4di __attribute__ ((__vector_size__ (32))); > + __v4di > + _mm256_maddlo_epu64 (__v4di __A, __v4di __B, __v4di __C) > + { > + return __builtin_ia32_vpmadd52luq256 (__A, __B, __C); > + } > + } "-O0 -mavxifma" ] > +} > + > # Return 1 if sse instructions can be compiled. > proc check_effective_target_sse { } { > return [check_no_compiler_messages sse object { > -- > 2.18.1 > --=20 BR, Hongtao