From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by sourceware.org (Postfix) with ESMTPS id A6C723858D28 for ; Mon, 7 Nov 2022 01:25:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A6C723858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb31.google.com with SMTP id j130so11947942ybj.9 for ; Sun, 06 Nov 2022 17:25:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=75Z496GfWFeWFVQDmeW7b/g+fPleoZs4QvsEHJxVbbM=; b=WwT/0nOwFof5ABsZSkHKeuSopoYmqNmsA6IpCILi1kIE50NxDdf6j2dcCs3f+hnv+E CXMHOgXBI7jhqth2g87S34o/YHc1+rDo4j1LqLy6/TuCf1hR47TfYhWYecSogOlw2WAA nVxsP9qkMsbdrQGzFRMtNEPfg+nHTx4F/6Gr++Ndrh9zUoB9Ihe3qVXZyXmJphWW1QsZ d+aMl/S/QKlUywBicv3mILOVj3eFhas5vCPu/wjZkdN5X375KYRpZHPUa1dXg2S5YMEu +umganhdL7ay8V5JlEzxXTw7Hn++mZ7WrFj9Te0xkuWkzdw2EPe5s7blkMOo8lA4cnyD KaKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=75Z496GfWFeWFVQDmeW7b/g+fPleoZs4QvsEHJxVbbM=; b=mJm2SQ4OQbeIT2MNBpyEmMHZYQgCLeVLj6oSUAIfAWm37sbF8IFilEG8hyKhbTE6y0 Mlfiz0Lgty8KtoOpjbp/2tC5gUqm6pKkybUZjG8OdC1szFZ3STkb3/Ej8hUVavpAt6i5 8eZTREeDNlRUSgKo2sLBmo1LGcwQKr+h6ZVgscFfLPtkY43Rcawu2byBtJz96UOj2WZs eRQWevCNq+xL7pDrENrcpcjs+pd5uqGSKtTcSQ5668E0wO4ixN+i+nQT34PKNQCFq0Bq mZuyQ2q398ugHz75sBgWhBClu+eMyp/njh9SN06ZylGP1ydSag9h3fexI1luBvuyHT1Z c1Qg== X-Gm-Message-State: ACrzQf3iqriHSVzZoQnLn+bN3kZoI+ZT0F8DVjlPPkQ1vAzcODnxJB5T mp3KayE+gp8nW8o0pDUsV5izN5Y+QStYZq+QMi8= X-Google-Smtp-Source: AMsMyM7kSa4V889XxFoCqMzLKCCLjp1lDB7IYvDrF+vDwVWR+9URB4RTiRMlqW58Gz2O/Jse01dllr7KiZBzXbrf4jI= X-Received: by 2002:a25:8587:0:b0:6cc:43cd:e69a with SMTP id x7-20020a258587000000b006cc43cde69amr42035076ybk.601.1667784303613; Sun, 06 Nov 2022 17:25:03 -0800 (PST) MIME-Version: 1.0 References: <20221104074632.19951-1-haochen.jiang@intel.com> In-Reply-To: <20221104074632.19951-1-haochen.jiang@intel.com> From: Hongtao Liu Date: Mon, 7 Nov 2022 09:28:11 +0800 Message-ID: Subject: Re: [PATCH] Support Intel prefetchit0/t1 To: Haochen Jiang Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, Richard.Earnshaw@foss.arm.com, segher@kernel.crashing.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Nov 4, 2022 at 3:47 PM Haochen Jiang via Gcc-patches wrote: > > Hi all, > > We will take back the patches which add a new parameter on original > builtin_prefetch and implement instruction prefetch on that. > > Also we consider that since we will only do that on specific backend, > no need to add a new rtl for that. > > This patch will only support instructions prefetch for x86 backend. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? Ok. > > BRs, > Haochen > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): > Detect PREFETCHI. > * common/config/i386/i386-common.cc > (OPTION_MASK_ISA2_PREFETCHI_SET, > OPTION_MASK_ISA2_PREFETCHI_UNSET): New. > (ix86_handle_option): Handle -mprefetchi. > * common/config/i386/i386-cpuinfo.h > (enum processor_features): Add FEATURE_PREFETCHI. > * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY > for prefetchi. > * config.gcc: Add prfchiintrin.h. > * config/i386/cpuid.h (bit_PREFETCHI): New. > * config/i386/i386-builtin-types.def: > Add DEF_FUNCTION_TYPE (VOID, PCVOID, INT) > and DEF_FUNCTION_TYPE (VOID, PCVOID, INT, INT, INT). > * config/i386/i386-builtin.def (BDESC): Add new builtins. > * config/i386/i386-c.cc (ix86_target_macros_internal): > Define __PREFETCHI__. > * config/i386/i386-expand.cc: Handle new builtins. > * config/i386/i386-isa.def (PREFETCHI): > Add DEF_PTA(PREFETCHI). > * config/i386/i386-options.cc > (ix86_valid_target_attribute_inner_p): Handle prefetchi. > * config/i386/i386.md (prefetchi): New define_insn. > * config/i386/i386.opt: Add option -mprefetchi. > * config/i386/predicates.md (local_func_symbolic_operand): > New predicates. > * config/i386/x86gprintrin.h: Include prfchiintrin.h. > * config/i386/xmmintrin.h (enum _mm_hint): New enum for > prefetchi. > (_mm_prefetch): Handle the highest bit of enum. > * doc/extend.texi: Document prefetchi. > * doc/invoke.texi: Document -mprefetchi. > * doc/sourcebuild.texi: Document target prefetchi. > * config/i386/prfchiintrin.h: New file. > > gcc/testsuite/ChangeLog: > > * g++.dg/other/i386-2.C: Add -mprefetchi. > * g++.dg/other/i386-3.C: Ditto. > * gcc.target/i386/avx-1.c: Ditto. > * gcc.target/i386/funcspec-56.inc: Add new target attribute. > * gcc.target/i386/sse-13.c: Add -mprefetchi. > * gcc.target/i386/sse-23.c: Ditto. > * gcc.target/i386/x86gprintrin-1.c: Ditto. > * gcc.target/i386/x86gprintrin-2.c: Ditto. > * gcc.target/i386/x86gprintrin-3.c: Ditto. > * gcc.target/i386/x86gprintrin-4.c: Ditto. > * gcc.target/i386/x86gprintrin-5.c: Ditto. > * gcc.target/i386/prefetchi-1.c: New test. > * gcc.target/i386/prefetchi-2.c: Ditto. > * gcc.target/i386/prefetchi-3.c: Ditto. > * gcc.target/i386/prefetchi-4.c: Ditto. > > Co-authored-by: Hongtao Liu > --- > gcc/common/config/i386/cpuinfo.h | 2 + > gcc/common/config/i386/i386-common.cc | 15 ++++ > gcc/common/config/i386/i386-cpuinfo.h | 1 + > gcc/common/config/i386/i386-isas.h | 1 + > gcc/config.gcc | 2 +- > gcc/config/i386/cpuid.h | 1 + > gcc/config/i386/i386-builtin-types.def | 4 + > gcc/config/i386/i386-builtin.def | 4 + > gcc/config/i386/i386-c.cc | 2 + > gcc/config/i386/i386-expand.cc | 77 +++++++++++++++++++ > gcc/config/i386/i386-isa.def | 1 + > gcc/config/i386/i386-options.cc | 4 +- > gcc/config/i386/i386.md | 23 ++++++ > gcc/config/i386/i386.opt | 4 + > gcc/config/i386/predicates.md | 15 ++++ > gcc/config/i386/prfchiintrin.h | 49 ++++++++++++ > gcc/config/i386/x86gprintrin.h | 2 + > gcc/config/i386/xmmintrin.h | 7 +- > gcc/doc/extend.texi | 5 ++ > gcc/doc/invoke.texi | 10 ++- > gcc/doc/sourcebuild.texi | 3 + > gcc/testsuite/g++.dg/other/i386-2.C | 2 +- > gcc/testsuite/g++.dg/other/i386-3.C | 2 +- > gcc/testsuite/gcc.target/i386/avx-1.c | 4 +- > gcc/testsuite/gcc.target/i386/funcspec-56.inc | 2 + > gcc/testsuite/gcc.target/i386/prefetchi-1.c | 40 ++++++++++ > gcc/testsuite/gcc.target/i386/prefetchi-2.c | 26 +++++++ > gcc/testsuite/gcc.target/i386/prefetchi-3.c | 20 +++++ > gcc/testsuite/gcc.target/i386/prefetchi-4.c | 19 +++++ > gcc/testsuite/gcc.target/i386/sse-13.c | 4 +- > gcc/testsuite/gcc.target/i386/sse-23.c | 4 +- > .../gcc.target/i386/x86gprintrin-1.c | 2 +- > .../gcc.target/i386/x86gprintrin-2.c | 2 +- > .../gcc.target/i386/x86gprintrin-3.c | 2 +- > .../gcc.target/i386/x86gprintrin-4.c | 2 +- > .../gcc.target/i386/x86gprintrin-5.c | 2 +- > 36 files changed, 345 insertions(+), 20 deletions(-) > create mode 100644 gcc/config/i386/prfchiintrin.h > create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-4.c > > diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cp= uinfo.h > index a38c1b65602..ac7761699af 100644 > --- a/gcc/common/config/i386/cpuinfo.h > +++ b/gcc/common/config/i386/cpuinfo.h > @@ -839,6 +839,8 @@ get_available_features (struct __processor_model *cpu= _model, > set_feature (FEATURE_HRESET); > if (eax & bit_CMPCCXADD) > set_feature(FEATURE_CMPCCXADD); > + if (edx & bit_PREFETCHI) > + set_feature (FEATURE_PREFETCHI); > if (avx_usable) > { > if (eax & bit_AVXVNNI) > diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i3= 86/i386-common.cc > index a044e28d25f..9bcae020a00 100644 > --- a/gcc/common/config/i386/i386-common.cc > +++ b/gcc/common/config/i386/i386-common.cc > @@ -112,6 +112,7 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA2_AVXNECONVERT_SET OPTION_MASK_ISA2_AVXNECONVERT > #define OPTION_MASK_ISA2_CMPCCXADD_SET OPTION_MASK_ISA2_CMPCCXADD > #define OPTION_MASK_ISA2_AMX_FP16_SET OPTION_MASK_ISA2_AMX_FP16 > +#define OPTION_MASK_ISA2_PREFETCHI_SET OPTION_MASK_ISA2_PREFETCHI > > /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same > as -msse4.2. */ > @@ -287,6 +288,7 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA2_AVXNECONVERT_UNSET OPTION_MASK_ISA2_AVXNECONVER= T > #define OPTION_MASK_ISA2_CMPCCXADD_UNSET OPTION_MASK_ISA2_CMPCCXADD > #define OPTION_MASK_ISA2_AMX_FP16_UNSET OPTION_MASK_ISA2_AMX_FP16 > +#define OPTION_MASK_ISA2_PREFETCHI_UNSET OPTION_MASK_ISA2_PREFETCHI > > /* SSE4 includes both SSE4.1 and SSE4.2. -mno-sse4 should the same > as -mno-sse4.1. */ > @@ -1211,6 +1213,19 @@ ix86_handle_option (struct gcc_options *opts, > } > return true; > > + case OPT_mprefetchi: > + if (value) > + { > + opts->x_ix86_isa_flags2 |=3D OPTION_MASK_ISA2_PREFETCHI_SET; > + opts->x_ix86_isa_flags2_explicit |=3D OPTION_MASK_ISA2_PREFETCH= I_SET; > + } > + else > + { > + opts->x_ix86_isa_flags2 &=3D ~OPTION_MASK_ISA2_PREFETCHI_UNSET; > + opts->x_ix86_isa_flags2_explicit |=3D OPTION_MASK_ISA2_PREFETCH= I_UNSET; > + } > + return true; > + > case OPT_mfma: > if (value) > { > diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i3= 86/i386-cpuinfo.h > index 014174e1856..68eda7a8696 100644 > --- a/gcc/common/config/i386/i386-cpuinfo.h > +++ b/gcc/common/config/i386/i386-cpuinfo.h > @@ -249,6 +249,7 @@ enum processor_features > FEATURE_AVXNECONVERT, > FEATURE_CMPCCXADD, > FEATURE_AMX_FP16, > + FEATURE_PREFETCHI, > CPU_FEATURE_MAX > }; > > diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/= i386-isas.h > index 7c4a71413b5..8648ea6903c 100644 > --- a/gcc/common/config/i386/i386-isas.h > +++ b/gcc/common/config/i386/i386-isas.h > @@ -182,4 +182,5 @@ ISA_NAMES_TABLE_START > P_NONE, "-mavxneconvert") > ISA_NAMES_TABLE_ENTRY("cmpccxadd", FEATURE_CMPCCXADD, P_NONE, "-mcmpcc= xadd") > ISA_NAMES_TABLE_ENTRY("amx-fp16", FEATURE_AMX_FP16, P_NONE, "-mamx-fp1= 6") > + ISA_NAMES_TABLE_ENTRY("prefetchi", FEATURE_PREFETCHI, P_NONE, "-mprefe= tchi") > ISA_NAMES_TABLE_END > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 1191a0df7b0..5c782b2f298 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -423,7 +423,7 @@ i[34567]86-*-* | x86_64-*-*) > hresetintrin.h keylockerintrin.h avxvnniintrin.h > mwaitintrin.h avx512fp16intrin.h avx512fp16vlintri= n.h > avxifmaintrin.h avxvnniint8intrin.h avxneconvertin= trin.h > - cmpccxaddintrin.h amxfp16intrin.h" > + cmpccxaddintrin.h amxfp16intrin.h prfchiintrin.h" > ;; > ia64-*-*) > extra_headers=3Dia64intrin.h > diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h > index 229c15c5950..92583261883 100644 > --- a/gcc/config/i386/cpuid.h > +++ b/gcc/config/i386/cpuid.h > @@ -54,6 +54,7 @@ > #define bit_AVXVNNIINT8 (1 << 4) > #define bit_AVXNECONVERT (1 << 5) > #define bit_CMPXCHG8B (1 << 8) > +#define bit_PREFETCHI (1 << 14) > #define bit_CMOV (1 << 15) > #define bit_MMX (1 << 23) > #define bit_FXSAVE (1 << 24) > diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i38= 6-builtin-types.def > index 2af66145d4b..d10de32643f 100644 > --- a/gcc/config/i386/i386-builtin-types.def > +++ b/gcc/config/i386/i386-builtin-types.def > @@ -1411,3 +1411,7 @@ DEF_FUNCTION_TYPE (V8SF, PCV16BF) > # CMPccXADD builtins > DEF_FUNCTION_TYPE (INT, PINT, INT, INT, INT) > DEF_FUNCTION_TYPE (LONGLONG, PLONGLONG, LONGLONG, LONGLONG, INT) > + > +# PREFETCHI builtins > +DEF_FUNCTION_TYPE (VOID, PCVOID, INT) > +DEF_FUNCTION_TYPE (VOID, PCVOID, INT, INT, INT) > diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-buil= tin.def > index c272c392d03..837007ada8e 100644 > --- a/gcc/config/i386/i386-builtin.def > +++ b/gcc/config/i386/i386-builtin.def > @@ -487,6 +487,10 @@ BDESC (0, OPTION_MASK_ISA2_WIDEKL, CODE_FOR_nothing,= "__builtin_ia32_aesdecwide2 > BDESC (0, OPTION_MASK_ISA2_WIDEKL, CODE_FOR_nothing, "__builtin_ia32_aes= encwide128kl_u8", IX86_BUILTIN_AESENCWIDE128KLU8, UNKNOWN, (int) UINT8_FTYP= E_PV2DI_PCV2DI_PCVOID) > BDESC (0, OPTION_MASK_ISA2_WIDEKL, CODE_FOR_nothing, "__builtin_ia32_aes= encwide256kl_u8", IX86_BUILTIN_AESENCWIDE256KLU8, UNKNOWN, (int) UINT8_FTYP= E_PV2DI_PCV2DI_PCVOID) > > +/* PREFETCHI */ > +BDESC (0, 0, CODE_FOR_prefetchi, "__builtin_ia32_prefetchi", IX86_BUILTI= N_PREFETCHI, UNKNOWN, (int) VOID_FTYPE_PCVOID_INT) > +BDESC (0, 0, CODE_FOR_nothing, "__builtin_ia32_prefetch", IX86_BUILTIN_P= REFETCH, UNKNOWN, (int) VOID_FTYPE_PCVOID_INT_INT_INT) > + > BDESC_END (SPECIAL_ARGS, PURE_ARGS) > > /* AVX */ > diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc > index ac0087a4653..07ce0f8a5a7 100644 > --- a/gcc/config/i386/i386-c.cc > +++ b/gcc/config/i386/i386-c.cc > @@ -657,6 +657,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, > def_or_undef (parse_in, "__CMPCCXADD__"); > if (isa_flag2 & OPTION_MASK_ISA2_AMX_FP16) > def_or_undef (parse_in, "__AMX_FP16__"); > + if (isa_flag2 & OPTION_MASK_ISA2_PREFETCHI) > + def_or_undef (parse_in, "__PREFETCHI__"); > if (TARGET_IAMCU) > { > def_or_undef (parse_in, "__iamcu"); > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand= .cc > index a37fde307d1..2e0d12c0108 100644 > --- a/gcc/config/i386/i386-expand.cc > +++ b/gcc/config/i386/i386-expand.cc > @@ -13035,6 +13035,83 @@ ix86_expand_builtin (tree exp, rtx target, rtx s= ubtarget, > return target; > } > > + case IX86_BUILTIN_PREFETCH: > + { > + arg0 =3D CALL_EXPR_ARG (exp, 0); // const void * > + arg1 =3D CALL_EXPR_ARG (exp, 1); // const int > + arg2 =3D CALL_EXPR_ARG (exp, 2); // const int > + arg3 =3D CALL_EXPR_ARG (exp, 3); // const int > + > + op0 =3D expand_normal (arg0); > + op1 =3D expand_normal (arg1); > + op2 =3D expand_normal (arg2); > + op3 =3D expand_normal (arg3); > + > + if (!CONST_INT_P (op1) || !CONST_INT_P (op2) || !CONST_INT_P (op3= )) > + { > + error ("second, third and fourth argument must be a const"); > + return const0_rtx; > + } > + > + if (INTVAL (op3) =3D=3D 1) > + { > + if (TARGET_64BIT > + && local_func_symbolic_operand (op0, GET_MODE (op0))) > + emit_insn (gen_prefetchi (op0, op2)); > + else > + { > + warning (0, "instruction prefetch applies when in 64-bit = mode" > + " with RIP-relative addressing and" > + " option %<-mprefetchi%>;" > + " they stay NOPs otherwise"); > + emit_insn (gen_nop ()); > + } > + } > + else > + { > + if (!address_operand (op0, VOIDmode)) > + { > + op0 =3D convert_memory_address (Pmode, op0); > + op0 =3D copy_addr_to_reg (op0); > + } > + emit_insn (gen_prefetch (op0, op1, op2)); > + } > + > + return 0; > + } > + > + case IX86_BUILTIN_PREFETCHI: > + { > + arg0 =3D CALL_EXPR_ARG (exp, 0); // const void * > + arg1 =3D CALL_EXPR_ARG (exp, 1); // const int > + > + op0 =3D expand_normal (arg0); > + op1 =3D expand_normal (arg1); > + > + if (!CONST_INT_P (op1)) > + { > + error ("second argument must be a const"); > + return const0_rtx; > + } > + > + /* GOT/PLT_PIC should not be available for instruction prefetch. > + It must be real instruction address. */ > + if (TARGET_64BIT > + && local_func_symbolic_operand (op0, GET_MODE (op0))) > + emit_insn (gen_prefetchi (op0, op1)); > + else > + { > + /* Ignore the hint. */ > + warning (0, "instruction prefetch applies when in 64-bit mode= " > + " with RIP-relative addressing and" > + " option %<-mprefetchi%>;" > + " they stay NOPs otherwise"); > + emit_insn (gen_nop ()); > + } > + > + return 0; > + } > + > case IX86_BUILTIN_VEC_INIT_V2SI: > case IX86_BUILTIN_VEC_INIT_V4HI: > case IX86_BUILTIN_VEC_INIT_V8QI: > diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def > index 55b25763957..f234dcc37d7 100644 > --- a/gcc/config/i386/i386-isa.def > +++ b/gcc/config/i386/i386-isa.def > @@ -114,3 +114,4 @@ DEF_PTA(AVXVNNIINT8) > DEF_PTA(AVXNECONVERT) > DEF_PTA(CMPCCXADD) > DEF_PTA(AMX_FP16) > +DEF_PTA(PREFETCHI) > diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-optio= ns.cc > index bbb8307d0b0..625739658c9 100644 > --- a/gcc/config/i386/i386-options.cc > +++ b/gcc/config/i386/i386-options.cc > @@ -233,7 +233,8 @@ static struct ix86_target_opts isa2_opts[] =3D > { "-mavxvnniint8", OPTION_MASK_ISA2_AVXVNNIINT8 }, > { "-mavxneconvert", OPTION_MASK_ISA2_AVXNECONVERT }, > { "-mcmpccxadd", OPTION_MASK_ISA2_CMPCCXADD }, > - { "-mamx-fp16", OPTION_MASK_ISA2_AMX_FP16 } > + { "-mamx-fp16", OPTION_MASK_ISA2_AMX_FP16 }, > + { "-mprefetchi", OPTION_MASK_ISA2_PREFETCHI } > }; > static struct ix86_target_opts isa_opts[] =3D > { > @@ -1086,6 +1087,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, t= ree args, char *p_strings[], > IX86_ATTR_ISA ("avxneconvert", OPT_mavxneconvert), > IX86_ATTR_ISA ("cmpccxadd", OPT_mcmpccxadd), > IX86_ATTR_ISA ("amx-fp16", OPT_mamx_fp16), > + IX86_ATTR_ISA ("prefetchi", OPT_mprefetchi), > > /* enum options */ > IX86_ATTR_ENUM ("fpmath=3D", OPT_mfpmath_), > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 436eabb691a..a2b8f26714a 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -330,6 +330,9 @@ > > ;; For HRESET support > UNSPECV_HRESET > + > + ;; For PREFETCHI support > + UNSPECV_PREFETCHI > ]) > > ;; Constants to represent rounding modes in the ROUND instruction > @@ -23961,6 +23964,26 @@ > (symbol_ref "memory_address_length (operands[0], false)")) > (set_attr "memory" "none")]) > > +(define_insn "prefetchi" > + [(unspec_volatile [(match_operand 0 "local_func_symbolic_operand" "p") > + (match_operand:SI 1 "const_int_operand")] > + UNSPECV_PREFETCHI)] > + "TARGET_PREFETCHI && TARGET_64BIT" > +{ > + static const char * const patterns[2] =3D { > + "prefetchit1\t%0", "prefetchit0\t%0" > + }; > + > + int locality =3D INTVAL (operands[1]); > + gcc_assert (IN_RANGE (locality, 2, 3)); > + > + return patterns[locality - 2]; > +} > + [(set_attr "type" "sse") > + (set (attr "length_address") > + (symbol_ref "memory_address_length (operands[0], false)")) > + (set_attr "memory" "none")]) > + > (define_expand "stack_protect_set" > [(match_operand 0 "memory_operand") > (match_operand 1 "memory_operand")] > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > index eaa43946341..1d91103cd54 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1238,3 +1238,7 @@ CMPCCXADD build-in functions and code generation. > mamx-fp16 > Target Mask(ISA2_AMX_FP16) Var(ix86_isa_flags2) Save > Support AMX-FP16 built-in functions and code generation. > + > +mprefetchi > +Target Mask(ISA2_PREFETCHI) Var(ix86_isa_flags2) Save > +Support PREFETCHI built-in functions and code generation. > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.m= d > index c4141a96735..2a3f07224cc 100644 > --- a/gcc/config/i386/predicates.md > +++ b/gcc/config/i386/predicates.md > @@ -610,6 +610,21 @@ > return false; > }) > > +(define_predicate "local_func_symbolic_operand" > + (match_operand 0 "local_symbolic_operand") > +{ > + if (GET_CODE (op) =3D=3D CONST > + && GET_CODE (XEXP (op, 0)) =3D=3D PLUS > + && CONST_INT_P (XEXP (XEXP (op, 0), 1))) > + op =3D XEXP (XEXP (op, 0), 0); > + > + if (GET_CODE (op) =3D=3D SYMBOL_REF > + && !SYMBOL_REF_FUNCTION_P (op)) > + return false; > + > + return true; > +}) > + > ;; Test for a legitimate @GOTOFF operand. > ;; > ;; VxWorks does not impose a fixed gap between segments; the run-time > diff --git a/gcc/config/i386/prfchiintrin.h b/gcc/config/i386/prfchiintri= n.h > new file mode 100644 > index 00000000000..06deef488ba > --- /dev/null > +++ b/gcc/config/i386/prfchiintrin.h > @@ -0,0 +1,49 @@ > +/* Copyright (C) 2022 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + Under Section 7 of GPL version 3, you are granted additional > + permissions described in the GCC Runtime Library Exception, version > + 3.1, as published by the Free Software Foundation. > + > + You should have received a copy of the GNU General Public License and > + a copy of the GCC Runtime Library Exception along with this program; > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > + . */ > + > +#if !defined _X86GPRINTRIN_H_INCLUDED > +# error "Never use directly; include i= nstead." > +#endif > + > +#ifndef _PRFCHIINTRIN_H_INCLUDED > +#define _PRFCHIINTRIN_H_INCLUDED > + > +#ifdef __x86_64__ > + > +extern __inline void > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_m_prefetchit0 (void* __P) > +{ > + __builtin_ia32_prefetchi (__P, 3); > +} > + > +extern __inline void > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_m_prefetchit1 (void* __P) > +{ > + __builtin_ia32_prefetchi (__P, 2); > +} > + > +#endif > + > +#endif /* _PRFCHIINTRIN_H_INCLUDED */ > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintri= n.h > index a84fbe9137d..abe8f487f03 100644 > --- a/gcc/config/i386/x86gprintrin.h > +++ b/gcc/config/i386/x86gprintrin.h > @@ -74,6 +74,8 @@ > > #include > > +#include > + > #include > > #include > diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h > index 62659080601..ab65c430a97 100644 > --- a/gcc/config/i386/xmmintrin.h > +++ b/gcc/config/i386/xmmintrin.h > @@ -36,6 +36,8 @@ > /* Constants for use with _mm_prefetch. */ > enum _mm_hint > { > + _MM_HINT_IT0 =3D 19, > + _MM_HINT_IT1 =3D 18, > /* _MM_HINT_ET is _MM_HINT_T with set 3rd bit. */ > _MM_HINT_ET0 =3D 7, > _MM_HINT_ET1 =3D 6, > @@ -51,11 +53,12 @@ enum _mm_hint > extern __inline void __attribute__((__gnu_inline__, __always_inline__, _= _artificial__)) > _mm_prefetch (const void *__P, enum _mm_hint __I) > { > - __builtin_prefetch (__P, (__I & 0x4) >> 2, __I & 0x3); > + __builtin_ia32_prefetch (__P, (__I & 0x4) >> 2, > + __I & 0x3, (__I & 0x10) >> 4); > } > #else > #define _mm_prefetch(P, I) \ > - __builtin_prefetch ((P), ((I & 0x4) >> 2), (I & 0x3)) > + __builtin_ia32_prefetch ((P), ((I) & 0x4) >> 2, ((I) & 0x3), ((I) & 0x= 10) >> 4) > #endif > > #ifndef __SSE__ > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 8d4475fc615..ba1e12b4fa9 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -7085,6 +7085,11 @@ Enable/disable the generation of the CMPccXADD ins= tructions. > @cindex @code{target("amx-fp16")} function attribute, x86 > Enable/disable the generation of the AMX-FP16 instructions. > > +@item prefetchi > +@itemx no-prefetchi > +@cindex @code{target("prefetchi")} function attribute, x86 > +Enable/disable the generation of the PREFETCHI instructions. > + > @item cld > @itemx no-cld > @cindex @code{target("cld")} function attribute, x86 > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index e9207a3a255..bb908f81ba9 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1438,6 +1438,7 @@ See RS/6000 and PowerPC Options. > -mrdseed -msgx -mavx512vp2intersect -mserialize -mtsxldtrk@gol > -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni@gol > -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp= 16 @gol > +-mprefetchi @gol > -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops= @gol > -minline-stringops-dynamically -mstringop-strategy=3D@var{alg} @gol > -mkl -mwidekl @gol > @@ -32984,6 +32985,9 @@ preferred alignment to @option{-mpreferred-stack-= boundary=3D2}. > @need 200 > @itemx -mamx-fp16 > @opindex mamx-fp16 > +@need 200 > +@itemx -mprefetchi > +@opindex mprefetchi > These switches enable the use of instructions in the MMX, SSE, > SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F, AVX5= 12PF, > AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI= , SHA, > @@ -32994,9 +32998,9 @@ XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, = CLZERO, PKU, AVX512VBMI2, > GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX51= 2BF16, > ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALI= ZE, > UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512FP1= 6, > -AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16 or CLDEMOTE exte= nded > -instruction sets. Each has a corresponding @option{-mno-} option to disa= ble > -use of these instructions. > +AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI or CL= DEMOTE > +extended instruction sets. Each has a corresponding @option{-mno-} optio= n to > +disable use of these instructions. > > These extensions are also available as built-in functions: see > @ref{x86 Built-in Functions}, for details of the functions enabled and > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi > index 5de5e9576d5..58adb6516ed 100644 > --- a/gcc/doc/sourcebuild.texi > +++ b/gcc/doc/sourcebuild.texi > @@ -2535,6 +2535,9 @@ Target does not require strict alignment. > @item pie_copyreloc > The x86-64 target linker supports PIE with copy reloc. > > +@item prefetchi > +Target supports the execution of @code{prefetchi} instructions. > + > @item rdrand > Target supports x86 @code{rdrand} instruction. > > diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/o= ther/i386-2.C > index 79b84af0a75..ec3b1864ec0 100644 > --- a/gcc/testsuite/g++.dg/other/i386-2.C > +++ b/gcc/testsuite/g++.dg/other/i386-2.C > @@ -1,5 +1,5 @@ > /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ > -/* { dg-options "-O -pedantic-errors -march=3Dk8 -msse4a -m3dnow -mavx -= mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm= -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr= -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 = -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512if= ma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcnt= dq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpco= nfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mts= xldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp1= 6 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16" } */ > +/* { dg-options "-O -pedantic-errors -march=3Dk8 -msse4a -m3dnow -mavx -= mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm= -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr= -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 = -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512if= ma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcnt= dq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpco= nfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mts= xldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp1= 6 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mprefetchi= " } */ > > /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, > xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, > diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/o= ther/i386-3.C > index c811a4454bf..542275ca057 100644 > --- a/gcc/testsuite/g++.dg/other/i386-3.C > +++ b/gcc/testsuite/g++.dg/other/i386-3.C > @@ -1,5 +1,5 @@ > /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ > -/* { dg-options "-O -fkeep-inline-functions -march=3Dk8 -msse4a -m3dnow = -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi= 2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx= -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefe= tchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mav= x512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512v= popcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg= -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserializ= e -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx= 512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16" } */ > +/* { dg-options "-O -fkeep-inline-functions -march=3Dk8 -msse4a -m3dnow = -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi= 2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx= -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefe= tchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mav= x512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512v= popcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg= -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserializ= e -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx= 512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mpre= fetchi" } */ > > /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, > xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, > diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.ta= rget/i386/avx-1.c > index 051a1b59b5b..0b2b68b678d 100644 > --- a/gcc/testsuite/gcc.target/i386/avx-1.c > +++ b/gcc/testsuite/gcc.target/i386/avx-1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512v= l" } */ > +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512v= l -mprefetchi" } */ > /* { dg-add-options bind_pic_locally } */ > > #include > @@ -153,7 +153,7 @@ > #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) > > /* xmmintrin.h */ > -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NT= A) > +#define __builtin_ia32_prefetch(A, B, C, D) __builtin_ia32_prefetch(A, 0= , 3, 0) > #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) > #define __builtin_ia32_vec_set_v4hi(A, D, N) \ > __builtin_ia32_vec_set_v4hi(A, D, 0) > diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuit= e/gcc.target/i386/funcspec-56.inc > index 71063235374..631d5c2b950 100644 > --- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc > +++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc > @@ -85,6 +85,7 @@ extern void test_avxvnniint8 (void) __attribu= te__((__target__("avxvnniint8"))); > extern void test_avxneconvert (void) __attribute__((__target__= ("avxneconvert"))); > extern void test_cmpccxadd (void) __attribute__((__target__= ("cmpccxadd"))); > extern void test_amx_fp16 (void) __attribute__((__target__= ("amx-fp16"))); > +extern void test_prefetchi (void) __attribute__((__target_= _("prefetchi"))); > > extern void test_no_sgx (void) __attribute__((__target__= ("no-sgx"))); > extern void test_no_avx5124fmaps(void) __attribute__((__target__= ("no-avx5124fmaps"))); > @@ -171,6 +172,7 @@ extern void test_no_avxvnniint8 (void) _= _attribute__((__target__("no-avxvnniint > extern void test_no_avxneconvert (void) __attribute__((__= target__("no-avxneconvert"))); > extern void test_no_cmpccxadd (void) __attribute__((__target_= _("no-cmpccxadd"))); > extern void test_no_amx_fp16 (void) __attribute__((__target__= ("no-amx-fp16"))); > +extern void test_no_prefetchi (void) __attribute__((__target_= _("no-prefetchi"))); > > extern void test_arch_nocona (void) __attribute__((__target__= ("arch=3Dnocona"))); > extern void test_arch_core2 (void) __attribute__((__target__= ("arch=3Dcore2"))); > diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/= gcc.target/i386/prefetchi-1.c > new file mode 100644 > index 00000000000..80f25e70e8e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c > @@ -0,0 +1,40 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-mprefetchi -O2" } */ > +/* { dg-final { scan-assembler-times "\[ \\t\]+prefetchit0\[ \\t\]+" 2 }= } */ > +/* { dg-final { scan-assembler-times "\[ \\t\]+prefetchit1\[ \\t\]+" 2 }= } */ > + > +#include > + > +int > +bar (int a) > +{ > + return a + 1; > +} > + > +int > +foo1 (int b) > +{ > + _mm_prefetch (bar, _MM_HINT_IT0); > + return bar (b) + 1; > +} > + > +int > +foo2 (int b) > +{ > + _mm_prefetch (bar, _MM_HINT_IT1); > + return bar (b) + 1; > +} > + > +int > +foo3 (int b) > +{ > + _m_prefetchit0 (bar); > + return bar (b) + 1; > +} > + > +int > +foo4 (int b) > +{ > + _m_prefetchit1 (bar); > + return bar (b) + 1; > +} > diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-2.c b/gcc/testsuite/= gcc.target/i386/prefetchi-2.c > new file mode 100644 > index 00000000000..e05ce9c733d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/prefetchi-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile { target { ia32 } } } */ > +/* { dg-options "-mprefetchi -O2" } */ > +/* { dg-final { scan-assembler-not "\[ \\t\]+prefetchit0" } } */ > +/* { dg-final { scan-assembler-not "\[ \\t\]+prefetchit1" } } */ > + > +#include > + > +int > +bar (int a) > +{ > + return a + 1; > +} > + > +int > +foo1 (int b) > +{ > + __builtin_ia32_prefetch (bar, 0, 3, 1); /* { dg-warning "instruction p= refetch applies when in 64-bit mode with RIP-relative addressing and option= '-mprefetchi'; they stay NOPs otherwise" } */ > + return bar (b) + 1; > +} > + > +int > +foo2 (int b) > +{ > + __builtin_ia32_prefetchi (bar, 2); /* { dg-warning "instruction prefet= ch applies when in 64-bit mode with RIP-relative addressing and option '-mp= refetchi'; they stay NOPs otherwise" } */ > + return bar (b) + 1; > +} > diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-3.c b/gcc/testsuite/= gcc.target/i386/prefetchi-3.c > new file mode 100644 > index 00000000000..f0a4173d2a6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/prefetchi-3.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mprefetchi -O2" } */ > +/* { dg-final { scan-assembler-not "prefetchit0" } } */ > +/* { dg-final { scan-assembler-not "prefetchit1" } } */ > + > +#include > + > +void* p; > + > +void extern > +prefetchi_test1 (void) > +{ > + __builtin_ia32_prefetchi (p, 2); /* { dg-warning "instruction prefetch= applies when in 64-bit mode with RIP-relative addressing and option '-mpre= fetchi'; they stay NOPs otherwise" } */ > +} > + > +void extern > +prefetchi_test2 (void) > +{ > + __builtin_ia32_prefetch (p, 0, 3, 1); /* { dg-warning "instruction pre= fetch applies when in 64-bit mode with RIP-relative addressing and option '= -mprefetchi'; they stay NOPs otherwise" } */ > +} > diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-4.c b/gcc/testsuite/= gcc.target/i386/prefetchi-4.c > new file mode 100644 > index 00000000000..73ae596d147 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/prefetchi-4.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O0" } */ > + > +#include > + > +void* p; > + > +void extern > +prefetch_test (void) > +{ > + __builtin_ia32_prefetch (p, 0, 3, 0); > + __builtin_ia32_prefetch (p, 0, 2, 0); > + __builtin_ia32_prefetch (p, 0, 1, 0); > + __builtin_ia32_prefetch (p, 0, 0, 0); > + __builtin_ia32_prefetch (p, 1, 3, 0); > + __builtin_ia32_prefetch (p, 1, 2, 0); > + __builtin_ia32_prefetch (p, 1, 1, 0); > + __builtin_ia32_prefetch (p, 1, 0, 0); > +} > diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.t= arget/i386/sse-13.c > index ca662f7bd47..f0d2d5b4975 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-13.c > +++ b/gcc/testsuite/gcc.target/i386/sse-13.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq = -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku= -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -men= qcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl = -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -= mamx-fp16" } */ > +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dk8 -= msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlz= cnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed= -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512p= f -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq = -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124= vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku= -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -men= qcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl = -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -= mamx-fp16 -mprefetchi" } */ > /* { dg-add-options bind_pic_locally } */ > > #include > @@ -125,7 +125,7 @@ > #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) > > /* xmmintrin.h */ > -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NT= A) > +#define __builtin_ia32_prefetch(A, B, C, D) __builtin_ia32_prefetch(A, 0= , 3, 0) > #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) > #define __builtin_ia32_vec_set_v4hi(A, D, N) \ > __builtin_ia32_vec_set_v4hi(A, D, 0) > diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.t= arget/i386/sse-23.c > index ba1310f9f89..547e441c986 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-23.c > +++ b/gcc/testsuite/gcc.target/i386/sse-23.c > @@ -94,7 +94,7 @@ > #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) > > /* xmmintrin.h */ > -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NT= A) > +#define __builtin_ia32_prefetch(A, B, C, D) __builtin_ia32_prefetch(A, 0= , 3, 0) > #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) > #define __builtin_ia32_vec_set_v4hi(A, D, N) \ > __builtin_ia32_vec_set_v4hi(A, D, 0) > @@ -847,6 +847,6 @@ > #define __builtin_ia32_cmpccxadd(A, B, C, D) __builtin_ia32_cmpccxadd(A,= B, C, 1) > #define __builtin_ia32_cmpccxadd64(A, B, C, D) __builtin_ia32_cmpccxadd6= 4(A, B, C, 1) > > -#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,= xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,c= lflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx= 5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2= ,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2inters= ect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512f= p16,avxifma,avxvnniint8,avxneconvert,cmpccxadd,amx-fp16") > +#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm= ,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,= xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,c= lflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx= 5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2= ,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2inters= ect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512f= p16,avxifma,avxvnniint8,avxneconvert,cmpccxadd,amx-fp16,prefetchi") > > #include > diff --git a/gcc/testsuite/gcc.target/i386/x86gprintrin-1.c b/gcc/testsui= te/gcc.target/i386/x86gprintrin-1.c > index 76de89d0cb7..3be40d41b37 100644 > --- a/gcc/testsuite/gcc.target/i386/x86gprintrin-1.c > +++ b/gcc/testsuite/gcc.target/i386/x86gprintrin-1.c > @@ -1,7 +1,7 @@ > /* Test that is usable with -O -std=3Dc89 -pedantic-err= ors. */ > /* { dg-do compile } */ > /* { dg-options "-O -std=3Dc89 -pedantic-errors -march=3Dx86-64 -madx -m= bmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -menqcmd -mfsgsbase -mfx= sr -mhreset -mlzcnt -mlwp -mmovdiri -mmwaitx -mpconfig -mpopcnt -mpku -mptw= rite -mrdpid -mrdrnd -mrdseed -mrtm -mserialize -msgx -mshstk -mtbm -mtsxld= trk -mwaitpkg -mwbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -mno-sse -mno= -mmx" } */ > -/* { dg-additional-options "-mcmpccxadd -muintr" { target { ! ia32 } } }= */ > +/* { dg-additional-options "-mcmpccxadd -mprefetchi -muintr" { target { = ! ia32 } } } */ > > #include > > diff --git a/gcc/testsuite/gcc.target/i386/x86gprintrin-2.c b/gcc/testsui= te/gcc.target/i386/x86gprintrin-2.c > index aefad77f864..5eaeab6edf0 100644 > --- a/gcc/testsuite/gcc.target/i386/x86gprintrin-2.c > +++ b/gcc/testsuite/gcc.target/i386/x86gprintrin-2.c > @@ -1,7 +1,7 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -Werror-implicit-function-declaration -march=3Dx86-= 64 -madx -mbmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -menqcmd -mfs= gsbase -mfxsr -mhreset -mlzcnt -mlwp -mmovdiri -mmwaitx -mpconfig -mpopcnt = -mpku -mptwrite -mrdpid -mrdrnd -mrdseed -mrtm -mserialize -msgx -mshstk -m= tbm -mtsxldtrk -mwaitpkg -mwbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -m= no-sse -mno-mmx" } */ > /* { dg-add-options bind_pic_locally } */ > -/* { dg-additional-options "-mcmpccxadd -muintr" { target { ! ia32 } } }= */ > +/* { dg-additional-options "-mcmpccxadd -mprefetchi -muintr" { target { = ! ia32 } } } */ > > /* Test that the intrinsics in compile with optimizatio= n. > All of them are defined as inline functions that reference the proper > diff --git a/gcc/testsuite/gcc.target/i386/x86gprintrin-3.c b/gcc/testsui= te/gcc.target/i386/x86gprintrin-3.c > index 261c9180aa0..03967f80445 100644 > --- a/gcc/testsuite/gcc.target/i386/x86gprintrin-3.c > +++ b/gcc/testsuite/gcc.target/i386/x86gprintrin-3.c > @@ -1,7 +1,7 @@ > /* { dg-do compile } */ > /* { dg-options "-O0 -Werror-implicit-function-declaration -march=3Dx86-= 64 -madx -mbmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -menqcmd -mfs= gsbase -mfxsr -mhreset -mlzcnt -mlwp -mmovdiri -mmwaitx -mpconfig -mpopcnt = -mpku -mptwrite -mrdpid -mrdrnd -mrdseed -mrtm -mserialize -msgx -mshstk -m= tbm -mtsxldtrk -mwaitpkg -mwbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -m= no-sse -mno-mmx" } */ > /* { dg-add-options bind_pic_locally } */ > -/* { dg-additional-options "-mcmpccxadd -muintr" { target { ! ia32 } } }= */ > +/* { dg-additional-options "-mcmpccxadd -mprefetchi -muintr" { target { = ! ia32 } } } */ > > /* Test that the intrinsics in compile without optimiza= tion. > All of them are defined as inline functions that reference the proper > diff --git a/gcc/testsuite/gcc.target/i386/x86gprintrin-4.c b/gcc/testsui= te/gcc.target/i386/x86gprintrin-4.c > index 7f76b870934..64fc337da60 100644 > --- a/gcc/testsuite/gcc.target/i386/x86gprintrin-4.c > +++ b/gcc/testsuite/gcc.target/i386/x86gprintrin-4.c > @@ -15,7 +15,7 @@ > > #ifndef DIFFERENT_PRAGMAS > #ifdef __x86_64__ > -#pragma GCC target ("adx,bmi,bmi2,cmpccxadd,fsgsbase,fxsr,hreset,lwp,lzc= nt,popcnt,rdrnd,rdseed,tbm,rtm,serialize,tsxldtrk,uintr,xsaveopt") > +#pragma GCC target ("adx,bmi,bmi2,cmpccxadd,fsgsbase,fxsr,hreset,lwp,lzc= nt,popcnt,prefetchi,rdrnd,rdseed,tbm,rtm,serialize,tsxldtrk,uintr,xsaveopt"= ) > #else > #pragma GCC target ("adx,bmi,bmi2,fsgsbase,fxsr,hreset,lwp,lzcnt,popcnt,= rdrnd,rdseed,tbm,rtm,serialize,tsxldtrk,xsaveopt") > #endif > diff --git a/gcc/testsuite/gcc.target/i386/x86gprintrin-5.c b/gcc/testsui= te/gcc.target/i386/x86gprintrin-5.c > index 54d826c4f46..8937f55e7ee 100644 > --- a/gcc/testsuite/gcc.target/i386/x86gprintrin-5.c > +++ b/gcc/testsuite/gcc.target/i386/x86gprintrin-5.c > @@ -32,7 +32,7 @@ > #define __builtin_ia32_cmpccxadd64(A, B, C, D) __builtin_ia32_cmpccxadd6= 4(A, B, C, 1) > > #ifdef __x86_64__ > -#pragma GCC target ("adx,bmi,bmi2,clflushopt,clwb,clzero,cmpccxadd,enqcm= d,fsgsbase,fxsr,hreset,lwp,lzcnt,mwaitx,pconfig,pku,popcnt,rdpid,rdrnd,rdse= ed,tbm,rtm,serialize,sgx,tsxldtrk,uintr,xsavec,xsaveopt,xsaves,wbnoinvd") > +#pragma GCC target ("adx,bmi,bmi2,clflushopt,clwb,clzero,cmpccxadd,enqcm= d,fsgsbase,fxsr,hreset,lwp,lzcnt,mwaitx,pconfig,pku,popcnt,prefetchi,rdpid,= rdrnd,rdseed,tbm,rtm,serialize,sgx,tsxldtrk,uintr,xsavec,xsaveopt,xsaves,wb= noinvd") > #else > #pragma GCC target ("adx,bmi,bmi2,clflushopt,clwb,clzero,enqcmd,fsgsbase= ,fxsr,hreset,lwp,lzcnt,mwaitx,pconfig,pku,popcnt,rdpid,rdrnd,rdseed,tbm,rtm= ,serialize,sgx,tsxldtrk,xsavec,xsaveopt,xsaves,wbnoinvd") > #endif > -- > 2.18.1 > --=20 BR, Hongtao