* [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only @ 2021-08-13 13:50 H.J. Lu 2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu ` (5 more replies) 0 siblings, 6 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener <x86gprintrin.h> and target("general-regs-only") function attribute were added to GCC 11. But their implementations are incomplete. I'd like to backport the following patches to GCC 11 branch to finish them. H.J. Lu (5): x86: Add -mmwait for -mgeneral-regs-only x86: Use crc32 target option for CRC32 intrinsics x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions x86: Enable the GPR only instructions for -mgeneral-regs-only <x86gprintrin.h>: Add pragma GCC target("general-regs-only") gcc/common/config/i386/i386-common.c | 45 ++- gcc/config.gcc | 6 +- gcc/config/i386/i386-builtin.def | 8 +- gcc/config/i386/i386-builtins.c | 4 +- gcc/config/i386/i386-c.c | 2 + gcc/config/i386/i386-options.c | 12 + gcc/config/i386/i386.c | 6 +- gcc/config/i386/i386.h | 2 + gcc/config/i386/i386.md | 4 +- gcc/config/i386/i386.opt | 4 + gcc/config/i386/ia32intrin.h | 42 ++- gcc/config/i386/mwaitintrin.h | 52 +++ gcc/config/i386/pmmintrin.h | 13 +- gcc/config/i386/serializeintrin.h | 7 +- gcc/config/i386/sse.md | 4 +- gcc/config/i386/x86gprintrin.h | 13 + gcc/doc/extend.texi | 5 + gcc/doc/invoke.texi | 8 +- gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + 30 files changed, 717 insertions(+), 45 deletions(-) create mode 100644 gcc/config/i386/mwaitintrin.h create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu @ 2021-08-13 13:50 ` H.J. Lu 2021-08-16 6:11 ` Richard Biener 2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu ` (4 subsequent siblings) 5 siblings, 1 reply; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with -mgeneral-regs-only and make -msse3 to imply -mmwait. gcc/ * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and x86_64-*-* targets. * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET): New. (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise. (ix86_handle_option): Handle -mmwait. * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on __builtin_ia32_monitor and __builtin_ia32_mwait. * config/i386/i386-options.c (isa2_opts): Add -mmwait. (ix86_valid_target_attribute_inner_p): Likewise. (ix86_option_override_internal): Enable mwait/monitor instructions for -msse3. * config/i386/i386.h (TARGET_MWAIT): New. (TARGET_MWAIT_P): Likewise. * config/i386/i386.opt: Add -mmwait. * config/i386/mwaitintrin.h: New file. * config/i386/pmmintrin.h: Include <mwaitintrin.h>. * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with TARGET_MWAIT. (@sse3_monitor_<mode>): Likewise. * config/i386/x86gprintrin.h: Include <mwaitintrin.h>. * doc/extend.texi: Document mwait target attribute. * doc/invoke.texi: Document -mmwait. gcc/testsuite/ * gcc.target/i386/monitor-2.c: New test. (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922) --- gcc/common/config/i386/i386-common.c | 15 +++++++ gcc/config.gcc | 6 ++- gcc/config/i386/i386-builtins.c | 4 +- gcc/config/i386/i386-options.c | 7 +++ gcc/config/i386/i386.h | 2 + gcc/config/i386/i386.opt | 4 ++ gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++ gcc/config/i386/pmmintrin.h | 13 +----- gcc/config/i386/sse.md | 4 +- gcc/config/i386/x86gprintrin.h | 2 + gcc/doc/extend.texi | 5 +++ gcc/doc/invoke.texi | 8 +++- gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++ 13 files changed, 130 insertions(+), 19 deletions(-) create mode 100644 gcc/config/i386/mwaitintrin.h create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 6a7b5c8312f..e156cc34584 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_F16C_SET \ (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET) #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mmwait: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET; + } + return true; + case OPT_mclzero: if (value) { diff --git a/gcc/config.gcc b/gcc/config.gcc index 357b0bed067..a020e0808c9 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -414,7 +414,8 @@ i[34567]86-*-*) avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h amxbf16intrin.h x86gprintrin.h uintrintrin.h - hresetintrin.h keylockerintrin.h avxvnniintrin.h" + hresetintrin.h keylockerintrin.h avxvnniintrin.h + mwaitintrin.h" ;; x86_64-*-*) cpu_type=i386 @@ -451,7 +452,8 @@ x86_64-*-*) avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h amxbf16intrin.h x86gprintrin.h uintrintrin.h - hresetintrin.h keylockerintrin.h avxvnniintrin.h" + hresetintrin.h keylockerintrin.h avxvnniintrin.h + mwaitintrin.h" ;; ia64-*-*) extra_headers=ia64intrin.h diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c index 4fcdf4b89ee..128bd39816c 100644 --- a/gcc/config/i386/i386-builtins.c +++ b/gcc/config/i386/i386-builtins.c @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void) VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE); /* SSE3. */ - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor", + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor", VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR); - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait", + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait", VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); /* AES */ diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 18d2c0b9f99..7ecd0cf8b8c 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] = { "-mmovbe", OPTION_MASK_ISA2_MOVBE }, { "-mclzero", OPTION_MASK_ISA2_CLZERO }, { "-mmwaitx", OPTION_MASK_ISA2_MWAITX }, + { "-mmwait", OPTION_MASK_ISA2_MWAIT }, { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B }, { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG }, { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE }, @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase), IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd), IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx), + IX86_ATTR_ISA ("mwait", OPT_mmwait), IX86_ATTR_ISA ("clzero", OPT_mclzero), IX86_ATTR_ISA ("pku", OPT_mpku), IX86_ATTR_ISA ("lwp", OPT_mlwp), @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p, || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags)) ix86_prefetch_sse = true; + /* Enable mwait/monitor instructions for -msse3. */ + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) + opts->x_ix86_isa_flags2 + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit; + /* Enable popcnt instruction for -msse4.2 or -mabm. */ if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags) || TARGET_ABM_P (opts->x_ix86_isa_flags)) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 5583ec6881a..73e118900f7 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x) #define TARGET_MWAITX TARGET_ISA2_MWAITX #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x) +#define TARGET_MWAIT TARGET_ISA2_MWAIT +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x) #define TARGET_PKU TARGET_ISA_PKU #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x) #define TARGET_SHSTK TARGET_ISA_SHSTK diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index c781fdc8278..7b8547bb1c3 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation. mneeded Target Var(ix86_needed) Save Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. + +mmwait +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save +Support MWAIT and MONITOR built-in functions and code generation. diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h new file mode 100644 index 00000000000..1ecbc4abb69 --- /dev/null +++ b/gcc/config/i386/mwaitintrin.h @@ -0,0 +1,52 @@ +/* Copyright (C) 2021 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _MWAITINTRIN_H_INCLUDED +#define _MWAITINTRIN_H_INCLUDED + +#ifndef __MWAIT__ +#pragma GCC push_options +#pragma GCC target("mwait") +#define __DISABLE_MWAIT__ +#endif /* __MWAIT__ */ + +extern __inline void +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) +{ + __builtin_ia32_monitor (__P, __E, __H); +} + +extern __inline void +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mwait (unsigned int __E, unsigned int __H) +{ + __builtin_ia32_mwait (__E, __H); +} + +#ifdef __DISABLE_MWAIT__ +#undef __DISABLE_MWAIT__ +#pragma GCC pop_options +#endif /* __DISABLE_MWAIT__ */ + +#endif /* _MWAITINTRIN_H_INCLUDED */ diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h index fa9c5bb8b9f..f8102d2be23 100644 --- a/gcc/config/i386/pmmintrin.h +++ b/gcc/config/i386/pmmintrin.h @@ -29,6 +29,7 @@ /* We need definitions from the SSE2 and SSE header files*/ #include <emmintrin.h> +#include <mwaitintrin.h> #ifndef __SSE3__ #pragma GCC push_options @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P) return (__m128i) __builtin_ia32_lddqu ((char const *)__P); } -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) -{ - __builtin_ia32_monitor (__P, __E, __H); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mwait (unsigned int __E, unsigned int __H) -{ - __builtin_ia32_mwait (__E, __H); -} - #ifdef __DISABLE_SSE3__ #undef __DISABLE_SSE3__ #pragma GCC pop_options diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 3f81abc7804..43afe3dabed 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait" [(unspec_volatile [(match_operand:SI 0 "register_operand" "c") (match_operand:SI 1 "register_operand" "a")] UNSPECV_MWAIT)] - "TARGET_SSE3" + "TARGET_MWAIT" ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. ;; Since 32bit register operands are implicitly zero extended to 64bit, ;; we only need to set up 32bit registers. @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>" (match_operand:SI 1 "register_operand" "c") (match_operand:SI 2 "register_operand" "d")] UNSPECV_MONITOR)] - "TARGET_SSE3" + "TARGET_MWAIT" ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in ;; RCX and RDX are used. Since 32bit register operands are implicitly ;; zero extended to 64bit, we only need to set up 32bit registers. diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h index ceda501252c..7793032ba90 100644 --- a/gcc/config/i386/x86gprintrin.h +++ b/gcc/config/i386/x86gprintrin.h @@ -56,6 +56,8 @@ #include <movdirintrin.h> +#include <mwaitintrin.h> + #include <mwaitxintrin.h> #include <pconfigintrin.h> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 1bc66cce2b8..1acfaf1d345 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions. @cindex @code{target("movdiri")} function attribute, x86 Enable/disable the generation of the MOVDIRI instructions. +@item mwait +@itemx no-mwait +@cindex @code{target("mwait")} function attribute, x86 +Enable/disable the generation of the MWAIT and MONITOR instructions. + @item mwaitx @itemx no-mwaitx @cindex @code{target("mwaitx")} function attribute, x86 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7f13ffb79e1..3e1f0bc8fad 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options. -mno-wide-multiply -mrtd -malign-double @gol -mpreferred-stack-boundary=@var{num} @gol -mincoming-stack-boundary=@var{num} @gol --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol -mrecip -mrecip=@var{opt} @gol -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi}, @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction. +@item -mmwait +@opindex mmwait +This option enables built-in functions @code{__builtin_ia32_monitor}, +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and +@code{mwait} machine instructions. + @item -mrecip @opindex mrecip This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c new file mode 100644 index 00000000000..96eeec070f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */ + +/* Verify that they work in both 32bit and 64bit. */ + +#include <x86gprintrin.h> + +void +foo (char *p, int x, int y, int z) +{ + _mm_monitor (p, y, x); + _mm_mwait (z, y); +} + +void +bar (char *p, long x, long y, long z) +{ + _mm_monitor (p, y, x); + _mm_mwait (z, y); +} + +void +foo1 (char *p) +{ + _mm_monitor (p, 0, 0); + _mm_mwait (0, 0); +} -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu @ 2021-08-16 6:11 ` Richard Biener 2021-08-16 12:25 ` H.J. Lu 0 siblings, 1 reply; 16+ messages in thread From: Richard Biener @ 2021-08-16 6:11 UTC (permalink / raw) To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with > -mgeneral-regs-only and make -msse3 to imply -mmwait. Adding new options requires to bump the LTO streaming minor version (I know we forgot it once on the branch already when adding a new --param). Please take care of this when backporting. Richard. > gcc/ > > * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and > x86_64-*-* targets. > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET): > New. > (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise. > (ix86_handle_option): Handle -mmwait. > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): > Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on > __builtin_ia32_monitor and __builtin_ia32_mwait. > * config/i386/i386-options.c (isa2_opts): Add -mmwait. > (ix86_valid_target_attribute_inner_p): Likewise. > (ix86_option_override_internal): Enable mwait/monitor > instructions for -msse3. > * config/i386/i386.h (TARGET_MWAIT): New. > (TARGET_MWAIT_P): Likewise. > * config/i386/i386.opt: Add -mmwait. > * config/i386/mwaitintrin.h: New file. > * config/i386/pmmintrin.h: Include <mwaitintrin.h>. > * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with > TARGET_MWAIT. > (@sse3_monitor_<mode>): Likewise. > * config/i386/x86gprintrin.h: Include <mwaitintrin.h>. > * doc/extend.texi: Document mwait target attribute. > * doc/invoke.texi: Document -mmwait. > > gcc/testsuite/ > > * gcc.target/i386/monitor-2.c: New test. > > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922) > --- > gcc/common/config/i386/i386-common.c | 15 +++++++ > gcc/config.gcc | 6 ++- > gcc/config/i386/i386-builtins.c | 4 +- > gcc/config/i386/i386-options.c | 7 +++ > gcc/config/i386/i386.h | 2 + > gcc/config/i386/i386.opt | 4 ++ > gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++ > gcc/config/i386/pmmintrin.h | 13 +----- > gcc/config/i386/sse.md | 4 +- > gcc/config/i386/x86gprintrin.h | 2 + > gcc/doc/extend.texi | 5 +++ > gcc/doc/invoke.texi | 8 +++- > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++ > 13 files changed, 130 insertions(+), 19 deletions(-) > create mode 100644 gcc/config/i386/mwaitintrin.h > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c > index 6a7b5c8312f..e156cc34584 100644 > --- a/gcc/common/config/i386/i386-common.c > +++ b/gcc/common/config/i386/i386-common.c > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA_F16C_SET \ > (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET) > #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT > #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO > #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU > #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES > #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB > #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT > #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO > #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU > #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts, > } > return true; > > + case OPT_mmwait: > + if (value) > + { > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET; > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET; > + } > + else > + { > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET; > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET; > + } > + return true; > + > case OPT_mclzero: > if (value) > { > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 357b0bed067..a020e0808c9 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -414,7 +414,8 @@ i[34567]86-*-*) > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > amxbf16intrin.h x86gprintrin.h uintrintrin.h > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > + mwaitintrin.h" > ;; > x86_64-*-*) > cpu_type=i386 > @@ -451,7 +452,8 @@ x86_64-*-*) > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > amxbf16intrin.h x86gprintrin.h uintrintrin.h > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > + mwaitintrin.h" > ;; > ia64-*-*) > extra_headers=ia64intrin.h > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c > index 4fcdf4b89ee..128bd39816c 100644 > --- a/gcc/config/i386/i386-builtins.c > +++ b/gcc/config/i386/i386-builtins.c > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void) > VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE); > > /* SSE3. */ > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor", > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor", > VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR); > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait", > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait", > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); > > /* AES */ > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c > index 18d2c0b9f99..7ecd0cf8b8c 100644 > --- a/gcc/config/i386/i386-options.c > +++ b/gcc/config/i386/i386-options.c > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] = > { "-mmovbe", OPTION_MASK_ISA2_MOVBE }, > { "-mclzero", OPTION_MASK_ISA2_CLZERO }, > { "-mmwaitx", OPTION_MASK_ISA2_MWAITX }, > + { "-mmwait", OPTION_MASK_ISA2_MWAIT }, > { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B }, > { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG }, > { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE }, > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], > IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase), > IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd), > IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx), > + IX86_ATTR_ISA ("mwait", OPT_mmwait), > IX86_ATTR_ISA ("clzero", OPT_mclzero), > IX86_ATTR_ISA ("pku", OPT_mpku), > IX86_ATTR_ISA ("lwp", OPT_mlwp), > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p, > || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags)) > ix86_prefetch_sse = true; > > + /* Enable mwait/monitor instructions for -msse3. */ > + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) > + opts->x_ix86_isa_flags2 > + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit; > + > /* Enable popcnt instruction for -msse4.2 or -mabm. */ > if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags) > || TARGET_ABM_P (opts->x_ix86_isa_flags)) > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 5583ec6881a..73e118900f7 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x) > #define TARGET_MWAITX TARGET_ISA2_MWAITX > #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x) > +#define TARGET_MWAIT TARGET_ISA2_MWAIT > +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x) > #define TARGET_PKU TARGET_ISA_PKU > #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x) > #define TARGET_SHSTK TARGET_ISA_SHSTK > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > index c781fdc8278..7b8547bb1c3 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation. > mneeded > Target Var(ix86_needed) Save > Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. > + > +mmwait > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save > +Support MWAIT and MONITOR built-in functions and code generation. > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h > new file mode 100644 > index 00000000000..1ecbc4abb69 > --- /dev/null > +++ b/gcc/config/i386/mwaitintrin.h > @@ -0,0 +1,52 @@ > +/* Copyright (C) 2021 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + Under Section 7 of GPL version 3, you are granted additional > + permissions described in the GCC Runtime Library Exception, version > + 3.1, as published by the Free Software Foundation. > + > + You should have received a copy of the GNU General Public License and > + a copy of the GCC Runtime Library Exception along with this program; > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef _MWAITINTRIN_H_INCLUDED > +#define _MWAITINTRIN_H_INCLUDED > + > +#ifndef __MWAIT__ > +#pragma GCC push_options > +#pragma GCC target("mwait") > +#define __DISABLE_MWAIT__ > +#endif /* __MWAIT__ */ > + > +extern __inline void > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > +{ > + __builtin_ia32_monitor (__P, __E, __H); > +} > + > +extern __inline void > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_mwait (unsigned int __E, unsigned int __H) > +{ > + __builtin_ia32_mwait (__E, __H); > +} > + > +#ifdef __DISABLE_MWAIT__ > +#undef __DISABLE_MWAIT__ > +#pragma GCC pop_options > +#endif /* __DISABLE_MWAIT__ */ > + > +#endif /* _MWAITINTRIN_H_INCLUDED */ > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h > index fa9c5bb8b9f..f8102d2be23 100644 > --- a/gcc/config/i386/pmmintrin.h > +++ b/gcc/config/i386/pmmintrin.h > @@ -29,6 +29,7 @@ > > /* We need definitions from the SSE2 and SSE header files*/ > #include <emmintrin.h> > +#include <mwaitintrin.h> > > #ifndef __SSE3__ > #pragma GCC push_options > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P) > return (__m128i) __builtin_ia32_lddqu ((char const *)__P); > } > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > -{ > - __builtin_ia32_monitor (__P, __E, __H); > -} > - > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > -_mm_mwait (unsigned int __E, unsigned int __H) > -{ > - __builtin_ia32_mwait (__E, __H); > -} > - > #ifdef __DISABLE_SSE3__ > #undef __DISABLE_SSE3__ > #pragma GCC pop_options > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 3f81abc7804..43afe3dabed 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait" > [(unspec_volatile [(match_operand:SI 0 "register_operand" "c") > (match_operand:SI 1 "register_operand" "a")] > UNSPECV_MWAIT)] > - "TARGET_SSE3" > + "TARGET_MWAIT" > ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. > ;; Since 32bit register operands are implicitly zero extended to 64bit, > ;; we only need to set up 32bit registers. > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>" > (match_operand:SI 1 "register_operand" "c") > (match_operand:SI 2 "register_operand" "d")] > UNSPECV_MONITOR)] > - "TARGET_SSE3" > + "TARGET_MWAIT" > ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in > ;; RCX and RDX are used. Since 32bit register operands are implicitly > ;; zero extended to 64bit, we only need to set up 32bit registers. > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h > index ceda501252c..7793032ba90 100644 > --- a/gcc/config/i386/x86gprintrin.h > +++ b/gcc/config/i386/x86gprintrin.h > @@ -56,6 +56,8 @@ > > #include <movdirintrin.h> > > +#include <mwaitintrin.h> > + > #include <mwaitxintrin.h> > > #include <pconfigintrin.h> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 1bc66cce2b8..1acfaf1d345 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions. > @cindex @code{target("movdiri")} function attribute, x86 > Enable/disable the generation of the MOVDIRI instructions. > > +@item mwait > +@itemx no-mwait > +@cindex @code{target("mwait")} function attribute, x86 > +Enable/disable the generation of the MWAIT and MONITOR instructions. > + > @item mwaitx > @itemx no-mwaitx > @cindex @code{target("mwaitx")} function attribute, x86 > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 7f13ffb79e1..3e1f0bc8fad 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options. > -mno-wide-multiply -mrtd -malign-double @gol > -mpreferred-stack-boundary=@var{num} @gol > -mincoming-stack-boundary=@var{num} @gol > --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol > +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol > -mrecip -mrecip=@var{opt} @gol > -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol > -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi}, > @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and > @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction. > > +@item -mmwait > +@opindex mmwait > +This option enables built-in functions @code{__builtin_ia32_monitor}, > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and > +@code{mwait} machine instructions. > + > @item -mrecip > @opindex mrecip > This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c > new file mode 100644 > index 00000000000..96eeec070f0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */ > + > +/* Verify that they work in both 32bit and 64bit. */ > + > +#include <x86gprintrin.h> > + > +void > +foo (char *p, int x, int y, int z) > +{ > + _mm_monitor (p, y, x); > + _mm_mwait (z, y); > +} > + > +void > +bar (char *p, long x, long y, long z) > +{ > + _mm_monitor (p, y, x); > + _mm_mwait (z, y); > +} > + > +void > +foo1 (char *p) > +{ > + _mm_monitor (p, 0, 0); > + _mm_mwait (0, 0); > +} > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-16 6:11 ` Richard Biener @ 2021-08-16 12:25 ` H.J. Lu 2021-08-16 12:28 ` Richard Biener 0 siblings, 1 reply; 16+ messages in thread From: H.J. Lu @ 2021-08-16 12:25 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek On Sun, Aug 15, 2021 at 11:11 PM Richard Biener <richard.guenther@gmail.com> wrote: > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with > > -mgeneral-regs-only and make -msse3 to imply -mmwait. > > Adding new options requires to bump the LTO streaming minor version > (I know we forgot it once on the branch already when adding a new --param). > > Please take care of this when backporting. It was updated today: commit dce5367eecfb0729cad0325240d614721afb39e3 Author: Martin Liska <mliska@suse.cz> Date: Mon Aug 16 13:02:54 2021 +0200 LTO: bump minor version Bump the LTO_minor_version due to changes in 52f0aa4dee8401ef3958dbf789780b0ee877beab PR c/100150 gcc/ChangeLog: * lto-streamer.h (LTO_minor_version): Bump. Do I need to do it again if I can check in my patches this week? Thanks. > Richard. > > > gcc/ > > > > * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and > > x86_64-*-* targets. > > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET): > > New. > > (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise. > > (ix86_handle_option): Handle -mmwait. > > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): > > Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on > > __builtin_ia32_monitor and __builtin_ia32_mwait. > > * config/i386/i386-options.c (isa2_opts): Add -mmwait. > > (ix86_valid_target_attribute_inner_p): Likewise. > > (ix86_option_override_internal): Enable mwait/monitor > > instructions for -msse3. > > * config/i386/i386.h (TARGET_MWAIT): New. > > (TARGET_MWAIT_P): Likewise. > > * config/i386/i386.opt: Add -mmwait. > > * config/i386/mwaitintrin.h: New file. > > * config/i386/pmmintrin.h: Include <mwaitintrin.h>. > > * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with > > TARGET_MWAIT. > > (@sse3_monitor_<mode>): Likewise. > > * config/i386/x86gprintrin.h: Include <mwaitintrin.h>. > > * doc/extend.texi: Document mwait target attribute. > > * doc/invoke.texi: Document -mmwait. > > > > gcc/testsuite/ > > > > * gcc.target/i386/monitor-2.c: New test. > > > > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922) > > --- > > gcc/common/config/i386/i386-common.c | 15 +++++++ > > gcc/config.gcc | 6 ++- > > gcc/config/i386/i386-builtins.c | 4 +- > > gcc/config/i386/i386-options.c | 7 +++ > > gcc/config/i386/i386.h | 2 + > > gcc/config/i386/i386.opt | 4 ++ > > gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++ > > gcc/config/i386/pmmintrin.h | 13 +----- > > gcc/config/i386/sse.md | 4 +- > > gcc/config/i386/x86gprintrin.h | 2 + > > gcc/doc/extend.texi | 5 +++ > > gcc/doc/invoke.texi | 8 +++- > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++ > > 13 files changed, 130 insertions(+), 19 deletions(-) > > create mode 100644 gcc/config/i386/mwaitintrin.h > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > > > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c > > index 6a7b5c8312f..e156cc34584 100644 > > --- a/gcc/common/config/i386/i386-common.c > > +++ b/gcc/common/config/i386/i386-common.c > > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see > > #define OPTION_MASK_ISA_F16C_SET \ > > (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET) > > #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX > > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT > > #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO > > #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU > > #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID > > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see > > #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES > > #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB > > #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX > > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT > > #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO > > #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU > > #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID > > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts, > > } > > return true; > > > > + case OPT_mmwait: > > + if (value) > > + { > > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET; > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET; > > + } > > + else > > + { > > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET; > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET; > > + } > > + return true; > > + > > case OPT_mclzero: > > if (value) > > { > > diff --git a/gcc/config.gcc b/gcc/config.gcc > > index 357b0bed067..a020e0808c9 100644 > > --- a/gcc/config.gcc > > +++ b/gcc/config.gcc > > @@ -414,7 +414,8 @@ i[34567]86-*-*) > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > > amxbf16intrin.h x86gprintrin.h uintrintrin.h > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > > + mwaitintrin.h" > > ;; > > x86_64-*-*) > > cpu_type=i386 > > @@ -451,7 +452,8 @@ x86_64-*-*) > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > > amxbf16intrin.h x86gprintrin.h uintrintrin.h > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > > + mwaitintrin.h" > > ;; > > ia64-*-*) > > extra_headers=ia64intrin.h > > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c > > index 4fcdf4b89ee..128bd39816c 100644 > > --- a/gcc/config/i386/i386-builtins.c > > +++ b/gcc/config/i386/i386-builtins.c > > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void) > > VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE); > > > > /* SSE3. */ > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor", > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor", > > VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR); > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait", > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait", > > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); > > > > /* AES */ > > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c > > index 18d2c0b9f99..7ecd0cf8b8c 100644 > > --- a/gcc/config/i386/i386-options.c > > +++ b/gcc/config/i386/i386-options.c > > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] = > > { "-mmovbe", OPTION_MASK_ISA2_MOVBE }, > > { "-mclzero", OPTION_MASK_ISA2_CLZERO }, > > { "-mmwaitx", OPTION_MASK_ISA2_MWAITX }, > > + { "-mmwait", OPTION_MASK_ISA2_MWAIT }, > > { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B }, > > { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG }, > > { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE }, > > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], > > IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase), > > IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd), > > IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx), > > + IX86_ATTR_ISA ("mwait", OPT_mmwait), > > IX86_ATTR_ISA ("clzero", OPT_mclzero), > > IX86_ATTR_ISA ("pku", OPT_mpku), > > IX86_ATTR_ISA ("lwp", OPT_mlwp), > > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p, > > || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags)) > > ix86_prefetch_sse = true; > > > > + /* Enable mwait/monitor instructions for -msse3. */ > > + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) > > + opts->x_ix86_isa_flags2 > > + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit; > > + > > /* Enable popcnt instruction for -msse4.2 or -mabm. */ > > if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags) > > || TARGET_ABM_P (opts->x_ix86_isa_flags)) > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > index 5583ec6881a..73e118900f7 100644 > > --- a/gcc/config/i386/i386.h > > +++ b/gcc/config/i386/i386.h > > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x) > > #define TARGET_MWAITX TARGET_ISA2_MWAITX > > #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x) > > +#define TARGET_MWAIT TARGET_ISA2_MWAIT > > +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x) > > #define TARGET_PKU TARGET_ISA_PKU > > #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x) > > #define TARGET_SHSTK TARGET_ISA_SHSTK > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > > index c781fdc8278..7b8547bb1c3 100644 > > --- a/gcc/config/i386/i386.opt > > +++ b/gcc/config/i386/i386.opt > > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation. > > mneeded > > Target Var(ix86_needed) Save > > Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. > > + > > +mmwait > > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save > > +Support MWAIT and MONITOR built-in functions and code generation. > > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h > > new file mode 100644 > > index 00000000000..1ecbc4abb69 > > --- /dev/null > > +++ b/gcc/config/i386/mwaitintrin.h > > @@ -0,0 +1,52 @@ > > +/* Copyright (C) 2021 Free Software Foundation, Inc. > > + > > + This file is part of GCC. > > + > > + GCC is free software; you can redistribute it and/or modify > > + it under the terms of the GNU General Public License as published by > > + the Free Software Foundation; either version 3, or (at your option) > > + any later version. > > + > > + GCC is distributed in the hope that it will be useful, > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + GNU General Public License for more details. > > + > > + Under Section 7 of GPL version 3, you are granted additional > > + permissions described in the GCC Runtime Library Exception, version > > + 3.1, as published by the Free Software Foundation. > > + > > + You should have received a copy of the GNU General Public License and > > + a copy of the GCC Runtime Library Exception along with this program; > > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > + <http://www.gnu.org/licenses/>. */ > > + > > +#ifndef _MWAITINTRIN_H_INCLUDED > > +#define _MWAITINTRIN_H_INCLUDED > > + > > +#ifndef __MWAIT__ > > +#pragma GCC push_options > > +#pragma GCC target("mwait") > > +#define __DISABLE_MWAIT__ > > +#endif /* __MWAIT__ */ > > + > > +extern __inline void > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > > +{ > > + __builtin_ia32_monitor (__P, __E, __H); > > +} > > + > > +extern __inline void > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > +_mm_mwait (unsigned int __E, unsigned int __H) > > +{ > > + __builtin_ia32_mwait (__E, __H); > > +} > > + > > +#ifdef __DISABLE_MWAIT__ > > +#undef __DISABLE_MWAIT__ > > +#pragma GCC pop_options > > +#endif /* __DISABLE_MWAIT__ */ > > + > > +#endif /* _MWAITINTRIN_H_INCLUDED */ > > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h > > index fa9c5bb8b9f..f8102d2be23 100644 > > --- a/gcc/config/i386/pmmintrin.h > > +++ b/gcc/config/i386/pmmintrin.h > > @@ -29,6 +29,7 @@ > > > > /* We need definitions from the SSE2 and SSE header files*/ > > #include <emmintrin.h> > > +#include <mwaitintrin.h> > > > > #ifndef __SSE3__ > > #pragma GCC push_options > > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P) > > return (__m128i) __builtin_ia32_lddqu ((char const *)__P); > > } > > > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > > -{ > > - __builtin_ia32_monitor (__P, __E, __H); > > -} > > - > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > -_mm_mwait (unsigned int __E, unsigned int __H) > > -{ > > - __builtin_ia32_mwait (__E, __H); > > -} > > - > > #ifdef __DISABLE_SSE3__ > > #undef __DISABLE_SSE3__ > > #pragma GCC pop_options > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > index 3f81abc7804..43afe3dabed 100644 > > --- a/gcc/config/i386/sse.md > > +++ b/gcc/config/i386/sse.md > > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait" > > [(unspec_volatile [(match_operand:SI 0 "register_operand" "c") > > (match_operand:SI 1 "register_operand" "a")] > > UNSPECV_MWAIT)] > > - "TARGET_SSE3" > > + "TARGET_MWAIT" > > ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. > > ;; Since 32bit register operands are implicitly zero extended to 64bit, > > ;; we only need to set up 32bit registers. > > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>" > > (match_operand:SI 1 "register_operand" "c") > > (match_operand:SI 2 "register_operand" "d")] > > UNSPECV_MONITOR)] > > - "TARGET_SSE3" > > + "TARGET_MWAIT" > > ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in > > ;; RCX and RDX are used. Since 32bit register operands are implicitly > > ;; zero extended to 64bit, we only need to set up 32bit registers. > > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h > > index ceda501252c..7793032ba90 100644 > > --- a/gcc/config/i386/x86gprintrin.h > > +++ b/gcc/config/i386/x86gprintrin.h > > @@ -56,6 +56,8 @@ > > > > #include <movdirintrin.h> > > > > +#include <mwaitintrin.h> > > + > > #include <mwaitxintrin.h> > > > > #include <pconfigintrin.h> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > > index 1bc66cce2b8..1acfaf1d345 100644 > > --- a/gcc/doc/extend.texi > > +++ b/gcc/doc/extend.texi > > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions. > > @cindex @code{target("movdiri")} function attribute, x86 > > Enable/disable the generation of the MOVDIRI instructions. > > > > +@item mwait > > +@itemx no-mwait > > +@cindex @code{target("mwait")} function attribute, x86 > > +Enable/disable the generation of the MWAIT and MONITOR instructions. > > + > > @item mwaitx > > @itemx no-mwaitx > > @cindex @code{target("mwaitx")} function attribute, x86 > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index 7f13ffb79e1..3e1f0bc8fad 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options. > > -mno-wide-multiply -mrtd -malign-double @gol > > -mpreferred-stack-boundary=@var{num} @gol > > -mincoming-stack-boundary=@var{num} @gol > > --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol > > +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol > > -mrecip -mrecip=@var{opt} @gol > > -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol > > -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol > > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi}, > > @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and > > @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction. > > > > +@item -mmwait > > +@opindex mmwait > > +This option enables built-in functions @code{__builtin_ia32_monitor}, > > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and > > +@code{mwait} machine instructions. > > + > > @item -mrecip > > @opindex mrecip > > This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions > > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c > > new file mode 100644 > > index 00000000000..96eeec070f0 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c > > @@ -0,0 +1,27 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */ > > + > > +/* Verify that they work in both 32bit and 64bit. */ > > + > > +#include <x86gprintrin.h> > > + > > +void > > +foo (char *p, int x, int y, int z) > > +{ > > + _mm_monitor (p, y, x); > > + _mm_mwait (z, y); > > +} > > + > > +void > > +bar (char *p, long x, long y, long z) > > +{ > > + _mm_monitor (p, y, x); > > + _mm_mwait (z, y); > > +} > > + > > +void > > +foo1 (char *p) > > +{ > > + _mm_monitor (p, 0, 0); > > + _mm_mwait (0, 0); > > +} > > -- > > 2.31.1 > > -- H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-16 12:25 ` H.J. Lu @ 2021-08-16 12:28 ` Richard Biener 2021-08-16 12:35 ` H.J. Lu 2021-08-16 12:37 ` Martin Liška 0 siblings, 2 replies; 16+ messages in thread From: Richard Biener @ 2021-08-16 12:28 UTC (permalink / raw) To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > <richard.guenther@gmail.com> wrote: > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with > > > -mgeneral-regs-only and make -msse3 to imply -mmwait. > > > > Adding new options requires to bump the LTO streaming minor version > > (I know we forgot it once on the branch already when adding a new --param). > > > > Please take care of this when backporting. > > It was updated today: > > commit dce5367eecfb0729cad0325240d614721afb39e3 > Author: Martin Liska <mliska@suse.cz> > Date: Mon Aug 16 13:02:54 2021 +0200 > > LTO: bump minor version > > Bump the LTO_minor_version due to changes in > 52f0aa4dee8401ef3958dbf789780b0ee877beab > > PR c/100150 > > gcc/ChangeLog: > > * lto-streamer.h (LTO_minor_version): Bump. > > Do I need to do it again if I can check in my patches this week? Yes please, and do it with the same commit doing the .opt change. Richard. > Thanks. > > > Richard. > > > > > gcc/ > > > > > > * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and > > > x86_64-*-* targets. > > > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET): > > > New. > > > (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise. > > > (ix86_handle_option): Handle -mmwait. > > > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): > > > Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on > > > __builtin_ia32_monitor and __builtin_ia32_mwait. > > > * config/i386/i386-options.c (isa2_opts): Add -mmwait. > > > (ix86_valid_target_attribute_inner_p): Likewise. > > > (ix86_option_override_internal): Enable mwait/monitor > > > instructions for -msse3. > > > * config/i386/i386.h (TARGET_MWAIT): New. > > > (TARGET_MWAIT_P): Likewise. > > > * config/i386/i386.opt: Add -mmwait. > > > * config/i386/mwaitintrin.h: New file. > > > * config/i386/pmmintrin.h: Include <mwaitintrin.h>. > > > * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with > > > TARGET_MWAIT. > > > (@sse3_monitor_<mode>): Likewise. > > > * config/i386/x86gprintrin.h: Include <mwaitintrin.h>. > > > * doc/extend.texi: Document mwait target attribute. > > > * doc/invoke.texi: Document -mmwait. > > > > > > gcc/testsuite/ > > > > > > * gcc.target/i386/monitor-2.c: New test. > > > > > > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922) > > > --- > > > gcc/common/config/i386/i386-common.c | 15 +++++++ > > > gcc/config.gcc | 6 ++- > > > gcc/config/i386/i386-builtins.c | 4 +- > > > gcc/config/i386/i386-options.c | 7 +++ > > > gcc/config/i386/i386.h | 2 + > > > gcc/config/i386/i386.opt | 4 ++ > > > gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++ > > > gcc/config/i386/pmmintrin.h | 13 +----- > > > gcc/config/i386/sse.md | 4 +- > > > gcc/config/i386/x86gprintrin.h | 2 + > > > gcc/doc/extend.texi | 5 +++ > > > gcc/doc/invoke.texi | 8 +++- > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++ > > > 13 files changed, 130 insertions(+), 19 deletions(-) > > > create mode 100644 gcc/config/i386/mwaitintrin.h > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > > > > > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c > > > index 6a7b5c8312f..e156cc34584 100644 > > > --- a/gcc/common/config/i386/i386-common.c > > > +++ b/gcc/common/config/i386/i386-common.c > > > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see > > > #define OPTION_MASK_ISA_F16C_SET \ > > > (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET) > > > #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX > > > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT > > > #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO > > > #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU > > > #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID > > > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see > > > #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES > > > #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB > > > #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX > > > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT > > > #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO > > > #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU > > > #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID > > > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts, > > > } > > > return true; > > > > > > + case OPT_mmwait: > > > + if (value) > > > + { > > > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET; > > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET; > > > + } > > > + else > > > + { > > > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET; > > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET; > > > + } > > > + return true; > > > + > > > case OPT_mclzero: > > > if (value) > > > { > > > diff --git a/gcc/config.gcc b/gcc/config.gcc > > > index 357b0bed067..a020e0808c9 100644 > > > --- a/gcc/config.gcc > > > +++ b/gcc/config.gcc > > > @@ -414,7 +414,8 @@ i[34567]86-*-*) > > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > > > amxbf16intrin.h x86gprintrin.h uintrintrin.h > > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > > > + mwaitintrin.h" > > > ;; > > > x86_64-*-*) > > > cpu_type=i386 > > > @@ -451,7 +452,8 @@ x86_64-*-*) > > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h > > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > > > amxbf16intrin.h x86gprintrin.h uintrintrin.h > > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h" > > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h > > > + mwaitintrin.h" > > > ;; > > > ia64-*-*) > > > extra_headers=ia64intrin.h > > > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c > > > index 4fcdf4b89ee..128bd39816c 100644 > > > --- a/gcc/config/i386/i386-builtins.c > > > +++ b/gcc/config/i386/i386-builtins.c > > > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void) > > > VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE); > > > > > > /* SSE3. */ > > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor", > > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor", > > > VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR); > > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait", > > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait", > > > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); > > > > > > /* AES */ > > > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c > > > index 18d2c0b9f99..7ecd0cf8b8c 100644 > > > --- a/gcc/config/i386/i386-options.c > > > +++ b/gcc/config/i386/i386-options.c > > > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] = > > > { "-mmovbe", OPTION_MASK_ISA2_MOVBE }, > > > { "-mclzero", OPTION_MASK_ISA2_CLZERO }, > > > { "-mmwaitx", OPTION_MASK_ISA2_MWAITX }, > > > + { "-mmwait", OPTION_MASK_ISA2_MWAIT }, > > > { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B }, > > > { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG }, > > > { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE }, > > > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], > > > IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase), > > > IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd), > > > IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx), > > > + IX86_ATTR_ISA ("mwait", OPT_mmwait), > > > IX86_ATTR_ISA ("clzero", OPT_mclzero), > > > IX86_ATTR_ISA ("pku", OPT_mpku), > > > IX86_ATTR_ISA ("lwp", OPT_mlwp), > > > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p, > > > || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags)) > > > ix86_prefetch_sse = true; > > > > > > + /* Enable mwait/monitor instructions for -msse3. */ > > > + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) > > > + opts->x_ix86_isa_flags2 > > > + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit; > > > + > > > /* Enable popcnt instruction for -msse4.2 or -mabm. */ > > > if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags) > > > || TARGET_ABM_P (opts->x_ix86_isa_flags)) > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > index 5583ec6881a..73e118900f7 100644 > > > --- a/gcc/config/i386/i386.h > > > +++ b/gcc/config/i386/i386.h > > > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > > #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x) > > > #define TARGET_MWAITX TARGET_ISA2_MWAITX > > > #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x) > > > +#define TARGET_MWAIT TARGET_ISA2_MWAIT > > > +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x) > > > #define TARGET_PKU TARGET_ISA_PKU > > > #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x) > > > #define TARGET_SHSTK TARGET_ISA_SHSTK > > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > > > index c781fdc8278..7b8547bb1c3 100644 > > > --- a/gcc/config/i386/i386.opt > > > +++ b/gcc/config/i386/i386.opt > > > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation. > > > mneeded > > > Target Var(ix86_needed) Save > > > Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. > > > + > > > +mmwait > > > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save > > > +Support MWAIT and MONITOR built-in functions and code generation. > > > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h > > > new file mode 100644 > > > index 00000000000..1ecbc4abb69 > > > --- /dev/null > > > +++ b/gcc/config/i386/mwaitintrin.h > > > @@ -0,0 +1,52 @@ > > > +/* Copyright (C) 2021 Free Software Foundation, Inc. > > > + > > > + This file is part of GCC. > > > + > > > + GCC is free software; you can redistribute it and/or modify > > > + it under the terms of the GNU General Public License as published by > > > + the Free Software Foundation; either version 3, or (at your option) > > > + any later version. > > > + > > > + GCC is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > > + GNU General Public License for more details. > > > + > > > + Under Section 7 of GPL version 3, you are granted additional > > > + permissions described in the GCC Runtime Library Exception, version > > > + 3.1, as published by the Free Software Foundation. > > > + > > > + You should have received a copy of the GNU General Public License and > > > + a copy of the GCC Runtime Library Exception along with this program; > > > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > > + <http://www.gnu.org/licenses/>. */ > > > + > > > +#ifndef _MWAITINTRIN_H_INCLUDED > > > +#define _MWAITINTRIN_H_INCLUDED > > > + > > > +#ifndef __MWAIT__ > > > +#pragma GCC push_options > > > +#pragma GCC target("mwait") > > > +#define __DISABLE_MWAIT__ > > > +#endif /* __MWAIT__ */ > > > + > > > +extern __inline void > > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > > > +{ > > > + __builtin_ia32_monitor (__P, __E, __H); > > > +} > > > + > > > +extern __inline void > > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > > +_mm_mwait (unsigned int __E, unsigned int __H) > > > +{ > > > + __builtin_ia32_mwait (__E, __H); > > > +} > > > + > > > +#ifdef __DISABLE_MWAIT__ > > > +#undef __DISABLE_MWAIT__ > > > +#pragma GCC pop_options > > > +#endif /* __DISABLE_MWAIT__ */ > > > + > > > +#endif /* _MWAITINTRIN_H_INCLUDED */ > > > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h > > > index fa9c5bb8b9f..f8102d2be23 100644 > > > --- a/gcc/config/i386/pmmintrin.h > > > +++ b/gcc/config/i386/pmmintrin.h > > > @@ -29,6 +29,7 @@ > > > > > > /* We need definitions from the SSE2 and SSE header files*/ > > > #include <emmintrin.h> > > > +#include <mwaitintrin.h> > > > > > > #ifndef __SSE3__ > > > #pragma GCC push_options > > > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P) > > > return (__m128i) __builtin_ia32_lddqu ((char const *)__P); > > > } > > > > > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) > > > -{ > > > - __builtin_ia32_monitor (__P, __E, __H); > > > -} > > > - > > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) > > > -_mm_mwait (unsigned int __E, unsigned int __H) > > > -{ > > > - __builtin_ia32_mwait (__E, __H); > > > -} > > > - > > > #ifdef __DISABLE_SSE3__ > > > #undef __DISABLE_SSE3__ > > > #pragma GCC pop_options > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > > index 3f81abc7804..43afe3dabed 100644 > > > --- a/gcc/config/i386/sse.md > > > +++ b/gcc/config/i386/sse.md > > > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait" > > > [(unspec_volatile [(match_operand:SI 0 "register_operand" "c") > > > (match_operand:SI 1 "register_operand" "a")] > > > UNSPECV_MWAIT)] > > > - "TARGET_SSE3" > > > + "TARGET_MWAIT" > > > ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. > > > ;; Since 32bit register operands are implicitly zero extended to 64bit, > > > ;; we only need to set up 32bit registers. > > > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>" > > > (match_operand:SI 1 "register_operand" "c") > > > (match_operand:SI 2 "register_operand" "d")] > > > UNSPECV_MONITOR)] > > > - "TARGET_SSE3" > > > + "TARGET_MWAIT" > > > ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in > > > ;; RCX and RDX are used. Since 32bit register operands are implicitly > > > ;; zero extended to 64bit, we only need to set up 32bit registers. > > > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h > > > index ceda501252c..7793032ba90 100644 > > > --- a/gcc/config/i386/x86gprintrin.h > > > +++ b/gcc/config/i386/x86gprintrin.h > > > @@ -56,6 +56,8 @@ > > > > > > #include <movdirintrin.h> > > > > > > +#include <mwaitintrin.h> > > > + > > > #include <mwaitxintrin.h> > > > > > > #include <pconfigintrin.h> > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > > > index 1bc66cce2b8..1acfaf1d345 100644 > > > --- a/gcc/doc/extend.texi > > > +++ b/gcc/doc/extend.texi > > > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions. > > > @cindex @code{target("movdiri")} function attribute, x86 > > > Enable/disable the generation of the MOVDIRI instructions. > > > > > > +@item mwait > > > +@itemx no-mwait > > > +@cindex @code{target("mwait")} function attribute, x86 > > > +Enable/disable the generation of the MWAIT and MONITOR instructions. > > > + > > > @item mwaitx > > > @itemx no-mwaitx > > > @cindex @code{target("mwaitx")} function attribute, x86 > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > > index 7f13ffb79e1..3e1f0bc8fad 100644 > > > --- a/gcc/doc/invoke.texi > > > +++ b/gcc/doc/invoke.texi > > > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options. > > > -mno-wide-multiply -mrtd -malign-double @gol > > > -mpreferred-stack-boundary=@var{num} @gol > > > -mincoming-stack-boundary=@var{num} @gol > > > --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol > > > +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol > > > -mrecip -mrecip=@var{opt} @gol > > > -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol > > > -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol > > > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi}, > > > @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and > > > @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction. > > > > > > +@item -mmwait > > > +@opindex mmwait > > > +This option enables built-in functions @code{__builtin_ia32_monitor}, > > > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and > > > +@code{mwait} machine instructions. > > > + > > > @item -mrecip > > > @opindex mrecip > > > This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions > > > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c > > > new file mode 100644 > > > index 00000000000..96eeec070f0 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c > > > @@ -0,0 +1,27 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */ > > > + > > > +/* Verify that they work in both 32bit and 64bit. */ > > > + > > > +#include <x86gprintrin.h> > > > + > > > +void > > > +foo (char *p, int x, int y, int z) > > > +{ > > > + _mm_monitor (p, y, x); > > > + _mm_mwait (z, y); > > > +} > > > + > > > +void > > > +bar (char *p, long x, long y, long z) > > > +{ > > > + _mm_monitor (p, y, x); > > > + _mm_mwait (z, y); > > > +} > > > + > > > +void > > > +foo1 (char *p) > > > +{ > > > + _mm_monitor (p, 0, 0); > > > + _mm_mwait (0, 0); > > > +} > > > -- > > > 2.31.1 > > > > > > > -- > H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-16 12:28 ` Richard Biener @ 2021-08-16 12:35 ` H.J. Lu 2021-08-16 12:37 ` Martin Liška 1 sibling, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-16 12:35 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek [-- Attachment #1: Type: text/plain, Size: 1350 bytes --] On Mon, Aug 16, 2021 at 5:28 AM Richard Biener <richard.guenther@gmail.com> wrote: > > On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > > <richard.guenther@gmail.com> wrote: > > > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with > > > > -mgeneral-regs-only and make -msse3 to imply -mmwait. > > > > > > Adding new options requires to bump the LTO streaming minor version > > > (I know we forgot it once on the branch already when adding a new --param). > > > > > > Please take care of this when backporting. > > > > It was updated today: > > > > commit dce5367eecfb0729cad0325240d614721afb39e3 > > Author: Martin Liska <mliska@suse.cz> > > Date: Mon Aug 16 13:02:54 2021 +0200 > > > > LTO: bump minor version > > > > Bump the LTO_minor_version due to changes in > > 52f0aa4dee8401ef3958dbf789780b0ee877beab > > > > PR c/100150 > > > > gcc/ChangeLog: > > > > * lto-streamer.h (LTO_minor_version): Bump. > > > > Do I need to do it again if I can check in my patches this week? > > Yes please, and do it with the same commit doing the .opt change. > Here is the updated patch with LTO_minor_version bump. Thanks. -- H.J. [-- Attachment #2: 0001-x86-Add-mmwait-for-mgeneral-regs-only.patch --] [-- Type: text/x-patch, Size: 15414 bytes --] From 8f3e275ef061cd5f8353c71cb99f05dd944575f9 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.tools@gmail.com> Date: Thu, 15 Apr 2021 11:19:32 -0700 Subject: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with -mgeneral-regs-only and make -msse3 to imply -mmwait. gcc/ * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and x86_64-*-* targets. * lto-streamer.h (LTO_minor_version): Bump. * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET): New. (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise. (ix86_handle_option): Handle -mmwait. * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on __builtin_ia32_monitor and __builtin_ia32_mwait. * config/i386/i386-options.c (isa2_opts): Add -mmwait. (ix86_valid_target_attribute_inner_p): Likewise. (ix86_option_override_internal): Enable mwait/monitor instructions for -msse3. * config/i386/i386.h (TARGET_MWAIT): New. (TARGET_MWAIT_P): Likewise. * config/i386/i386.opt: Add -mmwait. * config/i386/mwaitintrin.h: New file. * config/i386/pmmintrin.h: Include <mwaitintrin.h>. * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with TARGET_MWAIT. (@sse3_monitor_<mode>): Likewise. * config/i386/x86gprintrin.h: Include <mwaitintrin.h>. * doc/extend.texi: Document mwait target attribute. * doc/invoke.texi: Document -mmwait. gcc/testsuite/ * gcc.target/i386/monitor-2.c: New test. (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922) --- gcc/common/config/i386/i386-common.c | 15 +++++++ gcc/config.gcc | 6 ++- gcc/config/i386/i386-builtins.c | 4 +- gcc/config/i386/i386-options.c | 7 +++ gcc/config/i386/i386.h | 2 + gcc/config/i386/i386.opt | 4 ++ gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++ gcc/config/i386/pmmintrin.h | 13 +----- gcc/config/i386/sse.md | 4 +- gcc/config/i386/x86gprintrin.h | 2 + gcc/doc/extend.texi | 5 +++ gcc/doc/invoke.texi | 8 +++- gcc/lto-streamer.h | 2 +- gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++ 14 files changed, 131 insertions(+), 20 deletions(-) create mode 100644 gcc/config/i386/mwaitintrin.h create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 6a7b5c8312f..e156cc34584 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_F16C_SET \ (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET) #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mmwait: + if (value) + { + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET; + } + else + { + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET; + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET; + } + return true; + case OPT_mclzero: if (value) { diff --git a/gcc/config.gcc b/gcc/config.gcc index 357b0bed067..a020e0808c9 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -414,7 +414,8 @@ i[34567]86-*-*) avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h amxbf16intrin.h x86gprintrin.h uintrintrin.h - hresetintrin.h keylockerintrin.h avxvnniintrin.h" + hresetintrin.h keylockerintrin.h avxvnniintrin.h + mwaitintrin.h" ;; x86_64-*-*) cpu_type=i386 @@ -451,7 +452,8 @@ x86_64-*-*) avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h amxbf16intrin.h x86gprintrin.h uintrintrin.h - hresetintrin.h keylockerintrin.h avxvnniintrin.h" + hresetintrin.h keylockerintrin.h avxvnniintrin.h + mwaitintrin.h" ;; ia64-*-*) extra_headers=ia64intrin.h diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c index 4fcdf4b89ee..128bd39816c 100644 --- a/gcc/config/i386/i386-builtins.c +++ b/gcc/config/i386/i386-builtins.c @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void) VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE); /* SSE3. */ - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor", + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor", VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR); - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait", + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait", VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT); /* AES */ diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 18d2c0b9f99..7ecd0cf8b8c 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] = { "-mmovbe", OPTION_MASK_ISA2_MOVBE }, { "-mclzero", OPTION_MASK_ISA2_CLZERO }, { "-mmwaitx", OPTION_MASK_ISA2_MWAITX }, + { "-mmwait", OPTION_MASK_ISA2_MWAIT }, { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B }, { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG }, { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE }, @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[], IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase), IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd), IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx), + IX86_ATTR_ISA ("mwait", OPT_mmwait), IX86_ATTR_ISA ("clzero", OPT_mclzero), IX86_ATTR_ISA ("pku", OPT_mpku), IX86_ATTR_ISA ("lwp", OPT_mlwp), @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p, || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags)) ix86_prefetch_sse = true; + /* Enable mwait/monitor instructions for -msse3. */ + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) + opts->x_ix86_isa_flags2 + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit; + /* Enable popcnt instruction for -msse4.2 or -mabm. */ if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags) || TARGET_ABM_P (opts->x_ix86_isa_flags)) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 5583ec6881a..73e118900f7 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x) #define TARGET_MWAITX TARGET_ISA2_MWAITX #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x) +#define TARGET_MWAIT TARGET_ISA2_MWAIT +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x) #define TARGET_PKU TARGET_ISA_PKU #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x) #define TARGET_SHSTK TARGET_ISA_SHSTK diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index c781fdc8278..7b8547bb1c3 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation. mneeded Target Var(ix86_needed) Save Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. + +mmwait +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save +Support MWAIT and MONITOR built-in functions and code generation. diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h new file mode 100644 index 00000000000..1ecbc4abb69 --- /dev/null +++ b/gcc/config/i386/mwaitintrin.h @@ -0,0 +1,52 @@ +/* Copyright (C) 2021 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _MWAITINTRIN_H_INCLUDED +#define _MWAITINTRIN_H_INCLUDED + +#ifndef __MWAIT__ +#pragma GCC push_options +#pragma GCC target("mwait") +#define __DISABLE_MWAIT__ +#endif /* __MWAIT__ */ + +extern __inline void +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) +{ + __builtin_ia32_monitor (__P, __E, __H); +} + +extern __inline void +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mwait (unsigned int __E, unsigned int __H) +{ + __builtin_ia32_mwait (__E, __H); +} + +#ifdef __DISABLE_MWAIT__ +#undef __DISABLE_MWAIT__ +#pragma GCC pop_options +#endif /* __DISABLE_MWAIT__ */ + +#endif /* _MWAITINTRIN_H_INCLUDED */ diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h index fa9c5bb8b9f..f8102d2be23 100644 --- a/gcc/config/i386/pmmintrin.h +++ b/gcc/config/i386/pmmintrin.h @@ -29,6 +29,7 @@ /* We need definitions from the SSE2 and SSE header files*/ #include <emmintrin.h> +#include <mwaitintrin.h> #ifndef __SSE3__ #pragma GCC push_options @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P) return (__m128i) __builtin_ia32_lddqu ((char const *)__P); } -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H) -{ - __builtin_ia32_monitor (__P, __E, __H); -} - -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_mwait (unsigned int __E, unsigned int __H) -{ - __builtin_ia32_mwait (__E, __H); -} - #ifdef __DISABLE_SSE3__ #undef __DISABLE_SSE3__ #pragma GCC pop_options diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 3f81abc7804..43afe3dabed 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait" [(unspec_volatile [(match_operand:SI 0 "register_operand" "c") (match_operand:SI 1 "register_operand" "a")] UNSPECV_MWAIT)] - "TARGET_SSE3" + "TARGET_MWAIT" ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. ;; Since 32bit register operands are implicitly zero extended to 64bit, ;; we only need to set up 32bit registers. @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>" (match_operand:SI 1 "register_operand" "c") (match_operand:SI 2 "register_operand" "d")] UNSPECV_MONITOR)] - "TARGET_SSE3" + "TARGET_MWAIT" ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in ;; RCX and RDX are used. Since 32bit register operands are implicitly ;; zero extended to 64bit, we only need to set up 32bit registers. diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h index ceda501252c..7793032ba90 100644 --- a/gcc/config/i386/x86gprintrin.h +++ b/gcc/config/i386/x86gprintrin.h @@ -56,6 +56,8 @@ #include <movdirintrin.h> +#include <mwaitintrin.h> + #include <mwaitxintrin.h> #include <pconfigintrin.h> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 1bc66cce2b8..1acfaf1d345 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions. @cindex @code{target("movdiri")} function attribute, x86 Enable/disable the generation of the MOVDIRI instructions. +@item mwait +@itemx no-mwait +@cindex @code{target("mwait")} function attribute, x86 +Enable/disable the generation of the MWAIT and MONITOR instructions. + @item mwaitx @itemx no-mwaitx @cindex @code{target("mwaitx")} function attribute, x86 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 05269f83808..fc222758a22 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options. -mno-wide-multiply -mrtd -malign-double @gol -mpreferred-stack-boundary=@var{num} @gol -mincoming-stack-boundary=@var{num} @gol --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol -mrecip -mrecip=@var{opt} @gol -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol @@ -31178,6 +31178,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi}, @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction. +@item -mmwait +@opindex mmwait +This option enables built-in functions @code{__builtin_ia32_monitor}, +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and +@code{mwait} machine instructions. + @item -mrecip @opindex mrecip This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index a01049da472..e2a0e033ab2 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -121,7 +121,7 @@ along with GCC; see the file COPYING3. If not see form followed by the data for the string. */ #define LTO_major_version 11 -#define LTO_minor_version 1 +#define LTO_minor_version 2 typedef unsigned char lto_decl_flags_t; diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c new file mode 100644 index 00000000000..96eeec070f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */ + +/* Verify that they work in both 32bit and 64bit. */ + +#include <x86gprintrin.h> + +void +foo (char *p, int x, int y, int z) +{ + _mm_monitor (p, y, x); + _mm_mwait (z, y); +} + +void +bar (char *p, long x, long y, long z) +{ + _mm_monitor (p, y, x); + _mm_mwait (z, y); +} + +void +foo1 (char *p) +{ + _mm_monitor (p, 0, 0); + _mm_mwait (0, 0); +} -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only 2021-08-16 12:28 ` Richard Biener 2021-08-16 12:35 ` H.J. Lu @ 2021-08-16 12:37 ` Martin Liška 1 sibling, 0 replies; 16+ messages in thread From: Martin Liška @ 2021-08-16 12:37 UTC (permalink / raw) To: Richard Biener, H.J. Lu; +Cc: Jakub Jelinek, GCC Patches On 8/16/21 2:28 PM, Richard Biener via Gcc-patches wrote: > Yes please, and do it with the same commit doing the .opt change. Just one quick note: I've got a periodic builder that verifies the LTO stream on tramp3d in all active branches. Martin ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu 2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu @ 2021-08-13 13:51 ` H.J. Lu 2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu ` (3 subsequent siblings) 5 siblings, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener Use crc32 target option for CRC32 intrinsics to support CRC32 intrinsics without enabling SSE vector instructions. * config/i386/i386-c.c (ix86_target_macros_internal): Define __CRC32__ for -mcrc32. * config/i386/i386-options.c (ix86_option_override_internal): Enable crc32 instruction for -msse4.2. * config/i386/i386.md (sse4_2_crc32<mode>): Remove TARGET_SSE4_2 check. (sse4_2_crc32di): Likewise. * config/i386/ia32intrin.h: Use crc32 target option for CRC32 intrinsics. (cherry picked from commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd) --- gcc/config/i386/i386-c.c | 2 ++ gcc/config/i386/i386-options.c | 5 +++++ gcc/config/i386/i386.md | 4 ++-- gcc/config/i386/ia32intrin.h | 28 ++++++++++++++-------------- 4 files changed, 23 insertions(+), 16 deletions(-) diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c index be46d0506ad..5ed0de006fb 100644 --- a/gcc/config/i386/i386-c.c +++ b/gcc/config/i386/i386-c.c @@ -532,6 +532,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, def_or_undef (parse_in, "__LZCNT__"); if (isa_flag & OPTION_MASK_ISA_TBM) def_or_undef (parse_in, "__TBM__"); + if (isa_flag & OPTION_MASK_ISA_CRC32) + def_or_undef (parse_in, "__CRC32__"); if (isa_flag & OPTION_MASK_ISA_POPCNT) def_or_undef (parse_in, "__POPCNT__"); if (isa_flag & OPTION_MASK_ISA_FSGSBASE) diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 7ecd0cf8b8c..19632b5fd6b 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2625,6 +2625,11 @@ ix86_option_override_internal (bool main_args_p, opts->x_ix86_isa_flags |= OPTION_MASK_ISA_POPCNT & ~opts->x_ix86_isa_flags_explicit; + /* Enable crc32 instruction for -msse4.2. */ + if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)) + opts->x_ix86_isa_flags + |= OPTION_MASK_ISA_CRC32 & ~opts->x_ix86_isa_flags_explicit; + /* Enable lzcnt instruction for -mabm. */ if (TARGET_ABM_P(opts->x_ix86_isa_flags)) opts->x_ix86_isa_flags diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 2fdf98266cd..1d528a4434a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -20992,7 +20992,7 @@ (define_insn "sse4_2_crc32<mode>" [(match_operand:SI 1 "register_operand" "0") (match_operand:SWI124 2 "nonimmediate_operand" "<r>m")] UNSPEC_CRC32))] - "TARGET_SSE4_2 || TARGET_CRC32" + "TARGET_CRC32" "crc32{<imodesuffix>}\t{%2, %0|%0, %2}" [(set_attr "type" "sselog1") (set_attr "prefix_rep" "1") @@ -21013,7 +21013,7 @@ (define_insn "sse4_2_crc32di" [(match_operand:DI 1 "register_operand" "0") (match_operand:DI 2 "nonimmediate_operand" "rm")] UNSPEC_CRC32))] - "TARGET_64BIT && (TARGET_SSE4_2 || TARGET_CRC32)" + "TARGET_64BIT && TARGET_CRC32" "crc32{q}\t{%2, %0|%0, %2}" [(set_attr "type" "sselog1") (set_attr "prefix_rep" "1") diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h index 591394076cc..5422b0fc9e0 100644 --- a/gcc/config/i386/ia32intrin.h +++ b/gcc/config/i386/ia32intrin.h @@ -51,11 +51,11 @@ __bswapd (int __X) #ifndef __iamcu__ -#ifndef __SSE4_2__ +#ifndef __CRC32__ #pragma GCC push_options -#pragma GCC target("sse4.2") -#define __DISABLE_SSE4_2__ -#endif /* __SSE4_2__ */ +#pragma GCC target("crc32") +#define __DISABLE_CRC32__ +#endif /* __CRC32__ */ /* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned int @@ -79,10 +79,10 @@ __crc32d (unsigned int __C, unsigned int __V) return __builtin_ia32_crc32si (__C, __V); } -#ifdef __DISABLE_SSE4_2__ -#undef __DISABLE_SSE4_2__ +#ifdef __DISABLE_CRC32__ +#undef __DISABLE_CRC32__ #pragma GCC pop_options -#endif /* __DISABLE_SSE4_2__ */ +#endif /* __DISABLE_CRC32__ */ #endif /* __iamcu__ */ @@ -199,11 +199,11 @@ __bswapq (long long __X) return __builtin_bswap64 (__X); } -#ifndef __SSE4_2__ +#ifndef __CRC32__ #pragma GCC push_options -#pragma GCC target("sse4.2") -#define __DISABLE_SSE4_2__ -#endif /* __SSE4_2__ */ +#pragma GCC target("crc32") +#define __DISABLE_CRC32__ +#endif /* __CRC32__ */ /* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned long long @@ -213,10 +213,10 @@ __crc32q (unsigned long long __C, unsigned long long __V) return __builtin_ia32_crc32di (__C, __V); } -#ifdef __DISABLE_SSE4_2__ -#undef __DISABLE_SSE4_2__ +#ifdef __DISABLE_CRC32__ +#undef __DISABLE_CRC32__ #pragma GCC pop_options -#endif /* __DISABLE_SSE4_2__ */ +#endif /* __DISABLE_CRC32__ */ /* 64bit popcnt */ extern __inline long long -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu 2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu 2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu @ 2021-08-13 13:51 ` H.J. Lu 2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu ` (2 subsequent siblings) 5 siblings, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener Since commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd Author: H.J. Lu <hjl.tools@gmail.com> Date: Thu Apr 15 05:59:48 2021 -0700 x86: Use crc32 target option for CRC32 intrinsics enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_SSE4_2 check in sse4_2_crc32<mode> pattens, remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions. gcc/ PR target/101549 * config/i386/i386-builtin.def: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions. gcc/testsuite/ PR target/101549 * gcc.target/i386/crc32-6.c: New test. (cherry picked from commit 7aa28dbc371cf3c09c05c68672b00d9006391595) --- gcc/config/i386/i386-builtin.def | 8 ++++---- gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +++++++++++++ 2 files changed, 17 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index e3ed4e1578f..ea509c67ddb 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -963,10 +963,10 @@ BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_pte /* SSE4.2 */ BDESC (OPTION_MASK_ISA_SSE4_2, 0, CODE_FOR_sse4_2_gtv2di3, "__builtin_ia32_pcmpgtq", IX86_BUILTIN_PCMPGTQ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI) -BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR) -BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT) -BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT) -BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64) +BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR) +BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT) +BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT) +BDESC (OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64) /* SSE4A */ BDESC (OPTION_MASK_ISA_SSE4A, 0, CODE_FOR_sse4a_extrqi, "__builtin_ia32_extrqi", IX86_BUILTIN_EXTRQI, UNKNOWN, (int) V2DI_FTYPE_V2DI_UINT_UINT) diff --git a/gcc/testsuite/gcc.target/i386/crc32-6.c b/gcc/testsuite/gcc.target/i386/crc32-6.c new file mode 100644 index 00000000000..464e3444069 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/crc32-6.c @@ -0,0 +1,13 @@ +/* PR target/101549 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4 -mno-crc32" } */ + +#include <immintrin.h> + +unsigned int +test_mm_crc32_u8 (unsigned int CRC, unsigned char V) +{ + return _mm_crc32_u8 (CRC, V); +} + +/* { dg-error "needs isa option -mcrc32" "" { target *-*-* } 0 } */ -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu ` (2 preceding siblings ...) 2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu @ 2021-08-13 13:51 ` H.J. Lu 2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu 2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener 5 siblings, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener For -mgeneral-regs-only, enable the GPR only instructions which are enabled implicitly by SSE ISAs unless they have been disabled explicitly. gcc/ PR target/101492 * common/config/i386/i386-common.c (ix86_handle_option): For -mgeneral-regs-only, enable the GPR only instructions which are enabled implicitly by SSE ISAs unless they have been disabled explicitly. gcc/testsuite/ PR target/101492 * gcc.target/i386/pr101492-1.c: New test. * gcc.target/i386/pr101492-2.c: Likewise. * gcc.target/i386/pr101492-3.c: Likewise. * gcc.target/i386/pr101492-4.c: Likewise. (cherry picked from commit 6ae8aac19cdbdbd96d90f86e4d8505fe121bdf06) --- gcc/common/config/i386/i386-common.c | 30 ++++++++++++++++++++-- gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 ++++++++ gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 ++++++++ gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 ++++++++ gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +++++++++ 5 files changed, 70 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index e156cc34584..38dbb9d9263 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -354,16 +354,42 @@ ix86_handle_option (struct gcc_options *opts, case OPT_mgeneral_regs_only: if (value) { + HOST_WIDE_INT general_regs_only_flags = 0; + HOST_WIDE_INT general_regs_only_flags2 = 0; + + /* NB: Enable the GPR only instructions which are enabled + implicitly by SSE ISAs unless they have been disabled + explicitly. */ + if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)) + { + if ((opts->x_ix86_isa_flags_explicit + & OPTION_MASK_ISA_CRC32) == 0) + general_regs_only_flags |= OPTION_MASK_ISA_CRC32; + if ((opts->x_ix86_isa_flags_explicit + & OPTION_MASK_ISA_POPCNT) == 0) + general_regs_only_flags |= OPTION_MASK_ISA_POPCNT; + } + if (TARGET_SSE3_P (opts->x_ix86_isa_flags)) + { + if ((opts->x_ix86_isa_flags2_explicit + & OPTION_MASK_ISA2_MWAIT) == 0) + general_regs_only_flags2 |= OPTION_MASK_ISA2_MWAIT; + } + /* Disable MMX, SSE and x87 instructions if only general registers are allowed. */ opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET; opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET; + opts->x_ix86_isa_flags |= general_regs_only_flags; + opts->x_ix86_isa_flags2 |= general_regs_only_flags2; opts->x_ix86_isa_flags_explicit - |= OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET; + |= (OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET + | general_regs_only_flags); opts->x_ix86_isa_flags2_explicit - |= OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET; + |= (OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET + | general_regs_only_flags2); opts->x_target_flags &= ~MASK_80387; } diff --git a/gcc/testsuite/gcc.target/i386/pr101492-1.c b/gcc/testsuite/gcc.target/i386/pr101492-1.c new file mode 100644 index 00000000000..41002571761 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101492-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */ + +#include <x86intrin.h> + +unsigned int +foo1 (unsigned int x, unsigned int y) +{ + return __crc32d (x, y); +} diff --git a/gcc/testsuite/gcc.target/i386/pr101492-2.c b/gcc/testsuite/gcc.target/i386/pr101492-2.c new file mode 100644 index 00000000000..c7d24f43c39 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101492-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */ + +#include <x86intrin.h> + +unsigned int +foo1 (unsigned int x) +{ + return _mm_popcnt_u32 (x); +} diff --git a/gcc/testsuite/gcc.target/i386/pr101492-3.c b/gcc/testsuite/gcc.target/i386/pr101492-3.c new file mode 100644 index 00000000000..37e2071ab57 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101492-3.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse3 -mgeneral-regs-only" } */ + +#include <x86intrin.h> + +void +foo1 (unsigned int x, unsigned int y) +{ + _mm_mwait (x, y); +} diff --git a/gcc/testsuite/gcc.target/i386/pr101492-4.c b/gcc/testsuite/gcc.target/i386/pr101492-4.c new file mode 100644 index 00000000000..c5a4f0abd25 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr101492-4.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-mwait -msse3 -mgeneral-regs-only" } */ + +#include <x86intrin.h> + +void +foo1 (unsigned int x, unsigned int y) +{ + _mm_mwait (x, y); +} + +/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */ -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu ` (3 preceding siblings ...) 2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu @ 2021-08-13 13:51 ` H.J. Lu 2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener 5 siblings, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw) To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener 1. Intrinsics in <x86gprintrin.h> only require GPR ISAs. Add #if defined __MMX__ || defined __SSE__ #pragma GCC push_options #pragma GCC target("general-regs-only") #define __DISABLE_GENERAL_REGS_ONLY__ #endif and #ifdef __DISABLE_GENERAL_REGS_ONLY__ #undef __DISABLE_GENERAL_REGS_ONLY__ #pragma GCC pop_options #endif /* __DISABLE_GENERAL_REGS_ONLY__ */ to <x86gprintrin.h> to disable non-GPR ISAs so that they can be used in functions with __attribute__ ((target("general-regs-only"))). 2. When checking always_inline attribute, if callee only uses GPRs, ignore MASK_80387 since enable MASK_80387 in caller has no impact on callee inline. gcc/ PR target/99744 * config/i386/i386.c (ix86_can_inline_p): Ignore MASK_80387 if callee only uses GPRs. * config/i386/ia32intrin.h: Revert commit 5463cee2770. * config/i386/serializeintrin.h: Revert commit 71958f740f1. * config/i386/x86gprintrin.h: Add #pragma GCC target("general-regs-only") and #pragma GCC pop_options to disable non-GPR ISAs. gcc/testsuite/ PR target/99744 * gcc.target/i386/pr99744-3.c: New test. * gcc.target/i386/pr99744-4.c: Likewise. * gcc.target/i386/pr99744-5.c: Likewise. * gcc.target/i386/pr99744-6.c: Likewise. * gcc.target/i386/pr99744-7.c: Likewise. * gcc.target/i386/pr99744-8.c: Likewise. (cherry picked from commit 72264a639729a5dcc21dbee304717ce22b338bfd) --- gcc/config/i386/i386.c | 6 +- gcc/config/i386/ia32intrin.h | 14 +- gcc/config/i386/serializeintrin.h | 7 +- gcc/config/i386/x86gprintrin.h | 11 + gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + 10 files changed, 477 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 5a7bc8c44a8..527d493ecae 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -553,7 +553,7 @@ ix86_can_inline_p (tree caller, tree callee) /* Changes of those flags can be tolerated for always inlines. Lets hope user knows what he is doing. */ - const unsigned HOST_WIDE_INT always_inline_safe_mask + unsigned HOST_WIDE_INT always_inline_safe_mask = (MASK_USE_8BIT_IDIV | MASK_ACCUMULATE_OUTGOING_ARGS | MASK_NO_ALIGN_STRINGOPS | MASK_AVX256_SPLIT_UNALIGNED_LOAD | MASK_AVX256_SPLIT_UNALIGNED_STORE | MASK_CLD @@ -578,6 +578,10 @@ ix86_can_inline_p (tree caller, tree callee) && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))); + /* If callee only uses GPRs, ignore MASK_80387. */ + if (TARGET_GENERAL_REGS_ONLY_P (callee_opts->x_ix86_target_flags)) + always_inline_safe_mask |= MASK_80387; + cgraph_node *callee_node = cgraph_node::get (callee); /* Callee's isa options should be a subset of the caller's, i.e. a SSE4 function can inline a SSE2 function but a SSE2 function can't inline diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h index 5422b0fc9e0..df99220ee4f 100644 --- a/gcc/config/i386/ia32intrin.h +++ b/gcc/config/i386/ia32intrin.h @@ -107,12 +107,22 @@ __rdpmc (int __S) #endif /* __iamcu__ */ /* rdtsc */ -#define __rdtsc() __builtin_ia32_rdtsc () +extern __inline unsigned long long +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__rdtsc (void) +{ + return __builtin_ia32_rdtsc (); +} #ifndef __iamcu__ /* rdtscp */ -#define __rdtscp(a) __builtin_ia32_rdtscp (a) +extern __inline unsigned long long +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__rdtscp (unsigned int *__A) +{ + return __builtin_ia32_rdtscp (__A); +} #endif /* __iamcu__ */ diff --git a/gcc/config/i386/serializeintrin.h b/gcc/config/i386/serializeintrin.h index e280250b198..89b5b94ea9b 100644 --- a/gcc/config/i386/serializeintrin.h +++ b/gcc/config/i386/serializeintrin.h @@ -34,7 +34,12 @@ #define __DISABLE_SERIALIZE__ #endif /* __SERIALIZE__ */ -#define _serialize() __builtin_ia32_serialize () +extern __inline void +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_serialize (void) +{ + __builtin_ia32_serialize (); +} #ifdef __DISABLE_SERIALIZE__ #undef __DISABLE_SERIALIZE__ diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h index 7793032ba90..b7fefa780a6 100644 --- a/gcc/config/i386/x86gprintrin.h +++ b/gcc/config/i386/x86gprintrin.h @@ -24,6 +24,12 @@ #ifndef _X86GPRINTRIN_H_INCLUDED #define _X86GPRINTRIN_H_INCLUDED +#if defined __MMX__ || defined __SSE__ +#pragma GCC push_options +#pragma GCC target("general-regs-only") +#define __DISABLE_GENERAL_REGS_ONLY__ +#endif + #include <ia32intrin.h> #ifndef __iamcu__ @@ -255,4 +261,9 @@ _ptwrite32 (unsigned __B) #endif /* __iamcu__ */ +#ifdef __DISABLE_GENERAL_REGS_ONLY__ +#undef __DISABLE_GENERAL_REGS_ONLY__ +#pragma GCC pop_options +#endif /* __DISABLE_GENERAL_REGS_ONLY__ */ + #endif /* _X86GPRINTRIN_H_INCLUDED. */ diff --git a/gcc/testsuite/gcc.target/i386/pr99744-3.c b/gcc/testsuite/gcc.target/i386/pr99744-3.c new file mode 100644 index 00000000000..6c505816ceb --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-3.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-serialize" } */ + +#include <x86intrin.h> + +__attribute__ ((target("general-regs-only"))) +void +foo1 (void) +{ + _serialize (); +} + +/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */ diff --git a/gcc/testsuite/gcc.target/i386/pr99744-4.c b/gcc/testsuite/gcc.target/i386/pr99744-4.c new file mode 100644 index 00000000000..9196e62d955 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-4.c @@ -0,0 +1,357 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mbmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -mcrc32 -menqcmd -mfsgsbase -mfxsr -mhreset -mlzcnt -mlwp -mmovdir64b -mmovdiri -mmwaitx -mpconfig -mpku -mpopcnt -mptwrite -mrdpid -mrdrnd -mrdseed -mrtm -msgx -mshstk -mtbm -mtsxldtrk -mxsave -mxsavec -mxsaveopt -mxsaves -mwaitpkg -mwbnoinvd" } */ +/* { dg-additional-options "-muintr" { target { ! ia32 } } } */ + +/* Test calling GPR intrinsics from functions with general-regs-only + target attribute. */ + +#include <x86gprintrin.h> + +#define _CONCAT(x,y) x ## y + +#define test_0(func, type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (void) \ + { return func (); } + +#define test_0_i1(func, type, imm) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (void) \ + { return func (imm); } + +#define test_1(func, type, op1_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A) \ + { return func (A); } + +#define test_1_i1(func, type, op1_type, imm) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A) \ + { return func (A, imm); } + +#define test_2(func, type, op1_type, op2_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B) \ + { return func (A, B); } + +#define test_2_i1(func, type, op1_type, op2_type, imm) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B) \ + { return func (A, B, imm); } + +#define test_3(func, type, op1_type, op2_type, op3_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C) \ + { return func (A, B, C); } + +#define test_4(func, type, op1_type, op2_type, op3_type, op4_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C, \ + op4_type D) \ + { return func (A, B, C, D); } + +/* ia32intrin.h */ +test_1 (__bsfd, int, int) +test_1 (__bsrd, int, int) +test_1 (__bswapd, int, int) +test_1 (__popcntd, int, unsigned int) +test_2 (__rolb, unsigned char, unsigned char, int) +test_2 (__rolw, unsigned short, unsigned short, int) +test_2 (__rold, unsigned int, unsigned int, int) +test_2 (__rorb, unsigned char, unsigned char, int) +test_2 (__rorw, unsigned short, unsigned short, int) +test_2 (__rord, unsigned int, unsigned int, int) + +#ifndef __iamcu__ +/* adxintrin.h */ +test_4 (_subborrow_u32, unsigned char, unsigned char, unsigned int, + unsigned int, unsigned int *) +test_4 (_addcarry_u32, unsigned char, unsigned char, unsigned int, + unsigned int, unsigned int *) +test_4 (_addcarryx_u32, unsigned char, unsigned char, unsigned int, + unsigned int, unsigned int *) + +/* bmiintrin.h */ +test_1 (__tzcnt_u16, unsigned short, unsigned short) +test_2 (__andn_u32, unsigned int, unsigned int, unsigned int) +test_2 (__bextr_u32, unsigned int, unsigned int, unsigned int) +test_3 (_bextr_u32, unsigned int, unsigned int, unsigned int, + unsigned int) +test_1 (__blsi_u32, unsigned int, unsigned int) +test_1 (_blsi_u32, unsigned int, unsigned int) +test_1 (__blsmsk_u32, unsigned int, unsigned int) +test_1 (_blsmsk_u32, unsigned int, unsigned int) +test_1 (__blsr_u32, unsigned int, unsigned int) +test_1 (_blsr_u32, unsigned int, unsigned int) +test_1 (__tzcnt_u32, unsigned int, unsigned int) +test_1 (_tzcnt_u32, unsigned int, unsigned int) + +/* bmi2intrin.h */ +test_2 (_bzhi_u32, unsigned int, unsigned int, unsigned int) +test_2 (_pdep_u32, unsigned int, unsigned int, unsigned int) +test_2 (_pext_u32, unsigned int, unsigned int, unsigned int) + +/* cetintrin.h */ +test_1 (_inc_ssp, void, unsigned int) +test_0 (_saveprevssp, void) +test_1 (_rstorssp, void, void *) +test_2 (_wrssd, void, unsigned int, void *) +test_2 (_wrussd, void, unsigned int, void *) +test_0 (_setssbsy, void) +test_1 (_clrssbsy, void, void *) + +/* cldemoteintrin.h */ +test_1 (_cldemote, void, void *) + +/* clflushoptintrin.h */ +test_1 (_mm_clflushopt, void, void *) + +/* clwbintrin.h */ +test_1 (_mm_clwb, void, void *) + +/* clzerointrin.h */ +test_1 (_mm_clzero, void, void *) + +/* enqcmdintrin.h */ +test_2 (_enqcmd, int, void *, const void *) +test_2 (_enqcmds, int, void *, const void *) + +/* fxsrintrin.h */ +test_1 (_fxsave, void, void *) +test_1 (_fxrstor, void, void *) + +/* hresetintrin.h */ +test_1 (_hreset, void, unsigned int) + +/* ia32intrin.h */ +test_2 (__crc32b, unsigned int, unsigned char, unsigned char) +test_2 (__crc32w, unsigned int, unsigned short, unsigned short) +test_2 (__crc32d, unsigned int, unsigned int, unsigned int) +test_1 (__rdpmc, unsigned long long, int) +test_0 (__rdtsc, unsigned long long) +test_1 (__rdtscp, unsigned long long, unsigned int *) +test_0 (__pause, void) + +/* lzcntintrin.h */ +test_1 (__lzcnt16, unsigned short, unsigned short) +test_1 (__lzcnt32, unsigned int, unsigned int) +test_1 (_lzcnt_u32, unsigned int, unsigned int) + +/* lwpintrin.h */ +test_1 (__llwpcb, void, void *) +test_0 (__slwpcb, void *) +test_2_i1 (__lwpval32, void, unsigned int, unsigned int, 1) +test_2_i1 (__lwpins32, unsigned char, unsigned int, unsigned int, 1) + +/* movdirintrin.h */ +test_2 (_directstoreu_u32, void, void *, unsigned int) +test_2 (_movdir64b, void, void *, const void *) + +/* mwaitxintrin.h */ +test_3 (_mm_monitorx, void, void const *, unsigned int, unsigned int) +test_3 (_mm_mwaitx, void, unsigned int, unsigned int, unsigned int) + +/* pconfigintrin.h */ +test_2 (_pconfig_u32, unsigned int, const unsigned int, size_t *) + +/* pkuintrin.h */ +test_0 (_rdpkru_u32, unsigned int) +test_1 (_wrpkru, void, unsigned int) + +/* popcntintrin.h */ +test_1 (_mm_popcnt_u32, int, unsigned int) + +/* rdseedintrin.h */ +test_1 (_rdseed16_step, int, unsigned short *) +test_1 (_rdseed32_step, int, unsigned int *) + +/* rtmintrin.h */ +test_0 (_xbegin, unsigned int) +test_0 (_xend, void) +test_0_i1 (_xabort, void, 1) + +/* sgxintrin.h */ +test_2 (_encls_u32, unsigned int, const unsigned int, size_t *) +test_2 (_enclu_u32, unsigned int, const unsigned int, size_t *) +test_2 (_enclv_u32, unsigned int, const unsigned int, size_t *) + +/* tbmintrin.h */ +test_1_i1 (__bextri_u32, unsigned int, unsigned int, 1) +test_1 (__blcfill_u32, unsigned int, unsigned int) +test_1 (__blci_u32, unsigned int, unsigned int) +test_1 (__blcic_u32, unsigned int, unsigned int) +test_1 (__blcmsk_u32, unsigned int, unsigned int) +test_1 (__blcs_u32, unsigned int, unsigned int) +test_1 (__blsfill_u32, unsigned int, unsigned int) +test_1 (__blsic_u32, unsigned int, unsigned int) +test_1 (__t1mskc_u32, unsigned int, unsigned int) +test_1 (__tzmsk_u32, unsigned int, unsigned int) + +/* tsxldtrkintrin.h */ +test_0 (_xsusldtrk, void) +test_0 (_xresldtrk, void) + +/* x86gprintrin.h */ +test_1 (_ptwrite32, void, unsigned int) +test_1 (_rdrand16_step, int, unsigned short *) +test_1 (_rdrand32_step, int, unsigned int *) +test_0 (_wbinvd, void) + +/* xtestintrin.h */ +test_0 (_xtest, int) + +/* xsaveintrin.h */ +test_2 (_xsave, void, void *, long long) +test_2 (_xrstor, void, void *, long long) +test_2 (_xsetbv, void, unsigned int, long long) +test_1 (_xgetbv, long long, unsigned int) + +/* xsavecintrin.h */ +test_2 (_xsavec, void, void *, long long) + +/* xsaveoptintrin.h */ +test_2 (_xsaveopt, void, void *, long long) + +/* xsavesintrin.h */ +test_2 (_xsaves, void, void *, long long) +test_2 (_xrstors, void, void *, long long) + +/* wbnoinvdintrin.h */ +test_0 (_wbnoinvd, void) + +#ifdef __x86_64__ +/* adxintrin.h */ +test_4 (_subborrow_u64, unsigned char, unsigned char, + unsigned long long, unsigned long long, + unsigned long long *) +test_4 (_addcarry_u64, unsigned char, unsigned char, + unsigned long long, unsigned long long, + unsigned long long *) +test_4 (_addcarryx_u64, unsigned char, unsigned char, + unsigned long long, unsigned long long, + unsigned long long *) + +/* bmiintrin.h */ +test_2 (__andn_u64, unsigned long long, unsigned long long, + unsigned long long) +test_2 (__bextr_u64, unsigned long long, unsigned long long, + unsigned long long) +test_3 (_bextr_u64, unsigned long long, unsigned long long, + unsigned long long, unsigned long long) +test_1 (__blsi_u64, unsigned long long, unsigned long long) +test_1 (_blsi_u64, unsigned long long, unsigned long long) +test_1 (__blsmsk_u64, unsigned long long, unsigned long long) +test_1 (_blsmsk_u64, unsigned long long, unsigned long long) +test_1 (__blsr_u64, unsigned long long, unsigned long long) +test_1 (_blsr_u64, unsigned long long, unsigned long long) +test_1 (__tzcnt_u64, unsigned long long, unsigned long long) +test_1 (_tzcnt_u64, unsigned long long, unsigned long long) + +/* bmi2intrin.h */ +test_2 (_bzhi_u64, unsigned long long, unsigned long long, + unsigned long long) +test_2 (_pdep_u64, unsigned long long, unsigned long long, + unsigned long long) +test_2 (_pext_u64, unsigned long long, unsigned long long, + unsigned long long) +test_3 (_mulx_u64, unsigned long long, unsigned long long, + unsigned long long, unsigned long long *) + +/* cetintrin.h */ +test_0 (_get_ssp, unsigned long long) +test_2 (_wrssq, void, unsigned long long, void *) +test_2 (_wrussq, void, unsigned long long, void *) + +/* fxsrintrin.h */ +test_1 (_fxsave64, void, void *) +test_1 (_fxrstor64, void, void *) + +/* ia32intrin.h */ +test_1 (__bsfq, int, long long) +test_1 (__bsrq, int, long long) +test_1 (__bswapq, long long, long long) +test_2 (__crc32q, unsigned long long, unsigned long long, + unsigned long long) +test_1 (__popcntq, long long, unsigned long long) +test_2 (__rolq, unsigned long long, unsigned long long, int) +test_2 (__rorq, unsigned long long, unsigned long long, int) +test_0 (__readeflags, unsigned long long) +test_1 (__writeeflags, void, unsigned int) + +/* lzcntintrin.h */ +test_1 (__lzcnt64, unsigned long long, unsigned long long) +test_1 (_lzcnt_u64, unsigned long long, unsigned long long) + +/* lwpintrin.h */ +test_2_i1 (__lwpval64, void, unsigned long long, unsigned int, 1) +test_2_i1 (__lwpins64, unsigned char, unsigned long long, + unsigned int, 1) + +/* movdirintrin.h */ +test_2 (_directstoreu_u64, void, void *, unsigned long long) + +/* popcntintrin.h */ +test_1 (_mm_popcnt_u64, long long, unsigned long long) + +/* rdseedintrin.h */ +test_1 (_rdseed64_step, int, unsigned long long *) + +/* tbmintrin.h */ +test_1_i1 (__bextri_u64, unsigned long long, unsigned long long, 1) +test_1 (__blcfill_u64, unsigned long long, unsigned long long) +test_1 (__blci_u64, unsigned long long, unsigned long long) +test_1 (__blcic_u64, unsigned long long, unsigned long long) +test_1 (__blcmsk_u64, unsigned long long, unsigned long long) +test_1 (__blcs_u64, unsigned long long, unsigned long long) +test_1 (__blsfill_u64, unsigned long long, unsigned long long) +test_1 (__blsic_u64, unsigned long long, unsigned long long) +test_1 (__t1mskc_u64, unsigned long long, unsigned long long) +test_1 (__tzmsk_u64, unsigned long long, unsigned long long) + +/* uintrintrin.h */ +test_0 (_clui, void) +test_1 (_senduipi, void, unsigned long long) +test_0 (_stui, void) +test_0 (_testui, unsigned char) + +/* x86gprintrin.h */ +test_1 (_ptwrite64, void, unsigned long long) +test_0 (_readfsbase_u32, unsigned int) +test_0 (_readfsbase_u64, unsigned long long) +test_0 (_readgsbase_u32, unsigned int) +test_0 (_readgsbase_u64, unsigned long long) +test_1 (_rdrand64_step, int, unsigned long long *) +test_1 (_writefsbase_u32, void, unsigned int) +test_1 (_writefsbase_u64, void, unsigned long long) +test_1 (_writegsbase_u32, void, unsigned int) +test_1 (_writegsbase_u64, void, unsigned long long) + +/* xsaveintrin.h */ +test_2 (_xsave64, void, void *, long long) +test_2 (_xrstor64, void, void *, long long) + +/* xsavecintrin.h */ +test_2 (_xsavec64, void, void *, long long) + +/* xsaveoptintrin.h */ +test_2 (_xsaveopt64, void, void *, long long) + +/* xsavesintrin.h */ +test_2 (_xsaves64, void, void *, long long) +test_2 (_xrstors64, void, void *, long long) + +/* waitpkgintrin.h */ +test_1 (_umonitor, void, void *) +test_2 (_umwait, unsigned char, unsigned int, unsigned long long) +test_2 (_tpause, unsigned char, unsigned int, unsigned long long) + +#else /* !__x86_64__ */ +/* bmi2intrin.h */ +test_3 (_mulx_u32, unsigned int, unsigned int, unsigned int, + unsigned int *) + +/* cetintrin.h */ +test_0 (_get_ssp, unsigned int) +#endif /* __x86_64__ */ + +#endif diff --git a/gcc/testsuite/gcc.target/i386/pr99744-5.c b/gcc/testsuite/gcc.target/i386/pr99744-5.c new file mode 100644 index 00000000000..9e40e5ef428 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-5.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mmwait" } */ + +/* Test calling MWAIT intrinsics from functions with general-regs-only + target attribute. */ + +#include <x86gprintrin.h> + +#define _CONCAT(x,y) x ## y + +#define test_2(func, type, op1_type, op2_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B) \ + { return func (A, B); } + +#define test_3(func, type, op1_type, op2_type, op3_type) \ + __attribute__ ((target("general-regs-only"))) \ + type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C) \ + { return func (A, B, C); } + +#ifndef __iamcu__ +/* mwaitintrin.h */ +test_3 (_mm_monitor, void, void const *, unsigned int, unsigned int) +test_2 (_mm_mwait, void, unsigned int, unsigned int) +#endif diff --git a/gcc/testsuite/gcc.target/i386/pr99744-6.c b/gcc/testsuite/gcc.target/i386/pr99744-6.c new file mode 100644 index 00000000000..4025918a9c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-6.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ + +#include <x86intrin.h> + +extern unsigned long long int curr_deadline; +extern void bar (void); + +void +foo1 (void) +{ + if (__rdtsc () < curr_deadline) + return; + bar (); +} + +void +foo2 (unsigned int *p) +{ + if (__rdtscp (p) < curr_deadline) + return; + bar (); +} diff --git a/gcc/testsuite/gcc.target/i386/pr99744-7.c b/gcc/testsuite/gcc.target/i386/pr99744-7.c new file mode 100644 index 00000000000..30b7ca05966 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-7.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mno-avx -Wno-psabi" } */ + +#include <x86intrin.h> + +void +foo (__m256 *x) +{ + x[0] = _mm256_sub_ps (x[1], x[2]); +} + +/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */ diff --git a/gcc/testsuite/gcc.target/i386/pr99744-8.c b/gcc/testsuite/gcc.target/i386/pr99744-8.c new file mode 100644 index 00000000000..115183eede6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr99744-8.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O -Wno-psabi" } */ + +#include <x86intrin.h> + +__attribute__((target ("no-avx"))) +void +foo (__m256 *x) +{ + x[0] = _mm256_sub_ps (x[1], x[2]); +} + +/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */ -- 2.31.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu ` (4 preceding siblings ...) 2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu @ 2021-08-16 6:11 ` Richard Biener 2021-08-24 14:57 ` H.J. Lu 5 siblings, 1 reply; 16+ messages in thread From: Richard Biener @ 2021-08-16 6:11 UTC (permalink / raw) To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > <x86gprintrin.h> and target("general-regs-only") function attribute > were added to GCC 11. But their implementations are incomplete. I'd > like to backport the following patches to GCC 11 branch to finish them. Fine with me if x86 maintainers do not disagree (also see one comment I have on the -mwait adding patch). > H.J. Lu (5): > x86: Add -mmwait for -mgeneral-regs-only > x86: Use crc32 target option for CRC32 intrinsics > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions > x86: Enable the GPR only instructions for -mgeneral-regs-only > <x86gprintrin.h>: Add pragma GCC target("general-regs-only") > > gcc/common/config/i386/i386-common.c | 45 ++- > gcc/config.gcc | 6 +- > gcc/config/i386/i386-builtin.def | 8 +- > gcc/config/i386/i386-builtins.c | 4 +- > gcc/config/i386/i386-c.c | 2 + > gcc/config/i386/i386-options.c | 12 + > gcc/config/i386/i386.c | 6 +- > gcc/config/i386/i386.h | 2 + > gcc/config/i386/i386.md | 4 +- > gcc/config/i386/i386.opt | 4 + > gcc/config/i386/ia32intrin.h | 42 ++- > gcc/config/i386/mwaitintrin.h | 52 +++ > gcc/config/i386/pmmintrin.h | 13 +- > gcc/config/i386/serializeintrin.h | 7 +- > gcc/config/i386/sse.md | 4 +- > gcc/config/i386/x86gprintrin.h | 13 + > gcc/doc/extend.texi | 5 + > gcc/doc/invoke.texi | 8 +- > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + > 30 files changed, 717 insertions(+), 45 deletions(-) > create mode 100644 gcc/config/i386/mwaitintrin.h > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c > > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only 2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener @ 2021-08-24 14:57 ` H.J. Lu 2021-08-25 7:34 ` Uros Bizjak 0 siblings, 1 reply; 16+ messages in thread From: H.J. Lu @ 2021-08-24 14:57 UTC (permalink / raw) To: Richard Biener, Jan Hubicka; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek On Sun, Aug 15, 2021 at 11:11 PM Richard Biener <richard.guenther@gmail.com> wrote: > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > <x86gprintrin.h> and target("general-regs-only") function attribute > > were added to GCC 11. But their implementations are incomplete. I'd > > like to backport the following patches to GCC 11 branch to finish them. > > Fine with me if x86 maintainers do not disagree (also see one comment I have > on the -mwait adding patch). Hi Uros, Honza, Do you have any comments? The updated -mwait patch with LTO_minor_version bump is at: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html Thanks. H.J. > > H.J. Lu (5): > > x86: Add -mmwait for -mgeneral-regs-only > > x86: Use crc32 target option for CRC32 intrinsics > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions > > x86: Enable the GPR only instructions for -mgeneral-regs-only > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only") > > > > gcc/common/config/i386/i386-common.c | 45 ++- > > gcc/config.gcc | 6 +- > > gcc/config/i386/i386-builtin.def | 8 +- > > gcc/config/i386/i386-builtins.c | 4 +- > > gcc/config/i386/i386-c.c | 2 + > > gcc/config/i386/i386-options.c | 12 + > > gcc/config/i386/i386.c | 6 +- > > gcc/config/i386/i386.h | 2 + > > gcc/config/i386/i386.md | 4 +- > > gcc/config/i386/i386.opt | 4 + > > gcc/config/i386/ia32intrin.h | 42 ++- > > gcc/config/i386/mwaitintrin.h | 52 +++ > > gcc/config/i386/pmmintrin.h | 13 +- > > gcc/config/i386/serializeintrin.h | 7 +- > > gcc/config/i386/sse.md | 4 +- > > gcc/config/i386/x86gprintrin.h | 13 + > > gcc/doc/extend.texi | 5 + > > gcc/doc/invoke.texi | 8 +- > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + > > 30 files changed, 717 insertions(+), 45 deletions(-) > > create mode 100644 gcc/config/i386/mwaitintrin.h > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c > > > > -- > > 2.31.1 > > -- H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only 2021-08-24 14:57 ` H.J. Lu @ 2021-08-25 7:34 ` Uros Bizjak 2021-08-25 12:14 ` H.J. Lu 2021-08-26 6:35 ` Richard Biener 0 siblings, 2 replies; 16+ messages in thread From: Uros Bizjak @ 2021-08-25 7:34 UTC (permalink / raw) To: H.J. Lu; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > <richard.guenther@gmail.com> wrote: > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > <x86gprintrin.h> and target("general-regs-only") function attribute > > > were added to GCC 11. But their implementations are incomplete. I'd > > > like to backport the following patches to GCC 11 branch to finish them. > > > > Fine with me if x86 maintainers do not disagree (also see one comment I have > > on the -mwait adding patch). > > Hi Uros, Honza, > > Do you have any comments? The updated -mwait patch with LTO_minor_version > bump is at: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html I don't have any comments, but IIRC, approved changes can be backported from mainline to release branches without additional approval. Uros. > Thanks. > > H.J. > > > H.J. Lu (5): > > > x86: Add -mmwait for -mgeneral-regs-only > > > x86: Use crc32 target option for CRC32 intrinsics > > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions > > > x86: Enable the GPR only instructions for -mgeneral-regs-only > > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only") > > > > > > gcc/common/config/i386/i386-common.c | 45 ++- > > > gcc/config.gcc | 6 +- > > > gcc/config/i386/i386-builtin.def | 8 +- > > > gcc/config/i386/i386-builtins.c | 4 +- > > > gcc/config/i386/i386-c.c | 2 + > > > gcc/config/i386/i386-options.c | 12 + > > > gcc/config/i386/i386.c | 6 +- > > > gcc/config/i386/i386.h | 2 + > > > gcc/config/i386/i386.md | 4 +- > > > gcc/config/i386/i386.opt | 4 + > > > gcc/config/i386/ia32intrin.h | 42 ++- > > > gcc/config/i386/mwaitintrin.h | 52 +++ > > > gcc/config/i386/pmmintrin.h | 13 +- > > > gcc/config/i386/serializeintrin.h | 7 +- > > > gcc/config/i386/sse.md | 4 +- > > > gcc/config/i386/x86gprintrin.h | 13 + > > > gcc/doc/extend.texi | 5 + > > > gcc/doc/invoke.texi | 8 +- > > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ > > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + > > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + > > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + > > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + > > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + > > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ > > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ > > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ > > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + > > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + > > > 30 files changed, 717 insertions(+), 45 deletions(-) > > > create mode 100644 gcc/config/i386/mwaitintrin.h > > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c > > > > > > -- > > > 2.31.1 > > > > > > > -- > H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only 2021-08-25 7:34 ` Uros Bizjak @ 2021-08-25 12:14 ` H.J. Lu 2021-08-26 6:35 ` Richard Biener 1 sibling, 0 replies; 16+ messages in thread From: H.J. Lu @ 2021-08-25 12:14 UTC (permalink / raw) To: Uros Bizjak; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek On Wed, Aug 25, 2021 at 12:34 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > > <richard.guenther@gmail.com> wrote: > > > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > <x86gprintrin.h> and target("general-regs-only") function attribute > > > > were added to GCC 11. But their implementations are incomplete. I'd > > > > like to backport the following patches to GCC 11 branch to finish them. > > > > > > Fine with me if x86 maintainers do not disagree (also see one comment I have > > > on the -mwait adding patch). > > > > Hi Uros, Honza, > > > > Do you have any comments? The updated -mwait patch with LTO_minor_version > > bump is at: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html > > I don't have any comments, but IIRC, approved changes can be > backported from mainline to release branches without additional > approval. I am checking them in. Thanks. > Uros. > > > Thanks. > > > > H.J. > > > > H.J. Lu (5): > > > > x86: Add -mmwait for -mgeneral-regs-only > > > > x86: Use crc32 target option for CRC32 intrinsics > > > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions > > > > x86: Enable the GPR only instructions for -mgeneral-regs-only > > > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only") > > > > > > > > gcc/common/config/i386/i386-common.c | 45 ++- > > > > gcc/config.gcc | 6 +- > > > > gcc/config/i386/i386-builtin.def | 8 +- > > > > gcc/config/i386/i386-builtins.c | 4 +- > > > > gcc/config/i386/i386-c.c | 2 + > > > > gcc/config/i386/i386-options.c | 12 + > > > > gcc/config/i386/i386.c | 6 +- > > > > gcc/config/i386/i386.h | 2 + > > > > gcc/config/i386/i386.md | 4 +- > > > > gcc/config/i386/i386.opt | 4 + > > > > gcc/config/i386/ia32intrin.h | 42 ++- > > > > gcc/config/i386/mwaitintrin.h | 52 +++ > > > > gcc/config/i386/pmmintrin.h | 13 +- > > > > gcc/config/i386/serializeintrin.h | 7 +- > > > > gcc/config/i386/sse.md | 4 +- > > > > gcc/config/i386/x86gprintrin.h | 13 + > > > > gcc/doc/extend.texi | 5 + > > > > gcc/doc/invoke.texi | 8 +- > > > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + > > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ > > > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + > > > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + > > > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ > > > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ > > > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ > > > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + > > > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + > > > > 30 files changed, 717 insertions(+), 45 deletions(-) > > > > create mode 100644 gcc/config/i386/mwaitintrin.h > > > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c > > > > > > > > -- > > > > 2.31.1 > > > > > > > > > > > > -- > > H.J. -- H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only 2021-08-25 7:34 ` Uros Bizjak 2021-08-25 12:14 ` H.J. Lu @ 2021-08-26 6:35 ` Richard Biener 1 sibling, 0 replies; 16+ messages in thread From: Richard Biener @ 2021-08-26 6:35 UTC (permalink / raw) To: Uros Bizjak; +Cc: H.J. Lu, Jan Hubicka, GCC Patches, Jakub Jelinek On Wed, Aug 25, 2021 at 9:34 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener > > <richard.guenther@gmail.com> wrote: > > > > > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > <x86gprintrin.h> and target("general-regs-only") function attribute > > > > were added to GCC 11. But their implementations are incomplete. I'd > > > > like to backport the following patches to GCC 11 branch to finish them. > > > > > > Fine with me if x86 maintainers do not disagree (also see one comment I have > > > on the -mwait adding patch). > > > > Hi Uros, Honza, > > > > Do you have any comments? The updated -mwait patch with LTO_minor_version > > bump is at: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html > > I don't have any comments, but IIRC, approved changes can be > backported from mainline to release branches without additional > approval. If they fix regressions, yes. I understood this wasn't such obvious case here (instead it's a new but buggy feature). Richard. > Uros. > > > Thanks. > > > > H.J. > > > > H.J. Lu (5): > > > > x86: Add -mmwait for -mgeneral-regs-only > > > > x86: Use crc32 target option for CRC32 intrinsics > > > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions > > > > x86: Enable the GPR only instructions for -mgeneral-regs-only > > > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only") > > > > > > > > gcc/common/config/i386/i386-common.c | 45 ++- > > > > gcc/config.gcc | 6 +- > > > > gcc/config/i386/i386-builtin.def | 8 +- > > > > gcc/config/i386/i386-builtins.c | 4 +- > > > > gcc/config/i386/i386-c.c | 2 + > > > > gcc/config/i386/i386-options.c | 12 + > > > > gcc/config/i386/i386.c | 6 +- > > > > gcc/config/i386/i386.h | 2 + > > > > gcc/config/i386/i386.md | 4 +- > > > > gcc/config/i386/i386.opt | 4 + > > > > gcc/config/i386/ia32intrin.h | 42 ++- > > > > gcc/config/i386/mwaitintrin.h | 52 +++ > > > > gcc/config/i386/pmmintrin.h | 13 +- > > > > gcc/config/i386/serializeintrin.h | 7 +- > > > > gcc/config/i386/sse.md | 4 +- > > > > gcc/config/i386/x86gprintrin.h | 13 + > > > > gcc/doc/extend.texi | 5 + > > > > gcc/doc/invoke.texi | 8 +- > > > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 + > > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++ > > > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 + > > > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 + > > > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + > > > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++ > > > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ > > > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ > > > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 + > > > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 + > > > > 30 files changed, 717 insertions(+), 45 deletions(-) > > > > create mode 100644 gcc/config/i386/mwaitintrin.h > > > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c > > > > > > > > -- > > > > 2.31.1 > > > > > > > > > > > > -- > > H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-08-26 6:35 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu 2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu 2021-08-16 6:11 ` Richard Biener 2021-08-16 12:25 ` H.J. Lu 2021-08-16 12:28 ` Richard Biener 2021-08-16 12:35 ` H.J. Lu 2021-08-16 12:37 ` Martin Liška 2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu 2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu 2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu 2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu 2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener 2021-08-24 14:57 ` H.J. Lu 2021-08-25 7:34 ` Uros Bizjak 2021-08-25 12:14 ` H.J. Lu 2021-08-26 6:35 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).