From: Richard Biener <richard.guenther@gmail.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Uros Bizjak <ubizjak@gmail.com>,
Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
Date: Mon, 16 Aug 2021 08:11:00 +0200 [thread overview]
Message-ID: <CAFiYyc2nk=DXoG92tEFRT-ECFBRqYYmo5Fn1jMUf+P5_qCQTSg@mail.gmail.com> (raw)
In-Reply-To: <20210813135103.46696-2-hjl.tools@gmail.com>
On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> -mgeneral-regs-only and make -msse3 to imply -mmwait.
Adding new options requires to bump the LTO streaming minor version
(I know we forgot it once on the branch already when adding a new --param).
Please take care of this when backporting.
Richard.
> gcc/
>
> * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> x86_64-*-* targets.
> * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> New.
> (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> (ix86_handle_option): Handle -mmwait.
> * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> __builtin_ia32_monitor and __builtin_ia32_mwait.
> * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> (ix86_valid_target_attribute_inner_p): Likewise.
> (ix86_option_override_internal): Enable mwait/monitor
> instructions for -msse3.
> * config/i386/i386.h (TARGET_MWAIT): New.
> (TARGET_MWAIT_P): Likewise.
> * config/i386/i386.opt: Add -mmwait.
> * config/i386/mwaitintrin.h: New file.
> * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> TARGET_MWAIT.
> (@sse3_monitor_<mode>): Likewise.
> * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> * doc/extend.texi: Document mwait target attribute.
> * doc/invoke.texi: Document -mmwait.
>
> gcc/testsuite/
>
> * gcc.target/i386/monitor-2.c: New test.
>
> (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> ---
> gcc/common/config/i386/i386-common.c | 15 +++++++
> gcc/config.gcc | 6 ++-
> gcc/config/i386/i386-builtins.c | 4 +-
> gcc/config/i386/i386-options.c | 7 +++
> gcc/config/i386/i386.h | 2 +
> gcc/config/i386/i386.opt | 4 ++
> gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
> gcc/config/i386/pmmintrin.h | 13 +-----
> gcc/config/i386/sse.md | 4 +-
> gcc/config/i386/x86gprintrin.h | 2 +
> gcc/doc/extend.texi | 5 +++
> gcc/doc/invoke.texi | 8 +++-
> gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> 13 files changed, 130 insertions(+), 19 deletions(-)
> create mode 100644 gcc/config/i386/mwaitintrin.h
> create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
>
> diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> index 6a7b5c8312f..e156cc34584 100644
> --- a/gcc/common/config/i386/i386-common.c
> +++ b/gcc/common/config/i386/i386-common.c
> @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
> #define OPTION_MASK_ISA_F16C_SET \
> (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
> #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> }
> return true;
>
> + case OPT_mmwait:
> + if (value)
> + {
> + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> + }
> + else
> + {
> + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> + }
> + return true;
> +
> case OPT_mclzero:
> if (value)
> {
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 357b0bed067..a020e0808c9 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -414,7 +414,8 @@ i[34567]86-*-*)
> avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> amxbf16intrin.h x86gprintrin.h uintrintrin.h
> - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> + mwaitintrin.h"
> ;;
> x86_64-*-*)
> cpu_type=i386
> @@ -451,7 +452,8 @@ x86_64-*-*)
> avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> amxbf16intrin.h x86gprintrin.h uintrintrin.h
> - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> + mwaitintrin.h"
> ;;
> ia64-*-*)
> extra_headers=ia64intrin.h
> diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> index 4fcdf4b89ee..128bd39816c 100644
> --- a/gcc/config/i386/i386-builtins.c
> +++ b/gcc/config/i386/i386-builtins.c
> @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
>
> /* SSE3. */
> - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
>
> /* AES */
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 18d2c0b9f99..7ecd0cf8b8c 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> { "-mmovbe", OPTION_MASK_ISA2_MOVBE },
> { "-mclzero", OPTION_MASK_ISA2_CLZERO },
> { "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
> + { "-mmwait", OPTION_MASK_ISA2_MWAIT },
> { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
> { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
> { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
> @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
> IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
> + IX86_ATTR_ISA ("mwait", OPT_mmwait),
> IX86_ATTR_ISA ("clzero", OPT_mclzero),
> IX86_ATTR_ISA ("pku", OPT_mpku),
> IX86_ATTR_ISA ("lwp", OPT_mlwp),
> @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> ix86_prefetch_sse = true;
>
> + /* Enable mwait/monitor instructions for -msse3. */
> + if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> + opts->x_ix86_isa_flags2
> + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> +
> /* Enable popcnt instruction for -msse4.2 or -mabm. */
> if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> || TARGET_ABM_P (opts->x_ix86_isa_flags))
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 5583ec6881a..73e118900f7 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
> #define TARGET_MWAITX TARGET_ISA2_MWAITX
> #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
> +#define TARGET_MWAIT TARGET_ISA2_MWAIT
> +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
> #define TARGET_PKU TARGET_ISA_PKU
> #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
> #define TARGET_SHSTK TARGET_ISA_SHSTK
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index c781fdc8278..7b8547bb1c3 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> mneeded
> Target Var(ix86_needed) Save
> Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> +
> +mmwait
> +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> +Support MWAIT and MONITOR built-in functions and code generation.
> diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> new file mode 100644
> index 00000000000..1ecbc4abb69
> --- /dev/null
> +++ b/gcc/config/i386/mwaitintrin.h
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2021 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + Under Section 7 of GPL version 3, you are granted additional
> + permissions described in the GCC Runtime Library Exception, version
> + 3.1, as published by the Free Software Foundation.
> +
> + You should have received a copy of the GNU General Public License and
> + a copy of the GCC Runtime Library Exception along with this program;
> + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef _MWAITINTRIN_H_INCLUDED
> +#define _MWAITINTRIN_H_INCLUDED
> +
> +#ifndef __MWAIT__
> +#pragma GCC push_options
> +#pragma GCC target("mwait")
> +#define __DISABLE_MWAIT__
> +#endif /* __MWAIT__ */
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> +{
> + __builtin_ia32_monitor (__P, __E, __H);
> +}
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_mwait (unsigned int __E, unsigned int __H)
> +{
> + __builtin_ia32_mwait (__E, __H);
> +}
> +
> +#ifdef __DISABLE_MWAIT__
> +#undef __DISABLE_MWAIT__
> +#pragma GCC pop_options
> +#endif /* __DISABLE_MWAIT__ */
> +
> +#endif /* _MWAITINTRIN_H_INCLUDED */
> diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> index fa9c5bb8b9f..f8102d2be23 100644
> --- a/gcc/config/i386/pmmintrin.h
> +++ b/gcc/config/i386/pmmintrin.h
> @@ -29,6 +29,7 @@
>
> /* We need definitions from the SSE2 and SSE header files*/
> #include <emmintrin.h>
> +#include <mwaitintrin.h>
>
> #ifndef __SSE3__
> #pragma GCC push_options
> @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> }
>
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> -{
> - __builtin_ia32_monitor (__P, __E, __H);
> -}
> -
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_mwait (unsigned int __E, unsigned int __H)
> -{
> - __builtin_ia32_mwait (__E, __H);
> -}
> -
> #ifdef __DISABLE_SSE3__
> #undef __DISABLE_SSE3__
> #pragma GCC pop_options
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3f81abc7804..43afe3dabed 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> (match_operand:SI 1 "register_operand" "a")]
> UNSPECV_MWAIT)]
> - "TARGET_SSE3"
> + "TARGET_MWAIT"
> ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> ;; Since 32bit register operands are implicitly zero extended to 64bit,
> ;; we only need to set up 32bit registers.
> @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> (match_operand:SI 1 "register_operand" "c")
> (match_operand:SI 2 "register_operand" "d")]
> UNSPECV_MONITOR)]
> - "TARGET_SSE3"
> + "TARGET_MWAIT"
> ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> ;; RCX and RDX are used. Since 32bit register operands are implicitly
> ;; zero extended to 64bit, we only need to set up 32bit registers.
> diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> index ceda501252c..7793032ba90 100644
> --- a/gcc/config/i386/x86gprintrin.h
> +++ b/gcc/config/i386/x86gprintrin.h
> @@ -56,6 +56,8 @@
>
> #include <movdirintrin.h>
>
> +#include <mwaitintrin.h>
> +
> #include <mwaitxintrin.h>
>
> #include <pconfigintrin.h>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1bc66cce2b8..1acfaf1d345 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> @cindex @code{target("movdiri")} function attribute, x86
> Enable/disable the generation of the MOVDIRI instructions.
>
> +@item mwait
> +@itemx no-mwait
> +@cindex @code{target("mwait")} function attribute, x86
> +Enable/disable the generation of the MWAIT and MONITOR instructions.
> +
> @item mwaitx
> @itemx no-mwaitx
> @cindex @code{target("mwaitx")} function attribute, x86
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 7f13ffb79e1..3e1f0bc8fad 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> -mno-wide-multiply -mrtd -malign-double @gol
> -mpreferred-stack-boundary=@var{num} @gol
> -mincoming-stack-boundary=@var{num} @gol
> --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
> +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
> -mrecip -mrecip=@var{opt} @gol
> -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
> -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
> @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
>
> +@item -mmwait
> +@opindex mmwait
> +This option enables built-in functions @code{__builtin_ia32_monitor},
> +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> +@code{mwait} machine instructions.
> +
> @item -mrecip
> @opindex mrecip
> This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> new file mode 100644
> index 00000000000..96eeec070f0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> +
> +/* Verify that they work in both 32bit and 64bit. */
> +
> +#include <x86gprintrin.h>
> +
> +void
> +foo (char *p, int x, int y, int z)
> +{
> + _mm_monitor (p, y, x);
> + _mm_mwait (z, y);
> +}
> +
> +void
> +bar (char *p, long x, long y, long z)
> +{
> + _mm_monitor (p, y, x);
> + _mm_mwait (z, y);
> +}
> +
> +void
> +foo1 (char *p)
> +{
> + _mm_monitor (p, 0, 0);
> + _mm_mwait (0, 0);
> +}
> --
> 2.31.1
>
next prev parent reply other threads:[~2021-08-16 6:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
2021-08-16 6:11 ` Richard Biener [this message]
2021-08-16 12:25 ` H.J. Lu
2021-08-16 12:28 ` Richard Biener
2021-08-16 12:35 ` H.J. Lu
2021-08-16 12:37 ` Martin Liška
2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
2021-08-24 14:57 ` H.J. Lu
2021-08-25 7:34 ` Uros Bizjak
2021-08-25 12:14 ` H.J. Lu
2021-08-26 6:35 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc2nk=DXoG92tEFRT-ECFBRqYYmo5Fn1jMUf+P5_qCQTSg@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=jakub@redhat.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).