public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
@ 2021-08-13 13:50 H.J. Lu
  2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

<x86gprintrin.h> and target("general-regs-only") function attribute
were added to GCC 11.  But their implementations are incomplete.  I'd
like to backport the following patches to GCC 11 branch to finish them.

H.J. Lu (5):
  x86: Add -mmwait for -mgeneral-regs-only
  x86: Use crc32 target option for CRC32 intrinsics
  x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
  x86: Enable the GPR only instructions for -mgeneral-regs-only
  <x86gprintrin.h>: Add pragma GCC target("general-regs-only")

 gcc/common/config/i386/i386-common.c       |  45 ++-
 gcc/config.gcc                             |   6 +-
 gcc/config/i386/i386-builtin.def           |   8 +-
 gcc/config/i386/i386-builtins.c            |   4 +-
 gcc/config/i386/i386-c.c                   |   2 +
 gcc/config/i386/i386-options.c             |  12 +
 gcc/config/i386/i386.c                     |   6 +-
 gcc/config/i386/i386.h                     |   2 +
 gcc/config/i386/i386.md                    |   4 +-
 gcc/config/i386/i386.opt                   |   4 +
 gcc/config/i386/ia32intrin.h               |  42 ++-
 gcc/config/i386/mwaitintrin.h              |  52 +++
 gcc/config/i386/pmmintrin.h                |  13 +-
 gcc/config/i386/serializeintrin.h          |   7 +-
 gcc/config/i386/sse.md                     |   4 +-
 gcc/config/i386/x86gprintrin.h             |  13 +
 gcc/doc/extend.texi                        |   5 +
 gcc/doc/invoke.texi                        |   8 +-
 gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
 gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
 gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
 gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
 gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
 gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
 gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
 gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
 gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
 30 files changed, 717 insertions(+), 45 deletions(-)
 create mode 100644 gcc/config/i386/mwaitintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
@ 2021-08-13 13:50 ` H.J. Lu
  2021-08-16  6:11   ` Richard Biener
  2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.

gcc/

	* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
	x86_64-*-* targets.
	* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
	New.
	(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
	(ix86_handle_option): Handle -mmwait.
	* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
	Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
	__builtin_ia32_monitor and __builtin_ia32_mwait.
	* config/i386/i386-options.c (isa2_opts): Add -mmwait.
	(ix86_valid_target_attribute_inner_p): Likewise.
	(ix86_option_override_internal): Enable mwait/monitor
	instructions for -msse3.
	* config/i386/i386.h (TARGET_MWAIT): New.
	(TARGET_MWAIT_P): Likewise.
	* config/i386/i386.opt: Add -mmwait.
	* config/i386/mwaitintrin.h: New file.
	* config/i386/pmmintrin.h: Include <mwaitintrin.h>.
	* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
	TARGET_MWAIT.
	(@sse3_monitor_<mode>): Likewise.
	* config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
	* doc/extend.texi: Document mwait target attribute.
	* doc/invoke.texi: Document -mmwait.

gcc/testsuite/

	* gcc.target/i386/monitor-2.c: New test.

(cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
---
 gcc/common/config/i386/i386-common.c      | 15 +++++++
 gcc/config.gcc                            |  6 ++-
 gcc/config/i386/i386-builtins.c           |  4 +-
 gcc/config/i386/i386-options.c            |  7 +++
 gcc/config/i386/i386.h                    |  2 +
 gcc/config/i386/i386.opt                  |  4 ++
 gcc/config/i386/mwaitintrin.h             | 52 +++++++++++++++++++++++
 gcc/config/i386/pmmintrin.h               | 13 +-----
 gcc/config/i386/sse.md                    |  4 +-
 gcc/config/i386/x86gprintrin.h            |  2 +
 gcc/doc/extend.texi                       |  5 +++
 gcc/doc/invoke.texi                       |  8 +++-
 gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
 13 files changed, 130 insertions(+), 19 deletions(-)
 create mode 100644 gcc/config/i386/mwaitintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 6a7b5c8312f..e156cc34584 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_F16C_SET \
   (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
 #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
 #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
@@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mmwait:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
+	}
+      return true;
+
     case OPT_mclzero:
       if (value)
 	{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 357b0bed067..a020e0808c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -414,7 +414,8 @@ i[34567]86-*-*)
 		       avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
 		       tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
 		       amxbf16intrin.h x86gprintrin.h uintrintrin.h
-		       hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+		       hresetintrin.h keylockerintrin.h avxvnniintrin.h
+		       mwaitintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -451,7 +452,8 @@ x86_64-*-*)
 		       avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
 		       tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
 		       amxbf16intrin.h x86gprintrin.h uintrintrin.h
-		       hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+		       hresetintrin.h keylockerintrin.h avxvnniintrin.h
+		       mwaitintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 4fcdf4b89ee..128bd39816c 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
 			    VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
 
   /* SSE3.  */
-  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
+  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
 	       VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
-  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
+  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
 	       VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
 
   /* AES */
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 18d2c0b9f99..7ecd0cf8b8c 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
   { "-mmovbe",		OPTION_MASK_ISA2_MOVBE },
   { "-mclzero",		OPTION_MASK_ISA2_CLZERO },
   { "-mmwaitx",		OPTION_MASK_ISA2_MWAITX },
+  { "-mmwait",		OPTION_MASK_ISA2_MWAIT },
   { "-mmovdir64b",	OPTION_MASK_ISA2_MOVDIR64B },
   { "-mwaitpkg",	OPTION_MASK_ISA2_WAITPKG },
   { "-mcldemote",	OPTION_MASK_ISA2_CLDEMOTE },
@@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
     IX86_ATTR_ISA ("fsgsbase",	OPT_mfsgsbase),
     IX86_ATTR_ISA ("rdrnd",	OPT_mrdrnd),
     IX86_ATTR_ISA ("mwaitx",	OPT_mmwaitx),
+    IX86_ATTR_ISA ("mwait",	OPT_mmwait),
     IX86_ATTR_ISA ("clzero",	OPT_mclzero),
     IX86_ATTR_ISA ("pku",	OPT_mpku),
     IX86_ATTR_ISA ("lwp",	OPT_mlwp),
@@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
       || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
     ix86_prefetch_sse = true;
 
+  /* Enable mwait/monitor instructions for -msse3.  */
+  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+    opts->x_ix86_isa_flags2
+      |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
+
   /* Enable popcnt instruction for -msse4.2 or -mabm.  */
   if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
       || TARGET_ABM_P (opts->x_ix86_isa_flags))
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5583ec6881a..73e118900f7 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_CLWB_P(x)	TARGET_ISA_CLWB_P(x)
 #define TARGET_MWAITX	TARGET_ISA2_MWAITX
 #define TARGET_MWAITX_P(x)	TARGET_ISA2_MWAITX_P(x)
+#define TARGET_MWAIT	TARGET_ISA2_MWAIT
+#define TARGET_MWAIT_P(x)	TARGET_ISA2_MWAIT_P(x)
 #define TARGET_PKU	TARGET_ISA_PKU
 #define TARGET_PKU_P(x)	TARGET_ISA_PKU_P(x)
 #define TARGET_SHSTK	TARGET_ISA_SHSTK
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index c781fdc8278..7b8547bb1c3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
 mneeded
 Target Var(ix86_needed) Save
 Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
+
+mmwait
+Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
+Support MWAIT and MONITOR built-in functions and code generation.
diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
new file mode 100644
index 00000000000..1ecbc4abb69
--- /dev/null
+++ b/gcc/config/i386/mwaitintrin.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MWAITINTRIN_H_INCLUDED
+#define _MWAITINTRIN_H_INCLUDED
+
+#ifndef __MWAIT__
+#pragma GCC push_options
+#pragma GCC target("mwait")
+#define __DISABLE_MWAIT__
+#endif /* __MWAIT__ */
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
+{
+  __builtin_ia32_monitor (__P, __E, __H);
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mwait (unsigned int __E, unsigned int __H)
+{
+  __builtin_ia32_mwait (__E, __H);
+}
+
+#ifdef __DISABLE_MWAIT__
+#undef __DISABLE_MWAIT__
+#pragma GCC pop_options
+#endif /* __DISABLE_MWAIT__ */
+
+#endif /* _MWAITINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
index fa9c5bb8b9f..f8102d2be23 100644
--- a/gcc/config/i386/pmmintrin.h
+++ b/gcc/config/i386/pmmintrin.h
@@ -29,6 +29,7 @@
 
 /* We need definitions from the SSE2 and SSE header files*/
 #include <emmintrin.h>
+#include <mwaitintrin.h>
 
 #ifndef __SSE3__
 #pragma GCC push_options
@@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
   return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
 }
 
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
-{
-  __builtin_ia32_monitor (__P, __E, __H);
-}
-
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mwait (unsigned int __E, unsigned int __H)
-{
-  __builtin_ia32_mwait (__E, __H);
-}
-
 #ifdef __DISABLE_SSE3__
 #undef __DISABLE_SSE3__
 #pragma GCC pop_options
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3f81abc7804..43afe3dabed 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
 		     (match_operand:SI 1 "register_operand" "a")]
 		    UNSPECV_MWAIT)]
-  "TARGET_SSE3"
+  "TARGET_MWAIT"
 ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
 ;; Since 32bit register operands are implicitly zero extended to 64bit,
 ;; we only need to set up 32bit registers.
@@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
 		     (match_operand:SI 1 "register_operand" "c")
 		     (match_operand:SI 2 "register_operand" "d")]
 		    UNSPECV_MONITOR)]
-  "TARGET_SSE3"
+  "TARGET_MWAIT"
 ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
 ;; RCX and RDX are used.  Since 32bit register operands are implicitly
 ;; zero extended to 64bit, we only need to set up 32bit registers.
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index ceda501252c..7793032ba90 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -56,6 +56,8 @@
 
 #include <movdirintrin.h>
 
+#include <mwaitintrin.h>
+
 #include <mwaitxintrin.h>
 
 #include <pconfigintrin.h>
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1bc66cce2b8..1acfaf1d345 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
 @cindex @code{target("movdiri")} function attribute, x86
 Enable/disable the generation of the MOVDIRI instructions.
 
+@item mwait
+@itemx no-mwait
+@cindex @code{target("mwait")} function attribute, x86
+Enable/disable the generation of the MWAIT and MONITOR instructions.
+
 @item mwaitx
 @itemx no-mwaitx
 @cindex @code{target("mwaitx")} function attribute, x86
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7f13ffb79e1..3e1f0bc8fad 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
 -mno-wide-multiply  -mrtd  -malign-double @gol
 -mpreferred-stack-boundary=@var{num} @gol
 -mincoming-stack-boundary=@var{num} @gol
--mcld  -mcx16  -msahf  -mmovbe  -mcrc32 @gol
+-mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait @gol
 -mrecip  -mrecip=@var{opt} @gol
 -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt} @gol
 -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx @gol
@@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
 @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
 @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
 
+@item -mmwait
+@opindex mmwait
+This option enables built-in functions @code{__builtin_ia32_monitor},
+and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
+@code{mwait} machine instructions.
+
 @item -mrecip
 @opindex mrecip
 This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
new file mode 100644
index 00000000000..96eeec070f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
+
+/* Verify that they work in both 32bit and 64bit.  */
+
+#include <x86gprintrin.h>
+
+void
+foo (char *p, int x, int y, int z)
+{
+   _mm_monitor (p, y, x);
+   _mm_mwait (z, y);
+}
+
+void
+bar (char *p, long x, long y, long z)
+{
+   _mm_monitor (p, y, x);
+   _mm_mwait (z, y);
+}
+
+void
+foo1 (char *p)
+{
+   _mm_monitor (p, 0, 0);
+   _mm_mwait (0, 0);
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
  2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
  2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

Use crc32 target option for CRC32 intrinsics to support CRC32 intrinsics
without enabling SSE vector instructions.

	* config/i386/i386-c.c (ix86_target_macros_internal): Define
	__CRC32__ for -mcrc32.
	* config/i386/i386-options.c (ix86_option_override_internal):
	Enable crc32 instruction for -msse4.2.
	* config/i386/i386.md (sse4_2_crc32<mode>): Remove TARGET_SSE4_2
	check.
	(sse4_2_crc32di): Likewise.
	* config/i386/ia32intrin.h: Use crc32 target option for CRC32
	intrinsics.

(cherry picked from commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd)
---
 gcc/config/i386/i386-c.c       |  2 ++
 gcc/config/i386/i386-options.c |  5 +++++
 gcc/config/i386/i386.md        |  4 ++--
 gcc/config/i386/ia32intrin.h   | 28 ++++++++++++++--------------
 4 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index be46d0506ad..5ed0de006fb 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -532,6 +532,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__LZCNT__");
   if (isa_flag & OPTION_MASK_ISA_TBM)
     def_or_undef (parse_in, "__TBM__");
+  if (isa_flag & OPTION_MASK_ISA_CRC32)
+    def_or_undef (parse_in, "__CRC32__");
   if (isa_flag & OPTION_MASK_ISA_POPCNT)
     def_or_undef (parse_in, "__POPCNT__");
   if (isa_flag & OPTION_MASK_ISA_FSGSBASE)
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 7ecd0cf8b8c..19632b5fd6b 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2625,6 +2625,11 @@ ix86_option_override_internal (bool main_args_p,
     opts->x_ix86_isa_flags
       |= OPTION_MASK_ISA_POPCNT & ~opts->x_ix86_isa_flags_explicit;
 
+  /* Enable crc32 instruction for -msse4.2.  */
+  if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags))
+    opts->x_ix86_isa_flags
+      |= OPTION_MASK_ISA_CRC32 & ~opts->x_ix86_isa_flags_explicit;
+
   /* Enable lzcnt instruction for -mabm.  */
   if (TARGET_ABM_P(opts->x_ix86_isa_flags))
     opts->x_ix86_isa_flags
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2fdf98266cd..1d528a4434a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -20992,7 +20992,7 @@ (define_insn "sse4_2_crc32<mode>"
 	  [(match_operand:SI 1 "register_operand" "0")
 	   (match_operand:SWI124 2 "nonimmediate_operand" "<r>m")]
 	  UNSPEC_CRC32))]
-  "TARGET_SSE4_2 || TARGET_CRC32"
+  "TARGET_CRC32"
   "crc32{<imodesuffix>}\t{%2, %0|%0, %2}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_rep" "1")
@@ -21013,7 +21013,7 @@ (define_insn "sse4_2_crc32di"
 	  [(match_operand:DI 1 "register_operand" "0")
 	   (match_operand:DI 2 "nonimmediate_operand" "rm")]
 	  UNSPEC_CRC32))]
-  "TARGET_64BIT && (TARGET_SSE4_2 || TARGET_CRC32)"
+  "TARGET_64BIT && TARGET_CRC32"
   "crc32{q}\t{%2, %0|%0, %2}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_rep" "1")
diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h
index 591394076cc..5422b0fc9e0 100644
--- a/gcc/config/i386/ia32intrin.h
+++ b/gcc/config/i386/ia32intrin.h
@@ -51,11 +51,11 @@ __bswapd (int __X)
 
 #ifndef __iamcu__
 
-#ifndef __SSE4_2__
+#ifndef __CRC32__
 #pragma GCC push_options
-#pragma GCC target("sse4.2")
-#define __DISABLE_SSE4_2__
-#endif /* __SSE4_2__ */
+#pragma GCC target("crc32")
+#define __DISABLE_CRC32__
+#endif /* __CRC32__ */
 
 /* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value.  */
 extern __inline unsigned int
@@ -79,10 +79,10 @@ __crc32d (unsigned int __C, unsigned int __V)
   return __builtin_ia32_crc32si (__C, __V);
 }
 
-#ifdef __DISABLE_SSE4_2__
-#undef __DISABLE_SSE4_2__
+#ifdef __DISABLE_CRC32__
+#undef __DISABLE_CRC32__
 #pragma GCC pop_options
-#endif /* __DISABLE_SSE4_2__ */
+#endif /* __DISABLE_CRC32__ */
 
 #endif /* __iamcu__ */
 
@@ -199,11 +199,11 @@ __bswapq (long long __X)
   return __builtin_bswap64 (__X);
 }
 
-#ifndef __SSE4_2__
+#ifndef __CRC32__
 #pragma GCC push_options
-#pragma GCC target("sse4.2")
-#define __DISABLE_SSE4_2__
-#endif /* __SSE4_2__ */
+#pragma GCC target("crc32")
+#define __DISABLE_CRC32__
+#endif /* __CRC32__ */
 
 /* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value.  */
 extern __inline unsigned long long
@@ -213,10 +213,10 @@ __crc32q (unsigned long long __C, unsigned long long __V)
   return __builtin_ia32_crc32di (__C, __V);
 }
 
-#ifdef __DISABLE_SSE4_2__
-#undef __DISABLE_SSE4_2__
+#ifdef __DISABLE_CRC32__
+#undef __DISABLE_CRC32__
 #pragma GCC pop_options
-#endif /* __DISABLE_SSE4_2__ */
+#endif /* __DISABLE_CRC32__ */
 
 /* 64bit popcnt */
 extern __inline long long
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
  2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
  2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
  2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

Since

commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Apr 15 05:59:48 2021 -0700

    x86: Use crc32 target option for CRC32 intrinsics

enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_SSE4_2 check
in sse4_2_crc32<mode> pattens, remove OPTION_MASK_ISA_SSE4_2 from CRC32
_builtin functions.

gcc/

	PR target/101549
	* config/i386/i386-builtin.def: Remove OPTION_MASK_ISA_SSE4_2
	from CRC32 _builtin functions.

gcc/testsuite/

	PR target/101549
	* gcc.target/i386/crc32-6.c: New test.

(cherry picked from commit 7aa28dbc371cf3c09c05c68672b00d9006391595)
---
 gcc/config/i386/i386-builtin.def        |  8 ++++----
 gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +++++++++++++
 2 files changed, 17 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index e3ed4e1578f..ea509c67ddb 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -963,10 +963,10 @@ BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_pte
 
 /* SSE4.2 */
 BDESC (OPTION_MASK_ISA_SSE4_2, 0, CODE_FOR_sse4_2_gtv2di3, "__builtin_ia32_pcmpgtq", IX86_BUILTIN_PCMPGTQ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT)
+BDESC (OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64)
 
 /* SSE4A */
 BDESC (OPTION_MASK_ISA_SSE4A, 0, CODE_FOR_sse4a_extrqi, "__builtin_ia32_extrqi", IX86_BUILTIN_EXTRQI, UNKNOWN, (int) V2DI_FTYPE_V2DI_UINT_UINT)
diff --git a/gcc/testsuite/gcc.target/i386/crc32-6.c b/gcc/testsuite/gcc.target/i386/crc32-6.c
new file mode 100644
index 00000000000..464e3444069
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/crc32-6.c
@@ -0,0 +1,13 @@
+/* PR target/101549 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4 -mno-crc32" } */
+
+#include <immintrin.h>
+
+unsigned int
+test_mm_crc32_u8 (unsigned int CRC, unsigned char V)
+{
+  return _mm_crc32_u8 (CRC, V);
+}
+
+/* { dg-error "needs isa option -mcrc32" "" { target *-*-* } 0  } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
                   ` (2 preceding siblings ...)
  2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
  2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
  2021-08-16  6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
  5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

For -mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled explicitly.

gcc/

	PR target/101492
	* common/config/i386/i386-common.c (ix86_handle_option): For
	-mgeneral-regs-only, enable the GPR only instructions which are
	enabled implicitly by SSE ISAs unless they have been disabled
	explicitly.

gcc/testsuite/

	PR target/101492
	* gcc.target/i386/pr101492-1.c: New test.
	* gcc.target/i386/pr101492-2.c: Likewise.
	* gcc.target/i386/pr101492-3.c: Likewise.
	* gcc.target/i386/pr101492-4.c: Likewise.

(cherry picked from commit 6ae8aac19cdbdbd96d90f86e4d8505fe121bdf06)
---
 gcc/common/config/i386/i386-common.c       | 30 ++++++++++++++++++++--
 gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 ++++++++
 gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 ++++++++
 gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 ++++++++
 gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +++++++++
 5 files changed, 70 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index e156cc34584..38dbb9d9263 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -354,16 +354,42 @@ ix86_handle_option (struct gcc_options *opts,
     case OPT_mgeneral_regs_only:
       if (value)
 	{
+	  HOST_WIDE_INT general_regs_only_flags = 0;
+	  HOST_WIDE_INT general_regs_only_flags2 = 0;
+
+	  /* NB: Enable the GPR only instructions which are enabled
+	     implicitly by SSE ISAs unless they have been disabled
+	     explicitly.  */
+	  if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags))
+	    {
+	      if ((opts->x_ix86_isa_flags_explicit
+		   & OPTION_MASK_ISA_CRC32) == 0)
+		general_regs_only_flags |= OPTION_MASK_ISA_CRC32;
+	      if ((opts->x_ix86_isa_flags_explicit
+		   & OPTION_MASK_ISA_POPCNT) == 0)
+		general_regs_only_flags |= OPTION_MASK_ISA_POPCNT;
+	    }
+	  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+	    {
+	      if ((opts->x_ix86_isa_flags2_explicit
+		   & OPTION_MASK_ISA2_MWAIT) == 0)
+		general_regs_only_flags2 |= OPTION_MASK_ISA2_MWAIT;
+	    }
+
 	  /* Disable MMX, SSE and x87 instructions if only
 	     general registers are allowed.  */
 	  opts->x_ix86_isa_flags
 	    &= ~OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET;
 	  opts->x_ix86_isa_flags2
 	    &= ~OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET;
+	  opts->x_ix86_isa_flags |= general_regs_only_flags;
+	  opts->x_ix86_isa_flags2 |= general_regs_only_flags2;
 	  opts->x_ix86_isa_flags_explicit
-	    |= OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET;
+	    |= (OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET
+		| general_regs_only_flags);
 	  opts->x_ix86_isa_flags2_explicit
-	    |= OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET;
+	    |= (OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET
+		| general_regs_only_flags2);
 
 	  opts->x_target_flags &= ~MASK_80387;
 	}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-1.c b/gcc/testsuite/gcc.target/i386/pr101492-1.c
new file mode 100644
index 00000000000..41002571761
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+unsigned int
+foo1 (unsigned int x, unsigned int y)
+{
+  return __crc32d (x, y);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-2.c b/gcc/testsuite/gcc.target/i386/pr101492-2.c
new file mode 100644
index 00000000000..c7d24f43c39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+unsigned int
+foo1 (unsigned int x)
+{
+  return _mm_popcnt_u32 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-3.c b/gcc/testsuite/gcc.target/i386/pr101492-3.c
new file mode 100644
index 00000000000..37e2071ab57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse3 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+void
+foo1 (unsigned int x, unsigned int y)
+{
+  _mm_mwait (x, y);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-4.c b/gcc/testsuite/gcc.target/i386/pr101492-4.c
new file mode 100644
index 00000000000..c5a4f0abd25
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-4.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-mwait -msse3 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+void
+foo1 (unsigned int x, unsigned int y)
+{
+  _mm_mwait (x, y);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
                   ` (3 preceding siblings ...)
  2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
  2021-08-16  6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
  5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener

1. Intrinsics in <x86gprintrin.h> only require GPR ISAs.  Add

 #if defined __MMX__ || defined __SSE__
 #pragma GCC push_options
 #pragma GCC target("general-regs-only")
 #define __DISABLE_GENERAL_REGS_ONLY__
 #endif

and

 #ifdef __DISABLE_GENERAL_REGS_ONLY__
 #undef __DISABLE_GENERAL_REGS_ONLY__
 #pragma GCC pop_options
 #endif /* __DISABLE_GENERAL_REGS_ONLY__ */

to <x86gprintrin.h> to disable non-GPR ISAs so that they can be used in
functions with __attribute__ ((target("general-regs-only"))).
2. When checking always_inline attribute, if callee only uses GPRs,
ignore MASK_80387 since enable MASK_80387 in caller has no impact on
callee inline.

gcc/

	PR target/99744
	* config/i386/i386.c (ix86_can_inline_p): Ignore MASK_80387 if
	callee only uses GPRs.
	* config/i386/ia32intrin.h: Revert commit 5463cee2770.
	* config/i386/serializeintrin.h: Revert commit 71958f740f1.
	* config/i386/x86gprintrin.h: Add
	#pragma GCC target("general-regs-only") and #pragma GCC pop_options
	to disable non-GPR ISAs.

gcc/testsuite/

	PR target/99744
	* gcc.target/i386/pr99744-3.c: New test.
	* gcc.target/i386/pr99744-4.c: Likewise.
	* gcc.target/i386/pr99744-5.c: Likewise.
	* gcc.target/i386/pr99744-6.c: Likewise.
	* gcc.target/i386/pr99744-7.c: Likewise.
	* gcc.target/i386/pr99744-8.c: Likewise.

(cherry picked from commit 72264a639729a5dcc21dbee304717ce22b338bfd)
---
 gcc/config/i386/i386.c                    |   6 +-
 gcc/config/i386/ia32intrin.h              |  14 +-
 gcc/config/i386/serializeintrin.h         |   7 +-
 gcc/config/i386/x86gprintrin.h            |  11 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr99744-5.c |  25 ++
 gcc/testsuite/gcc.target/i386/pr99744-6.c |  23 ++
 gcc/testsuite/gcc.target/i386/pr99744-7.c |  12 +
 gcc/testsuite/gcc.target/i386/pr99744-8.c |  13 +
 10 files changed, 477 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5a7bc8c44a8..527d493ecae 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -553,7 +553,7 @@ ix86_can_inline_p (tree caller, tree callee)
 
   /* Changes of those flags can be tolerated for always inlines. Lets hope
      user knows what he is doing.  */
-  const unsigned HOST_WIDE_INT always_inline_safe_mask
+  unsigned HOST_WIDE_INT always_inline_safe_mask
 	 = (MASK_USE_8BIT_IDIV | MASK_ACCUMULATE_OUTGOING_ARGS
 	    | MASK_NO_ALIGN_STRINGOPS | MASK_AVX256_SPLIT_UNALIGNED_LOAD
 	    | MASK_AVX256_SPLIT_UNALIGNED_STORE | MASK_CLD
@@ -578,6 +578,10 @@ ix86_can_inline_p (tree caller, tree callee)
        && lookup_attribute ("always_inline",
 			    DECL_ATTRIBUTES (callee)));
 
+  /* If callee only uses GPRs, ignore MASK_80387.  */
+  if (TARGET_GENERAL_REGS_ONLY_P (callee_opts->x_ix86_target_flags))
+    always_inline_safe_mask |= MASK_80387;
+
   cgraph_node *callee_node = cgraph_node::get (callee);
   /* Callee's isa options should be a subset of the caller's, i.e. a SSE4
      function can inline a SSE2 function but a SSE2 function can't inline
diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h
index 5422b0fc9e0..df99220ee4f 100644
--- a/gcc/config/i386/ia32intrin.h
+++ b/gcc/config/i386/ia32intrin.h
@@ -107,12 +107,22 @@ __rdpmc (int __S)
 #endif /* __iamcu__ */
 
 /* rdtsc */
-#define __rdtsc()		__builtin_ia32_rdtsc ()
+extern __inline unsigned long long
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__rdtsc (void)
+{
+  return __builtin_ia32_rdtsc ();
+}
 
 #ifndef __iamcu__
 
 /* rdtscp */
-#define __rdtscp(a)		__builtin_ia32_rdtscp (a)
+extern __inline unsigned long long
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__rdtscp (unsigned int *__A)
+{
+  return __builtin_ia32_rdtscp (__A);
+}
 
 #endif /* __iamcu__ */
 
diff --git a/gcc/config/i386/serializeintrin.h b/gcc/config/i386/serializeintrin.h
index e280250b198..89b5b94ea9b 100644
--- a/gcc/config/i386/serializeintrin.h
+++ b/gcc/config/i386/serializeintrin.h
@@ -34,7 +34,12 @@
 #define __DISABLE_SERIALIZE__
 #endif /* __SERIALIZE__ */
 
-#define _serialize()	__builtin_ia32_serialize ()
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_serialize (void)
+{
+  __builtin_ia32_serialize ();
+}
 
 #ifdef __DISABLE_SERIALIZE__
 #undef __DISABLE_SERIALIZE__
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index 7793032ba90..b7fefa780a6 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -24,6 +24,12 @@
 #ifndef _X86GPRINTRIN_H_INCLUDED
 #define _X86GPRINTRIN_H_INCLUDED
 
+#if defined __MMX__ || defined __SSE__
+#pragma GCC push_options
+#pragma GCC target("general-regs-only")
+#define __DISABLE_GENERAL_REGS_ONLY__
+#endif
+
 #include <ia32intrin.h>
 
 #ifndef __iamcu__
@@ -255,4 +261,9 @@ _ptwrite32 (unsigned __B)
 
 #endif /* __iamcu__ */
 
+#ifdef __DISABLE_GENERAL_REGS_ONLY__
+#undef __DISABLE_GENERAL_REGS_ONLY__
+#pragma GCC pop_options
+#endif /* __DISABLE_GENERAL_REGS_ONLY__ */
+
 #endif /* _X86GPRINTRIN_H_INCLUDED.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-3.c b/gcc/testsuite/gcc.target/i386/pr99744-3.c
new file mode 100644
index 00000000000..6c505816ceb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-3.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-serialize" } */
+
+#include <x86intrin.h>
+
+__attribute__ ((target("general-regs-only")))
+void
+foo1 (void)
+{
+  _serialize ();
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-4.c b/gcc/testsuite/gcc.target/i386/pr99744-4.c
new file mode 100644
index 00000000000..9196e62d955
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-4.c
@@ -0,0 +1,357 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -mcrc32 -menqcmd -mfsgsbase -mfxsr -mhreset -mlzcnt -mlwp -mmovdir64b -mmovdiri -mmwaitx -mpconfig -mpku -mpopcnt -mptwrite -mrdpid -mrdrnd -mrdseed -mrtm -msgx -mshstk -mtbm -mtsxldtrk -mxsave -mxsavec -mxsaveopt -mxsaves -mwaitpkg -mwbnoinvd" } */
+/* { dg-additional-options "-muintr" { target { ! ia32 } } }  */
+
+/* Test calling GPR intrinsics from functions with general-regs-only
+   target attribute.  */
+
+#include <x86gprintrin.h>
+
+#define _CONCAT(x,y) x ## y
+
+#define test_0(func, type)						\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (void)						\
+  { return func (); }
+
+#define test_0_i1(func, type, imm)					\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (void)						\
+  { return func (imm); }
+
+#define test_1(func, type, op1_type)					\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A)					\
+  { return func (A); }
+
+#define test_1_i1(func, type, op1_type, imm)				\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A)					\
+  { return func (A, imm); }
+
+#define test_2(func, type, op1_type, op2_type)				\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B)			\
+  { return func (A, B); }
+
+#define test_2_i1(func, type, op1_type, op2_type, imm)			\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B)			\
+  { return func (A, B, imm); }
+
+#define test_3(func, type, op1_type, op2_type, op3_type)		\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C)		\
+  { return func (A, B, C); }
+
+#define test_4(func, type, op1_type, op2_type, op3_type, op4_type)	\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C,		\
+			  op4_type D)					\
+  { return func (A, B, C, D); }
+
+/* ia32intrin.h  */
+test_1 (__bsfd, int, int)
+test_1 (__bsrd, int, int)
+test_1 (__bswapd, int, int)
+test_1 (__popcntd, int, unsigned int)
+test_2 (__rolb, unsigned char, unsigned char, int)
+test_2 (__rolw, unsigned short, unsigned short, int)
+test_2 (__rold, unsigned int, unsigned int, int)
+test_2 (__rorb, unsigned char, unsigned char, int)
+test_2 (__rorw, unsigned short, unsigned short, int)
+test_2 (__rord, unsigned int, unsigned int, int)
+
+#ifndef __iamcu__
+/* adxintrin.h */
+test_4 (_subborrow_u32, unsigned char, unsigned char, unsigned int,
+	unsigned int, unsigned int *)
+test_4 (_addcarry_u32, unsigned char, unsigned char, unsigned int,
+	unsigned int, unsigned int *)
+test_4 (_addcarryx_u32, unsigned char, unsigned char, unsigned int,
+	unsigned int, unsigned int *)
+
+/* bmiintrin.h */
+test_1 (__tzcnt_u16, unsigned short, unsigned short)
+test_2 (__andn_u32, unsigned int, unsigned int, unsigned int)
+test_2 (__bextr_u32, unsigned int, unsigned int, unsigned int)
+test_3 (_bextr_u32, unsigned int, unsigned int, unsigned int,
+	unsigned int)
+test_1 (__blsi_u32, unsigned int, unsigned int)
+test_1 (_blsi_u32, unsigned int, unsigned int)
+test_1 (__blsmsk_u32, unsigned int, unsigned int)
+test_1 (_blsmsk_u32, unsigned int, unsigned int)
+test_1 (__blsr_u32, unsigned int, unsigned int)
+test_1 (_blsr_u32, unsigned int, unsigned int)
+test_1 (__tzcnt_u32, unsigned int, unsigned int)
+test_1 (_tzcnt_u32, unsigned int, unsigned int)
+
+/* bmi2intrin.h */
+test_2 (_bzhi_u32, unsigned int, unsigned int, unsigned int)
+test_2 (_pdep_u32, unsigned int, unsigned int, unsigned int)
+test_2 (_pext_u32, unsigned int, unsigned int, unsigned int)
+
+/* cetintrin.h */
+test_1 (_inc_ssp, void, unsigned int)
+test_0 (_saveprevssp, void)
+test_1 (_rstorssp, void, void *)
+test_2 (_wrssd, void, unsigned int, void *)
+test_2 (_wrussd, void, unsigned int, void *)
+test_0 (_setssbsy, void)
+test_1 (_clrssbsy, void, void *)
+
+/* cldemoteintrin.h */
+test_1 (_cldemote, void, void *)
+
+/* clflushoptintrin.h */
+test_1 (_mm_clflushopt, void, void *)
+
+/* clwbintrin.h */
+test_1 (_mm_clwb, void, void *)
+
+/* clzerointrin.h */
+test_1 (_mm_clzero, void, void *)
+
+/* enqcmdintrin.h */
+test_2 (_enqcmd, int, void *, const void *)
+test_2 (_enqcmds, int, void *, const void *)
+
+/* fxsrintrin.h */
+test_1 (_fxsave, void, void *)
+test_1 (_fxrstor, void, void *)
+
+/* hresetintrin.h */
+test_1 (_hreset, void, unsigned int)
+
+/* ia32intrin.h  */
+test_2 (__crc32b, unsigned int, unsigned char, unsigned char)
+test_2 (__crc32w, unsigned int, unsigned short, unsigned short)
+test_2 (__crc32d, unsigned int, unsigned int, unsigned int)
+test_1 (__rdpmc, unsigned long long, int)
+test_0 (__rdtsc, unsigned long long)
+test_1 (__rdtscp, unsigned long long, unsigned int *)
+test_0 (__pause, void)
+
+/* lzcntintrin.h */
+test_1 (__lzcnt16, unsigned short, unsigned short)
+test_1 (__lzcnt32, unsigned int, unsigned int)
+test_1 (_lzcnt_u32, unsigned int, unsigned int)
+
+/* lwpintrin.h */
+test_1 (__llwpcb, void, void *)
+test_0 (__slwpcb, void *)
+test_2_i1 (__lwpval32, void, unsigned int, unsigned int, 1)
+test_2_i1 (__lwpins32, unsigned char, unsigned int, unsigned int, 1)
+
+/* movdirintrin.h */
+test_2 (_directstoreu_u32, void, void *, unsigned int)
+test_2 (_movdir64b, void, void *, const void *)
+
+/* mwaitxintrin.h */
+test_3 (_mm_monitorx, void, void const *, unsigned int, unsigned int)
+test_3 (_mm_mwaitx, void, unsigned int, unsigned int, unsigned int)
+
+/* pconfigintrin.h */
+test_2 (_pconfig_u32, unsigned int, const unsigned int, size_t *)
+
+/* pkuintrin.h */
+test_0 (_rdpkru_u32, unsigned int)
+test_1 (_wrpkru, void, unsigned int)
+
+/* popcntintrin.h */
+test_1 (_mm_popcnt_u32, int, unsigned int)
+
+/* rdseedintrin.h */
+test_1 (_rdseed16_step, int, unsigned short *)
+test_1 (_rdseed32_step, int, unsigned int *)
+
+/* rtmintrin.h */
+test_0 (_xbegin, unsigned int)
+test_0 (_xend, void)
+test_0_i1 (_xabort, void, 1)
+
+/* sgxintrin.h */
+test_2 (_encls_u32, unsigned int, const unsigned int, size_t *)
+test_2 (_enclu_u32, unsigned int, const unsigned int, size_t *)
+test_2 (_enclv_u32, unsigned int, const unsigned int, size_t *)
+
+/* tbmintrin.h */
+test_1_i1 (__bextri_u32, unsigned int, unsigned int, 1)
+test_1 (__blcfill_u32, unsigned int, unsigned int)
+test_1 (__blci_u32, unsigned int, unsigned int)
+test_1 (__blcic_u32, unsigned int, unsigned int)
+test_1 (__blcmsk_u32, unsigned int, unsigned int)
+test_1 (__blcs_u32, unsigned int, unsigned int)
+test_1 (__blsfill_u32, unsigned int, unsigned int)
+test_1 (__blsic_u32, unsigned int, unsigned int)
+test_1 (__t1mskc_u32, unsigned int, unsigned int)
+test_1 (__tzmsk_u32, unsigned int, unsigned int)
+
+/* tsxldtrkintrin.h */
+test_0 (_xsusldtrk, void)
+test_0 (_xresldtrk, void)
+
+/* x86gprintrin.h */
+test_1 (_ptwrite32, void, unsigned int)
+test_1 (_rdrand16_step, int, unsigned short *)
+test_1 (_rdrand32_step, int, unsigned int *)
+test_0 (_wbinvd, void)
+
+/* xtestintrin.h */
+test_0 (_xtest, int)
+
+/* xsaveintrin.h */
+test_2 (_xsave, void, void *, long long)
+test_2 (_xrstor, void, void *, long long)
+test_2 (_xsetbv, void, unsigned int, long long)
+test_1 (_xgetbv, long long, unsigned int)
+
+/* xsavecintrin.h */
+test_2 (_xsavec, void, void *, long long)
+
+/* xsaveoptintrin.h */
+test_2 (_xsaveopt, void, void *, long long)
+
+/* xsavesintrin.h */
+test_2 (_xsaves, void, void *, long long)
+test_2 (_xrstors, void, void *, long long)
+
+/* wbnoinvdintrin.h */
+test_0 (_wbnoinvd, void)
+
+#ifdef __x86_64__
+/* adxintrin.h */
+test_4 (_subborrow_u64, unsigned char, unsigned char,
+	unsigned long long, unsigned long long,
+	unsigned long long *)
+test_4 (_addcarry_u64, unsigned char, unsigned char,
+	unsigned long long, unsigned long long,
+	unsigned long long *)
+test_4 (_addcarryx_u64, unsigned char, unsigned char,
+	unsigned long long, unsigned long long,
+	unsigned long long *)
+
+/* bmiintrin.h */
+test_2 (__andn_u64, unsigned long long, unsigned long long,
+	unsigned long long)
+test_2 (__bextr_u64, unsigned long long, unsigned long long,
+	unsigned long long)
+test_3 (_bextr_u64, unsigned long long, unsigned long long,
+	unsigned long long, unsigned long long)
+test_1 (__blsi_u64, unsigned long long, unsigned long long)
+test_1 (_blsi_u64, unsigned long long, unsigned long long)
+test_1 (__blsmsk_u64, unsigned long long, unsigned long long)
+test_1 (_blsmsk_u64, unsigned long long, unsigned long long)
+test_1 (__blsr_u64, unsigned long long, unsigned long long)
+test_1 (_blsr_u64, unsigned long long, unsigned long long)
+test_1 (__tzcnt_u64, unsigned long long, unsigned long long)
+test_1 (_tzcnt_u64, unsigned long long, unsigned long long)
+
+/* bmi2intrin.h */
+test_2 (_bzhi_u64, unsigned long long, unsigned long long,
+	unsigned long long)
+test_2 (_pdep_u64, unsigned long long, unsigned long long,
+	unsigned long long)
+test_2 (_pext_u64, unsigned long long, unsigned long long,
+	unsigned long long)
+test_3 (_mulx_u64, unsigned long long, unsigned long long,
+	unsigned long long, unsigned long long *)
+
+/* cetintrin.h */
+test_0 (_get_ssp, unsigned long long)
+test_2 (_wrssq, void, unsigned long long, void *)
+test_2 (_wrussq, void, unsigned long long, void *)
+
+/* fxsrintrin.h */
+test_1 (_fxsave64, void, void *)
+test_1 (_fxrstor64, void, void *)
+
+/* ia32intrin.h  */
+test_1 (__bsfq, int, long long)
+test_1 (__bsrq, int, long long)
+test_1 (__bswapq, long long, long long)
+test_2 (__crc32q, unsigned long long, unsigned long long,
+	unsigned long long)
+test_1 (__popcntq, long long, unsigned long long)
+test_2 (__rolq, unsigned long long, unsigned long long, int)
+test_2 (__rorq, unsigned long long, unsigned long long, int)
+test_0 (__readeflags, unsigned long long)
+test_1 (__writeeflags, void, unsigned int)
+
+/* lzcntintrin.h */
+test_1 (__lzcnt64, unsigned long long, unsigned long long)
+test_1 (_lzcnt_u64, unsigned long long, unsigned long long)
+
+/* lwpintrin.h */
+test_2_i1 (__lwpval64, void, unsigned long long, unsigned int, 1)
+test_2_i1 (__lwpins64, unsigned char, unsigned long long,
+	   unsigned int, 1)
+
+/* movdirintrin.h */
+test_2 (_directstoreu_u64, void, void *, unsigned long long)
+
+/* popcntintrin.h */
+test_1 (_mm_popcnt_u64, long long, unsigned long long)
+
+/* rdseedintrin.h */
+test_1 (_rdseed64_step, int, unsigned long long *)
+
+/* tbmintrin.h */
+test_1_i1 (__bextri_u64, unsigned long long, unsigned long long, 1)
+test_1 (__blcfill_u64, unsigned long long, unsigned long long)
+test_1 (__blci_u64, unsigned long long, unsigned long long)
+test_1 (__blcic_u64, unsigned long long, unsigned long long)
+test_1 (__blcmsk_u64, unsigned long long, unsigned long long)
+test_1 (__blcs_u64, unsigned long long, unsigned long long)
+test_1 (__blsfill_u64, unsigned long long, unsigned long long)
+test_1 (__blsic_u64, unsigned long long, unsigned long long)
+test_1 (__t1mskc_u64, unsigned long long, unsigned long long)
+test_1 (__tzmsk_u64, unsigned long long, unsigned long long)
+
+/* uintrintrin.h */
+test_0 (_clui, void)
+test_1 (_senduipi, void, unsigned long long)
+test_0 (_stui, void)
+test_0 (_testui, unsigned char)
+
+/* x86gprintrin.h */
+test_1 (_ptwrite64, void, unsigned long long)
+test_0 (_readfsbase_u32, unsigned int)
+test_0 (_readfsbase_u64, unsigned long long)
+test_0 (_readgsbase_u32, unsigned int)
+test_0 (_readgsbase_u64, unsigned long long)
+test_1 (_rdrand64_step, int, unsigned long long *)
+test_1 (_writefsbase_u32, void, unsigned int)
+test_1 (_writefsbase_u64, void, unsigned long long)
+test_1 (_writegsbase_u32, void, unsigned int)
+test_1 (_writegsbase_u64, void, unsigned long long)
+
+/* xsaveintrin.h */
+test_2 (_xsave64, void, void *, long long)
+test_2 (_xrstor64, void, void *, long long)
+
+/* xsavecintrin.h */
+test_2 (_xsavec64, void, void *, long long)
+
+/* xsaveoptintrin.h */
+test_2 (_xsaveopt64, void, void *, long long)
+
+/* xsavesintrin.h */
+test_2 (_xsaves64, void, void *, long long)
+test_2 (_xrstors64, void, void *, long long)
+
+/* waitpkgintrin.h */
+test_1 (_umonitor, void, void *)
+test_2 (_umwait, unsigned char, unsigned int, unsigned long long)
+test_2 (_tpause, unsigned char, unsigned int, unsigned long long)
+
+#else /* !__x86_64__ */
+/* bmi2intrin.h */
+test_3 (_mulx_u32, unsigned int, unsigned int, unsigned int,
+	unsigned int *)
+
+/* cetintrin.h */
+test_0 (_get_ssp, unsigned int)
+#endif /* __x86_64__ */
+
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-5.c b/gcc/testsuite/gcc.target/i386/pr99744-5.c
new file mode 100644
index 00000000000..9e40e5ef428
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-5.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait" } */
+
+/* Test calling MWAIT intrinsics from functions with general-regs-only
+   target attribute.  */
+
+#include <x86gprintrin.h>
+
+#define _CONCAT(x,y) x ## y
+
+#define test_2(func, type, op1_type, op2_type)				\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B)			\
+  { return func (A, B); }
+
+#define test_3(func, type, op1_type, op2_type, op3_type)		\
+  __attribute__ ((target("general-regs-only")))				\
+  type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C)		\
+  { return func (A, B, C); }
+
+#ifndef __iamcu__
+/* mwaitintrin.h */
+test_3 (_mm_monitor, void, void const *, unsigned int, unsigned int)
+test_2 (_mm_mwait, void, unsigned int, unsigned int)
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-6.c b/gcc/testsuite/gcc.target/i386/pr99744-6.c
new file mode 100644
index 00000000000..4025918a9c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-6.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+#include <x86intrin.h>
+
+extern unsigned long long int curr_deadline;
+extern void bar (void);
+
+void
+foo1 (void)
+{
+  if (__rdtsc () < curr_deadline)
+    return; 
+  bar ();
+}
+
+void
+foo2 (unsigned int *p)
+{
+  if (__rdtscp (p) < curr_deadline)
+    return; 
+  bar ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-7.c b/gcc/testsuite/gcc.target/i386/pr99744-7.c
new file mode 100644
index 00000000000..30b7ca05966
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-7.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mno-avx -Wno-psabi" } */
+
+#include <x86intrin.h>
+
+void
+foo (__m256 *x)
+{
+  x[0] = _mm256_sub_ps (x[1], x[2]);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-8.c b/gcc/testsuite/gcc.target/i386/pr99744-8.c
new file mode 100644
index 00000000000..115183eede6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-8.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -Wno-psabi" } */
+
+#include <x86intrin.h>
+
+__attribute__((target ("no-avx")))
+void
+foo (__m256 *x)
+{
+  x[0] = _mm256_sub_ps (x[1], x[2]);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
@ 2021-08-16  6:11   ` Richard Biener
  2021-08-16 12:25     ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2021-08-16  6:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> -mgeneral-regs-only and make -msse3 to imply -mmwait.

Adding new options requires to bump the LTO streaming minor version
(I know we forgot it once on the branch already when adding a new --param).

Please take care of this when backporting.

Richard.

> gcc/
>
>         * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
>         x86_64-*-* targets.
>         * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
>         New.
>         (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
>         (ix86_handle_option): Handle -mmwait.
>         * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
>         Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
>         __builtin_ia32_monitor and __builtin_ia32_mwait.
>         * config/i386/i386-options.c (isa2_opts): Add -mmwait.
>         (ix86_valid_target_attribute_inner_p): Likewise.
>         (ix86_option_override_internal): Enable mwait/monitor
>         instructions for -msse3.
>         * config/i386/i386.h (TARGET_MWAIT): New.
>         (TARGET_MWAIT_P): Likewise.
>         * config/i386/i386.opt: Add -mmwait.
>         * config/i386/mwaitintrin.h: New file.
>         * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
>         * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
>         TARGET_MWAIT.
>         (@sse3_monitor_<mode>): Likewise.
>         * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
>         * doc/extend.texi: Document mwait target attribute.
>         * doc/invoke.texi: Document -mmwait.
>
> gcc/testsuite/
>
>         * gcc.target/i386/monitor-2.c: New test.
>
> (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> ---
>  gcc/common/config/i386/i386-common.c      | 15 +++++++
>  gcc/config.gcc                            |  6 ++-
>  gcc/config/i386/i386-builtins.c           |  4 +-
>  gcc/config/i386/i386-options.c            |  7 +++
>  gcc/config/i386/i386.h                    |  2 +
>  gcc/config/i386/i386.opt                  |  4 ++
>  gcc/config/i386/mwaitintrin.h             | 52 +++++++++++++++++++++++
>  gcc/config/i386/pmmintrin.h               | 13 +-----
>  gcc/config/i386/sse.md                    |  4 +-
>  gcc/config/i386/x86gprintrin.h            |  2 +
>  gcc/doc/extend.texi                       |  5 +++
>  gcc/doc/invoke.texi                       |  8 +++-
>  gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
>  13 files changed, 130 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/config/i386/mwaitintrin.h
>  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
>
> diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> index 6a7b5c8312f..e156cc34584 100644
> --- a/gcc/common/config/i386/i386-common.c
> +++ b/gcc/common/config/i386/i386-common.c
> @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define OPTION_MASK_ISA_F16C_SET \
>    (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
>  #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
>  #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
>  #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
>  #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
>  #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
>  #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
>  #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
>  #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
>  #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
>         }
>        return true;
>
> +    case OPT_mmwait:
> +      if (value)
> +       {
> +         opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> +       }
> +      else
> +       {
> +         opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> +       }
> +      return true;
> +
>      case OPT_mclzero:
>        if (value)
>         {
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 357b0bed067..a020e0808c9 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -414,7 +414,8 @@ i[34567]86-*-*)
>                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
>                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
>                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> +                      mwaitintrin.h"
>         ;;
>  x86_64-*-*)
>         cpu_type=i386
> @@ -451,7 +452,8 @@ x86_64-*-*)
>                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
>                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
>                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> +                      mwaitintrin.h"
>         ;;
>  ia64-*-*)
>         extra_headers=ia64intrin.h
> diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> index 4fcdf4b89ee..128bd39816c 100644
> --- a/gcc/config/i386/i386-builtins.c
> +++ b/gcc/config/i386/i386-builtins.c
> @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
>                             VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
>
>    /* SSE3.  */
> -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
>                VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
>                VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
>
>    /* AES */
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 18d2c0b9f99..7ecd0cf8b8c 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
>    { "-mmovbe",         OPTION_MASK_ISA2_MOVBE },
>    { "-mclzero",                OPTION_MASK_ISA2_CLZERO },
>    { "-mmwaitx",                OPTION_MASK_ISA2_MWAITX },
> +  { "-mmwait",         OPTION_MASK_ISA2_MWAIT },
>    { "-mmovdir64b",     OPTION_MASK_ISA2_MOVDIR64B },
>    { "-mwaitpkg",       OPTION_MASK_ISA2_WAITPKG },
>    { "-mcldemote",      OPTION_MASK_ISA2_CLDEMOTE },
> @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
>      IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
>      IX86_ATTR_ISA ("rdrnd",    OPT_mrdrnd),
>      IX86_ATTR_ISA ("mwaitx",   OPT_mmwaitx),
> +    IX86_ATTR_ISA ("mwait",    OPT_mmwait),
>      IX86_ATTR_ISA ("clzero",   OPT_mclzero),
>      IX86_ATTR_ISA ("pku",      OPT_mpku),
>      IX86_ATTR_ISA ("lwp",      OPT_mlwp),
> @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
>        || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
>      ix86_prefetch_sse = true;
>
> +  /* Enable mwait/monitor instructions for -msse3.  */
> +  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> +    opts->x_ix86_isa_flags2
> +      |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> +
>    /* Enable popcnt instruction for -msse4.2 or -mabm.  */
>    if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
>        || TARGET_ABM_P (opts->x_ix86_isa_flags))
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 5583ec6881a..73e118900f7 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  #define TARGET_CLWB_P(x)       TARGET_ISA_CLWB_P(x)
>  #define TARGET_MWAITX  TARGET_ISA2_MWAITX
>  #define TARGET_MWAITX_P(x)     TARGET_ISA2_MWAITX_P(x)
> +#define TARGET_MWAIT   TARGET_ISA2_MWAIT
> +#define TARGET_MWAIT_P(x)      TARGET_ISA2_MWAIT_P(x)
>  #define TARGET_PKU     TARGET_ISA_PKU
>  #define TARGET_PKU_P(x)        TARGET_ISA_PKU_P(x)
>  #define TARGET_SHSTK   TARGET_ISA_SHSTK
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index c781fdc8278..7b8547bb1c3 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
>  mneeded
>  Target Var(ix86_needed) Save
>  Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> +
> +mmwait
> +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> +Support MWAIT and MONITOR built-in functions and code generation.
> diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> new file mode 100644
> index 00000000000..1ecbc4abb69
> --- /dev/null
> +++ b/gcc/config/i386/mwaitintrin.h
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef _MWAITINTRIN_H_INCLUDED
> +#define _MWAITINTRIN_H_INCLUDED
> +
> +#ifndef __MWAIT__
> +#pragma GCC push_options
> +#pragma GCC target("mwait")
> +#define __DISABLE_MWAIT__
> +#endif /* __MWAIT__ */
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> +{
> +  __builtin_ia32_monitor (__P, __E, __H);
> +}
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_mwait (unsigned int __E, unsigned int __H)
> +{
> +  __builtin_ia32_mwait (__E, __H);
> +}
> +
> +#ifdef __DISABLE_MWAIT__
> +#undef __DISABLE_MWAIT__
> +#pragma GCC pop_options
> +#endif /* __DISABLE_MWAIT__ */
> +
> +#endif /* _MWAITINTRIN_H_INCLUDED */
> diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> index fa9c5bb8b9f..f8102d2be23 100644
> --- a/gcc/config/i386/pmmintrin.h
> +++ b/gcc/config/i386/pmmintrin.h
> @@ -29,6 +29,7 @@
>
>  /* We need definitions from the SSE2 and SSE header files*/
>  #include <emmintrin.h>
> +#include <mwaitintrin.h>
>
>  #ifndef __SSE3__
>  #pragma GCC push_options
> @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
>    return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
>  }
>
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> -{
> -  __builtin_ia32_monitor (__P, __E, __H);
> -}
> -
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_mwait (unsigned int __E, unsigned int __H)
> -{
> -  __builtin_ia32_mwait (__E, __H);
> -}
> -
>  #ifdef __DISABLE_SSE3__
>  #undef __DISABLE_SSE3__
>  #pragma GCC pop_options
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3f81abc7804..43afe3dabed 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
>    [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
>                      (match_operand:SI 1 "register_operand" "a")]
>                     UNSPECV_MWAIT)]
> -  "TARGET_SSE3"
> +  "TARGET_MWAIT"
>  ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
>  ;; Since 32bit register operands are implicitly zero extended to 64bit,
>  ;; we only need to set up 32bit registers.
> @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
>                      (match_operand:SI 1 "register_operand" "c")
>                      (match_operand:SI 2 "register_operand" "d")]
>                     UNSPECV_MONITOR)]
> -  "TARGET_SSE3"
> +  "TARGET_MWAIT"
>  ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
>  ;; RCX and RDX are used.  Since 32bit register operands are implicitly
>  ;; zero extended to 64bit, we only need to set up 32bit registers.
> diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> index ceda501252c..7793032ba90 100644
> --- a/gcc/config/i386/x86gprintrin.h
> +++ b/gcc/config/i386/x86gprintrin.h
> @@ -56,6 +56,8 @@
>
>  #include <movdirintrin.h>
>
> +#include <mwaitintrin.h>
> +
>  #include <mwaitxintrin.h>
>
>  #include <pconfigintrin.h>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1bc66cce2b8..1acfaf1d345 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
>  @cindex @code{target("movdiri")} function attribute, x86
>  Enable/disable the generation of the MOVDIRI instructions.
>
> +@item mwait
> +@itemx no-mwait
> +@cindex @code{target("mwait")} function attribute, x86
> +Enable/disable the generation of the MWAIT and MONITOR instructions.
> +
>  @item mwaitx
>  @itemx no-mwaitx
>  @cindex @code{target("mwaitx")} function attribute, x86
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 7f13ffb79e1..3e1f0bc8fad 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
>  -mno-wide-multiply  -mrtd  -malign-double @gol
>  -mpreferred-stack-boundary=@var{num} @gol
>  -mincoming-stack-boundary=@var{num} @gol
> --mcld  -mcx16  -msahf  -mmovbe  -mcrc32 @gol
> +-mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait @gol
>  -mrecip  -mrecip=@var{opt} @gol
>  -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt} @gol
>  -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx @gol
> @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
>  @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
>  @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
>
> +@item -mmwait
> +@opindex mmwait
> +This option enables built-in functions @code{__builtin_ia32_monitor},
> +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> +@code{mwait} machine instructions.
> +
>  @item -mrecip
>  @opindex mrecip
>  This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> new file mode 100644
> index 00000000000..96eeec070f0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> +
> +/* Verify that they work in both 32bit and 64bit.  */
> +
> +#include <x86gprintrin.h>
> +
> +void
> +foo (char *p, int x, int y, int z)
> +{
> +   _mm_monitor (p, y, x);
> +   _mm_mwait (z, y);
> +}
> +
> +void
> +bar (char *p, long x, long y, long z)
> +{
> +   _mm_monitor (p, y, x);
> +   _mm_mwait (z, y);
> +}
> +
> +void
> +foo1 (char *p)
> +{
> +   _mm_monitor (p, 0, 0);
> +   _mm_mwait (0, 0);
> +}
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
  2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
                   ` (4 preceding siblings ...)
  2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
@ 2021-08-16  6:11 ` Richard Biener
  2021-08-24 14:57   ` H.J. Lu
  5 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2021-08-16  6:11 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> <x86gprintrin.h> and target("general-regs-only") function attribute
> were added to GCC 11.  But their implementations are incomplete.  I'd
> like to backport the following patches to GCC 11 branch to finish them.

Fine with me if x86 maintainers do not disagree (also see one comment I have
on the -mwait adding patch).

> H.J. Lu (5):
>   x86: Add -mmwait for -mgeneral-regs-only
>   x86: Use crc32 target option for CRC32 intrinsics
>   x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
>   x86: Enable the GPR only instructions for -mgeneral-regs-only
>   <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
>
>  gcc/common/config/i386/i386-common.c       |  45 ++-
>  gcc/config.gcc                             |   6 +-
>  gcc/config/i386/i386-builtin.def           |   8 +-
>  gcc/config/i386/i386-builtins.c            |   4 +-
>  gcc/config/i386/i386-c.c                   |   2 +
>  gcc/config/i386/i386-options.c             |  12 +
>  gcc/config/i386/i386.c                     |   6 +-
>  gcc/config/i386/i386.h                     |   2 +
>  gcc/config/i386/i386.md                    |   4 +-
>  gcc/config/i386/i386.opt                   |   4 +
>  gcc/config/i386/ia32intrin.h               |  42 ++-
>  gcc/config/i386/mwaitintrin.h              |  52 +++
>  gcc/config/i386/pmmintrin.h                |  13 +-
>  gcc/config/i386/serializeintrin.h          |   7 +-
>  gcc/config/i386/sse.md                     |   4 +-
>  gcc/config/i386/x86gprintrin.h             |  13 +
>  gcc/doc/extend.texi                        |   5 +
>  gcc/doc/invoke.texi                        |   8 +-
>  gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
>  gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
>  gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
>  gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
>  gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
>  gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
>  gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
>  gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
>  gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
>  gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
>  gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
>  30 files changed, 717 insertions(+), 45 deletions(-)
>  create mode 100644 gcc/config/i386/mwaitintrin.h
>  create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
>
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-16  6:11   ` Richard Biener
@ 2021-08-16 12:25     ` H.J. Lu
  2021-08-16 12:28       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-16 12:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > -mgeneral-regs-only and make -msse3 to imply -mmwait.
>
> Adding new options requires to bump the LTO streaming minor version
> (I know we forgot it once on the branch already when adding a new --param).
>
> Please take care of this when backporting.

It was updated today:

commit dce5367eecfb0729cad0325240d614721afb39e3
Author: Martin Liska <mliska@suse.cz>
Date:   Mon Aug 16 13:02:54 2021 +0200

    LTO: bump minor version

    Bump the LTO_minor_version due to changes in
52f0aa4dee8401ef3958dbf789780b0ee877beab

            PR c/100150

    gcc/ChangeLog:

            * lto-streamer.h (LTO_minor_version): Bump.

Do I need to do it again if I can check in my patches this week?

Thanks.

> Richard.
>
> > gcc/
> >
> >         * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> >         x86_64-*-* targets.
> >         * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> >         New.
> >         (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> >         (ix86_handle_option): Handle -mmwait.
> >         * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> >         Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> >         __builtin_ia32_monitor and __builtin_ia32_mwait.
> >         * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> >         (ix86_valid_target_attribute_inner_p): Likewise.
> >         (ix86_option_override_internal): Enable mwait/monitor
> >         instructions for -msse3.
> >         * config/i386/i386.h (TARGET_MWAIT): New.
> >         (TARGET_MWAIT_P): Likewise.
> >         * config/i386/i386.opt: Add -mmwait.
> >         * config/i386/mwaitintrin.h: New file.
> >         * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> >         * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> >         TARGET_MWAIT.
> >         (@sse3_monitor_<mode>): Likewise.
> >         * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> >         * doc/extend.texi: Document mwait target attribute.
> >         * doc/invoke.texi: Document -mmwait.
> >
> > gcc/testsuite/
> >
> >         * gcc.target/i386/monitor-2.c: New test.
> >
> > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> > ---
> >  gcc/common/config/i386/i386-common.c      | 15 +++++++
> >  gcc/config.gcc                            |  6 ++-
> >  gcc/config/i386/i386-builtins.c           |  4 +-
> >  gcc/config/i386/i386-options.c            |  7 +++
> >  gcc/config/i386/i386.h                    |  2 +
> >  gcc/config/i386/i386.opt                  |  4 ++
> >  gcc/config/i386/mwaitintrin.h             | 52 +++++++++++++++++++++++
> >  gcc/config/i386/pmmintrin.h               | 13 +-----
> >  gcc/config/i386/sse.md                    |  4 +-
> >  gcc/config/i386/x86gprintrin.h            |  2 +
> >  gcc/doc/extend.texi                       |  5 +++
> >  gcc/doc/invoke.texi                       |  8 +++-
> >  gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> >  13 files changed, 130 insertions(+), 19 deletions(-)
> >  create mode 100644 gcc/config/i386/mwaitintrin.h
> >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> >
> > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> > index 6a7b5c8312f..e156cc34584 100644
> > --- a/gcc/common/config/i386/i386-common.c
> > +++ b/gcc/common/config/i386/i386-common.c
> > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #define OPTION_MASK_ISA_F16C_SET \
> >    (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> >  #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> >  #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> >  #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> >  #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> >  #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> >  #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> >  #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> >  #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> >  #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> >         }
> >        return true;
> >
> > +    case OPT_mmwait:
> > +      if (value)
> > +       {
> > +         opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> > +       }
> > +      else
> > +       {
> > +         opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> > +       }
> > +      return true;
> > +
> >      case OPT_mclzero:
> >        if (value)
> >         {
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 357b0bed067..a020e0808c9 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -414,7 +414,8 @@ i[34567]86-*-*)
> >                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> >                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> >                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > +                      mwaitintrin.h"
> >         ;;
> >  x86_64-*-*)
> >         cpu_type=i386
> > @@ -451,7 +452,8 @@ x86_64-*-*)
> >                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> >                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> >                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > +                      mwaitintrin.h"
> >         ;;
> >  ia64-*-*)
> >         extra_headers=ia64intrin.h
> > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> > index 4fcdf4b89ee..128bd39816c 100644
> > --- a/gcc/config/i386/i386-builtins.c
> > +++ b/gcc/config/i386/i386-builtins.c
> > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> >                             VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
> >
> >    /* SSE3.  */
> > -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> > +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> >                VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> > -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> > +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> >                VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
> >
> >    /* AES */
> > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> > index 18d2c0b9f99..7ecd0cf8b8c 100644
> > --- a/gcc/config/i386/i386-options.c
> > +++ b/gcc/config/i386/i386-options.c
> > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> >    { "-mmovbe",         OPTION_MASK_ISA2_MOVBE },
> >    { "-mclzero",                OPTION_MASK_ISA2_CLZERO },
> >    { "-mmwaitx",                OPTION_MASK_ISA2_MWAITX },
> > +  { "-mmwait",         OPTION_MASK_ISA2_MWAIT },
> >    { "-mmovdir64b",     OPTION_MASK_ISA2_MOVDIR64B },
> >    { "-mwaitpkg",       OPTION_MASK_ISA2_WAITPKG },
> >    { "-mcldemote",      OPTION_MASK_ISA2_CLDEMOTE },
> > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> >      IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> >      IX86_ATTR_ISA ("rdrnd",    OPT_mrdrnd),
> >      IX86_ATTR_ISA ("mwaitx",   OPT_mmwaitx),
> > +    IX86_ATTR_ISA ("mwait",    OPT_mmwait),
> >      IX86_ATTR_ISA ("clzero",   OPT_mclzero),
> >      IX86_ATTR_ISA ("pku",      OPT_mpku),
> >      IX86_ATTR_ISA ("lwp",      OPT_mlwp),
> > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> >        || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> >      ix86_prefetch_sse = true;
> >
> > +  /* Enable mwait/monitor instructions for -msse3.  */
> > +  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> > +    opts->x_ix86_isa_flags2
> > +      |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> > +
> >    /* Enable popcnt instruction for -msse4.2 or -mabm.  */
> >    if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> >        || TARGET_ABM_P (opts->x_ix86_isa_flags))
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index 5583ec6881a..73e118900f7 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> >  #define TARGET_CLWB_P(x)       TARGET_ISA_CLWB_P(x)
> >  #define TARGET_MWAITX  TARGET_ISA2_MWAITX
> >  #define TARGET_MWAITX_P(x)     TARGET_ISA2_MWAITX_P(x)
> > +#define TARGET_MWAIT   TARGET_ISA2_MWAIT
> > +#define TARGET_MWAIT_P(x)      TARGET_ISA2_MWAIT_P(x)
> >  #define TARGET_PKU     TARGET_ISA_PKU
> >  #define TARGET_PKU_P(x)        TARGET_ISA_PKU_P(x)
> >  #define TARGET_SHSTK   TARGET_ISA_SHSTK
> > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> > index c781fdc8278..7b8547bb1c3 100644
> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> >  mneeded
> >  Target Var(ix86_needed) Save
> >  Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> > +
> > +mmwait
> > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> > +Support MWAIT and MONITOR built-in functions and code generation.
> > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> > new file mode 100644
> > index 00000000000..1ecbc4abb69
> > --- /dev/null
> > +++ b/gcc/config/i386/mwaitintrin.h
> > @@ -0,0 +1,52 @@
> > +/* Copyright (C) 2021 Free Software Foundation, Inc.
> > +
> > +   This file is part of GCC.
> > +
> > +   GCC is free software; you can redistribute it and/or modify
> > +   it under the terms of the GNU General Public License as published by
> > +   the Free Software Foundation; either version 3, or (at your option)
> > +   any later version.
> > +
> > +   GCC is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +   GNU General Public License for more details.
> > +
> > +   Under Section 7 of GPL version 3, you are granted additional
> > +   permissions described in the GCC Runtime Library Exception, version
> > +   3.1, as published by the Free Software Foundation.
> > +
> > +   You should have received a copy of the GNU General Public License and
> > +   a copy of the GCC Runtime Library Exception along with this program;
> > +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef _MWAITINTRIN_H_INCLUDED
> > +#define _MWAITINTRIN_H_INCLUDED
> > +
> > +#ifndef __MWAIT__
> > +#pragma GCC push_options
> > +#pragma GCC target("mwait")
> > +#define __DISABLE_MWAIT__
> > +#endif /* __MWAIT__ */
> > +
> > +extern __inline void
> > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > +{
> > +  __builtin_ia32_monitor (__P, __E, __H);
> > +}
> > +
> > +extern __inline void
> > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > +_mm_mwait (unsigned int __E, unsigned int __H)
> > +{
> > +  __builtin_ia32_mwait (__E, __H);
> > +}
> > +
> > +#ifdef __DISABLE_MWAIT__
> > +#undef __DISABLE_MWAIT__
> > +#pragma GCC pop_options
> > +#endif /* __DISABLE_MWAIT__ */
> > +
> > +#endif /* _MWAITINTRIN_H_INCLUDED */
> > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> > index fa9c5bb8b9f..f8102d2be23 100644
> > --- a/gcc/config/i386/pmmintrin.h
> > +++ b/gcc/config/i386/pmmintrin.h
> > @@ -29,6 +29,7 @@
> >
> >  /* We need definitions from the SSE2 and SSE header files*/
> >  #include <emmintrin.h>
> > +#include <mwaitintrin.h>
> >
> >  #ifndef __SSE3__
> >  #pragma GCC push_options
> > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> >    return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> >  }
> >
> > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > -{
> > -  __builtin_ia32_monitor (__P, __E, __H);
> > -}
> > -
> > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > -_mm_mwait (unsigned int __E, unsigned int __H)
> > -{
> > -  __builtin_ia32_mwait (__E, __H);
> > -}
> > -
> >  #ifdef __DISABLE_SSE3__
> >  #undef __DISABLE_SSE3__
> >  #pragma GCC pop_options
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 3f81abc7804..43afe3dabed 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> >    [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> >                      (match_operand:SI 1 "register_operand" "a")]
> >                     UNSPECV_MWAIT)]
> > -  "TARGET_SSE3"
> > +  "TARGET_MWAIT"
> >  ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> >  ;; Since 32bit register operands are implicitly zero extended to 64bit,
> >  ;; we only need to set up 32bit registers.
> > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> >                      (match_operand:SI 1 "register_operand" "c")
> >                      (match_operand:SI 2 "register_operand" "d")]
> >                     UNSPECV_MONITOR)]
> > -  "TARGET_SSE3"
> > +  "TARGET_MWAIT"
> >  ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> >  ;; RCX and RDX are used.  Since 32bit register operands are implicitly
> >  ;; zero extended to 64bit, we only need to set up 32bit registers.
> > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> > index ceda501252c..7793032ba90 100644
> > --- a/gcc/config/i386/x86gprintrin.h
> > +++ b/gcc/config/i386/x86gprintrin.h
> > @@ -56,6 +56,8 @@
> >
> >  #include <movdirintrin.h>
> >
> > +#include <mwaitintrin.h>
> > +
> >  #include <mwaitxintrin.h>
> >
> >  #include <pconfigintrin.h>
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 1bc66cce2b8..1acfaf1d345 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> >  @cindex @code{target("movdiri")} function attribute, x86
> >  Enable/disable the generation of the MOVDIRI instructions.
> >
> > +@item mwait
> > +@itemx no-mwait
> > +@cindex @code{target("mwait")} function attribute, x86
> > +Enable/disable the generation of the MWAIT and MONITOR instructions.
> > +
> >  @item mwaitx
> >  @itemx no-mwaitx
> >  @cindex @code{target("mwaitx")} function attribute, x86
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 7f13ffb79e1..3e1f0bc8fad 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> >  -mno-wide-multiply  -mrtd  -malign-double @gol
> >  -mpreferred-stack-boundary=@var{num} @gol
> >  -mincoming-stack-boundary=@var{num} @gol
> > --mcld  -mcx16  -msahf  -mmovbe  -mcrc32 @gol
> > +-mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait @gol
> >  -mrecip  -mrecip=@var{opt} @gol
> >  -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt} @gol
> >  -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx @gol
> > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> >  @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> >  @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
> >
> > +@item -mmwait
> > +@opindex mmwait
> > +This option enables built-in functions @code{__builtin_ia32_monitor},
> > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> > +@code{mwait} machine instructions.
> > +
> >  @item -mrecip
> >  @opindex mrecip
> >  This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > new file mode 100644
> > index 00000000000..96eeec070f0
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > @@ -0,0 +1,27 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> > +
> > +/* Verify that they work in both 32bit and 64bit.  */
> > +
> > +#include <x86gprintrin.h>
> > +
> > +void
> > +foo (char *p, int x, int y, int z)
> > +{
> > +   _mm_monitor (p, y, x);
> > +   _mm_mwait (z, y);
> > +}
> > +
> > +void
> > +bar (char *p, long x, long y, long z)
> > +{
> > +   _mm_monitor (p, y, x);
> > +   _mm_mwait (z, y);
> > +}
> > +
> > +void
> > +foo1 (char *p)
> > +{
> > +   _mm_monitor (p, 0, 0);
> > +   _mm_mwait (0, 0);
> > +}
> > --
> > 2.31.1
> >



-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-16 12:25     ` H.J. Lu
@ 2021-08-16 12:28       ` Richard Biener
  2021-08-16 12:35         ` H.J. Lu
  2021-08-16 12:37         ` Martin Liška
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2021-08-16 12:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > > -mgeneral-regs-only and make -msse3 to imply -mmwait.
> >
> > Adding new options requires to bump the LTO streaming minor version
> > (I know we forgot it once on the branch already when adding a new --param).
> >
> > Please take care of this when backporting.
>
> It was updated today:
>
> commit dce5367eecfb0729cad0325240d614721afb39e3
> Author: Martin Liska <mliska@suse.cz>
> Date:   Mon Aug 16 13:02:54 2021 +0200
>
>     LTO: bump minor version
>
>     Bump the LTO_minor_version due to changes in
> 52f0aa4dee8401ef3958dbf789780b0ee877beab
>
>             PR c/100150
>
>     gcc/ChangeLog:
>
>             * lto-streamer.h (LTO_minor_version): Bump.
>
> Do I need to do it again if I can check in my patches this week?

Yes please, and do it with the same commit doing the .opt change.

Richard.

> Thanks.
>
> > Richard.
> >
> > > gcc/
> > >
> > >         * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> > >         x86_64-*-* targets.
> > >         * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> > >         New.
> > >         (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> > >         (ix86_handle_option): Handle -mmwait.
> > >         * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> > >         Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> > >         __builtin_ia32_monitor and __builtin_ia32_mwait.
> > >         * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> > >         (ix86_valid_target_attribute_inner_p): Likewise.
> > >         (ix86_option_override_internal): Enable mwait/monitor
> > >         instructions for -msse3.
> > >         * config/i386/i386.h (TARGET_MWAIT): New.
> > >         (TARGET_MWAIT_P): Likewise.
> > >         * config/i386/i386.opt: Add -mmwait.
> > >         * config/i386/mwaitintrin.h: New file.
> > >         * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> > >         * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> > >         TARGET_MWAIT.
> > >         (@sse3_monitor_<mode>): Likewise.
> > >         * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> > >         * doc/extend.texi: Document mwait target attribute.
> > >         * doc/invoke.texi: Document -mmwait.
> > >
> > > gcc/testsuite/
> > >
> > >         * gcc.target/i386/monitor-2.c: New test.
> > >
> > > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> > > ---
> > >  gcc/common/config/i386/i386-common.c      | 15 +++++++
> > >  gcc/config.gcc                            |  6 ++-
> > >  gcc/config/i386/i386-builtins.c           |  4 +-
> > >  gcc/config/i386/i386-options.c            |  7 +++
> > >  gcc/config/i386/i386.h                    |  2 +
> > >  gcc/config/i386/i386.opt                  |  4 ++
> > >  gcc/config/i386/mwaitintrin.h             | 52 +++++++++++++++++++++++
> > >  gcc/config/i386/pmmintrin.h               | 13 +-----
> > >  gcc/config/i386/sse.md                    |  4 +-
> > >  gcc/config/i386/x86gprintrin.h            |  2 +
> > >  gcc/doc/extend.texi                       |  5 +++
> > >  gcc/doc/invoke.texi                       |  8 +++-
> > >  gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> > >  13 files changed, 130 insertions(+), 19 deletions(-)
> > >  create mode 100644 gcc/config/i386/mwaitintrin.h
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > >
> > > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> > > index 6a7b5c8312f..e156cc34584 100644
> > > --- a/gcc/common/config/i386/i386-common.c
> > > +++ b/gcc/common/config/i386/i386-common.c
> > > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #define OPTION_MASK_ISA_F16C_SET \
> > >    (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> > >  #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> > > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> > >  #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> > >  #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> > >  #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> > > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> > >  #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> > >  #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> > > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> > >  #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> > >  #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> > >  #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> > > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> > >         }
> > >        return true;
> > >
> > > +    case OPT_mmwait:
> > > +      if (value)
> > > +       {
> > > +         opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> > > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> > > +       }
> > > +      else
> > > +       {
> > > +         opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> > > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> > > +       }
> > > +      return true;
> > > +
> > >      case OPT_mclzero:
> > >        if (value)
> > >         {
> > > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > > index 357b0bed067..a020e0808c9 100644
> > > --- a/gcc/config.gcc
> > > +++ b/gcc/config.gcc
> > > @@ -414,7 +414,8 @@ i[34567]86-*-*)
> > >                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > >                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > >                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > > -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > > +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > > +                      mwaitintrin.h"
> > >         ;;
> > >  x86_64-*-*)
> > >         cpu_type=i386
> > > @@ -451,7 +452,8 @@ x86_64-*-*)
> > >                        avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > >                        tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > >                        amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > > -                      hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > > +                      hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > > +                      mwaitintrin.h"
> > >         ;;
> > >  ia64-*-*)
> > >         extra_headers=ia64intrin.h
> > > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> > > index 4fcdf4b89ee..128bd39816c 100644
> > > --- a/gcc/config/i386/i386-builtins.c
> > > +++ b/gcc/config/i386/i386-builtins.c
> > > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> > >                             VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
> > >
> > >    /* SSE3.  */
> > > -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> > > +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> > >                VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> > > -  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> > > +  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> > >                VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
> > >
> > >    /* AES */
> > > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> > > index 18d2c0b9f99..7ecd0cf8b8c 100644
> > > --- a/gcc/config/i386/i386-options.c
> > > +++ b/gcc/config/i386/i386-options.c
> > > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> > >    { "-mmovbe",         OPTION_MASK_ISA2_MOVBE },
> > >    { "-mclzero",                OPTION_MASK_ISA2_CLZERO },
> > >    { "-mmwaitx",                OPTION_MASK_ISA2_MWAITX },
> > > +  { "-mmwait",         OPTION_MASK_ISA2_MWAIT },
> > >    { "-mmovdir64b",     OPTION_MASK_ISA2_MOVDIR64B },
> > >    { "-mwaitpkg",       OPTION_MASK_ISA2_WAITPKG },
> > >    { "-mcldemote",      OPTION_MASK_ISA2_CLDEMOTE },
> > > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> > >      IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> > >      IX86_ATTR_ISA ("rdrnd",    OPT_mrdrnd),
> > >      IX86_ATTR_ISA ("mwaitx",   OPT_mmwaitx),
> > > +    IX86_ATTR_ISA ("mwait",    OPT_mmwait),
> > >      IX86_ATTR_ISA ("clzero",   OPT_mclzero),
> > >      IX86_ATTR_ISA ("pku",      OPT_mpku),
> > >      IX86_ATTR_ISA ("lwp",      OPT_mlwp),
> > > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> > >        || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> > >      ix86_prefetch_sse = true;
> > >
> > > +  /* Enable mwait/monitor instructions for -msse3.  */
> > > +  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> > > +    opts->x_ix86_isa_flags2
> > > +      |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> > > +
> > >    /* Enable popcnt instruction for -msse4.2 or -mabm.  */
> > >    if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> > >        || TARGET_ABM_P (opts->x_ix86_isa_flags))
> > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > index 5583ec6881a..73e118900f7 100644
> > > --- a/gcc/config/i386/i386.h
> > > +++ b/gcc/config/i386/i386.h
> > > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> > >  #define TARGET_CLWB_P(x)       TARGET_ISA_CLWB_P(x)
> > >  #define TARGET_MWAITX  TARGET_ISA2_MWAITX
> > >  #define TARGET_MWAITX_P(x)     TARGET_ISA2_MWAITX_P(x)
> > > +#define TARGET_MWAIT   TARGET_ISA2_MWAIT
> > > +#define TARGET_MWAIT_P(x)      TARGET_ISA2_MWAIT_P(x)
> > >  #define TARGET_PKU     TARGET_ISA_PKU
> > >  #define TARGET_PKU_P(x)        TARGET_ISA_PKU_P(x)
> > >  #define TARGET_SHSTK   TARGET_ISA_SHSTK
> > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> > > index c781fdc8278..7b8547bb1c3 100644
> > > --- a/gcc/config/i386/i386.opt
> > > +++ b/gcc/config/i386/i386.opt
> > > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> > >  mneeded
> > >  Target Var(ix86_needed) Save
> > >  Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> > > +
> > > +mmwait
> > > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> > > +Support MWAIT and MONITOR built-in functions and code generation.
> > > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> > > new file mode 100644
> > > index 00000000000..1ecbc4abb69
> > > --- /dev/null
> > > +++ b/gcc/config/i386/mwaitintrin.h
> > > @@ -0,0 +1,52 @@
> > > +/* Copyright (C) 2021 Free Software Foundation, Inc.
> > > +
> > > +   This file is part of GCC.
> > > +
> > > +   GCC is free software; you can redistribute it and/or modify
> > > +   it under the terms of the GNU General Public License as published by
> > > +   the Free Software Foundation; either version 3, or (at your option)
> > > +   any later version.
> > > +
> > > +   GCC is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +   GNU General Public License for more details.
> > > +
> > > +   Under Section 7 of GPL version 3, you are granted additional
> > > +   permissions described in the GCC Runtime Library Exception, version
> > > +   3.1, as published by the Free Software Foundation.
> > > +
> > > +   You should have received a copy of the GNU General Public License and
> > > +   a copy of the GCC Runtime Library Exception along with this program;
> > > +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> > > +   <http://www.gnu.org/licenses/>.  */
> > > +
> > > +#ifndef _MWAITINTRIN_H_INCLUDED
> > > +#define _MWAITINTRIN_H_INCLUDED
> > > +
> > > +#ifndef __MWAIT__
> > > +#pragma GCC push_options
> > > +#pragma GCC target("mwait")
> > > +#define __DISABLE_MWAIT__
> > > +#endif /* __MWAIT__ */
> > > +
> > > +extern __inline void
> > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > > +{
> > > +  __builtin_ia32_monitor (__P, __E, __H);
> > > +}
> > > +
> > > +extern __inline void
> > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > +_mm_mwait (unsigned int __E, unsigned int __H)
> > > +{
> > > +  __builtin_ia32_mwait (__E, __H);
> > > +}
> > > +
> > > +#ifdef __DISABLE_MWAIT__
> > > +#undef __DISABLE_MWAIT__
> > > +#pragma GCC pop_options
> > > +#endif /* __DISABLE_MWAIT__ */
> > > +
> > > +#endif /* _MWAITINTRIN_H_INCLUDED */
> > > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> > > index fa9c5bb8b9f..f8102d2be23 100644
> > > --- a/gcc/config/i386/pmmintrin.h
> > > +++ b/gcc/config/i386/pmmintrin.h
> > > @@ -29,6 +29,7 @@
> > >
> > >  /* We need definitions from the SSE2 and SSE header files*/
> > >  #include <emmintrin.h>
> > > +#include <mwaitintrin.h>
> > >
> > >  #ifndef __SSE3__
> > >  #pragma GCC push_options
> > > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> > >    return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> > >  }
> > >
> > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > > -{
> > > -  __builtin_ia32_monitor (__P, __E, __H);
> > > -}
> > > -
> > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > -_mm_mwait (unsigned int __E, unsigned int __H)
> > > -{
> > > -  __builtin_ia32_mwait (__E, __H);
> > > -}
> > > -
> > >  #ifdef __DISABLE_SSE3__
> > >  #undef __DISABLE_SSE3__
> > >  #pragma GCC pop_options
> > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > index 3f81abc7804..43afe3dabed 100644
> > > --- a/gcc/config/i386/sse.md
> > > +++ b/gcc/config/i386/sse.md
> > > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> > >    [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> > >                      (match_operand:SI 1 "register_operand" "a")]
> > >                     UNSPECV_MWAIT)]
> > > -  "TARGET_SSE3"
> > > +  "TARGET_MWAIT"
> > >  ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> > >  ;; Since 32bit register operands are implicitly zero extended to 64bit,
> > >  ;; we only need to set up 32bit registers.
> > > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> > >                      (match_operand:SI 1 "register_operand" "c")
> > >                      (match_operand:SI 2 "register_operand" "d")]
> > >                     UNSPECV_MONITOR)]
> > > -  "TARGET_SSE3"
> > > +  "TARGET_MWAIT"
> > >  ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> > >  ;; RCX and RDX are used.  Since 32bit register operands are implicitly
> > >  ;; zero extended to 64bit, we only need to set up 32bit registers.
> > > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> > > index ceda501252c..7793032ba90 100644
> > > --- a/gcc/config/i386/x86gprintrin.h
> > > +++ b/gcc/config/i386/x86gprintrin.h
> > > @@ -56,6 +56,8 @@
> > >
> > >  #include <movdirintrin.h>
> > >
> > > +#include <mwaitintrin.h>
> > > +
> > >  #include <mwaitxintrin.h>
> > >
> > >  #include <pconfigintrin.h>
> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > > index 1bc66cce2b8..1acfaf1d345 100644
> > > --- a/gcc/doc/extend.texi
> > > +++ b/gcc/doc/extend.texi
> > > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> > >  @cindex @code{target("movdiri")} function attribute, x86
> > >  Enable/disable the generation of the MOVDIRI instructions.
> > >
> > > +@item mwait
> > > +@itemx no-mwait
> > > +@cindex @code{target("mwait")} function attribute, x86
> > > +Enable/disable the generation of the MWAIT and MONITOR instructions.
> > > +
> > >  @item mwaitx
> > >  @itemx no-mwaitx
> > >  @cindex @code{target("mwaitx")} function attribute, x86
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 7f13ffb79e1..3e1f0bc8fad 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> > >  -mno-wide-multiply  -mrtd  -malign-double @gol
> > >  -mpreferred-stack-boundary=@var{num} @gol
> > >  -mincoming-stack-boundary=@var{num} @gol
> > > --mcld  -mcx16  -msahf  -mmovbe  -mcrc32 @gol
> > > +-mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait @gol
> > >  -mrecip  -mrecip=@var{opt} @gol
> > >  -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt} @gol
> > >  -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx @gol
> > > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> > >  @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> > >  @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
> > >
> > > +@item -mmwait
> > > +@opindex mmwait
> > > +This option enables built-in functions @code{__builtin_ia32_monitor},
> > > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> > > +@code{mwait} machine instructions.
> > > +
> > >  @item -mrecip
> > >  @opindex mrecip
> > >  This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> > > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > > new file mode 100644
> > > index 00000000000..96eeec070f0
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > > @@ -0,0 +1,27 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> > > +
> > > +/* Verify that they work in both 32bit and 64bit.  */
> > > +
> > > +#include <x86gprintrin.h>
> > > +
> > > +void
> > > +foo (char *p, int x, int y, int z)
> > > +{
> > > +   _mm_monitor (p, y, x);
> > > +   _mm_mwait (z, y);
> > > +}
> > > +
> > > +void
> > > +bar (char *p, long x, long y, long z)
> > > +{
> > > +   _mm_monitor (p, y, x);
> > > +   _mm_mwait (z, y);
> > > +}
> > > +
> > > +void
> > > +foo1 (char *p)
> > > +{
> > > +   _mm_monitor (p, 0, 0);
> > > +   _mm_mwait (0, 0);
> > > +}
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-16 12:28       ` Richard Biener
@ 2021-08-16 12:35         ` H.J. Lu
  2021-08-16 12:37         ` Martin Liška
  1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-16 12:35 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]

On Mon, Aug 16, 2021 at 5:28 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > > > -mgeneral-regs-only and make -msse3 to imply -mmwait.
> > >
> > > Adding new options requires to bump the LTO streaming minor version
> > > (I know we forgot it once on the branch already when adding a new --param).
> > >
> > > Please take care of this when backporting.
> >
> > It was updated today:
> >
> > commit dce5367eecfb0729cad0325240d614721afb39e3
> > Author: Martin Liska <mliska@suse.cz>
> > Date:   Mon Aug 16 13:02:54 2021 +0200
> >
> >     LTO: bump minor version
> >
> >     Bump the LTO_minor_version due to changes in
> > 52f0aa4dee8401ef3958dbf789780b0ee877beab
> >
> >             PR c/100150
> >
> >     gcc/ChangeLog:
> >
> >             * lto-streamer.h (LTO_minor_version): Bump.
> >
> > Do I need to do it again if I can check in my patches this week?
>
> Yes please, and do it with the same commit doing the .opt change.
>

Here is the updated patch with LTO_minor_version bump.

Thanks.

-- 
H.J.

[-- Attachment #2: 0001-x86-Add-mmwait-for-mgeneral-regs-only.patch --]
[-- Type: text/x-patch, Size: 15414 bytes --]

From 8f3e275ef061cd5f8353c71cb99f05dd944575f9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 15 Apr 2021 11:19:32 -0700
Subject: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only

Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.

gcc/

	* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
	x86_64-*-* targets.
	* lto-streamer.h (LTO_minor_version): Bump.
	* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
	New.
	(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
	(ix86_handle_option): Handle -mmwait.
	* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
	Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
	__builtin_ia32_monitor and __builtin_ia32_mwait.
	* config/i386/i386-options.c (isa2_opts): Add -mmwait.
	(ix86_valid_target_attribute_inner_p): Likewise.
	(ix86_option_override_internal): Enable mwait/monitor
	instructions for -msse3.
	* config/i386/i386.h (TARGET_MWAIT): New.
	(TARGET_MWAIT_P): Likewise.
	* config/i386/i386.opt: Add -mmwait.
	* config/i386/mwaitintrin.h: New file.
	* config/i386/pmmintrin.h: Include <mwaitintrin.h>.
	* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
	TARGET_MWAIT.
	(@sse3_monitor_<mode>): Likewise.
	* config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
	* doc/extend.texi: Document mwait target attribute.
	* doc/invoke.texi: Document -mmwait.

gcc/testsuite/

	* gcc.target/i386/monitor-2.c: New test.

(cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
---
 gcc/common/config/i386/i386-common.c      | 15 +++++++
 gcc/config.gcc                            |  6 ++-
 gcc/config/i386/i386-builtins.c           |  4 +-
 gcc/config/i386/i386-options.c            |  7 +++
 gcc/config/i386/i386.h                    |  2 +
 gcc/config/i386/i386.opt                  |  4 ++
 gcc/config/i386/mwaitintrin.h             | 52 +++++++++++++++++++++++
 gcc/config/i386/pmmintrin.h               | 13 +-----
 gcc/config/i386/sse.md                    |  4 +-
 gcc/config/i386/x86gprintrin.h            |  2 +
 gcc/doc/extend.texi                       |  5 +++
 gcc/doc/invoke.texi                       |  8 +++-
 gcc/lto-streamer.h                        |  2 +-
 gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
 14 files changed, 131 insertions(+), 20 deletions(-)
 create mode 100644 gcc/config/i386/mwaitintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 6a7b5c8312f..e156cc34584 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -150,6 +150,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_F16C_SET \
   (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
 #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
 #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
 #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
 #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
 #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
@@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mmwait:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
+	}
+      return true;
+
     case OPT_mclzero:
       if (value)
 	{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 357b0bed067..a020e0808c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -414,7 +414,8 @@ i[34567]86-*-*)
 		       avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
 		       tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
 		       amxbf16intrin.h x86gprintrin.h uintrintrin.h
-		       hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+		       hresetintrin.h keylockerintrin.h avxvnniintrin.h
+		       mwaitintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -451,7 +452,8 @@ x86_64-*-*)
 		       avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
 		       tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
 		       amxbf16intrin.h x86gprintrin.h uintrintrin.h
-		       hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+		       hresetintrin.h keylockerintrin.h avxvnniintrin.h
+		       mwaitintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 4fcdf4b89ee..128bd39816c 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
 			    VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
 
   /* SSE3.  */
-  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
+  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
 	       VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
-  def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
+  def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
 	       VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
 
   /* AES */
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 18d2c0b9f99..7ecd0cf8b8c 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
   { "-mmovbe",		OPTION_MASK_ISA2_MOVBE },
   { "-mclzero",		OPTION_MASK_ISA2_CLZERO },
   { "-mmwaitx",		OPTION_MASK_ISA2_MWAITX },
+  { "-mmwait",		OPTION_MASK_ISA2_MWAIT },
   { "-mmovdir64b",	OPTION_MASK_ISA2_MOVDIR64B },
   { "-mwaitpkg",	OPTION_MASK_ISA2_WAITPKG },
   { "-mcldemote",	OPTION_MASK_ISA2_CLDEMOTE },
@@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
     IX86_ATTR_ISA ("fsgsbase",	OPT_mfsgsbase),
     IX86_ATTR_ISA ("rdrnd",	OPT_mrdrnd),
     IX86_ATTR_ISA ("mwaitx",	OPT_mmwaitx),
+    IX86_ATTR_ISA ("mwait",	OPT_mmwait),
     IX86_ATTR_ISA ("clzero",	OPT_mclzero),
     IX86_ATTR_ISA ("pku",	OPT_mpku),
     IX86_ATTR_ISA ("lwp",	OPT_mlwp),
@@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
       || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
     ix86_prefetch_sse = true;
 
+  /* Enable mwait/monitor instructions for -msse3.  */
+  if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+    opts->x_ix86_isa_flags2
+      |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
+
   /* Enable popcnt instruction for -msse4.2 or -mabm.  */
   if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
       || TARGET_ABM_P (opts->x_ix86_isa_flags))
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5583ec6881a..73e118900f7 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_CLWB_P(x)	TARGET_ISA_CLWB_P(x)
 #define TARGET_MWAITX	TARGET_ISA2_MWAITX
 #define TARGET_MWAITX_P(x)	TARGET_ISA2_MWAITX_P(x)
+#define TARGET_MWAIT	TARGET_ISA2_MWAIT
+#define TARGET_MWAIT_P(x)	TARGET_ISA2_MWAIT_P(x)
 #define TARGET_PKU	TARGET_ISA_PKU
 #define TARGET_PKU_P(x)	TARGET_ISA_PKU_P(x)
 #define TARGET_SHSTK	TARGET_ISA_SHSTK
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index c781fdc8278..7b8547bb1c3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
 mneeded
 Target Var(ix86_needed) Save
 Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
+
+mmwait
+Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
+Support MWAIT and MONITOR built-in functions and code generation.
diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
new file mode 100644
index 00000000000..1ecbc4abb69
--- /dev/null
+++ b/gcc/config/i386/mwaitintrin.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MWAITINTRIN_H_INCLUDED
+#define _MWAITINTRIN_H_INCLUDED
+
+#ifndef __MWAIT__
+#pragma GCC push_options
+#pragma GCC target("mwait")
+#define __DISABLE_MWAIT__
+#endif /* __MWAIT__ */
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
+{
+  __builtin_ia32_monitor (__P, __E, __H);
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mwait (unsigned int __E, unsigned int __H)
+{
+  __builtin_ia32_mwait (__E, __H);
+}
+
+#ifdef __DISABLE_MWAIT__
+#undef __DISABLE_MWAIT__
+#pragma GCC pop_options
+#endif /* __DISABLE_MWAIT__ */
+
+#endif /* _MWAITINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
index fa9c5bb8b9f..f8102d2be23 100644
--- a/gcc/config/i386/pmmintrin.h
+++ b/gcc/config/i386/pmmintrin.h
@@ -29,6 +29,7 @@
 
 /* We need definitions from the SSE2 and SSE header files*/
 #include <emmintrin.h>
+#include <mwaitintrin.h>
 
 #ifndef __SSE3__
 #pragma GCC push_options
@@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
   return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
 }
 
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
-{
-  __builtin_ia32_monitor (__P, __E, __H);
-}
-
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mwait (unsigned int __E, unsigned int __H)
-{
-  __builtin_ia32_mwait (__E, __H);
-}
-
 #ifdef __DISABLE_SSE3__
 #undef __DISABLE_SSE3__
 #pragma GCC pop_options
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3f81abc7804..43afe3dabed 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
 		     (match_operand:SI 1 "register_operand" "a")]
 		    UNSPECV_MWAIT)]
-  "TARGET_SSE3"
+  "TARGET_MWAIT"
 ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
 ;; Since 32bit register operands are implicitly zero extended to 64bit,
 ;; we only need to set up 32bit registers.
@@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
 		     (match_operand:SI 1 "register_operand" "c")
 		     (match_operand:SI 2 "register_operand" "d")]
 		    UNSPECV_MONITOR)]
-  "TARGET_SSE3"
+  "TARGET_MWAIT"
 ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
 ;; RCX and RDX are used.  Since 32bit register operands are implicitly
 ;; zero extended to 64bit, we only need to set up 32bit registers.
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index ceda501252c..7793032ba90 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -56,6 +56,8 @@
 
 #include <movdirintrin.h>
 
+#include <mwaitintrin.h>
+
 #include <mwaitxintrin.h>
 
 #include <pconfigintrin.h>
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1bc66cce2b8..1acfaf1d345 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
 @cindex @code{target("movdiri")} function attribute, x86
 Enable/disable the generation of the MOVDIRI instructions.
 
+@item mwait
+@itemx no-mwait
+@cindex @code{target("mwait")} function attribute, x86
+Enable/disable the generation of the MWAIT and MONITOR instructions.
+
 @item mwaitx
 @itemx no-mwaitx
 @cindex @code{target("mwaitx")} function attribute, x86
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 05269f83808..fc222758a22 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
 -mno-wide-multiply  -mrtd  -malign-double @gol
 -mpreferred-stack-boundary=@var{num} @gol
 -mincoming-stack-boundary=@var{num} @gol
--mcld  -mcx16  -msahf  -mmovbe  -mcrc32 @gol
+-mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait @gol
 -mrecip  -mrecip=@var{opt} @gol
 -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt} @gol
 -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx @gol
@@ -31178,6 +31178,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
 @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
 @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
 
+@item -mmwait
+@opindex mmwait
+This option enables built-in functions @code{__builtin_ia32_monitor},
+and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
+@code{mwait} machine instructions.
+
 @item -mrecip
 @opindex mrecip
 This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index a01049da472..e2a0e033ab2 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -121,7 +121,7 @@ along with GCC; see the file COPYING3.  If not see
      form followed by the data for the string.  */
 
 #define LTO_major_version 11
-#define LTO_minor_version 1
+#define LTO_minor_version 2
 
 typedef unsigned char	lto_decl_flags_t;
 
diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
new file mode 100644
index 00000000000..96eeec070f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
+
+/* Verify that they work in both 32bit and 64bit.  */
+
+#include <x86gprintrin.h>
+
+void
+foo (char *p, int x, int y, int z)
+{
+   _mm_monitor (p, y, x);
+   _mm_mwait (z, y);
+}
+
+void
+bar (char *p, long x, long y, long z)
+{
+   _mm_monitor (p, y, x);
+   _mm_mwait (z, y);
+}
+
+void
+foo1 (char *p)
+{
+   _mm_monitor (p, 0, 0);
+   _mm_mwait (0, 0);
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
  2021-08-16 12:28       ` Richard Biener
  2021-08-16 12:35         ` H.J. Lu
@ 2021-08-16 12:37         ` Martin Liška
  1 sibling, 0 replies; 16+ messages in thread
From: Martin Liška @ 2021-08-16 12:37 UTC (permalink / raw)
  To: Richard Biener, H.J. Lu; +Cc: Jakub Jelinek, GCC Patches

On 8/16/21 2:28 PM, Richard Biener via Gcc-patches wrote:
> Yes please, and do it with the same commit doing the .opt change.

Just one quick note: I've got a periodic builder that verifies the LTO
stream on tramp3d in all active branches.

Martin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
  2021-08-16  6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
@ 2021-08-24 14:57   ` H.J. Lu
  2021-08-25  7:34     ` Uros Bizjak
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-24 14:57 UTC (permalink / raw)
  To: Richard Biener, Jan Hubicka; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > <x86gprintrin.h> and target("general-regs-only") function attribute
> > were added to GCC 11.  But their implementations are incomplete.  I'd
> > like to backport the following patches to GCC 11 branch to finish them.
>
> Fine with me if x86 maintainers do not disagree (also see one comment I have
> on the -mwait adding patch).

Hi Uros, Honza,

Do you have any comments?  The updated -mwait patch with LTO_minor_version
bump is at:

https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html

Thanks.

H.J.
> > H.J. Lu (5):
> >   x86: Add -mmwait for -mgeneral-regs-only
> >   x86: Use crc32 target option for CRC32 intrinsics
> >   x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> >   x86: Enable the GPR only instructions for -mgeneral-regs-only
> >   <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> >
> >  gcc/common/config/i386/i386-common.c       |  45 ++-
> >  gcc/config.gcc                             |   6 +-
> >  gcc/config/i386/i386-builtin.def           |   8 +-
> >  gcc/config/i386/i386-builtins.c            |   4 +-
> >  gcc/config/i386/i386-c.c                   |   2 +
> >  gcc/config/i386/i386-options.c             |  12 +
> >  gcc/config/i386/i386.c                     |   6 +-
> >  gcc/config/i386/i386.h                     |   2 +
> >  gcc/config/i386/i386.md                    |   4 +-
> >  gcc/config/i386/i386.opt                   |   4 +
> >  gcc/config/i386/ia32intrin.h               |  42 ++-
> >  gcc/config/i386/mwaitintrin.h              |  52 +++
> >  gcc/config/i386/pmmintrin.h                |  13 +-
> >  gcc/config/i386/serializeintrin.h          |   7 +-
> >  gcc/config/i386/sse.md                     |   4 +-
> >  gcc/config/i386/x86gprintrin.h             |  13 +
> >  gcc/doc/extend.texi                        |   5 +
> >  gcc/doc/invoke.texi                        |   8 +-
> >  gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
> >  gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
> >  gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
> >  gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
> >  gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
> >  gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
> >  gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
> >  gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
> >  gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
> >  gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
> >  gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
> >  gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
> >  30 files changed, 717 insertions(+), 45 deletions(-)
> >  create mode 100644 gcc/config/i386/mwaitintrin.h
> >  create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> >
> > --
> > 2.31.1
> >



--
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
  2021-08-24 14:57   ` H.J. Lu
@ 2021-08-25  7:34     ` Uros Bizjak
  2021-08-25 12:14       ` H.J. Lu
  2021-08-26  6:35       ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Uros Bizjak @ 2021-08-25  7:34 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek

On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > were added to GCC 11.  But their implementations are incomplete.  I'd
> > > like to backport the following patches to GCC 11 branch to finish them.
> >
> > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > on the -mwait adding patch).
>
> Hi Uros, Honza,
>
> Do you have any comments?  The updated -mwait patch with LTO_minor_version
> bump is at:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html

I don't have any comments, but IIRC, approved changes can be
backported from mainline to release branches without additional
approval.

Uros.

> Thanks.
>
> H.J.
> > > H.J. Lu (5):
> > >   x86: Add -mmwait for -mgeneral-regs-only
> > >   x86: Use crc32 target option for CRC32 intrinsics
> > >   x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > >   x86: Enable the GPR only instructions for -mgeneral-regs-only
> > >   <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > >
> > >  gcc/common/config/i386/i386-common.c       |  45 ++-
> > >  gcc/config.gcc                             |   6 +-
> > >  gcc/config/i386/i386-builtin.def           |   8 +-
> > >  gcc/config/i386/i386-builtins.c            |   4 +-
> > >  gcc/config/i386/i386-c.c                   |   2 +
> > >  gcc/config/i386/i386-options.c             |  12 +
> > >  gcc/config/i386/i386.c                     |   6 +-
> > >  gcc/config/i386/i386.h                     |   2 +
> > >  gcc/config/i386/i386.md                    |   4 +-
> > >  gcc/config/i386/i386.opt                   |   4 +
> > >  gcc/config/i386/ia32intrin.h               |  42 ++-
> > >  gcc/config/i386/mwaitintrin.h              |  52 +++
> > >  gcc/config/i386/pmmintrin.h                |  13 +-
> > >  gcc/config/i386/serializeintrin.h          |   7 +-
> > >  gcc/config/i386/sse.md                     |   4 +-
> > >  gcc/config/i386/x86gprintrin.h             |  13 +
> > >  gcc/doc/extend.texi                        |   5 +
> > >  gcc/doc/invoke.texi                        |   8 +-
> > >  gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
> > >  gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
> > >  gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
> > >  gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
> > >  gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
> > >  gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
> > >  gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
> > >  gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
> > >  gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
> > >  gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
> > >  gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
> > >  gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
> > >  30 files changed, 717 insertions(+), 45 deletions(-)
> > >  create mode 100644 gcc/config/i386/mwaitintrin.h
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > >
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
  2021-08-25  7:34     ` Uros Bizjak
@ 2021-08-25 12:14       ` H.J. Lu
  2021-08-26  6:35       ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-25 12:14 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek

On Wed, Aug 25, 2021 at 12:34 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > > were added to GCC 11.  But their implementations are incomplete.  I'd
> > > > like to backport the following patches to GCC 11 branch to finish them.
> > >
> > > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > > on the -mwait adding patch).
> >
> > Hi Uros, Honza,
> >
> > Do you have any comments?  The updated -mwait patch with LTO_minor_version
> > bump is at:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
>
> I don't have any comments, but IIRC, approved changes can be
> backported from mainline to release branches without additional
> approval.

I am checking them in.

Thanks.

> Uros.
>
> > Thanks.
> >
> > H.J.
> > > > H.J. Lu (5):
> > > >   x86: Add -mmwait for -mgeneral-regs-only
> > > >   x86: Use crc32 target option for CRC32 intrinsics
> > > >   x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > > >   x86: Enable the GPR only instructions for -mgeneral-regs-only
> > > >   <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > > >
> > > >  gcc/common/config/i386/i386-common.c       |  45 ++-
> > > >  gcc/config.gcc                             |   6 +-
> > > >  gcc/config/i386/i386-builtin.def           |   8 +-
> > > >  gcc/config/i386/i386-builtins.c            |   4 +-
> > > >  gcc/config/i386/i386-c.c                   |   2 +
> > > >  gcc/config/i386/i386-options.c             |  12 +
> > > >  gcc/config/i386/i386.c                     |   6 +-
> > > >  gcc/config/i386/i386.h                     |   2 +
> > > >  gcc/config/i386/i386.md                    |   4 +-
> > > >  gcc/config/i386/i386.opt                   |   4 +
> > > >  gcc/config/i386/ia32intrin.h               |  42 ++-
> > > >  gcc/config/i386/mwaitintrin.h              |  52 +++
> > > >  gcc/config/i386/pmmintrin.h                |  13 +-
> > > >  gcc/config/i386/serializeintrin.h          |   7 +-
> > > >  gcc/config/i386/sse.md                     |   4 +-
> > > >  gcc/config/i386/x86gprintrin.h             |  13 +
> > > >  gcc/doc/extend.texi                        |   5 +
> > > >  gcc/doc/invoke.texi                        |   8 +-
> > > >  gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
> > > >  gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
> > > >  gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
> > > >  30 files changed, 717 insertions(+), 45 deletions(-)
> > > >  create mode 100644 gcc/config/i386/mwaitintrin.h
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > > >
> > > > --
> > > > 2.31.1
> > > >
> >
> >
> >
> > --
> > H.J.



-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
  2021-08-25  7:34     ` Uros Bizjak
  2021-08-25 12:14       ` H.J. Lu
@ 2021-08-26  6:35       ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2021-08-26  6:35 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: H.J. Lu, Jan Hubicka, GCC Patches, Jakub Jelinek

On Wed, Aug 25, 2021 at 9:34 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > > were added to GCC 11.  But their implementations are incomplete.  I'd
> > > > like to backport the following patches to GCC 11 branch to finish them.
> > >
> > > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > > on the -mwait adding patch).
> >
> > Hi Uros, Honza,
> >
> > Do you have any comments?  The updated -mwait patch with LTO_minor_version
> > bump is at:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
>
> I don't have any comments, but IIRC, approved changes can be
> backported from mainline to release branches without additional
> approval.

If they fix regressions, yes.  I understood this wasn't such obvious case here
(instead it's a new but buggy feature).

Richard.

> Uros.
>
> > Thanks.
> >
> > H.J.
> > > > H.J. Lu (5):
> > > >   x86: Add -mmwait for -mgeneral-regs-only
> > > >   x86: Use crc32 target option for CRC32 intrinsics
> > > >   x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > > >   x86: Enable the GPR only instructions for -mgeneral-regs-only
> > > >   <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > > >
> > > >  gcc/common/config/i386/i386-common.c       |  45 ++-
> > > >  gcc/config.gcc                             |   6 +-
> > > >  gcc/config/i386/i386-builtin.def           |   8 +-
> > > >  gcc/config/i386/i386-builtins.c            |   4 +-
> > > >  gcc/config/i386/i386-c.c                   |   2 +
> > > >  gcc/config/i386/i386-options.c             |  12 +
> > > >  gcc/config/i386/i386.c                     |   6 +-
> > > >  gcc/config/i386/i386.h                     |   2 +
> > > >  gcc/config/i386/i386.md                    |   4 +-
> > > >  gcc/config/i386/i386.opt                   |   4 +
> > > >  gcc/config/i386/ia32intrin.h               |  42 ++-
> > > >  gcc/config/i386/mwaitintrin.h              |  52 +++
> > > >  gcc/config/i386/pmmintrin.h                |  13 +-
> > > >  gcc/config/i386/serializeintrin.h          |   7 +-
> > > >  gcc/config/i386/sse.md                     |   4 +-
> > > >  gcc/config/i386/x86gprintrin.h             |  13 +
> > > >  gcc/doc/extend.texi                        |   5 +
> > > >  gcc/doc/invoke.texi                        |   8 +-
> > > >  gcc/testsuite/gcc.target/i386/crc32-6.c    |  13 +
> > > >  gcc/testsuite/gcc.target/i386/monitor-2.c  |  27 ++
> > > >  gcc/testsuite/gcc.target/i386/pr101492-1.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-2.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-3.c |  10 +
> > > >  gcc/testsuite/gcc.target/i386/pr101492-4.c |  12 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-3.c  |  13 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-4.c  | 357 +++++++++++++++++++++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-5.c  |  25 ++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-6.c  |  23 ++
> > > >  gcc/testsuite/gcc.target/i386/pr99744-7.c  |  12 +
> > > >  gcc/testsuite/gcc.target/i386/pr99744-8.c  |  13 +
> > > >  30 files changed, 717 insertions(+), 45 deletions(-)
> > > >  create mode 100644 gcc/config/i386/mwaitintrin.h
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > > >
> > > > --
> > > > 2.31.1
> > > >
> >
> >
> >
> > --
> > H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-08-26  6:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
2021-08-16  6:11   ` Richard Biener
2021-08-16 12:25     ` H.J. Lu
2021-08-16 12:28       ` Richard Biener
2021-08-16 12:35         ` H.J. Lu
2021-08-16 12:37         ` Martin Liška
2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
2021-08-16  6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
2021-08-24 14:57   ` H.J. Lu
2021-08-25  7:34     ` Uros Bizjak
2021-08-25 12:14       ` H.J. Lu
2021-08-26  6:35       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).