* [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
@ 2021-08-13 13:50 H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
<x86gprintrin.h> and target("general-regs-only") function attribute
were added to GCC 11. But their implementations are incomplete. I'd
like to backport the following patches to GCC 11 branch to finish them.
H.J. Lu (5):
x86: Add -mmwait for -mgeneral-regs-only
x86: Use crc32 target option for CRC32 intrinsics
x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
x86: Enable the GPR only instructions for -mgeneral-regs-only
<x86gprintrin.h>: Add pragma GCC target("general-regs-only")
gcc/common/config/i386/i386-common.c | 45 ++-
gcc/config.gcc | 6 +-
gcc/config/i386/i386-builtin.def | 8 +-
gcc/config/i386/i386-builtins.c | 4 +-
gcc/config/i386/i386-c.c | 2 +
gcc/config/i386/i386-options.c | 12 +
gcc/config/i386/i386.c | 6 +-
gcc/config/i386/i386.h | 2 +
gcc/config/i386/i386.md | 4 +-
gcc/config/i386/i386.opt | 4 +
gcc/config/i386/ia32intrin.h | 42 ++-
gcc/config/i386/mwaitintrin.h | 52 +++
gcc/config/i386/pmmintrin.h | 13 +-
gcc/config/i386/serializeintrin.h | 7 +-
gcc/config/i386/sse.md | 4 +-
gcc/config/i386/x86gprintrin.h | 13 +
gcc/doc/extend.texi | 5 +
gcc/doc/invoke.texi | 8 +-
gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
30 files changed, 717 insertions(+), 45 deletions(-)
create mode 100644 gcc/config/i386/mwaitintrin.h
create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
@ 2021-08-13 13:50 ` H.J. Lu
2021-08-16 6:11 ` Richard Biener
2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:50 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.
gcc/
* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
x86_64-*-* targets.
* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
New.
(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
(ix86_handle_option): Handle -mmwait.
* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
__builtin_ia32_monitor and __builtin_ia32_mwait.
* config/i386/i386-options.c (isa2_opts): Add -mmwait.
(ix86_valid_target_attribute_inner_p): Likewise.
(ix86_option_override_internal): Enable mwait/monitor
instructions for -msse3.
* config/i386/i386.h (TARGET_MWAIT): New.
(TARGET_MWAIT_P): Likewise.
* config/i386/i386.opt: Add -mmwait.
* config/i386/mwaitintrin.h: New file.
* config/i386/pmmintrin.h: Include <mwaitintrin.h>.
* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
TARGET_MWAIT.
(@sse3_monitor_<mode>): Likewise.
* config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
* doc/extend.texi: Document mwait target attribute.
* doc/invoke.texi: Document -mmwait.
gcc/testsuite/
* gcc.target/i386/monitor-2.c: New test.
(cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
---
gcc/common/config/i386/i386-common.c | 15 +++++++
gcc/config.gcc | 6 ++-
gcc/config/i386/i386-builtins.c | 4 +-
gcc/config/i386/i386-options.c | 7 +++
gcc/config/i386/i386.h | 2 +
gcc/config/i386/i386.opt | 4 ++
gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
gcc/config/i386/pmmintrin.h | 13 +-----
gcc/config/i386/sse.md | 4 +-
gcc/config/i386/x86gprintrin.h | 2 +
gcc/doc/extend.texi | 5 +++
gcc/doc/invoke.texi | 8 +++-
gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
13 files changed, 130 insertions(+), 19 deletions(-)
create mode 100644 gcc/config/i386/mwaitintrin.h
create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 6a7b5c8312f..e156cc34584 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
#define OPTION_MASK_ISA_F16C_SET \
(OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
#define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
#define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
#define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
#define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
#define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
#define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
#define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
#define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
#define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
#define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
@@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
}
return true;
+ case OPT_mmwait:
+ if (value)
+ {
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
+ }
+ else
+ {
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
+ }
+ return true;
+
case OPT_mclzero:
if (value)
{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 357b0bed067..a020e0808c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -414,7 +414,8 @@ i[34567]86-*-*)
avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
amxbf16intrin.h x86gprintrin.h uintrintrin.h
- hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+ hresetintrin.h keylockerintrin.h avxvnniintrin.h
+ mwaitintrin.h"
;;
x86_64-*-*)
cpu_type=i386
@@ -451,7 +452,8 @@ x86_64-*-*)
avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
amxbf16intrin.h x86gprintrin.h uintrintrin.h
- hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+ hresetintrin.h keylockerintrin.h avxvnniintrin.h
+ mwaitintrin.h"
;;
ia64-*-*)
extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 4fcdf4b89ee..128bd39816c 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
/* SSE3. */
- def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
+ def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
- def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
+ def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
/* AES */
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 18d2c0b9f99..7ecd0cf8b8c 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
{ "-mmovbe", OPTION_MASK_ISA2_MOVBE },
{ "-mclzero", OPTION_MASK_ISA2_CLZERO },
{ "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
+ { "-mmwait", OPTION_MASK_ISA2_MWAIT },
{ "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
{ "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
{ "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
@@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
+ IX86_ATTR_ISA ("mwait", OPT_mmwait),
IX86_ATTR_ISA ("clzero", OPT_mclzero),
IX86_ATTR_ISA ("pku", OPT_mpku),
IX86_ATTR_ISA ("lwp", OPT_mlwp),
@@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
|| TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
ix86_prefetch_sse = true;
+ /* Enable mwait/monitor instructions for -msse3. */
+ if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+ opts->x_ix86_isa_flags2
+ |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
+
/* Enable popcnt instruction for -msse4.2 or -mabm. */
if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
|| TARGET_ABM_P (opts->x_ix86_isa_flags))
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5583ec6881a..73e118900f7 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
#define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
#define TARGET_MWAITX TARGET_ISA2_MWAITX
#define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
+#define TARGET_MWAIT TARGET_ISA2_MWAIT
+#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
#define TARGET_PKU TARGET_ISA_PKU
#define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
#define TARGET_SHSTK TARGET_ISA_SHSTK
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index c781fdc8278..7b8547bb1c3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
mneeded
Target Var(ix86_needed) Save
Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
+
+mmwait
+Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
+Support MWAIT and MONITOR built-in functions and code generation.
diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
new file mode 100644
index 00000000000..1ecbc4abb69
--- /dev/null
+++ b/gcc/config/i386/mwaitintrin.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ Under Section 7 of GPL version 3, you are granted additional
+ permissions described in the GCC Runtime Library Exception, version
+ 3.1, as published by the Free Software Foundation.
+
+ You should have received a copy of the GNU General Public License and
+ a copy of the GCC Runtime Library Exception along with this program;
+ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _MWAITINTRIN_H_INCLUDED
+#define _MWAITINTRIN_H_INCLUDED
+
+#ifndef __MWAIT__
+#pragma GCC push_options
+#pragma GCC target("mwait")
+#define __DISABLE_MWAIT__
+#endif /* __MWAIT__ */
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
+{
+ __builtin_ia32_monitor (__P, __E, __H);
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mwait (unsigned int __E, unsigned int __H)
+{
+ __builtin_ia32_mwait (__E, __H);
+}
+
+#ifdef __DISABLE_MWAIT__
+#undef __DISABLE_MWAIT__
+#pragma GCC pop_options
+#endif /* __DISABLE_MWAIT__ */
+
+#endif /* _MWAITINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
index fa9c5bb8b9f..f8102d2be23 100644
--- a/gcc/config/i386/pmmintrin.h
+++ b/gcc/config/i386/pmmintrin.h
@@ -29,6 +29,7 @@
/* We need definitions from the SSE2 and SSE header files*/
#include <emmintrin.h>
+#include <mwaitintrin.h>
#ifndef __SSE3__
#pragma GCC push_options
@@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
}
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
-{
- __builtin_ia32_monitor (__P, __E, __H);
-}
-
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mwait (unsigned int __E, unsigned int __H)
-{
- __builtin_ia32_mwait (__E, __H);
-}
-
#ifdef __DISABLE_SSE3__
#undef __DISABLE_SSE3__
#pragma GCC pop_options
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3f81abc7804..43afe3dabed 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
[(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
(match_operand:SI 1 "register_operand" "a")]
UNSPECV_MWAIT)]
- "TARGET_SSE3"
+ "TARGET_MWAIT"
;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
;; Since 32bit register operands are implicitly zero extended to 64bit,
;; we only need to set up 32bit registers.
@@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
(match_operand:SI 1 "register_operand" "c")
(match_operand:SI 2 "register_operand" "d")]
UNSPECV_MONITOR)]
- "TARGET_SSE3"
+ "TARGET_MWAIT"
;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
;; RCX and RDX are used. Since 32bit register operands are implicitly
;; zero extended to 64bit, we only need to set up 32bit registers.
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index ceda501252c..7793032ba90 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -56,6 +56,8 @@
#include <movdirintrin.h>
+#include <mwaitintrin.h>
+
#include <mwaitxintrin.h>
#include <pconfigintrin.h>
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1bc66cce2b8..1acfaf1d345 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
@cindex @code{target("movdiri")} function attribute, x86
Enable/disable the generation of the MOVDIRI instructions.
+@item mwait
+@itemx no-mwait
+@cindex @code{target("mwait")} function attribute, x86
+Enable/disable the generation of the MWAIT and MONITOR instructions.
+
@item mwaitx
@itemx no-mwaitx
@cindex @code{target("mwaitx")} function attribute, x86
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7f13ffb79e1..3e1f0bc8fad 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
-mno-wide-multiply -mrtd -malign-double @gol
-mpreferred-stack-boundary=@var{num} @gol
-mincoming-stack-boundary=@var{num} @gol
--mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
+-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
-mrecip -mrecip=@var{opt} @gol
-mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
@@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
@code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
@code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
+@item -mmwait
+@opindex mmwait
+This option enables built-in functions @code{__builtin_ia32_monitor},
+and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
+@code{mwait} machine instructions.
+
@item -mrecip
@opindex mrecip
This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
new file mode 100644
index 00000000000..96eeec070f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
+
+/* Verify that they work in both 32bit and 64bit. */
+
+#include <x86gprintrin.h>
+
+void
+foo (char *p, int x, int y, int z)
+{
+ _mm_monitor (p, y, x);
+ _mm_mwait (z, y);
+}
+
+void
+bar (char *p, long x, long y, long z)
+{
+ _mm_monitor (p, y, x);
+ _mm_mwait (z, y);
+}
+
+void
+foo1 (char *p)
+{
+ _mm_monitor (p, 0, 0);
+ _mm_mwait (0, 0);
+}
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
` (3 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
Use crc32 target option for CRC32 intrinsics to support CRC32 intrinsics
without enabling SSE vector instructions.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__CRC32__ for -mcrc32.
* config/i386/i386-options.c (ix86_option_override_internal):
Enable crc32 instruction for -msse4.2.
* config/i386/i386.md (sse4_2_crc32<mode>): Remove TARGET_SSE4_2
check.
(sse4_2_crc32di): Likewise.
* config/i386/ia32intrin.h: Use crc32 target option for CRC32
intrinsics.
(cherry picked from commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd)
---
gcc/config/i386/i386-c.c | 2 ++
gcc/config/i386/i386-options.c | 5 +++++
gcc/config/i386/i386.md | 4 ++--
gcc/config/i386/ia32intrin.h | 28 ++++++++++++++--------------
4 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index be46d0506ad..5ed0de006fb 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -532,6 +532,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
def_or_undef (parse_in, "__LZCNT__");
if (isa_flag & OPTION_MASK_ISA_TBM)
def_or_undef (parse_in, "__TBM__");
+ if (isa_flag & OPTION_MASK_ISA_CRC32)
+ def_or_undef (parse_in, "__CRC32__");
if (isa_flag & OPTION_MASK_ISA_POPCNT)
def_or_undef (parse_in, "__POPCNT__");
if (isa_flag & OPTION_MASK_ISA_FSGSBASE)
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 7ecd0cf8b8c..19632b5fd6b 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2625,6 +2625,11 @@ ix86_option_override_internal (bool main_args_p,
opts->x_ix86_isa_flags
|= OPTION_MASK_ISA_POPCNT & ~opts->x_ix86_isa_flags_explicit;
+ /* Enable crc32 instruction for -msse4.2. */
+ if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags))
+ opts->x_ix86_isa_flags
+ |= OPTION_MASK_ISA_CRC32 & ~opts->x_ix86_isa_flags_explicit;
+
/* Enable lzcnt instruction for -mabm. */
if (TARGET_ABM_P(opts->x_ix86_isa_flags))
opts->x_ix86_isa_flags
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2fdf98266cd..1d528a4434a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -20992,7 +20992,7 @@ (define_insn "sse4_2_crc32<mode>"
[(match_operand:SI 1 "register_operand" "0")
(match_operand:SWI124 2 "nonimmediate_operand" "<r>m")]
UNSPEC_CRC32))]
- "TARGET_SSE4_2 || TARGET_CRC32"
+ "TARGET_CRC32"
"crc32{<imodesuffix>}\t{%2, %0|%0, %2}"
[(set_attr "type" "sselog1")
(set_attr "prefix_rep" "1")
@@ -21013,7 +21013,7 @@ (define_insn "sse4_2_crc32di"
[(match_operand:DI 1 "register_operand" "0")
(match_operand:DI 2 "nonimmediate_operand" "rm")]
UNSPEC_CRC32))]
- "TARGET_64BIT && (TARGET_SSE4_2 || TARGET_CRC32)"
+ "TARGET_64BIT && TARGET_CRC32"
"crc32{q}\t{%2, %0|%0, %2}"
[(set_attr "type" "sselog1")
(set_attr "prefix_rep" "1")
diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h
index 591394076cc..5422b0fc9e0 100644
--- a/gcc/config/i386/ia32intrin.h
+++ b/gcc/config/i386/ia32intrin.h
@@ -51,11 +51,11 @@ __bswapd (int __X)
#ifndef __iamcu__
-#ifndef __SSE4_2__
+#ifndef __CRC32__
#pragma GCC push_options
-#pragma GCC target("sse4.2")
-#define __DISABLE_SSE4_2__
-#endif /* __SSE4_2__ */
+#pragma GCC target("crc32")
+#define __DISABLE_CRC32__
+#endif /* __CRC32__ */
/* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */
extern __inline unsigned int
@@ -79,10 +79,10 @@ __crc32d (unsigned int __C, unsigned int __V)
return __builtin_ia32_crc32si (__C, __V);
}
-#ifdef __DISABLE_SSE4_2__
-#undef __DISABLE_SSE4_2__
+#ifdef __DISABLE_CRC32__
+#undef __DISABLE_CRC32__
#pragma GCC pop_options
-#endif /* __DISABLE_SSE4_2__ */
+#endif /* __DISABLE_CRC32__ */
#endif /* __iamcu__ */
@@ -199,11 +199,11 @@ __bswapq (long long __X)
return __builtin_bswap64 (__X);
}
-#ifndef __SSE4_2__
+#ifndef __CRC32__
#pragma GCC push_options
-#pragma GCC target("sse4.2")
-#define __DISABLE_SSE4_2__
-#endif /* __SSE4_2__ */
+#pragma GCC target("crc32")
+#define __DISABLE_CRC32__
+#endif /* __CRC32__ */
/* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */
extern __inline unsigned long long
@@ -213,10 +213,10 @@ __crc32q (unsigned long long __C, unsigned long long __V)
return __builtin_ia32_crc32di (__C, __V);
}
-#ifdef __DISABLE_SSE4_2__
-#undef __DISABLE_SSE4_2__
+#ifdef __DISABLE_CRC32__
+#undef __DISABLE_CRC32__
#pragma GCC pop_options
-#endif /* __DISABLE_SSE4_2__ */
+#endif /* __DISABLE_CRC32__ */
/* 64bit popcnt */
extern __inline long long
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
` (2 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
Since
commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Apr 15 05:59:48 2021 -0700
x86: Use crc32 target option for CRC32 intrinsics
enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_SSE4_2 check
in sse4_2_crc32<mode> pattens, remove OPTION_MASK_ISA_SSE4_2 from CRC32
_builtin functions.
gcc/
PR target/101549
* config/i386/i386-builtin.def: Remove OPTION_MASK_ISA_SSE4_2
from CRC32 _builtin functions.
gcc/testsuite/
PR target/101549
* gcc.target/i386/crc32-6.c: New test.
(cherry picked from commit 7aa28dbc371cf3c09c05c68672b00d9006391595)
---
gcc/config/i386/i386-builtin.def | 8 ++++----
gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +++++++++++++
2 files changed, 17 insertions(+), 4 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index e3ed4e1578f..ea509c67ddb 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -963,10 +963,10 @@ BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, "__builtin_ia32_pte
/* SSE4.2 */
BDESC (OPTION_MASK_ISA_SSE4_2, 0, CODE_FOR_sse4_2_gtv2di3, "__builtin_ia32_pcmpgtq", IX86_BUILTIN_PCMPGTQ, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT)
-BDESC (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32qi, "__builtin_ia32_crc32qi", IX86_BUILTIN_CRC32QI, UNKNOWN, (int) UINT_FTYPE_UINT_UCHAR)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32hi, "__builtin_ia32_crc32hi", IX86_BUILTIN_CRC32HI, UNKNOWN, (int) UINT_FTYPE_UINT_USHORT)
+BDESC (OPTION_MASK_ISA_CRC32, 0, CODE_FOR_sse4_2_crc32si, "__builtin_ia32_crc32si", IX86_BUILTIN_CRC32SI, UNKNOWN, (int) UINT_FTYPE_UINT_UINT)
+BDESC (OPTION_MASK_ISA_CRC32 | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_sse4_2_crc32di, "__builtin_ia32_crc32di", IX86_BUILTIN_CRC32DI, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64)
/* SSE4A */
BDESC (OPTION_MASK_ISA_SSE4A, 0, CODE_FOR_sse4a_extrqi, "__builtin_ia32_extrqi", IX86_BUILTIN_EXTRQI, UNKNOWN, (int) V2DI_FTYPE_V2DI_UINT_UINT)
diff --git a/gcc/testsuite/gcc.target/i386/crc32-6.c b/gcc/testsuite/gcc.target/i386/crc32-6.c
new file mode 100644
index 00000000000..464e3444069
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/crc32-6.c
@@ -0,0 +1,13 @@
+/* PR target/101549 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4 -mno-crc32" } */
+
+#include <immintrin.h>
+
+unsigned int
+test_mm_crc32_u8 (unsigned int CRC, unsigned char V)
+{
+ return _mm_crc32_u8 (CRC, V);
+}
+
+/* { dg-error "needs isa option -mcrc32" "" { target *-*-* } 0 } */
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
` (2 preceding siblings ...)
2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
For -mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled explicitly.
gcc/
PR target/101492
* common/config/i386/i386-common.c (ix86_handle_option): For
-mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled
explicitly.
gcc/testsuite/
PR target/101492
* gcc.target/i386/pr101492-1.c: New test.
* gcc.target/i386/pr101492-2.c: Likewise.
* gcc.target/i386/pr101492-3.c: Likewise.
* gcc.target/i386/pr101492-4.c: Likewise.
(cherry picked from commit 6ae8aac19cdbdbd96d90f86e4d8505fe121bdf06)
---
gcc/common/config/i386/i386-common.c | 30 ++++++++++++++++++++--
gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 ++++++++
gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 ++++++++
gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 ++++++++
gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +++++++++
5 files changed, 70 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index e156cc34584..38dbb9d9263 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -354,16 +354,42 @@ ix86_handle_option (struct gcc_options *opts,
case OPT_mgeneral_regs_only:
if (value)
{
+ HOST_WIDE_INT general_regs_only_flags = 0;
+ HOST_WIDE_INT general_regs_only_flags2 = 0;
+
+ /* NB: Enable the GPR only instructions which are enabled
+ implicitly by SSE ISAs unless they have been disabled
+ explicitly. */
+ if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags))
+ {
+ if ((opts->x_ix86_isa_flags_explicit
+ & OPTION_MASK_ISA_CRC32) == 0)
+ general_regs_only_flags |= OPTION_MASK_ISA_CRC32;
+ if ((opts->x_ix86_isa_flags_explicit
+ & OPTION_MASK_ISA_POPCNT) == 0)
+ general_regs_only_flags |= OPTION_MASK_ISA_POPCNT;
+ }
+ if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+ {
+ if ((opts->x_ix86_isa_flags2_explicit
+ & OPTION_MASK_ISA2_MWAIT) == 0)
+ general_regs_only_flags2 |= OPTION_MASK_ISA2_MWAIT;
+ }
+
/* Disable MMX, SSE and x87 instructions if only
general registers are allowed. */
opts->x_ix86_isa_flags
&= ~OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET;
opts->x_ix86_isa_flags2
&= ~OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET;
+ opts->x_ix86_isa_flags |= general_regs_only_flags;
+ opts->x_ix86_isa_flags2 |= general_regs_only_flags2;
opts->x_ix86_isa_flags_explicit
- |= OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET;
+ |= (OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET
+ | general_regs_only_flags);
opts->x_ix86_isa_flags2_explicit
- |= OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET;
+ |= (OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET
+ | general_regs_only_flags2);
opts->x_target_flags &= ~MASK_80387;
}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-1.c b/gcc/testsuite/gcc.target/i386/pr101492-1.c
new file mode 100644
index 00000000000..41002571761
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+unsigned int
+foo1 (unsigned int x, unsigned int y)
+{
+ return __crc32d (x, y);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-2.c b/gcc/testsuite/gcc.target/i386/pr101492-2.c
new file mode 100644
index 00000000000..c7d24f43c39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4.2 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+unsigned int
+foo1 (unsigned int x)
+{
+ return _mm_popcnt_u32 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-3.c b/gcc/testsuite/gcc.target/i386/pr101492-3.c
new file mode 100644
index 00000000000..37e2071ab57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse3 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+void
+foo1 (unsigned int x, unsigned int y)
+{
+ _mm_mwait (x, y);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101492-4.c b/gcc/testsuite/gcc.target/i386/pr101492-4.c
new file mode 100644
index 00000000000..c5a4f0abd25
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101492-4.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-mwait -msse3 -mgeneral-regs-only" } */
+
+#include <x86intrin.h>
+
+void
+foo1 (unsigned int x, unsigned int y)
+{
+ _mm_mwait (x, y);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
` (3 preceding siblings ...)
2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
@ 2021-08-13 13:51 ` H.J. Lu
2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
5 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-13 13:51 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, Jakub Jelinek, Richard Biener
1. Intrinsics in <x86gprintrin.h> only require GPR ISAs. Add
#if defined __MMX__ || defined __SSE__
#pragma GCC push_options
#pragma GCC target("general-regs-only")
#define __DISABLE_GENERAL_REGS_ONLY__
#endif
and
#ifdef __DISABLE_GENERAL_REGS_ONLY__
#undef __DISABLE_GENERAL_REGS_ONLY__
#pragma GCC pop_options
#endif /* __DISABLE_GENERAL_REGS_ONLY__ */
to <x86gprintrin.h> to disable non-GPR ISAs so that they can be used in
functions with __attribute__ ((target("general-regs-only"))).
2. When checking always_inline attribute, if callee only uses GPRs,
ignore MASK_80387 since enable MASK_80387 in caller has no impact on
callee inline.
gcc/
PR target/99744
* config/i386/i386.c (ix86_can_inline_p): Ignore MASK_80387 if
callee only uses GPRs.
* config/i386/ia32intrin.h: Revert commit 5463cee2770.
* config/i386/serializeintrin.h: Revert commit 71958f740f1.
* config/i386/x86gprintrin.h: Add
#pragma GCC target("general-regs-only") and #pragma GCC pop_options
to disable non-GPR ISAs.
gcc/testsuite/
PR target/99744
* gcc.target/i386/pr99744-3.c: New test.
* gcc.target/i386/pr99744-4.c: Likewise.
* gcc.target/i386/pr99744-5.c: Likewise.
* gcc.target/i386/pr99744-6.c: Likewise.
* gcc.target/i386/pr99744-7.c: Likewise.
* gcc.target/i386/pr99744-8.c: Likewise.
(cherry picked from commit 72264a639729a5dcc21dbee304717ce22b338bfd)
---
gcc/config/i386/i386.c | 6 +-
gcc/config/i386/ia32intrin.h | 14 +-
gcc/config/i386/serializeintrin.h | 7 +-
gcc/config/i386/x86gprintrin.h | 11 +
gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
10 files changed, 477 insertions(+), 4 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5a7bc8c44a8..527d493ecae 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -553,7 +553,7 @@ ix86_can_inline_p (tree caller, tree callee)
/* Changes of those flags can be tolerated for always inlines. Lets hope
user knows what he is doing. */
- const unsigned HOST_WIDE_INT always_inline_safe_mask
+ unsigned HOST_WIDE_INT always_inline_safe_mask
= (MASK_USE_8BIT_IDIV | MASK_ACCUMULATE_OUTGOING_ARGS
| MASK_NO_ALIGN_STRINGOPS | MASK_AVX256_SPLIT_UNALIGNED_LOAD
| MASK_AVX256_SPLIT_UNALIGNED_STORE | MASK_CLD
@@ -578,6 +578,10 @@ ix86_can_inline_p (tree caller, tree callee)
&& lookup_attribute ("always_inline",
DECL_ATTRIBUTES (callee)));
+ /* If callee only uses GPRs, ignore MASK_80387. */
+ if (TARGET_GENERAL_REGS_ONLY_P (callee_opts->x_ix86_target_flags))
+ always_inline_safe_mask |= MASK_80387;
+
cgraph_node *callee_node = cgraph_node::get (callee);
/* Callee's isa options should be a subset of the caller's, i.e. a SSE4
function can inline a SSE2 function but a SSE2 function can't inline
diff --git a/gcc/config/i386/ia32intrin.h b/gcc/config/i386/ia32intrin.h
index 5422b0fc9e0..df99220ee4f 100644
--- a/gcc/config/i386/ia32intrin.h
+++ b/gcc/config/i386/ia32intrin.h
@@ -107,12 +107,22 @@ __rdpmc (int __S)
#endif /* __iamcu__ */
/* rdtsc */
-#define __rdtsc() __builtin_ia32_rdtsc ()
+extern __inline unsigned long long
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__rdtsc (void)
+{
+ return __builtin_ia32_rdtsc ();
+}
#ifndef __iamcu__
/* rdtscp */
-#define __rdtscp(a) __builtin_ia32_rdtscp (a)
+extern __inline unsigned long long
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__rdtscp (unsigned int *__A)
+{
+ return __builtin_ia32_rdtscp (__A);
+}
#endif /* __iamcu__ */
diff --git a/gcc/config/i386/serializeintrin.h b/gcc/config/i386/serializeintrin.h
index e280250b198..89b5b94ea9b 100644
--- a/gcc/config/i386/serializeintrin.h
+++ b/gcc/config/i386/serializeintrin.h
@@ -34,7 +34,12 @@
#define __DISABLE_SERIALIZE__
#endif /* __SERIALIZE__ */
-#define _serialize() __builtin_ia32_serialize ()
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_serialize (void)
+{
+ __builtin_ia32_serialize ();
+}
#ifdef __DISABLE_SERIALIZE__
#undef __DISABLE_SERIALIZE__
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index 7793032ba90..b7fefa780a6 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -24,6 +24,12 @@
#ifndef _X86GPRINTRIN_H_INCLUDED
#define _X86GPRINTRIN_H_INCLUDED
+#if defined __MMX__ || defined __SSE__
+#pragma GCC push_options
+#pragma GCC target("general-regs-only")
+#define __DISABLE_GENERAL_REGS_ONLY__
+#endif
+
#include <ia32intrin.h>
#ifndef __iamcu__
@@ -255,4 +261,9 @@ _ptwrite32 (unsigned __B)
#endif /* __iamcu__ */
+#ifdef __DISABLE_GENERAL_REGS_ONLY__
+#undef __DISABLE_GENERAL_REGS_ONLY__
+#pragma GCC pop_options
+#endif /* __DISABLE_GENERAL_REGS_ONLY__ */
+
#endif /* _X86GPRINTRIN_H_INCLUDED. */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-3.c b/gcc/testsuite/gcc.target/i386/pr99744-3.c
new file mode 100644
index 00000000000..6c505816ceb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-3.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-serialize" } */
+
+#include <x86intrin.h>
+
+__attribute__ ((target("general-regs-only")))
+void
+foo1 (void)
+{
+ _serialize ();
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-4.c b/gcc/testsuite/gcc.target/i386/pr99744-4.c
new file mode 100644
index 00000000000..9196e62d955
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-4.c
@@ -0,0 +1,357 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbmi -mbmi2 -mcldemote -mclflushopt -mclwb -mclzero -mcrc32 -menqcmd -mfsgsbase -mfxsr -mhreset -mlzcnt -mlwp -mmovdir64b -mmovdiri -mmwaitx -mpconfig -mpku -mpopcnt -mptwrite -mrdpid -mrdrnd -mrdseed -mrtm -msgx -mshstk -mtbm -mtsxldtrk -mxsave -mxsavec -mxsaveopt -mxsaves -mwaitpkg -mwbnoinvd" } */
+/* { dg-additional-options "-muintr" { target { ! ia32 } } } */
+
+/* Test calling GPR intrinsics from functions with general-regs-only
+ target attribute. */
+
+#include <x86gprintrin.h>
+
+#define _CONCAT(x,y) x ## y
+
+#define test_0(func, type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (void) \
+ { return func (); }
+
+#define test_0_i1(func, type, imm) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (void) \
+ { return func (imm); }
+
+#define test_1(func, type, op1_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A) \
+ { return func (A); }
+
+#define test_1_i1(func, type, op1_type, imm) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A) \
+ { return func (A, imm); }
+
+#define test_2(func, type, op1_type, op2_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B) \
+ { return func (A, B); }
+
+#define test_2_i1(func, type, op1_type, op2_type, imm) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B) \
+ { return func (A, B, imm); }
+
+#define test_3(func, type, op1_type, op2_type, op3_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C) \
+ { return func (A, B, C); }
+
+#define test_4(func, type, op1_type, op2_type, op3_type, op4_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C, \
+ op4_type D) \
+ { return func (A, B, C, D); }
+
+/* ia32intrin.h */
+test_1 (__bsfd, int, int)
+test_1 (__bsrd, int, int)
+test_1 (__bswapd, int, int)
+test_1 (__popcntd, int, unsigned int)
+test_2 (__rolb, unsigned char, unsigned char, int)
+test_2 (__rolw, unsigned short, unsigned short, int)
+test_2 (__rold, unsigned int, unsigned int, int)
+test_2 (__rorb, unsigned char, unsigned char, int)
+test_2 (__rorw, unsigned short, unsigned short, int)
+test_2 (__rord, unsigned int, unsigned int, int)
+
+#ifndef __iamcu__
+/* adxintrin.h */
+test_4 (_subborrow_u32, unsigned char, unsigned char, unsigned int,
+ unsigned int, unsigned int *)
+test_4 (_addcarry_u32, unsigned char, unsigned char, unsigned int,
+ unsigned int, unsigned int *)
+test_4 (_addcarryx_u32, unsigned char, unsigned char, unsigned int,
+ unsigned int, unsigned int *)
+
+/* bmiintrin.h */
+test_1 (__tzcnt_u16, unsigned short, unsigned short)
+test_2 (__andn_u32, unsigned int, unsigned int, unsigned int)
+test_2 (__bextr_u32, unsigned int, unsigned int, unsigned int)
+test_3 (_bextr_u32, unsigned int, unsigned int, unsigned int,
+ unsigned int)
+test_1 (__blsi_u32, unsigned int, unsigned int)
+test_1 (_blsi_u32, unsigned int, unsigned int)
+test_1 (__blsmsk_u32, unsigned int, unsigned int)
+test_1 (_blsmsk_u32, unsigned int, unsigned int)
+test_1 (__blsr_u32, unsigned int, unsigned int)
+test_1 (_blsr_u32, unsigned int, unsigned int)
+test_1 (__tzcnt_u32, unsigned int, unsigned int)
+test_1 (_tzcnt_u32, unsigned int, unsigned int)
+
+/* bmi2intrin.h */
+test_2 (_bzhi_u32, unsigned int, unsigned int, unsigned int)
+test_2 (_pdep_u32, unsigned int, unsigned int, unsigned int)
+test_2 (_pext_u32, unsigned int, unsigned int, unsigned int)
+
+/* cetintrin.h */
+test_1 (_inc_ssp, void, unsigned int)
+test_0 (_saveprevssp, void)
+test_1 (_rstorssp, void, void *)
+test_2 (_wrssd, void, unsigned int, void *)
+test_2 (_wrussd, void, unsigned int, void *)
+test_0 (_setssbsy, void)
+test_1 (_clrssbsy, void, void *)
+
+/* cldemoteintrin.h */
+test_1 (_cldemote, void, void *)
+
+/* clflushoptintrin.h */
+test_1 (_mm_clflushopt, void, void *)
+
+/* clwbintrin.h */
+test_1 (_mm_clwb, void, void *)
+
+/* clzerointrin.h */
+test_1 (_mm_clzero, void, void *)
+
+/* enqcmdintrin.h */
+test_2 (_enqcmd, int, void *, const void *)
+test_2 (_enqcmds, int, void *, const void *)
+
+/* fxsrintrin.h */
+test_1 (_fxsave, void, void *)
+test_1 (_fxrstor, void, void *)
+
+/* hresetintrin.h */
+test_1 (_hreset, void, unsigned int)
+
+/* ia32intrin.h */
+test_2 (__crc32b, unsigned int, unsigned char, unsigned char)
+test_2 (__crc32w, unsigned int, unsigned short, unsigned short)
+test_2 (__crc32d, unsigned int, unsigned int, unsigned int)
+test_1 (__rdpmc, unsigned long long, int)
+test_0 (__rdtsc, unsigned long long)
+test_1 (__rdtscp, unsigned long long, unsigned int *)
+test_0 (__pause, void)
+
+/* lzcntintrin.h */
+test_1 (__lzcnt16, unsigned short, unsigned short)
+test_1 (__lzcnt32, unsigned int, unsigned int)
+test_1 (_lzcnt_u32, unsigned int, unsigned int)
+
+/* lwpintrin.h */
+test_1 (__llwpcb, void, void *)
+test_0 (__slwpcb, void *)
+test_2_i1 (__lwpval32, void, unsigned int, unsigned int, 1)
+test_2_i1 (__lwpins32, unsigned char, unsigned int, unsigned int, 1)
+
+/* movdirintrin.h */
+test_2 (_directstoreu_u32, void, void *, unsigned int)
+test_2 (_movdir64b, void, void *, const void *)
+
+/* mwaitxintrin.h */
+test_3 (_mm_monitorx, void, void const *, unsigned int, unsigned int)
+test_3 (_mm_mwaitx, void, unsigned int, unsigned int, unsigned int)
+
+/* pconfigintrin.h */
+test_2 (_pconfig_u32, unsigned int, const unsigned int, size_t *)
+
+/* pkuintrin.h */
+test_0 (_rdpkru_u32, unsigned int)
+test_1 (_wrpkru, void, unsigned int)
+
+/* popcntintrin.h */
+test_1 (_mm_popcnt_u32, int, unsigned int)
+
+/* rdseedintrin.h */
+test_1 (_rdseed16_step, int, unsigned short *)
+test_1 (_rdseed32_step, int, unsigned int *)
+
+/* rtmintrin.h */
+test_0 (_xbegin, unsigned int)
+test_0 (_xend, void)
+test_0_i1 (_xabort, void, 1)
+
+/* sgxintrin.h */
+test_2 (_encls_u32, unsigned int, const unsigned int, size_t *)
+test_2 (_enclu_u32, unsigned int, const unsigned int, size_t *)
+test_2 (_enclv_u32, unsigned int, const unsigned int, size_t *)
+
+/* tbmintrin.h */
+test_1_i1 (__bextri_u32, unsigned int, unsigned int, 1)
+test_1 (__blcfill_u32, unsigned int, unsigned int)
+test_1 (__blci_u32, unsigned int, unsigned int)
+test_1 (__blcic_u32, unsigned int, unsigned int)
+test_1 (__blcmsk_u32, unsigned int, unsigned int)
+test_1 (__blcs_u32, unsigned int, unsigned int)
+test_1 (__blsfill_u32, unsigned int, unsigned int)
+test_1 (__blsic_u32, unsigned int, unsigned int)
+test_1 (__t1mskc_u32, unsigned int, unsigned int)
+test_1 (__tzmsk_u32, unsigned int, unsigned int)
+
+/* tsxldtrkintrin.h */
+test_0 (_xsusldtrk, void)
+test_0 (_xresldtrk, void)
+
+/* x86gprintrin.h */
+test_1 (_ptwrite32, void, unsigned int)
+test_1 (_rdrand16_step, int, unsigned short *)
+test_1 (_rdrand32_step, int, unsigned int *)
+test_0 (_wbinvd, void)
+
+/* xtestintrin.h */
+test_0 (_xtest, int)
+
+/* xsaveintrin.h */
+test_2 (_xsave, void, void *, long long)
+test_2 (_xrstor, void, void *, long long)
+test_2 (_xsetbv, void, unsigned int, long long)
+test_1 (_xgetbv, long long, unsigned int)
+
+/* xsavecintrin.h */
+test_2 (_xsavec, void, void *, long long)
+
+/* xsaveoptintrin.h */
+test_2 (_xsaveopt, void, void *, long long)
+
+/* xsavesintrin.h */
+test_2 (_xsaves, void, void *, long long)
+test_2 (_xrstors, void, void *, long long)
+
+/* wbnoinvdintrin.h */
+test_0 (_wbnoinvd, void)
+
+#ifdef __x86_64__
+/* adxintrin.h */
+test_4 (_subborrow_u64, unsigned char, unsigned char,
+ unsigned long long, unsigned long long,
+ unsigned long long *)
+test_4 (_addcarry_u64, unsigned char, unsigned char,
+ unsigned long long, unsigned long long,
+ unsigned long long *)
+test_4 (_addcarryx_u64, unsigned char, unsigned char,
+ unsigned long long, unsigned long long,
+ unsigned long long *)
+
+/* bmiintrin.h */
+test_2 (__andn_u64, unsigned long long, unsigned long long,
+ unsigned long long)
+test_2 (__bextr_u64, unsigned long long, unsigned long long,
+ unsigned long long)
+test_3 (_bextr_u64, unsigned long long, unsigned long long,
+ unsigned long long, unsigned long long)
+test_1 (__blsi_u64, unsigned long long, unsigned long long)
+test_1 (_blsi_u64, unsigned long long, unsigned long long)
+test_1 (__blsmsk_u64, unsigned long long, unsigned long long)
+test_1 (_blsmsk_u64, unsigned long long, unsigned long long)
+test_1 (__blsr_u64, unsigned long long, unsigned long long)
+test_1 (_blsr_u64, unsigned long long, unsigned long long)
+test_1 (__tzcnt_u64, unsigned long long, unsigned long long)
+test_1 (_tzcnt_u64, unsigned long long, unsigned long long)
+
+/* bmi2intrin.h */
+test_2 (_bzhi_u64, unsigned long long, unsigned long long,
+ unsigned long long)
+test_2 (_pdep_u64, unsigned long long, unsigned long long,
+ unsigned long long)
+test_2 (_pext_u64, unsigned long long, unsigned long long,
+ unsigned long long)
+test_3 (_mulx_u64, unsigned long long, unsigned long long,
+ unsigned long long, unsigned long long *)
+
+/* cetintrin.h */
+test_0 (_get_ssp, unsigned long long)
+test_2 (_wrssq, void, unsigned long long, void *)
+test_2 (_wrussq, void, unsigned long long, void *)
+
+/* fxsrintrin.h */
+test_1 (_fxsave64, void, void *)
+test_1 (_fxrstor64, void, void *)
+
+/* ia32intrin.h */
+test_1 (__bsfq, int, long long)
+test_1 (__bsrq, int, long long)
+test_1 (__bswapq, long long, long long)
+test_2 (__crc32q, unsigned long long, unsigned long long,
+ unsigned long long)
+test_1 (__popcntq, long long, unsigned long long)
+test_2 (__rolq, unsigned long long, unsigned long long, int)
+test_2 (__rorq, unsigned long long, unsigned long long, int)
+test_0 (__readeflags, unsigned long long)
+test_1 (__writeeflags, void, unsigned int)
+
+/* lzcntintrin.h */
+test_1 (__lzcnt64, unsigned long long, unsigned long long)
+test_1 (_lzcnt_u64, unsigned long long, unsigned long long)
+
+/* lwpintrin.h */
+test_2_i1 (__lwpval64, void, unsigned long long, unsigned int, 1)
+test_2_i1 (__lwpins64, unsigned char, unsigned long long,
+ unsigned int, 1)
+
+/* movdirintrin.h */
+test_2 (_directstoreu_u64, void, void *, unsigned long long)
+
+/* popcntintrin.h */
+test_1 (_mm_popcnt_u64, long long, unsigned long long)
+
+/* rdseedintrin.h */
+test_1 (_rdseed64_step, int, unsigned long long *)
+
+/* tbmintrin.h */
+test_1_i1 (__bextri_u64, unsigned long long, unsigned long long, 1)
+test_1 (__blcfill_u64, unsigned long long, unsigned long long)
+test_1 (__blci_u64, unsigned long long, unsigned long long)
+test_1 (__blcic_u64, unsigned long long, unsigned long long)
+test_1 (__blcmsk_u64, unsigned long long, unsigned long long)
+test_1 (__blcs_u64, unsigned long long, unsigned long long)
+test_1 (__blsfill_u64, unsigned long long, unsigned long long)
+test_1 (__blsic_u64, unsigned long long, unsigned long long)
+test_1 (__t1mskc_u64, unsigned long long, unsigned long long)
+test_1 (__tzmsk_u64, unsigned long long, unsigned long long)
+
+/* uintrintrin.h */
+test_0 (_clui, void)
+test_1 (_senduipi, void, unsigned long long)
+test_0 (_stui, void)
+test_0 (_testui, unsigned char)
+
+/* x86gprintrin.h */
+test_1 (_ptwrite64, void, unsigned long long)
+test_0 (_readfsbase_u32, unsigned int)
+test_0 (_readfsbase_u64, unsigned long long)
+test_0 (_readgsbase_u32, unsigned int)
+test_0 (_readgsbase_u64, unsigned long long)
+test_1 (_rdrand64_step, int, unsigned long long *)
+test_1 (_writefsbase_u32, void, unsigned int)
+test_1 (_writefsbase_u64, void, unsigned long long)
+test_1 (_writegsbase_u32, void, unsigned int)
+test_1 (_writegsbase_u64, void, unsigned long long)
+
+/* xsaveintrin.h */
+test_2 (_xsave64, void, void *, long long)
+test_2 (_xrstor64, void, void *, long long)
+
+/* xsavecintrin.h */
+test_2 (_xsavec64, void, void *, long long)
+
+/* xsaveoptintrin.h */
+test_2 (_xsaveopt64, void, void *, long long)
+
+/* xsavesintrin.h */
+test_2 (_xsaves64, void, void *, long long)
+test_2 (_xrstors64, void, void *, long long)
+
+/* waitpkgintrin.h */
+test_1 (_umonitor, void, void *)
+test_2 (_umwait, unsigned char, unsigned int, unsigned long long)
+test_2 (_tpause, unsigned char, unsigned int, unsigned long long)
+
+#else /* !__x86_64__ */
+/* bmi2intrin.h */
+test_3 (_mulx_u32, unsigned int, unsigned int, unsigned int,
+ unsigned int *)
+
+/* cetintrin.h */
+test_0 (_get_ssp, unsigned int)
+#endif /* __x86_64__ */
+
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-5.c b/gcc/testsuite/gcc.target/i386/pr99744-5.c
new file mode 100644
index 00000000000..9e40e5ef428
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-5.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait" } */
+
+/* Test calling MWAIT intrinsics from functions with general-regs-only
+ target attribute. */
+
+#include <x86gprintrin.h>
+
+#define _CONCAT(x,y) x ## y
+
+#define test_2(func, type, op1_type, op2_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B) \
+ { return func (A, B); }
+
+#define test_3(func, type, op1_type, op2_type, op3_type) \
+ __attribute__ ((target("general-regs-only"))) \
+ type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C) \
+ { return func (A, B, C); }
+
+#ifndef __iamcu__
+/* mwaitintrin.h */
+test_3 (_mm_monitor, void, void const *, unsigned int, unsigned int)
+test_2 (_mm_mwait, void, unsigned int, unsigned int)
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-6.c b/gcc/testsuite/gcc.target/i386/pr99744-6.c
new file mode 100644
index 00000000000..4025918a9c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-6.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+#include <x86intrin.h>
+
+extern unsigned long long int curr_deadline;
+extern void bar (void);
+
+void
+foo1 (void)
+{
+ if (__rdtsc () < curr_deadline)
+ return;
+ bar ();
+}
+
+void
+foo2 (unsigned int *p)
+{
+ if (__rdtscp (p) < curr_deadline)
+ return;
+ bar ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-7.c b/gcc/testsuite/gcc.target/i386/pr99744-7.c
new file mode 100644
index 00000000000..30b7ca05966
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-7.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mno-avx -Wno-psabi" } */
+
+#include <x86intrin.h>
+
+void
+foo (__m256 *x)
+{
+ x[0] = _mm256_sub_ps (x[1], x[2]);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-8.c b/gcc/testsuite/gcc.target/i386/pr99744-8.c
new file mode 100644
index 00000000000..115183eede6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-8.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -Wno-psabi" } */
+
+#include <x86intrin.h>
+
+__attribute__((target ("no-avx")))
+void
+foo (__m256 *x)
+{
+ x[0] = _mm256_sub_ps (x[1], x[2]);
+}
+
+/* { dg-error "target specific option mismatch" "" { target *-*-* } 0 } */
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
@ 2021-08-16 6:11 ` Richard Biener
2021-08-16 12:25 ` H.J. Lu
0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2021-08-16 6:11 UTC (permalink / raw)
To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> -mgeneral-regs-only and make -msse3 to imply -mmwait.
Adding new options requires to bump the LTO streaming minor version
(I know we forgot it once on the branch already when adding a new --param).
Please take care of this when backporting.
Richard.
> gcc/
>
> * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> x86_64-*-* targets.
> * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> New.
> (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> (ix86_handle_option): Handle -mmwait.
> * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> __builtin_ia32_monitor and __builtin_ia32_mwait.
> * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> (ix86_valid_target_attribute_inner_p): Likewise.
> (ix86_option_override_internal): Enable mwait/monitor
> instructions for -msse3.
> * config/i386/i386.h (TARGET_MWAIT): New.
> (TARGET_MWAIT_P): Likewise.
> * config/i386/i386.opt: Add -mmwait.
> * config/i386/mwaitintrin.h: New file.
> * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> TARGET_MWAIT.
> (@sse3_monitor_<mode>): Likewise.
> * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> * doc/extend.texi: Document mwait target attribute.
> * doc/invoke.texi: Document -mmwait.
>
> gcc/testsuite/
>
> * gcc.target/i386/monitor-2.c: New test.
>
> (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> ---
> gcc/common/config/i386/i386-common.c | 15 +++++++
> gcc/config.gcc | 6 ++-
> gcc/config/i386/i386-builtins.c | 4 +-
> gcc/config/i386/i386-options.c | 7 +++
> gcc/config/i386/i386.h | 2 +
> gcc/config/i386/i386.opt | 4 ++
> gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
> gcc/config/i386/pmmintrin.h | 13 +-----
> gcc/config/i386/sse.md | 4 +-
> gcc/config/i386/x86gprintrin.h | 2 +
> gcc/doc/extend.texi | 5 +++
> gcc/doc/invoke.texi | 8 +++-
> gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> 13 files changed, 130 insertions(+), 19 deletions(-)
> create mode 100644 gcc/config/i386/mwaitintrin.h
> create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
>
> diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> index 6a7b5c8312f..e156cc34584 100644
> --- a/gcc/common/config/i386/i386-common.c
> +++ b/gcc/common/config/i386/i386-common.c
> @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
> #define OPTION_MASK_ISA_F16C_SET \
> (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
> #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> }
> return true;
>
> + case OPT_mmwait:
> + if (value)
> + {
> + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> + }
> + else
> + {
> + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> + }
> + return true;
> +
> case OPT_mclzero:
> if (value)
> {
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 357b0bed067..a020e0808c9 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -414,7 +414,8 @@ i[34567]86-*-*)
> avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> amxbf16intrin.h x86gprintrin.h uintrintrin.h
> - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> + mwaitintrin.h"
> ;;
> x86_64-*-*)
> cpu_type=i386
> @@ -451,7 +452,8 @@ x86_64-*-*)
> avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> amxbf16intrin.h x86gprintrin.h uintrintrin.h
> - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> + mwaitintrin.h"
> ;;
> ia64-*-*)
> extra_headers=ia64intrin.h
> diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> index 4fcdf4b89ee..128bd39816c 100644
> --- a/gcc/config/i386/i386-builtins.c
> +++ b/gcc/config/i386/i386-builtins.c
> @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
>
> /* SSE3. */
> - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
>
> /* AES */
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 18d2c0b9f99..7ecd0cf8b8c 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> { "-mmovbe", OPTION_MASK_ISA2_MOVBE },
> { "-mclzero", OPTION_MASK_ISA2_CLZERO },
> { "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
> + { "-mmwait", OPTION_MASK_ISA2_MWAIT },
> { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
> { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
> { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
> @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
> IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
> + IX86_ATTR_ISA ("mwait", OPT_mmwait),
> IX86_ATTR_ISA ("clzero", OPT_mclzero),
> IX86_ATTR_ISA ("pku", OPT_mpku),
> IX86_ATTR_ISA ("lwp", OPT_mlwp),
> @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> ix86_prefetch_sse = true;
>
> + /* Enable mwait/monitor instructions for -msse3. */
> + if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> + opts->x_ix86_isa_flags2
> + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> +
> /* Enable popcnt instruction for -msse4.2 or -mabm. */
> if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> || TARGET_ABM_P (opts->x_ix86_isa_flags))
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 5583ec6881a..73e118900f7 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
> #define TARGET_MWAITX TARGET_ISA2_MWAITX
> #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
> +#define TARGET_MWAIT TARGET_ISA2_MWAIT
> +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
> #define TARGET_PKU TARGET_ISA_PKU
> #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
> #define TARGET_SHSTK TARGET_ISA_SHSTK
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index c781fdc8278..7b8547bb1c3 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> mneeded
> Target Var(ix86_needed) Save
> Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> +
> +mmwait
> +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> +Support MWAIT and MONITOR built-in functions and code generation.
> diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> new file mode 100644
> index 00000000000..1ecbc4abb69
> --- /dev/null
> +++ b/gcc/config/i386/mwaitintrin.h
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2021 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + Under Section 7 of GPL version 3, you are granted additional
> + permissions described in the GCC Runtime Library Exception, version
> + 3.1, as published by the Free Software Foundation.
> +
> + You should have received a copy of the GNU General Public License and
> + a copy of the GCC Runtime Library Exception along with this program;
> + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef _MWAITINTRIN_H_INCLUDED
> +#define _MWAITINTRIN_H_INCLUDED
> +
> +#ifndef __MWAIT__
> +#pragma GCC push_options
> +#pragma GCC target("mwait")
> +#define __DISABLE_MWAIT__
> +#endif /* __MWAIT__ */
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> +{
> + __builtin_ia32_monitor (__P, __E, __H);
> +}
> +
> +extern __inline void
> +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_mwait (unsigned int __E, unsigned int __H)
> +{
> + __builtin_ia32_mwait (__E, __H);
> +}
> +
> +#ifdef __DISABLE_MWAIT__
> +#undef __DISABLE_MWAIT__
> +#pragma GCC pop_options
> +#endif /* __DISABLE_MWAIT__ */
> +
> +#endif /* _MWAITINTRIN_H_INCLUDED */
> diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> index fa9c5bb8b9f..f8102d2be23 100644
> --- a/gcc/config/i386/pmmintrin.h
> +++ b/gcc/config/i386/pmmintrin.h
> @@ -29,6 +29,7 @@
>
> /* We need definitions from the SSE2 and SSE header files*/
> #include <emmintrin.h>
> +#include <mwaitintrin.h>
>
> #ifndef __SSE3__
> #pragma GCC push_options
> @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> }
>
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> -{
> - __builtin_ia32_monitor (__P, __E, __H);
> -}
> -
> -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_mwait (unsigned int __E, unsigned int __H)
> -{
> - __builtin_ia32_mwait (__E, __H);
> -}
> -
> #ifdef __DISABLE_SSE3__
> #undef __DISABLE_SSE3__
> #pragma GCC pop_options
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3f81abc7804..43afe3dabed 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> (match_operand:SI 1 "register_operand" "a")]
> UNSPECV_MWAIT)]
> - "TARGET_SSE3"
> + "TARGET_MWAIT"
> ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> ;; Since 32bit register operands are implicitly zero extended to 64bit,
> ;; we only need to set up 32bit registers.
> @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> (match_operand:SI 1 "register_operand" "c")
> (match_operand:SI 2 "register_operand" "d")]
> UNSPECV_MONITOR)]
> - "TARGET_SSE3"
> + "TARGET_MWAIT"
> ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> ;; RCX and RDX are used. Since 32bit register operands are implicitly
> ;; zero extended to 64bit, we only need to set up 32bit registers.
> diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> index ceda501252c..7793032ba90 100644
> --- a/gcc/config/i386/x86gprintrin.h
> +++ b/gcc/config/i386/x86gprintrin.h
> @@ -56,6 +56,8 @@
>
> #include <movdirintrin.h>
>
> +#include <mwaitintrin.h>
> +
> #include <mwaitxintrin.h>
>
> #include <pconfigintrin.h>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1bc66cce2b8..1acfaf1d345 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> @cindex @code{target("movdiri")} function attribute, x86
> Enable/disable the generation of the MOVDIRI instructions.
>
> +@item mwait
> +@itemx no-mwait
> +@cindex @code{target("mwait")} function attribute, x86
> +Enable/disable the generation of the MWAIT and MONITOR instructions.
> +
> @item mwaitx
> @itemx no-mwaitx
> @cindex @code{target("mwaitx")} function attribute, x86
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 7f13ffb79e1..3e1f0bc8fad 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> -mno-wide-multiply -mrtd -malign-double @gol
> -mpreferred-stack-boundary=@var{num} @gol
> -mincoming-stack-boundary=@var{num} @gol
> --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
> +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
> -mrecip -mrecip=@var{opt} @gol
> -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
> -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
> @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
>
> +@item -mmwait
> +@opindex mmwait
> +This option enables built-in functions @code{__builtin_ia32_monitor},
> +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> +@code{mwait} machine instructions.
> +
> @item -mrecip
> @opindex mrecip
> This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> new file mode 100644
> index 00000000000..96eeec070f0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> +
> +/* Verify that they work in both 32bit and 64bit. */
> +
> +#include <x86gprintrin.h>
> +
> +void
> +foo (char *p, int x, int y, int z)
> +{
> + _mm_monitor (p, y, x);
> + _mm_mwait (z, y);
> +}
> +
> +void
> +bar (char *p, long x, long y, long z)
> +{
> + _mm_monitor (p, y, x);
> + _mm_mwait (z, y);
> +}
> +
> +void
> +foo1 (char *p)
> +{
> + _mm_monitor (p, 0, 0);
> + _mm_mwait (0, 0);
> +}
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
` (4 preceding siblings ...)
2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
@ 2021-08-16 6:11 ` Richard Biener
2021-08-24 14:57 ` H.J. Lu
5 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2021-08-16 6:11 UTC (permalink / raw)
To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> <x86gprintrin.h> and target("general-regs-only") function attribute
> were added to GCC 11. But their implementations are incomplete. I'd
> like to backport the following patches to GCC 11 branch to finish them.
Fine with me if x86 maintainers do not disagree (also see one comment I have
on the -mwait adding patch).
> H.J. Lu (5):
> x86: Add -mmwait for -mgeneral-regs-only
> x86: Use crc32 target option for CRC32 intrinsics
> x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> x86: Enable the GPR only instructions for -mgeneral-regs-only
> <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
>
> gcc/common/config/i386/i386-common.c | 45 ++-
> gcc/config.gcc | 6 +-
> gcc/config/i386/i386-builtin.def | 8 +-
> gcc/config/i386/i386-builtins.c | 4 +-
> gcc/config/i386/i386-c.c | 2 +
> gcc/config/i386/i386-options.c | 12 +
> gcc/config/i386/i386.c | 6 +-
> gcc/config/i386/i386.h | 2 +
> gcc/config/i386/i386.md | 4 +-
> gcc/config/i386/i386.opt | 4 +
> gcc/config/i386/ia32intrin.h | 42 ++-
> gcc/config/i386/mwaitintrin.h | 52 +++
> gcc/config/i386/pmmintrin.h | 13 +-
> gcc/config/i386/serializeintrin.h | 7 +-
> gcc/config/i386/sse.md | 4 +-
> gcc/config/i386/x86gprintrin.h | 13 +
> gcc/doc/extend.texi | 5 +
> gcc/doc/invoke.texi | 8 +-
> gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
> gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
> gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
> gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
> gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
> gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
> gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
> gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
> gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
> gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
> gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
> gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
> 30 files changed, 717 insertions(+), 45 deletions(-)
> create mode 100644 gcc/config/i386/mwaitintrin.h
> create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
>
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-16 6:11 ` Richard Biener
@ 2021-08-16 12:25 ` H.J. Lu
2021-08-16 12:28 ` Richard Biener
0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-16 12:25 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > -mgeneral-regs-only and make -msse3 to imply -mmwait.
>
> Adding new options requires to bump the LTO streaming minor version
> (I know we forgot it once on the branch already when adding a new --param).
>
> Please take care of this when backporting.
It was updated today:
commit dce5367eecfb0729cad0325240d614721afb39e3
Author: Martin Liska <mliska@suse.cz>
Date: Mon Aug 16 13:02:54 2021 +0200
LTO: bump minor version
Bump the LTO_minor_version due to changes in
52f0aa4dee8401ef3958dbf789780b0ee877beab
PR c/100150
gcc/ChangeLog:
* lto-streamer.h (LTO_minor_version): Bump.
Do I need to do it again if I can check in my patches this week?
Thanks.
> Richard.
>
> > gcc/
> >
> > * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> > x86_64-*-* targets.
> > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> > New.
> > (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> > (ix86_handle_option): Handle -mmwait.
> > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> > Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> > __builtin_ia32_monitor and __builtin_ia32_mwait.
> > * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> > (ix86_valid_target_attribute_inner_p): Likewise.
> > (ix86_option_override_internal): Enable mwait/monitor
> > instructions for -msse3.
> > * config/i386/i386.h (TARGET_MWAIT): New.
> > (TARGET_MWAIT_P): Likewise.
> > * config/i386/i386.opt: Add -mmwait.
> > * config/i386/mwaitintrin.h: New file.
> > * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> > * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> > TARGET_MWAIT.
> > (@sse3_monitor_<mode>): Likewise.
> > * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> > * doc/extend.texi: Document mwait target attribute.
> > * doc/invoke.texi: Document -mmwait.
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/monitor-2.c: New test.
> >
> > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> > ---
> > gcc/common/config/i386/i386-common.c | 15 +++++++
> > gcc/config.gcc | 6 ++-
> > gcc/config/i386/i386-builtins.c | 4 +-
> > gcc/config/i386/i386-options.c | 7 +++
> > gcc/config/i386/i386.h | 2 +
> > gcc/config/i386/i386.opt | 4 ++
> > gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
> > gcc/config/i386/pmmintrin.h | 13 +-----
> > gcc/config/i386/sse.md | 4 +-
> > gcc/config/i386/x86gprintrin.h | 2 +
> > gcc/doc/extend.texi | 5 +++
> > gcc/doc/invoke.texi | 8 +++-
> > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> > 13 files changed, 130 insertions(+), 19 deletions(-)
> > create mode 100644 gcc/config/i386/mwaitintrin.h
> > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> >
> > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> > index 6a7b5c8312f..e156cc34584 100644
> > --- a/gcc/common/config/i386/i386-common.c
> > +++ b/gcc/common/config/i386/i386-common.c
> > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
> > #define OPTION_MASK_ISA_F16C_SET \
> > (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> > #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> > #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> > #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> > #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
> > #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> > #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> > #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> > #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> > #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> > #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> > }
> > return true;
> >
> > + case OPT_mmwait:
> > + if (value)
> > + {
> > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> > + }
> > + else
> > + {
> > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> > + }
> > + return true;
> > +
> > case OPT_mclzero:
> > if (value)
> > {
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 357b0bed067..a020e0808c9 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -414,7 +414,8 @@ i[34567]86-*-*)
> > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > + mwaitintrin.h"
> > ;;
> > x86_64-*-*)
> > cpu_type=i386
> > @@ -451,7 +452,8 @@ x86_64-*-*)
> > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > + mwaitintrin.h"
> > ;;
> > ia64-*-*)
> > extra_headers=ia64intrin.h
> > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> > index 4fcdf4b89ee..128bd39816c 100644
> > --- a/gcc/config/i386/i386-builtins.c
> > +++ b/gcc/config/i386/i386-builtins.c
> > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> > VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
> >
> > /* SSE3. */
> > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> > VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
> >
> > /* AES */
> > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> > index 18d2c0b9f99..7ecd0cf8b8c 100644
> > --- a/gcc/config/i386/i386-options.c
> > +++ b/gcc/config/i386/i386-options.c
> > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> > { "-mmovbe", OPTION_MASK_ISA2_MOVBE },
> > { "-mclzero", OPTION_MASK_ISA2_CLZERO },
> > { "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
> > + { "-mmwait", OPTION_MASK_ISA2_MWAIT },
> > { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
> > { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
> > { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
> > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> > IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> > IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
> > IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
> > + IX86_ATTR_ISA ("mwait", OPT_mmwait),
> > IX86_ATTR_ISA ("clzero", OPT_mclzero),
> > IX86_ATTR_ISA ("pku", OPT_mpku),
> > IX86_ATTR_ISA ("lwp", OPT_mlwp),
> > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> > || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> > ix86_prefetch_sse = true;
> >
> > + /* Enable mwait/monitor instructions for -msse3. */
> > + if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> > + opts->x_ix86_isa_flags2
> > + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> > +
> > /* Enable popcnt instruction for -msse4.2 or -mabm. */
> > if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> > || TARGET_ABM_P (opts->x_ix86_isa_flags))
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index 5583ec6881a..73e118900f7 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> > #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
> > #define TARGET_MWAITX TARGET_ISA2_MWAITX
> > #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
> > +#define TARGET_MWAIT TARGET_ISA2_MWAIT
> > +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
> > #define TARGET_PKU TARGET_ISA_PKU
> > #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
> > #define TARGET_SHSTK TARGET_ISA_SHSTK
> > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> > index c781fdc8278..7b8547bb1c3 100644
> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> > mneeded
> > Target Var(ix86_needed) Save
> > Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> > +
> > +mmwait
> > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> > +Support MWAIT and MONITOR built-in functions and code generation.
> > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> > new file mode 100644
> > index 00000000000..1ecbc4abb69
> > --- /dev/null
> > +++ b/gcc/config/i386/mwaitintrin.h
> > @@ -0,0 +1,52 @@
> > +/* Copyright (C) 2021 Free Software Foundation, Inc.
> > +
> > + This file is part of GCC.
> > +
> > + GCC is free software; you can redistribute it and/or modify
> > + it under the terms of the GNU General Public License as published by
> > + the Free Software Foundation; either version 3, or (at your option)
> > + any later version.
> > +
> > + GCC is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + GNU General Public License for more details.
> > +
> > + Under Section 7 of GPL version 3, you are granted additional
> > + permissions described in the GCC Runtime Library Exception, version
> > + 3.1, as published by the Free Software Foundation.
> > +
> > + You should have received a copy of the GNU General Public License and
> > + a copy of the GCC Runtime Library Exception along with this program;
> > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> > + <http://www.gnu.org/licenses/>. */
> > +
> > +#ifndef _MWAITINTRIN_H_INCLUDED
> > +#define _MWAITINTRIN_H_INCLUDED
> > +
> > +#ifndef __MWAIT__
> > +#pragma GCC push_options
> > +#pragma GCC target("mwait")
> > +#define __DISABLE_MWAIT__
> > +#endif /* __MWAIT__ */
> > +
> > +extern __inline void
> > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > +{
> > + __builtin_ia32_monitor (__P, __E, __H);
> > +}
> > +
> > +extern __inline void
> > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > +_mm_mwait (unsigned int __E, unsigned int __H)
> > +{
> > + __builtin_ia32_mwait (__E, __H);
> > +}
> > +
> > +#ifdef __DISABLE_MWAIT__
> > +#undef __DISABLE_MWAIT__
> > +#pragma GCC pop_options
> > +#endif /* __DISABLE_MWAIT__ */
> > +
> > +#endif /* _MWAITINTRIN_H_INCLUDED */
> > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> > index fa9c5bb8b9f..f8102d2be23 100644
> > --- a/gcc/config/i386/pmmintrin.h
> > +++ b/gcc/config/i386/pmmintrin.h
> > @@ -29,6 +29,7 @@
> >
> > /* We need definitions from the SSE2 and SSE header files*/
> > #include <emmintrin.h>
> > +#include <mwaitintrin.h>
> >
> > #ifndef __SSE3__
> > #pragma GCC push_options
> > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> > return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> > }
> >
> > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > -{
> > - __builtin_ia32_monitor (__P, __E, __H);
> > -}
> > -
> > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > -_mm_mwait (unsigned int __E, unsigned int __H)
> > -{
> > - __builtin_ia32_mwait (__E, __H);
> > -}
> > -
> > #ifdef __DISABLE_SSE3__
> > #undef __DISABLE_SSE3__
> > #pragma GCC pop_options
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 3f81abc7804..43afe3dabed 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> > [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> > (match_operand:SI 1 "register_operand" "a")]
> > UNSPECV_MWAIT)]
> > - "TARGET_SSE3"
> > + "TARGET_MWAIT"
> > ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> > ;; Since 32bit register operands are implicitly zero extended to 64bit,
> > ;; we only need to set up 32bit registers.
> > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> > (match_operand:SI 1 "register_operand" "c")
> > (match_operand:SI 2 "register_operand" "d")]
> > UNSPECV_MONITOR)]
> > - "TARGET_SSE3"
> > + "TARGET_MWAIT"
> > ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> > ;; RCX and RDX are used. Since 32bit register operands are implicitly
> > ;; zero extended to 64bit, we only need to set up 32bit registers.
> > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> > index ceda501252c..7793032ba90 100644
> > --- a/gcc/config/i386/x86gprintrin.h
> > +++ b/gcc/config/i386/x86gprintrin.h
> > @@ -56,6 +56,8 @@
> >
> > #include <movdirintrin.h>
> >
> > +#include <mwaitintrin.h>
> > +
> > #include <mwaitxintrin.h>
> >
> > #include <pconfigintrin.h>
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 1bc66cce2b8..1acfaf1d345 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> > @cindex @code{target("movdiri")} function attribute, x86
> > Enable/disable the generation of the MOVDIRI instructions.
> >
> > +@item mwait
> > +@itemx no-mwait
> > +@cindex @code{target("mwait")} function attribute, x86
> > +Enable/disable the generation of the MWAIT and MONITOR instructions.
> > +
> > @item mwaitx
> > @itemx no-mwaitx
> > @cindex @code{target("mwaitx")} function attribute, x86
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 7f13ffb79e1..3e1f0bc8fad 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> > -mno-wide-multiply -mrtd -malign-double @gol
> > -mpreferred-stack-boundary=@var{num} @gol
> > -mincoming-stack-boundary=@var{num} @gol
> > --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
> > +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
> > -mrecip -mrecip=@var{opt} @gol
> > -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
> > -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
> > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> > @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> > @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
> >
> > +@item -mmwait
> > +@opindex mmwait
> > +This option enables built-in functions @code{__builtin_ia32_monitor},
> > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> > +@code{mwait} machine instructions.
> > +
> > @item -mrecip
> > @opindex mrecip
> > This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > new file mode 100644
> > index 00000000000..96eeec070f0
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > @@ -0,0 +1,27 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> > +
> > +/* Verify that they work in both 32bit and 64bit. */
> > +
> > +#include <x86gprintrin.h>
> > +
> > +void
> > +foo (char *p, int x, int y, int z)
> > +{
> > + _mm_monitor (p, y, x);
> > + _mm_mwait (z, y);
> > +}
> > +
> > +void
> > +bar (char *p, long x, long y, long z)
> > +{
> > + _mm_monitor (p, y, x);
> > + _mm_mwait (z, y);
> > +}
> > +
> > +void
> > +foo1 (char *p)
> > +{
> > + _mm_monitor (p, 0, 0);
> > + _mm_mwait (0, 0);
> > +}
> > --
> > 2.31.1
> >
--
H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-16 12:25 ` H.J. Lu
@ 2021-08-16 12:28 ` Richard Biener
2021-08-16 12:35 ` H.J. Lu
2021-08-16 12:37 ` Martin Liška
0 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2021-08-16 12:28 UTC (permalink / raw)
To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > > -mgeneral-regs-only and make -msse3 to imply -mmwait.
> >
> > Adding new options requires to bump the LTO streaming minor version
> > (I know we forgot it once on the branch already when adding a new --param).
> >
> > Please take care of this when backporting.
>
> It was updated today:
>
> commit dce5367eecfb0729cad0325240d614721afb39e3
> Author: Martin Liska <mliska@suse.cz>
> Date: Mon Aug 16 13:02:54 2021 +0200
>
> LTO: bump minor version
>
> Bump the LTO_minor_version due to changes in
> 52f0aa4dee8401ef3958dbf789780b0ee877beab
>
> PR c/100150
>
> gcc/ChangeLog:
>
> * lto-streamer.h (LTO_minor_version): Bump.
>
> Do I need to do it again if I can check in my patches this week?
Yes please, and do it with the same commit doing the .opt change.
Richard.
> Thanks.
>
> > Richard.
> >
> > > gcc/
> > >
> > > * config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
> > > x86_64-*-* targets.
> > > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
> > > New.
> > > (OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
> > > (ix86_handle_option): Handle -mmwait.
> > > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
> > > Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
> > > __builtin_ia32_monitor and __builtin_ia32_mwait.
> > > * config/i386/i386-options.c (isa2_opts): Add -mmwait.
> > > (ix86_valid_target_attribute_inner_p): Likewise.
> > > (ix86_option_override_internal): Enable mwait/monitor
> > > instructions for -msse3.
> > > * config/i386/i386.h (TARGET_MWAIT): New.
> > > (TARGET_MWAIT_P): Likewise.
> > > * config/i386/i386.opt: Add -mmwait.
> > > * config/i386/mwaitintrin.h: New file.
> > > * config/i386/pmmintrin.h: Include <mwaitintrin.h>.
> > > * config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
> > > TARGET_MWAIT.
> > > (@sse3_monitor_<mode>): Likewise.
> > > * config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
> > > * doc/extend.texi: Document mwait target attribute.
> > > * doc/invoke.texi: Document -mmwait.
> > >
> > > gcc/testsuite/
> > >
> > > * gcc.target/i386/monitor-2.c: New test.
> > >
> > > (cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
> > > ---
> > > gcc/common/config/i386/i386-common.c | 15 +++++++
> > > gcc/config.gcc | 6 ++-
> > > gcc/config/i386/i386-builtins.c | 4 +-
> > > gcc/config/i386/i386-options.c | 7 +++
> > > gcc/config/i386/i386.h | 2 +
> > > gcc/config/i386/i386.opt | 4 ++
> > > gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
> > > gcc/config/i386/pmmintrin.h | 13 +-----
> > > gcc/config/i386/sse.md | 4 +-
> > > gcc/config/i386/x86gprintrin.h | 2 +
> > > gcc/doc/extend.texi | 5 +++
> > > gcc/doc/invoke.texi | 8 +++-
> > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
> > > 13 files changed, 130 insertions(+), 19 deletions(-)
> > > create mode 100644 gcc/config/i386/mwaitintrin.h
> > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > >
> > > diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
> > > index 6a7b5c8312f..e156cc34584 100644
> > > --- a/gcc/common/config/i386/i386-common.c
> > > +++ b/gcc/common/config/i386/i386-common.c
> > > @@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
> > > #define OPTION_MASK_ISA_F16C_SET \
> > > (OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
> > > #define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
> > > +#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
> > > #define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
> > > #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
> > > #define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
> > > @@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
> > > #define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
> > > #define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
> > > #define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
> > > +#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
> > > #define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
> > > #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
> > > #define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
> > > @@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
> > > }
> > > return true;
> > >
> > > + case OPT_mmwait:
> > > + if (value)
> > > + {
> > > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
> > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
> > > + }
> > > + else
> > > + {
> > > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
> > > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
> > > + }
> > > + return true;
> > > +
> > > case OPT_mclzero:
> > > if (value)
> > > {
> > > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > > index 357b0bed067..a020e0808c9 100644
> > > --- a/gcc/config.gcc
> > > +++ b/gcc/config.gcc
> > > @@ -414,7 +414,8 @@ i[34567]86-*-*)
> > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > > amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > > + mwaitintrin.h"
> > > ;;
> > > x86_64-*-*)
> > > cpu_type=i386
> > > @@ -451,7 +452,8 @@ x86_64-*-*)
> > > avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
> > > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
> > > amxbf16intrin.h x86gprintrin.h uintrintrin.h
> > > - hresetintrin.h keylockerintrin.h avxvnniintrin.h"
> > > + hresetintrin.h keylockerintrin.h avxvnniintrin.h
> > > + mwaitintrin.h"
> > > ;;
> > > ia64-*-*)
> > > extra_headers=ia64intrin.h
> > > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> > > index 4fcdf4b89ee..128bd39816c 100644
> > > --- a/gcc/config/i386/i386-builtins.c
> > > +++ b/gcc/config/i386/i386-builtins.c
> > > @@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
> > > VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
> > >
> > > /* SSE3. */
> > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
> > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
> > > VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
> > > - def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
> > > + def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
> > > VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
> > >
> > > /* AES */
> > > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> > > index 18d2c0b9f99..7ecd0cf8b8c 100644
> > > --- a/gcc/config/i386/i386-options.c
> > > +++ b/gcc/config/i386/i386-options.c
> > > @@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
> > > { "-mmovbe", OPTION_MASK_ISA2_MOVBE },
> > > { "-mclzero", OPTION_MASK_ISA2_CLZERO },
> > > { "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
> > > + { "-mmwait", OPTION_MASK_ISA2_MWAIT },
> > > { "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
> > > { "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
> > > { "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
> > > @@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> > > IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
> > > IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
> > > IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
> > > + IX86_ATTR_ISA ("mwait", OPT_mmwait),
> > > IX86_ATTR_ISA ("clzero", OPT_mclzero),
> > > IX86_ATTR_ISA ("pku", OPT_mpku),
> > > IX86_ATTR_ISA ("lwp", OPT_mlwp),
> > > @@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
> > > || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
> > > ix86_prefetch_sse = true;
> > >
> > > + /* Enable mwait/monitor instructions for -msse3. */
> > > + if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
> > > + opts->x_ix86_isa_flags2
> > > + |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
> > > +
> > > /* Enable popcnt instruction for -msse4.2 or -mabm. */
> > > if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
> > > || TARGET_ABM_P (opts->x_ix86_isa_flags))
> > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > index 5583ec6881a..73e118900f7 100644
> > > --- a/gcc/config/i386/i386.h
> > > +++ b/gcc/config/i386/i386.h
> > > @@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> > > #define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
> > > #define TARGET_MWAITX TARGET_ISA2_MWAITX
> > > #define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
> > > +#define TARGET_MWAIT TARGET_ISA2_MWAIT
> > > +#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
> > > #define TARGET_PKU TARGET_ISA_PKU
> > > #define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
> > > #define TARGET_SHSTK TARGET_ISA_SHSTK
> > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> > > index c781fdc8278..7b8547bb1c3 100644
> > > --- a/gcc/config/i386/i386.opt
> > > +++ b/gcc/config/i386/i386.opt
> > > @@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
> > > mneeded
> > > Target Var(ix86_needed) Save
> > > Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
> > > +
> > > +mmwait
> > > +Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
> > > +Support MWAIT and MONITOR built-in functions and code generation.
> > > diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
> > > new file mode 100644
> > > index 00000000000..1ecbc4abb69
> > > --- /dev/null
> > > +++ b/gcc/config/i386/mwaitintrin.h
> > > @@ -0,0 +1,52 @@
> > > +/* Copyright (C) 2021 Free Software Foundation, Inc.
> > > +
> > > + This file is part of GCC.
> > > +
> > > + GCC is free software; you can redistribute it and/or modify
> > > + it under the terms of the GNU General Public License as published by
> > > + the Free Software Foundation; either version 3, or (at your option)
> > > + any later version.
> > > +
> > > + GCC is distributed in the hope that it will be useful,
> > > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > + GNU General Public License for more details.
> > > +
> > > + Under Section 7 of GPL version 3, you are granted additional
> > > + permissions described in the GCC Runtime Library Exception, version
> > > + 3.1, as published by the Free Software Foundation.
> > > +
> > > + You should have received a copy of the GNU General Public License and
> > > + a copy of the GCC Runtime Library Exception along with this program;
> > > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> > > + <http://www.gnu.org/licenses/>. */
> > > +
> > > +#ifndef _MWAITINTRIN_H_INCLUDED
> > > +#define _MWAITINTRIN_H_INCLUDED
> > > +
> > > +#ifndef __MWAIT__
> > > +#pragma GCC push_options
> > > +#pragma GCC target("mwait")
> > > +#define __DISABLE_MWAIT__
> > > +#endif /* __MWAIT__ */
> > > +
> > > +extern __inline void
> > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > +_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > > +{
> > > + __builtin_ia32_monitor (__P, __E, __H);
> > > +}
> > > +
> > > +extern __inline void
> > > +__attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > +_mm_mwait (unsigned int __E, unsigned int __H)
> > > +{
> > > + __builtin_ia32_mwait (__E, __H);
> > > +}
> > > +
> > > +#ifdef __DISABLE_MWAIT__
> > > +#undef __DISABLE_MWAIT__
> > > +#pragma GCC pop_options
> > > +#endif /* __DISABLE_MWAIT__ */
> > > +
> > > +#endif /* _MWAITINTRIN_H_INCLUDED */
> > > diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
> > > index fa9c5bb8b9f..f8102d2be23 100644
> > > --- a/gcc/config/i386/pmmintrin.h
> > > +++ b/gcc/config/i386/pmmintrin.h
> > > @@ -29,6 +29,7 @@
> > >
> > > /* We need definitions from the SSE2 and SSE header files*/
> > > #include <emmintrin.h>
> > > +#include <mwaitintrin.h>
> > >
> > > #ifndef __SSE3__
> > > #pragma GCC push_options
> > > @@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
> > > return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
> > > }
> > >
> > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > -_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
> > > -{
> > > - __builtin_ia32_monitor (__P, __E, __H);
> > > -}
> > > -
> > > -extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> > > -_mm_mwait (unsigned int __E, unsigned int __H)
> > > -{
> > > - __builtin_ia32_mwait (__E, __H);
> > > -}
> > > -
> > > #ifdef __DISABLE_SSE3__
> > > #undef __DISABLE_SSE3__
> > > #pragma GCC pop_options
> > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > index 3f81abc7804..43afe3dabed 100644
> > > --- a/gcc/config/i386/sse.md
> > > +++ b/gcc/config/i386/sse.md
> > > @@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
> > > [(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
> > > (match_operand:SI 1 "register_operand" "a")]
> > > UNSPECV_MWAIT)]
> > > - "TARGET_SSE3"
> > > + "TARGET_MWAIT"
> > > ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
> > > ;; Since 32bit register operands are implicitly zero extended to 64bit,
> > > ;; we only need to set up 32bit registers.
> > > @@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
> > > (match_operand:SI 1 "register_operand" "c")
> > > (match_operand:SI 2 "register_operand" "d")]
> > > UNSPECV_MONITOR)]
> > > - "TARGET_SSE3"
> > > + "TARGET_MWAIT"
> > > ;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
> > > ;; RCX and RDX are used. Since 32bit register operands are implicitly
> > > ;; zero extended to 64bit, we only need to set up 32bit registers.
> > > diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
> > > index ceda501252c..7793032ba90 100644
> > > --- a/gcc/config/i386/x86gprintrin.h
> > > +++ b/gcc/config/i386/x86gprintrin.h
> > > @@ -56,6 +56,8 @@
> > >
> > > #include <movdirintrin.h>
> > >
> > > +#include <mwaitintrin.h>
> > > +
> > > #include <mwaitxintrin.h>
> > >
> > > #include <pconfigintrin.h>
> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > > index 1bc66cce2b8..1acfaf1d345 100644
> > > --- a/gcc/doc/extend.texi
> > > +++ b/gcc/doc/extend.texi
> > > @@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
> > > @cindex @code{target("movdiri")} function attribute, x86
> > > Enable/disable the generation of the MOVDIRI instructions.
> > >
> > > +@item mwait
> > > +@itemx no-mwait
> > > +@cindex @code{target("mwait")} function attribute, x86
> > > +Enable/disable the generation of the MWAIT and MONITOR instructions.
> > > +
> > > @item mwaitx
> > > @itemx no-mwaitx
> > > @cindex @code{target("mwaitx")} function attribute, x86
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 7f13ffb79e1..3e1f0bc8fad 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
> > > -mno-wide-multiply -mrtd -malign-double @gol
> > > -mpreferred-stack-boundary=@var{num} @gol
> > > -mincoming-stack-boundary=@var{num} @gol
> > > --mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
> > > +-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
> > > -mrecip -mrecip=@var{opt} @gol
> > > -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
> > > -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
> > > @@ -31159,6 +31159,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
> > > @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
> > > @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
> > >
> > > +@item -mmwait
> > > +@opindex mmwait
> > > +This option enables built-in functions @code{__builtin_ia32_monitor},
> > > +and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
> > > +@code{mwait} machine instructions.
> > > +
> > > @item -mrecip
> > > @opindex mrecip
> > > This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
> > > diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > > new file mode 100644
> > > index 00000000000..96eeec070f0
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
> > > @@ -0,0 +1,27 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
> > > +
> > > +/* Verify that they work in both 32bit and 64bit. */
> > > +
> > > +#include <x86gprintrin.h>
> > > +
> > > +void
> > > +foo (char *p, int x, int y, int z)
> > > +{
> > > + _mm_monitor (p, y, x);
> > > + _mm_mwait (z, y);
> > > +}
> > > +
> > > +void
> > > +bar (char *p, long x, long y, long z)
> > > +{
> > > + _mm_monitor (p, y, x);
> > > + _mm_mwait (z, y);
> > > +}
> > > +
> > > +void
> > > +foo1 (char *p)
> > > +{
> > > + _mm_monitor (p, 0, 0);
> > > + _mm_mwait (0, 0);
> > > +}
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-16 12:28 ` Richard Biener
@ 2021-08-16 12:35 ` H.J. Lu
2021-08-16 12:37 ` Martin Liška
1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-16 12:35 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]
On Mon, Aug 16, 2021 at 5:28 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 2:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
> > > > -mgeneral-regs-only and make -msse3 to imply -mmwait.
> > >
> > > Adding new options requires to bump the LTO streaming minor version
> > > (I know we forgot it once on the branch already when adding a new --param).
> > >
> > > Please take care of this when backporting.
> >
> > It was updated today:
> >
> > commit dce5367eecfb0729cad0325240d614721afb39e3
> > Author: Martin Liska <mliska@suse.cz>
> > Date: Mon Aug 16 13:02:54 2021 +0200
> >
> > LTO: bump minor version
> >
> > Bump the LTO_minor_version due to changes in
> > 52f0aa4dee8401ef3958dbf789780b0ee877beab
> >
> > PR c/100150
> >
> > gcc/ChangeLog:
> >
> > * lto-streamer.h (LTO_minor_version): Bump.
> >
> > Do I need to do it again if I can check in my patches this week?
>
> Yes please, and do it with the same commit doing the .opt change.
>
Here is the updated patch with LTO_minor_version bump.
Thanks.
--
H.J.
[-- Attachment #2: 0001-x86-Add-mmwait-for-mgeneral-regs-only.patch --]
[-- Type: text/x-patch, Size: 15414 bytes --]
From 8f3e275ef061cd5f8353c71cb99f05dd944575f9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 15 Apr 2021 11:19:32 -0700
Subject: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
Add -mmwait so that the MWAIT and MONITOR intrinsics can be used with
-mgeneral-regs-only and make -msse3 to imply -mmwait.
gcc/
* config.gcc: Install mwaitintrin.h for i[34567]86-*-* and
x86_64-*-* targets.
* lto-streamer.h (LTO_minor_version): Bump.
* common/config/i386/i386-common.c (OPTION_MASK_ISA2_MWAIT_SET):
New.
(OPTION_MASK_ISA2_MWAIT_UNSET): Likewise.
(ix86_handle_option): Handle -mmwait.
* config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins):
Replace OPTION_MASK_ISA_SSE3 with OPTION_MASK_ISA2_MWAIT on
__builtin_ia32_monitor and __builtin_ia32_mwait.
* config/i386/i386-options.c (isa2_opts): Add -mmwait.
(ix86_valid_target_attribute_inner_p): Likewise.
(ix86_option_override_internal): Enable mwait/monitor
instructions for -msse3.
* config/i386/i386.h (TARGET_MWAIT): New.
(TARGET_MWAIT_P): Likewise.
* config/i386/i386.opt: Add -mmwait.
* config/i386/mwaitintrin.h: New file.
* config/i386/pmmintrin.h: Include <mwaitintrin.h>.
* config/i386/sse.md (sse3_mwait): Replace TARGET_SSE3 with
TARGET_MWAIT.
(@sse3_monitor_<mode>): Likewise.
* config/i386/x86gprintrin.h: Include <mwaitintrin.h>.
* doc/extend.texi: Document mwait target attribute.
* doc/invoke.texi: Document -mmwait.
gcc/testsuite/
* gcc.target/i386/monitor-2.c: New test.
(cherry picked from commit d8c6cc2ca35489bc41bb58ec96c1195928826922)
---
gcc/common/config/i386/i386-common.c | 15 +++++++
gcc/config.gcc | 6 ++-
gcc/config/i386/i386-builtins.c | 4 +-
gcc/config/i386/i386-options.c | 7 +++
gcc/config/i386/i386.h | 2 +
gcc/config/i386/i386.opt | 4 ++
gcc/config/i386/mwaitintrin.h | 52 +++++++++++++++++++++++
gcc/config/i386/pmmintrin.h | 13 +-----
gcc/config/i386/sse.md | 4 +-
gcc/config/i386/x86gprintrin.h | 2 +
gcc/doc/extend.texi | 5 +++
gcc/doc/invoke.texi | 8 +++-
gcc/lto-streamer.h | 2 +-
gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++++++++++++
14 files changed, 131 insertions(+), 20 deletions(-)
create mode 100644 gcc/config/i386/mwaitintrin.h
create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 6a7b5c8312f..e156cc34584 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -150,6 +150,7 @@ along with GCC; see the file COPYING3. If not see
#define OPTION_MASK_ISA_F16C_SET \
(OPTION_MASK_ISA_F16C | OPTION_MASK_ISA_AVX_SET)
#define OPTION_MASK_ISA2_MWAITX_SET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_SET OPTION_MASK_ISA2_MWAIT
#define OPTION_MASK_ISA2_CLZERO_SET OPTION_MASK_ISA2_CLZERO
#define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
#define OPTION_MASK_ISA2_RDPID_SET OPTION_MASK_ISA2_RDPID
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3. If not see
#define OPTION_MASK_ISA_XSAVES_UNSET OPTION_MASK_ISA_XSAVES
#define OPTION_MASK_ISA_CLWB_UNSET OPTION_MASK_ISA_CLWB
#define OPTION_MASK_ISA2_MWAITX_UNSET OPTION_MASK_ISA2_MWAITX
+#define OPTION_MASK_ISA2_MWAIT_UNSET OPTION_MASK_ISA2_MWAIT
#define OPTION_MASK_ISA2_CLZERO_UNSET OPTION_MASK_ISA2_CLZERO
#define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
#define OPTION_MASK_ISA2_RDPID_UNSET OPTION_MASK_ISA2_RDPID
@@ -1546,6 +1548,19 @@ ix86_handle_option (struct gcc_options *opts,
}
return true;
+ case OPT_mmwait:
+ if (value)
+ {
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAIT_SET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_SET;
+ }
+ else
+ {
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_MWAIT_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_MWAIT_UNSET;
+ }
+ return true;
+
case OPT_mclzero:
if (value)
{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 357b0bed067..a020e0808c9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -414,7 +414,8 @@ i[34567]86-*-*)
avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
amxbf16intrin.h x86gprintrin.h uintrintrin.h
- hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+ hresetintrin.h keylockerintrin.h avxvnniintrin.h
+ mwaitintrin.h"
;;
x86_64-*-*)
cpu_type=i386
@@ -451,7 +452,8 @@ x86_64-*-*)
avx512vp2intersectintrin.h avx512vp2intersectvlintrin.h
tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h
amxbf16intrin.h x86gprintrin.h uintrintrin.h
- hresetintrin.h keylockerintrin.h avxvnniintrin.h"
+ hresetintrin.h keylockerintrin.h avxvnniintrin.h
+ mwaitintrin.h"
;;
ia64-*-*)
extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 4fcdf4b89ee..128bd39816c 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -628,9 +628,9 @@ ix86_init_mmx_sse_builtins (void)
VOID_FTYPE_VOID, IX86_BUILTIN_MFENCE);
/* SSE3. */
- def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_monitor",
+ def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_monitor",
VOID_FTYPE_PCVOID_UNSIGNED_UNSIGNED, IX86_BUILTIN_MONITOR);
- def_builtin (OPTION_MASK_ISA_SSE3, 0, "__builtin_ia32_mwait",
+ def_builtin (0, OPTION_MASK_ISA2_MWAIT, "__builtin_ia32_mwait",
VOID_FTYPE_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAIT);
/* AES */
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 18d2c0b9f99..7ecd0cf8b8c 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -207,6 +207,7 @@ static struct ix86_target_opts isa2_opts[] =
{ "-mmovbe", OPTION_MASK_ISA2_MOVBE },
{ "-mclzero", OPTION_MASK_ISA2_CLZERO },
{ "-mmwaitx", OPTION_MASK_ISA2_MWAITX },
+ { "-mmwait", OPTION_MASK_ISA2_MWAIT },
{ "-mmovdir64b", OPTION_MASK_ISA2_MOVDIR64B },
{ "-mwaitpkg", OPTION_MASK_ISA2_WAITPKG },
{ "-mcldemote", OPTION_MASK_ISA2_CLDEMOTE },
@@ -1015,6 +1016,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
IX86_ATTR_ISA ("fsgsbase", OPT_mfsgsbase),
IX86_ATTR_ISA ("rdrnd", OPT_mrdrnd),
IX86_ATTR_ISA ("mwaitx", OPT_mmwaitx),
+ IX86_ATTR_ISA ("mwait", OPT_mmwait),
IX86_ATTR_ISA ("clzero", OPT_mclzero),
IX86_ATTR_ISA ("pku", OPT_mpku),
IX86_ATTR_ISA ("lwp", OPT_mlwp),
@@ -2612,6 +2614,11 @@ ix86_option_override_internal (bool main_args_p,
|| TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
ix86_prefetch_sse = true;
+ /* Enable mwait/monitor instructions for -msse3. */
+ if (TARGET_SSE3_P (opts->x_ix86_isa_flags))
+ opts->x_ix86_isa_flags2
+ |= OPTION_MASK_ISA2_MWAIT & ~opts->x_ix86_isa_flags2_explicit;
+
/* Enable popcnt instruction for -msse4.2 or -mabm. */
if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
|| TARGET_ABM_P (opts->x_ix86_isa_flags))
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 5583ec6881a..73e118900f7 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -181,6 +181,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
#define TARGET_CLWB_P(x) TARGET_ISA_CLWB_P(x)
#define TARGET_MWAITX TARGET_ISA2_MWAITX
#define TARGET_MWAITX_P(x) TARGET_ISA2_MWAITX_P(x)
+#define TARGET_MWAIT TARGET_ISA2_MWAIT
+#define TARGET_MWAIT_P(x) TARGET_ISA2_MWAIT_P(x)
#define TARGET_PKU TARGET_ISA_PKU
#define TARGET_PKU_P(x) TARGET_ISA_PKU_P(x)
#define TARGET_SHSTK TARGET_ISA_SHSTK
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index c781fdc8278..7b8547bb1c3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1162,3 +1162,7 @@ AVXVNNI built-in functions and code generation.
mneeded
Target Var(ix86_needed) Save
Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property.
+
+mmwait
+Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save
+Support MWAIT and MONITOR built-in functions and code generation.
diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
new file mode 100644
index 00000000000..1ecbc4abb69
--- /dev/null
+++ b/gcc/config/i386/mwaitintrin.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ Under Section 7 of GPL version 3, you are granted additional
+ permissions described in the GCC Runtime Library Exception, version
+ 3.1, as published by the Free Software Foundation.
+
+ You should have received a copy of the GNU General Public License and
+ a copy of the GCC Runtime Library Exception along with this program;
+ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _MWAITINTRIN_H_INCLUDED
+#define _MWAITINTRIN_H_INCLUDED
+
+#ifndef __MWAIT__
+#pragma GCC push_options
+#pragma GCC target("mwait")
+#define __DISABLE_MWAIT__
+#endif /* __MWAIT__ */
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
+{
+ __builtin_ia32_monitor (__P, __E, __H);
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mwait (unsigned int __E, unsigned int __H)
+{
+ __builtin_ia32_mwait (__E, __H);
+}
+
+#ifdef __DISABLE_MWAIT__
+#undef __DISABLE_MWAIT__
+#pragma GCC pop_options
+#endif /* __DISABLE_MWAIT__ */
+
+#endif /* _MWAITINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/pmmintrin.h b/gcc/config/i386/pmmintrin.h
index fa9c5bb8b9f..f8102d2be23 100644
--- a/gcc/config/i386/pmmintrin.h
+++ b/gcc/config/i386/pmmintrin.h
@@ -29,6 +29,7 @@
/* We need definitions from the SSE2 and SSE header files*/
#include <emmintrin.h>
+#include <mwaitintrin.h>
#ifndef __SSE3__
#pragma GCC push_options
@@ -112,18 +113,6 @@ _mm_lddqu_si128 (__m128i const *__P)
return (__m128i) __builtin_ia32_lddqu ((char const *)__P);
}
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
-{
- __builtin_ia32_monitor (__P, __E, __H);
-}
-
-extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mwait (unsigned int __E, unsigned int __H)
-{
- __builtin_ia32_mwait (__E, __H);
-}
-
#ifdef __DISABLE_SSE3__
#undef __DISABLE_SSE3__
#pragma GCC pop_options
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3f81abc7804..43afe3dabed 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16593,7 +16593,7 @@ (define_insn "sse3_mwait"
[(unspec_volatile [(match_operand:SI 0 "register_operand" "c")
(match_operand:SI 1 "register_operand" "a")]
UNSPECV_MWAIT)]
- "TARGET_SSE3"
+ "TARGET_MWAIT"
;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
;; Since 32bit register operands are implicitly zero extended to 64bit,
;; we only need to set up 32bit registers.
@@ -16605,7 +16605,7 @@ (define_insn "@sse3_monitor_<mode>"
(match_operand:SI 1 "register_operand" "c")
(match_operand:SI 2 "register_operand" "d")]
UNSPECV_MONITOR)]
- "TARGET_SSE3"
+ "TARGET_MWAIT"
;; 64bit version is "monitor %rax,%rcx,%rdx". But only lower 32bits in
;; RCX and RDX are used. Since 32bit register operands are implicitly
;; zero extended to 64bit, we only need to set up 32bit registers.
diff --git a/gcc/config/i386/x86gprintrin.h b/gcc/config/i386/x86gprintrin.h
index ceda501252c..7793032ba90 100644
--- a/gcc/config/i386/x86gprintrin.h
+++ b/gcc/config/i386/x86gprintrin.h
@@ -56,6 +56,8 @@
#include <movdirintrin.h>
+#include <mwaitintrin.h>
+
#include <mwaitxintrin.h>
#include <pconfigintrin.h>
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1bc66cce2b8..1acfaf1d345 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -6665,6 +6665,11 @@ Enable/disable the generation of the MOVDIR64B instructions.
@cindex @code{target("movdiri")} function attribute, x86
Enable/disable the generation of the MOVDIRI instructions.
+@item mwait
+@itemx no-mwait
+@cindex @code{target("mwait")} function attribute, x86
+Enable/disable the generation of the MWAIT and MONITOR instructions.
+
@item mwaitx
@itemx no-mwaitx
@cindex @code{target("mwaitx")} function attribute, x86
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 05269f83808..fc222758a22 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1371,7 +1371,7 @@ See RS/6000 and PowerPC Options.
-mno-wide-multiply -mrtd -malign-double @gol
-mpreferred-stack-boundary=@var{num} @gol
-mincoming-stack-boundary=@var{num} @gol
--mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
+-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait @gol
-mrecip -mrecip=@var{opt} @gol
-mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} @gol
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
@@ -31178,6 +31178,12 @@ This option enables built-in functions @code{__builtin_ia32_crc32qi},
@code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
@code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
+@item -mmwait
+@opindex mmwait
+This option enables built-in functions @code{__builtin_ia32_monitor},
+and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
+@code{mwait} machine instructions.
+
@item -mrecip
@opindex mrecip
This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index a01049da472..e2a0e033ab2 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -121,7 +121,7 @@ along with GCC; see the file COPYING3. If not see
form followed by the data for the string. */
#define LTO_major_version 11
-#define LTO_minor_version 1
+#define LTO_minor_version 2
typedef unsigned char lto_decl_flags_t;
diff --git a/gcc/testsuite/gcc.target/i386/monitor-2.c b/gcc/testsuite/gcc.target/i386/monitor-2.c
new file mode 100644
index 00000000000..96eeec070f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/monitor-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait -mgeneral-regs-only" } */
+
+/* Verify that they work in both 32bit and 64bit. */
+
+#include <x86gprintrin.h>
+
+void
+foo (char *p, int x, int y, int z)
+{
+ _mm_monitor (p, y, x);
+ _mm_mwait (z, y);
+}
+
+void
+bar (char *p, long x, long y, long z)
+{
+ _mm_monitor (p, y, x);
+ _mm_mwait (z, y);
+}
+
+void
+foo1 (char *p)
+{
+ _mm_monitor (p, 0, 0);
+ _mm_mwait (0, 0);
+}
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only
2021-08-16 12:28 ` Richard Biener
2021-08-16 12:35 ` H.J. Lu
@ 2021-08-16 12:37 ` Martin Liška
1 sibling, 0 replies; 16+ messages in thread
From: Martin Liška @ 2021-08-16 12:37 UTC (permalink / raw)
To: Richard Biener, H.J. Lu; +Cc: Jakub Jelinek, GCC Patches
On 8/16/21 2:28 PM, Richard Biener via Gcc-patches wrote:
> Yes please, and do it with the same commit doing the .opt change.
Just one quick note: I've got a periodic builder that verifies the LTO
stream on tramp3d in all active branches.
Martin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
@ 2021-08-24 14:57 ` H.J. Lu
2021-08-25 7:34 ` Uros Bizjak
0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2021-08-24 14:57 UTC (permalink / raw)
To: Richard Biener, Jan Hubicka; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek
On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > <x86gprintrin.h> and target("general-regs-only") function attribute
> > were added to GCC 11. But their implementations are incomplete. I'd
> > like to backport the following patches to GCC 11 branch to finish them.
>
> Fine with me if x86 maintainers do not disagree (also see one comment I have
> on the -mwait adding patch).
Hi Uros, Honza,
Do you have any comments? The updated -mwait patch with LTO_minor_version
bump is at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
Thanks.
H.J.
> > H.J. Lu (5):
> > x86: Add -mmwait for -mgeneral-regs-only
> > x86: Use crc32 target option for CRC32 intrinsics
> > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > x86: Enable the GPR only instructions for -mgeneral-regs-only
> > <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> >
> > gcc/common/config/i386/i386-common.c | 45 ++-
> > gcc/config.gcc | 6 +-
> > gcc/config/i386/i386-builtin.def | 8 +-
> > gcc/config/i386/i386-builtins.c | 4 +-
> > gcc/config/i386/i386-c.c | 2 +
> > gcc/config/i386/i386-options.c | 12 +
> > gcc/config/i386/i386.c | 6 +-
> > gcc/config/i386/i386.h | 2 +
> > gcc/config/i386/i386.md | 4 +-
> > gcc/config/i386/i386.opt | 4 +
> > gcc/config/i386/ia32intrin.h | 42 ++-
> > gcc/config/i386/mwaitintrin.h | 52 +++
> > gcc/config/i386/pmmintrin.h | 13 +-
> > gcc/config/i386/serializeintrin.h | 7 +-
> > gcc/config/i386/sse.md | 4 +-
> > gcc/config/i386/x86gprintrin.h | 13 +
> > gcc/doc/extend.texi | 5 +
> > gcc/doc/invoke.texi | 8 +-
> > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
> > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
> > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
> > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
> > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
> > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
> > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
> > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
> > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
> > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
> > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
> > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
> > 30 files changed, 717 insertions(+), 45 deletions(-)
> > create mode 100644 gcc/config/i386/mwaitintrin.h
> > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> >
> > --
> > 2.31.1
> >
--
H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
2021-08-24 14:57 ` H.J. Lu
@ 2021-08-25 7:34 ` Uros Bizjak
2021-08-25 12:14 ` H.J. Lu
2021-08-26 6:35 ` Richard Biener
0 siblings, 2 replies; 16+ messages in thread
From: Uros Bizjak @ 2021-08-25 7:34 UTC (permalink / raw)
To: H.J. Lu; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek
On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > were added to GCC 11. But their implementations are incomplete. I'd
> > > like to backport the following patches to GCC 11 branch to finish them.
> >
> > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > on the -mwait adding patch).
>
> Hi Uros, Honza,
>
> Do you have any comments? The updated -mwait patch with LTO_minor_version
> bump is at:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
I don't have any comments, but IIRC, approved changes can be
backported from mainline to release branches without additional
approval.
Uros.
> Thanks.
>
> H.J.
> > > H.J. Lu (5):
> > > x86: Add -mmwait for -mgeneral-regs-only
> > > x86: Use crc32 target option for CRC32 intrinsics
> > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > > x86: Enable the GPR only instructions for -mgeneral-regs-only
> > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > >
> > > gcc/common/config/i386/i386-common.c | 45 ++-
> > > gcc/config.gcc | 6 +-
> > > gcc/config/i386/i386-builtin.def | 8 +-
> > > gcc/config/i386/i386-builtins.c | 4 +-
> > > gcc/config/i386/i386-c.c | 2 +
> > > gcc/config/i386/i386-options.c | 12 +
> > > gcc/config/i386/i386.c | 6 +-
> > > gcc/config/i386/i386.h | 2 +
> > > gcc/config/i386/i386.md | 4 +-
> > > gcc/config/i386/i386.opt | 4 +
> > > gcc/config/i386/ia32intrin.h | 42 ++-
> > > gcc/config/i386/mwaitintrin.h | 52 +++
> > > gcc/config/i386/pmmintrin.h | 13 +-
> > > gcc/config/i386/serializeintrin.h | 7 +-
> > > gcc/config/i386/sse.md | 4 +-
> > > gcc/config/i386/x86gprintrin.h | 13 +
> > > gcc/doc/extend.texi | 5 +
> > > gcc/doc/invoke.texi | 8 +-
> > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
> > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
> > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
> > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
> > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
> > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
> > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
> > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
> > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
> > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
> > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
> > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
> > > 30 files changed, 717 insertions(+), 45 deletions(-)
> > > create mode 100644 gcc/config/i386/mwaitintrin.h
> > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > >
> > > --
> > > 2.31.1
> > >
>
>
>
> --
> H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
2021-08-25 7:34 ` Uros Bizjak
@ 2021-08-25 12:14 ` H.J. Lu
2021-08-26 6:35 ` Richard Biener
1 sibling, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2021-08-25 12:14 UTC (permalink / raw)
To: Uros Bizjak; +Cc: Richard Biener, Jan Hubicka, GCC Patches, Jakub Jelinek
On Wed, Aug 25, 2021 at 12:34 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > > were added to GCC 11. But their implementations are incomplete. I'd
> > > > like to backport the following patches to GCC 11 branch to finish them.
> > >
> > > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > > on the -mwait adding patch).
> >
> > Hi Uros, Honza,
> >
> > Do you have any comments? The updated -mwait patch with LTO_minor_version
> > bump is at:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
>
> I don't have any comments, but IIRC, approved changes can be
> backported from mainline to release branches without additional
> approval.
I am checking them in.
Thanks.
> Uros.
>
> > Thanks.
> >
> > H.J.
> > > > H.J. Lu (5):
> > > > x86: Add -mmwait for -mgeneral-regs-only
> > > > x86: Use crc32 target option for CRC32 intrinsics
> > > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > > > x86: Enable the GPR only instructions for -mgeneral-regs-only
> > > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > > >
> > > > gcc/common/config/i386/i386-common.c | 45 ++-
> > > > gcc/config.gcc | 6 +-
> > > > gcc/config/i386/i386-builtin.def | 8 +-
> > > > gcc/config/i386/i386-builtins.c | 4 +-
> > > > gcc/config/i386/i386-c.c | 2 +
> > > > gcc/config/i386/i386-options.c | 12 +
> > > > gcc/config/i386/i386.c | 6 +-
> > > > gcc/config/i386/i386.h | 2 +
> > > > gcc/config/i386/i386.md | 4 +-
> > > > gcc/config/i386/i386.opt | 4 +
> > > > gcc/config/i386/ia32intrin.h | 42 ++-
> > > > gcc/config/i386/mwaitintrin.h | 52 +++
> > > > gcc/config/i386/pmmintrin.h | 13 +-
> > > > gcc/config/i386/serializeintrin.h | 7 +-
> > > > gcc/config/i386/sse.md | 4 +-
> > > > gcc/config/i386/x86gprintrin.h | 13 +
> > > > gcc/doc/extend.texi | 5 +
> > > > gcc/doc/invoke.texi | 8 +-
> > > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
> > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
> > > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
> > > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
> > > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
> > > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
> > > > 30 files changed, 717 insertions(+), 45 deletions(-)
> > > > create mode 100644 gcc/config/i386/mwaitintrin.h
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > > >
> > > > --
> > > > 2.31.1
> > > >
> >
> >
> >
> > --
> > H.J.
--
H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only
2021-08-25 7:34 ` Uros Bizjak
2021-08-25 12:14 ` H.J. Lu
@ 2021-08-26 6:35 ` Richard Biener
1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2021-08-26 6:35 UTC (permalink / raw)
To: Uros Bizjak; +Cc: H.J. Lu, Jan Hubicka, GCC Patches, Jakub Jelinek
On Wed, Aug 25, 2021 at 9:34 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 4:57 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 15, 2021 at 11:11 PM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> > >
> > > On Fri, Aug 13, 2021 at 3:51 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > <x86gprintrin.h> and target("general-regs-only") function attribute
> > > > were added to GCC 11. But their implementations are incomplete. I'd
> > > > like to backport the following patches to GCC 11 branch to finish them.
> > >
> > > Fine with me if x86 maintainers do not disagree (also see one comment I have
> > > on the -mwait adding patch).
> >
> > Hi Uros, Honza,
> >
> > Do you have any comments? The updated -mwait patch with LTO_minor_version
> > bump is at:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577471.html
>
> I don't have any comments, but IIRC, approved changes can be
> backported from mainline to release branches without additional
> approval.
If they fix regressions, yes. I understood this wasn't such obvious case here
(instead it's a new but buggy feature).
Richard.
> Uros.
>
> > Thanks.
> >
> > H.J.
> > > > H.J. Lu (5):
> > > > x86: Add -mmwait for -mgeneral-regs-only
> > > > x86: Use crc32 target option for CRC32 intrinsics
> > > > x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions
> > > > x86: Enable the GPR only instructions for -mgeneral-regs-only
> > > > <x86gprintrin.h>: Add pragma GCC target("general-regs-only")
> > > >
> > > > gcc/common/config/i386/i386-common.c | 45 ++-
> > > > gcc/config.gcc | 6 +-
> > > > gcc/config/i386/i386-builtin.def | 8 +-
> > > > gcc/config/i386/i386-builtins.c | 4 +-
> > > > gcc/config/i386/i386-c.c | 2 +
> > > > gcc/config/i386/i386-options.c | 12 +
> > > > gcc/config/i386/i386.c | 6 +-
> > > > gcc/config/i386/i386.h | 2 +
> > > > gcc/config/i386/i386.md | 4 +-
> > > > gcc/config/i386/i386.opt | 4 +
> > > > gcc/config/i386/ia32intrin.h | 42 ++-
> > > > gcc/config/i386/mwaitintrin.h | 52 +++
> > > > gcc/config/i386/pmmintrin.h | 13 +-
> > > > gcc/config/i386/serializeintrin.h | 7 +-
> > > > gcc/config/i386/sse.md | 4 +-
> > > > gcc/config/i386/x86gprintrin.h | 13 +
> > > > gcc/doc/extend.texi | 5 +
> > > > gcc/doc/invoke.texi | 8 +-
> > > > gcc/testsuite/gcc.target/i386/crc32-6.c | 13 +
> > > > gcc/testsuite/gcc.target/i386/monitor-2.c | 27 ++
> > > > gcc/testsuite/gcc.target/i386/pr101492-1.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-2.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-3.c | 10 +
> > > > gcc/testsuite/gcc.target/i386/pr101492-4.c | 12 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 +++++++++++++++++++++
> > > > gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++
> > > > gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++
> > > > gcc/testsuite/gcc.target/i386/pr99744-7.c | 12 +
> > > > gcc/testsuite/gcc.target/i386/pr99744-8.c | 13 +
> > > > 30 files changed, 717 insertions(+), 45 deletions(-)
> > > > create mode 100644 gcc/config/i386/mwaitintrin.h
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/crc32-6.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/monitor-2.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-1.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-2.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-3.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101492-4.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c
> > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c
> > > >
> > > > --
> > > > 2.31.1
> > > >
> >
> >
> >
> > --
> > H.J.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-08-26 6:35 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 13:50 [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only H.J. Lu
2021-08-13 13:50 ` [PATCH 1/5] x86: Add -mmwait for -mgeneral-regs-only H.J. Lu
2021-08-16 6:11 ` Richard Biener
2021-08-16 12:25 ` H.J. Lu
2021-08-16 12:28 ` Richard Biener
2021-08-16 12:35 ` H.J. Lu
2021-08-16 12:37 ` Martin Liška
2021-08-13 13:51 ` [PATCH 2/5] x86: Use crc32 target option for CRC32 intrinsics H.J. Lu
2021-08-13 13:51 ` [PATCH 3/5] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions H.J. Lu
2021-08-13 13:51 ` [PATCH 4/5] x86: Enable the GPR only instructions for -mgeneral-regs-only H.J. Lu
2021-08-13 13:51 ` [PATCH 5/5] <x86gprintrin.h>: Add pragma GCC target("general-regs-only") H.J. Lu
2021-08-16 6:11 ` [GCC-11] [PATCH 0/5] Finish <x86gprintrin.h> and general-regs-only Richard Biener
2021-08-24 14:57 ` H.J. Lu
2021-08-25 7:34 ` Uros Bizjak
2021-08-25 12:14 ` H.J. Lu
2021-08-26 6:35 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).