From: Uros Bizjak <ubizjak@gmail.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Jan Hubicka <hubicka@ucw.cz>,
Hongtao Liu <hongtao.liu@intel.com>
Subject: Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]
Date: Mon, 7 Aug 2023 17:59:58 +0200 [thread overview]
Message-ID: <CAFULd4ZUXTDoAgGN_xi0tgWCm=gC5Vmd1nM+0KHa+MDPRC5V9A@mail.gmail.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2307310937420.12935@jbgna.fhfr.qr>
[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener <rguenther@suse.de> wrote:
>
> On Sun, 30 Jul 2023, Uros Bizjak wrote:
>
> > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > named patterns in order to avoid generation of partial vector V4SFmode
> > trapping instructions.
> >
> > The new option is enabled by default, because even with sanitization,
> > a small but consistent speed up of 2 to 3% with Polyhedron capacita
> > benchmark can be achieved vs. scalar code.
> >
> > Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9%
> > vs. scalar code. This is what clang does by default, as it defaults
> > to -fno-trapping-math.
>
> I like the new option, note you lack invoke.texi documentation where
> I'd also elaborate a bit on the interaction with -fno-trapping-math
> and the possible performance impact then NaNs or denormals leak
> into the upper halves and cross-reference -mdaz-ftz.
The attached doc patch is invoke.texi entry for -mmmxfp-with-sse
option. It is written in a way to also cover half-float vectors. WDYT?
Uros.
[-- Attachment #2: d.diff.txt --]
[-- Type: text/plain, Size: 1788 bytes --]
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index fa765d5a0dd..99093172abe 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1417,6 +1417,7 @@ See RS/6000 and PowerPC Options.
-mcld -mcx16 -msahf -mmovbe -mcrc32 -mmwait
-mrecip -mrecip=@var{opt}
-mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt}
+-mmmxfp-with-sse
-mmove-max=@var{bits} -mstore-max=@var{bits}
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx
-mavx2 -mavx512f -mavx512pf -mavx512er -mavx512cd -mavx512vl
@@ -33708,6 +33709,22 @@ This option instructs GCC to use 128-bit AVX instructions instead of
This option instructs GCC to use @var{opt}-bit vector width in instructions
instead of default on the selected platform.
+@opindex -mmmxfp-with-sse
+@item -mmmxfp-with-sse
+This option enables GCC to generate trapping floating-point operations on
+partial vectors, where vector elements reside in the low part of the 128-bit
+SSE register. Unless @option{-fno-trapping-math} is specified, the compiler
+guarantees correct trapping behavior by sanitizing all input operands to
+have zeroes in the upper part of the vector register. Note that by using
+built-in functions or inline assembly with partial vector arguments, NaNs,
+denormal or invalid values can leak into the upper part of the vector,
+causing possible performance issues when @option{-fno-trapping-math} is in
+effect. These issues can be mitigated by manually sanitizing the upper part
+of the partial vector argument register or by using @option{-mdaz-ftz} to set
+denormals-are-zero (DAZ) flag in the MXCSR register.
+
+This option is enabled by default.
+
@opindex mmove-max
@item -mmove-max=@var{bits}
This option instructs GCC to set the maximum number of bits can be
next prev parent reply other threads:[~2023-08-07 16:00 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-30 20:12 Uros Bizjak
2023-07-31 9:40 ` Richard Biener
2023-07-31 10:13 ` Uros Bizjak
2023-08-07 15:59 ` Uros Bizjak [this message]
2023-08-08 8:07 ` Richard Biener
2023-08-08 9:06 ` Uros Bizjak
2023-08-08 10:08 ` Richard Biener
2023-08-08 11:03 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFULd4ZUXTDoAgGN_xi0tgWCm=gC5Vmd1nM+0KHa+MDPRC5V9A@mail.gmail.com' \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=hubicka@ucw.cz \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).