Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Uros Bizjak <ubizjak@gmail.com>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Jan Hubicka <hubicka@ucw.cz>,
	 Hongtao Liu <hongtao.liu@intel.com>
Subject: Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]
Date: Tue, 8 Aug 2023 13:03:34 +0200	[thread overview]
Message-ID: <CAFULd4a_fapV6z5qJjtVQXqjv1S7_jvM_KwXiUYCWdHG2bC3zg@mail.gmail.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2308081003560.12935@jbgna.fhfr.qr>

On Tue, Aug 8, 2023 at 12:08 PM Richard Biener <rguenther@suse.de> wrote:

> > > > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > > > > > named patterns in order to avoid generation of partial vector V4SFmode
> > > > > > trapping instructions.
> > > > > >
> > > > > > The new option is enabled by default, because even with sanitization,
> > > > > > a small but consistent speed up of 2 to 3% with Polyhedron capacita
> > > > > > benchmark can be achieved vs. scalar code.
> > > > > >
> > > > > > Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9%
> > > > > > vs. scalar code.  This is what clang does by default, as it defaults
> > > > > > to -fno-trapping-math.
> > > > >
> > > > > I like the new option, note you lack invoke.texi documentation where
> > > > > I'd also elaborate a bit on the interaction with -fno-trapping-math
> > > > > and the possible performance impact then NaNs or denormals leak
> > > > > into the upper halves and cross-reference -mdaz-ftz.
> > > >
> > > > The attached doc patch is invoke.texi entry for -mmmxfp-with-sse
> > > > option. It is written in a way to also cover half-float vectors. WDYT?
> > >
> > > "generate trapping floating-point operations"
> > >
> > > I'd say "generate floating-point operations that might affect the
> > > set of floating point status flags", the word "trapping" is IMHO
> > > misleading.
> > > Not sure if "set of floating point status flags" is the correct term,
> > > but it's what the C standard seems to refer to when talking about
> > > things you get with fegetexceptflag.  feraieexcept refers to
> > > "floating-point exceptions".  Unfortunately the -fno-trapping-math
> > > documentation is similarly confusing (and maybe even wrong, I read
> > > it to conform to 'non-stop' IEEE arithmetic).
> >
> > Thanks for suggesting the right terminology. I think that:
> >
> > +@opindex mpartial-vector-math
> > +@item -mpartial-vector-math
> > +This option enables GCC to generate floating-point operations that might
> > +affect the set of floating point status flags on partial vectors, where
> > +vector elements reside in the low part of the 128-bit SSE register.  Unless
> > +@option{-fno-trapping-math} is specified, the compiler guarantees correct
> > +behavior by sanitizing all input operands to have zeroes in the unused
> > +upper part of the vector register.  Note that by using built-in functions
> > +or inline assembly with partial vector arguments, NaNs, denormal or invalid
> > +values can leak into the upper part of the vector, causing possible
> > +performance issues when @option{-fno-trapping-math} is in effect.  These
> > +issues can be mitigated by manually sanitizing the upper part of the partial
> > +vector argument register or by using @option{-mdaz-ftz} to set
> > +denormals-are-zero (DAZ) flag in the MXCSR register.
> >
> > Now explain in adequate detail what the option does. IMO, the
> > "floating-point operations that might affect the set of floating point
> > status flags" correctly identifies affected operations, so an example,
> > as suggested below, is not necessary.
> >
> > > I'd maybe give an example of a FP operation that's _not_ affected
> > > by the flag (copysign?).
> >
> > Please note that I have renamed the option to "-mpartial-vector-math"
> > with a short target-specific description:
>
> Ah yes, that's a less confusing name but then it might suggest
> that -mno-partial-vector-math would disable all of that, including
> integer ops, not only the patterns possibly affecting the exception
> flags?  Note I don't have a better suggestion and this is clearly
> better than the one mentioning mmx.

You are right, I think I'll rename the option to -mpartial-vector-fp-math.

Thanks,
Uros.

     prev parent reply	other threads:[~2023-08-08 11:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-30 20:12 Uros Bizjak
2023-07-31  9:40 ` Richard Biener
2023-07-31 10:13   ` Uros Bizjak
2023-08-07 15:59   ` Uros Bizjak
2023-08-08  8:07     ` Richard Biener
2023-08-08  9:06       ` Uros Bizjak
2023-08-08 10:08         ` Richard Biener
2023-08-08 11:03           ` Uros Bizjak [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4a_fapV6z5qJjtVQXqjv1S7_jvM_KwXiUYCWdHG2bC3zg@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).