From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=6SNn=DZ=gmail.com=ubizjak@sourceware.org>
Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c])
	by sourceware.org (Postfix) with ESMTPS id F117C3858280
	for <gcc-patches@gcc.gnu.org>; Tue,  8 Aug 2023 11:03:47 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F117C3858280
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-4fe15bfb1adso9160030e87.0
        for <gcc-patches@gcc.gnu.org>; Tue, 08 Aug 2023 04:03:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1691492626; x=1692097426;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=8cgU9Z2xKhXQn8gXKB9TM6qDRB74Dtg9sD91F/7TOuU=;
        b=ajg+gxvMYHutKbKlxT5K9oitxJ6RIr/a6Nf1gxj+zL3Ft0oEIP/b5Cg52zzi6mln4k
         0eyHfaLrUL3EDnK61XPE67KBdbJleXxz5yMm9+Fds9goYnGCvfazfkMchH9Kwvq6HLoS
         Ur5CYKAnbWPketcz1ZAqYX7EPtONVYd1ILee+fkxV7u7kYaZlvT8QE0INvhVRGSgZ/KU
         xtsTiwlzuAdJeViSYpaU3FgVkqlwa10QJaoHhTNgv/wUe49SAhp7/JfTC3ieUIOnSr5J
         tI9sRdv3EPWmqU2n27uJWJYOuNFmRIOHvQ7uWvJ7BFsQWU/h5Nfh+VtbP7XOr3Yj5O7w
         zMVw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1691492626; x=1692097426;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=8cgU9Z2xKhXQn8gXKB9TM6qDRB74Dtg9sD91F/7TOuU=;
        b=eeE7PVP1Wga1R9YaHIVvMDw+ZCSfuk4KKiasQD6goBoyJjDjd1DRBDjxjkW9Nz+R/d
         mHBbZmZZkyWaakvfPdBb4hXPKsGct5GAq4JRSZe611GSSTXAlUv3QOu4VHNFiMaFORS7
         MYz7madIhbs6p8/FrH3GAsx3T2rRH059sCQSZFTn1pAXUpY/GGjI9u617hKU8XH+re2b
         cGaPtaDqgSKC9kFrlG2khztJl6HbUfBKQbbBi7VLxw4lrR9Zdb+qLcADbeULuXiHrZ41
         3DyUGLfO3/4To2DUaIqDkBu26foXqK6zmqgYA6IoyuIzvQV5THg1d3T6foB4sCx0/TIe
         cWNQ==
X-Gm-Message-State: AOJu0YzTDxQ0rXbEUNLaSrh5JknVWZ9BIdhcBlotDTut1UklJJh7GkdG
	DaaszVOVOTfZmID5uX2uGi93M7qTH2eY/Tmfb4g=
X-Google-Smtp-Source: AGHT+IHJ5rVxKltifWtFWulcfx/m7sGSPWy+BtuA8h7AEH5xpHClIkQIO9uJLVXYqWymwN35rehjc0Spw0JQDlDE8hM=
X-Received: by 2002:a05:6512:e9a:b0:4fd:d213:dfd0 with SMTP id
 bi26-20020a0565120e9a00b004fdd213dfd0mr9324640lfb.11.1691492626097; Tue, 08
 Aug 2023 04:03:46 -0700 (PDT)
MIME-Version: 1.0
References: <CAFULd4abm7fZrKOYWMibFDM=uBk1TET0vSn7=5=-tYhcVrRdUA@mail.gmail.com>
 <nycvar.YFH.7.77.849.2307310937420.12935@jbgna.fhfr.qr> <CAFULd4ZUXTDoAgGN_xi0tgWCm=gC5Vmd1nM+0KHa+MDPRC5V9A@mail.gmail.com>
 <nycvar.YFH.7.77.849.2308080758430.12935@jbgna.fhfr.qr> <CAFULd4YhGRqs9ByoQQjXwEB+ndi9VHmkv=1RUu2GBUskT8c2GQ@mail.gmail.com>
 <nycvar.YFH.7.77.849.2308081003560.12935@jbgna.fhfr.qr>
In-Reply-To: <nycvar.YFH.7.77.849.2308081003560.12935@jbgna.fhfr.qr>
From: Uros Bizjak <ubizjak@gmail.com>
Date: Tue, 8 Aug 2023 13:03:34 +0200
Message-ID: <CAFULd4a_fapV6z5qJjtVQXqjv1S7_jvM_KwXiUYCWdHG2bC3zg@mail.gmail.com>
Subject: Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with
 -fno-trapping-math [PR110832]
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, Jan Hubicka <hubicka@ucw.cz>, 
	Hongtao Liu <hongtao.liu@intel.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Tue, Aug 8, 2023 at 12:08=E2=80=AFPM Richard Biener <rguenther@suse.de> =
wrote:

> > > > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping=
 V2SF
> > > > > > named patterns in order to avoid generation of partial vector V=
4SFmode
> > > > > > trapping instructions.
> > > > > >
> > > > > > The new option is enabled by default, because even with sanitiz=
ation,
> > > > > > a small but consistent speed up of 2 to 3% with Polyhedron capa=
cita
> > > > > > benchmark can be achieved vs. scalar code.
> > > > > >
> > > > > > Using -fno-trapping-math improves Polyhedron capacita runtime 8=
 to 9%
> > > > > > vs. scalar code.  This is what clang does by default, as it def=
aults
> > > > > > to -fno-trapping-math.
> > > > >
> > > > > I like the new option, note you lack invoke.texi documentation wh=
ere
> > > > > I'd also elaborate a bit on the interaction with -fno-trapping-ma=
th
> > > > > and the possible performance impact then NaNs or denormals leak
> > > > > into the upper halves and cross-reference -mdaz-ftz.
> > > >
> > > > The attached doc patch is invoke.texi entry for -mmmxfp-with-sse
> > > > option. It is written in a way to also cover half-float vectors. WD=
YT?
> > >
> > > "generate trapping floating-point operations"
> > >
> > > I'd say "generate floating-point operations that might affect the
> > > set of floating point status flags", the word "trapping" is IMHO
> > > misleading.
> > > Not sure if "set of floating point status flags" is the correct term,
> > > but it's what the C standard seems to refer to when talking about
> > > things you get with fegetexceptflag.  feraieexcept refers to
> > > "floating-point exceptions".  Unfortunately the -fno-trapping-math
> > > documentation is similarly confusing (and maybe even wrong, I read
> > > it to conform to 'non-stop' IEEE arithmetic).
> >
> > Thanks for suggesting the right terminology. I think that:
> >
> > +@opindex mpartial-vector-math
> > +@item -mpartial-vector-math
> > +This option enables GCC to generate floating-point operations that mig=
ht
> > +affect the set of floating point status flags on partial vectors, wher=
e
> > +vector elements reside in the low part of the 128-bit SSE register.  U=
nless
> > +@option{-fno-trapping-math} is specified, the compiler guarantees corr=
ect
> > +behavior by sanitizing all input operands to have zeroes in the unused
> > +upper part of the vector register.  Note that by using built-in functi=
ons
> > +or inline assembly with partial vector arguments, NaNs, denormal or in=
valid
> > +values can leak into the upper part of the vector, causing possible
> > +performance issues when @option{-fno-trapping-math} is in effect.  The=
se
> > +issues can be mitigated by manually sanitizing the upper part of the p=
artial
> > +vector argument register or by using @option{-mdaz-ftz} to set
> > +denormals-are-zero (DAZ) flag in the MXCSR register.
> >
> > Now explain in adequate detail what the option does. IMO, the
> > "floating-point operations that might affect the set of floating point
> > status flags" correctly identifies affected operations, so an example,
> > as suggested below, is not necessary.
> >
> > > I'd maybe give an example of a FP operation that's _not_ affected
> > > by the flag (copysign?).
> >
> > Please note that I have renamed the option to "-mpartial-vector-math"
> > with a short target-specific description:
>
> Ah yes, that's a less confusing name but then it might suggest
> that -mno-partial-vector-math would disable all of that, including
> integer ops, not only the patterns possibly affecting the exception
> flags?  Note I don't have a better suggestion and this is clearly
> better than the one mentioning mmx.

You are right, I think I'll rename the option to -mpartial-vector-fp-math.

Thanks,
Uros.