From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) by sourceware.org (Postfix) with ESMTPS id F1D4D3858D39 for ; Fri, 12 May 2023 05:43:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F1D4D3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb2c.google.com with SMTP id 3f1490d57ef6-ba1cde4ee59so7470912276.1 for ; Thu, 11 May 2023 22:43:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683870229; x=1686462229; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bWaeGAM04ZTAWUWaoue87T3yfNyjCX/qoH8dKos4FXg=; b=URF7nuVBmKeeQZMj7XOOLyXzRw2mWlIQi2jO9GpvI8dEZ+ydNqMZCYBnHDZ9C5h/YT ryuZefd55GTSjrYEm+k4g5iBg8cQZXb9oj0FmGweILLXxJve75RRf+pW1MlW3nG7QeZ2 GS9LBWVdDeHL4jlVYMR/cQ2CbGyOWa1dhBFRmCcRbrBl9IYZEoFxcSUOwblUVJCmHDBU 0XRENdKSZ57/Se5U5Hs3sovblHOQNqHnT2BNFTqRJTyheg81aLuCFwk9Ax/WOamPVAhf O5GWbRsEGTc795vaCmOXcnB4XIBsLHTz+B26PDlEgtkHChDYeiXoM8JnoZweSyQpK0W6 fhGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683870229; x=1686462229; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bWaeGAM04ZTAWUWaoue87T3yfNyjCX/qoH8dKos4FXg=; b=eftulLvLkiKw2nC5hJ/wK9Lw2Tix7k+pPSCW7PS2SbganoOR6fQMsGT7zwFca4FzS0 CTEbL2F74EnHAC3RLPX1LZvp0Ls8pCZGcBY6QsPogSshUtxiwkN0FMOoBnzOs1+EfBuW OAxzG3vMI9L5cEFjROvNKL/2qaYk9VgOdbz9CxRUnfTTLttKNkQO+pORfq1WemmCN97Q v5HFk0m2M4VYDaVHWAfFTTEVLVXrg88kDRUCxnFsBIEziy6VfWQ/C9dVmxD1Lc8hZH0J jnfQOKEUGf22Uodw2Ol5Ypqb1dvTSgQCemgcrPgTwgCxZw6vVF3ZSPTPyhGzdf1zxydu mUnQ== X-Gm-Message-State: AC+VfDyzdhyMNkMDZvvNySlHKl9lSeMzQQ4NyIc+fCPR7opwRxx8Evh9 c9szUbJREgPHJeaSeWOl6EaNKjJ2YLnpnXW5oNA= X-Google-Smtp-Source: ACHHUZ7qQLgMIYpVLMlxEXtOqIUEcb/Irt9JyD0v8mp1uTsOQ9RMCIEkaTOgLLVKNOIo3l9x2qAGmSoBqvC6ZSkBGTI= X-Received: by 2002:a81:a1d6:0:b0:54f:7971:4f87 with SMTP id y205-20020a81a1d6000000b0054f79714f87mr23115052ywg.36.1683870229258; Thu, 11 May 2023 22:43:49 -0700 (PDT) MIME-Version: 1.0 References: <20230510090758.3737162-1-hongtao.liu@intel.com> In-Reply-To: <20230510090758.3737162-1-hongtao.liu@intel.com> From: Hongtao Liu Date: Fri, 12 May 2023 13:43:38 +0800 Message-ID: Subject: Re: [PATCH] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR. To: liuhongt Cc: gcc-patches@gcc.gnu.org, hjl.tools@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, May 10, 2023 at 5:10=E2=80=AFPM liuhongt wr= ote: > > > The quoted patch shows -shared in context and you didn't post a > > backport version > > to look at. But yes, we shouldn't change -shared behavior on a > > branch, even less so make it > > inconsistent between targets. > Here's the patch. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for GCC 11/12 backport? I'm going to push the patch next week if there's no objection. > > if (mdaz-ftz) > link crtfastmath.o > else if ((Ofast || ffast-math || funsafe-math-optimizations) > && !mno-daz-ftz) > link crtfastmath.o > else > Don't link crtfastmath.o > > gcc/ChangeLog: > > * config/i386/cygwin.h (ENDFILE_SPEC): Link crtfastmath.o > whenever -mdaz-ftz is specified. Don't link crtfastmath.o > when -mno-daz-ftz is specified. > * config/i386/darwin.h (ENDFILE_SPEC): Ditto. > * config/i386/gnu-user-common.h > (GNU_USER_TARGET_MATHFILE_SPEC): Ditto. > * config/i386/mingw32.h (ENDFILE_SPEC): Ditto. > * config/i386/i386.opt (mdaz-ftz): New option. > * doc/invoke.texi (x86 options): Document mftz-daz. > --- > gcc/config/i386/cygwin.h | 2 +- > gcc/config/i386/darwin.h | 4 ++-- > gcc/config/i386/gnu-user-common.h | 2 +- > gcc/config/i386/i386.opt | 4 ++++ > gcc/config/i386/mingw32.h | 2 +- > gcc/doc/invoke.texi | 11 ++++++++++- > 6 files changed, 19 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/i386/cygwin.h b/gcc/config/i386/cygwin.h > index d06eda369cf..5412c5d4479 100644 > --- a/gcc/config/i386/cygwin.h > +++ b/gcc/config/i386/cygwin.h > @@ -57,7 +57,7 @@ along with GCC; see the file COPYING3. If not see > > #undef ENDFILE_SPEC > #define ENDFILE_SPEC \ > - "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}\ > + "%{mdaz-ftz:crtfastmath.o%s;Ofast|ffast-math|funsafe-math-optimization= s:%{!mno-daz-ftz:crtfastmath.o%s}} \ > %{!shared:%:if-exists(default-manifest.o%s)}\ > %{fvtable-verify=3Dnone:%s; \ > fvtable-verify=3Dpreinit:vtv_end.o%s; \ > diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h > index a55f6b2b874..2f773924d6e 100644 > --- a/gcc/config/i386/darwin.h > +++ b/gcc/config/i386/darwin.h > @@ -109,8 +109,8 @@ along with GCC; see the file COPYING3. If not see > "%{!force_cpusubtype_ALL:-force_cpusubtype_ALL} " > > #undef ENDFILE_SPEC > -#define ENDFILE_SPEC \ > - "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ > +#define ENDFILE_SPEC > +\ "%{mdaz-ftz:crtfastmath.o%s;Ofast|ffast-math|funsafe-math-optimizatio= ns:%{!mno-daz-ftz:crtfastmath.o%s}} \ > %{mpc32:crtprec32.o%s} \ > %{mpc64:crtprec64.o%s} \ > %{mpc80:crtprec80.o%s}" TM_DESTRUCTOR > diff --git a/gcc/config/i386/gnu-user-common.h b/gcc/config/i386/gnu-user= -common.h > index 23b54c5be52..3d2a33f1714 100644 > --- a/gcc/config/i386/gnu-user-common.h > +++ b/gcc/config/i386/gnu-user-common.h > @@ -47,7 +47,7 @@ along with GCC; see the file COPYING3. If not see > > /* Similar to standard GNU userspace, but adding -ffast-math support. *= / > #define GNU_USER_TARGET_MATHFILE_SPEC \ > - "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ > + "%{mdaz-ftz:crtfastmath.o%s;Ofast|ffast-math|funsafe-math-optimization= s:%{!mno-daz-ftz:crtfastmath.o%s}} \ > %{mpc32:crtprec32.o%s} \ > %{mpc64:crtprec64.o%s} \ > %{mpc80:crtprec80.o%s}" > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > index a3675e515bc..5cfb7cdcbc2 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -420,6 +420,10 @@ mpc80 > Target RejectNegative > Set 80387 floating-point precision to 80-bit. > > +mdaz-ftz > +Target > +Set the FTZ and DAZ Flags. > + > mpreferred-stack-boundary=3D > Target RejectNegative Joined UInteger Var(ix86_preferred_stack_boundary_= arg) > Attempt to keep stack aligned to this power of 2. > diff --git a/gcc/config/i386/mingw32.h b/gcc/config/i386/mingw32.h > index d3ca0cd0279..ddbe6a4054b 100644 > --- a/gcc/config/i386/mingw32.h > +++ b/gcc/config/i386/mingw32.h > @@ -197,7 +197,7 @@ along with GCC; see the file COPYING3. If not see > > #undef ENDFILE_SPEC > #define ENDFILE_SPEC \ > - "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ > + "%{mdaz-ftz:crtfastmath.o%s;Ofast|ffast-math|funsafe-math-optimization= s:%{!mno-daz-ftz:crtfastmath.o%s}} \ > %{!shared:%:if-exists(default-manifest.o%s)}\ > %{fvtable-verify=3Dnone:%s; \ > fvtable-verify=3Dpreinit:vtv_end.o%s; \ > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index cb83dd8a1cc..87eedfffa6c 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1434,7 +1434,7 @@ See RS/6000 and PowerPC Options. > -m96bit-long-double -mlong-double-64 -mlong-double-80 -mlong-double-1= 28 @gol > -mregparm=3D@var{num} -msseregparm @gol > -mveclibabi=3D@var{type} -mvect8-ret-in-mem @gol > --mpc32 -mpc64 -mpc80 -mstackrealign @gol > +-mpc32 -mpc64 -mpc80 -mdaz-ftz -mstackrealign @gol > -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol > -mcmodel=3D@var{code-model} -mabi=3D@var{name} -maddress-mode=3D@var{m= ode} @gol > -m32 -m64 -mx32 -m16 -miamcu -mlarge-data-threshold=3D@var{num} @go= l > @@ -32078,6 +32078,15 @@ are enabled by default; routines in such librari= es could suffer significant > loss of accuracy, typically through so-called ``catastrophic cancellatio= n'', > when this option is used to set the precision to less than extended prec= ision. > > +@item -mdaz-ftz > +@opindex mdaz-ftz > + > +The flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR = register > +are used to control floating-point calculations.SSE and AVX instructions > +including scalar and vector instructions could benefit from enabling the= FTZ > +and DAZ flags when @option{-mdaz-ftz} is specified. Don't set FTZ/DAZ fl= ags > +when @option{-mno-daz-ftz} is specified. > + > @item -mstackrealign > @opindex mstackrealign > Realign the stack at entry. On the x86, the @option{-mstackrealign} > -- > 2.39.1.388.g2fc9e9ca3c > --=20 BR, Hongtao