From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 75517 invoked by alias); 24 Jun 2019 11:57:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 75508 invoked by uid 89); 24 Jun 2019 11:57:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=aims, D*fr X-HELO: mail-lj1-f177.google.com Received: from mail-lj1-f177.google.com (HELO mail-lj1-f177.google.com) (209.85.208.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 24 Jun 2019 11:57:04 +0000 Received: by mail-lj1-f177.google.com with SMTP id a21so12320128ljh.7 for ; Mon, 24 Jun 2019 04:57:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=jY8lZ/ol5sTyZI47+d+5yNLVeTyHU/0F/InWR42HqY8=; b=g2QdfduYdBcTyBxUkS/DWm9Fp2+EDwDE4l/o19ki+W8i+oEAz55+BG+9wm7lZBtfSe wRaN6SbpnOa6s6Fg4CQAwlpAi5KyVueDCNnZNY2CIos5R8fnsE67XKqfRtzGJG4N0L5A lm+q9RztlCNFE+hSnrBjl7ovxi9YgJ62buH/wnXVpdE/umUcxFbrtSO5lee36JX5byMw JL0If9y1ubG3nGQFvdwPNx/9A3Ad+1nfh7n6l/p7/jyDhnvuFnAWmUFlzQ0vu8lJ7e5Y PnDv5yLiAfGTResQboDW4zc05iS2XoRfK+5r7fS8ktMIH95idiMro0CnTagdDdrvzdIB 4xhg== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Mon, 24 Jun 2019 11:57:00 -0000 Message-ID: Subject: Re: Start implementing -frounding-math To: Marc Glisse Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-06/txt/msg01434.txt.bz2 On Sun, Jun 23, 2019 at 12:22 AM Marc Glisse wrote: > > On Sat, 22 Jun 2019, Richard Biener wrote: > > > On June 22, 2019 6:10:15 PM GMT+02:00, Marc Glisse wrote: > >> Hello, > >> > >> as discussed in the PR, this seems like a simple enough approach to > >> handle > >> FENV functionality safely, while keeping it possible to implement > >> optimizations in the future. > >> > >> Some key missing things: > >> - handle C, not just C++ (I don't care, but some people probably do) > > > > As you tackle C++, what does the standard say to constexpr contexts and > > FENV? That is, what's the FP environment at compiler - time (I suppose > > FENV modifying functions are not constexpr declared). > > The C++ standard doesn't care much about fenv: > > [Note: This document does not require an implementation to support the > FENV_ACCESS pragma; it is implementation-defined (15.8) whether the pragma > is supported. As a consequence, it is implementation- defined whether > these functions can be used to test floating-point status flags, set > floating-point control modes, or run under non-default mode settings. If > the pragma is used to enable control over the floating-point environment, > this document does not specify the effect on floating-point evaluation in > constant expressions. =E2=80=94 end note] Oh, I see. > We should care about the C standard, and do whatever makes sense for C++ > without expecting the C++ standard to tell us exactly what that is. We can > check what visual studio and intel do, but we don't have to follow them. This makes it somewhat odd to implement this for C++ first and not C, but h= ey ;) > -frounding-math is supposed to be equivalent to "#pragma stdc fenv_access > on" covering the whole program. > > For constant expressions, I see a difference between > constexpr double third =3D 1. / 3.; > which really needs to be done at compile time, and > const double third =3D 1. / 3.; > which will try to evaluate the rhs as constexpr, but where the program is > still valid if that fails. The second one clearly should refuse to be > evaluated at compile time if we are specifying a dynamic rounding > direction. For the first one, I am not sure. I guess you should only write > that in "fenv_access off" regions and I wouldn't mind a compile error. > > Note that C2x adds a pragma fenv_round that specifies a rounding direction > for a region of code, which seems relevant for constant expressions. That > pragma looks hard, but maybe some pieces would be nice to add. Hmm. My thinking was along the line that at the start of main() the C abstract machine might specify the initial rounding mode (and exception state) is implementation defined and all constant expressions are evaluated whilst being in this state. So we can define that to round-to-nearest and simply fold all constants in contexts we are allowed to evaluate at compile-time as we see them? I guess fenv_round aims at using a pragma to change the rounding mode? > >> - handle vectors (for complex, I don't know what it means) > >> > >> Then flag_trapping_math should also enable this path, meaning that we > >> should stop making it the default, or performance will suffer. > > > > Do we need N variants of the functions to really encode FP options into > > the IL and thus allow inlining of say different signed-zero flag > > functions? > > Not sure what you are suggesting. I am essentially creating a new > tree_code (well, an internal function) for an addition-like function that > actually reads/writes memory, so it should be orthogonal to inlining, and > only the front-end should care about -frounding-math. I didn't think about > the interaction with signed-zero. Ah, you mean > IFN_FENV_ADD_WITH_ROUNDING_AND_SIGNED_ZEROS, etc? Yeah. Basically the goal is to have the IL fully defined on its own, witho= ut having its semantic depend on flag_*. > The ones I am starting > from are supposed to be safe-for-everything. As refinement, I was thinking > in 2 directions: > * add a third constant argument, where we can specify extra info > * add a variant for the case where the function is pure (because I expect > that's easier on the compiler than "pure if (arg3 & 8) !=3D 0") > I am not sure more variants are needed. For optimization having a ADD_ROUND_TO_ZERO (or the extra params specifying an explicit rounding mode) might be interesting since on x86 there are now instructions with rounding mode control bits. > Also, while rounding clearly applies to an operation, signed-zero kind of > seems to apply to a variable, and in an operation, I don't really know if > it means that I can pretend that an argument of -0. is +0. (I can return > +inf for 1/-0.) or if it means I can return 0. when the operation should > return -0.. Probably both... If we have just -fsigned-zeros but no > rounding or trapping, the penalty of using an IFN would be bad. But indeed > inlining functions with different -f(no-)signed-zeros forces to use > -fsigned-zeros for the whole merged function if we don't encode it in the > operations. Hmm Yeah. I guess we need to think about each and every case and how to deal with it. There's denormals and flush-to-zero (not covered by posix fenv modification IIRC) and a lot of math optimization flags that do not map to FP operations directly... > > I didn't look at the patch but I suppose you rely on RTL to not do code > > motion across FENV modifications and not fold Constants? > > No, I rely on asm volatile to prevent that, as in your recent hack, except > that the asm only appears near expansion. I am trying to start from > something safe and refine with optimizations, no subtlety. Ah, OK. So indeed instead of a new pass doing the lowering on GIMPLE this should ideally be done by populating expand_FENV_* appropriately. > > That is, don't we really need unspec_volatile variant patterns for the > > Operations? > > Yes. One future optimization (that I listed in the PR) is to let targets > expand those IFN as they like (without the asm barriers), using some > unspec_volatile. I hope we can get there, although just letting targets > replace "=3Dg" with whatever in the asm would already get most of the > benefits. > > > > I just thought of one issue for vector intrinsics, say _mm_add_pd, where > the fenv_access status that should matter is that of the caller, not the > one in emmintrin.h. But since I don't have the pragma or vectors, that can > wait. True. I guess for the intrinsic headers we could invent some new attribute (or assume such semantics for always_inline which IIRC they are) saying that a function inherits options from the caller (difficult if not inlined, it would imply cloning, thus always-inline again...). On the patch I'd name _DIV _RDIV (to match the tree code we are dealing with). You miss _NEGATE and also the _FIX_TRUNC and _FLOAT in case those might trap with -ftrapping-math. There are also internal functions for POW, FMOD and others which are ECF_CONST but may not end up being folded from their builtin counter-part with -frounding-mat= h. I guess builtins need the same treatment for -ftrapping-math as they do for -frounding-math. I think you already mentioned the default of this flag doesn't make much sense (well, the flag isn't fully honored/implemented). So I think the patch is a good start but I'd say we should not introduce the new pass but instead expand to the asm() kludge directly which would make it also easier to handle some ops as unspecs in the target. In the future an optimize_fenv pass could annotate the call with the optional specifier if it detects regions with known exception/rounding state but it still may not rewrite back the internal functions to plain operations (at least before IPA) since the IFNs are required so the FENV modifying operations are code-motion barriers. Thanks, Richard. > > -- > Marc Glisse