*[PATCH] [range-ops] Implement sqrt.@ 2022-11-13 20:05 Aldy Hernandez2022-11-13 20:39 ` Jakub Jelinek 0 siblings, 1 reply; 21+ messages in thread From: Aldy Hernandez @ 2022-11-13 20:05 UTC (permalink / raw) To: Jakub Jelinek;+Cc:GCC patches, Andrew MacLeod, Aldy Hernandez It seems SQRT is relatively straightforward, and it's something Jakub wanted for this release. Jakub, what do you think? p.s. Too tired to think about op1_range. gcc/ChangeLog: * gimple-range-op.cc (class cfn_sqrt): New. (gimple_range_op_handler::maybe_builtin_call): Add cases for sqrt. --- gcc/gimple-range-op.cc | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 7764166d5fb..240cd8b6a11 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "range.h" #include "value-query.h" #include "gimple-range.h" +#include "fold-const-call.h" // Given stmt S, fill VEC, up to VEC_SIZE elements, with relevant ssa-names // on the statement. For efficiency, it is an error to not pass in enough @@ -301,6 +302,41 @@ public: } } op_cfn_constant_p; +// Implement range operator for SQRT. +class cfn_sqrt : public range_operator_float +{ + using range_operator_float::fold_range; +private: + REAL_VALUE_TYPE real_sqrt (const REAL_VALUE_TYPE &arg, tree type) const + { + tree targ = build_real (type, arg); + tree res = fold_const_call (as_combined_fn (BUILT_IN_SQRT), type, targ); + return *TREE_REAL_CST_PTR (res); + } + void rv_fold (REAL_VALUE_TYPE &lb, REAL_VALUE_TYPE &ub, bool &maybe_nan, + tree type, + const REAL_VALUE_TYPE &lh_lb, + const REAL_VALUE_TYPE &lh_ub, + const REAL_VALUE_TYPE &, + const REAL_VALUE_TYPE &, + relation_kind) const final override + { + if (real_compare (LT_EXPR, &lh_ub, &dconst0)) + { + real_nan (&lb, "", 0, TYPE_MODE (type)); + ub = lb; + maybe_nan = true; + return; + } + lb = real_sqrt (lh_lb, type); + ub = real_sqrt (lh_ub, type); + if (real_compare (GE_EXPR, &lh_lb, &dconst0)) + maybe_nan = false; + else + maybe_nan = true; + } +} fop_cfn_sqrt; + // Implement range operator for CFN_BUILT_IN_SIGNBIT. class cfn_signbit : public range_operator_float { @@ -907,6 +943,12 @@ gimple_range_op_handler::maybe_builtin_call () m_int = &op_cfn_parity; break; + CASE_CFN_SQRT: + CASE_CFN_SQRT_FN: + m_valid = true; + m_float = &fop_cfn_sqrt; + break; + default: break; } -- 2.38.1 ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-13 20:05 [PATCH] [range-ops] Implement sqrt Aldy Hernandez@ 2022-11-13 20:39 ` Jakub Jelinek2022-11-14 7:45 ` Aldy Hernandez 2022-11-14 21:55 ` Joseph Myers 0 siblings, 2 replies; 21+ messages in thread From: Jakub Jelinek @ 2022-11-13 20:39 UTC (permalink / raw) To: Aldy Hernandez, Joseph S. Myers;+Cc:GCC patches, Andrew MacLeod On Sun, Nov 13, 2022 at 09:05:53PM +0100, Aldy Hernandez wrote: > It seems SQRT is relatively straightforward, and it's something Jakub > wanted for this release. > > Jakub, what do you think? > > p.s. Too tired to think about op1_range. That would be multiplication of the same value twice, i.e. fop_mult with trio that has op1_op2 () == VREL_EQ? But see below, as sqrt won't be always precise, we need to account for some errors. > gcc/ChangeLog: > > * gimple-range-op.cc (class cfn_sqrt): New. > (gimple_range_op_handler::maybe_builtin_call): Add cases for sqrt. Yes, I'd like to see SQRT support in. The only thing I'm worried is that unlike {+,-,*,/}, negation etc. typically implemented in hardware or precise soft-float, sqrt is often implemented in library using multiple floating point arithmetic functions. And different implementations have different accuracy. So, I wonder if we don't need to add a target hook where targets will be able to provide upper bound on error for floating point functions for different floating point modes and some way to signal unknown accuracy/can't be trusted, in which case we would give up or return just the range for VARYING. Then, we could write some tests that say in a loop constructs random floating point values (perhaps sanitized to be non-NAN), calls libm function and the same mpfr one and return maximum error in ulps. And then record those, initially for glibc and most common targets and gradually maintainers could supply more. If we add an infrastructure for that within a few days, then we could start filling the details. One would hope that sqrt has < 10ulps accuracy if not already the 0.5ulp one, but for various other functions I think it can be much more. Oh, nanq ib libquadmath has terrible accuracy, but that one fortunately is not builtin... If we have some small integer for ulps accuracy of calls (we could use 0 for 0.5ulps accuracy aka precise), wonder if we'd handle it just as a loop of doing n times frange_nextafter or something smarter. > --- a/gcc/gimple-range-op.cc > +++ b/gcc/gimple-range-op.cc > @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see > #include "range.h" > #include "value-query.h" > #include "gimple-range.h" > +#include "fold-const-call.h" > > // Given stmt S, fill VEC, up to VEC_SIZE elements, with relevant ssa-names > // on the statement. For efficiency, it is an error to not pass in enough > @@ -301,6 +302,41 @@ public: > } > } op_cfn_constant_p; > > +// Implement range operator for SQRT. > +class cfn_sqrt : public range_operator_float > +{ > + using range_operator_float::fold_range; > +private: > + REAL_VALUE_TYPE real_sqrt (const REAL_VALUE_TYPE &arg, tree type) const > + { > + tree targ = build_real (type, arg); > + tree res = fold_const_call (as_combined_fn (BUILT_IN_SQRT), type, targ); > + return *TREE_REAL_CST_PTR (res); > + } > + void rv_fold (REAL_VALUE_TYPE &lb, REAL_VALUE_TYPE &ub, bool &maybe_nan, > + tree type, > + const REAL_VALUE_TYPE &lh_lb, > + const REAL_VALUE_TYPE &lh_ub, > + const REAL_VALUE_TYPE &, > + const REAL_VALUE_TYPE &, > + relation_kind) const final override > + { > + if (real_compare (LT_EXPR, &lh_ub, &dconst0)) > + { > + real_nan (&lb, "", 0, TYPE_MODE (type)); > + ub = lb; > + maybe_nan = true; > + return; > + } > + lb = real_sqrt (lh_lb, type); > + ub = real_sqrt (lh_ub, type); > + if (real_compare (GE_EXPR, &lh_lb, &dconst0)) > + maybe_nan = false; > + else > + maybe_nan = true; Doesn't this for say VARYING range result in [NAN, +INF] range? We want [-0.0, +INF]. So perhaps the real_compare should be done before doing the real_sqrt calls and for the maybe_nan case use hardcoded -0.0 as lb? BTW, as for the ulps, another thing to test is whether even when the library has certain number of ulps error worst case whether it still obeys the basic math properties of the function or not. Say for sqrt that it always fits into [-0.0, +INF] (guess because of the flush denormals to zero we wouldn't have a problem here for say 30ulps sqrt that [nextafter (0.0, 1.0) * 16.0, 64.0] wouldn't be considered [-nextafter (0.0, 1.0) * 16.0, 8.0 + 30ulps] but just [-0.0, 8.0 + 30ulps], but later on say sin/cos, which mathematically should have result always in [-1.0, 1.0] +-NAN, it would be interesting to see if there aren't some implementations that would happily return 1.0 + 15ulps or -1.0 - 20ulps. And unrelated thought about reverse y = x * x; if we know y's range - op1_range/op2_range in that case could be handled as sqrt without the library ulps treatment (if we assume that multiplication is always precise), but the question is if op1_range or op2_range is called at all in those cases and whether we could similarly use trio to derive in that case the x's range. Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-13 20:39 ` Jakub Jelinek@ 2022-11-14 7:45 ` Aldy Hernandez2022-11-14 14:30 ` Jeff Law 2022-11-14 21:55 ` Joseph Myers 1 sibling, 1 reply; 21+ messages in thread From: Aldy Hernandez @ 2022-11-14 7:45 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Joseph S. Myers, GCC patches, Andrew MacLeod On Sun, Nov 13, 2022 at 9:39 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Sun, Nov 13, 2022 at 09:05:53PM +0100, Aldy Hernandez wrote: > > It seems SQRT is relatively straightforward, and it's something Jakub > > wanted for this release. > > > > Jakub, what do you think? > > > > p.s. Too tired to think about op1_range. > > That would be multiplication of the same value twice, i.e. > fop_mult with trio that has op1_op2 () == VREL_EQ? > But see below, as sqrt won't be always precise, we need to account for > some errors. > > > gcc/ChangeLog: > > > > * gimple-range-op.cc (class cfn_sqrt): New. > > (gimple_range_op_handler::maybe_builtin_call): Add cases for sqrt. > > Yes, I'd like to see SQRT support in. > The only thing I'm worried is that unlike {+,-,*,/}, negation etc. typically > implemented in hardware or precise soft-float, sqrt is often implemented > in library using multiple floating point arithmetic functions. And different > implementations have different accuracy. > > So, I wonder if we don't need to add a target hook where targets will be > able to provide upper bound on error for floating point functions for > different floating point modes and some way to signal unknown accuracy/can't > be trusted, in which case we would give up or return just the range for > VARYING. > Then, we could write some tests that say in a loop constructs random > floating point values (perhaps sanitized to be non-NAN), calls libm function > and the same mpfr one and return maximum error in ulps. > And then record those, initially for glibc and most common targets and > gradually maintainers could supply more. > > If we add an infrastructure for that within a few days, then we could start > filling the details. One would hope that sqrt has < 10ulps accuracy if not > already the 0.5ulp one, but for various other functions I think it can be I don't know what would possess me to think that sqrt would be easy ;-). Sure, I can sink a few days to flesh this out if you're willing to review it. Aldy ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-14 7:45 ` Aldy Hernandez@ 2022-11-14 14:30 ` Jeff Law2022-11-14 14:35 ` Jakub Jelinek 0 siblings, 1 reply; 21+ messages in thread From: Jeff Law @ 2022-11-14 14:30 UTC (permalink / raw) To: Aldy Hernandez, Jakub Jelinek Cc: Joseph S. Myers, GCC patches, Andrew MacLeod On 11/14/22 00:45, Aldy Hernandez via Gcc-patches wrote: > On Sun, Nov 13, 2022 at 9:39 PM Jakub Jelinek <jakub@redhat.com> wrote: >> On Sun, Nov 13, 2022 at 09:05:53PM +0100, Aldy Hernandez wrote: >>> It seems SQRT is relatively straightforward, and it's something Jakub >>> wanted for this release. >>> >>> Jakub, what do you think? >>> >>> p.s. Too tired to think about op1_range. >> That would be multiplication of the same value twice, i.e. >> fop_mult with trio that has op1_op2 () == VREL_EQ? >> But see below, as sqrt won't be always precise, we need to account for >> some errors. >> >>> gcc/ChangeLog: >>> >>> * gimple-range-op.cc (class cfn_sqrt): New. >>> (gimple_range_op_handler::maybe_builtin_call): Add cases for sqrt. >> Yes, I'd like to see SQRT support in. >> The only thing I'm worried is that unlike {+,-,*,/}, negation etc. typically >> implemented in hardware or precise soft-float, sqrt is often implemented >> in library using multiple floating point arithmetic functions. And different >> implementations have different accuracy. >> >> So, I wonder if we don't need to add a target hook where targets will be >> able to provide upper bound on error for floating point functions for >> different floating point modes and some way to signal unknown accuracy/can't >> be trusted, in which case we would give up or return just the range for >> VARYING. >> Then, we could write some tests that say in a loop constructs random >> floating point values (perhaps sanitized to be non-NAN), calls libm function >> and the same mpfr one and return maximum error in ulps. >> And then record those, initially for glibc and most common targets and >> gradually maintainers could supply more. >> >> If we add an infrastructure for that within a few days, then we could start >> filling the details. One would hope that sqrt has < 10ulps accuracy if not >> already the 0.5ulp one, but for various other functions I think it can be > I don't know what would possess me to think that sqrt would be easy > ;-). Sure, I can sink a few days to flesh this out if you're willing > to review it. To Jakub's concern. I thought sqrt was treated like +-/* WRT accuracy requirements by IEEE. ie, for any input there is a well defined answer for a confirming IEEE implementation. In fact, getting to that .5ulp bound is a significant amount of the cost for a NR or Goldschmidt (or hybrid) implementation if you've got a reasonable (say 12 or 14 bit) estimator and high performance fmacs. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-14 14:30 ` Jeff Law@ 2022-11-14 14:35 ` Jakub Jelinek2022-11-14 14:48 ` Jeff Law 2022-11-14 15:01 ` Aldy Hernandez 0 siblings, 2 replies; 21+ messages in thread From: Jakub Jelinek @ 2022-11-14 14:35 UTC (permalink / raw) To: Jeff Law;+Cc:Aldy Hernandez, Joseph S. Myers, GCC patches, Andrew MacLeod On Mon, Nov 14, 2022 at 07:30:18AM -0700, Jeff Law via Gcc-patches wrote: > To Jakub's concern. I thought sqrt was treated like +-/* WRT accuracy > requirements by IEEE. ie, for any input there is a well defined answer for > a confirming IEEE implementation. In fact, getting to that .5ulp bound is > a significant amount of the cost for a NR or Goldschmidt (or hybrid) > implementation if you've got a reasonable (say 12 or 14 bit) estimator and > high performance fmacs. That might be the case (except for the known libquadmath sqrtq case PR105101 which fortunately is not a builtin). But we'll need to ulps infrastructure for other functions anyway and it would be nice to write a short testcase first that will test sqrt{,f,l,f32,f64,f128} and can be easily adjusted to test other functions. I'll try to cook something up tomorrow. Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-14 14:35 ` Jakub Jelinek@ 2022-11-14 14:48 ` Jeff Law2022-11-14 15:01 ` Aldy Hernandez 1 sibling, 0 replies; 21+ messages in thread From: Jeff Law @ 2022-11-14 14:48 UTC (permalink / raw) To: Jakub Jelinek Cc: Aldy Hernandez, Joseph S. Myers, GCC patches, Andrew MacLeod On 11/14/22 07:35, Jakub Jelinek wrote: > On Mon, Nov 14, 2022 at 07:30:18AM -0700, Jeff Law via Gcc-patches wrote: >> To Jakub's concern. I thought sqrt was treated like +-/* WRT accuracy >> requirements by IEEE. ie, for any input there is a well defined answer for >> a confirming IEEE implementation. In fact, getting to that .5ulp bound is >> a significant amount of the cost for a NR or Goldschmidt (or hybrid) >> implementation if you've got a reasonable (say 12 or 14 bit) estimator and >> high performance fmacs. > That might be the case (except for the known libquadmath sqrtq case > PR105101 which fortunately is not a builtin). > But we'll need to ulps infrastructure for other functions anyway and > it would be nice to write a short testcase first that will test > sqrt{,f,l,f32,f64,f128} and can be easily adjusted to test other functions. > I'll try to cook something up tomorrow. Agreed we'll need it elsewhere, so no objection to building it out if it's not going to delay things for sqrt. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-14 14:35 ` Jakub Jelinek 2022-11-14 14:48 ` Jeff Law@ 2022-11-14 15:01 ` Aldy Hernandez1 sibling, 0 replies; 21+ messages in thread From: Aldy Hernandez @ 2022-11-14 15:01 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Jeff Law, Joseph S. Myers, GCC patches, Andrew MacLeod [-- Attachment #1: Type: text/plain, Size: 1009 bytes --] Huh...no argument from me. Thanks. Aldy On Mon, Nov 14, 2022, 15:35 Jakub Jelinek <jakub@redhat.com> wrote: > On Mon, Nov 14, 2022 at 07:30:18AM -0700, Jeff Law via Gcc-patches wrote: > > To Jakub's concern. I thought sqrt was treated like +-/* WRT accuracy > > requirements by IEEE. ie, for any input there is a well defined answer > for > > a confirming IEEE implementation. In fact, getting to that .5ulp bound > is > > a significant amount of the cost for a NR or Goldschmidt (or hybrid) > > implementation if you've got a reasonable (say 12 or 14 bit) estimator > and > > high performance fmacs. > > That might be the case (except for the known libquadmath sqrtq case > PR105101 which fortunately is not a builtin). > But we'll need to ulps infrastructure for other functions anyway and > it would be nice to write a short testcase first that will test > sqrt{,f,l,f32,f64,f128} and can be easily adjusted to test other functions. > I'll try to cook something up tomorrow. > > Jakub > > ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-13 20:39 ` Jakub Jelinek 2022-11-14 7:45 ` Aldy Hernandez@ 2022-11-14 21:55 ` Joseph Myers2022-11-16 20:32 ` Jakub Jelinek 1 sibling, 1 reply; 21+ messages in thread From: Joseph Myers @ 2022-11-14 21:55 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Aldy Hernandez, GCC patches, Andrew MacLeod On Sun, 13 Nov 2022, Jakub Jelinek via Gcc-patches wrote: > So, I wonder if we don't need to add a target hook where targets will be > able to provide upper bound on error for floating point functions for > different floating point modes and some way to signal unknown accuracy/can't > be trusted, in which case we would give up or return just the range for > VARYING. Note that the figures given in the glibc manual are purely empirical (largest errors observed for inputs in the glibc testsuite on a system that was then used to update the libm-test-ulps files); they don't constitute any kind of guarantee about either the current implementation or the API, nor are they formally verified, nor do they come from exhaustive testing (though worst cases from exhaustive testing for float may have been added to the glibc testsuite in some cases). (I think the only functions known to give huge errors for some inputs, outside of any IBM long double issues, are the Bessel functions and cpow functions. But even if other functions don't have huge errors, and some architecture-specific implementations might have issues, there are certainly some cases where errors can exceed the 9ulp threshold on what the libm tests will accept in libm-test-ulps files, which are thus considered glibc bugs. (That's 9ulp from the correctly rounded value, computed in ulp of that value. For IBM long double it's 16ulp instead, treating the format as having a fixed 106 bits of precision. Both figures are empirical ones chosen based on what bounds sufficed for most libm functions some years ago; ideally, with better implementations of some functions we could probably bring those numbers down.)) -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-14 21:55 ` Joseph Myers@ 2022-11-16 20:32 ` Jakub Jelinek2022-11-17 16:40 ` Aldy Hernandez 0 siblings, 1 reply; 21+ messages in thread From: Jakub Jelinek @ 2022-11-16 20:32 UTC (permalink / raw) To: Joseph Myers, Aldy Hernandez;+Cc:GCC patches, Andrew MacLeod [-- Attachment #1: Type: text/plain, Size: 2850 bytes --] On Mon, Nov 14, 2022 at 09:55:29PM +0000, Joseph Myers wrote: > On Sun, 13 Nov 2022, Jakub Jelinek via Gcc-patches wrote: > > > So, I wonder if we don't need to add a target hook where targets will be > > able to provide upper bound on error for floating point functions for > > different floating point modes and some way to signal unknown accuracy/can't > > be trusted, in which case we would give up or return just the range for > > VARYING. > > Note that the figures given in the glibc manual are purely empirical > (largest errors observed for inputs in the glibc testsuite on a system > that was then used to update the libm-test-ulps files); they don't > constitute any kind of guarantee about either the current implementation > or the API, nor are they formally verified, nor do they come from > exhaustive testing (though worst cases from exhaustive testing for float > may have been added to the glibc testsuite in some cases). (I think the > only functions known to give huge errors for some inputs, outside of any > IBM long double issues, are the Bessel functions and cpow functions. But > even if other functions don't have huge errors, and some > architecture-specific implementations might have issues, there are > certainly some cases where errors can exceed the 9ulp threshold on what > the libm tests will accept in libm-test-ulps files, which are thus > considered glibc bugs. (That's 9ulp from the correctly rounded value, > computed in ulp of that value. For IBM long double it's 16ulp instead, > treating the format as having a fixed 106 bits of precision. Both figures > are empirical ones chosen based on what bounds sufficed for most libm > functions some years ago; ideally, with better implementations of some > functions we could probably bring those numbers down.)) I know I can't get guarantees without formal proofs and even ulps from reported errors are better than randomized testing. But I think at least for non-glibc we want to be able to get a rough idea of the usual error range in ulps. This is what I came up with so far (link with gcc -o ulp-tester{,.c} -O2 -lmpfr -lm ), it still doesn't verify that functions are always within the mathematical range of results ([-0.0, Inf] for sqrt, [-1.0, 1.0] for sin/cos etc.), guess that would be useful and verify the program actually does what is intended. One can supply just one argument (number of tests, first 46 aren't really random) or two, in the latter case the second should be upward, downward or towardzero to use non-default rounding mode. The idea is that we'd collect ballpark estimates for roundtonearest and then estimates for the other 3 rounding modes, the former would be used without -frounding-math, max over all 4 rounding modes for -frounding-math as gcc will compute using mpfr always in round to nearest. Jakub [-- Attachment #2: ulp-tester.c --] [-- Type: text/plain, Size: 6603 bytes --] #ifdef THIS_TYPE static THIS_TYPE THIS_FUNC (ulp) (THIS_TYPE val) { if (__builtin_isnormal (val)) return THIS_FUNC (ldexp) (THIS_LIT (1.0), THIS_FUNC (ilogb) (val) - THIS_MANT_DIG + 1); else return THIS_FUNC (ldexp) (THIS_LIT (1.0), THIS_MIN_EXP - THIS_MANT_DIG); } static void THIS_FUNC (test) (THIS_TYPE (*fn) (THIS_TYPE), int (*mpfr_fn) (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t), const char *name, unsigned long count) { char buf1[256], buf2[256], buf3[256]; mpfr_set_default_prec (THIS_MANT_DIG); mpfr_t v; mpfr_init2 (v, THIS_MANT_DIG); THIS_TYPE max_ulp = THIS_LIT (0.0); volatile THIS_TYPE val = THIS_LIT (0.0); int m = 0; for (unsigned long i = 0; i < count; ++i) { if (m == 0) m = 1; else if (m <= 10) { val = THIS_FUNC (nextafter) (val, THIS_LIT (1.0)); if ((m == 1 && val == THIS_MIN) || m > 1) ++m; } else if (m == 11) { val = THIS_LIT (1.0); for (int j = 0; j < 10; j++) val = THIS_FUNC (nextafter) (val, THIS_LIT (0.0)); ++m; } else if (m <= 32) { val = THIS_FUNC (nextafter) (val, THIS_LIT (1.0)); ++m; } else if (m == 33) { val = THIS_MAX; for (int j = 0; j < 10; j++) val = THIS_FUNC (nextafter) (val, THIS_LIT (0.0)); ++m; } else if (m <= 45) { val = THIS_FUNC (nextafter) (val, THIS_LIT (1.0)); ++m; } else val = THIS_FUNC (get_rand) (); if (__builtin_isnan (val)) continue; THIS_TYPE given = fn (val); THIS_MPFR_SET (v, val, MPFR_RNDN); mpfr_fn (v, v, MPFR_RNDN); THIS_TYPE expected = THIS_MPFR_GET (v, MPFR_RNDN); if ((!__builtin_isnan (given)) != (!__builtin_isnan (expected)) || __builtin_isinf (given) != __builtin_isinf (expected)) { THIS_SNPRINTF (buf1, val); THIS_SNPRINTF (buf2, given); THIS_SNPRINTF (buf3, expected); printf ("%s (%s) = %s rather than %s\n", name, buf1, buf2, buf3); } else if (!__builtin_isnan (given) && !__builtin_isinf (given)) { THIS_TYPE this_ulp = THIS_FUNC (fabs) (given - expected) / THIS_FUNC (ulp) (expected); if (this_ulp > max_ulp) max_ulp = this_ulp; } } printf ("%s max error %.1fulp\n", name, (float) max_ulp); } #undef THIS_TYPE #undef THIS_LIT #undef THIS_FUNC #undef THIS_MIN_EXP #undef THIS_MANT_DIG #undef THIS_MIN #undef THIS_MAX #undef THIS_MPFR_SET #undef THIS_MPFR_GET #undef THIS_SNPRINTF #else #define _GNU_SOURCE 1 #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <math.h> #include <fenv.h> #if defined(__FLT128_DIG__) && defined(__GLIBC_PREREQ) #if __GLIBC_PREREQ (2, 26) #define TEST_FLT128 #define MPFR_WANT_FLOAT128 #endif #endif #include <gmp.h> #include <mpfr.h> static long rand_n; static int rand_c; static uint32_t get_rand32 (void) { uint32_t ret = 0; if (rand_c == 0) { ret = random () & 0x7fffffff; rand_c = 31; } else ret = rand_n & (((uint32_t) 1 << rand_c) - 1); ret <<= (32 - rand_c); rand_n = random (); ret |= rand_n & (((uint32_t) 1 << (32 - rand_c)) - 1); rand_n >>= (32 - rand_c); rand_c = 31 - (32 - rand_c); return ret; } static uint64_t get_rand64 (void) { return (((uint64_t) get_rand32 ()) << 32) | get_rand32 (); } static float get_randf (void) { uint32_t i = get_rand32 (); float f; memcpy (&f, &i, sizeof (f)); return f; } static double get_rand (void) { uint64_t i = get_rand64 (); double d; memcpy (&d, &i, sizeof (d)); return d; } static long double get_randl (void) { long double ld; uint64_t i = get_rand64 (); memcpy (&ld, &i, sizeof (i)); if (sizeof (long double) == 12) { uint32_t j = get_rand32 (); memcpy ((char *) &ld + 8, &j, sizeof (j)); } else if (sizeof (long double) == 16) { i = get_rand64 (); memcpy ((char *) &ld + 8, &i, sizeof (i)); } return ld; } #ifdef TEST_FLT128 static long double get_randf128 (void) { _Float128 f128; uint64_t i = get_rand64 (); memcpy (&f128, &i, sizeof (i)); i = get_rand64 (); memcpy ((char *) &f128 + 8, &i, sizeof (i)); return f128; } #endif #define THIS_TYPE float #define THIS_LIT(v) v##f #define THIS_FUNC(v) v##f #define THIS_MIN_EXP __FLT_MIN_EXP__ #define THIS_MANT_DIG __FLT_MANT_DIG__ #define THIS_MIN __FLT_MIN__ #define THIS_MAX __FLT_MAX__ #define THIS_MPFR_SET mpfr_set_flt #define THIS_MPFR_GET mpfr_get_flt #define THIS_SNPRINTF(buf, x) snprintf ((buf), sizeof (buf), "%a", (x)); #include "ulp-tester.c" #define THIS_TYPE double #define THIS_LIT(v) v #define THIS_FUNC(v) v #define THIS_MIN_EXP __DBL_MIN_EXP__ #define THIS_MANT_DIG __DBL_MANT_DIG__ #define THIS_MIN __DBL_MIN__ #define THIS_MAX __DBL_MAX__ #define THIS_MPFR_SET mpfr_set_d #define THIS_MPFR_GET mpfr_get_d #define THIS_SNPRINTF(buf, x) snprintf ((buf), sizeof (buf), "%a", (x)); #include "ulp-tester.c" #define THIS_TYPE long double #define THIS_LIT(v) v##L #define THIS_FUNC(v) v##l #define THIS_MIN_EXP __LDBL_MIN_EXP__ #define THIS_MANT_DIG __LDBL_MANT_DIG__ #define THIS_MIN __LDBL_MIN__ #define THIS_MAX __LDBL_MAX__ #define THIS_MPFR_SET mpfr_set_ld #define THIS_MPFR_GET mpfr_get_ld #define THIS_SNPRINTF(buf, x) snprintf ((buf), sizeof (buf), "%La", (x)); #include "ulp-tester.c" #ifdef TEST_FLT128 #define THIS_TYPE _Float128 #define THIS_LIT(v) v##F128 #define THIS_FUNC(v) v##f128 #define THIS_MIN_EXP __FLT128_MIN_EXP__ #define THIS_MANT_DIG __FLT128_MANT_DIG__ #define THIS_MIN __FLT128_MIN__ #define THIS_MAX __FLT128_MAX__ #define THIS_MPFR_SET mpfr_set_float128 #define THIS_MPFR_GET mpfr_get_float128 #define THIS_SNPRINTF(buf, x) strfromf128 ((buf), sizeof (buf), "%a", (x)); #include "ulp-tester.c" #else #define testf128(fn, mpfr_fn, name, count) do { } while (0) #endif int main (int argc, const char **argv) { const char *arg; char *endptr; (void) argc; if (argc <= 1) arg = ""; else arg = argv[1]; unsigned long count = strtoul (arg, &endptr, 10); if (endptr == arg) { fprintf (stderr, "ulp-tester number_of_iterations rnd\n"); return 1; } const char *rnd = "tonearest"; if (argc >= 3) rnd = argv[2]; if (strcmp (rnd, "upward") == 0) fesetround (FE_UPWARD); else if (strcmp (rnd, "downward") == 0) fesetround (FE_DOWNWARD); else if (strcmp (rnd, "towardzero") == 0) fesetround (FE_TOWARDZERO); #define TESTS(fn) \ testf (fn##f, mpfr_##fn, #fn "f", count); \ test (fn, mpfr_##fn, #fn, count); \ testl (fn##l, mpfr_##fn, #fn "l", count); \ testf128 (fn##f128, mpfr_##fn, #fn "f128", count) TESTS (sqrt); TESTS (sin); TESTS (cos); TESTS (exp10); } #endif ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-16 20:32 ` Jakub Jelinek@ 2022-11-17 16:40 ` Aldy Hernandez2022-11-17 16:48 ` Aldy Hernandez 2022-11-17 18:59 ` Joseph Myers 0 siblings, 2 replies; 21+ messages in thread From: Aldy Hernandez @ 2022-11-17 16:40 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Joseph Myers, GCC patches, Andrew MacLeod [-- Attachment #1: Type: text/plain, Size: 4460 bytes --] To go along with whatever magic we're gonna tack along to the range-ops sqrt implementation, here is another revision addressing the VARYING issue you pointed out. A few things... Instead of going through trees, I decided to call do_mpfr_arg1 directly. Let's not go the wide int <-> tree rat hole in this one. The function do_mpfr_arg1 bails on +INF, so I had to handle it manually. There's a regression in gfortran.dg/ieee/ieee_6.f90, which I'm not sure how to handle. We are failing because we are calculating sqrt(-1) and expecting certain IEEE flags set. These flags aren't set, presumably because we folded sqrt(-1) into a NAN directly: // All negatives. if (real_compare (LT_EXPR, &lh_ub, &dconst0)) { real_nan (&lb, "", 0, TYPE_MODE (type)); ub = lb; maybe_nan = true; return; } The failing part of the test is: if (.not. (all(flags .eqv. [.false.,.false.,.true.,.true.,.false.]) & .or. all(flags .eqv. [.false.,.false.,.true.,.true.,.true.]) & .or. all(flags .eqv. [.false.,.false.,.true.,.false.,.false.]) & .or. all(flags .eqv. [.false.,.false.,.true.,.false.,.true.]))) STOP 5 But we are generating F F F F F. Google has informed me that that 3rd flag is IEEE_INVALID. So... is the optimization wrong? Are we not allowed to substitute that NAN if we know it's gonna happen? Should we also allow F F F F F in the test? Or something else? Thanks. Aldy On Wed, Nov 16, 2022 at 9:33 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Nov 14, 2022 at 09:55:29PM +0000, Joseph Myers wrote: > > On Sun, 13 Nov 2022, Jakub Jelinek via Gcc-patches wrote: > > > > > So, I wonder if we don't need to add a target hook where targets will be > > > able to provide upper bound on error for floating point functions for > > > different floating point modes and some way to signal unknown accuracy/can't > > > be trusted, in which case we would give up or return just the range for > > > VARYING. > > > > Note that the figures given in the glibc manual are purely empirical > > (largest errors observed for inputs in the glibc testsuite on a system > > that was then used to update the libm-test-ulps files); they don't > > constitute any kind of guarantee about either the current implementation > > or the API, nor are they formally verified, nor do they come from > > exhaustive testing (though worst cases from exhaustive testing for float > > may have been added to the glibc testsuite in some cases). (I think the > > only functions known to give huge errors for some inputs, outside of any > > IBM long double issues, are the Bessel functions and cpow functions. But > > even if other functions don't have huge errors, and some > > architecture-specific implementations might have issues, there are > > certainly some cases where errors can exceed the 9ulp threshold on what > > the libm tests will accept in libm-test-ulps files, which are thus > > considered glibc bugs. (That's 9ulp from the correctly rounded value, > > computed in ulp of that value. For IBM long double it's 16ulp instead, > > treating the format as having a fixed 106 bits of precision. Both figures > > are empirical ones chosen based on what bounds sufficed for most libm > > functions some years ago; ideally, with better implementations of some > > functions we could probably bring those numbers down.)) > > I know I can't get guarantees without formal proofs and even ulps from > reported errors are better than randomized testing. > But I think at least for non-glibc we want to be able to get a rough idea > of the usual error range in ulps. > > This is what I came up with so far (link with > gcc -o ulp-tester{,.c} -O2 -lmpfr -lm > ), it still doesn't verify that functions are always within the mathematical > range of results ([-0.0, Inf] for sqrt, [-1.0, 1.0] for sin/cos etc.), guess > that would be useful and verify the program actually does what is intended. > One can supply just one argument (number of tests, first 46 aren't really > random) or two, in the latter case the second should be upward, downward or > towardzero to use non-default rounding mode. > The idea is that we'd collect ballpark estimates for roundtonearest and > then estimates for the other 3 rounding modes, the former would be used > without -frounding-math, max over all 4 rounding modes for -frounding-math > as gcc will compute using mpfr always in round to nearest. > > Jakub [-- Attachment #2: 0001-range-ops-Implement-sqrt.patch --] [-- Type: text/x-patch, Size: 4584 bytes --] From 759bcd4b4b6f70fcec045b24fb6874aaca989549 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez <aldyh@redhat.com> Date: Sun, 13 Nov 2022 18:39:59 +0100 Subject: [PATCH] [range-ops] Implement sqrt. gcc/ChangeLog: * fold-const-call.cc (do_mpfr_arg1): Remove static. * gimple-range-op.cc (class cfn_sqrt): New. (gimple_range_op_handler::maybe_builtin_call): Add sqrt case. * realmpfr.h (do_mpfr_arg1): Add extern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp124.c: New test. --- gcc/fold-const-call.cc | 2 +- gcc/gimple-range-op.cc | 56 ++++++++++++++++++++++++++ gcc/realmpfr.h | 4 ++ gcc/testsuite/gcc.dg/tree-ssa/vrp124.c | 16 ++++++++ 4 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp124.c diff --git a/gcc/fold-const-call.cc b/gcc/fold-const-call.cc index 8ceed8f02f9..5c6b852acdc 100644 --- a/gcc/fold-const-call.cc +++ b/gcc/fold-const-call.cc @@ -118,7 +118,7 @@ do_mpfr_ckconv (real_value *result, mpfr_srcptr m, bool inexact, in format FORMAT, given that FUNC is the MPFR implementation of f. Return true on success. */ -static bool +bool do_mpfr_arg1 (real_value *result, int (*func) (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t), const real_value *arg, const real_format *format) diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 7764166d5fb..f1f2f098305 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3. If not see #include "range.h" #include "value-query.h" #include "gimple-range.h" +#include "fold-const-call.h" +#include "realmpfr.h" // Given stmt S, fill VEC, up to VEC_SIZE elements, with relevant ssa-names // on the statement. For efficiency, it is an error to not pass in enough @@ -301,6 +303,54 @@ public: } } op_cfn_constant_p; +// Implement range operator for SQRT. +class cfn_sqrt : public range_operator_float +{ + using range_operator_float::fold_range; +private: + void rv_fold (REAL_VALUE_TYPE &lb, REAL_VALUE_TYPE &ub, bool &maybe_nan, + tree type, + const REAL_VALUE_TYPE &lh_lb, + const REAL_VALUE_TYPE &lh_ub, + const REAL_VALUE_TYPE &, + const REAL_VALUE_TYPE &, + relation_kind) const final override + { + // All negatives. + if (real_compare (LT_EXPR, &lh_ub, &dconst0)) + { + real_nan (&lb, "", 0, TYPE_MODE (type)); + ub = lb; + maybe_nan = true; + return; + } + const real_format *format = REAL_MODE_FORMAT (TYPE_MODE (type)); + // All positives. + if (real_compare (GE_EXPR, &lh_lb, &dconst0)) + { + // ?? Handle +INF manually since do_mpfr_arg1 does not. + if (real_isinf (&lh_lb, 0)) + lb = lh_lb; + else if (!do_mpfr_arg1 (&lb, mpfr_sqrt, &lh_lb, format)) + { + lb = dconst0; + lb.sign = 1; + } + maybe_nan = false; + } + // Both positives and negatives. + else + { + // Range is [-0.0, sqrt(lh_ub)] +-NAN. + lb = dconst0; + lb.sign = 1; + maybe_nan = true; + } + if (!do_mpfr_arg1 (&ub, mpfr_sqrt, &lh_ub, format)) + ub = dconstinf; + } +} fop_cfn_sqrt; + // Implement range operator for CFN_BUILT_IN_SIGNBIT. class cfn_signbit : public range_operator_float { @@ -907,6 +957,12 @@ gimple_range_op_handler::maybe_builtin_call () m_int = &op_cfn_parity; break; + CASE_CFN_SQRT_ALL: + m_valid = true; + m_op1 = gimple_call_arg (call, 0); + m_float = &fop_cfn_sqrt; + break; + default: break; } diff --git a/gcc/realmpfr.h b/gcc/realmpfr.h index edc08385fe8..807dd2308d2 100644 --- a/gcc/realmpfr.h +++ b/gcc/realmpfr.h @@ -32,4 +32,8 @@ extern void real_from_mpfr (REAL_VALUE_TYPE *, mpfr_srcptr, const real_format *, mpfr_rnd_t); extern void mpfr_from_real (mpfr_ptr, const REAL_VALUE_TYPE *, mpfr_rnd_t); +extern bool do_mpfr_arg1 (real_value *result, + int (*func) (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t), + const real_value *arg, const real_format *format); + #endif /* ! GCC_REALGMP_H */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c new file mode 100644 index 00000000000..ef72d660153 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp124.c @@ -0,0 +1,16 @@ +// { dg-do compile } +// { dg-options "-O2 -fdump-tree-evrp -fdisable-tree-ethread" } + +void link_error (); + +void foo (float f) +{ + float z = __builtin_sqrt (f); + if (!__builtin_isnan (z)) + { + if (z < 0.0) + link_error (); + } +} + +// { dg-final { scan-tree-dump-not "link_error" "evrp" } } -- 2.38.1 ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-17 16:40 ` Aldy Hernandez@ 2022-11-17 16:48 ` Aldy Hernandez2022-11-17 17:42 ` Aldy Hernandez 2022-11-17 18:59 ` Joseph Myers 1 sibling, 1 reply; 21+ messages in thread From: Aldy Hernandez @ 2022-11-17 16:48 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Joseph Myers, GCC patches, Andrew MacLeod On 11/17/22 17:40, Aldy Hernandez wrote: > To go along with whatever magic we're gonna tack along to the > range-ops sqrt implementation, here is another revision addressing the > VARYING issue you pointed out. > > A few things... > > Instead of going through trees, I decided to call do_mpfr_arg1 > directly. Let's not go the wide int <-> tree rat hole in this one. > > The function do_mpfr_arg1 bails on +INF, so I had to handle it manually. > > There's a regression in gfortran.dg/ieee/ieee_6.f90, which I'm not > sure how to handle. We are failing because we are calculating > sqrt(-1) and expecting certain IEEE flags set. These flags aren't > set, presumably because we folded sqrt(-1) into a NAN directly: > > // All negatives. > if (real_compare (LT_EXPR, &lh_ub, &dconst0)) > { > real_nan (&lb, "", 0, TYPE_MODE (type)); > ub = lb; > maybe_nan = true; > return; > } FWIW, we could return [-0.0, +INF] +-NAN which would keep us from eliding the sqrt, but it'd be a pity to keep the sqrt unless it's mandated by some IEEE canon. Aldy ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-17 16:48 ` Aldy Hernandez@ 2022-11-17 17:42 ` Aldy Hernandez0 siblings, 0 replies; 21+ messages in thread From: Aldy Hernandez @ 2022-11-17 17:42 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Joseph Myers, GCC patches, Andrew MacLeod This may be DCE. DOM uses ranger through simplify_using_ranges::fold_cond() to fold the following conditional to false, because we know x_185 is a NAN: x_185 = __builtin_sqrtf (-1.0e+0); if (x_185 ord x_185) I believe we can do that, because there are no user observable effects. But DCE removes the sqrt which could trap: Eliminating unnecessary statements: Deleting : x_185 = __builtin_sqrtf (-1.0e+0); Is DCE allowed to remove that sqrtf call? Thanks. Aldy On Thu, Nov 17, 2022 at 5:48 PM Aldy Hernandez <aldyh@redhat.com> wrote: > > > > On 11/17/22 17:40, Aldy Hernandez wrote: > > To go along with whatever magic we're gonna tack along to the > > range-ops sqrt implementation, here is another revision addressing the > > VARYING issue you pointed out. > > > > A few things... > > > > Instead of going through trees, I decided to call do_mpfr_arg1 > > directly. Let's not go the wide int <-> tree rat hole in this one. > > > > The function do_mpfr_arg1 bails on +INF, so I had to handle it manually. > > > > There's a regression in gfortran.dg/ieee/ieee_6.f90, which I'm not > > sure how to handle. We are failing because we are calculating > > sqrt(-1) and expecting certain IEEE flags set. These flags aren't > > set, presumably because we folded sqrt(-1) into a NAN directly: > > > > // All negatives. > > if (real_compare (LT_EXPR, &lh_ub, &dconst0)) > > { > > real_nan (&lb, "", 0, TYPE_MODE (type)); > > ub = lb; > > maybe_nan = true; > > return; > > } > > FWIW, we could return [-0.0, +INF] +-NAN which would keep us from > eliding the sqrt, but it'd be a pity to keep the sqrt unless it's > mandated by some IEEE canon. > > Aldy ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-17 16:40 ` Aldy Hernandez 2022-11-17 16:48 ` Aldy Hernandez@ 2022-11-17 18:59 ` Joseph Myers2022-11-17 19:37 ` Jakub Jelinek 1 sibling, 1 reply; 21+ messages in thread From: Joseph Myers @ 2022-11-17 18:59 UTC (permalink / raw) To: Aldy Hernandez;+Cc:Jakub Jelinek, GCC patches, Andrew MacLeod On Thu, 17 Nov 2022, Aldy Hernandez via Gcc-patches wrote: > So... is the optimization wrong? Are we not allowed to substitute > that NAN if we know it's gonna happen? Should we also allow F F F F F > in the test? Or something else? This seems like the usual ambiguity about what transformations -ftrapping-math (on by default) is meant to prevent. Generally it's understood to prevent transformations that add *or remove* exceptions, so folding a case that raises "invalid" to a NaN (with "invalid" no longer raised) is invalid with -ftrapping-math. But that doesn't tend to be applied if the operation raising the exceptions has a result that is otherwise unused - in such a case the operation may still be removed completely (the exception isn't properly treated as a side effect to avoid dead code elimination; cf. Marc Glisse's -ffenv-access patches from August 2020). And it may often also not be applied to "inexact". There have been various past discussions of possible ways to split up the different effects of options such as -ftrapping-math into finer-grained options allowing more control of what transformations are permitted - see e.g. <https://gcc.gnu.org/pipermail/gcc-patches/2021-September/thread.html#580252> and bug 54192. There is also the question in that context of which sub-options should be enabled by default at all. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-17 18:59 ` Joseph Myers@ 2022-11-17 19:37 ` Jakub Jelinek2022-11-17 20:43 ` Joseph Myers 2022-11-18 8:39 ` Richard Biener 0 siblings, 2 replies; 21+ messages in thread From: Jakub Jelinek @ 2022-11-17 19:37 UTC (permalink / raw) To: Joseph Myers;+Cc:Aldy Hernandez, GCC patches, Andrew MacLeod On Thu, Nov 17, 2022 at 06:59:45PM +0000, Joseph Myers wrote: > On Thu, 17 Nov 2022, Aldy Hernandez via Gcc-patches wrote: > > > So... is the optimization wrong? Are we not allowed to substitute > > that NAN if we know it's gonna happen? Should we also allow F F F F F > > in the test? Or something else? > > This seems like the usual ambiguity about what transformations > -ftrapping-math (on by default) is meant to prevent. > > Generally it's understood to prevent transformations that add *or remove* > exceptions, so folding a case that raises "invalid" to a NaN (with > "invalid" no longer raised) is invalid with -ftrapping-math. But that > doesn't tend to be applied if the operation raising the exceptions has a > result that is otherwise unused - in such a case the operation may still > be removed completely (the exception isn't properly treated as a side > effect to avoid dead code elimination; cf. Marc Glisse's -ffenv-access > patches from August 2020). And it may often also not be applied to > "inexact". The problem is that the above model I'm afraid is largely incompatible with the optimizations ranger provides. A strict model where no operations that could raise exceptions are discarded is easy, we let frange optimize as much as it wants and just tell DCE not to eliminate operations that can raise exceptions. But in the model where some exceptions can be discarded if results are unused but not others where they are used, there is no way to distinguish between the result of the operation really isn't needed and ranger figured out a result (or usable range of something) and therefore the result of the operation isn't needed. Making frange more limited with -ftrapping-math, making it punt for operations that could raise an exception would be quite drastic pessimization. Perhaps for -ftrapping-math we could say no frange value is singleton and so at least for most of operations we actually wouldn't optimize out the whole computation when we know the result? Still, we could also just have r = long_computation (x, y, z); if (r > 42.0) and if frange figures out that r must be [256.0, 1024.0] and never NAN, we'd still happily optimize away the comparison. Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-17 19:37 ` Jakub Jelinek@ 2022-11-17 20:43 ` Joseph Myers2022-11-18 8:39 ` Richard Biener 1 sibling, 0 replies; 21+ messages in thread From: Joseph Myers @ 2022-11-17 20:43 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Aldy Hernandez, GCC patches, Andrew MacLeod On Thu, 17 Nov 2022, Jakub Jelinek via Gcc-patches wrote: > On Thu, Nov 17, 2022 at 06:59:45PM +0000, Joseph Myers wrote: > > On Thu, 17 Nov 2022, Aldy Hernandez via Gcc-patches wrote: > > > > > So... is the optimization wrong? Are we not allowed to substitute > > > that NAN if we know it's gonna happen? Should we also allow F F F F F > > > in the test? Or something else? > > > > This seems like the usual ambiguity about what transformations > > -ftrapping-math (on by default) is meant to prevent. > > > > Generally it's understood to prevent transformations that add *or remove* > > exceptions, so folding a case that raises "invalid" to a NaN (with > > "invalid" no longer raised) is invalid with -ftrapping-math. But that > > doesn't tend to be applied if the operation raising the exceptions has a > > result that is otherwise unused - in such a case the operation may still > > be removed completely (the exception isn't properly treated as a side > > effect to avoid dead code elimination; cf. Marc Glisse's -ffenv-access > > patches from August 2020). And it may often also not be applied to > > "inexact". > > The problem is that the above model I'm afraid is largely incompatible with > the optimizations ranger provides. That model is more an empirical description of when the nominal -ftrapping-math semantics tend to be respected, than a coherent design for any kind of API commitment to what the option does or what the default trapping-math rules are. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 21+ messages in thread

*2022-11-17 19:37 ` Jakub Jelinek 2022-11-17 20:43 ` Joseph MyersRe: [PATCH] [range-ops] Implement sqrt.@ 2022-11-18 8:39 ` Richard Biener2022-11-18 10:37 ` Aldy Hernandez 1 sibling, 1 reply; 21+ messages in thread From: Richard Biener @ 2022-11-18 8:39 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Joseph Myers, Aldy Hernandez, GCC patches, Andrew MacLeod On Thu, Nov 17, 2022 at 8:38 PM Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Thu, Nov 17, 2022 at 06:59:45PM +0000, Joseph Myers wrote: > > On Thu, 17 Nov 2022, Aldy Hernandez via Gcc-patches wrote: > > > > > So... is the optimization wrong? Are we not allowed to substitute > > > that NAN if we know it's gonna happen? Should we also allow F F F F F > > > in the test? Or something else? > > > > This seems like the usual ambiguity about what transformations > > -ftrapping-math (on by default) is meant to prevent. > > > > Generally it's understood to prevent transformations that add *or remove* > > exceptions, so folding a case that raises "invalid" to a NaN (with > > "invalid" no longer raised) is invalid with -ftrapping-math. But that > > doesn't tend to be applied if the operation raising the exceptions has a > > result that is otherwise unused - in such a case the operation may still > > be removed completely (the exception isn't properly treated as a side > > effect to avoid dead code elimination; cf. Marc Glisse's -ffenv-access > > patches from August 2020). And it may often also not be applied to > > "inexact". > > The problem is that the above model I'm afraid is largely incompatible with > the optimizations ranger provides. > A strict model where no operations that could raise exceptions are discarded > is easy, we let frange optimize as much as it wants and just tell DCE not to > eliminate operations that can raise exceptions. > But in the model where some exceptions can be discarded if results are unused > but not others where they are used, there is no way to distinguish between > the result of the operation really isn't needed and ranger figured out a > result (or usable range of something) and therefore the result of the > operation isn't needed. > Making frange more limited with -ftrapping-math, making it punt for > operations that could raise an exception would be quite drastic > pessimization. Perhaps for -ftrapping-math we could say no frange value is > singleton and so at least for most of operations we actually wouldn't > optimize out the whole computation when we know the result? Still, we could > also just have > r = long_computation (x, y, z); > if (r > 42.0) > and if frange figures out that r must be [256.0, 1024.0] and never NAN, we'd > still happily optimize away the comparison. Yes, I don't think singling out the singleton case will help. Practically strictly preserving IEEE exceptions is only important for a very small audience, and for that even INEXACT will matter (but we still have -ftrapping-math by default). For that audience likely all constant / range propagation is futile and thus the easiest thing might be to simply cut that off completely? I'd say what ranger does is reasonable with -ftrapping-math given the current practice of handling this option. There's no point in trying to preserve the (by accident) "better" handling without ranger. Instead as Joseph says somebody would need to sit down, split -ftrapping-math, adjust the default and thorougly document things (also with -fnon-call-exceptions which magically makes IEEE flag raising operations possibly throw exceptions). As there's currently no code motion barriers for FP code with respect to exception flag inspection any dead code we preserve is likely going to be unhelpful. So for now simply amend the documentation as to what -ftrapping-math currently means with respect to range/constant propagation? Richard. > > Jakub > ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-18 8:39 ` Richard Biener@ 2022-11-18 10:37 ` Aldy Hernandez2022-11-18 10:44 ` Jakub Jelinek 0 siblings, 1 reply; 21+ messages in thread From: Aldy Hernandez @ 2022-11-18 10:37 UTC (permalink / raw) To: Richard Biener, Jakub Jelinek;+Cc:Joseph Myers, GCC patches, Andrew MacLeod On 11/18/22 09:39, Richard Biener wrote: > On Thu, Nov 17, 2022 at 8:38 PM Jakub Jelinek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> On Thu, Nov 17, 2022 at 06:59:45PM +0000, Joseph Myers wrote: >>> On Thu, 17 Nov 2022, Aldy Hernandez via Gcc-patches wrote: >>> >>>> So... is the optimization wrong? Are we not allowed to substitute >>>> that NAN if we know it's gonna happen? Should we also allow F F F F F >>>> in the test? Or something else? >>> >>> This seems like the usual ambiguity about what transformations >>> -ftrapping-math (on by default) is meant to prevent. >>> >>> Generally it's understood to prevent transformations that add *or remove* >>> exceptions, so folding a case that raises "invalid" to a NaN (with >>> "invalid" no longer raised) is invalid with -ftrapping-math. But that >>> doesn't tend to be applied if the operation raising the exceptions has a >>> result that is otherwise unused - in such a case the operation may still >>> be removed completely (the exception isn't properly treated as a side >>> effect to avoid dead code elimination; cf. Marc Glisse's -ffenv-access >>> patches from August 2020). And it may often also not be applied to >>> "inexact". >> >> The problem is that the above model I'm afraid is largely incompatible with >> the optimizations ranger provides. >> A strict model where no operations that could raise exceptions are discarded >> is easy, we let frange optimize as much as it wants and just tell DCE not to >> eliminate operations that can raise exceptions. >> But in the model where some exceptions can be discarded if results are unused >> but not others where they are used, there is no way to distinguish between >> the result of the operation really isn't needed and ranger figured out a >> result (or usable range of something) and therefore the result of the >> operation isn't needed. >> Making frange more limited with -ftrapping-math, making it punt for >> operations that could raise an exception would be quite drastic >> pessimization. Perhaps for -ftrapping-math we could say no frange value is >> singleton and so at least for most of operations we actually wouldn't >> optimize out the whole computation when we know the result? Still, we could >> also just have >> r = long_computation (x, y, z); >> if (r > 42.0) >> and if frange figures out that r must be [256.0, 1024.0] and never NAN, we'd >> still happily optimize away the comparison. > > Yes, I don't think singling out the singleton case will help. There is also simplify_using_ranges::fold_cond() which is used by VRP and DOM to fold conditionals. So twiddling frange::singleton_p will have no effect here since FP conditionals results are integers (f > 3.0 is true or false). And now that we're on this subject... We are very careful in frange (range-op-floats.o) to avoid returning true/false in relational which may have a NAN. This keeps us from folding conditionals that may result in a trapping NAN. For example, if we see [if (x_5 unord_lt 10.0)...] and we know x_5 is [-INF, -8.0] +-NAN, this conditional is always true, but we return VARYING to avoid folding a NAN producing conditional. I wonder whether we're being too conservative? An alternative woudld be: z_8 = x_5 unord_lt 10.0 goto true_side; But if DCE is going to clean that up anyhow without regards to exceptions, then maybe we can fold these conditionals altogether? If not in this release, then in the next one. ISTM that range-ops should always tell the truth of what it knows, instead of being conservative wrt exceptions. It should be up to the clients (VRP or simplify_using_ranges::fold_cond) to use the information correctly. > Practically strictly > preserving IEEE exceptions is only important for a very small audience, and > for that even INEXACT will matter (but we still have -ftrapping-math > by default). > For that audience likely all constant / range propagation is futile and thus the > easiest thing might be to simply cut that off completely? > > I'd say what ranger does is reasonable with -ftrapping-math given the current > practice of handling this option. There's no point in trying to preserve the > (by accident) "better" handling without ranger. Instead as Joseph says somebody > would need to sit down, split -ftrapping-math, adjust the default and thorougly > document things (also with -fnon-call-exceptions which magically makes > IEEE flag raising operations possibly throw exceptions). As there's currently > no code motion barriers for FP code with respect to exception flag inspection > any dead code we preserve is likely going to be unhelpful. > > So for now simply amend the documentation as to what -ftrapping-math > currently means with respect to range/constant propagation? So something like "Even in the presence of -ftrapping-math, VRP may fold operations that may cause exceptions For example, an addition that is guaranteed to produce a NAN, may be replaced with a NAN, thus eliding the addition. This may cause any exception that may have been generated by the addition to not appear in the final program." ?? Aldy ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-18 10:37 ` Aldy Hernandez@ 2022-11-18 10:44 ` Jakub Jelinek2022-11-18 11:20 ` Aldy Hernandez 2022-11-18 12:14 ` Richard Biener 0 siblings, 2 replies; 21+ messages in thread From: Jakub Jelinek @ 2022-11-18 10:44 UTC (permalink / raw) To: Aldy Hernandez;+Cc:Richard Biener, Joseph Myers, GCC patches, Andrew MacLeod On Fri, Nov 18, 2022 at 11:37:42AM +0100, Aldy Hernandez wrote: > > Practically strictly > > preserving IEEE exceptions is only important for a very small audience, and > > for that even INEXACT will matter (but we still have -ftrapping-math > > by default). > > For that audience likely all constant / range propagation is futile and thus the > > easiest thing might be to simply cut that off completely? > > > > I'd say what ranger does is reasonable with -ftrapping-math given the current > > practice of handling this option. There's no point in trying to preserve the > > (by accident) "better" handling without ranger. Instead as Joseph says somebody > > would need to sit down, split -ftrapping-math, adjust the default and thorougly > > document things (also with -fnon-call-exceptions which magically makes > > IEEE flag raising operations possibly throw exceptions). As there's currently > > no code motion barriers for FP code with respect to exception flag inspection > > any dead code we preserve is likely going to be unhelpful. > > > > So for now simply amend the documentation as to what -ftrapping-math > > currently means with respect to range/constant propagation? > > So something like "Even in the presence of -ftrapping-math, VRP may fold > operations that may cause exceptions For example, an addition that is > guaranteed to produce a NAN, may be replaced with a NAN, thus eliding the > addition. This may cause any exception that may have been generated by the > addition to not appear in the final program." > > ?? If we just adjust user expectations for -ftrapping-math, shouldn't we introduce another option that will make sure we never optimize away floating point operations which can trap (and probably just disable frange for that mode)? Jakub ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-18 10:44 ` Jakub Jelinek@ 2022-11-18 11:20 ` Aldy Hernandez2022-11-18 11:57 ` Aldy Hernandez 2022-11-18 12:14 ` Richard Biener 1 sibling, 1 reply; 21+ messages in thread From: Aldy Hernandez @ 2022-11-18 11:20 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Richard Biener, Joseph Myers, GCC patches, Andrew MacLeod On 11/18/22 11:44, Jakub Jelinek wrote: > On Fri, Nov 18, 2022 at 11:37:42AM +0100, Aldy Hernandez wrote: >>> Practically strictly >>> preserving IEEE exceptions is only important for a very small audience, and >>> for that even INEXACT will matter (but we still have -ftrapping-math >>> by default). >>> For that audience likely all constant / range propagation is futile and thus the >>> easiest thing might be to simply cut that off completely? >>> >>> I'd say what ranger does is reasonable with -ftrapping-math given the current >>> practice of handling this option. There's no point in trying to preserve the >>> (by accident) "better" handling without ranger. Instead as Joseph says somebody >>> would need to sit down, split -ftrapping-math, adjust the default and thorougly >>> document things (also with -fnon-call-exceptions which magically makes >>> IEEE flag raising operations possibly throw exceptions). As there's currently >>> no code motion barriers for FP code with respect to exception flag inspection >>> any dead code we preserve is likely going to be unhelpful. >>> >>> So for now simply amend the documentation as to what -ftrapping-math >>> currently means with respect to range/constant propagation? >> >> So something like "Even in the presence of -ftrapping-math, VRP may fold >> operations that may cause exceptions For example, an addition that is >> guaranteed to produce a NAN, may be replaced with a NAN, thus eliding the >> addition. This may cause any exception that may have been generated by the >> addition to not appear in the final program." >> >> ?? > > If we just adjust user expectations for -ftrapping-math, shouldn't we > introduce another option that will make sure we never optimize away floating > point operations which can trap (and probably just disable frange for that > mode)? That seems like a big hammer, but sure. We could change frange::supports_p() to return false for flag_severely_limiting_option :). Aldy ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-18 11:20 ` Aldy Hernandez@ 2022-11-18 11:57 ` Aldy Hernandez0 siblings, 0 replies; 21+ messages in thread From: Aldy Hernandez @ 2022-11-18 11:57 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Richard Biener, Joseph Myers, GCC patches, Andrew MacLeod [-- Attachment #1: Type: text/plain, Size: 2530 bytes --] I wonder if instead of disabling ranger altogether, we could disable code changes (constant propagation, jump threading and simplify_using_ranges)? Or does that sound like too much hassle? It seems that some passes (instruction selection?) could benefit from global ranges being available even if no propagation was done. Just a thought. I don't have strong opinions here. Aldy On Fri, Nov 18, 2022, 12:20 Aldy Hernandez <aldyh@redhat.com> wrote: > > > On 11/18/22 11:44, Jakub Jelinek wrote: > > On Fri, Nov 18, 2022 at 11:37:42AM +0100, Aldy Hernandez wrote: > >>> Practically strictly > >>> preserving IEEE exceptions is only important for a very small > audience, and > >>> for that even INEXACT will matter (but we still have -ftrapping-math > >>> by default). > >>> For that audience likely all constant / range propagation is futile > and thus the > >>> easiest thing might be to simply cut that off completely? > >>> > >>> I'd say what ranger does is reasonable with -ftrapping-math given the > current > >>> practice of handling this option. There's no point in trying to > preserve the > >>> (by accident) "better" handling without ranger. Instead as Joseph > says somebody > >>> would need to sit down, split -ftrapping-math, adjust the default and > thorougly > >>> document things (also with -fnon-call-exceptions which magically makes > >>> IEEE flag raising operations possibly throw exceptions). As there's > currently > >>> no code motion barriers for FP code with respect to exception flag > inspection > >>> any dead code we preserve is likely going to be unhelpful. > >>> > >>> So for now simply amend the documentation as to what -ftrapping-math > >>> currently means with respect to range/constant propagation? > >> > >> So something like "Even in the presence of -ftrapping-math, VRP may fold > >> operations that may cause exceptions For example, an addition that is > >> guaranteed to produce a NAN, may be replaced with a NAN, thus eliding > the > >> addition. This may cause any exception that may have been generated by > the > >> addition to not appear in the final program." > >> > >> ?? > > > > If we just adjust user expectations for -ftrapping-math, shouldn't we > > introduce another option that will make sure we never optimize away > floating > > point operations which can trap (and probably just disable frange for > that > > mode)? > > That seems like a big hammer, but sure. We could change > frange::supports_p() to return false for flag_severely_limiting_option :). > > Aldy > ^ permalink raw reply [flat|nested] 21+ messages in thread

*Re: [PATCH] [range-ops] Implement sqrt.2022-11-18 10:44 ` Jakub Jelinek 2022-11-18 11:20 ` Aldy Hernandez@ 2022-11-18 12:14 ` Richard Biener1 sibling, 0 replies; 21+ messages in thread From: Richard Biener @ 2022-11-18 12:14 UTC (permalink / raw) To: Jakub Jelinek;+Cc:Aldy Hernandez, Joseph Myers, GCC patches, Andrew MacLeod > Am 18.11.2022 um 11:44 schrieb Jakub Jelinek <jakub@redhat.com>: > > On Fri, Nov 18, 2022 at 11:37:42AM +0100, Aldy Hernandez wrote: >>> Practically strictly >>> preserving IEEE exceptions is only important for a very small audience, and >>> for that even INEXACT will matter (but we still have -ftrapping-math >>> by default). >>> For that audience likely all constant / range propagation is futile and thus the >>> easiest thing might be to simply cut that off completely? >>> >>> I'd say what ranger does is reasonable with -ftrapping-math given the current >>> practice of handling this option. There's no point in trying to preserve the >>> (by accident) "better" handling without ranger. Instead as Joseph says somebody >>> would need to sit down, split -ftrapping-math, adjust the default and thorougly >>> document things (also with -fnon-call-exceptions which magically makes >>> IEEE flag raising operations possibly throw exceptions). As there's currently >>> no code motion barriers for FP code with respect to exception flag inspection >>> any dead code we preserve is likely going to be unhelpful. >>> >>> So for now simply amend the documentation as to what -ftrapping-math >>> currently means with respect to range/constant propagation? >> >> So something like "Even in the presence of -ftrapping-math, VRP may fold >> operations that may cause exceptions For example, an addition that is >> guaranteed to produce a NAN, may be replaced with a NAN, thus eliding the >> addition. This may cause any exception that may have been generated by the >> addition to not appear in the final program." >> >> ?? > > If we just adjust user expectations for -ftrapping-math, shouldn't we > introduce another option that will make sure we never optimize away floating > point operations which can trap (and probably just disable frange for that > mode)? I think it’s just like -frounding-math and Fenv access - the intent is there but the implementation is known buggy (and disabling optimizations doesn’t fully fix it). Richard > Jakub > ^ permalink raw reply [flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-11-18 12:14 UTC | newest]Thread overview:21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-13 20:05 [PATCH] [range-ops] Implement sqrt Aldy Hernandez 2022-11-13 20:39 ` Jakub Jelinek 2022-11-14 7:45 ` Aldy Hernandez 2022-11-14 14:30 ` Jeff Law 2022-11-14 14:35 ` Jakub Jelinek 2022-11-14 14:48 ` Jeff Law 2022-11-14 15:01 ` Aldy Hernandez 2022-11-14 21:55 ` Joseph Myers 2022-11-16 20:32 ` Jakub Jelinek 2022-11-17 16:40 ` Aldy Hernandez 2022-11-17 16:48 ` Aldy Hernandez 2022-11-17 17:42 ` Aldy Hernandez 2022-11-17 18:59 ` Joseph Myers 2022-11-17 19:37 ` Jakub Jelinek 2022-11-17 20:43 ` Joseph Myers 2022-11-18 8:39 ` Richard Biener 2022-11-18 10:37 ` Aldy Hernandez 2022-11-18 10:44 ` Jakub Jelinek 2022-11-18 11:20 ` Aldy Hernandez 2022-11-18 11:57 ` Aldy Hernandez 2022-11-18 12:14 ` Richard Biener

This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).