From: Aldy Hernandez <aldyh@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC patches <gcc-patches@gcc.gnu.org>,
Andrew MacLeod <amacleod@redhat.com>
Subject: Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
Date: Mon, 17 Oct 2022 08:21:01 +0200 [thread overview]
Message-ID: <CAGm3qMXCqT2r97o-tD8LfNa==z0zT-NSM1539rqS48ShknmhzQ@mail.gmail.com> (raw)
In-Reply-To: <Y0hRjKccB1XjeTC0@tucnak>
[-- Attachment #1: Type: text/plain, Size: 5056 bytes --]
On Thu, Oct 13, 2022 at 7:57 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote:
> > +// Like real_arithmetic, but round the result to INF if the operation
> > +// produced inexact results.
> > +//
> > +// ?? There is still one problematic case, i387. With
> > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
> > +// XFmode (long_double_type_node), so that case is OK. But without
> > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode
> > +// precision (64-bit mantissa) and only occassionally rounded to
> > +// SF/DFmode (when storing into memory from the 387 stack). Maybe
> > +// this is ok as well though it is just occassionally more precise. ??
> > +
> > +static void
> > +frange_arithmetic (enum tree_code code, tree type,
> > + REAL_VALUE_TYPE &result,
> > + const REAL_VALUE_TYPE &op1,
> > + const REAL_VALUE_TYPE &op2,
> > + const REAL_VALUE_TYPE &inf)
> > +{
> > + REAL_VALUE_TYPE value;
> > + enum machine_mode mode = TYPE_MODE (type);
> > + bool mode_composite = MODE_COMPOSITE_P (mode);
> > +
> > + bool inexact = real_arithmetic (&value, code, &op1, &op2);
> > + real_convert (&result, mode, &value);
> > +
> > + // If real_convert above has rounded an inexact value to towards
> > + // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp
> > + // later (real_nextafter).
> > + bool rounding = (flag_rounding_math
> > + && (real_isneg (&inf)
> > + ? real_less (&result, &value)
> > + : !real_less (&value, &result)));
>
> I thought the agreement during Cauldron was that we'd do this always,
> regardless of flag_rounding_math.
> Because excess precision (the fast one like on ia32 or -mfpmath=387 on
> x86_64), or -frounding-math, or FMA contraction can all increase precision
> and worst case it all behaves like -frounding-math for the ranges.
>
> So, perhaps use:
> if ((mode_composite || (real_isneg (&inf) ? real_less (&result, &value)
> : !real_less (&value, &result))
> && (inexact || !real_identical (&result, &value))))
Done.
> ?
> No need to do the real_isneg/real_less stuff for mode_composite, then
> we do it always for inexacts, but otherwise we check if the rounding
> performed by real.cc has been in the conservative direction (for upper
> bound to +inf, for lower bound to -inf), if yes, we don't need to do
> anything, if yes, we frange_nextafter.
>
> As discussed, for mode_composite, I think we want to do the extra
> stuff for inexact denormals and otherwise do the nextafter unconditionally,
> because our internal mode_composite representation isn't precise enough.
>
> > + // Be extra careful if there may be discrepancies between the
> > + // compile and runtime results.
> > + if ((rounding || mode_composite)
> > + && (inexact || !real_identical (&result, &value)))
> > + {
> > + if (mode_composite)
> > + {
> > + bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0;
>
> Use real_isdenormal here?
Done.
> Though, real_iszero needs the same thing.
So... real_isdenormal() || real_iszero() as in the attached patch?
>
> > + if (denormal)
> > + {
> > + REAL_VALUE_TYPE tmp;
>
> And explain here why is this, that IBM extended denormals have just
> DFmode precision.
Done.
> Though, now that I think about it, while this is correct for denormals,
>
> > + real_convert (&tmp, DFmode, &value);
> > + frange_nextafter (DFmode, tmp, inf);
> > + real_convert (&result, mode, &tmp);
> > + }
>
> there are also the cases where the higher double exponent is in the
> [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so.
> https://en.wikipedia.org/wiki/Double-precision_floating-point_format
> If the upper double is denormal in the DFmode sense, so smaller absolute
> value than __DBL_MIN__, then doing nextafter in DFmode is the right thing to
> do, the lower double must be always +/- zero.
> Now, if the result is __DBL_MIN__, the upper double is already normalized
> but we can add __DBL_DENORM_MIN__ to it, which will make the number have
> 54-bit precision.
> If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__
> and make it 55-bit precision. Etc. until we reach __DBL_MIN__ * 2e53
> where it acts like fully normalized 106-bit precision number.
> I must say I'm not really sure what real_nextafter is doing in those cases,
> I'm afraid it doesn't handle it correctly but the only other use
> of real_nextafter is guarded with:
> /* Don't handle composite modes, nor decimal, nor modes without
> inf or denorm at least for now. */
> if (format->pnan < format->p
> || format->b == 10
> || !format->has_inf
> || !format->has_denorm)
> return false;
Dunno. Is there a conservative thing we can do for mode_composites
that aren't denormal or zero?
How does this look?
Aldy
[-- Attachment #2: 0001-PR24021-Implement-PLUS_EXPR-range-op-entry-for-float.patch --]
[-- Type: text/x-patch, Size: 7870 bytes --]
From d7be6caf60133bc39f8224e2f0e00dabcdbbe55d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
This is the range-op entry for floating point PLUS_EXPR. It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats. This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.
So far we track NANs and signs correctly. We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up). All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.
My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN. I
may abstract the general parts as we do for irange's fold_range and
wi_fold.
PR tree-optimization/24021
gcc/ChangeLog:
* range-op-float.cc (update_nan_sign): New.
(propagate_nans): New.
(frange_nextafter): New.
(frange_arithmetic): New.
(class foperator_plus): New.
(floating_op_table::floating_op_table): Add PLUS_EXPR entry.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
gcc/range-op-float.cc | 163 ++++++++++++++++++
.../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++
2 files changed, 184 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c
diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 23e0f5ef4e2..7ffd5795e17 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -200,6 +200,116 @@ frelop_early_resolve (irange &r, tree type,
&& relop_early_resolve (r, type, op1, op2, rel, my_rel));
}
+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange &r, const frange &op1, const frange &op2)
+{
+ if (!r.maybe_isnan ())
+ return;
+
+ bool op1_nan = op1.maybe_isnan ();
+ bool op2_nan = op2.maybe_isnan ();
+ bool sign1, sign2;
+
+ gcc_checking_assert (!r.nan_signbit_p (sign1));
+ if (op1_nan && op2_nan)
+ {
+ if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2))
+ r.update_nan (sign1 | sign2);
+ }
+ else if (op1_nan)
+ {
+ if (op1.nan_signbit_p (sign1))
+ r.update_nan (sign1);
+ }
+ else if (op2_nan)
+ {
+ if (op2.nan_signbit_p (sign2))
+ r.update_nan (sign2);
+ }
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange &r, const frange &op1, const frange &op2)
+{
+ if (op1.known_isnan () || op2.known_isnan ())
+ {
+ r.set_nan (op1.type ());
+ update_nan_sign (r, op1, op2);
+ return true;
+ }
+ return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+ REAL_VALUE_TYPE &value,
+ const REAL_VALUE_TYPE &inf)
+{
+ const real_format *fmt = REAL_MODE_FORMAT (mode);
+ REAL_VALUE_TYPE tmp;
+ bool overflow = real_nextafter (&tmp, fmt, &value, &inf);
+ if (overflow)
+ value = inf;
+ else
+ value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387. With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK. But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantissa) and only occassionally rounded to
+// SF/DFmode (when storing into memory from the 387 stack). Maybe
+// this is ok as well though it is just occassionally more precise. ??
+
+static void
+frange_arithmetic (enum tree_code code, tree type,
+ REAL_VALUE_TYPE &result,
+ const REAL_VALUE_TYPE &op1,
+ const REAL_VALUE_TYPE &op2,
+ const REAL_VALUE_TYPE &inf)
+{
+ REAL_VALUE_TYPE value;
+ enum machine_mode mode = TYPE_MODE (type);
+ bool mode_composite = MODE_COMPOSITE_P (mode);
+
+ bool inexact = real_arithmetic (&value, code, &op1, &op2);
+ real_convert (&result, mode, &value);
+
+ // Be extra careful if there may be discrepancies between the
+ // compile and runtime results.
+ if ((mode_composite || (real_isneg (&inf) ? real_less (&result, &value)
+ : !real_less (&value, &result)))
+ && (inexact || !real_identical (&result, &value)))
+ {
+ if (mode_composite)
+ {
+ if (real_isdenormal (&result)
+ || real_iszero (&result))
+ {
+ // IBM extended denormals only have DFmode precision.
+ REAL_VALUE_TYPE tmp;
+ real_convert (&tmp, DFmode, &value);
+ frange_nextafter (DFmode, tmp, inf);
+ real_convert (&result, mode, &tmp);
+ return;
+ }
+ }
+ frange_nextafter (mode, result, inf);
+ }
+}
+
// Crop R to [-INF, MAX] where MAX is the maximum representable number
// for TYPE.
@@ -1620,6 +1730,58 @@ foperator_unordered_equal::op1_range (frange &r, tree type,
return true;
}
+class foperator_plus : public range_operator_float
+{
+ using range_operator_float::fold_range;
+
+public:
+ bool fold_range (frange &r, tree type,
+ const frange &lh,
+ const frange &rh,
+ relation_kind rel = VREL_VARYING) const final override;
+} fop_plus;
+
+bool
+foperator_plus::fold_range (frange &r, tree type,
+ const frange &op1, const frange &op2,
+ relation_kind) const
+{
+ if (empty_range_varying (r, type, op1, op2))
+ return true;
+ if (propagate_nans (r, op1, op2))
+ return true;
+
+ REAL_VALUE_TYPE lb, ub;
+ frange_arithmetic (PLUS_EXPR, type, lb,
+ op1.lower_bound (), op2.lower_bound (), dconstninf);
+ frange_arithmetic (PLUS_EXPR, type, ub,
+ op1.upper_bound (), op2.upper_bound (), dconstinf);
+
+ // Handle possible NANs by saturating to the appropriate INF if only
+ // one end is a NAN. If both ends are a NAN, just return a NAN.
+ bool lb_nan = real_isnan (&lb);
+ bool ub_nan = real_isnan (&ub);
+ if (lb_nan && ub_nan)
+ {
+ r.set_nan (type);
+ return true;
+ }
+ if (lb_nan)
+ lb = dconstninf;
+ else if (ub_nan)
+ ub = dconstinf;
+
+ // The setter sets NAN by default for HONOR_NANS.
+ r.set (type, lb, ub);
+
+ if (lb_nan || ub_nan)
+ update_nan_sign (r, op1, op2);
+ else if (!op1.maybe_isnan () && !op2.maybe_isnan ())
+ r.clear_nan ();
+
+ return true;
+}
+
// Instantiate a range_op_table for floating point operations.
static floating_op_table global_floating_table;
@@ -1652,6 +1814,7 @@ floating_op_table::floating_op_table ()
set (ABS_EXPR, fop_abs);
set (NEGATE_EXPR, fop_negate);
+ set (PLUS_EXPR, fop_plus);
}
// Return a pointer to the range_operator_float instance, if there is
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c
new file mode 100644
index 00000000000..3739ea4e810
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c
@@ -0,0 +1,21 @@
+// { dg-do compile }
+// { dg-options "-O2 -fno-tree-fre -fno-tree-dominator-opts -fno-thread-jumps -fdump-tree-vrp2" }
+
+double BG_SplineLength ()
+{
+ double lastPoint;
+ double i;
+
+ for (i = 0.01;i<=1;i+=0.1f)
+ if (!(i != 0.0))
+ {
+ lastPoint = i;
+ }
+ else
+ {
+ lastPoint = 2;
+ }
+ return lastPoint;
+}
+
+// { dg-final { scan-tree-dump-times "return 2\\.0e" 1 "vrp2" } }
--
2.37.3
next prev parent reply other threads:[~2022-10-17 6:21 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-13 12:36 Aldy Hernandez
2022-10-13 13:02 ` Toon Moene
2022-10-13 13:44 ` Aldy Hernandez
2022-10-13 13:52 ` Toon Moene
2022-10-14 8:04 ` Aldy Hernandez
2022-10-13 17:57 ` Jakub Jelinek
2022-10-17 6:21 ` Aldy Hernandez [this message]
2022-10-24 6:04 ` Aldy Hernandez
2022-10-29 4:55 ` Jeff Law
2022-10-31 8:42 ` Aldy Hernandez
2022-11-04 13:16 ` Jakub Jelinek
2022-11-04 19:14 ` Jakub Jelinek
2022-11-04 19:53 ` Jakub Jelinek
2022-11-07 12:35 ` Aldy Hernandez
2022-11-07 12:43 ` Jakub Jelinek
2022-11-07 12:48 ` Aldy Hernandez
2022-11-07 12:56 ` Jakub Jelinek
2022-11-07 15:38 ` Aldy Hernandez
2022-11-08 11:07 ` Jakub Jelinek
2022-11-08 12:47 ` Aldy Hernandez
2022-11-08 13:15 ` Jakub Jelinek
2022-11-08 14:02 ` Aldy Hernandez
2022-11-08 14:03 ` Jakub Jelinek
2022-11-07 15:41 ` Aldy Hernandez
2022-11-08 11:20 ` Jakub Jelinek
2022-11-08 13:06 ` Aldy Hernandez
2022-11-08 13:24 ` Jakub Jelinek
2022-11-08 13:47 ` Aldy Hernandez
2022-11-08 13:50 ` Jakub Jelinek
2022-11-08 14:06 ` Aldy Hernandez
2022-11-08 14:11 ` Jakub Jelinek
2022-11-08 14:14 ` Aldy Hernandez
2022-11-08 23:05 ` Aldy Hernandez
2022-11-09 6:59 ` Aldy Hernandez
2022-11-08 17:44 ` Andrew Waterman
2022-11-08 18:11 ` Jakub Jelinek
2022-11-08 18:17 ` Andrew Waterman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGm3qMXCqT2r97o-tD8LfNa==z0zT-NSM1539rqS48ShknmhzQ@mail.gmail.com' \
--to=aldyh@redhat.com \
--cc=amacleod@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).