Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Aldy Hernandez <aldyh@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: GCC patches <gcc-patches@gcc.gnu.org>,
	Andrew MacLeod <amacleod@redhat.com>
Subject: Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
Date: Tue, 8 Nov 2022 14:06:58 +0100	[thread overview]
Message-ID: <CAGm3qMWWq3zan5eWKtn2Bn1HphmrCjk9aQurmOkK9r=pgJHQfw@mail.gmail.com> (raw)
In-Reply-To: <Y2o7bdPhkVkp61qy@tucnak>

[-- Attachment #1: Type: text/plain, Size: 1676 bytes --]

On Tue, Nov 8, 2022 at 12:20 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> > As suggested upthread, I have also adjusted update_nan_sign() to drop
> > the NAN sign to VARYING if both operands are NAN.  As an optimization
> > I keep the sign if both operands are NAN and have the same sign.
>
> For NaNs this still relies on something IEEE754 doesn't guarantee,
> as I cited, after a binary operation the sign bit of the NaN is
> unspecified, whether there is one NaN operand or two.
> It might be that all CPUs handle it the way you've implemented
> (that for one NaN operand the sign of NaN result will be the same
> as that NaN operand and for two it will be the sign of one of the two
> NaNs operands, never something else), but I think we'd need to check
> more than one implementation for that (I've only tried x86_64 and thus
> SSE behavior in it), so one would need to test i387 long double behavior
> too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
> The guarantee given by IEEE754 is only for those copy, negate, abs, copySign
> operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
> __builtin_copysign*.

Ughh, that's unfortunate.  OK, I've added a big note.

>
> Otherwise LGTM (but would be nice to get into GCC13 not just
> +, but also -, *, /, sqrt at least).

Minus is trivial as we can implement it with a negate and plus.  I
have a patch queued up for that.  The rest require a bit more thought,
though perhaps with what we have so far can serve as a base.  I'll
look into it.

Attached is the patch I'm retesting.

Thanks for your patience, and copious help here.
Aldy

[-- Attachment #2: 0002-PR24021-Implement-PLUS_EXPR-range-op-entry-for-float.patch --]
[-- Type: text/x-patch, Size: 8113 bytes --]

From 32e9063bbd5a48bf7f7b16077ebc0c1e7bf3c33d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

	PR tree-optimization/24021

gcc/ChangeLog:

	* range-op-float.cc (update_nan_sign): New.
	(propagate_nans): New.
	(frange_nextafter): New.
	(frange_arithmetic): New.
	(class foperator_plus): New.
	(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc                         | 165 ++++++++++++++++++
 .../gcc.dg/tree-ssa/vrp-float-plus.c          |  21 +++
 2 files changed, 186 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a1f372997bf..1a6913b8b98 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -192,6 +192,118 @@ frelop_early_resolve (irange &r, tree type,
 	  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange &r, const frange &op1, const frange &op2)
+{
+  if (!r.maybe_isnan ())
+    return;
+
+  bool op1_nan = op1.maybe_isnan ();
+  bool op2_nan = op2.maybe_isnan ();
+  bool sign1, sign2;
+
+  gcc_checking_assert (!r.nan_signbit_p (sign1));
+  if (op1_nan && op2_nan)
+    {
+      // If boths signs agree, we could use that sign, but IEEE754
+      // does not guarantee this for a binary operator.  The x86_64
+      // architure does keep the common known sign, but further tests
+      // are needed to see if other architectures do the same (i387
+      // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc).
+      // In the meantime, keep sign VARYING.
+      ;
+    }
+  else if (op1_nan)
+    {
+      if (op1.nan_signbit_p (sign1))
+	r.update_nan (sign1);
+    }
+  else if (op2_nan)
+    {
+      if (op2.nan_signbit_p (sign2))
+	r.update_nan (sign2);
+    }
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange &r, const frange &op1, const frange &op2)
+{
+  if (op1.known_isnan () || op2.known_isnan ())
+    {
+      r.set_nan (op1.type ());
+      update_nan_sign (r, op1, op2);
+      return true;
+    }
+  return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+		  REAL_VALUE_TYPE &value,
+		  const REAL_VALUE_TYPE &inf)
+{
+  const real_format *fmt = REAL_MODE_FORMAT (mode);
+  REAL_VALUE_TYPE tmp;
+  real_nextafter (&tmp, fmt, &value, &inf);
+  value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387.  With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK.  But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantissa) and only occassionally rounded to
+// SF/DFmode (when storing into memory from the 387 stack).  Maybe
+// this is ok as well though it is just occassionally more precise. ??
+
+static void
+frange_arithmetic (enum tree_code code, tree type,
+		   REAL_VALUE_TYPE &result,
+		   const REAL_VALUE_TYPE &op1,
+		   const REAL_VALUE_TYPE &op2,
+		   const REAL_VALUE_TYPE &inf)
+{
+  REAL_VALUE_TYPE value;
+  enum machine_mode mode = TYPE_MODE (type);
+  bool mode_composite = MODE_COMPOSITE_P (mode);
+
+  bool inexact = real_arithmetic (&value, code, &op1, &op2);
+  real_convert (&result, mode, &value);
+
+  // Be extra careful if there may be discrepancies between the
+  // compile and runtime results.
+  if ((mode_composite || (real_isneg (&inf) ? real_less (&result, &value)
+			  : !real_less (&value, &result)))
+      && (inexact || !real_identical (&result, &value)))
+    {
+      if (mode_composite)
+	{
+	  if (real_isdenormal (&result, mode)
+	      || real_iszero (&result))
+	    {
+	      // IBM extended denormals only have DFmode precision.
+	      REAL_VALUE_TYPE tmp;
+	      real_convert (&tmp, DFmode, &value);
+	      frange_nextafter (DFmode, tmp, inf);
+	      real_convert (&result, mode, &tmp);
+	      return;
+	    }
+	}
+      frange_nextafter (mode, result, inf);
+    }
+}
+
 // Crop R to [-INF, MAX] where MAX is the maximum representable number
 // for TYPE.
 
@@ -1746,6 +1858,58 @@ foperator_unordered_equal::op1_range (frange &r, tree type,
   return true;
 }
 
+class foperator_plus : public range_operator_float
+{
+  using range_operator_float::fold_range;
+
+public:
+  bool fold_range (frange &r, tree type,
+		   const frange &lh,
+		   const frange &rh,
+		   relation_trio = TRIO_VARYING) const final override;
+} fop_plus;
+
+bool
+foperator_plus::fold_range (frange &r, tree type,
+			    const frange &op1, const frange &op2,
+			    relation_trio) const
+{
+  if (empty_range_varying (r, type, op1, op2))
+    return true;
+  if (propagate_nans (r, op1, op2))
+    return true;
+
+  REAL_VALUE_TYPE lb, ub;
+  frange_arithmetic (PLUS_EXPR, type, lb,
+		     op1.lower_bound (), op2.lower_bound (), dconstninf);
+  frange_arithmetic (PLUS_EXPR, type, ub,
+		     op1.upper_bound (), op2.upper_bound (), dconstinf);
+
+  // Handle possible NANs by saturating to the appropriate INF if only
+  // one end is a NAN.  If both ends are a NAN, just return a NAN.
+  bool lb_nan = real_isnan (&lb);
+  bool ub_nan = real_isnan (&ub);
+  if (lb_nan && ub_nan)
+    {
+      r.set_nan (type);
+      return true;
+    }
+  if (lb_nan)
+    lb = dconstninf;
+  else if (ub_nan)
+    ub = dconstinf;
+
+  // The setter sets NAN by default for HONOR_NANS.
+  r.set (type, lb, ub);
+
+  if (lb_nan || ub_nan)
+    update_nan_sign (r, op1, op2);
+  else if (!op1.maybe_isnan () && !op2.maybe_isnan ())
+    r.clear_nan ();
+
+  return true;
+}
+
 // Instantiate a range_op_table for floating point operations.
 static floating_op_table global_floating_table;
 
@@ -1778,6 +1942,7 @@ floating_op_table::floating_op_table ()
 
   set (ABS_EXPR, fop_abs);
   set (NEGATE_EXPR, fop_negate);
+  set (PLUS_EXPR, fop_plus);
 }
 
 // Return a pointer to the range_operator_float instance, if there is
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c
new file mode 100644
index 00000000000..3739ea4e810
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c
@@ -0,0 +1,21 @@
+// { dg-do compile }
+// { dg-options "-O2 -fno-tree-fre -fno-tree-dominator-opts -fno-thread-jumps -fdump-tree-vrp2" }
+
+double BG_SplineLength ()
+{
+  double lastPoint;
+  double i;
+
+  for (i = 0.01;i<=1;i+=0.1f)
+    if (!(i != 0.0))
+      {
+        lastPoint = i;
+      }
+    else
+      {
+        lastPoint = 2;
+      }
+  return lastPoint;
+}
+
+// { dg-final { scan-tree-dump-times "return 2\\.0e" 1 "vrp2" } }
-- 
2.38.1

next prev parent reply	other threads:[~2022-11-08 13:07 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-13 12:36 Aldy Hernandez
2022-10-13 13:02 ` Toon Moene
2022-10-13 13:44   ` Aldy Hernandez
2022-10-13 13:52     ` Toon Moene
2022-10-14  8:04       ` Aldy Hernandez
2022-10-13 17:57 ` Jakub Jelinek
2022-10-17  6:21   ` Aldy Hernandez
2022-10-24  6:04     ` Aldy Hernandez
2022-10-29  4:55       ` Jeff Law
2022-10-31  8:42       ` Aldy Hernandez
2022-11-04 13:16     ` Jakub Jelinek
2022-11-04 19:14     ` Jakub Jelinek
2022-11-04 19:53       ` Jakub Jelinek
2022-11-07 12:35         ` Aldy Hernandez
2022-11-07 12:43           ` Jakub Jelinek
2022-11-07 12:48             ` Aldy Hernandez
2022-11-07 12:56               ` Jakub Jelinek
2022-11-07 15:38                 ` Aldy Hernandez
2022-11-08 11:07                   ` Jakub Jelinek
2022-11-08 12:47                     ` Aldy Hernandez
2022-11-08 13:15                       ` Jakub Jelinek
2022-11-08 14:02                         ` Aldy Hernandez
2022-11-08 14:03                           ` Jakub Jelinek
2022-11-07 15:41       ` Aldy Hernandez
2022-11-08 11:20         ` Jakub Jelinek
2022-11-08 13:06           ` Aldy Hernandez [this message]
2022-11-08 13:24             ` Jakub Jelinek
2022-11-08 13:47               ` Aldy Hernandez
2022-11-08 13:50                 ` Jakub Jelinek
2022-11-08 14:06                   ` Aldy Hernandez
2022-11-08 14:11                     ` Jakub Jelinek
2022-11-08 14:14                       ` Aldy Hernandez
2022-11-08 23:05                       ` Aldy Hernandez
2022-11-09  6:59                         ` Aldy Hernandez
2022-11-08 17:44           ` Andrew Waterman
2022-11-08 18:11             ` Jakub Jelinek
2022-11-08 18:17               ` Andrew Waterman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGm3qMWWq3zan5eWKtn2Bn1HphmrCjk9aQurmOkK9r=pgJHQfw@mail.gmail.com' \
    --to=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).