public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] -fftz-math: assume that denorms _must_ be flushed to zero optimizations
@ 2017-08-10 17:54 Pekka Jääskeläinen
  2017-08-14  9:28 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Pekka Jääskeläinen @ 2017-08-10 17:54 UTC (permalink / raw)
  To: GCC Patches, Henry Linjamäki, Martin Jambor

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

Hi,

The attached patch adds a new switch -fftz-math which makes certain
optimizations
assume that "flush to zero" behavior of denormal inputs and outputs is
not an optimization
hint, but required behavior for semantical correctness.

The need for this was initiated by HSAIL (BRIG). With HSAIL, flush to
zero handling is required,
(not only "allowed") in case an HSAIL instruction is marked with the
'ftz' modifier (all HSA Base
profile instructions are).

The patch is not complete and likely misses many optimizations.
However, it is a starting point
that fixes a few cases brought out by the HSAIL conformance suite. We
plan to extend this
as new cases come up.

OK for trunk?

BR,
Pekka

[-- Attachment #2: gcc-ftz-math-switch.patch --]
[-- Type: text/x-patch, Size: 3924 bytes --]

Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 251026)
+++ gcc/common.opt	(working copy)
@@ -2281,6 +2281,11 @@
 Common Report Var(flag_single_precision_constant) Optimization
 Convert floating point constants to single precision constants.
 
+fftz-math
+Common Report Var(flag_ftz_math) Optimization
+Optimizations handle floating-point operations as they must flush
+subnormal floating-point values to zero.
+
 fsplit-ivs-in-unroller
 Common Report Var(flag_split_ivs_in_unroller) Init(1) Optimization
 Split lifetimes of induction variables when loops are unrolled.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 251026)
+++ gcc/doc/invoke.texi	(working copy)
@@ -9458,6 +9458,17 @@
 This option is experimental and does not currently guarantee to
 disable all GCC optimizations that affect signaling NaN behavior.
 
+@item -fftz-math
+@opindex ftz-math
+This option is experimental. With this flag on GCC treats
+floating-point operations (except abs, class, copysign and neg) as
+they must flush subnormal input operands and results to zero
+(FTZ). The FTZ rules are derived from HSA Programmers Reference Manual
+for the base profile. This alters optimizations that would break the
+rules, for example X * 1 -> X simplification. The option assumes the
+target supports FTZ in hardware and has it enabled - either by default
+or set by the user.
+
 @item -fno-fp-int-builtin-inexact
 @opindex fno-fp-int-builtin-inexact
 Do not allow the built-in functions @code{ceil}, @code{floor},
Index: gcc/fold-const-call.c
===================================================================
--- gcc/fold-const-call.c	(revision 251026)
+++ gcc/fold-const-call.c	(working copy)
@@ -697,7 +697,7 @@
 	      && do_mpfr_arg1 (result, mpfr_y1, arg, format));
 
     CASE_CFN_FLOOR:
-      if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
+      if ((!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) && !flag_ftz_math)
 	{
 	  real_floor (result, format, arg);
 	  return true;
@@ -705,7 +705,7 @@
       return false;
 
     CASE_CFN_CEIL:
-      if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
+      if ((!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) && !flag_ftz_math)
 	{
 	  real_ceil (result, format, arg);
 	  return true;
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 251026)
+++ gcc/match.pd	(working copy)
@@ -143,6 +143,7 @@
 (simplify
  (mult @0 real_onep)
  (if (!HONOR_SNANS (type)
+      && !flag_ftz_math
       && (!HONOR_SIGNED_ZEROS (type)
           || !COMPLEX_FLOAT_TYPE_P (type)))
   (non_lvalue @0)))
@@ -151,6 +152,7 @@
 (simplify
  (mult @0 real_minus_onep)
   (if (!HONOR_SNANS (type)
+       && !flag_ftz_math
        && (!HONOR_SIGNED_ZEROS (type)
            || !COMPLEX_FLOAT_TYPE_P (type)))
    (negate @0)))
@@ -332,13 +334,13 @@
 /* In IEEE floating point, x/1 is not equivalent to x for snans.  */
 (simplify
  (rdiv @0 real_onep)
- (if (!HONOR_SNANS (type))
+ (if (!HONOR_SNANS (type) && !flag_ftz_math)
   (non_lvalue @0)))
 
 /* In IEEE floating point, x/-1 is not equivalent to -x for snans.  */
 (simplify
  (rdiv @0 real_minus_onep)
- (if (!HONOR_SNANS (type))
+ (if (!HONOR_SNANS (type) && !flag_ftz_math)
   (negate @0)))
 
 (if (flag_reciprocal_math)
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(revision 251026)
+++ gcc/simplify-rtx.c	(working copy)
@@ -2565,8 +2565,10 @@
 	return op1;
 
       /* In IEEE floating point, x*1 is not equivalent to x for
-	 signalling NaNs.  */
+	 signalling NaNs.
+	 For -fftz-math, x*1 is not equivalent to x for subnormals. */
       if (!HONOR_SNANS (mode)
+	  && (FLOAT_MODE_P (mode) && !flag_ftz_math)
 	  && trueop1 == CONST1_RTX (mode))
 	return op0;
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-08-22 13:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-10 17:54 [PATCH] -fftz-math: assume that denorms _must_ be flushed to zero optimizations Pekka Jääskeläinen
2017-08-14  9:28 ` Richard Biener
2017-08-14 11:21   ` Pekka Jääskeläinen
2017-08-14 11:25     ` Richard Biener
2017-08-14 13:17     ` Joseph Myers
2017-08-22 14:10       ` Pekka Jääskeläinen
2017-08-22 14:20         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).