public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
@ 2015-08-06  9:39 David Sherwood
  0 siblings, 0 replies; 24+ messages in thread
From: David Sherwood @ 2015-08-06  9:39 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]

Hi,

Sorry to bother people again. Is this OK to go now?

Thanks!
David.

> >
> > > On Mon, 29 Jun 2015, David Sherwood wrote:
> > >
> > > > Hi,
> > > >
> > > > I have added new STRICT_MAX_EXPR and STRICT_MIN_EXPR expressions to support the
> > > > IEEE versions of fmin and fmax. This is done by recognising the math library
> > > > "fmax" and "fmin" builtin functions in a similar way to how this is done for
> > > > -ffast-math. This also allows us to vectorise the IEEE max/min functions for
> > > > targets that support it, for example aarch64/aarch32.
> > >
> > > This patch is missing documentation.  You need to document the new insn
> > > patterns in md.texi and the new tree codes in generic.texi.
> >
> > Hi, I've uploaded a new patch with the documentation. Hope this is ok.
> 
> In various places where you refer to one operand being NaN, I think you
> mean one operand being a *quiet* NaN (if one is a signaling NaN - only
> supported by GCC if -fsignaling-nans - the IEEE minNum and maxNum
> operations raise "invalid" and return a quiet NaN).

Hi, I have a new patch that hopefully addresses the documentation issues.

Thanks,
David.

ChangeLog:

2015-07-15  David Sherwood  <david.sherwood@arm.com>

gcc/
    * builtins.c (integer_valued_real_p): Add STRICT_MIN_EXPR and
    STRICT_MAX_EXPR.
    (fold_builtin_fmin_fmax): For strict math, convert builting fmin and 
    fmax to STRICT_MIN_EXPR and STRICT_MIN_EXPR, respectively.
    * expr.c (expand_expr_real_2): Add STRICT_MIN_EXPR and STRICT_MAX_EXPR.
    * fold-const.c (const_binop): Likewise.
    (fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
    (tree_binary_nonzero_warnv_p): Likewise.
    * optabs.h (strict_minmax_support): Declare.
    * optabs.def: Add new optabs strict_max_optab/strict_min_optab.
    * optabs.c (optab_for_tree_code): Return new optabs for STRICT_MIN_EXPR
    and STRICT_MAX_EXPR.
    (strict_minmax_support): New function.
    * real.c (real_arithmetic): Add STRICT_MIN_EXPR and STRICT_MAX_EXPR.
    * tree.def: Likewise.
    * tree.c (associative_tree_code, commutative_tree_code): Likewise.
    * tree-cfg.c (verify_expr): Likewise.
    (verify_gimple_assign_binary): Likewise.
    * tree-inline.c (estimate_operator_cost): Likewise.
    * tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
    (op_symbol_code): Likewise.
gcc/config:
    * aarch64/aarch64.md: New pattern.
    * aarch64/aarch64-simd.md: Likewise.
    * aarch64/iterators.md: New unspecs, iterators.
    * arm/iterators.md: New iterators.
    * arm/unspecs.md: New unspecs.
    * arm/neon.md: New pattern.
    * arm/vfp.md: Likewise.
gcc/doc:
    * generic.texi: Add STRICT_MAX_EXPR and STRICT_MIN_EXPR.
    * md.texi: Add strict_min and strict_max patterns.
gcc/testsuite
    * gcc.target/aarch64/maxmin_strict.c: New test.
    * gcc.target/arm/maxmin_strict.c: New test.

[-- Attachment #2: strict_max.patch --]
[-- Type: application/octet-stream, Size: 21492 bytes --]

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 1b5e659..ef1a15f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7438,6 +7438,8 @@ integer_valued_real_p (tree t)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
       return integer_valued_real_p (TREE_OPERAND (t, 0))
 	     && integer_valued_real_p (TREE_OPERAND (t, 1));
 
@@ -9221,6 +9223,10 @@ fold_builtin_fmin_fmax (location_t loc, tree arg0, tree arg1,
 	return fold_build2_loc (loc, (max ? MAX_EXPR : MIN_EXPR), type,
 			    fold_convert_loc (loc, type, arg0),
 			    fold_convert_loc (loc, type, arg1));
+      else if (strict_minmax_support (type, max))
+	return fold_build2_loc (loc, (max ? STRICT_MAX_EXPR : STRICT_MIN_EXPR),
+			    type, fold_convert_loc (loc, type, arg0),
+			    fold_convert_loc (loc, type, arg1));
     }
   return NULL_TREE;
 }
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b90f938..72f5877 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1821,6 +1821,15 @@
   [(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+	(unspec:VDQF [(match_operand:VDQF 1 "register_operand" "0")
+		      (match_operand:VDQF 2 "register_operand" "w")]
+		      FMAXMIN_STRICT))]
+  "TARGET_SIMD"
+  "<maxmin_strict_op>\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+)
+
 (define_insn "<maxmin_uns><mode>3"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
        (unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index d3f5d5b..ee9bf99 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4234,6 +4234,15 @@
   [(set_attr "type" "f_minmax<s>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(unspec:GPF [(match_operand:GPF 1 "register_operand" "0")
+		     (match_operand:GPF 2 "register_operand" "w")]
+		     FMAXMIN_STRICT))]
+  "TARGET_FLOAT"
+  "<maxmin_strict_op>\\t%<s>0, %<s>1, %<s>2"
+)
+
 ;; -------------------------------------------------------------------
 ;; Reload support
 ;; -------------------------------------------------------------------
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 498358a..0a7c760 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -279,6 +279,8 @@
     UNSPEC_PMULL2       ; Used in aarch64-simd.md.
     UNSPEC_REV_REGLIST  ; Used in aarch64-simd.md.
     UNSPEC_VEC_SHR      ; Used in aarch64-simd.md.
+    UNSPEC_FMAX_STRICT  ; Used in aarch64-simd.md.
+    UNSPEC_FMIN_STRICT  ; Used in aarch64-simd.md.
 ])
 
 ;; -------------------------------------------------------------------
@@ -868,6 +870,8 @@
 
 (define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN])
 
+(define_int_iterator FMAXMIN_STRICT [UNSPEC_FMAX_STRICT UNSPEC_FMIN_STRICT])
+
 (define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
 
 (define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
@@ -948,6 +952,12 @@
 				 (UNSPEC_FMINNMV "fminnm")
 				 (UNSPEC_FMINV "fmin")])
 
+(define_int_attr  maxmin_strict [(UNSPEC_FMAX_STRICT "strict_max")
+				 (UNSPEC_FMIN_STRICT "strict_min")])
+
+(define_int_attr  maxmin_strict_op [(UNSPEC_FMAX_STRICT "fmaxnm")
+				    (UNSPEC_FMIN_STRICT "fminnm")])
+
 (define_int_attr sur [(UNSPEC_SHADD "s") (UNSPEC_UHADD "u")
 		      (UNSPEC_SRHADD "sr") (UNSPEC_URHADD "ur")
 		      (UNSPEC_SHSUB "s") (UNSPEC_UHSUB "u")
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1e7f3f1..3b24e4d 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -292,6 +292,8 @@
 
 (define_int_iterator VMAXMINF [UNSPEC_VMAX UNSPEC_VMIN])
 
+(define_int_iterator VMAXMINF_STRICT [UNSPEC_VMAX_STRICT UNSPEC_VMIN_STRICT])
+
 (define_int_iterator VPADDL [UNSPEC_VPADDL_S UNSPEC_VPADDL_U])
 
 (define_int_iterator VPADAL [UNSPEC_VPADAL_S UNSPEC_VPADAL_U])
@@ -716,6 +718,13 @@
   (UNSPEC_VPMIN "min") (UNSPEC_VPMIN_U "min")
 ])
 
+(define_int_attr  maxmin_strict [
+  (UNSPEC_VMAX_STRICT "strict_max") (UNSPEC_VMIN_STRICT "strict_min")])
+
+(define_int_attr maxmin_strict_op [
+  (UNSPEC_VMAX_STRICT "vmaxnm") (UNSPEC_VMIN_STRICT "vminnm")
+])
+
 (define_int_attr shift_op [
   (UNSPEC_VSHL_S "shl") (UNSPEC_VSHL_U "shl")
   (UNSPEC_VRSHL_S "rshl") (UNSPEC_VRSHL_U "rshl")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 654d9d5..e71e31f 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2354,6 +2354,15 @@
   [(set_attr "type" "neon_fp_minmax_s<q>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
+	(unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
+		       (match_operand:VCVTF 2 "s_register_operand" "w")]
+		       VMAXMINF_STRICT))]
+  "TARGET_NEON && TARGET_FPU_ARMV8"
+  "<maxmin_strict_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+)
+
 (define_expand "neon_vpadd<mode>"
   [(match_operand:VD 0 "s_register_operand" "=w")
    (match_operand:VD 1 "s_register_operand" "w")
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 0ec2c48..83094d5 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -224,8 +224,10 @@
   UNSPEC_VLD4_LANE
   UNSPEC_VMAX
   UNSPEC_VMAX_U
+  UNSPEC_VMAX_STRICT
   UNSPEC_VMIN
   UNSPEC_VMIN_U
+  UNSPEC_VMIN_STRICT
   UNSPEC_VMLA
   UNSPEC_VMLA_LANE
   UNSPEC_VMLAL_S
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index f62ff79..351af4f 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1345,6 +1345,15 @@
    (set_attr "conds" "unconditional")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:SDF 0 "s_register_operand" "=<F_constraint>")
+	(unspec:SDF [(match_operand:SDF 1 "s_register_operand" "<F_constraint>")
+		     (match_operand:SDF 2 "s_register_operand" "<F_constraint>")]
+		     VMAXMINF_STRICT))]
+  "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
+  "<maxmin_strict_op>.<V_if_elem>\\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+)
+
 ;; Write Floating-point Status and Control Register.
 (define_insn "set_fpscr"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index bbafad9..8dad9a7 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1268,6 +1268,8 @@ the byte offset of the field, but should not be used directly; call
 @tindex TARGET_EXPR
 @tindex VA_ARG_EXPR
 @tindex ANNOTATE_EXPR
+@tindex STRICT_MAX_EXPR
+@tindex STRICT_MIN_EXPR
 
 @table @code
 @item NEGATE_EXPR
@@ -1687,8 +1689,16 @@ its sole argument yields the representation for @code{ap}.
 This node is used to attach markers to an expression. The first operand
 is the annotated expression, the second is an @code{INTEGER_CST} with
 a value from @code{enum annot_expr_kind}.
-@end table
 
+@item STRICT_MAX_EXPR
+@item STRICT_MIN_EXPR
+These nodes represent IEEE-conformant maximum and minimum operations.  If either
+operand is a quiet @code{NaN} the other operand is returned.  If both operands
+are quiet @code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc
+supports signalling @code{NaN} (-fsignaling-nans) an invalid floating point
+exception is raised and a quiet @code{NaN} is returned.
+
+@end table
 
 @node Vectors
 @subsection Vectors
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index e991286..f1c3417 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4869,6 +4869,15 @@ Signed minimum and maximum operations.  When used with floating point,
 if both operands are zeros, or if either operand is @code{NaN}, then
 it is unspecified which of the two operands is returned as the result.
 
+@cindex @code{strict_min@var{m}3} instruction pattern
+@cindex @code{strict_max@var{m}3} instruction pattern
+@item @samp{strict_min@var{m}3}, @samp{strict_max@var{m}3}
+IEEE-conformant minimum and maximum operations.  If one operand is a quiet
+@code{NaN}, then the other operand is returned.  If both operands are quiet
+@code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc supports
+signalling @code{NaN} (-fsignaling-nans) an invalid floating point exception is
+raised and a quiet @code{NaN} is returned.
+
 @cindex @code{reduc_smin_@var{m}} instruction pattern
 @cindex @code{reduc_smax_@var{m}} instruction pattern
 @item @samp{reduc_smin_@var{m}}, @samp{reduc_smax_@var{m}}
diff --git a/gcc/expr.c b/gcc/expr.c
index 78904c2..e2adb01 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8729,6 +8729,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       return expand_abs (mode, op0, target, unsignedp,
 			 safe_from_p (target, treeop0, 1));
 
+    case STRICT_MAX_EXPR:
+    case STRICT_MIN_EXPR:
     case MAX_EXPR:
     case MIN_EXPR:
       target = original_target;
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 60aa210..143f457 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1164,6 +1164,8 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
 	case RDIV_EXPR:
 	case MIN_EXPR:
 	case MAX_EXPR:
+	case STRICT_MIN_EXPR:
+	case STRICT_MAX_EXPR:
 	  break;
 
 	default:
@@ -9872,7 +9874,8 @@ fold_binary_loc (location_t loc,
      cases, the appropriate type conversions should be put back in
      the tree that will get out of the constant folder.  */
 
-  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR)
+  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR
+      || code == STRICT_MIN_EXPR || code == STRICT_MAX_EXPR)
     {
       STRIP_SIGN_NOPS (arg0);
       STRIP_SIGN_NOPS (arg1);
@@ -14773,6 +14776,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
 
     case BIT_AND_EXPR:
     case MAX_EXPR:
+    case STRICT_MAX_EXPR:
       return (tree_expr_nonnegative_warnv_p (op0,
 					     strict_overflow_p)
 	      || tree_expr_nonnegative_warnv_p (op1,
@@ -14781,6 +14785,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case MIN_EXPR:
+    case STRICT_MIN_EXPR:
     case RDIV_EXPR:
     case TRUNC_DIV_EXPR:
     case CEIL_DIV_EXPR:
@@ -15235,6 +15240,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MIN_EXPR:
+    case STRICT_MIN_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p)
@@ -15247,6 +15253,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MAX_EXPR:
+    case STRICT_MAX_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p))
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 491341b..ca642de 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -482,6 +482,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
     case MIN_EXPR:
       return TYPE_UNSIGNED (type) ? umin_optab : smin_optab;
 
+    case STRICT_MAX_EXPR:
+      return strict_max_optab;
+
+    case STRICT_MIN_EXPR:
+      return strict_min_optab;
+
     case REALIGN_LOAD_EXPR:
       return vec_realign_load_optab;
 
@@ -6798,6 +6804,16 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
   return tmp;
 }
 
+/* Return true if the target supports strict math max (MAX = TRUE) and min
+   (MAX = FALSE) operations on type TYPE.  */
+bool
+strict_minmax_support (tree type, bool max)
+{
+  optab optab = optab_for_tree_code
+    (max ? STRICT_MAX_EXPR : STRICT_MIN_EXPR, type, optab_default);
+  return optab_handler (optab, TYPE_MODE (type)) != CODE_FOR_nothing;
+}
+
 /* Return insn code for a conditional operator with a comparison in
    mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE.  */
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 888b21c..7a79e76 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -244,6 +244,10 @@ OPTAB_D (sin_optab, "sin$a2")
 OPTAB_D (sincos_optab, "sincos$a3")
 OPTAB_D (tan_optab, "tan$a2")
 
+/* C99 implementations of fmax/fmin.  */
+OPTAB_D (strict_max_optab, "strict_max$a3")
+OPTAB_D (strict_min_optab, "strict_min$a3")
+
 /* Vector reduction to a scalar.  */
 OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
 OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 95f5cbc..14b7a39 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -565,4 +565,6 @@ extern bool lshift_cheap_p (bool);
 
 extern enum rtx_code get_rtx_code (enum tree_code tcode, bool unsignedp);
 
+extern bool strict_minmax_support (tree, bool);
+
 #endif /* GCC_OPTABS_H */
diff --git a/gcc/real.c b/gcc/real.c
index 2d34b62..aa2f63c 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -1034,6 +1034,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op1;
       break;
 
+    case STRICT_MIN_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, -1) < 0)
+	*r = *op0;
+      else
+	*r = *op1;
+      break;
+
     case MAX_EXPR:
       if (op1->cl == rvc_nan)
 	*r = *op1;
@@ -1043,6 +1052,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op0;
       break;
 
+    case STRICT_MAX_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, 1) < 0)
+	*r = *op1;
+      else
+	*r = *op0;
+      break;
+
     case NEGATE_EXPR:
       *r = *op0;
       r->sign ^= 1;
diff --git a/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c b/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c
new file mode 100644
index 0000000..09cea1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -save-temps" } */
+
+
+extern void abort (void);
+double fmax(double, double);
+float fmaxf(float, float);
+double fmin(double, double);
+float fminf(float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define NUM_ELEMS(TYPE) (16 / sizeof (TYPE))
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < NUM_ELEMS (TYPE); i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  test_fmax (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  test_fmin (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/arm/maxmin_strict.c b/gcc/testsuite/gcc.target/arm/maxmin_strict.c
new file mode 100644
index 0000000..aa1dd6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/maxmin_strict.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a -save-temps" } */
+/* { dg-add-options arm_v8_neon } */
+
+extern void abort (void);
+double fmax(double, double);
+float fmaxf(float, float);
+double fmin(double, double);
+float fminf(float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < 4; i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+
+/* NOTE: There are no double precision vector versions of vmaxnm/vminnm.  */
+/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index adc56ba..f717e37 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3091,6 +3091,8 @@ verify_expr (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case LSHIFT_EXPR:
     case RSHIFT_EXPR:
     case LROTATE_EXPR:
@@ -3916,6 +3918,8 @@ verify_gimple_assign_binary (gassign *stmt)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index ce9495d..1b95154 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3888,6 +3888,8 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case FLOAT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case ABS_EXPR:
 
     case LSHIFT_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 13587e6..6d13fd2 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2849,6 +2849,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       pp_string (pp, " > ");
       break;
 
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
@@ -3223,6 +3225,8 @@ op_code_prio (enum tree_code code)
       /* Special expressions.  */
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case ABS_EXPR:
     case REALPART_EXPR:
     case IMAGPART_EXPR:
@@ -3419,6 +3423,12 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case STRICT_MAX_EXPR:
+      return "strictmax";
+
+    case STRICT_MIN_EXPR:
+      return "strictmin";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.c b/gcc/tree.c
index f6ab441..2d6b909 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7529,6 +7529,8 @@ associative_tree_code (enum tree_code code)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
       return true;
 
     default:
@@ -7549,6 +7551,8 @@ commutative_tree_code (enum tree_code code)
     case MULT_HIGHPART_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..daa4c77 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -722,6 +722,14 @@ DEFTREECODE (NEGATE_EXPR, "negate_expr", tcc_unary, 1)
 DEFTREECODE (MIN_EXPR, "min_expr", tcc_binary, 2)
 DEFTREECODE (MAX_EXPR, "max_expr", tcc_binary, 2)
 
+/* Minimum and maximum values, but when used with floating point it conforms to
+   the C99 definition of fmax and fmin, i.e.
+     1. if one operand is NaN the other numeric value is returned,
+     2. if both operands are NaN then a NaN is returned,
+     3. there is no distinction between -0 and 0.  */
+DEFTREECODE (STRICT_MIN_EXPR, "strict_min_expr", tcc_binary, 2)
+DEFTREECODE (STRICT_MAX_EXPR, "strict_max_expr", tcc_binary, 2)
+
 /* Represents the absolute value of the operand.
 
    An ABS_EXPR must have either an INTEGER_TYPE or a REAL_TYPE.  The

^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
@ 2015-08-13 10:13 David Sherwood
  2015-08-13 11:12 ` Richard Biener
  0 siblings, 1 reply; 24+ messages in thread
From: David Sherwood @ 2015-08-13 10:13 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]

Hi,

Sorry to bother people again. Is this OK to go now?

Thanks!
David.

> >
> > > On Mon, 29 Jun 2015, David Sherwood wrote:
> > >
> > > > Hi,
> > > >
> > > > I have added new STRICT_MAX_EXPR and STRICT_MIN_EXPR expressions to support the
> > > > IEEE versions of fmin and fmax. This is done by recognising the math library
> > > > "fmax" and "fmin" builtin functions in a similar way to how this is done for
> > > > -ffast-math. This also allows us to vectorise the IEEE max/min functions for
> > > > targets that support it, for example aarch64/aarch32.
> > >
> > > This patch is missing documentation.  You need to document the new insn
> > > patterns in md.texi and the new tree codes in generic.texi.
> >
> > Hi, I've uploaded a new patch with the documentation. Hope this is ok.
> 
> In various places where you refer to one operand being NaN, I think you
> mean one operand being a *quiet* NaN (if one is a signaling NaN - only
> supported by GCC if -fsignaling-nans - the IEEE minNum and maxNum
> operations raise "invalid" and return a quiet NaN).

Hi, I have a new patch that hopefully addresses the documentation issues.

Thanks,
David.

ChangeLog:

2015-07-15  David Sherwood  <david.sherwood@arm.com>

gcc/
    * builtins.c (integer_valued_real_p): Add STRICT_MIN_EXPR and
    STRICT_MAX_EXPR.
    (fold_builtin_fmin_fmax): For strict math, convert builting fmin and 
    fmax to STRICT_MIN_EXPR and STRICT_MIN_EXPR, respectively.
    * expr.c (expand_expr_real_2): Add STRICT_MIN_EXPR and STRICT_MAX_EXPR.
    * fold-const.c (const_binop): Likewise.
    (fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
    (tree_binary_nonzero_warnv_p): Likewise.
    * optabs.h (strict_minmax_support): Declare.
    * optabs.def: Add new optabs strict_max_optab/strict_min_optab.
    * optabs.c (optab_for_tree_code): Return new optabs for STRICT_MIN_EXPR
    and STRICT_MAX_EXPR.
    (strict_minmax_support): New function.
    * real.c (real_arithmetic): Add STRICT_MIN_EXPR and STRICT_MAX_EXPR.
    * tree.def: Likewise.
    * tree.c (associative_tree_code, commutative_tree_code): Likewise.
    * tree-cfg.c (verify_expr): Likewise.
    (verify_gimple_assign_binary): Likewise.
    * tree-inline.c (estimate_operator_cost): Likewise.
    * tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
    (op_symbol_code): Likewise.
gcc/config:
    * aarch64/aarch64.md: New pattern.
    * aarch64/aarch64-simd.md: Likewise.
    * aarch64/iterators.md: New unspecs, iterators.
    * arm/iterators.md: New iterators.
    * arm/unspecs.md: New unspecs.
    * arm/neon.md: New pattern.
    * arm/vfp.md: Likewise.
gcc/doc:
    * generic.texi: Add STRICT_MAX_EXPR and STRICT_MIN_EXPR.
    * md.texi: Add strict_min and strict_max patterns.
gcc/testsuite
    * gcc.target/aarch64/maxmin_strict.c: New test.
    * gcc.target/arm/maxmin_strict.c: New test.

[-- Attachment #2: strict_max.patch --]
[-- Type: application/octet-stream, Size: 21492 bytes --]

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 1b5e659..ef1a15f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7438,6 +7438,8 @@ integer_valued_real_p (tree t)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
       return integer_valued_real_p (TREE_OPERAND (t, 0))
 	     && integer_valued_real_p (TREE_OPERAND (t, 1));
 
@@ -9221,6 +9223,10 @@ fold_builtin_fmin_fmax (location_t loc, tree arg0, tree arg1,
 	return fold_build2_loc (loc, (max ? MAX_EXPR : MIN_EXPR), type,
 			    fold_convert_loc (loc, type, arg0),
 			    fold_convert_loc (loc, type, arg1));
+      else if (strict_minmax_support (type, max))
+	return fold_build2_loc (loc, (max ? STRICT_MAX_EXPR : STRICT_MIN_EXPR),
+			    type, fold_convert_loc (loc, type, arg0),
+			    fold_convert_loc (loc, type, arg1));
     }
   return NULL_TREE;
 }
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b90f938..72f5877 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1821,6 +1821,15 @@
   [(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+	(unspec:VDQF [(match_operand:VDQF 1 "register_operand" "0")
+		      (match_operand:VDQF 2 "register_operand" "w")]
+		      FMAXMIN_STRICT))]
+  "TARGET_SIMD"
+  "<maxmin_strict_op>\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+)
+
 (define_insn "<maxmin_uns><mode>3"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
        (unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index d3f5d5b..ee9bf99 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4234,6 +4234,15 @@
   [(set_attr "type" "f_minmax<s>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(unspec:GPF [(match_operand:GPF 1 "register_operand" "0")
+		     (match_operand:GPF 2 "register_operand" "w")]
+		     FMAXMIN_STRICT))]
+  "TARGET_FLOAT"
+  "<maxmin_strict_op>\\t%<s>0, %<s>1, %<s>2"
+)
+
 ;; -------------------------------------------------------------------
 ;; Reload support
 ;; -------------------------------------------------------------------
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 498358a..0a7c760 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -279,6 +279,8 @@
     UNSPEC_PMULL2       ; Used in aarch64-simd.md.
     UNSPEC_REV_REGLIST  ; Used in aarch64-simd.md.
     UNSPEC_VEC_SHR      ; Used in aarch64-simd.md.
+    UNSPEC_FMAX_STRICT  ; Used in aarch64-simd.md.
+    UNSPEC_FMIN_STRICT  ; Used in aarch64-simd.md.
 ])
 
 ;; -------------------------------------------------------------------
@@ -868,6 +870,8 @@
 
 (define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN])
 
+(define_int_iterator FMAXMIN_STRICT [UNSPEC_FMAX_STRICT UNSPEC_FMIN_STRICT])
+
 (define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
 
 (define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
@@ -948,6 +952,12 @@
 				 (UNSPEC_FMINNMV "fminnm")
 				 (UNSPEC_FMINV "fmin")])
 
+(define_int_attr  maxmin_strict [(UNSPEC_FMAX_STRICT "strict_max")
+				 (UNSPEC_FMIN_STRICT "strict_min")])
+
+(define_int_attr  maxmin_strict_op [(UNSPEC_FMAX_STRICT "fmaxnm")
+				    (UNSPEC_FMIN_STRICT "fminnm")])
+
 (define_int_attr sur [(UNSPEC_SHADD "s") (UNSPEC_UHADD "u")
 		      (UNSPEC_SRHADD "sr") (UNSPEC_URHADD "ur")
 		      (UNSPEC_SHSUB "s") (UNSPEC_UHSUB "u")
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1e7f3f1..3b24e4d 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -292,6 +292,8 @@
 
 (define_int_iterator VMAXMINF [UNSPEC_VMAX UNSPEC_VMIN])
 
+(define_int_iterator VMAXMINF_STRICT [UNSPEC_VMAX_STRICT UNSPEC_VMIN_STRICT])
+
 (define_int_iterator VPADDL [UNSPEC_VPADDL_S UNSPEC_VPADDL_U])
 
 (define_int_iterator VPADAL [UNSPEC_VPADAL_S UNSPEC_VPADAL_U])
@@ -716,6 +718,13 @@
   (UNSPEC_VPMIN "min") (UNSPEC_VPMIN_U "min")
 ])
 
+(define_int_attr  maxmin_strict [
+  (UNSPEC_VMAX_STRICT "strict_max") (UNSPEC_VMIN_STRICT "strict_min")])
+
+(define_int_attr maxmin_strict_op [
+  (UNSPEC_VMAX_STRICT "vmaxnm") (UNSPEC_VMIN_STRICT "vminnm")
+])
+
 (define_int_attr shift_op [
   (UNSPEC_VSHL_S "shl") (UNSPEC_VSHL_U "shl")
   (UNSPEC_VRSHL_S "rshl") (UNSPEC_VRSHL_U "rshl")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 654d9d5..e71e31f 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2354,6 +2354,15 @@
   [(set_attr "type" "neon_fp_minmax_s<q>")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
+	(unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
+		       (match_operand:VCVTF 2 "s_register_operand" "w")]
+		       VMAXMINF_STRICT))]
+  "TARGET_NEON && TARGET_FPU_ARMV8"
+  "<maxmin_strict_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+)
+
 (define_expand "neon_vpadd<mode>"
   [(match_operand:VD 0 "s_register_operand" "=w")
    (match_operand:VD 1 "s_register_operand" "w")
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 0ec2c48..83094d5 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -224,8 +224,10 @@
   UNSPEC_VLD4_LANE
   UNSPEC_VMAX
   UNSPEC_VMAX_U
+  UNSPEC_VMAX_STRICT
   UNSPEC_VMIN
   UNSPEC_VMIN_U
+  UNSPEC_VMIN_STRICT
   UNSPEC_VMLA
   UNSPEC_VMLA_LANE
   UNSPEC_VMLAL_S
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index f62ff79..351af4f 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1345,6 +1345,15 @@
    (set_attr "conds" "unconditional")]
 )
 
+(define_insn "<maxmin_strict><mode>3"
+  [(set (match_operand:SDF 0 "s_register_operand" "=<F_constraint>")
+	(unspec:SDF [(match_operand:SDF 1 "s_register_operand" "<F_constraint>")
+		     (match_operand:SDF 2 "s_register_operand" "<F_constraint>")]
+		     VMAXMINF_STRICT))]
+  "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
+  "<maxmin_strict_op>.<V_if_elem>\\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+)
+
 ;; Write Floating-point Status and Control Register.
 (define_insn "set_fpscr"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index bbafad9..8dad9a7 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1268,6 +1268,8 @@ the byte offset of the field, but should not be used directly; call
 @tindex TARGET_EXPR
 @tindex VA_ARG_EXPR
 @tindex ANNOTATE_EXPR
+@tindex STRICT_MAX_EXPR
+@tindex STRICT_MIN_EXPR
 
 @table @code
 @item NEGATE_EXPR
@@ -1687,8 +1689,16 @@ its sole argument yields the representation for @code{ap}.
 This node is used to attach markers to an expression. The first operand
 is the annotated expression, the second is an @code{INTEGER_CST} with
 a value from @code{enum annot_expr_kind}.
-@end table
 
+@item STRICT_MAX_EXPR
+@item STRICT_MIN_EXPR
+These nodes represent IEEE-conformant maximum and minimum operations.  If either
+operand is a quiet @code{NaN} the other operand is returned.  If both operands
+are quiet @code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc
+supports signalling @code{NaN} (-fsignaling-nans) an invalid floating point
+exception is raised and a quiet @code{NaN} is returned.
+
+@end table
 
 @node Vectors
 @subsection Vectors
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index e991286..f1c3417 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4869,6 +4869,15 @@ Signed minimum and maximum operations.  When used with floating point,
 if both operands are zeros, or if either operand is @code{NaN}, then
 it is unspecified which of the two operands is returned as the result.
 
+@cindex @code{strict_min@var{m}3} instruction pattern
+@cindex @code{strict_max@var{m}3} instruction pattern
+@item @samp{strict_min@var{m}3}, @samp{strict_max@var{m}3}
+IEEE-conformant minimum and maximum operations.  If one operand is a quiet
+@code{NaN}, then the other operand is returned.  If both operands are quiet
+@code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc supports
+signalling @code{NaN} (-fsignaling-nans) an invalid floating point exception is
+raised and a quiet @code{NaN} is returned.
+
 @cindex @code{reduc_smin_@var{m}} instruction pattern
 @cindex @code{reduc_smax_@var{m}} instruction pattern
 @item @samp{reduc_smin_@var{m}}, @samp{reduc_smax_@var{m}}
diff --git a/gcc/expr.c b/gcc/expr.c
index 78904c2..e2adb01 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8729,6 +8729,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       return expand_abs (mode, op0, target, unsignedp,
 			 safe_from_p (target, treeop0, 1));
 
+    case STRICT_MAX_EXPR:
+    case STRICT_MIN_EXPR:
     case MAX_EXPR:
     case MIN_EXPR:
       target = original_target;
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 60aa210..143f457 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1164,6 +1164,8 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
 	case RDIV_EXPR:
 	case MIN_EXPR:
 	case MAX_EXPR:
+	case STRICT_MIN_EXPR:
+	case STRICT_MAX_EXPR:
 	  break;
 
 	default:
@@ -9872,7 +9874,8 @@ fold_binary_loc (location_t loc,
      cases, the appropriate type conversions should be put back in
      the tree that will get out of the constant folder.  */
 
-  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR)
+  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR
+      || code == STRICT_MIN_EXPR || code == STRICT_MAX_EXPR)
     {
       STRIP_SIGN_NOPS (arg0);
       STRIP_SIGN_NOPS (arg1);
@@ -14773,6 +14776,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
 
     case BIT_AND_EXPR:
     case MAX_EXPR:
+    case STRICT_MAX_EXPR:
       return (tree_expr_nonnegative_warnv_p (op0,
 					     strict_overflow_p)
 	      || tree_expr_nonnegative_warnv_p (op1,
@@ -14781,6 +14785,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case MIN_EXPR:
+    case STRICT_MIN_EXPR:
     case RDIV_EXPR:
     case TRUNC_DIV_EXPR:
     case CEIL_DIV_EXPR:
@@ -15235,6 +15240,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MIN_EXPR:
+    case STRICT_MIN_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p)
@@ -15247,6 +15253,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MAX_EXPR:
+    case STRICT_MAX_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p))
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 491341b..ca642de 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -482,6 +482,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
     case MIN_EXPR:
       return TYPE_UNSIGNED (type) ? umin_optab : smin_optab;
 
+    case STRICT_MAX_EXPR:
+      return strict_max_optab;
+
+    case STRICT_MIN_EXPR:
+      return strict_min_optab;
+
     case REALIGN_LOAD_EXPR:
       return vec_realign_load_optab;
 
@@ -6798,6 +6804,16 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
   return tmp;
 }
 
+/* Return true if the target supports strict math max (MAX = TRUE) and min
+   (MAX = FALSE) operations on type TYPE.  */
+bool
+strict_minmax_support (tree type, bool max)
+{
+  optab optab = optab_for_tree_code
+    (max ? STRICT_MAX_EXPR : STRICT_MIN_EXPR, type, optab_default);
+  return optab_handler (optab, TYPE_MODE (type)) != CODE_FOR_nothing;
+}
+
 /* Return insn code for a conditional operator with a comparison in
    mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE.  */
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 888b21c..7a79e76 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -244,6 +244,10 @@ OPTAB_D (sin_optab, "sin$a2")
 OPTAB_D (sincos_optab, "sincos$a3")
 OPTAB_D (tan_optab, "tan$a2")
 
+/* C99 implementations of fmax/fmin.  */
+OPTAB_D (strict_max_optab, "strict_max$a3")
+OPTAB_D (strict_min_optab, "strict_min$a3")
+
 /* Vector reduction to a scalar.  */
 OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
 OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 95f5cbc..14b7a39 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -565,4 +565,6 @@ extern bool lshift_cheap_p (bool);
 
 extern enum rtx_code get_rtx_code (enum tree_code tcode, bool unsignedp);
 
+extern bool strict_minmax_support (tree, bool);
+
 #endif /* GCC_OPTABS_H */
diff --git a/gcc/real.c b/gcc/real.c
index 2d34b62..aa2f63c 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -1034,6 +1034,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op1;
       break;
 
+    case STRICT_MIN_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, -1) < 0)
+	*r = *op0;
+      else
+	*r = *op1;
+      break;
+
     case MAX_EXPR:
       if (op1->cl == rvc_nan)
 	*r = *op1;
@@ -1043,6 +1052,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op0;
       break;
 
+    case STRICT_MAX_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, 1) < 0)
+	*r = *op1;
+      else
+	*r = *op0;
+      break;
+
     case NEGATE_EXPR:
       *r = *op0;
       r->sign ^= 1;
diff --git a/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c b/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c
new file mode 100644
index 0000000..09cea1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/maxmin_strict.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -save-temps" } */
+
+
+extern void abort (void);
+double fmax(double, double);
+float fmaxf(float, float);
+double fmin(double, double);
+float fminf(float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define NUM_ELEMS(TYPE) (16 / sizeof (TYPE))
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < NUM_ELEMS (TYPE); i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  test_fmax (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  test_fmin (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/arm/maxmin_strict.c b/gcc/testsuite/gcc.target/arm/maxmin_strict.c
new file mode 100644
index 0000000..aa1dd6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/maxmin_strict.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a -save-temps" } */
+/* { dg-add-options arm_v8_neon } */
+
+extern void abort (void);
+double fmax(double, double);
+float fmaxf(float, float);
+double fmin(double, double);
+float fminf(float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < 4; i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+
+/* NOTE: There are no double precision vector versions of vmaxnm/vminnm.  */
+/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index adc56ba..f717e37 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3091,6 +3091,8 @@ verify_expr (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case LSHIFT_EXPR:
     case RSHIFT_EXPR:
     case LROTATE_EXPR:
@@ -3916,6 +3918,8 @@ verify_gimple_assign_binary (gassign *stmt)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index ce9495d..1b95154 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3888,6 +3888,8 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case FLOAT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case ABS_EXPR:
 
     case LSHIFT_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 13587e6..6d13fd2 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2849,6 +2849,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       pp_string (pp, " > ");
       break;
 
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
@@ -3223,6 +3225,8 @@ op_code_prio (enum tree_code code)
       /* Special expressions.  */
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case ABS_EXPR:
     case REALPART_EXPR:
     case IMAGPART_EXPR:
@@ -3419,6 +3423,12 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case STRICT_MAX_EXPR:
+      return "strictmax";
+
+    case STRICT_MIN_EXPR:
+      return "strictmin";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.c b/gcc/tree.c
index f6ab441..2d6b909 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7529,6 +7529,8 @@ associative_tree_code (enum tree_code code)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
       return true;
 
     default:
@@ -7549,6 +7551,8 @@ commutative_tree_code (enum tree_code code)
     case MULT_HIGHPART_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case STRICT_MIN_EXPR:
+    case STRICT_MAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..daa4c77 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -722,6 +722,14 @@ DEFTREECODE (NEGATE_EXPR, "negate_expr", tcc_unary, 1)
 DEFTREECODE (MIN_EXPR, "min_expr", tcc_binary, 2)
 DEFTREECODE (MAX_EXPR, "max_expr", tcc_binary, 2)
 
+/* Minimum and maximum values, but when used with floating point it conforms to
+   the C99 definition of fmax and fmin, i.e.
+     1. if one operand is NaN the other numeric value is returned,
+     2. if both operands are NaN then a NaN is returned,
+     3. there is no distinction between -0 and 0.  */
+DEFTREECODE (STRICT_MIN_EXPR, "strict_min_expr", tcc_binary, 2)
+DEFTREECODE (STRICT_MAX_EXPR, "strict_max_expr", tcc_binary, 2)
+
 /* Represents the absolute value of the operand.
 
    An ABS_EXPR must have either an INTEGER_TYPE or a REAL_TYPE.  The

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-11-25 12:38 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-06  9:39 [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax* David Sherwood
2015-08-13 10:13 David Sherwood
2015-08-13 11:12 ` Richard Biener
2015-08-17  9:41   ` David Sherwood
2015-08-17 14:02     ` Richard Biener
2015-08-18 11:10       ` David Sherwood
2015-08-18 13:31         ` Richard Biener
2015-08-18 14:20           ` Richard Sandiford
2015-08-19  9:48             ` Richard Biener
2015-08-19 10:04               ` Richard Sandiford
2015-08-19 10:31                 ` Richard Biener
2015-08-19 12:23                   ` Richard Sandiford
2015-08-19 12:35                     ` Richard Biener
2015-08-19 13:16                       ` Richard Sandiford
2015-08-19 13:41                         ` Richard Biener
2015-09-14 10:47                           ` David Sherwood
2015-09-14 13:42                             ` Richard Biener
2015-09-14 20:38                               ` Joseph Myers
2015-08-19 15:32                       ` Joseph Myers
2015-11-23  9:21                       ` David Sherwood
2015-11-25 12:39                         ` Richard Biener
2015-08-19 15:07               ` Michael Matz
2015-08-19 15:25                 ` Richard Biener
2015-08-19 15:39                   ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).