RE: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "David Sherwood" <david.sherwood@arm.com>
To: "'Richard Biener'" <richard.guenther@gmail.com>,
	"GCC Patches" <gcc-patches@gcc.gnu.org>,
	"Richard Sandiford" <Richard.Sandiford@arm.com>
Subject: RE: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
Date: Mon, 14 Sep 2015 10:47:00 -0000	[thread overview]
Message-ID: <000001d0eed9$48ed0070$dac70150$@arm.com> (raw)
In-Reply-To: <CAFiYyc0dt911qOdRAXa3LmcB5JMzPL0jc_Su_0po123Dkp22iA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3232 bytes --]

Hi All,

For what it's worth I have uploaded a new patch that changes the name
from STRICT_FMIN/MAX to just FMIN/FMAX, although I realise that this
discussion has not yet been resolved. I have also added scheduling
attributes to the aarch64 instructions.

Regards,
David Sherwood.

ChangeLog:

2015-08-28  David Sherwood  <david.sherwood@arm.com>

    gcc/
	* builtins.c (integer_valued_real_p): Add FMIN_EXPR and FMAX_EXPR.
	(fold_builtin_fmin_fmax): For strict math, convert builtins fmin and
	fmax to FMIN_EXPR and FMIN_EXPR, respectively.
	* expr.c (expand_expr_real_2): Add FMIN_EXPR and FMAX_EXPR.
	* fold-const.c (const_binop): Likewise.
	(fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
	(tree_binary_nonzero_warnv_p): Likewise.
	* optabs.h (fminmax_support): Declare.
	* optabs.def: Add new optabs fmax_optab/fmin_optab.
	* optabs.c (optab_for_tree_code): Return new optabs for FMIN_EXPR and
	FMAX_EXPR.
	(fminmax_support): New function.
	* real.c (real_arithmetic): Add FMIN_EXPR and FMAX_EXPR.
	* tree.def: Likewise.
	* tree.c (associative_tree_code, commutative_tree_code): Likewise.
	* tree-cfg.c (verify_expr): Likewise.
	(verify_gimple_assign_binary): Likewise.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
	(op_symbol_code): Likewise.
	* config/aarch64/aarch64.md: New pattern.
	* config/aarch64/aarch64-simd.md: Likewise.
	* config/aarch64/iterators.md: New unspecs, iterators.
	* config/arm/iterators.md: New iterators.
	* config/arm/unspecs.md: New unspecs.
	* config/arm/neon.md: New pattern.
	* config/arm/vfp.md: Likewise.
        * doc/generic.texi: Add FMAX_EXPR and FMIN_EXPR.
        * doc/md.texi: Add fmin and fmax patterns.
    gcc/testsuite
        * gcc.target/aarch64/fmaxmin.c: New test.
        * gcc.target/arm/fmaxmin.c: New test.


> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: 19 August 2015 14:41
> To: Richard Biener; David Sherwood; GCC Patches; Richard Sandiford
> Subject: Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
> 
> On Wed, Aug 19, 2015 at 3:06 PM, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> > Richard Biener <richard.guenther@gmail.com> writes:
> >> As an additional point for many math functions we have to support errno
> >> which means, like, BUILT_IN_SQRT can be rewritten to SQRT_EXPR
> >> only if -fno-math-errno is in effect.  But then code has to handle
> >> both variants for things like constant folding and expression combining.
> >> That's very unfortunate and something we want to avoid (one reason
> >> the POW_EXPR thing didn't fly when I tried).  STRICT_FMIN/MAX_EXPR
> >> is an example where this doesn't apply, of course (but I detest the name,
> >> just use FMIN/FMAX_EXPR?).  Still you'd need to handle both,
> >> FMIN_EXPR and BUILT_IN_FMIN, in code doing analysis/transform.
> >
> > Yeah, but match.pd makes that easy, right? ;-)
> 
> Sure, but that only addresses stmt combining, not other passes.  And of course
> it causes {gimple,generic}-match.c to become even bigger ;)
> 
> Richard.


[-- Attachment #2: fmaxmin.patch --]
[-- Type: application/octet-stream, Size: 21233 bytes --]

diff --git a/gcc/builtins.c b/gcc/builtins.c
index d79372c..284534f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7393,6 +7393,8 @@ integer_valued_real_p (tree t)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
       return integer_valued_real_p (TREE_OPERAND (t, 0))
 	     && integer_valued_real_p (TREE_OPERAND (t, 1));
 
@@ -9176,6 +9178,10 @@ fold_builtin_fmin_fmax (location_t loc, tree arg0, tree arg1,
 	return fold_build2_loc (loc, (max ? MAX_EXPR : MIN_EXPR), type,
 			    fold_convert_loc (loc, type, arg0),
 			    fold_convert_loc (loc, type, arg1));
+      else if (fminmax_support (type, max))
+	return fold_build2_loc (loc, (max ? FMAX_EXPR : FMIN_EXPR), type,
+			    fold_convert_loc (loc, type, arg0),
+			    fold_convert_loc (loc, type, arg1));
     }
   return NULL_TREE;
 }
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9777418..0a704b6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1821,6 +1821,16 @@
   [(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
 )
 
+(define_insn "<fmaxmin><mode>3"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+	(unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
+		      (match_operand:VDQF 2 "register_operand" "w")]
+		      FMAXMIN_STRICT))]
+  "TARGET_SIMD"
+  "<fmaxmin_op>\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+  [(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
+)
+
 (define_insn "<maxmin_uns><mode>3"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
        (unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c8511f0..023be58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4315,6 +4315,16 @@
   [(set_attr "type" "f_minmax<s>")]
 )
 
+(define_insn "<fmaxmin><mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(unspec:GPF [(match_operand:GPF 1 "register_operand" "w")
+		     (match_operand:GPF 2 "register_operand" "w")]
+		     FMAXMIN_STRICT))]
+  "TARGET_FLOAT"
+  "<fmaxmin_op>\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "type" "f_minmax<s>")]
+)
+
 ;; -------------------------------------------------------------------
 ;; Reload support
 ;; -------------------------------------------------------------------
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index b8a45d1..5e248bc 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -282,6 +282,8 @@
     UNSPEC_PMULL2       ; Used in aarch64-simd.md.
     UNSPEC_REV_REGLIST  ; Used in aarch64-simd.md.
     UNSPEC_VEC_SHR      ; Used in aarch64-simd.md.
+    UNSPEC_FMAX_STRICT  ; Used in aarch64-simd.md.
+    UNSPEC_FMIN_STRICT  ; Used in aarch64-simd.md.
 ])
 
 ;; -------------------------------------------------------------------
@@ -873,6 +875,8 @@
 
 (define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN])
 
+(define_int_iterator FMAXMIN_STRICT [UNSPEC_FMAX_STRICT UNSPEC_FMIN_STRICT])
+
 (define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
 
 (define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
@@ -953,6 +957,12 @@
 				 (UNSPEC_FMINNMV "fminnm")
 				 (UNSPEC_FMINV "fmin")])
 
+(define_int_attr fmaxmin [(UNSPEC_FMAX_STRICT "fmax")
+			  (UNSPEC_FMIN_STRICT "fmin")])
+
+(define_int_attr fmaxmin_op [(UNSPEC_FMAX_STRICT "fmaxnm")
+			     (UNSPEC_FMIN_STRICT "fminnm")])
+
 (define_int_attr sur [(UNSPEC_SHADD "s") (UNSPEC_UHADD "u")
 		      (UNSPEC_SRHADD "sr") (UNSPEC_URHADD "ur")
 		      (UNSPEC_SHSUB "s") (UNSPEC_UHSUB "u")
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1e7f3f1..42fc688 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -292,6 +292,8 @@
 
 (define_int_iterator VMAXMINF [UNSPEC_VMAX UNSPEC_VMIN])
 
+(define_int_iterator VMAXMINF_STRICT [UNSPEC_VMAX_STRICT UNSPEC_VMIN_STRICT])
+
 (define_int_iterator VPADDL [UNSPEC_VPADDL_S UNSPEC_VPADDL_U])
 
 (define_int_iterator VPADAL [UNSPEC_VPADAL_S UNSPEC_VPADAL_U])
@@ -716,6 +718,13 @@
   (UNSPEC_VPMIN "min") (UNSPEC_VPMIN_U "min")
 ])
 
+(define_int_attr fmaxmin [
+  (UNSPEC_VMAX_STRICT "fmax") (UNSPEC_VMIN_STRICT "fmin")])
+
+(define_int_attr fmaxmin_op [
+  (UNSPEC_VMAX_STRICT "vmaxnm") (UNSPEC_VMIN_STRICT "vminnm")
+])
+
 (define_int_attr shift_op [
   (UNSPEC_VSHL_S "shl") (UNSPEC_VSHL_U "shl")
   (UNSPEC_VRSHL_S "rshl") (UNSPEC_VRSHL_U "rshl")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 873330f..d39f7ff 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2354,6 +2354,16 @@
   [(set_attr "type" "neon_fp_minmax_s<q>")]
 )
 
+(define_insn "<fmaxmin><mode>3"
+  [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
+	(unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
+		       (match_operand:VCVTF 2 "s_register_operand" "w")]
+		       VMAXMINF_STRICT))]
+  "TARGET_NEON && TARGET_FPU_ARMV8"
+  "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+  [(set_attr "type" "neon_fp_minmax_s<q>")]
+)
+
 (define_expand "neon_vpadd<mode>"
   [(match_operand:VD 0 "s_register_operand" "=w")
    (match_operand:VD 1 "s_register_operand" "w")
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 0ec2c48..83094d5 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -224,8 +224,10 @@
   UNSPEC_VLD4_LANE
   UNSPEC_VMAX
   UNSPEC_VMAX_U
+  UNSPEC_VMAX_STRICT
   UNSPEC_VMIN
   UNSPEC_VMIN_U
+  UNSPEC_VMIN_STRICT
   UNSPEC_VMLA
   UNSPEC_VMLA_LANE
   UNSPEC_VMLAL_S
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 081aab2..a9a5949 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1368,6 +1368,17 @@
    (set_attr "conds" "unconditional")]
 )
 
+(define_insn "<fmaxmin><mode>3"
+  [(set (match_operand:SDF 0 "s_register_operand" "=<F_constraint>")
+	(unspec:SDF [(match_operand:SDF 1 "s_register_operand" "<F_constraint>")
+		     (match_operand:SDF 2 "s_register_operand" "<F_constraint>")]
+		     VMAXMINF_STRICT))]
+  "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
+  "<fmaxmin_op>.<V_if_elem>\\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+  [(set_attr "type" "f_minmax<vfp_type>")
+   (set_attr "conds" "unconditional")]
+)
+
 ;; Write Floating-point Status and Control Register.
 (define_insn "set_fpscr"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index bbafad9..937ca5b 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1268,6 +1268,8 @@ the byte offset of the field, but should not be used directly; call
 @tindex TARGET_EXPR
 @tindex VA_ARG_EXPR
 @tindex ANNOTATE_EXPR
+@tindex FMAX_EXPR
+@tindex FMIN_EXPR
 
 @table @code
 @item NEGATE_EXPR
@@ -1687,8 +1689,16 @@ its sole argument yields the representation for @code{ap}.
 This node is used to attach markers to an expression. The first operand
 is the annotated expression, the second is an @code{INTEGER_CST} with
 a value from @code{enum annot_expr_kind}.
-@end table
 
+@item FMAX_EXPR
+@item FMIN_EXPR
+These nodes represent IEEE-conformant maximum and minimum operations.  If either
+operand is a quiet @code{NaN} the other operand is returned.  If both operands
+are quiet @code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc
+supports signalling @code{NaN} (-fsignaling-nans) an invalid floating point
+exception is raised and a quiet @code{NaN} is returned.
+
+@end table
 
 @node Vectors
 @subsection Vectors
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0bffdc6..31b1d24 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4879,6 +4879,15 @@ Signed minimum and maximum operations.  When used with floating point,
 if both operands are zeros, or if either operand is @code{NaN}, then
 it is unspecified which of the two operands is returned as the result.
 
+@cindex @code{fmin@var{m}3} instruction pattern
+@cindex @code{fmax@var{m}3} instruction pattern
+@item @samp{fmin@var{m}3}, @samp{fmax@var{m}3}
+IEEE-conformant minimum and maximum operations.  If one operand is a quiet
+@code{NaN}, then the other operand is returned.  If both operands are quiet
+@code{NaN}, then a quiet @code{NaN} is returned.  In the case when gcc supports
+signalling @code{NaN} (-fsignaling-nans) an invalid floating point exception is
+raised and a quiet @code{NaN} is returned.
+
 @cindex @code{reduc_smin_@var{m}} instruction pattern
 @cindex @code{reduc_smax_@var{m}} instruction pattern
 @item @samp{reduc_smin_@var{m}}, @samp{reduc_smax_@var{m}}
diff --git a/gcc/expr.c b/gcc/expr.c
index 1e820b4..f69ba80 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8689,6 +8689,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       return expand_abs (mode, op0, target, unsignedp,
 			 safe_from_p (target, treeop0, 1));
 
+    case FMAX_EXPR:
+    case FMIN_EXPR:
     case MAX_EXPR:
     case MIN_EXPR:
       target = original_target;
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c826e67..7846e42 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1155,6 +1155,8 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
 	case RDIV_EXPR:
 	case MIN_EXPR:
 	case MAX_EXPR:
+	case FMIN_EXPR:
+	case FMAX_EXPR:
 	  break;
 
 	default:
@@ -9069,7 +9071,8 @@ fold_binary_loc (location_t loc,
      cases, the appropriate type conversions should be put back in
      the tree that will get out of the constant folder.  */
 
-  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR)
+  if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR
+      || code == FMIN_EXPR || code == FMAX_EXPR)
     {
       STRIP_SIGN_NOPS (arg0);
       STRIP_SIGN_NOPS (arg1);
@@ -13017,6 +13020,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
 
     case BIT_AND_EXPR:
     case MAX_EXPR:
+    case FMAX_EXPR:
       return (tree_expr_nonnegative_warnv_p (op0,
 					     strict_overflow_p)
 	      || tree_expr_nonnegative_warnv_p (op1,
@@ -13025,6 +13029,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case MIN_EXPR:
+    case FMIN_EXPR:
     case RDIV_EXPR:
     case TRUNC_DIV_EXPR:
     case CEIL_DIV_EXPR:
@@ -13479,6 +13484,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MIN_EXPR:
+    case FMIN_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p)
@@ -13491,6 +13497,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
       break;
 
     case MAX_EXPR:
+    case FMAX_EXPR:
       sub_strict_overflow_p = false;
       if (tree_expr_nonzero_warnv_p (op0,
 				     &sub_strict_overflow_p))
diff --git a/gcc/optabs.c b/gcc/optabs.c
index e533e6e..b733a2e 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -476,6 +476,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
     case MIN_EXPR:
       return TYPE_UNSIGNED (type) ? umin_optab : smin_optab;
 
+    case FMAX_EXPR:
+      return fmax_optab;
+
+    case FMIN_EXPR:
+      return fmin_optab;
+
     case REALIGN_LOAD_EXPR:
       return vec_realign_load_optab;
 
@@ -6791,6 +6797,16 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
   return tmp;
 }
 
+/* Return true if the target supports strict FP math max (MAX = TRUE) and min
+   (MAX = FALSE) operations on type TYPE.  */
+bool
+fminmax_support (tree type, bool max)
+{
+  optab optab = optab_for_tree_code
+    (max ? FMAX_EXPR : FMIN_EXPR, type, optab_default);
+  return optab_handler (optab, TYPE_MODE (type)) != CODE_FOR_nothing;
+}
+
 /* Return insn code for a conditional operator with a comparison in
    mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE.  */
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 888b21c..36c72d8 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -244,6 +244,10 @@ OPTAB_D (sin_optab, "sin$a2")
 OPTAB_D (sincos_optab, "sincos$a3")
 OPTAB_D (tan_optab, "tan$a2")
 
+/* C99 implementations of fmax/fmin.  */
+OPTAB_D (fmax_optab, "fmax$a3")
+OPTAB_D (fmin_optab, "fmin$a3")
+
 /* Vector reduction to a scalar.  */
 OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
 OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 95f5cbc..0bbf736 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -565,4 +565,6 @@ extern bool lshift_cheap_p (bool);
 
 extern enum rtx_code get_rtx_code (enum tree_code tcode, bool unsignedp);
 
+extern bool fminmax_support (tree, bool);
+
 #endif /* GCC_OPTABS_H */
diff --git a/gcc/real.c b/gcc/real.c
index c1ff78d..3a2d7b6 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -1033,6 +1033,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op1;
       break;
 
+    case FMIN_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, -1) < 0)
+	*r = *op0;
+      else
+	*r = *op1;
+      break;
+
     case MAX_EXPR:
       if (op1->cl == rvc_nan)
 	*r = *op1;
@@ -1042,6 +1051,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
 	*r = *op0;
       break;
 
+    case FMAX_EXPR:
+      if (op0->cl == rvc_nan)
+	*r = *op1;
+      else if (do_compare (op0, op1, 1) < 0)
+	*r = *op1;
+      else
+	*r = *op0;
+      break;
+
     case NEGATE_EXPR:
       *r = *op0;
       r->sign ^= 1;
diff --git a/gcc/testsuite/gcc.target/aarch64/fmaxmin.c b/gcc/testsuite/gcc.target/aarch64/fmaxmin.c
new file mode 100644
index 0000000..7654955
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fmaxmin.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -save-temps" } */
+
+
+extern void abort (void);
+double fmax (double, double);
+float fmaxf (float, float);
+double fmin (double, double);
+float fminf (float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define NUM_ELEMS(TYPE) (16 / sizeof (TYPE))
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < NUM_ELEMS (TYPE); i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  test_fmax (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  test_fmin (&r_d[2], &a_d[2], &b_d[2]);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/arm/fmaxmin.c b/gcc/testsuite/gcc.target/arm/fmaxmin.c
new file mode 100644
index 0000000..f55ac5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fmaxmin.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a -save-temps" } */
+/* { dg-add-options arm_v8_neon } */
+
+extern void abort (void);
+double fmax (double, double);
+float fmaxf (float, float);
+double fmin (double, double);
+float fminf (float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+		 TYPE *__restrict__ b)\
+{\
+  int i;\
+  for (i = 0; i < 4; i++)\
+    r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+  float a_f[4] = { 4, NAN, -3, INFINITY };
+  float b_f[4] = { 1,   7,NAN, 0 };
+  float r_f[4];
+  double a_d[4] = { 4, NAN,  -3,  INFINITY };
+  double b_d[4] = { 1,   7, NAN,  0 };
+  double r_d[4];
+
+  test_fmaxf (r_f, a_f, b_f);
+  if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+    abort ();
+
+  test_fminf (r_f, a_f, b_f);
+  if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+    abort ();
+
+  test_fmax (r_d, a_d, b_d);
+  if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+    abort ();
+
+  test_fmin (r_d, a_d, b_d);
+  if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+
+/* NOTE: There are no double precision vector versions of vmaxnm/vminnm.  */
+/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5ac73b3..36ab7a9 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3055,6 +3055,8 @@ verify_expr (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case LSHIFT_EXPR:
     case RSHIFT_EXPR:
     case LROTATE_EXPR:
@@ -3880,6 +3882,8 @@ verify_gimple_assign_binary (gassign *stmt)
     case EXACT_DIV_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..d41c8ff 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3873,6 +3873,8 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case FLOAT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case ABS_EXPR:
 
     case LSHIFT_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..dc950f7 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2844,6 +2844,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       pp_string (pp, " > ");
       break;
 
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
@@ -3218,6 +3220,8 @@ op_code_prio (enum tree_code code)
       /* Special expressions.  */
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case ABS_EXPR:
     case REALPART_EXPR:
     case IMAGPART_EXPR:
@@ -3414,6 +3418,12 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case FMAX_EXPR:
+      return "fmax";
+
+    case FMIN_EXPR:
+      return "fmin";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.c b/gcc/tree.c
index af3a6a3..183a9e6 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7533,6 +7533,8 @@ associative_tree_code (enum tree_code code)
     case MULT_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
       return true;
 
     default:
@@ -7553,6 +7555,8 @@ commutative_tree_code (enum tree_code code)
     case MULT_HIGHPART_EXPR:
     case MIN_EXPR:
     case MAX_EXPR:
+    case FMIN_EXPR:
+    case FMAX_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..cf19392 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -722,6 +722,14 @@ DEFTREECODE (NEGATE_EXPR, "negate_expr", tcc_unary, 1)
 DEFTREECODE (MIN_EXPR, "min_expr", tcc_binary, 2)
 DEFTREECODE (MAX_EXPR, "max_expr", tcc_binary, 2)
 
+/* Minimum and maximum values, but when used with floating point it conforms to
+   the C99 definition of fmax and fmin, i.e.
+     1. if one operand is NaN the other numeric value is returned,
+     2. if both operands are NaN then a NaN is returned,
+     3. there is no distinction between -0 and 0.  */
+DEFTREECODE (FMIN_EXPR, "fmin_expr", tcc_binary, 2)
+DEFTREECODE (FMAX_EXPR, "fmax_expr", tcc_binary, 2)
+
 /* Represents the absolute value of the operand.
 
    An ABS_EXPR must have either an INTEGER_TYPE or a REAL_TYPE.  The

next prev parent reply	other threads:[~2015-09-14 10:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13 10:13 David Sherwood
2015-08-13 11:12 ` Richard Biener
2015-08-17  9:41   ` David Sherwood
2015-08-17 14:02     ` Richard Biener
2015-08-18 11:10       ` David Sherwood
2015-08-18 13:31         ` Richard Biener
2015-08-18 14:20           ` Richard Sandiford
2015-08-19  9:48             ` Richard Biener
2015-08-19 10:04               ` Richard Sandiford
2015-08-19 10:31                 ` Richard Biener
2015-08-19 12:23                   ` Richard Sandiford
2015-08-19 12:35                     ` Richard Biener
2015-08-19 13:16                       ` Richard Sandiford
2015-08-19 13:41                         ` Richard Biener
2015-09-14 10:47                           ` David Sherwood [this message]
2015-09-14 13:42                             ` Richard Biener
2015-09-14 20:38                               ` Joseph Myers
2015-08-19 15:32                       ` Joseph Myers
2015-11-23  9:21                       ` David Sherwood
2015-11-25 12:39                         ` Richard Biener
2015-08-19 15:07               ` Michael Matz
2015-08-19 15:25                 ` Richard Biener
2015-08-19 15:39                   ` Richard Sandiford
  -- strict thread matches above, loose matches on Subject: below --
2015-08-06  9:39 David Sherwood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='000001d0eed9$48ed0070$dac70150$@arm.com' \
    --to=david.sherwood@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).