From: "David Sherwood" <david.sherwood@arm.com>
To: "'Richard Biener'" <richard.guenther@gmail.com>,
"GCC Patches" <gcc-patches@gcc.gnu.org>,
"Richard Sandiford" <Richard.Sandiford@arm.com>
Subject: RE: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
Date: Mon, 14 Sep 2015 10:47:00 -0000 [thread overview]
Message-ID: <000001d0eed9$48ed0070$dac70150$@arm.com> (raw)
In-Reply-To: <CAFiYyc0dt911qOdRAXa3LmcB5JMzPL0jc_Su_0po123Dkp22iA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3232 bytes --]
Hi All,
For what it's worth I have uploaded a new patch that changes the name
from STRICT_FMIN/MAX to just FMIN/FMAX, although I realise that this
discussion has not yet been resolved. I have also added scheduling
attributes to the aarch64 instructions.
Regards,
David Sherwood.
ChangeLog:
2015-08-28 David Sherwood <david.sherwood@arm.com>
gcc/
* builtins.c (integer_valued_real_p): Add FMIN_EXPR and FMAX_EXPR.
(fold_builtin_fmin_fmax): For strict math, convert builtins fmin and
fmax to FMIN_EXPR and FMIN_EXPR, respectively.
* expr.c (expand_expr_real_2): Add FMIN_EXPR and FMAX_EXPR.
* fold-const.c (const_binop): Likewise.
(fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
(tree_binary_nonzero_warnv_p): Likewise.
* optabs.h (fminmax_support): Declare.
* optabs.def: Add new optabs fmax_optab/fmin_optab.
* optabs.c (optab_for_tree_code): Return new optabs for FMIN_EXPR and
FMAX_EXPR.
(fminmax_support): New function.
* real.c (real_arithmetic): Add FMIN_EXPR and FMAX_EXPR.
* tree.def: Likewise.
* tree.c (associative_tree_code, commutative_tree_code): Likewise.
* tree-cfg.c (verify_expr): Likewise.
(verify_gimple_assign_binary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
(op_symbol_code): Likewise.
* config/aarch64/aarch64.md: New pattern.
* config/aarch64/aarch64-simd.md: Likewise.
* config/aarch64/iterators.md: New unspecs, iterators.
* config/arm/iterators.md: New iterators.
* config/arm/unspecs.md: New unspecs.
* config/arm/neon.md: New pattern.
* config/arm/vfp.md: Likewise.
* doc/generic.texi: Add FMAX_EXPR and FMIN_EXPR.
* doc/md.texi: Add fmin and fmax patterns.
gcc/testsuite
* gcc.target/aarch64/fmaxmin.c: New test.
* gcc.target/arm/fmaxmin.c: New test.
> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: 19 August 2015 14:41
> To: Richard Biener; David Sherwood; GCC Patches; Richard Sandiford
> Subject: Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*
>
> On Wed, Aug 19, 2015 at 3:06 PM, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> > Richard Biener <richard.guenther@gmail.com> writes:
> >> As an additional point for many math functions we have to support errno
> >> which means, like, BUILT_IN_SQRT can be rewritten to SQRT_EXPR
> >> only if -fno-math-errno is in effect. But then code has to handle
> >> both variants for things like constant folding and expression combining.
> >> That's very unfortunate and something we want to avoid (one reason
> >> the POW_EXPR thing didn't fly when I tried). STRICT_FMIN/MAX_EXPR
> >> is an example where this doesn't apply, of course (but I detest the name,
> >> just use FMIN/FMAX_EXPR?). Still you'd need to handle both,
> >> FMIN_EXPR and BUILT_IN_FMIN, in code doing analysis/transform.
> >
> > Yeah, but match.pd makes that easy, right? ;-)
>
> Sure, but that only addresses stmt combining, not other passes. And of course
> it causes {gimple,generic}-match.c to become even bigger ;)
>
> Richard.
[-- Attachment #2: fmaxmin.patch --]
[-- Type: application/octet-stream, Size: 21233 bytes --]
diff --git a/gcc/builtins.c b/gcc/builtins.c
index d79372c..284534f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7393,6 +7393,8 @@ integer_valued_real_p (tree t)
case MULT_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
return integer_valued_real_p (TREE_OPERAND (t, 0))
&& integer_valued_real_p (TREE_OPERAND (t, 1));
@@ -9176,6 +9178,10 @@ fold_builtin_fmin_fmax (location_t loc, tree arg0, tree arg1,
return fold_build2_loc (loc, (max ? MAX_EXPR : MIN_EXPR), type,
fold_convert_loc (loc, type, arg0),
fold_convert_loc (loc, type, arg1));
+ else if (fminmax_support (type, max))
+ return fold_build2_loc (loc, (max ? FMAX_EXPR : FMIN_EXPR), type,
+ fold_convert_loc (loc, type, arg0),
+ fold_convert_loc (loc, type, arg1));
}
return NULL_TREE;
}
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9777418..0a704b6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1821,6 +1821,16 @@
[(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
)
+(define_insn "<fmaxmin><mode>3"
+ [(set (match_operand:VDQF 0 "register_operand" "=w")
+ (unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
+ (match_operand:VDQF 2 "register_operand" "w")]
+ FMAXMIN_STRICT))]
+ "TARGET_SIMD"
+ "<fmaxmin_op>\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+ [(set_attr "type" "neon_fp_minmax_<Vetype><q>")]
+)
+
(define_insn "<maxmin_uns><mode>3"
[(set (match_operand:VDQF 0 "register_operand" "=w")
(unspec:VDQF [(match_operand:VDQF 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c8511f0..023be58 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4315,6 +4315,16 @@
[(set_attr "type" "f_minmax<s>")]
)
+(define_insn "<fmaxmin><mode>3"
+ [(set (match_operand:GPF 0 "register_operand" "=w")
+ (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")
+ (match_operand:GPF 2 "register_operand" "w")]
+ FMAXMIN_STRICT))]
+ "TARGET_FLOAT"
+ "<fmaxmin_op>\\t%<s>0, %<s>1, %<s>2"
+ [(set_attr "type" "f_minmax<s>")]
+)
+
;; -------------------------------------------------------------------
;; Reload support
;; -------------------------------------------------------------------
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index b8a45d1..5e248bc 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -282,6 +282,8 @@
UNSPEC_PMULL2 ; Used in aarch64-simd.md.
UNSPEC_REV_REGLIST ; Used in aarch64-simd.md.
UNSPEC_VEC_SHR ; Used in aarch64-simd.md.
+ UNSPEC_FMAX_STRICT ; Used in aarch64-simd.md.
+ UNSPEC_FMIN_STRICT ; Used in aarch64-simd.md.
])
;; -------------------------------------------------------------------
@@ -873,6 +875,8 @@
(define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN])
+(define_int_iterator FMAXMIN_STRICT [UNSPEC_FMAX_STRICT UNSPEC_FMIN_STRICT])
+
(define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
(define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
@@ -953,6 +957,12 @@
(UNSPEC_FMINNMV "fminnm")
(UNSPEC_FMINV "fmin")])
+(define_int_attr fmaxmin [(UNSPEC_FMAX_STRICT "fmax")
+ (UNSPEC_FMIN_STRICT "fmin")])
+
+(define_int_attr fmaxmin_op [(UNSPEC_FMAX_STRICT "fmaxnm")
+ (UNSPEC_FMIN_STRICT "fminnm")])
+
(define_int_attr sur [(UNSPEC_SHADD "s") (UNSPEC_UHADD "u")
(UNSPEC_SRHADD "sr") (UNSPEC_URHADD "ur")
(UNSPEC_SHSUB "s") (UNSPEC_UHSUB "u")
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1e7f3f1..42fc688 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -292,6 +292,8 @@
(define_int_iterator VMAXMINF [UNSPEC_VMAX UNSPEC_VMIN])
+(define_int_iterator VMAXMINF_STRICT [UNSPEC_VMAX_STRICT UNSPEC_VMIN_STRICT])
+
(define_int_iterator VPADDL [UNSPEC_VPADDL_S UNSPEC_VPADDL_U])
(define_int_iterator VPADAL [UNSPEC_VPADAL_S UNSPEC_VPADAL_U])
@@ -716,6 +718,13 @@
(UNSPEC_VPMIN "min") (UNSPEC_VPMIN_U "min")
])
+(define_int_attr fmaxmin [
+ (UNSPEC_VMAX_STRICT "fmax") (UNSPEC_VMIN_STRICT "fmin")])
+
+(define_int_attr fmaxmin_op [
+ (UNSPEC_VMAX_STRICT "vmaxnm") (UNSPEC_VMIN_STRICT "vminnm")
+])
+
(define_int_attr shift_op [
(UNSPEC_VSHL_S "shl") (UNSPEC_VSHL_U "shl")
(UNSPEC_VRSHL_S "rshl") (UNSPEC_VRSHL_U "rshl")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 873330f..d39f7ff 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2354,6 +2354,16 @@
[(set_attr "type" "neon_fp_minmax_s<q>")]
)
+(define_insn "<fmaxmin><mode>3"
+ [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
+ (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
+ (match_operand:VCVTF 2 "s_register_operand" "w")]
+ VMAXMINF_STRICT))]
+ "TARGET_NEON && TARGET_FPU_ARMV8"
+ "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+ [(set_attr "type" "neon_fp_minmax_s<q>")]
+)
+
(define_expand "neon_vpadd<mode>"
[(match_operand:VD 0 "s_register_operand" "=w")
(match_operand:VD 1 "s_register_operand" "w")
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 0ec2c48..83094d5 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -224,8 +224,10 @@
UNSPEC_VLD4_LANE
UNSPEC_VMAX
UNSPEC_VMAX_U
+ UNSPEC_VMAX_STRICT
UNSPEC_VMIN
UNSPEC_VMIN_U
+ UNSPEC_VMIN_STRICT
UNSPEC_VMLA
UNSPEC_VMLA_LANE
UNSPEC_VMLAL_S
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 081aab2..a9a5949 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1368,6 +1368,17 @@
(set_attr "conds" "unconditional")]
)
+(define_insn "<fmaxmin><mode>3"
+ [(set (match_operand:SDF 0 "s_register_operand" "=<F_constraint>")
+ (unspec:SDF [(match_operand:SDF 1 "s_register_operand" "<F_constraint>")
+ (match_operand:SDF 2 "s_register_operand" "<F_constraint>")]
+ VMAXMINF_STRICT))]
+ "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
+ "<fmaxmin_op>.<V_if_elem>\\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+ [(set_attr "type" "f_minmax<vfp_type>")
+ (set_attr "conds" "unconditional")]
+)
+
;; Write Floating-point Status and Control Register.
(define_insn "set_fpscr"
[(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index bbafad9..937ca5b 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1268,6 +1268,8 @@ the byte offset of the field, but should not be used directly; call
@tindex TARGET_EXPR
@tindex VA_ARG_EXPR
@tindex ANNOTATE_EXPR
+@tindex FMAX_EXPR
+@tindex FMIN_EXPR
@table @code
@item NEGATE_EXPR
@@ -1687,8 +1689,16 @@ its sole argument yields the representation for @code{ap}.
This node is used to attach markers to an expression. The first operand
is the annotated expression, the second is an @code{INTEGER_CST} with
a value from @code{enum annot_expr_kind}.
-@end table
+@item FMAX_EXPR
+@item FMIN_EXPR
+These nodes represent IEEE-conformant maximum and minimum operations. If either
+operand is a quiet @code{NaN} the other operand is returned. If both operands
+are quiet @code{NaN}, then a quiet @code{NaN} is returned. In the case when gcc
+supports signalling @code{NaN} (-fsignaling-nans) an invalid floating point
+exception is raised and a quiet @code{NaN} is returned.
+
+@end table
@node Vectors
@subsection Vectors
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0bffdc6..31b1d24 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4879,6 +4879,15 @@ Signed minimum and maximum operations. When used with floating point,
if both operands are zeros, or if either operand is @code{NaN}, then
it is unspecified which of the two operands is returned as the result.
+@cindex @code{fmin@var{m}3} instruction pattern
+@cindex @code{fmax@var{m}3} instruction pattern
+@item @samp{fmin@var{m}3}, @samp{fmax@var{m}3}
+IEEE-conformant minimum and maximum operations. If one operand is a quiet
+@code{NaN}, then the other operand is returned. If both operands are quiet
+@code{NaN}, then a quiet @code{NaN} is returned. In the case when gcc supports
+signalling @code{NaN} (-fsignaling-nans) an invalid floating point exception is
+raised and a quiet @code{NaN} is returned.
+
@cindex @code{reduc_smin_@var{m}} instruction pattern
@cindex @code{reduc_smax_@var{m}} instruction pattern
@item @samp{reduc_smin_@var{m}}, @samp{reduc_smax_@var{m}}
diff --git a/gcc/expr.c b/gcc/expr.c
index 1e820b4..f69ba80 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8689,6 +8689,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
return expand_abs (mode, op0, target, unsignedp,
safe_from_p (target, treeop0, 1));
+ case FMAX_EXPR:
+ case FMIN_EXPR:
case MAX_EXPR:
case MIN_EXPR:
target = original_target;
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c826e67..7846e42 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1155,6 +1155,8 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
case RDIV_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
break;
default:
@@ -9069,7 +9071,8 @@ fold_binary_loc (location_t loc,
cases, the appropriate type conversions should be put back in
the tree that will get out of the constant folder. */
- if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR)
+ if (kind == tcc_comparison || code == MIN_EXPR || code == MAX_EXPR
+ || code == FMIN_EXPR || code == FMAX_EXPR)
{
STRIP_SIGN_NOPS (arg0);
STRIP_SIGN_NOPS (arg1);
@@ -13017,6 +13020,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
case BIT_AND_EXPR:
case MAX_EXPR:
+ case FMAX_EXPR:
return (tree_expr_nonnegative_warnv_p (op0,
strict_overflow_p)
|| tree_expr_nonnegative_warnv_p (op1,
@@ -13025,6 +13029,7 @@ tree_binary_nonnegative_warnv_p (enum tree_code code, tree type, tree op0,
case BIT_IOR_EXPR:
case BIT_XOR_EXPR:
case MIN_EXPR:
+ case FMIN_EXPR:
case RDIV_EXPR:
case TRUNC_DIV_EXPR:
case CEIL_DIV_EXPR:
@@ -13479,6 +13484,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
break;
case MIN_EXPR:
+ case FMIN_EXPR:
sub_strict_overflow_p = false;
if (tree_expr_nonzero_warnv_p (op0,
&sub_strict_overflow_p)
@@ -13491,6 +13497,7 @@ tree_binary_nonzero_warnv_p (enum tree_code code,
break;
case MAX_EXPR:
+ case FMAX_EXPR:
sub_strict_overflow_p = false;
if (tree_expr_nonzero_warnv_p (op0,
&sub_strict_overflow_p))
diff --git a/gcc/optabs.c b/gcc/optabs.c
index e533e6e..b733a2e 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -476,6 +476,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
case MIN_EXPR:
return TYPE_UNSIGNED (type) ? umin_optab : smin_optab;
+ case FMAX_EXPR:
+ return fmax_optab;
+
+ case FMIN_EXPR:
+ return fmin_optab;
+
case REALIGN_LOAD_EXPR:
return vec_realign_load_optab;
@@ -6791,6 +6797,16 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
return tmp;
}
+/* Return true if the target supports strict FP math max (MAX = TRUE) and min
+ (MAX = FALSE) operations on type TYPE. */
+bool
+fminmax_support (tree type, bool max)
+{
+ optab optab = optab_for_tree_code
+ (max ? FMAX_EXPR : FMIN_EXPR, type, optab_default);
+ return optab_handler (optab, TYPE_MODE (type)) != CODE_FOR_nothing;
+}
+
/* Return insn code for a conditional operator with a comparison in
mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE. */
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 888b21c..36c72d8 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -244,6 +244,10 @@ OPTAB_D (sin_optab, "sin$a2")
OPTAB_D (sincos_optab, "sincos$a3")
OPTAB_D (tan_optab, "tan$a2")
+/* C99 implementations of fmax/fmin. */
+OPTAB_D (fmax_optab, "fmax$a3")
+OPTAB_D (fmin_optab, "fmin$a3")
+
/* Vector reduction to a scalar. */
OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 95f5cbc..0bbf736 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -565,4 +565,6 @@ extern bool lshift_cheap_p (bool);
extern enum rtx_code get_rtx_code (enum tree_code tcode, bool unsignedp);
+extern bool fminmax_support (tree, bool);
+
#endif /* GCC_OPTABS_H */
diff --git a/gcc/real.c b/gcc/real.c
index c1ff78d..3a2d7b6 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -1033,6 +1033,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
*r = *op1;
break;
+ case FMIN_EXPR:
+ if (op0->cl == rvc_nan)
+ *r = *op1;
+ else if (do_compare (op0, op1, -1) < 0)
+ *r = *op0;
+ else
+ *r = *op1;
+ break;
+
case MAX_EXPR:
if (op1->cl == rvc_nan)
*r = *op1;
@@ -1042,6 +1051,15 @@ real_arithmetic (REAL_VALUE_TYPE *r, int icode, const REAL_VALUE_TYPE *op0,
*r = *op0;
break;
+ case FMAX_EXPR:
+ if (op0->cl == rvc_nan)
+ *r = *op1;
+ else if (do_compare (op0, op1, 1) < 0)
+ *r = *op1;
+ else
+ *r = *op0;
+ break;
+
case NEGATE_EXPR:
*r = *op0;
r->sign ^= 1;
diff --git a/gcc/testsuite/gcc.target/aarch64/fmaxmin.c b/gcc/testsuite/gcc.target/aarch64/fmaxmin.c
new file mode 100644
index 0000000..7654955
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fmaxmin.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -save-temps" } */
+
+
+extern void abort (void);
+double fmax (double, double);
+float fmaxf (float, float);
+double fmin (double, double);
+float fminf (float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define NUM_ELEMS(TYPE) (16 / sizeof (TYPE))
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+ TYPE *__restrict__ b)\
+{\
+ int i;\
+ for (i = 0; i < NUM_ELEMS (TYPE); i++)\
+ r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+ float a_f[4] = { 4, NAN, -3, INFINITY };
+ float b_f[4] = { 1, 7,NAN, 0 };
+ float r_f[4];
+ double a_d[4] = { 4, NAN, -3, INFINITY };
+ double b_d[4] = { 1, 7, NAN, 0 };
+ double r_d[4];
+
+ test_fmaxf (r_f, a_f, b_f);
+ if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+ abort ();
+
+ test_fminf (r_f, a_f, b_f);
+ if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+ abort ();
+
+ test_fmax (r_d, a_d, b_d);
+ test_fmax (&r_d[2], &a_d[2], &b_d[2]);
+ if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+ abort ();
+
+ test_fmin (r_d, a_d, b_d);
+ test_fmin (&r_d[2], &a_d[2], &b_d[2]);
+ if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+ abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fmaxnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */
+/* { dg-final { scan-assembler-times "fminnm\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/arm/fmaxmin.c b/gcc/testsuite/gcc.target/arm/fmaxmin.c
new file mode 100644
index 0000000..f55ac5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fmaxmin.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a -save-temps" } */
+/* { dg-add-options arm_v8_neon } */
+
+extern void abort (void);
+double fmax (double, double);
+float fmaxf (float, float);
+double fmin (double, double);
+float fminf (float, float);
+
+#define isnan __builtin_isnan
+#define isinf __builtin_isinf
+
+#define NAN __builtin_nan ("")
+#define INFINITY __builtin_inf ()
+
+#define DEF_MAXMIN(TYPE,FUN)\
+void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,\
+ TYPE *__restrict__ b)\
+{\
+ int i;\
+ for (i = 0; i < 4; i++)\
+ r[i] = FUN (a[i], b[i]);\
+}\
+
+DEF_MAXMIN (float, fmaxf)
+DEF_MAXMIN (double, fmax)
+
+DEF_MAXMIN (float, fminf)
+DEF_MAXMIN (double, fmin)
+
+int main ()
+{
+ float a_f[4] = { 4, NAN, -3, INFINITY };
+ float b_f[4] = { 1, 7,NAN, 0 };
+ float r_f[4];
+ double a_d[4] = { 4, NAN, -3, INFINITY };
+ double b_d[4] = { 1, 7, NAN, 0 };
+ double r_d[4];
+
+ test_fmaxf (r_f, a_f, b_f);
+ if (r_f[0] != 4 || isnan (r_f[1]) || isnan (r_f[2]) || !isinf (r_f[3]))
+ abort ();
+
+ test_fminf (r_f, a_f, b_f);
+ if (r_f[0] != 1 || isnan (r_f[1]) || isnan (r_f[2]) || isinf (r_f[3]))
+ abort ();
+
+ test_fmax (r_d, a_d, b_d);
+ if (r_d[0] != 4 || isnan (r_d[1]) || isnan (r_d[2]) || !isinf (r_d[3]))
+ abort ();
+
+ test_fmin (r_d, a_d, b_d);
+ if (r_d[0] != 1 || isnan (r_d[1]) || isnan (r_d[2]) || isinf (r_d[3]))
+ abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f32\tq\[0-9\]+, q\[0-9\]+, q\[0-9\]+" 1 } } */
+
+/* NOTE: There are no double precision vector versions of vmaxnm/vminnm. */
+/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
+
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5ac73b3..36ab7a9 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3055,6 +3055,8 @@ verify_expr (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
case EXACT_DIV_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case LSHIFT_EXPR:
case RSHIFT_EXPR:
case LROTATE_EXPR:
@@ -3880,6 +3882,8 @@ verify_gimple_assign_binary (gassign *stmt)
case EXACT_DIV_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case BIT_IOR_EXPR:
case BIT_XOR_EXPR:
case BIT_AND_EXPR:
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..d41c8ff 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3873,6 +3873,8 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
case FLOAT_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case ABS_EXPR:
case LSHIFT_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..dc950f7 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2844,6 +2844,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
pp_string (pp, " > ");
break;
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case VEC_WIDEN_MULT_HI_EXPR:
case VEC_WIDEN_MULT_LO_EXPR:
case VEC_WIDEN_MULT_EVEN_EXPR:
@@ -3218,6 +3220,8 @@ op_code_prio (enum tree_code code)
/* Special expressions. */
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case ABS_EXPR:
case REALPART_EXPR:
case IMAGPART_EXPR:
@@ -3414,6 +3418,12 @@ op_symbol_code (enum tree_code code)
case MIN_EXPR:
return "min";
+ case FMAX_EXPR:
+ return "fmax";
+
+ case FMIN_EXPR:
+ return "fmin";
+
default:
return "<<< ??? >>>";
}
diff --git a/gcc/tree.c b/gcc/tree.c
index af3a6a3..183a9e6 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7533,6 +7533,8 @@ associative_tree_code (enum tree_code code)
case MULT_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
return true;
default:
@@ -7553,6 +7555,8 @@ commutative_tree_code (enum tree_code code)
case MULT_HIGHPART_EXPR:
case MIN_EXPR:
case MAX_EXPR:
+ case FMIN_EXPR:
+ case FMAX_EXPR:
case BIT_IOR_EXPR:
case BIT_XOR_EXPR:
case BIT_AND_EXPR:
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..cf19392 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -722,6 +722,14 @@ DEFTREECODE (NEGATE_EXPR, "negate_expr", tcc_unary, 1)
DEFTREECODE (MIN_EXPR, "min_expr", tcc_binary, 2)
DEFTREECODE (MAX_EXPR, "max_expr", tcc_binary, 2)
+/* Minimum and maximum values, but when used with floating point it conforms to
+ the C99 definition of fmax and fmin, i.e.
+ 1. if one operand is NaN the other numeric value is returned,
+ 2. if both operands are NaN then a NaN is returned,
+ 3. there is no distinction between -0 and 0. */
+DEFTREECODE (FMIN_EXPR, "fmin_expr", tcc_binary, 2)
+DEFTREECODE (FMAX_EXPR, "fmax_expr", tcc_binary, 2)
+
/* Represents the absolute value of the operand.
An ABS_EXPR must have either an INTEGER_TYPE or a REAL_TYPE. The
next prev parent reply other threads:[~2015-09-14 10:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-13 10:13 David Sherwood
2015-08-13 11:12 ` Richard Biener
2015-08-17 9:41 ` David Sherwood
2015-08-17 14:02 ` Richard Biener
2015-08-18 11:10 ` David Sherwood
2015-08-18 13:31 ` Richard Biener
2015-08-18 14:20 ` Richard Sandiford
2015-08-19 9:48 ` Richard Biener
2015-08-19 10:04 ` Richard Sandiford
2015-08-19 10:31 ` Richard Biener
2015-08-19 12:23 ` Richard Sandiford
2015-08-19 12:35 ` Richard Biener
2015-08-19 13:16 ` Richard Sandiford
2015-08-19 13:41 ` Richard Biener
2015-09-14 10:47 ` David Sherwood [this message]
2015-09-14 13:42 ` Richard Biener
2015-09-14 20:38 ` Joseph Myers
2015-08-19 15:32 ` Joseph Myers
2015-11-23 9:21 ` David Sherwood
2015-11-25 12:39 ` Richard Biener
2015-08-19 15:07 ` Michael Matz
2015-08-19 15:25 ` Richard Biener
2015-08-19 15:39 ` Richard Sandiford
-- strict thread matches above, loose matches on Subject: below --
2015-08-06 9:39 David Sherwood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='000001d0eed9$48ed0070$dac70150$@arm.com' \
--to=david.sherwood@arm.com \
--cc=Richard.Sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).