[RFC] PR70117, ppc long double isinf

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [RFC] PR70117, ppc long double isinf
@ 2016-04-05  8:33 Alan Modra
  2016-04-05  9:29 ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2016-04-05  8:33 UTC (permalink / raw)
  To: gcc-patches

This patch fixes the incompatibility between GNUlib's 107 bit
precision LDBL_MAX for IBM extended precision and gcc's 106 bit
LDBL_MAX used to test for Inf, by just testing the high double for inf
and nan.  This agrees with the ABI which has stated for many years
that IBM extended precision "does not fully support the IEEE special
numbers NaN and INF.  These values are encoded in the high-order
double value only.  The low-order value is not significant".

I've also changed the test for nan, and both the inf test and the
subnormal test in isnormal, to just use the high double.  Changing the
subnormal test *does* allow a small range of values to be seen as
normal that previously would be rejected in a test of the whole long
double against 2**-969.  Which is why I'm making this an RFC rather
than a patch submission.

What is "subnormal" for an IBM extended precision number, anyway?  I
think the only definition that makes sense is in terms of precision.
We can't say a long double is subnormal if the low double is
subnormal, because numbers like (1.0 + 0x1p-1074) are representable
with the high double properly rounded and are clearly not close to
zero or losing precision.  So "subnormal" for IBM extended precision
is a number that has less than 106 bits of precision.  That would be
at a magnitude of less than 2**-969.  You can see that
  (0x1p-969 + 0x1p-1074)  = 0x1.000000000000000000000000008p-969
still has 106 bits of precision.  (0x1p-1074 is the smallest double
distinct from zero, and of course is subnormal.)  However,
  (0x1p-969 + -0x1p-1074) = 0x1.ffffffffffffffffffffffffffp-970
has only 105 bits of precision, if I'm counting correctly.

So testing just the high double in isnormal() returns true for a range
of 105 bit precision values, from (0x1p-969 - 0x1p-1023) to 
(0x1p-969 - 0x1p-1074).  The question is whether I should make the
isnormal() code quite nasty in order to give the right answer.
Probably yes, in which case this post becomes an explanation for why
the lower bound test in isnormal() needs to be a long double test.
Or probably better in terms of emitted code, can I get at both of the
component doubles of an IBM long double at the tree level?
VEIW_CONVERT_EXPR to a complex double perhaps?

	PR target/70117
	* builtins.c (fold_builtin_classify): For IBM extended precision,
	look at just the high-order double to test for NaN.
	(fold_builtin_interclass_mathfn): Similarly for Inf, and range
	test for IBM extended precision isnormal.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9368ed0..ed27d57 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7529,6 +7529,9 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)

   mode = TYPE_MODE (TREE_TYPE (arg));

+  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
+  bool is_ibm_extended = fmt->pnan < fmt->p;
+
   /* If there is no optab, try generic code.  */
   switch (DECL_FUNCTION_CODE (fndecl))
     {
@@ -7538,10 +7541,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
 	tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];

+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isgr_fn, 2,
@@ -7554,10 +7565,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];

+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isle_fn, 2,
@@ -7578,15 +7597,28 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 	   islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
 	tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
+	machine_mode orig_mode = mode;
 	REAL_VALUE_TYPE rmax, rmin;
 	char buf[128];

+	if (is_ibm_extended)
+	  {
+	    /* Use double to test the normal range of IBM extended
+	       precision.  Emin for IBM extended precision is
+	       different to emin for IEEE double, being 53 higher
+	       since the low double exponent is at least 53 lower
+	       than the high double exponent.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
+	arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
+
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&rmax, buf);
-	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
+	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
 	real_from_string (&rmin, buf);
-	arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
 	result = build_call_expr (isle_fn, 2, arg,
 				  build_real (type, rmax));
 	result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
@@ -7664,6 +7696,17 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
       if (!HONOR_NANS (arg))
 	return omit_one_operand_loc (loc, type, integer_zero_node, arg);

+      {
+	const struct real_format *fmt
+	  = FLOAT_MODE_FORMAT (TYPE_MODE (TREE_TYPE (arg)));
+	bool is_ibm_extended = fmt->pnan < fmt->p;
+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
+	  }
+      }
       arg = builtin_save_expr (arg);
       return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] PR70117, ppc long double isinf
  2016-04-05  8:33 [RFC] PR70117, ppc long double isinf Alan Modra
@ 2016-04-05  9:29 ` Richard Biener
  2016-04-06  8:32   ` [PATCH] " Alan Modra
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Biener @ 2016-04-05  9:29 UTC (permalink / raw)
  To: Alan Modra; +Cc: GCC Patches

On Tue, Apr 5, 2016 at 10:33 AM, Alan Modra <amodra@gmail.com> wrote:
> This patch fixes the incompatibility between GNUlib's 107 bit
> precision LDBL_MAX for IBM extended precision and gcc's 106 bit
> LDBL_MAX used to test for Inf, by just testing the high double for inf
> and nan.  This agrees with the ABI which has stated for many years
> that IBM extended precision "does not fully support the IEEE special
> numbers NaN and INF.  These values are encoded in the high-order
> double value only.  The low-order value is not significant".
>
> I've also changed the test for nan, and both the inf test and the
> subnormal test in isnormal, to just use the high double.  Changing the
> subnormal test *does* allow a small range of values to be seen as
> normal that previously would be rejected in a test of the whole long
> double against 2**-969.  Which is why I'm making this an RFC rather
> than a patch submission.
>
> What is "subnormal" for an IBM extended precision number, anyway?  I
> think the only definition that makes sense is in terms of precision.
> We can't say a long double is subnormal if the low double is
> subnormal, because numbers like (1.0 + 0x1p-1074) are representable
> with the high double properly rounded and are clearly not close to
> zero or losing precision.  So "subnormal" for IBM extended precision
> is a number that has less than 106 bits of precision.  That would be
> at a magnitude of less than 2**-969.  You can see that
>   (0x1p-969 + 0x1p-1074)  = 0x1.000000000000000000000000008p-969
> still has 106 bits of precision.  (0x1p-1074 is the smallest double
> distinct from zero, and of course is subnormal.)  However,
>   (0x1p-969 + -0x1p-1074) = 0x1.ffffffffffffffffffffffffffp-970
> has only 105 bits of precision, if I'm counting correctly.
>
> So testing just the high double in isnormal() returns true for a range
> of 105 bit precision values, from (0x1p-969 - 0x1p-1023) to
> (0x1p-969 - 0x1p-1074).  The question is whether I should make the
> isnormal() code quite nasty in order to give the right answer.
> Probably yes, in which case this post becomes an explanation for why
> the lower bound test in isnormal() needs to be a long double test.
> Or probably better in terms of emitted code, can I get at both of the
> component doubles of an IBM long double at the tree level?
> VEIW_CONVERT_EXPR to a complex double perhaps?

Yes, that would work I think, the other variant would be a
BIT_FIELD_REF (but watch out for endianess?).

In general the patch looks like a good approach to me but can we
hide that

> +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> +  bool is_ibm_extended = fmt->pnan < fmt->p;

in a function somewhere in real.[ch]?

Thanks,
Richard.

>         PR target/70117
>         * builtins.c (fold_builtin_classify): For IBM extended precision,
>         look at just the high-order double to test for NaN.
>         (fold_builtin_interclass_mathfn): Similarly for Inf, and range
>         test for IBM extended precision isnormal.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 9368ed0..ed27d57 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -7529,6 +7529,9 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>
>    mode = TYPE_MODE (TREE_TYPE (arg));
>
> +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> +  bool is_ibm_extended = fmt->pnan < fmt->p;
> +
>    /* If there is no optab, try generic code.  */
>    switch (DECL_FUNCTION_CODE (fndecl))
>      {
> @@ -7538,10 +7541,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
>         tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isgr_fn, 2,
> @@ -7554,10 +7565,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isle_fn, 2,
> @@ -7578,15 +7597,28 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>            islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
>         tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
> +       machine_mode orig_mode = mode;
>         REAL_VALUE_TYPE rmax, rmin;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* Use double to test the normal range of IBM extended
> +              precision.  Emin for IBM extended precision is
> +              different to emin for IEEE double, being 53 higher
> +              since the low double exponent is at least 53 lower
> +              than the high double exponent.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
> +       arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
> +
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&rmax, buf);
> -       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
> +       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
>         real_from_string (&rmin, buf);
> -       arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
>         result = build_call_expr (isle_fn, 2, arg,
>                                   build_real (type, rmax));
>         result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
> @@ -7664,6 +7696,17 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
>        if (!HONOR_NANS (arg))
>         return omit_one_operand_loc (loc, type, integer_zero_node, arg);
>
> +      {
> +       const struct real_format *fmt
> +         = FLOAT_MODE_FORMAT (TYPE_MODE (TREE_TYPE (arg)));
> +       bool is_ibm_extended = fmt->pnan < fmt->p;
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
> +         }
> +      }
>        arg = builtin_save_expr (arg);
>        return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
>
>
> --
> Alan Modra
> Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] PR70117, ppc long double isinf
  2016-04-05  9:29 ` Richard Biener
@ 2016-04-06  8:32   ` Alan Modra
  2016-04-06  8:46     ` Richard Biener
  2016-04-06 10:27     ` Andreas Schwab
  0 siblings, 2 replies; 13+ messages in thread
From: Alan Modra @ 2016-04-06  8:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
> In general the patch looks like a good approach to me but can we
> hide that
> 
> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
> 
> in a function somewhere in real.[ch]?

On looking in real.h, I see there is already a macro to do it.

Here's the revised version that properly tests the long double
subnormal limit.  Bootstrapped and regression tested
powerpc64le-linux.

gcc/
	PR target/70117
	* builtins.c (fold_builtin_classify): For IBM extended precision,
	look at just the high-order double to test for NaN.
	(fold_builtin_interclass_mathfn): Similarly for Inf.  For isnormal
	test just the high double for Inf but both doubles for subnormal
	limit.
gcc/testsuite/
	* gcc.target/powerpc/pr70117.c: New.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9368ed0..9162838 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7529,6 +7529,8 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 
   mode = TYPE_MODE (TREE_TYPE (arg));
 
+  bool is_ibm_extended = MODE_COMPOSITE_P (mode);
+
   /* If there is no optab, try generic code.  */
   switch (DECL_FUNCTION_CODE (fndecl))
     {
@@ -7538,10 +7540,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
 	tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];
 
+	if (is_ibm_extended)
+	  {
+	    /* NaN and Inf are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isgr_fn, 2,
@@ -7554,10 +7564,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];
 
+	if (is_ibm_extended)
+	  {
+	    /* NaN and Inf are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isle_fn, 2,
@@ -7577,21 +7595,72 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 	/* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) &
 	   islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-	tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
+	tree orig_arg, max_exp, min_exp;
+	machine_mode orig_mode = mode;
 	REAL_VALUE_TYPE rmax, rmin;
 	char buf[128];
 
+	orig_arg = arg = builtin_save_expr (arg);
+	if (is_ibm_extended)
+	  {
+	    /* Use double to test the normal range of IBM extended
+	       precision.  Emin for IBM extended precision is
+	       different to emin for IEEE double, being 53 higher
+	       since the low double exponent is at least 53 lower
+	       than the high double exponent.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
+	arg = fold_build1_loc (loc, ABS_EXPR, type, arg);
+
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&rmax, buf);
-	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
+	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
 	real_from_string (&rmin, buf);
-	arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
-	result = build_call_expr (isle_fn, 2, arg,
-				  build_real (type, rmax));
-	result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
-			      build_call_expr (isge_fn, 2, arg,
-					       build_real (type, rmin)));
+	max_exp = build_real (type, rmax);
+	min_exp = build_real (type, rmin);
+
+	max_exp = build_call_expr (isle_fn, 2, arg, max_exp);
+	if (is_ibm_extended)
+	  {
+	    /* Testing the high end of the range is done just using
+	       the high double, using the same test as isfinite().
+	       For the subnormal end of the range we first test the
+	       high double, then if its magnitude is equal to the
+	       limit of 0x1p-969, we test whether the low double is
+	       non-zero and opposite sign to the high double.  */
+	    tree const islt_fn = builtin_decl_explicit (BUILT_IN_ISLESS);
+	    tree const isgt_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
+	    tree gt_min = build_call_expr (isgt_fn, 2, arg, min_exp);
+	    tree eq_min = fold_build2 (EQ_EXPR, integer_type_node,
+				       arg, min_exp);
+	    tree as_complex = build1 (VIEW_CONVERT_EXPR,
+				      complex_double_type_node, orig_arg);
+	    tree hi_dbl = build1 (REALPART_EXPR, type, as_complex);
+	    tree lo_dbl = build1 (IMAGPART_EXPR, type, as_complex);
+	    tree zero = build_real (type, dconst0);
+	    tree hilt = build_call_expr (islt_fn, 2, hi_dbl, zero);
+	    tree lolt = build_call_expr (islt_fn, 2, lo_dbl, zero);
+	    tree logt = build_call_expr (isgt_fn, 2, lo_dbl, zero);
+	    tree ok_lo = fold_build1 (TRUTH_NOT_EXPR, integer_type_node,
+				      fold_build3 (COND_EXPR,
+						   integer_type_node,
+						   hilt, logt, lolt));
+	    eq_min = fold_build2 (TRUTH_ANDIF_EXPR, integer_type_node,
+				  eq_min, ok_lo);
+	    min_exp = fold_build2 (TRUTH_ORIF_EXPR, integer_type_node,
+				   gt_min, eq_min);
+	  }
+	else
+	  {
+	    tree const isge_fn
+	      = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
+	    min_exp = build_call_expr (isge_fn, 2, arg, min_exp);
+	  }
+	result = fold_build2 (BIT_AND_EXPR, integer_type_node,
+			      max_exp, min_exp);
 	return result;
       }
     default:
@@ -7664,6 +7733,15 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
       if (!HONOR_NANS (arg))
 	return omit_one_operand_loc (loc, type, integer_zero_node, arg);
 
+      {
+	bool is_ibm_extended = MODE_COMPOSITE_P (TYPE_MODE (TREE_TYPE (arg)));
+	if (is_ibm_extended)
+	  {
+	    /* NaN and Inf are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
+	  }
+      }
       arg = builtin_save_expr (arg);
       return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c b/gcc/testsuite/gcc.target/powerpc/pr70117.c
new file mode 100644
index 0000000..99e6f19
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } || { powerpc*-*-linux* && lp64 } } } } */
+/* { dg-options "-mlong-double-128" } */
+
+#include <float.h>
+
+union gl_long_double_union
+  {
+    struct { double hi; double lo; } dd;
+    long double ld;
+  };
+
+const union gl_long_double_union gl_LDBL_MAX =
+  { { (DBL_MAX), (DBL_MAX) / (double)134217728UL / (double)134217728UL } };
+
+int main()
+{
+  if (__builtin_isinfl (gl_LDBL_MAX.ld))
+    __builtin_abort ();
+  if (__builtin_isinfl (-gl_LDBL_MAX.ld))
+    __builtin_abort ();
+  return 0;
+}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-06  8:32   ` [PATCH] " Alan Modra
@ 2016-04-06  8:46     ` Richard Biener
  2016-04-06  9:19       ` Alan Modra
  2016-04-06 10:27     ` Andreas Schwab
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Biener @ 2016-04-06  8:46 UTC (permalink / raw)
  To: Alan Modra; +Cc: GCC Patches

On Wed, Apr 6, 2016 at 10:31 AM, Alan Modra <amodra@gmail.com> wrote:
> On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
>> In general the patch looks like a good approach to me but can we
>> hide that
>>
>> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
>> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
>>
>> in a function somewhere in real.[ch]?
>
> On looking in real.h, I see there is already a macro to do it.
>
> Here's the revised version that properly tests the long double
> subnormal limit.  Bootstrapped and regression tested
> powerpc64le-linux.

Can you add a testcase or two for the isnormal () case?

I wonder whether the isnormal tests are too excessive to put in
inline code and thus libgcc code wouldn't be better to handle this...

At least the glibc implementation looks a lot simpler to me ...
(if ./sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c is the correct one).

Thus an alternative is to inline sth similar via the folding or via
an optab and not folding (I'd prefer the latter).

That said, did you inspect the generated code for a isnormal (x)
call for non-constant x?  What does XLC do here?

Richard.

> gcc/
>         PR target/70117
>         * builtins.c (fold_builtin_classify): For IBM extended precision,
>         look at just the high-order double to test for NaN.
>         (fold_builtin_interclass_mathfn): Similarly for Inf.  For isnormal
>         test just the high double for Inf but both doubles for subnormal
>         limit.
> gcc/testsuite/
>         * gcc.target/powerpc/pr70117.c: New.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 9368ed0..9162838 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -7529,6 +7529,8 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>
>    mode = TYPE_MODE (TREE_TYPE (arg));
>
> +  bool is_ibm_extended = MODE_COMPOSITE_P (mode);
> +
>    /* If there is no optab, try generic code.  */
>    switch (DECL_FUNCTION_CODE (fndecl))
>      {
> @@ -7538,10 +7540,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
>         tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and Inf are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isgr_fn, 2,
> @@ -7554,10 +7564,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and Inf are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isle_fn, 2,
> @@ -7577,21 +7595,72 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>         /* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) &
>            islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -       tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
> +       tree orig_arg, max_exp, min_exp;
> +       machine_mode orig_mode = mode;
>         REAL_VALUE_TYPE rmax, rmin;
>         char buf[128];
>
> +       orig_arg = arg = builtin_save_expr (arg);
> +       if (is_ibm_extended)
> +         {
> +           /* Use double to test the normal range of IBM extended
> +              precision.  Emin for IBM extended precision is
> +              different to emin for IEEE double, being 53 higher
> +              since the low double exponent is at least 53 lower
> +              than the high double exponent.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
> +       arg = fold_build1_loc (loc, ABS_EXPR, type, arg);
> +
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&rmax, buf);
> -       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
> +       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
>         real_from_string (&rmin, buf);
> -       arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
> -       result = build_call_expr (isle_fn, 2, arg,
> -                                 build_real (type, rmax));
> -       result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
> -                             build_call_expr (isge_fn, 2, arg,
> -                                              build_real (type, rmin)));
> +       max_exp = build_real (type, rmax);
> +       min_exp = build_real (type, rmin);
> +
> +       max_exp = build_call_expr (isle_fn, 2, arg, max_exp);
> +       if (is_ibm_extended)
> +         {
> +           /* Testing the high end of the range is done just using
> +              the high double, using the same test as isfinite().
> +              For the subnormal end of the range we first test the
> +              high double, then if its magnitude is equal to the
> +              limit of 0x1p-969, we test whether the low double is
> +              non-zero and opposite sign to the high double.  */
> +           tree const islt_fn = builtin_decl_explicit (BUILT_IN_ISLESS);
> +           tree const isgt_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
> +           tree gt_min = build_call_expr (isgt_fn, 2, arg, min_exp);
> +           tree eq_min = fold_build2 (EQ_EXPR, integer_type_node,
> +                                      arg, min_exp);
> +           tree as_complex = build1 (VIEW_CONVERT_EXPR,
> +                                     complex_double_type_node, orig_arg);
> +           tree hi_dbl = build1 (REALPART_EXPR, type, as_complex);
> +           tree lo_dbl = build1 (IMAGPART_EXPR, type, as_complex);
> +           tree zero = build_real (type, dconst0);
> +           tree hilt = build_call_expr (islt_fn, 2, hi_dbl, zero);
> +           tree lolt = build_call_expr (islt_fn, 2, lo_dbl, zero);
> +           tree logt = build_call_expr (isgt_fn, 2, lo_dbl, zero);
> +           tree ok_lo = fold_build1 (TRUTH_NOT_EXPR, integer_type_node,
> +                                     fold_build3 (COND_EXPR,
> +                                                  integer_type_node,
> +                                                  hilt, logt, lolt));
> +           eq_min = fold_build2 (TRUTH_ANDIF_EXPR, integer_type_node,
> +                                 eq_min, ok_lo);
> +           min_exp = fold_build2 (TRUTH_ORIF_EXPR, integer_type_node,
> +                                  gt_min, eq_min);
> +         }
> +       else
> +         {
> +           tree const isge_fn
> +             = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
> +           min_exp = build_call_expr (isge_fn, 2, arg, min_exp);
> +         }
> +       result = fold_build2 (BIT_AND_EXPR, integer_type_node,
> +                             max_exp, min_exp);
>         return result;
>        }
>      default:
> @@ -7664,6 +7733,15 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
>        if (!HONOR_NANS (arg))
>         return omit_one_operand_loc (loc, type, integer_zero_node, arg);
>
> +      {
> +       bool is_ibm_extended = MODE_COMPOSITE_P (TYPE_MODE (TREE_TYPE (arg)));
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and Inf are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
> +         }
> +      }
>        arg = builtin_save_expr (arg);
>        return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> new file mode 100644
> index 0000000..99e6f19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -0,0 +1,22 @@
> +/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } || { powerpc*-*-linux* && lp64 } } } } */
> +/* { dg-options "-mlong-double-128" } */
> +
> +#include <float.h>
> +
> +union gl_long_double_union
> +  {
> +    struct { double hi; double lo; } dd;
> +    long double ld;
> +  };
> +
> +const union gl_long_double_union gl_LDBL_MAX =
> +  { { (DBL_MAX), (DBL_MAX) / (double)134217728UL / (double)134217728UL } };
> +
> +int main()
> +{
> +  if (__builtin_isinfl (gl_LDBL_MAX.ld))
> +    __builtin_abort ();
> +  if (__builtin_isinfl (-gl_LDBL_MAX.ld))
> +    __builtin_abort ();
> +  return 0;
> +}
>
> --
> Alan Modra
> Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-06  8:46     ` Richard Biener
@ 2016-04-06  9:19       ` Alan Modra
  2016-04-07  8:04         ` Alan Modra
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2016-04-06  9:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

On Wed, Apr 06, 2016 at 10:46:48AM +0200, Richard Biener wrote:
> On Wed, Apr 6, 2016 at 10:31 AM, Alan Modra <amodra@gmail.com> wrote:
> > On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
> >> In general the patch looks like a good approach to me but can we
> >> hide that
> >>
> >> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> >> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
> >>
> >> in a function somewhere in real.[ch]?
> >
> > On looking in real.h, I see there is already a macro to do it.
> >
> > Here's the revised version that properly tests the long double
> > subnormal limit.  Bootstrapped and regression tested
> > powerpc64le-linux.
> 
> Can you add a testcase or two for the isnormal () case?

Sure.  I'll adapt the testcase I was using to verify the output,
attached in case you're interested.

> I wonder whether the isnormal tests are too excessive to put in
> inline code and thus libgcc code wouldn't be better to handle this...

Out-of-line would be better for -Os at least.

> At least the glibc implementation looks a lot simpler to me ...
> (if ./sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c is the correct one).

It looks more or less the same to me, except done by bit twiddling on
integers.  :)

> Thus an alternative is to inline sth similar via the folding or via
> an optab and not folding (I'd prefer the latter).
> 
> That said, did you inspect the generated code for a isnormal (x)
> call for non-constant x?

Yes, I spent quite a bit of time fiddling trying to get optimal code.
I'm not claiming I succeeded..

>  What does XLC do here?

Not sure, sorry.  I don't have xlc handy.  Will try later.

-- 
Alan Modra
Australia Development Lab, IBM

[-- Attachment #2: isnormal.c --]
[-- Type: text/x-csrc, Size: 1846 bytes --]

int __attribute__ ((noclone, noinline))
isnormal (double x)
{
  return __builtin_isnormal (x);
}

int __attribute__ ((noclone, noinline))
isnormal_ld (long double x)
{
  return __builtin_isnormal (x);
}

double min_norm = 0x1p-1022;
double min_denorm = 0x1p-1074;
double ld_low = 0x1p-969;

int
main (void)
{
  static union { long double ld; unsigned long l[2]; } x;

  __builtin_printf ("%a %d\n", min_norm, isnormal (min_norm));
  __builtin_printf ("%a %d\n", min_norm * 0.5, isnormal (min_norm * 0.5));

  x.ld = ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = -ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = -min_norm * 0.5;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = min_norm * 0.5;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = -min_norm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = min_norm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = -min_denorm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = min_denorm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = min_denorm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  x.ld = -min_denorm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		    isnormal_ld (x.ld));
  return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-06  8:32   ` [PATCH] " Alan Modra
  2016-04-06  8:46     ` Richard Biener
@ 2016-04-06 10:27     ` Andreas Schwab
  2016-04-06 12:43       ` Alan Modra
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2016-04-06 10:27 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Biener, GCC Patches

Alan Modra <amodra@gmail.com> writes:

> diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> new file mode 100644
> index 0000000..99e6f19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -0,0 +1,22 @@
> +/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } || { powerpc*-*-linux* && lp64 } } } } */

Any reason why it is restricted to lp64?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-06 10:27     ` Andreas Schwab
@ 2016-04-06 12:43       ` Alan Modra
  0 siblings, 0 replies; 13+ messages in thread
From: Alan Modra @ 2016-04-06 12:43 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Richard Biener, GCC Patches

On Wed, Apr 06, 2016 at 12:27:36PM +0200, Andreas Schwab wrote:
> Alan Modra <amodra@gmail.com> writes:
> 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > new file mode 100644
> > index 0000000..99e6f19
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } || { powerpc*-*-linux* && lp64 } } } } */
> 
> Any reason why it is restricted to lp64?

No, that was me copying from rs6000-ldouble-1.c without thinking.
We've had double-double on powerpc-linux 32-bit for quite a while.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-06  9:19       ` Alan Modra
@ 2016-04-07  8:04         ` Alan Modra
  2016-04-07  9:33           ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2016-04-07  8:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Wed, Apr 06, 2016 at 06:49:19PM +0930, Alan Modra wrote:
> On Wed, Apr 06, 2016 at 10:46:48AM +0200, Richard Biener wrote:
> > Can you add a testcase or two for the isnormal () case?
> 
> Sure.  I'll adapt the testcase I was using to verify the output,

Revised testcase - target fixed, compiled at -O2 with volatile vars so
we're testing optimized builtins with non-constant data.

diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c b/gcc/testsuite/gcc.target/powerpc/pr70117.c
new file mode 100644
index 0000000..f1fdedb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
@@ -0,0 +1,92 @@
+/* { dg-do run { target { powerpc*-*-linux* powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } } } */
+/* { dg-options "-std=c99 -mlong-double-128 -O2" } */
+
+#include <float.h>
+
+union gl_long_double_union
+{
+  struct { double hi; double lo; } dd;
+  long double ld;
+};
+
+/* This is gnulib's LDBL_MAX which, being 107 bits in precision, is
+   slightly larger than gcc's 106 bit precision LDBL_MAX.  */
+volatile union gl_long_double_union gl_LDBL_MAX =
+  { { DBL_MAX, DBL_MAX / (double)134217728UL / (double)134217728UL } };
+
+volatile double min_denorm = 0x1p-1074;
+volatile double ld_low = 0x1p-969;
+volatile double dinf = 1.0/0.0;
+volatile double dnan = 0.0/0.0;
+
+int
+main (void)
+{
+  long double ld;
+
+  ld = gl_LDBL_MAX.ld;
+  if (__builtin_isinfl (ld))
+    __builtin_abort ();
+  ld = -gl_LDBL_MAX.ld;
+  if (__builtin_isinfl (ld))
+    __builtin_abort ();
+
+  ld = gl_LDBL_MAX.ld;
+  if (!__builtin_isfinite (ld))
+    __builtin_abort ();
+  ld = -gl_LDBL_MAX.ld;
+  if (!__builtin_isfinite (ld))
+    __builtin_abort ();
+
+  ld = ld_low;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -ld_low;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = -min_denorm;
+  ld += ld_low;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = min_denorm;
+  ld -= ld_low;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = 0.0;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -0.0;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = LDBL_MAX;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -LDBL_MAX;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = gl_LDBL_MAX.ld;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -gl_LDBL_MAX.ld;
+  if (!__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = dinf;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -dinf;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+
+  ld = dnan;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+  ld = -dnan;
+  if (__builtin_isnormal (ld))
+    __builtin_abort ();
+  return 0;
+}

> >  What does XLC do here?
> 
> Not sure, sorry.  I don't have xlc handy.  Will try later.

It seems that to compile 128-bit long double with xlc, I need xlc128,
and I don't have that..  For a double, xlc implements isnormal() on
power8 by moving the fpr argument to a gpr followed by a bunch of bit
twiddling to test the exponent.  xlc's sequence isn't as good as it
could be, 15 insns.  The ideal ought to be the following, I think,
which gcc compiles to 8 insns on power8 (and could be 7 insns if a
useless sign extension was eliminated).

int
bit_isnormal (double x)
{
  union { double d; uint64_t l; } val;
  val.d = x;
  uint64_t exp = (val.l >> 52) & 0x7ff;
  return exp - 1 < 0x7fe;
}

The above is around twice as fast as fold_builtin_interclass_mathfn
implementation of isnormal() for double, on power8.  I expect a bit
twiddling implementation for IBM extended would show similar or better
improvement.

However I'm not inclined to pursue this, especially for gcc-6.  The
patch I posted for isnormal() IBM extended is already faster (about
65% average timing on power8) than what existed previously.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-07  8:04         ` Alan Modra
@ 2016-04-07  9:33           ` Richard Biener
  2016-04-07 14:17             ` Alan Modra
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Biener @ 2016-04-07  9:33 UTC (permalink / raw)
  To: Alan Modra; +Cc: GCC Patches

On April 7, 2016 10:03:54 AM GMT+02:00, Alan Modra <amodra@gmail.com> wrote:
>On Wed, Apr 06, 2016 at 06:49:19PM +0930, Alan Modra wrote:
>> On Wed, Apr 06, 2016 at 10:46:48AM +0200, Richard Biener wrote:
>> > Can you add a testcase or two for the isnormal () case?
>> 
>> Sure.  I'll adapt the testcase I was using to verify the output,
>
>Revised testcase - target fixed, compiled at -O2 with volatile vars so
>we're testing optimized builtins with non-constant data.
>
>diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c
>b/gcc/testsuite/gcc.target/powerpc/pr70117.c
>new file mode 100644
>index 0000000..f1fdedb
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
>@@ -0,0 +1,92 @@
>+/* { dg-do run { target { powerpc*-*-linux* powerpc*-*-darwin*
>powerpc*-*-aix* rs6000-*-* } } } */
>+/* { dg-options "-std=c99 -mlong-double-128 -O2" } */
>+
>+#include <float.h>
>+
>+union gl_long_double_union
>+{
>+  struct { double hi; double lo; } dd;
>+  long double ld;
>+};
>+
>+/* This is gnulib's LDBL_MAX which, being 107 bits in precision, is
>+   slightly larger than gcc's 106 bit precision LDBL_MAX.  */
>+volatile union gl_long_double_union gl_LDBL_MAX =
>+  { { DBL_MAX, DBL_MAX / (double)134217728UL / (double)134217728UL }
>};
>+
>+volatile double min_denorm = 0x1p-1074;
>+volatile double ld_low = 0x1p-969;
>+volatile double dinf = 1.0/0.0;
>+volatile double dnan = 0.0/0.0;
>+
>+int
>+main (void)
>+{
>+  long double ld;
>+
>+  ld = gl_LDBL_MAX.ld;
>+  if (__builtin_isinfl (ld))
>+    __builtin_abort ();
>+  ld = -gl_LDBL_MAX.ld;
>+  if (__builtin_isinfl (ld))
>+    __builtin_abort ();
>+
>+  ld = gl_LDBL_MAX.ld;
>+  if (!__builtin_isfinite (ld))
>+    __builtin_abort ();
>+  ld = -gl_LDBL_MAX.ld;
>+  if (!__builtin_isfinite (ld))
>+    __builtin_abort ();
>+
>+  ld = ld_low;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -ld_low;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = -min_denorm;
>+  ld += ld_low;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = min_denorm;
>+  ld -= ld_low;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = 0.0;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -0.0;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = LDBL_MAX;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -LDBL_MAX;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = gl_LDBL_MAX.ld;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -gl_LDBL_MAX.ld;
>+  if (!__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = dinf;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -dinf;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+
>+  ld = dnan;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  ld = -dnan;
>+  if (__builtin_isnormal (ld))
>+    __builtin_abort ();
>+  return 0;
>+}
>
>> >  What does XLC do here?
>> 
>> Not sure, sorry.  I don't have xlc handy.  Will try later.
>
>It seems that to compile 128-bit long double with xlc, I need xlc128,
>and I don't have that..  For a double, xlc implements isnormal() on
>power8 by moving the fpr argument to a gpr followed by a bunch of bit
>twiddling to test the exponent.  xlc's sequence isn't as good as it
>could be, 15 insns.  The ideal ought to be the following, I think,
>which gcc compiles to 8 insns on power8 (and could be 7 insns if a
>useless sign extension was eliminated).
>
>int
>bit_isnormal (double x)
>{
>  union { double d; uint64_t l; } val;
>  val.d = x;
>  uint64_t exp = (val.l >> 52) & 0x7ff;
>  return exp - 1 < 0x7fe;
>}
>
>The above is around twice as fast as fold_builtin_interclass_mathfn
>implementation of isnormal() for double, on power8.  I expect a bit
>twiddling implementation for IBM extended would show similar or better
>improvement.
>
>However I'm not inclined to pursue this, especially for gcc-6.  The
>patch I posted for isnormal() IBM extended is already faster (about
>65% average timing on power8) than what existed previously.

That's good to know.  I think the patch is OK but please seek approval from a ppc maintainer as well

Thanks,
Richard.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-07  9:33           ` Richard Biener
@ 2016-04-07 14:17             ` Alan Modra
  2016-04-07 14:43               ` David Edelsohn
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2016-04-07 14:17 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On Thu, Apr 07, 2016 at 11:32:58AM +0200, Richard Biener wrote:
> That's good to know.  I think the patch is OK but please seek approval from a ppc maintainer as well

There's only one of those.  David?  Thread starts here
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00213.html

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-07 14:17             ` Alan Modra
@ 2016-04-07 14:43               ` David Edelsohn
  2016-04-08  3:03                 ` Alan Modra
  0 siblings, 1 reply; 13+ messages in thread
From: David Edelsohn @ 2016-04-07 14:43 UTC (permalink / raw)
  To: Alan Modra; +Cc: GCC Patches

On Thu, Apr 7, 2016 at 10:17 AM, Alan Modra <amodra@gmail.com> wrote:
> On Thu, Apr 07, 2016 at 11:32:58AM +0200, Richard Biener wrote:
>> That's good to know.  I think the patch is OK but please seek approval from a ppc maintainer as well
>
> There's only one of those.  David?  Thread starts here
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00213.html

Yes, I have been following this entertaining thread.

This is okay.

By the way, xlc -qldbl128 should enable 128 bit.

Thanks, David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-07 14:43               ` David Edelsohn
@ 2016-04-08  3:03                 ` Alan Modra
  2016-04-08  5:41                   ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2016-04-08  3:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, David Edelsohn

On Thu, Apr 07, 2016 at 10:43:31AM -0400, David Edelsohn wrote:
> Yes, I have been following this entertaining thread.

How to waste lots of time over one bit.  Floating point is like that.
:-)

I see the bug was opened against 5.3, so OK to commit there after a
few days and maybe 4.9 too, Richard?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PR70117, ppc long double isinf
  2016-04-08  3:03                 ` Alan Modra
@ 2016-04-08  5:41                   ` Richard Biener
  0 siblings, 0 replies; 13+ messages in thread
From: Richard Biener @ 2016-04-08  5:41 UTC (permalink / raw)
  To: Alan Modra; +Cc: GCC Patches, David Edelsohn

On April 8, 2016 5:03:04 AM GMT+02:00, Alan Modra <amodra@gmail.com> wrote:
>On Thu, Apr 07, 2016 at 10:43:31AM -0400, David Edelsohn wrote:
>> Yes, I have been following this entertaining thread.
>
>How to waste lots of time over one bit.  Floating point is like that.
>:-)
>
>I see the bug was opened against 5.3, so OK to commit there after a
>few days and maybe 4.9 too, Richard?

Yes please.

Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-04-08  5:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-05  8:33 [RFC] PR70117, ppc long double isinf Alan Modra
2016-04-05  9:29 ` Richard Biener
2016-04-06  8:32   ` [PATCH] " Alan Modra
2016-04-06  8:46     ` Richard Biener
2016-04-06  9:19       ` Alan Modra
2016-04-07  8:04         ` Alan Modra
2016-04-07  9:33           ` Richard Biener
2016-04-07 14:17             ` Alan Modra
2016-04-07 14:43               ` David Edelsohn
2016-04-08  3:03                 ` Alan Modra
2016-04-08  5:41                   ` Richard Biener
2016-04-06 10:27     ` Andreas Schwab
2016-04-06 12:43       ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).