[PATCH] Re: nextafter() about an order of magnitude slower than trivial implementation

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] Re: nextafter() about an order of magnitude slower than trivial implementation
       [not found]     ` <741b2c48-a754-51c5-cb72-a2f97795e30f@linaro.org>
@ 2021-08-19 17:57       ` Stefan Kanthak
  2021-08-20  9:52       ` [PATCH/2nd version] " Stefan Kanthak
  1 sibling, 0 replies; 6+ messages in thread
From: Stefan Kanthak @ 2021-08-19 17:57 UTC (permalink / raw)
  To: Szabolcs Nagy, Adhemerval Zanella; +Cc: libc-help, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 3645 bytes --]

Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> On 18/08/2021 14:11, Stefan Kanthak wrote:
>> Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>> 
>>> The 08/16/2021 18:03, Stefan Kanthak wrote:
>>>> Testing and benchmarking an exponential function that consumes
>>>> about 10ns/call on an AMD EPYC 7262, I noticed that nextafter()
>>>> itself is DEAD SLOW: it consumes about 8.5ns/call!
>>>>
>>>> The following (quick&dirty) implementation consumes 0.7ns/call,
>>>> i.e. is about an order of magnitude faster:
>>>
>>> correctly measuring latency on a modern x86_64 core:
>>>
>>> musl: 3.16 ns
>>> glibc: 5.68 ns
>>> your: 5.72 ns
> 
> Thanks for bring this up, if you want to contribute a patch

See attachment.
For completeness sake and to ease an initial review, I post the full
source here too: without ALTERNATE defined, it uses integer arithmetic
to the extent possible, like the original.
Choose whatever you like best or gives better throughput/latency.

Stefan

JFTR: .../sysdeps/ieee754/flt-32/s_nextafterf.c as well as
      .../sysdeps/i386/fpu/s_nextafterl.c will need the same changes.

--- .../math/s_nextafter.c ---
/* IEEE functions
 *        nextafter(x,y)
 *        return the next machine floating-point number of x in the
 *        direction toward y.
 *   Special cases:
 */

/* Ugly hack so that the aliasing works.  */
#define __nexttoward __internal___nexttoward
#define nexttoward __internal_nexttoward

#include <errno.h>
#include <math.h>
#include <math-barriers.h>
#include <math_private.h>
#include <float.h>
#include <libm-alias-double.h>

double __nextafter(double x, double y)
{
        union {
            double d;
            int64_t s;
            uint64_t u;
        } xx = {x}, yy = {y};

        uint64_t ax = xx.u + xx.u;      /* |x|<<1 */
        uint64_t ay = yy.u + yy.u;      /* |y|<<1 */

        if(ax > 0x7ffull << 53)         /* x is nan */
            return x;
        if(ay > 0x7ffull << 53)         /* y is nan */
            return y;
        if(x == y) return y;            /* x=y, return y */
        if(ax == 0ull) {                /* x == 0 */
            xx.u = yy.s < 0 ? 0x8000000000000001ull : 1ull;
            xx.d = math_opt_barrier (xx.d);
            xx.d += 0.0;
            math_force_eval (xx.d);     /* raise underflow flag */
            return xx.d;                /* return +-minsubnormal */
        }
#ifdef ALTERNATE
        if((x < y) == (x < 0.0))
            --xx.u;
        else
            ++xx.u;
#else
        if(xx.s < 0) {                  /* x < 0 */
            if(yy.s < 0 && xx.s < yy.s)
                ++xx.s;                 /* x < 0, x > y, x += ulp */
            else
                --xx.s;                 /* x < 0, x < y, x -= ulp */
        } else {                        /* x > 0 */
            if(xx.s > yy.s)
                --xx.s;                 /* x > 0, x > y, x -= ulp */
            else
                ++xx.s;                 /* x > 0, x < y, x += ulp */
        }
#endif
        ax = xx.u + xx.u;               /* |x|<<1 */
        if((ax >= 0x7ffull << 53)       /* overflow  */
        || (ax < 0x001ull << 53)) {     /* underflow */
          xx.d = math_opt_barrier (xx.d);
          xx.d += 0.0;
          math_force_eval (xx.d);       /* raise overflow/underflow flag */
          __set_errno (ERANGE);
        }
        return xx.d;
}
libm_alias_double (__nextafter, nextafter)
#ifdef NO_LONG_DOUBLE
strong_alias (__nextafter, __nexttowardl)
weak_alias (__nexttowardl, nexttowardl)
#undef __nexttoward
strong_alias (__nextafter, __nexttoward)
#undef nexttoward
weak_alias (__nextafter, nexttoward)
#endif
--- EOF ---

[-- Attachment #2: glibc.patch --]
[-- Type: application/octet-stream, Size: 3533 bytes --]

--- -/math/s_nextafter.c
+++ +/math/s_nextafter.c
@@ -1,19 +1,3 @@
-/* @(#)s_nextafter.c 5.1 93/09/24 */
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
-
-#if defined(LIBM_SCCS) && !defined(lint)
-static char rcsid[] = "$NetBSD: s_nextafter.c,v 1.8 1995/05/10 20:47:58 jtc Exp $";
-#endif
-
 /* IEEE functions
  *	nextafter(x,y)
  *	return the next machine floating-point number of x in the
@@ -34,55 +18,53 @@
 
 double __nextafter(double x, double y)
 {
-	int32_t hx,hy,ix,iy;
-	uint32_t lx,ly;
+	union {
+	    double d;
+	    int64_t s;
+	    uint64_t u;
+	} xx = {x}, yy = {y};
 
-	EXTRACT_WORDS(hx,lx,x);
-	EXTRACT_WORDS(hy,ly,y);
-	ix = hx&0x7fffffff;		/* |x| */
-	iy = hy&0x7fffffff;		/* |y| */
+	uint64_t ax = xx.u + xx.u;	/* |x|<<1 */
+	uint64_t ay = yy.u + yy.u;	/* |y|<<1 */
 
-	if(((ix>=0x7ff00000)&&((ix-0x7ff00000)|lx)!=0) ||   /* x is nan */
-	   ((iy>=0x7ff00000)&&((iy-0x7ff00000)|ly)!=0))     /* y is nan */
-	   return x+y;
-	if(x==y) return y;		/* x=y, return y */
-	if((ix|lx)==0) {			/* x == 0 */
-	    double u;
-	    INSERT_WORDS(x,hy&0x80000000,1);	/* return +-minsubnormal */
-	    u = math_opt_barrier (x);
-	    u = u*u;
-	    math_force_eval (u);		/* raise underflow flag */
+	if(ax > 0x7ffull << 53)		/* x is nan */
 	    return x;
+	if(ay > 0x7ffull << 53)		/* y is nan */
+	    return y;
+	if(x == y) return y;		/* x=y, return y */
+	if(ax == 0ull) {		/* x == 0 */
+	    xx.u = yy.s < 0 ? 0x8000000000000001ull : 1ull;
+	    xx.d = math_opt_barrier (xx.d);
+	    xx.d += 0.0;
+	    math_force_eval (xx.d);	/* raise underflow flag */
+	    return xx.d;		/* return +-minsubnormal */
 	}
-	if(hx>=0) {				/* x > 0 */
-	    if(hx>hy||((hx==hy)&&(lx>ly))) {	/* x > y, x -= ulp */
-		if(lx==0) hx -= 1;
-		lx -= 1;
-	    } else {				/* x < y, x += ulp */
-		lx += 1;
-		if(lx==0) hx += 1;
-	    }
-	} else {				/* x < 0 */
-	    if(hy>=0||hx>hy||((hx==hy)&&(lx>ly))){/* x < y, x -= ulp */
-		if(lx==0) hx -= 1;
-		lx -= 1;
-	    } else {				/* x > y, x += ulp */
-		lx += 1;
-		if(lx==0) hx += 1;
-	    }
+#ifdef ALTERNATE
+	if ((x < y) == (x < 0.0))
+	    --xx.u;
+	else
+	    ++xx.u;
+#else
+	if(xx.s < 0) {			/* x < 0 */
+	    if(yy.s < 0 && xx.s < yy.s)
+	        ++xx.s;			/* x < 0, x > y, x += ulp */
+	    else
+	        --xx.s;			/* x < 0, x < y, x -= ulp */
+	} else {			/* x > 0 */
+	    if(xx.s > yy.s)
+	        --xx.s;			/* x > 0, x > y, x -= ulp */
+	    else
+	        ++xx.s;			/* x > 0, x < y, x += ulp */
 	}
-	hy = hx&0x7ff00000;
-	if(hy>=0x7ff00000) {
-	  double u = x+x;	/* overflow  */
-	  math_force_eval (u);
+#endif
+	ax = xx.u + xx.u;		/* |x|<<1 */
+	if((ax >= 0x7ffull << 53)	/* overflow  */
+	|| (ax < 0x001ull << 53)) {	/* underflow */
+	  xx.d = math_opt_barrier (xx.d);
+	  xx.d += 0.0;
+	  math_force_eval (xx.d);	/* raise overflow/underflow flag */
 	  __set_errno (ERANGE);
 	}
-	if(hy<0x00100000) {
-	    double u = x*x;			/* underflow */
-	    math_force_eval (u);		/* raise underflow flag */
-	  __set_errno (ERANGE);
-	}
-	INSERT_WORDS(x,hx,lx);
 	return xx.d;
 }
 libm_alias_double (__nextafter, nextafter)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH/2nd version] Re: nextafter() about an order of magnitude slower than trivial implementation
       [not found]     ` <741b2c48-a754-51c5-cb72-a2f97795e30f@linaro.org>
  2021-08-19 17:57       ` [PATCH] Re: nextafter() about an order of magnitude slower than trivial implementation Stefan Kanthak
@ 2021-08-20  9:52       ` Stefan Kanthak
  2021-08-20 16:55         ` Joseph Myers
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Kanthak @ 2021-08-20  9:52 UTC (permalink / raw)
  To: Szabolcs Nagy, Adhemerval Zanella; +Cc: libc-help, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 3682 bytes --]

Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> On 18/08/2021 14:11, Stefan Kanthak wrote:
>> Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>> 
>>> The 08/16/2021 18:03, Stefan Kanthak wrote:
>>>> Testing and benchmarking an exponential function that consumes
>>>> about 10ns/call on an AMD EPYC 7262, I noticed that nextafter()
>>>> itself is DEAD SLOW: it consumes about 8.5ns/call!
>>>>
>>>> The following (quick&dirty) implementation consumes 0.7ns/call,
>>>> i.e. is about an order of magnitude faster:
>>>
>>> correctly measuring latency on a modern x86_64 core:
>>>
>>> musl: 3.16 ns
>>> glibc: 5.68 ns
>>> your: 5.72 ns
> 
> Thanks for bring this up, if you want to contribute a patch

See attachment.
For completeness sake and to ease an initial review, I post the full
source here too: without ALTERNATE defined, it uses integer arithmetic
to the extent possible, like the original.
Choose whatever you like best or gives better throughput/latency.

Changes from the 1st version of the patch:
- leftover FP comparision substituted with integer operations;
- increment/decrement logic simplified/unified, avoiding a
  conditional branch.
I expect the compiler to detect the common clause
 "&& ((xx.s < 0) == (yy.s < 0)))" introduced with these changes.

Stefan

JFTR: .../sysdeps/ieee754/flt-32/s_nextafterf.c as well as
      .../sysdeps/i386/fpu/s_nextafterl.c will need the same changes.

--- .../math/s_nextafter.c ---
/* IEEE functions
 *      nextafter(x,y)
 *      return the next machine floating-point number of x in the
 *      direction toward y.
 *   Special cases:
 */

/* Ugly hack so that the aliasing works.  */
#define __nexttoward __internal___nexttoward
#define nexttoward __internal_nexttoward

#include <errno.h>
#include <math.h>
#include <math-barriers.h>
#include <math_private.h>
#include <float.h>
#include <libm-alias-double.h>

double __nextafter(double x, double y)
{
        union {
            double d;
            int64_t s;
            uint64_t u;
        } xx = {x}, yy = {y};

        uint64_t ax = xx.u + xx.u;      /* |x|<<1 */
        uint64_t ay = yy.u + yy.u;      /* |y|<<1 */

        if(ax > 0x7ffull << 53)         /* x is nan */
            return x;
        if(ay > 0x7ffull << 53)         /* y is nan */
            return y;
#ifdef ALTERNATE
        if(x == y) return y;            /* x=y, return y */
#else
        if((ax == ay)
        && ((xx.s < 0) == (yy.s < 0)))
            return y;
#endif
        if(ax == 0) {                   /* x == 0 */
            xx.u = yy.s < 0 ? 0x8000000000000001ull : 1ull;
            xx.d = math_opt_barrier (xx.d);
            xx.d += 0.0;
            math_force_eval (xx.d);     /* raise underflow flag */
            return xx.d;                /* return +-minsubnormal */
        }
#ifdef ALTERNATE
        if((x < y) == (x < 0.0))
            --xx.u;
        else
            ++xx.u;
#else
        if((ax < ay)
        && ((xx.s < 0) == (yy.s < 0)))
            ++xx.u;
        else
            --xx.u;
#endif
        ax = xx.u + xx.u;               /* |x|<<1 */
        if((ax >= 0x7ffull << 53)       /* overflow  */
        || (ax < 0x001ull << 53)) {     /* underflow */
            xx.d = math_opt_barrier (xx.d);
            xx.d += 0.0;
            math_force_eval (xx.d);     /* raise overflow/underflow flag */
            __set_errno (ERANGE);
        }
        return xx.d;
}
libm_alias_double (__nextafter, nextafter)
#ifdef NO_LONG_DOUBLE
strong_alias (__nextafter, __nexttowardl)
weak_alias (__nexttowardl, nexttowardl)
#undef __nexttoward
strong_alias (__nextafter, __nexttoward)
#undef nexttoward
weak_alias (__nextafter, nexttoward)
#endif
--- EOF ---

[-- Attachment #2: glibc.patch --]
[-- Type: application/octet-stream, Size: 3447 bytes --]

--- -/math/s_nextafter.c
+++ +/math/s_nextafter.c
@@ -1,19 +1,3 @@
-/* @(#)s_nextafter.c 5.1 93/09/24 */
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
-
-#if defined(LIBM_SCCS) && !defined(lint)
-static char rcsid[] = "$NetBSD: s_nextafter.c,v 1.8 1995/05/10 20:47:58 jtc Exp $";
-#endif
-
 /* IEEE functions
  *	nextafter(x,y)
  *	return the next machine floating-point number of x in the
@@ -34,56 +18,55 @@
 
 double __nextafter(double x, double y)
 {
-	int32_t hx,hy,ix,iy;
-	uint32_t lx,ly;
+	union {
+	    double d;
+	    int64_t s;
+	    uint64_t u;
+	} xx = {x}, yy = {y};
 
-	EXTRACT_WORDS(hx,lx,x);
-	EXTRACT_WORDS(hy,ly,y);
-	ix = hx&0x7fffffff;		/* |x| */
-	iy = hy&0x7fffffff;		/* |y| */
+	uint64_t ax = xx.u + xx.u;	/* |x|<<1 */
+	uint64_t ay = yy.u + yy.u;	/* |y|<<1 */
 
-	if(((ix>=0x7ff00000)&&((ix-0x7ff00000)|lx)!=0) ||   /* x is nan */
-	   ((iy>=0x7ff00000)&&((iy-0x7ff00000)|ly)!=0))     /* y is nan */
-	   return x+y;
-	if(x==y) return y;		/* x=y, return y */
-	if((ix|lx)==0) {			/* x == 0 */
-	    double u;
-	    INSERT_WORDS(x,hy&0x80000000,1);	/* return +-minsubnormal */
-	    u = math_opt_barrier (x);
-	    u = u*u;
-	    math_force_eval (u);		/* raise underflow flag */
+	if(ax > 0x7ffull << 53)		/* x is nan */
 	    return x;
+	if(ay > 0x7ffull << 53)		/* y is nan */
+	    return y;
+#ifdef ALTERNATE
+	if(x == y) return y;		/* x=y, return y */
+#else
+	if((ax == ay)
+	&& ((xx.s < 0) == (yy.s < 0)))
+	    return y;
+#endif
+	if(ax == 0) {			/* x == 0 */
+	    xx.u = yy.s < 0 ? 0x8000000000000001ull : 1ull;
+	    xx.d = math_opt_barrier (xx.d);
+	    xx.d += 0.0;
+	    math_force_eval (xx.d);	/* raise underflow flag */
+	    return xx.d;		/* return +-minsubnormal */
 	}
-	if(hx>=0) {				/* x > 0 */
-	    if(hx>hy||((hx==hy)&&(lx>ly))) {	/* x > y, x -= ulp */
-		if(lx==0) hx -= 1;
-		lx -= 1;
-	    } else {				/* x < y, x += ulp */
-		lx += 1;
-		if(lx==0) hx += 1;
-	    }
-	} else {				/* x < 0 */
-	    if(hy>=0||hx>hy||((hx==hy)&&(lx>ly))){/* x < y, x -= ulp */
-		if(lx==0) hx -= 1;
-		lx -= 1;
-	    } else {				/* x > y, x += ulp */
-		lx += 1;
-		if(lx==0) hx += 1;
-	    }
+#ifdef ALTERNATE
+	if((x < y) == (x < 0.0))
+	    --xx.u;
+	else
+	    ++xx.u;
 	}
-	hy = hx&0x7ff00000;
-	if(hy>=0x7ff00000) {
-	  double u = x+x;	/* overflow  */
-	  math_force_eval (u);
-	  __set_errno (ERANGE);
+#else
+	if((ax < ay)
+	&& ((xx.s < 0) == (yy.s < 0)))
+	    ++xx.u;
+	else
+	    --xx.u;
+#endif
+	ax = xx.u + xx.u;		/* |x|<<1 */
+	if((ax >= 0x7ffull << 53)	/* overflow  */
+	|| (ax < 0x001ull << 53)) {	/* underflow */
+	  xx.d = math_opt_barrier (xx.d);
+	  xx.d += 0.0;
+	  math_force_eval (xx.d);	/* raise overflow/underflow flag */
+	  __set_errno (ERANGE);
 	}
-	if(hy<0x00100000) {
-	    double u = x*x;			/* underflow */
-	    math_force_eval (u);		/* raise underflow flag */
-	  __set_errno (ERANGE);
-	}
-	INSERT_WORDS(x,hx,lx);
-	return x;
+ 	return xx.d;
 }
 libm_alias_double (__nextafter, nextafter)
 #ifdef NO_LONG_DOUBLE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/2nd version] Re: nextafter() about an order of magnitude slower than trivial implementation
  2021-08-20  9:52       ` [PATCH/2nd version] " Stefan Kanthak
@ 2021-08-20 16:55         ` Joseph Myers
  2021-08-20 20:19           ` Stefan Kanthak
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Myers @ 2021-08-20 16:55 UTC (permalink / raw)
  To: Stefan Kanthak; +Cc: Szabolcs Nagy, Adhemerval Zanella, libc-help, libc-alpha

On Fri, 20 Aug 2021, Stefan Kanthak wrote:

>         if(ax > 0x7ffull << 53)         /* x is nan */
>             return x;
>         if(ay > 0x7ffull << 53)         /* y is nan */
>             return y;

How has this been tested?  I'd have expected it to fail the nextafter 
tests in the glibc testsuite (libm-test-nextafter.inc), because they 
verify that sNaN arguments produce a qNaN result with the "invalid" 
exception raised, and this looks like it would just return an sNaN 
argument unchanged.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/2nd version] Re: nextafter() about an order of magnitude slower than trivial implementation
  2021-08-20 16:55         ` Joseph Myers
@ 2021-08-20 20:19           ` Stefan Kanthak
  2021-08-20 21:03             ` Joseph Myers
  2021-08-23 12:50             ` Adhemerval Zanella
  0 siblings, 2 replies; 6+ messages in thread
From: Stefan Kanthak @ 2021-08-20 20:19 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Szabolcs Nagy, Adhemerval Zanella, libc-help, libc-alpha

"Joseph Myers" <joseph@codesourcery.com> wrote:

> On Fri, 20 Aug 2021, Stefan Kanthak wrote:
> 
>>         if(ax > 0x7ffull << 53)         /* x is nan */
>>             return x;
>>         if(ay > 0x7ffull << 53)         /* y is nan */
>>             return y;
> 
> How has this been tested?  I'd have expected it to fail the nextafter 
> tests in the glibc testsuite (libm-test-nextafter.inc), because they 
> verify that sNaN arguments produce a qNaN result with the "invalid" 
> exception raised, and this looks like it would just return an sNaN 
> argument unchanged.

It doesn't look like it would, it REALLY does!
As I explicitly wrote, my changes avoid FP operations if possible.

<https://pubs.opengroup.org/onlinepubs/9699919799/functions/nextafter.html>

| If x or y is NaN, a NaN shall be returned.

I choose to return the argument NaN, what both POSIX and ISO C allow.
If my mind serves me well, one of the former editions even stated
"If an argument is NaN, this argument shall be returned". This may
but be influenced/spoiled by
<https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/nextafter-functions>

| If either x or y is a NAN, then the return value is one of the input NANs.


If that bothers you, change these 4 lines to either

         if(ax > 0x7ffull << 53)         /* x is nan */
             return 0.0 + x;
         if(ay > 0x7ffull << 53)         /* y is nan */
             return 0.0 + y;

(the addition has eventually to be guarded from optimisation) or

         if(ax > 0x7ffull << 53          /* x is nan */
         || ay > 0x7ffull << 53)         /* y is nan */
             return x + y;

whatever you like best.

Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/2nd version] Re: nextafter() about an order of magnitude slower than trivial implementation
  2021-08-20 20:19           ` Stefan Kanthak
@ 2021-08-20 21:03             ` Joseph Myers
  2021-08-23 12:50             ` Adhemerval Zanella
  1 sibling, 0 replies; 6+ messages in thread
From: Joseph Myers @ 2021-08-20 21:03 UTC (permalink / raw)
  To: Stefan Kanthak; +Cc: Szabolcs Nagy, libc-help, libc-alpha

On Fri, 20 Aug 2021, Stefan Kanthak wrote:

> I choose to return the argument NaN, what both POSIX and ISO C allow.

glibc follows C23 Annex F FE_SNANS_ALWAYS_SIGNAL rules for signaling NaN 
arguments.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH/2nd version] Re: nextafter() about an order of magnitude slower than trivial implementation
  2021-08-20 20:19           ` Stefan Kanthak
  2021-08-20 21:03             ` Joseph Myers
@ 2021-08-23 12:50             ` Adhemerval Zanella
  1 sibling, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 12:50 UTC (permalink / raw)
  To: Stefan Kanthak, Joseph Myers; +Cc: Szabolcs Nagy, libc-help, libc-alpha



On 20/08/2021 17:19, Stefan Kanthak wrote:
> "Joseph Myers" <joseph@codesourcery.com> wrote:
> 
>> On Fri, 20 Aug 2021, Stefan Kanthak wrote:
>>
>>>         if(ax > 0x7ffull << 53)         /* x is nan */
>>>             return x;
>>>         if(ay > 0x7ffull << 53)         /* y is nan */
>>>             return y;
>>
>> How has this been tested?  I'd have expected it to fail the nextafter 
>> tests in the glibc testsuite (libm-test-nextafter.inc), because they 
>> verify that sNaN arguments produce a qNaN result with the "invalid" 
>> exception raised, and this looks like it would just return an sNaN 
>> argument unchanged.
> 
> It doesn't look like it would, it REALLY does!
> As I explicitly wrote, my changes avoid FP operations if possible.
> 
> <https://pubs.opengroup.org/onlinepubs/9699919799/functions/nextafter.html>
> 
> | If x or y is NaN, a NaN shall be returned.
> 
> I choose to return the argument NaN, what both POSIX and ISO C allow.
> If my mind serves me well, one of the former editions even stated
> "If an argument is NaN, this argument shall be returned". This may
> but be influenced/spoiled by
> <https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/nextafter-functions>
> 
> | If either x or y is a NAN, then the return value is one of the input NANs.
> 
> 
> If that bothers you, change these 4 lines to either
> 
>          if(ax > 0x7ffull << 53)         /* x is nan */
>              return 0.0 + x;
>          if(ay > 0x7ffull << 53)         /* y is nan */
>              return 0.0 + y;
> 
> (the addition has eventually to be guarded from optimisation) or
> 
>          if(ax > 0x7ffull << 53          /* x is nan */
>          || ay > 0x7ffull << 53)         /* y is nan */
>              return x + y;

Sorry, but sending half-baked patches where one will need to evaluate
and decide which is the best strategy is not the best way forward to
accept this change.

Please do follow the contributor checklist [1] and the suggestion I
gave you when you asked for input in the libc-help:

  * Check for regressions by testing your patch against glibc testsuite
    (make check).  As Joseph has put, your patch triggers a lot of
    regressions:

    $ grep ^FAIL math/subdir-tests.sum
    FAIL: math/bug-nextafter
    FAIL: math/test-double-nextafter
    FAIL: math/test-float32x-nextafter
    FAIL: math/test-float64-nextafter
    FAIL: math/test-misc

  * You removed the copyright since it is a new implementation, but
    the any code still requires a Copyright (even if is not assigned
    to FSF).  Check the item 3.4 on the Contributor checklist.

  * Also, since is a new implementation follow the usual style and
    code guidelines instead of using the current one used.

  * It would be good to add a benchmark to evaluate this change if
    the motivation for your change is performance.  Follow what
    other math benchmarks does on benchtests (exp2 or exp10 for
    instance).

  * Sending a patch with a define switch and stating "Choose whatever
    you like best" does not help in anything and put the burden or
    evaluating your change to the maintainers. 
 
[1] https://sourceware.org/glibc/wiki/Contribution%20checklist

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-23 12:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <D7077C5619F74049B8890508155291B2@H270>
     [not found] ` <20210818125119.GH25257@arm.com>
     [not found]   ` <BEF317B3F18B447F8BAE9B550BE5C237@H270>
     [not found]     ` <741b2c48-a754-51c5-cb72-a2f97795e30f@linaro.org>
2021-08-19 17:57       ` [PATCH] Re: nextafter() about an order of magnitude slower than trivial implementation Stefan Kanthak
2021-08-20  9:52       ` [PATCH/2nd version] " Stefan Kanthak
2021-08-20 16:55         ` Joseph Myers
2021-08-20 20:19           ` Stefan Kanthak
2021-08-20 21:03             ` Joseph Myers
2021-08-23 12:50             ` Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).