[PATCH v2 3/9] math: Simplify hypotf implementation

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v2 3/9] math: Simplify hypotf implementation
@ 2021-10-28 14:27 Wilco Dijkstra
  0 siblings, 0 replies; 3+ messages in thread
From: Wilco Dijkstra @ 2021-10-28 14:27 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: 'GNU C Library'

Hi Adhemerval,

+  if ((isinf (x) || isinf (y))
+      && !issignaling (x) && !issignaling (y))
+    return INFINITY;
+  if (isnan (x) || isnan (y))
+    return x + y;

These checks are inefficient. It's partly because GCC uses FP comparisons for
isinf/isnan when integer arithmetic is faster on almost all targets and more
correct for signaling NaNs. However GCC doesn't seem to treat these as
unlikely and so there are 3 comparisons and 3 taken branches before entering
the main code. If you change it to:

if (!isfinite (x) || !isfinite (y))
{
  if ((isinf (x) || isinf (y))
      && !issignaling (x) && !issignaling (y))
    return INFINITY;
 return x + y;
}

then it uses only 2 comparisons and we fallthrough into the common case.
This gives a speedup of 10%.

Wilco

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 0/9] Improve hypot()
@ 2021-10-25 11:57 Adhemerval Zanella
  2021-10-25 11:57 ` [PATCH v2 3/9] math: Simplify hypotf implementation Adhemerval Zanella
  0 siblings, 1 reply; 3+ messages in thread
From: Adhemerval Zanella @ 2021-10-25 11:57 UTC (permalink / raw)
  To: libc-alpha

This patchset uses a different algorithm for hypot()  based on the
'An Improved Algorithm for hypot(a,b)' by Carlos F. Borges [1].  This
method is also used by Julia language.  The motivation for this change
are:

  1. The new algorithm is more precise without minimum performance
     difference.

  2. It allows to consolidate the implementations, since it favor
     floating-point operations over integer ones (as done for powerpc).
     This might be a boost for some architectures as well.

The current hypot() implementation seems already to be bounded to a
maximum of 1 ulp of error, however the new proposed algorithm shows an
slight precision improvement by showing more correctly rounded results.

 With a random 1e9 inputs for different float format I see:

  - An improvement from 3427362 to 18457 results with 1 ulp of
    error for Binary64.
  - An improvement from 233442 to 1274 results with 1 ulp of
    error for Binary96 (x86_64).
  - An improvement from 453045 to 1294 results with 1 ulp of
    error for Binary96 (x86_64).

Also for the maximal known error master shows (in ulps, with
corresponding inputs), determined with [3]:

  binary32  0.500 0x1.3ac98p+67,-0x1.ba5ec2p+77
  binary64  0.987 -0x0.5a934b7eac967p-1022,-0x0.b5265a7e06b82p-1022
  binary96  0.981 0x1.73f339f61eda21dp-16384l,0x2.e45f9f9500877e2p-16384l
  binary128 0.985 -0x2.d8311789103b76133ea1d5bc38c4p-16384,-0x1.6d85492006d7dcc6cc52938684p-16384

With the new implementation:

  binary32  0.500 0x1.3ac98p+67,-0x1.ba5ec2p+77 [same]
  binary64  0.792 0x0.603e52daf0bfdp-1022,-0x0.a622d0a9a433bp-1022
  binary96  0.584 -0x2.97b86706043d619p+7240l,0x1.8256bdd12d2e163ep+7240l
  binary128 0.749 0x2.2d5faf4036d6e68566f01054612p-8192,0x3.5738e8e2505f5d1fc2973716f05p-8192

Along with the newer implementation, the last patch also optimizes the
call with similar work done for other math functions by removing the
POSIX error handling (which is a big performance boost in some cases
and architectures).

I have adapted the dbl-64, ldbl-96, and ldbl-128, the flt-32 is not
required since it calls the dbl-64 one.  I have not adapated ldbl-128ibm
since the format has a lot of caveats and IBM aims to move to ldbl-128.

[1] https://arxiv.org/pdf/1904.09481.pdf
[2] https://github.com/JuliaLang/julia/commit/4a046009a3362ab5e17d369641dbbc9657eb680c
[3] https://gitlab.inria.fr/zimmerma/math_accuracy/-/blob/master/binary64/check_sample2.c
---
Changes from v1:
  - Improved benchmarks to have the exponents of the two arguments close 
    together.
  - Fixed typos.
  - Fixed i386 C implementation.
---
Adhemerval Zanella (9):
  benchtests: Make hypot input random
  benchtests: Add hypotf
  math: Simplify hypotf implementation
  math: Use an improved algorithm for hypot (dbl-64)
  math: Use an improved algorithm for hypotl (ldbl-96)
  math: Use an improved algorithm for hypotl (ldbl-128)
  math: Remove powerpc e_hypot
  i386: Move hypot implementation to C
  math: Remove the error handling wrapper from hypot and hypotf

 benchtests/Makefile                           |    2 +-
 benchtests/hypot-inputs                       | 1015 ++++++++++++++++-
 benchtests/hypotf-inputs                      | 1007 ++++++++++++++++
 math/Versions                                 |    2 +
 math/w_hypot.c                                |    8 +
 math/w_hypot_compat.c                         |   13 +-
 math/w_hypotf.c                               |    8 +
 math/w_hypotf_compat.c                        |    6 +-
 sysdeps/i386/fpu/e_hypot.S                    |   75 --
 sysdeps/i386/fpu/e_hypot.c                    |   56 +
 sysdeps/i386/fpu/e_hypotf.S                   |   64 --
 sysdeps/ieee754/dbl-64/e_hypot.c              |  247 ++--
 sysdeps/ieee754/dbl-64/w_hypot.c              |    1 +
 sysdeps/ieee754/flt-32/e_hypotf.c             |   72 +-
 sysdeps/ieee754/flt-32/w_hypotf.c             |    1 +
 sysdeps/ieee754/ldbl-128/e_hypotl.c           |  222 ++--
 sysdeps/ieee754/ldbl-96/e_hypotl.c            |  227 ++--
 sysdeps/mach/hurd/i386/libm.abilist           |    2 +
 sysdeps/powerpc/fpu/e_hypot.c                 |   87 --
 sysdeps/powerpc/fpu/e_hypotf.c                |   78 --
 .../powerpc32/power4/fpu/multiarch/Makefile   |    5 +-
 .../power4/fpu/multiarch/e_hypot-power7.c     |   23 -
 .../power4/fpu/multiarch/e_hypot-ppc32.c      |   23 -
 .../powerpc32/power4/fpu/multiarch/e_hypot.c  |   33 -
 .../power4/fpu/multiarch/e_hypotf-power7.c    |   23 -
 .../power4/fpu/multiarch/e_hypotf-ppc32.c     |   23 -
 .../powerpc32/power4/fpu/multiarch/e_hypotf.c |   33 -
 sysdeps/unix/sysv/linux/aarch64/libm.abilist  |    2 +
 sysdeps/unix/sysv/linux/alpha/libm.abilist    |    2 +
 sysdeps/unix/sysv/linux/arm/be/libm.abilist   |    2 +
 sysdeps/unix/sysv/linux/arm/le/libm.abilist   |    2 +
 sysdeps/unix/sysv/linux/hppa/libm.abilist     |    2 +
 sysdeps/unix/sysv/linux/i386/libm.abilist     |    2 +
 .../sysv/linux/m68k/coldfire/libm.abilist     |    2 +
 .../unix/sysv/linux/m68k/m680x0/libm.abilist  |    2 +
 .../sysv/linux/microblaze/be/libm.abilist     |    2 +
 .../sysv/linux/microblaze/le/libm.abilist     |    2 +
 .../unix/sysv/linux/mips/mips32/libm.abilist  |    2 +
 .../unix/sysv/linux/mips/mips64/libm.abilist  |    2 +
 sysdeps/unix/sysv/linux/nios2/libm.abilist    |    2 +
 .../linux/powerpc/powerpc32/fpu/libm.abilist  |    2 +
 .../powerpc/powerpc32/nofpu/libm.abilist      |    2 +
 .../linux/powerpc/powerpc64/be/libm.abilist   |    2 +
 .../linux/powerpc/powerpc64/le/libm.abilist   |    2 +
 .../unix/sysv/linux/s390/s390-32/libm.abilist |    2 +
 .../unix/sysv/linux/s390/s390-64/libm.abilist |    2 +
 sysdeps/unix/sysv/linux/sh/be/libm.abilist    |    2 +
 sysdeps/unix/sysv/linux/sh/le/libm.abilist    |    2 +
 .../sysv/linux/sparc/sparc32/libm.abilist     |    2 +
 .../sysv/linux/sparc/sparc64/libm.abilist     |    2 +
 .../unix/sysv/linux/x86_64/64/libm.abilist    |    2 +
 .../unix/sysv/linux/x86_64/x32/libm.abilist   |    2 +
 52 files changed, 2487 insertions(+), 919 deletions(-)
 create mode 100644 benchtests/hypotf-inputs
 create mode 100644 math/w_hypot.c
 create mode 100644 math/w_hypotf.c
 delete mode 100644 sysdeps/i386/fpu/e_hypot.S
 create mode 100644 sysdeps/i386/fpu/e_hypot.c
 delete mode 100644 sysdeps/i386/fpu/e_hypotf.S
 create mode 100644 sysdeps/ieee754/dbl-64/w_hypot.c
 create mode 100644 sysdeps/ieee754/flt-32/w_hypotf.c
 delete mode 100644 sysdeps/powerpc/fpu/e_hypot.c
 delete mode 100644 sysdeps/powerpc/fpu/e_hypotf.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot-power7.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot-ppc32.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf-power7.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf-ppc32.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 3/9] math: Simplify hypotf implementation
  2021-10-25 11:57 [PATCH v2 0/9] Improve hypot() Adhemerval Zanella
@ 2021-10-25 11:57 ` Adhemerval Zanella
  0 siblings, 0 replies; 3+ messages in thread
From: Adhemerval Zanella @ 2021-10-25 11:57 UTC (permalink / raw)
  To: libc-alpha

Use isnan()/isinf() instead of GET_FLOAT_WORD and integer operations.
There is also no need to check for 0.0.

The file Copyright is also changed to use  GPL, the implementation was
completely changed by 7c10fd3515f to use double precision instead of
scaling and this change removes all the GET_FLOAT_WORD usage.

Checked on x86_64-linux-gnu.
---
 sysdeps/ieee754/flt-32/e_hypotf.c | 57 +++++++++++++------------------
 1 file changed, 23 insertions(+), 34 deletions(-)

diff --git a/sysdeps/ieee754/flt-32/e_hypotf.c b/sysdeps/ieee754/flt-32/e_hypotf.c
index e770947dc1..6495a91cd4 100644
--- a/sysdeps/ieee754/flt-32/e_hypotf.c
+++ b/sysdeps/ieee754/flt-32/e_hypotf.c
@@ -1,46 +1,35 @@
-/* e_hypotf.c -- float version of e_hypot.c.
- */
+/* Euclidean distance function.  Float/Binary32 version.
+   Copyright (C) 2012-2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
 
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
 
 #include <math.h>
 #include <math_private.h>
 #include <libm-alias-finite.h>
 
 float
-__ieee754_hypotf(float x, float y)
+__ieee754_hypotf (float x, float y)
 {
-	double d_x, d_y;
-	int32_t ha, hb;
-
-	GET_FLOAT_WORD(ha,x);
-	ha &= 0x7fffffff;
-	GET_FLOAT_WORD(hb,y);
-	hb &= 0x7fffffff;
-	if (ha == 0x7f800000 && !issignaling (y))
-	  return fabsf(x);
-	else if (hb == 0x7f800000 && !issignaling (x))
-	  return fabsf(y);
-	else if (ha > 0x7f800000 || hb > 0x7f800000)
-	  return fabsf(x) * fabsf(y);
-	else if (ha == 0)
-	  return fabsf(y);
-	else if (hb == 0)
-	  return fabsf(x);
-
-	d_x = (double) x;
-	d_y = (double) y;
+  if ((isinf (x) || isinf (y))
+      && !issignaling (x) && !issignaling (y))
+    return INFINITY;
+  if (isnan (x) || isnan (y))
+    return x + y;
 
-	return (float) sqrt(d_x * d_x + d_y * d_y);
+  return sqrt ((double) x * (double) x + (double) y * (double) y);
 }
 #ifndef __ieee754_hypotf
 libm_alias_finite (__ieee754_hypotf, __hypotf)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 0/9] Improve hypot()
@ 2021-10-22 20:48 Adhemerval Zanella
  2021-10-22 20:48 ` [PATCH v2 3/9] math: Simplify hypotf implementation Adhemerval Zanella
  0 siblings, 1 reply; 3+ messages in thread
From: Adhemerval Zanella @ 2021-10-22 20:48 UTC (permalink / raw)
  To: libc-alpha

This patchset uses a different algorithm for hypot()  based on the
'An Improved Algorithm for hypot(a,b)' by Carlos F. Borges [1].  This
method is also used by Julia language.  The motivation for this change
are:

  1. The new algorithm is more precise without minimum performance
     difference.

  2. It allows to consolidate the implementations, since it favor
     floating-point operations over integer ones (as done for powerpc).
     This might be a boost for some architectures as well.

The current hypot() implementation seems already to be bounded to a
maximum of 1 ulp of error, however the new proposed algorithm shows an
slight precision improvement by showing more correctly rounded results.

 With a random 1e9 inputs for different float format I see:

  - An improvement from 3427362 to 18457 results with 1 ulp of
    error for Binary64.
  - An improvement from 233442 to 1274 results with 1 ulp of
    error for Binary96 (x86_64).
  - An improvement from 453045 to 1294 results with 1 ulp of
    error for Binary96 (x86_64).

Also for the maximal known error master shows (in ulps, with
corresponding inputs), determined with [3]:

  binary32  0.500 0x1.3ac98p+67,-0x1.ba5ec2p+77
  binary64  0.987 -0x0.5a934b7eac967p-1022,-0x0.b5265a7e06b82p-1022
  binary96  0.981 0x1.73f339f61eda21dp-16384l,0x2.e45f9f9500877e2p-16384l
  binary128 0.985 -0x2.d8311789103b76133ea1d5bc38c4p-16384,-0x1.6d85492006d7dcc6cc52938684p-16384

With the new implementation:

  binary32  0.500 0x1.3ac98p+67,-0x1.ba5ec2p+77 [same]
  binary64  0.792 0x0.603e52daf0bfdp-1022,-0x0.a622d0a9a433bp-1022
  binary96  0.584 -0x2.97b86706043d619p+7240l,0x1.8256bdd12d2e163ep+7240l
  binary128 0.749 0x2.2d5faf4036d6e68566f01054612p-8192,0x3.5738e8e2505f5d1fc2973716f05p-8192

Along with the newer implementation, the last patch also optimizes the
call with similar work done for other math functions by removing the
POSIX error handling (which is a big performance boost in some cases
and architectures).

I have adapted the dbl-64, ldbl-96, and ldbl-128, the flt-32 is not
required since it calls the dbl-64 one.  I have not adapated ldbl-128ibm
since the format has a lot of caveats and IBM aims to move to ldbl-128.

[1] https://arxiv.org/pdf/1904.09481.pdf
[2] https://github.com/JuliaLang/julia/commit/4a046009a3362ab5e17d369641dbbc9657eb680c
[3] https://gitlab.inria.fr/zimmerma/math_accuracy/-/blob/master/binary64/check_sample2.c

---
Changes from v1:
  - Improved benchmarks to have the exponents of the two arguments close 
    together.
  - Fixed typos.
  - Fixed i386 C implementation.

---
Adhemerval Zanella (9):
  benchtests: Make hypot input random
  benchtests: Add hypotf
  math: Simplify hypotf implementation
  math: Use an improved algorithm for hypot (dbl-64)
  math: Use an improved algorithm for hypotl (ldbl-96)
  math: Use an improved algorithm for hypotl (ldbl-128)
  math: Remove powerpc e_hypot
  i386: Move hypot implementation to C
  math: Remove the error handling wrapper from hypot and hypotf

 benchtests/Makefile                           |    2 +-
 benchtests/hypot-inputs                       | 1015 ++++++++++++++++-
 benchtests/hypotf-inputs                      | 1007 ++++++++++++++++
 math/Versions                                 |    2 +
 math/w_hypot.c                                |    8 +
 math/w_hypot_compat.c                         |   13 +-
 math/w_hypotf.c                               |    8 +
 math/w_hypotf_compat.c                        |    6 +-
 sysdeps/i386/fpu/e_hypot.S                    |   75 --
 sysdeps/i386/fpu/e_hypot.c                    |   56 +
 sysdeps/i386/fpu/e_hypotf.S                   |   64 --
 sysdeps/ieee754/dbl-64/e_hypot.c              |  247 ++--
 sysdeps/ieee754/dbl-64/w_hypot.c              |    1 +
 sysdeps/ieee754/flt-32/e_hypotf.c             |   72 +-
 sysdeps/ieee754/flt-32/w_hypotf.c             |    1 +
 sysdeps/ieee754/ldbl-128/e_hypotl.c           |  222 ++--
 sysdeps/ieee754/ldbl-96/e_hypotl.c            |  227 ++--
 sysdeps/mach/hurd/i386/libm.abilist           |    2 +
 sysdeps/powerpc/fpu/e_hypot.c                 |   87 --
 sysdeps/powerpc/fpu/e_hypotf.c                |   78 --
 .../powerpc32/power4/fpu/multiarch/Makefile   |    5 +-
 .../power4/fpu/multiarch/e_hypot-power7.c     |   23 -
 .../power4/fpu/multiarch/e_hypot-ppc32.c      |   23 -
 .../powerpc32/power4/fpu/multiarch/e_hypot.c  |   33 -
 .../power4/fpu/multiarch/e_hypotf-power7.c    |   23 -
 .../power4/fpu/multiarch/e_hypotf-ppc32.c     |   23 -
 .../powerpc32/power4/fpu/multiarch/e_hypotf.c |   33 -
 sysdeps/unix/sysv/linux/aarch64/libm.abilist  |    2 +
 sysdeps/unix/sysv/linux/alpha/libm.abilist    |    2 +
 sysdeps/unix/sysv/linux/arm/be/libm.abilist   |    2 +
 sysdeps/unix/sysv/linux/arm/le/libm.abilist   |    2 +
 sysdeps/unix/sysv/linux/hppa/libm.abilist     |    2 +
 sysdeps/unix/sysv/linux/i386/libm.abilist     |    2 +
 .../sysv/linux/m68k/coldfire/libm.abilist     |    2 +
 .../unix/sysv/linux/m68k/m680x0/libm.abilist  |    2 +
 .../sysv/linux/microblaze/be/libm.abilist     |    2 +
 .../sysv/linux/microblaze/le/libm.abilist     |    2 +
 .../unix/sysv/linux/mips/mips32/libm.abilist  |    2 +
 .../unix/sysv/linux/mips/mips64/libm.abilist  |    2 +
 sysdeps/unix/sysv/linux/nios2/libm.abilist    |    2 +
 .../linux/powerpc/powerpc32/fpu/libm.abilist  |    2 +
 .../powerpc/powerpc32/nofpu/libm.abilist      |    2 +
 .../linux/powerpc/powerpc64/be/libm.abilist   |    2 +
 .../linux/powerpc/powerpc64/le/libm.abilist   |    2 +
 .../unix/sysv/linux/s390/s390-32/libm.abilist |    2 +
 .../unix/sysv/linux/s390/s390-64/libm.abilist |    2 +
 sysdeps/unix/sysv/linux/sh/be/libm.abilist    |    2 +
 sysdeps/unix/sysv/linux/sh/le/libm.abilist    |    2 +
 .../sysv/linux/sparc/sparc32/libm.abilist     |    2 +
 .../sysv/linux/sparc/sparc64/libm.abilist     |    2 +
 .../unix/sysv/linux/x86_64/64/libm.abilist    |    2 +
 .../unix/sysv/linux/x86_64/x32/libm.abilist   |    2 +
 52 files changed, 2487 insertions(+), 919 deletions(-)
 create mode 100644 benchtests/hypotf-inputs
 create mode 100644 math/w_hypot.c
 create mode 100644 math/w_hypotf.c
 delete mode 100644 sysdeps/i386/fpu/e_hypot.S
 create mode 100644 sysdeps/i386/fpu/e_hypot.c
 delete mode 100644 sysdeps/i386/fpu/e_hypotf.S
 create mode 100644 sysdeps/ieee754/dbl-64/w_hypot.c
 create mode 100644 sysdeps/ieee754/flt-32/w_hypotf.c
 delete mode 100644 sysdeps/powerpc/fpu/e_hypot.c
 delete mode 100644 sysdeps/powerpc/fpu/e_hypotf.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot-power7.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot-ppc32.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf-power7.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf-ppc32.c
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 3/9] math: Simplify hypotf implementation
  2021-10-22 20:48 [PATCH v2 0/9] Improve hypot() Adhemerval Zanella
@ 2021-10-22 20:48 ` Adhemerval Zanella
  0 siblings, 0 replies; 3+ messages in thread
From: Adhemerval Zanella @ 2021-10-22 20:48 UTC (permalink / raw)
  To: libc-alpha

Use isnan()/isinf() instead of GET_FLOAT_WORD and integer operations.
There is also no need to check for 0.0.

The file Copyright is also changed to use  GPL, the implementation was
completely changed by 7c10fd3515f to use double precision instead of
scaling and this change removes all the GET_FLOAT_WORD usage.

Checked on x86_64-linux-gnu.
---
 sysdeps/ieee754/flt-32/e_hypotf.c | 57 +++++++++++++------------------
 1 file changed, 23 insertions(+), 34 deletions(-)

diff --git a/sysdeps/ieee754/flt-32/e_hypotf.c b/sysdeps/ieee754/flt-32/e_hypotf.c
index e770947dc1..6495a91cd4 100644
--- a/sysdeps/ieee754/flt-32/e_hypotf.c
+++ b/sysdeps/ieee754/flt-32/e_hypotf.c
@@ -1,46 +1,35 @@
-/* e_hypotf.c -- float version of e_hypot.c.
- */
+/* Euclidean distance function.  Float/Binary32 version.
+   Copyright (C) 2012-2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
 
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
 
 #include <math.h>
 #include <math_private.h>
 #include <libm-alias-finite.h>
 
 float
-__ieee754_hypotf(float x, float y)
+__ieee754_hypotf (float x, float y)
 {
-	double d_x, d_y;
-	int32_t ha, hb;
-
-	GET_FLOAT_WORD(ha,x);
-	ha &= 0x7fffffff;
-	GET_FLOAT_WORD(hb,y);
-	hb &= 0x7fffffff;
-	if (ha == 0x7f800000 && !issignaling (y))
-	  return fabsf(x);
-	else if (hb == 0x7f800000 && !issignaling (x))
-	  return fabsf(y);
-	else if (ha > 0x7f800000 || hb > 0x7f800000)
-	  return fabsf(x) * fabsf(y);
-	else if (ha == 0)
-	  return fabsf(y);
-	else if (hb == 0)
-	  return fabsf(x);
-
-	d_x = (double) x;
-	d_y = (double) y;
+  if ((isinf (x) || isinf (y))
+      && !issignaling (x) && !issignaling (y))
+    return INFINITY;
+  if (isnan (x) || isnan (y))
+    return x + y;
 
-	return (float) sqrt(d_x * d_x + d_y * d_y);
+  return sqrt ((double) x * (double) x + (double) y * (double) y);
 }
 #ifndef __ieee754_hypotf
 libm_alias_finite (__ieee754_hypotf, __hypotf)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-28 14:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-28 14:27 [PATCH v2 3/9] math: Simplify hypotf implementation Wilco Dijkstra
  -- strict thread matches above, loose matches on Subject: below --
2021-10-25 11:57 [PATCH v2 0/9] Improve hypot() Adhemerval Zanella
2021-10-25 11:57 ` [PATCH v2 3/9] math: Simplify hypotf implementation Adhemerval Zanella
2021-10-22 20:48 [PATCH v2 0/9] Improve hypot() Adhemerval Zanella
2021-10-22 20:48 ` [PATCH v2 3/9] math: Simplify hypotf implementation Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).