public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf
@ 2017-09-29 11:00 Szabolcs Nagy
  2017-09-29 11:03 ` [PATCH 1/5 v3] New generic log2f Szabolcs Nagy
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:00 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

updated nsz/math, committed the aarch64 ulp update
and the logf changes, reordered the remaining patches:

Szabolcs Nagy (5):
  New generic log2f
  New generic powf
  New symbol version for logf, log2f and powf without SVID compat
  Do not wrap expf and exp2f
  Do not wrap logf, log2f and powf

 NEWS                                               |   2 +-
 math/Makefile                                      |   3 +-
 math/Versions                                      |   2 +-
 math/w_log2f.c                                     |   7 +
 math/w_log2f_compat.c                              |   6 +-
 math/w_logf.c                                      |   7 +
 math/w_logf_compat.c                               |   6 +-
 math/w_powf.c                                      |   7 +
 math/w_powf_compat.c                               |   6 +-
 sysdeps/i386/fpu/e_log2f_data.c                    |   1 +
 sysdeps/i386/fpu/e_powf_log2_data.c                |   1 +
 sysdeps/i386/fpu/w_exp2f.c                         |   1 +
 sysdeps/i386/fpu/w_expf.c                          |   1 +
 sysdeps/i386/fpu/w_log2f.c                         |   1 +
 sysdeps/i386/fpu/w_logf.c                          |   1 +
 sysdeps/i386/fpu/w_powf.c                          |   1 +
 sysdeps/i386/i686/fpu/multiarch/w_expf.c           |   1 +
 sysdeps/ia64/fpu/e_log2f.S                         |  10 +-
 sysdeps/ia64/fpu/e_log2f_data.c                    |   1 +
 sysdeps/ia64/fpu/e_logf.S                          |   6 +
 sysdeps/ia64/fpu/e_powf.S                          |  10 +-
 sysdeps/ia64/fpu/e_powf_log2_data.c                |   1 +
 sysdeps/ieee754/flt-32/e_exp2f.c                   |   9 +-
 sysdeps/ieee754/flt-32/e_expf.c                    |  16 +-
 sysdeps/ieee754/flt-32/e_log2f.c                   | 155 ++++----
 sysdeps/ieee754/flt-32/e_log2f_data.c              |  44 +++
 sysdeps/ieee754/flt-32/e_logf.c                    |   9 +-
 sysdeps/ieee754/flt-32/e_powf.c                    | 395 ++++++++++-----------
 sysdeps/ieee754/flt-32/e_powf_log2_data.c          |  45 +++
 sysdeps/ieee754/flt-32/math_config.h               |  38 ++
 sysdeps/ieee754/flt-32/w_exp2f.c                   |   1 +
 sysdeps/ieee754/flt-32/w_expf.c                    |   1 +
 sysdeps/ieee754/flt-32/w_log2f.c                   |   1 +
 sysdeps/ieee754/flt-32/w_logf.c                    |   1 +
 sysdeps/ieee754/flt-32/w_powf.c                    |   1 +
 sysdeps/m68k/m680x0/fpu/e_log2f_data.c             |   1 +
 sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c         |   1 +
 sysdeps/m68k/m680x0/fpu/w_exp2f.c                  |   1 +
 sysdeps/m68k/m680x0/fpu/w_expf.c                   |   1 +
 sysdeps/m68k/m680x0/fpu/w_log2f.c                  |   1 +
 sysdeps/m68k/m680x0/fpu/w_logf.c                   |   1 +
 sysdeps/m68k/m680x0/fpu/w_powf.c                   |   1 +
 .../powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c |   5 +-
 sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c   |   1 +
 sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c      |   1 +
 sysdeps/unix/sysv/linux/aarch64/libm.abilist       |   3 +
 sysdeps/unix/sysv/linux/alpha/libm.abilist         |   3 +
 sysdeps/unix/sysv/linux/arm/libm.abilist           |   3 +
 sysdeps/unix/sysv/linux/hppa/libm.abilist          |   3 +
 sysdeps/unix/sysv/linux/i386/libm.abilist          |   3 +
 sysdeps/unix/sysv/linux/ia64/libm.abilist          |   3 +
 sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist |   3 +
 sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist   |   3 +
 sysdeps/unix/sysv/linux/microblaze/libm.abilist    |   3 +
 sysdeps/unix/sysv/linux/mips/mips32/libm.abilist   |   3 +
 sysdeps/unix/sysv/linux/mips/mips64/libm.abilist   |   3 +
 sysdeps/unix/sysv/linux/nios2/libm.abilist         |   3 +
 .../sysv/linux/powerpc/powerpc32/fpu/libm.abilist  |   3 +
 .../linux/powerpc/powerpc32/nofpu/libm.abilist     |   3 +
 .../sysv/linux/powerpc/powerpc64/libm-le.abilist   |   3 +
 .../unix/sysv/linux/powerpc/powerpc64/libm.abilist |   3 +
 sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist  |   3 +
 sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist  |   3 +
 sysdeps/unix/sysv/linux/sh/libm.abilist            |   3 +
 sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist |   3 +
 sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist |   3 +
 .../sysv/linux/tile/tilegx/tilegx32/libm.abilist   |   3 +
 .../sysv/linux/tile/tilegx/tilegx64/libm.abilist   |   3 +
 sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist  |   3 +
 sysdeps/unix/sysv/linux/x86_64/64/libm.abilist     |   3 +
 sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist    |   3 +
 sysdeps/x86_64/fpu/w_expf.c                        |   1 +
 72 files changed, 591 insertions(+), 300 deletions(-)
 create mode 100644 math/w_log2f.c
 create mode 100644 math/w_logf.c
 create mode 100644 math/w_powf.c
 create mode 100644 sysdeps/i386/fpu/e_log2f_data.c
 create mode 100644 sysdeps/i386/fpu/e_powf_log2_data.c
 create mode 100644 sysdeps/i386/fpu/w_exp2f.c
 create mode 100644 sysdeps/i386/fpu/w_expf.c
 create mode 100644 sysdeps/i386/fpu/w_log2f.c
 create mode 100644 sysdeps/i386/fpu/w_logf.c
 create mode 100644 sysdeps/i386/fpu/w_powf.c
 create mode 100644 sysdeps/i386/i686/fpu/multiarch/w_expf.c
 create mode 100644 sysdeps/ia64/fpu/e_log2f_data.c
 create mode 100644 sysdeps/ia64/fpu/e_powf_log2_data.c
 create mode 100644 sysdeps/ieee754/flt-32/e_log2f_data.c
 create mode 100644 sysdeps/ieee754/flt-32/e_powf_log2_data.c
 create mode 100644 sysdeps/ieee754/flt-32/w_exp2f.c
 create mode 100644 sysdeps/ieee754/flt-32/w_expf.c
 create mode 100644 sysdeps/ieee754/flt-32/w_log2f.c
 create mode 100644 sysdeps/ieee754/flt-32/w_logf.c
 create mode 100644 sysdeps/ieee754/flt-32/w_powf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/e_log2f_data.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_exp2f.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_expf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_log2f.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_logf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_powf.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c
 create mode 100644 sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c
 create mode 100644 sysdeps/x86_64/fpu/w_expf.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/5 v3] New generic log2f
  2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
@ 2017-09-29 11:03 ` Szabolcs Nagy
  2017-09-29 16:06   ` Joseph Myers
  2017-09-29 11:05 ` [PATCH 2/5 v3] New generic powf Szabolcs Nagy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:03 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 19 bytes --]

v3:
- NEWS entry.


[-- Attachment #2: 0001-New-generic-log2f.patch --]
[-- Type: text/x-patch, Size: 11218 bytes --]

From eb38f71b3504ac132b5c65ba23a15ddb9d02a435 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Mon, 4 Sep 2017 17:53:47 +0100
Subject: [PATCH 1/5] New generic log2f

Similar to the new logf: double precision arithmetics and a small
lookup table is used. The argument reduction step is the same as in
the new logf.

without wrapper on aarch64:
log2f reciprocal-throughput: 2.3x faster
log2f latency: 2.1x faster
old worst case error: 1.72 ulp
new worst case error: 0.75 ulp
aarch64 .text size: -252 bytes
aarch64 .rodata size: +244 bytes

2017-09-19  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* math/Makefile (type-float-routines): Add e_log2f_data.
	* sysdeps/ieee754/flt-32/e_log2f.c: New implementation.
	* sysdeps/ieee754/flt-32/e_log2f_data.c: New file.
	* sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define.
	(LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define.
	* sysdeps/i386/fpu/e_log2f_data.c: New file.
	* sysdeps/ia64/fpu/e_log2f_data.c: New file.
	* sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file.
---
 NEWS                                   |   2 +-
 math/Makefile                          |   3 +-
 sysdeps/i386/fpu/e_log2f_data.c        |   1 +
 sysdeps/ia64/fpu/e_log2f_data.c        |   1 +
 sysdeps/ieee754/flt-32/e_log2f.c       | 148 +++++++++++++++++----------------
 sysdeps/ieee754/flt-32/e_log2f_data.c  |  44 ++++++++++
 sysdeps/ieee754/flt-32/math_config.h   |  11 +++
 sysdeps/m68k/m680x0/fpu/e_log2f_data.c |   1 +
 8 files changed, 136 insertions(+), 75 deletions(-)
 create mode 100644 sysdeps/i386/fpu/e_log2f_data.c
 create mode 100644 sysdeps/ia64/fpu/e_log2f_data.c
 create mode 100644 sysdeps/ieee754/flt-32/e_log2f_data.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/e_log2f_data.c

diff --git a/NEWS b/NEWS
index 05850f8288..5e88c54a6b 100644
--- a/NEWS
+++ b/NEWS
@@ -14,7 +14,7 @@ Major new features:
 
 * Optimized x86-64 trunc and truncf for processors with SSE4.1.
 
-* Optimized generic expf, exp2f, logf.
+* Optimized generic expf, exp2f, logf, log2f.
 
 * In order to support faster and safer process termination the malloc API
   family of functions will no longer print a failure address and stack
diff --git a/math/Makefile b/math/Makefile
index 919fec13ef..b4b3101592 100644
--- a/math/Makefile
+++ b/math/Makefile
@@ -115,7 +115,8 @@ type-double-routines := branred doasin dosincos halfulp mpa mpatan2	\
 
 # float support
 type-float-suffix := f
-type-float-routines := k_rem_pio2f math_errf e_exp2f_data e_logf_data
+type-float-routines := k_rem_pio2f math_errf e_exp2f_data e_logf_data	\
+		       e_log2f_data
 
 # _Float128 support
 type-float128-suffix := f128
diff --git a/sysdeps/i386/fpu/e_log2f_data.c b/sysdeps/i386/fpu/e_log2f_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/i386/fpu/e_log2f_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ia64/fpu/e_log2f_data.c b/sysdeps/ia64/fpu/e_log2f_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ia64/fpu/e_log2f_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ieee754/flt-32/e_log2f.c b/sysdeps/ieee754/flt-32/e_log2f.c
index 782d901094..6c42f27843 100644
--- a/sysdeps/ieee754/flt-32/e_log2f.c
+++ b/sysdeps/ieee754/flt-32/e_log2f.c
@@ -1,86 +1,88 @@
-/* e_logf.c -- float version of e_log.c.
- * Conversion to float by Ian Lance Taylor, Cygnus Support, ian@cygnus.com.
- * adapted for log2 by Ulrich Drepper <drepper@cygnus.com>
- */
+/* Single-precision log2 function.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
 
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
 
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
 
 #include <math.h>
-#include <math_private.h>
-#include <fix-int-fp-convert-zero.h>
+#include <stdint.h>
+#include "math_config.h"
+
+/*
+LOG2F_TABLE_BITS = 4
+LOG2F_POLY_ORDER = 4
 
-static const float
-ln2 = 0.69314718055994530942,
-two25 =    3.355443200e+07,	/* 0x4c000000 */
-Lg1 = 6.6666668653e-01,	/* 3F2AAAAB */
-Lg2 = 4.0000000596e-01,	/* 3ECCCCCD */
-Lg3 = 2.8571429849e-01, /* 3E924925 */
-Lg4 = 2.2222198546e-01, /* 3E638E29 */
-Lg5 = 1.8183572590e-01, /* 3E3A3325 */
-Lg6 = 1.5313838422e-01, /* 3E1CD04F */
-Lg7 = 1.4798198640e-01; /* 3E178897 */
+ULP error: 0.752 (nearest rounding.)
+Relative error: 1.9 * 2^-26 (before rounding.)
+*/
 
-static const float zero   =  0.0;
+#define N (1 << LOG2F_TABLE_BITS)
+#define T __log2f_data.tab
+#define A __log2f_data.poly
+#define OFF 0x3f330000
 
 float
-__ieee754_log2f(float x)
+__ieee754_log2f (float x)
 {
-	float hfsq,f,s,z,R,w,t1,t2,dk;
-	int32_t k,ix,i,j;
+  /* double_t for better performance on targets with FLT_EVAL_METHOD==2.  */
+  double_t z, r, r2, p, y, y0, invc, logc;
+  uint32_t ix, iz, top, tmp;
+  int k, i;
+
+  ix = asuint (x);
+#if WANT_ROUNDING
+  /* Fix sign of zero with downward rounding when x==1.  */
+  if (__glibc_unlikely (ix == 0x3f800000))
+    return 0;
+#endif
+  if (__glibc_unlikely (ix - 0x00800000 >= 0x7f800000 - 0x00800000))
+    {
+      /* x < 0x1p-126 or inf or nan.  */
+      if (ix * 2 == 0)
+	return __math_divzerof (1);
+      if (ix == 0x7f800000) /* log2(inf) == inf.  */
+	return x;
+      if ((ix & 0x80000000) || ix * 2 >= 0xff000000)
+	return __math_invalidf (x);
+      /* x is subnormal, normalize it.  */
+      ix = asuint (x * 0x1p23f);
+      ix -= 23 << 23;
+    }
+
+  /* x = 2^k z; where z is in range [OFF,2*OFF] and exact.
+     The range is split into N subintervals.
+     The ith subinterval contains z and c is near its center.  */
+  tmp = ix - OFF;
+  i = (tmp >> (23 - LOG2F_TABLE_BITS)) % N;
+  top = tmp & 0xff800000;
+  iz = ix - top;
+  k = (int32_t) tmp >> 23; /* arithmetic shift */
+  invc = T[i].invc;
+  logc = T[i].logc;
+  z = (double_t) asfloat (iz);
 
-	GET_FLOAT_WORD(ix,x);
+  /* log2(x) = log1p(z/c-1)/ln2 + log2(c) + k */
+  r = z * invc - 1;
+  y0 = logc + (double_t) k;
 
-	k=0;
-	if (ix < 0x00800000) {			/* x < 2**-126  */
-	    if (__builtin_expect((ix&0x7fffffff)==0, 0))
-		return -two25/__fabsf (x);	/* log(+-0)=-inf  */
-	    if (__builtin_expect(ix<0, 0))
-		return (x-x)/(x-x);	/* log(-#) = NaN */
-	    k -= 25; x *= two25; /* subnormal number, scale up x */
-	    GET_FLOAT_WORD(ix,x);
-	}
-	if (__builtin_expect(ix >= 0x7f800000, 0)) return x+x;
-	k += (ix>>23)-127;
-	ix &= 0x007fffff;
-	i = (ix+(0x95f64<<3))&0x800000;
-	SET_FLOAT_WORD(x,ix|(i^0x3f800000));	/* normalize x or x/2 */
-	k += (i>>23);
-	dk = (float)k;
-	f = x-(float)1.0;
-	if((0x007fffff&(15+ix))<16) {	/* |f| < 2**-20 */
-	    if(f==zero)
-	      {
-		if (FIX_INT_FP_CONVERT_ZERO && dk == 0.0f)
-		  dk = 0.0f;
-		return dk;
-	      }
-	    R = f*f*((float)0.5-(float)0.33333333333333333*f);
-	    return dk-(R-f)/ln2;
-	}
-	s = f/((float)2.0+f);
-	z = s*s;
-	i = ix-(0x6147a<<3);
-	w = z*z;
-	j = (0x6b851<<3)-ix;
-	t1= w*(Lg2+w*(Lg4+w*Lg6));
-	t2= z*(Lg1+w*(Lg3+w*(Lg5+w*Lg7)));
-	i |= j;
-	R = t2+t1;
-	if(i>0) {
-	    hfsq=(float)0.5*f*f;
-	    return dk-((hfsq-(s*(hfsq+R)))-f)/ln2;
-	} else {
-	    return dk-((s*(f-R))-f)/ln2;
-	}
+  /* Pipelined polynomial evaluation to approximate log1p(r)/ln2.  */
+  r2 = r * r;
+  y = A[1] * r + A[2];
+  y = A[0] * r2 + y;
+  p = A[3] * r + y0;
+  y = y * r2 + p;
+  return (float) y;
 }
 strong_alias (__ieee754_log2f, __log2f_finite)
diff --git a/sysdeps/ieee754/flt-32/e_log2f_data.c b/sysdeps/ieee754/flt-32/e_log2f_data.c
new file mode 100644
index 0000000000..e39de3ba56
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/e_log2f_data.c
@@ -0,0 +1,44 @@
+/* Data definition for log2f.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "math_config.h"
+
+const struct log2f_data __log2f_data = {
+  .tab = {
+  { 0x1.661ec79f8f3bep+0, -0x1.efec65b963019p-2 },
+  { 0x1.571ed4aaf883dp+0, -0x1.b0b6832d4fca4p-2 },
+  { 0x1.49539f0f010bp+0, -0x1.7418b0a1fb77bp-2 },
+  { 0x1.3c995b0b80385p+0, -0x1.39de91a6dcf7bp-2 },
+  { 0x1.30d190c8864a5p+0, -0x1.01d9bf3f2b631p-2 },
+  { 0x1.25e227b0b8eap+0, -0x1.97c1d1b3b7afp-3 },
+  { 0x1.1bb4a4a1a343fp+0, -0x1.2f9e393af3c9fp-3 },
+  { 0x1.12358f08ae5bap+0, -0x1.960cbbf788d5cp-4 },
+  { 0x1.0953f419900a7p+0, -0x1.a6f9db6475fcep-5 },
+  { 0x1p+0, 0x0p+0 },
+  { 0x1.e608cfd9a47acp-1, 0x1.338ca9f24f53dp-4 },
+  { 0x1.ca4b31f026aap-1, 0x1.476a9543891bap-3 },
+  { 0x1.b2036576afce6p-1, 0x1.e840b4ac4e4d2p-3 },
+  { 0x1.9c2d163a1aa2dp-1, 0x1.40645f0c6651cp-2 },
+  { 0x1.886e6037841edp-1, 0x1.88e9c2c1b9ff8p-2 },
+  { 0x1.767dcf5534862p-1, 0x1.ce0a44eb17bccp-2 },
+  },
+  .poly = {
+  -0x1.712b6f70a7e4dp-2, 0x1.ecabf496832ep-2, -0x1.715479ffae3dep-1,
+  0x1.715475f35c8b8p0,
+  }
+};
diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/math_config.h
index 953a4bc583..f869fbc66c 100644
--- a/sysdeps/ieee754/flt-32/math_config.h
+++ b/sysdeps/ieee754/flt-32/math_config.h
@@ -123,4 +123,15 @@ extern const struct logf_data
   double poly[LOGF_POLY_ORDER - 1]; /* First order coefficient is 1.  */
 } __logf_data attribute_hidden;
 
+#define LOG2F_TABLE_BITS 4
+#define LOG2F_POLY_ORDER 4
+extern const struct log2f_data
+{
+  struct
+  {
+    double invc, logc;
+  } tab[1 << LOG2F_TABLE_BITS];
+  double poly[LOG2F_POLY_ORDER];
+} __log2f_data attribute_hidden;
+
 #endif
diff --git a/sysdeps/m68k/m680x0/fpu/e_log2f_data.c b/sysdeps/m68k/m680x0/fpu/e_log2f_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/e_log2f_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/5 v3] New generic powf
  2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
  2017-09-29 11:03 ` [PATCH 1/5 v3] New generic log2f Szabolcs Nagy
@ 2017-09-29 11:05 ` Szabolcs Nagy
  2017-09-29 16:22   ` Joseph Myers
  2017-09-29 11:06 ` [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat Szabolcs Nagy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:05 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 19 bytes --]

v3:
- NEWS entry.


[-- Attachment #2: 0002-New-generic-powf.patch --]
[-- Type: text/x-patch, Size: 20055 bytes --]

From e260a9f8a9279c231405593c449e1f5bd39b3fd1 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Mon, 4 Sep 2017 17:55:33 +0100
Subject: [PATCH 2/5] New generic powf

without wrapper on aarch64:
powf reciprocal-throughput: 4.2x faster
powf latency: 2.6x faster
old worst-case error: 1.11 ulp
new worst-case error: 0.82 ulp
aarch64 .text size: -780 bytes
aarch64 .rodata size: +144 bytes

powf(x,y) is implemented as exp2(y*log2(x)) with the same algorithms
that are used in exp2f and log2f, except that the log2f polynomial is
larger for extra precision and its output (and exp2f input) may be
scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction
step of exp2 (possible when efficient round and convert toint operation
is available).

The special case handling tries to minimize the checks in the hot path.
When the input of exp2_inline is checked, int arithmetics is used as
that was faster on the tested aarch64 cores.

2017-09-19  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* math/Makefile (type-float-routines): Add e_powf_log2_data.
	* sysdeps/ieee754/flt-32/e_powf.c: New implementation.
	* sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file.
	* sysdeps/ieee754/flt-32/math_config.h (__powf_data): Define.
	(issignalingf_inline): Likewise.
	(POWF_LOG2_TABLE_BITS): Likewise.
	(POWF_LOG2_POLY_ORDER): Likewise.
	(POWF_SCALE_BITS): Likewise.
	(POWF_SCALE): Likewise.
	* sysdeps/i386/fpu/e_powf_log2_data.c: New file.
	* sysdeps/ia64/fpu/e_powf_log2_data.c: New file.
	* sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file.
---
 NEWS                                       |   2 +-
 math/Makefile                              |   2 +-
 sysdeps/i386/fpu/e_powf_log2_data.c        |   1 +
 sysdeps/ia64/fpu/e_powf_log2_data.c        |   1 +
 sysdeps/ieee754/flt-32/e_powf.c            | 388 ++++++++++++++---------------
 sysdeps/ieee754/flt-32/e_powf_log2_data.c  |  45 ++++
 sysdeps/ieee754/flt-32/math_config.h       |  27 ++
 sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c |   1 +
 8 files changed, 266 insertions(+), 201 deletions(-)
 create mode 100644 sysdeps/i386/fpu/e_powf_log2_data.c
 create mode 100644 sysdeps/ia64/fpu/e_powf_log2_data.c
 create mode 100644 sysdeps/ieee754/flt-32/e_powf_log2_data.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c

diff --git a/NEWS b/NEWS
index 5e88c54a6b..f5821411ca 100644
--- a/NEWS
+++ b/NEWS
@@ -14,7 +14,7 @@ Major new features:
 
 * Optimized x86-64 trunc and truncf for processors with SSE4.1.
 
-* Optimized generic expf, exp2f, logf, log2f.
+* Optimized generic expf, exp2f, logf, log2f and powf.
 
 * In order to support faster and safer process termination the malloc API
   family of functions will no longer print a failure address and stack
diff --git a/math/Makefile b/math/Makefile
index b4b3101592..6c8aa3e413 100644
--- a/math/Makefile
+++ b/math/Makefile
@@ -116,7 +116,7 @@ type-double-routines := branred doasin dosincos halfulp mpa mpatan2	\
 # float support
 type-float-suffix := f
 type-float-routines := k_rem_pio2f math_errf e_exp2f_data e_logf_data	\
-		       e_log2f_data
+		       e_log2f_data e_powf_log2_data
 
 # _Float128 support
 type-float128-suffix := f128
diff --git a/sysdeps/i386/fpu/e_powf_log2_data.c b/sysdeps/i386/fpu/e_powf_log2_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/i386/fpu/e_powf_log2_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ia64/fpu/e_powf_log2_data.c b/sysdeps/ia64/fpu/e_powf_log2_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ia64/fpu/e_powf_log2_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ieee754/flt-32/e_powf.c b/sysdeps/ieee754/flt-32/e_powf.c
index ce8e11f1ea..644a18d05e 100644
--- a/sysdeps/ieee754/flt-32/e_powf.c
+++ b/sysdeps/ieee754/flt-32/e_powf.c
@@ -1,7 +1,5 @@
-/* e_powf.c -- float version of e_pow.c.
- * Conversion to float by Ian Lance Taylor, Cygnus Support, ian@cygnus.com.
- */
-/* Copyright (C) 2017 Free Software Foundation, Inc.
+/* Single-precision pow function.
+   Copyright (C) 2017 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -18,210 +16,202 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-/*
- * ====================================================
- * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
- *
- * Developed at SunPro, a Sun Microsystems, Inc. business.
- * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice
- * is preserved.
- * ====================================================
- */
-
 #include <math.h>
-#include <math_private.h>
-
-static const float huge = 1.0e+30, tiny = 1.0e-30;
-
-static const float
-bp[] = {1.0, 1.5,},
-zero    =  0.0,
-one	=  1.0,
-two	=  2.0,
-two24	=  16777216.0,	/* 0x4b800000 */
-	/* poly coefs for (3/2)*(log(x)-2s-2/3*s**3 */
-L1  =  6.0000002384e-01, /* 0x3f19999a */
-L2  =  4.2857143283e-01, /* 0x3edb6db7 */
-L3  =  3.3333334327e-01, /* 0x3eaaaaab */
-L4  =  2.7272811532e-01, /* 0x3e8ba305 */
-L5  =  2.3066075146e-01, /* 0x3e6c3255 */
-L6  =  2.0697501302e-01, /* 0x3e53f142 */
-P1   =  1.6666667163e-01, /* 0x3e2aaaab */
-P2   = -2.7777778450e-03, /* 0xbb360b61 */
-P3   =  6.6137559770e-05, /* 0x388ab355 */
-P4   = -1.6533901999e-06, /* 0xb5ddea0e */
-P5   =  4.1381369442e-08, /* 0x3331bb4c */
-ovt =  4.2995665694e-08; /* -(128-log2(ovfl+.5ulp)) */
-
-static const double
-	dp[] = { 0.0, 0x1.2b803473f7ad1p-1, }, /* log2(1.5) */
-	lg2 = M_LN2,
-	cp = 2.0/3.0/M_LN2,
-	invln2 = 1.0/M_LN2;
+#include <stdint.h>
+#include "math_config.h"
 
-float
-__ieee754_powf(float x, float y)
+/*
+POWF_LOG2_POLY_ORDER = 5
+EXP2F_TABLE_BITS = 5
+
+ULP error: 0.82 (~ 0.5 + relerr*2^24)
+relerr: 1.27 * 2^-26 (Relative error ~= 128*Ln2*relerr_log2 + relerr_exp2)
+relerr_log2: 1.83 * 2^-33 (Relative error of logx.)
+relerr_exp2: 1.69 * 2^-34 (Relative error of exp2(ylogx).)
+*/
+
+#define N (1 << POWF_LOG2_TABLE_BITS)
+#define T __powf_log2_data.tab
+#define A __powf_log2_data.poly
+#define OFF 0x3f330000
+
+/* Subnormal input is normalized so ix has negative biased exponent.
+   Output is multiplied by N (POWF_SCALE) if TOINT_INTRINICS is set.  */
+static inline double_t
+log2_inline (uint32_t ix)
 {
-	float z, ax, s;
-	double d1, d2;
-	int32_t i,j,k,yisint,n;
-	int32_t hx,hy,ix,iy;
-
-	GET_FLOAT_WORD(hy,y);
-	iy = hy&0x7fffffff;
-
-    /* y==zero: x**0 = 1 */
-	if(iy==0 && !issignaling (x)) return one;
-
-    /* x==+-1 */
-	if(x == 1.0 && !issignaling (y)) return one;
-	if(x == -1.0 && isinf(y)) return one;
-
-	GET_FLOAT_WORD(hx,x);
-	ix = hx&0x7fffffff;
-
-    /* +-NaN return x+y */
-	if(__builtin_expect(ix > 0x7f800000 ||
-			    iy > 0x7f800000, 0))
-		return x+y;
-
-    /* special value of y */
-	if (__builtin_expect(iy==0x7f800000, 0)) {	/* y is +-inf */
-	    if (ix==0x3f800000)
-		return  y - y;	/* inf**+-1 is NaN */
-	    else if (ix > 0x3f800000)/* (|x|>1)**+-inf = inf,0 */
-		return (hy>=0)? y: zero;
-	    else			/* (|x|<1)**-,+inf = inf,0 */
-		return (hy<0)?-y: zero;
-	}
-	if(iy==0x3f800000) {	/* y is  +-1 */
-	    if(hy<0) return one/x; else return x;
-	}
-	if(hy==0x40000000) return x*x; /* y is  2 */
-	if(hy==0x3f000000) {	/* y is  0.5 */
-	    if(__builtin_expect(hx>=0, 1))	/* x >= +0 */
-	    return __ieee754_sqrtf(x);
-	}
+  /* double_t for better performance on targets with FLT_EVAL_METHOD==2.  */
+  double_t z, r, r2, r4, p, q, y, y0, invc, logc;
+  uint32_t iz, top, tmp;
+  int k, i;
+
+  /* x = 2^k z; where z is in range [OFF,2*OFF] and exact.
+     The range is split into N subintervals.
+     The ith subinterval contains z and c is near its center.  */
+  tmp = ix - OFF;
+  i = (tmp >> (23 - POWF_LOG2_TABLE_BITS)) % N;
+  top = tmp & 0xff800000;
+  iz = ix - top;
+  k = (int32_t) top >> (23 - POWF_SCALE_BITS); /* arithmetic shift */
+  invc = T[i].invc;
+  logc = T[i].logc;
+  z = (double_t) asfloat (iz);
+
+  /* log2(x) = log1p(z/c-1)/ln2 + log2(c) + k */
+  r = z * invc - 1;
+  y0 = logc + (double_t) k;
+
+  /* Pipelined polynomial evaluation to approximate log1p(r)/ln2.  */
+  r2 = r * r;
+  y = A[0] * r + A[1];
+  p = A[2] * r + A[3];
+  r4 = r2 * r2;
+  q = A[4] * r + y0;
+  q = p * r2 + q;
+  y = y * r4 + q;
+  return y;
+}
 
-    /* determine if y is an odd int when x < 0
-     * yisint = 0	... y is not an integer
-     * yisint = 1	... y is an odd int
-     * yisint = 2	... y is an even int
-     */
-	yisint  = 0;
-	if(hx<0) {
-	    if(iy>=0x4b800000) yisint = 2; /* even integer y */
-	    else if(iy>=0x3f800000) {
-		k = (iy>>23)-0x7f;	   /* exponent */
-		j = iy>>(23-k);
-		if((j<<(23-k))==iy) yisint = 2-(j&1);
-	    }
-	}
+#undef N
+#undef T
+#define N (1 << EXP2F_TABLE_BITS)
+#define T __exp2f_data.tab
+#define SIGN_BIAS (1 << (EXP2F_TABLE_BITS + 11))
+
+/* The output of log2 and thus the input of exp2 is either scaled by N
+   (in case of fast toint intrinsics) or not.  The unscaled xd must be
+   in [-1021,1023], sign_bias sets the sign of the result.  */
+static inline double_t
+exp2_inline (double_t xd, unsigned long sign_bias)
+{
+  uint64_t ki, ski, t;
+  /* double_t for better performance on targets with FLT_EVAL_METHOD==2.  */
+  double_t kd, z, r, r2, y, s;
+
+#if TOINT_INTRINSICS
+# define C __exp2f_data.poly_scaled
+  /* N*x = k + r with r in [-1/2, 1/2] */
+  kd = roundtoint (xd); /* k */
+  ki = converttoint (xd);
+#else
+# define C __exp2f_data.poly
+# define SHIFT __exp2f_data.shift_scaled
+  /* x = k/N + r with r in [-1/(2N), 1/(2N)] */
+  kd = (double) (xd + SHIFT); /* Rounding to double precision is required.  */
+  ki = asuint64 (kd);
+  kd -= SHIFT; /* k/N */
+#endif
+  r = xd - kd;
+
+  /* exp2(x) = 2^(k/N) * 2^r ~= s * (C0*r^3 + C1*r^2 + C2*r + 1) */
+  t = T[ki % N];
+  ski = ki + sign_bias;
+  t += ski << (52 - EXP2F_TABLE_BITS);
+  s = asdouble (t);
+  z = C[0] * r + C[1];
+  r2 = r * r;
+  y = C[2] * r + 1;
+  y = z * r2 + y;
+  y = y * s;
+  return y;
+}
 
-	ax   = fabsf(x);
-    /* special value of x */
-	if(__builtin_expect(ix==0x7f800000||ix==0||ix==0x3f800000, 0)){
-	    z = ax;			/*x is +-0,+-inf,+-1*/
-	    if(hy<0) z = one/z;	/* z = (1/|x|) */
-	    if(hx<0) {
-		if(((ix-0x3f800000)|yisint)==0) {
-		    z = (z-z)/(z-z); /* (-1)**non-int is NaN */
-		} else if(yisint==1)
-		    z = -z;		/* (x<0)**odd = -(|x|**odd) */
-	    }
-	    return z;
-	}
+/* Returns 0 if not int, 1 if odd int, 2 if even int.  */
+static inline int
+checkint (uint32_t iy)
+{
+  int e = iy >> 23 & 0xff;
+  if (e < 0x7f)
+    return 0;
+  if (e > 0x7f + 23)
+    return 2;
+  if (iy & ((1 << (0x7f + 23 - e)) - 1))
+    return 0;
+  if (iy & (1 << (0x7f + 23 - e)))
+    return 1;
+  return 2;
+}
 
-    /* (x<0)**(non-int) is NaN */
-	if(__builtin_expect(((((uint32_t)hx>>31)-1)|yisint)==0, 0))
-	    return (x-x)/(x-x);
-
-    /* |y| is huge */
-	if(__builtin_expect(iy>0x4d000000, 0)) { /* if |y| > 2**27 */
-	/* over/underflow if x is not close to one */
-	    if(ix<0x3f7ffff8) return (hy<0)? huge*huge:tiny*tiny;
-	    if(ix>0x3f800007) return (hy>0)? huge*huge:tiny*tiny;
-	/* now |1-x| is tiny <= 2**-20, suffice to compute
-	   log(x) by x-x^2/2+x^3/3-x^4/4 */
-	    d2 = ax-1;		/* d2 has 20 trailing zeros.  */
-	    d2 = d2 * invln2 -
-		 (d2 * d2) * (0.5 - d2 * (0.333333333333 - d2 * 0.25)) * invln2;
-	} else {
-	    /* Avoid internal underflow for tiny y.  The exact value
-	       of y does not matter if |y| <= 2**-32.  */
-	    if (iy < 0x2f800000)
-	      SET_FLOAT_WORD (y, (hy & 0x80000000) | 0x2f800000);
-	    n = 0;
-	/* take care subnormal number */
-	    if(ix<0x00800000)
-		{ax *= two24; n -= 24; GET_FLOAT_WORD(ix,ax); }
-	    n  += ((ix)>>23)-0x7f;
-	    j  = ix&0x007fffff;
-	/* determine interval */
-	    ix = j|0x3f800000;		/* normalize ix */
-	    if(j<=0x1cc471) k=0;	/* |x|<sqrt(3/2) */
-	    else if(j<0x5db3d7) k=1;	/* |x|<sqrt(3)   */
-	    else {k=0;n+=1;ix -= 0x00800000;}
-	    SET_FLOAT_WORD(ax,ix);
-
-	/* compute d1 = (x-1)/(x+1) or (x-1.5)/(x+1.5) */
-	    d1 = (ax-(double)bp[k])/(ax+(double)bp[k]);
-	/* compute d2 = log(ax) */
-	    d2 = d1 * d1;
-	    d2 = 3.0 + d2 + d2*d2*(L1+d2*(L2+d2*(L3+d2*(L4+d2*(L5+d2*L6)))));
-	/* 2/(3log2)*(d2+...) */
-	    d2 = d1*d2*cp;
-	/* log2(ax) = (d2+..)*2/(3*log2) */
-	    d2 = d2+dp[k]+(double)n;
-	}
+static inline int
+zeroinfnan (uint32_t ix)
+{
+  return 2 * ix - 1 >= 2u * 0x7f800000 - 1;
+}
 
-	s = one; /* s (sign of result -ve**odd) = -1 else = 1 */
-	if(((((uint32_t)hx>>31)-1)|(yisint-1))==0)
-	    s = -one;	/* (-ve)**(odd int) */
-
-    /* compute y * d2 */
-	d1 = y * d2;
-	z = d1;
-	GET_FLOAT_WORD(j,z);
-	if (__builtin_expect(j>0x43000000, 0))		/* if z > 128 */
-	    return s*huge*huge;				/* overflow */
-	else if (__builtin_expect(j==0x43000000, 0)) {	/* if z == 128 */
-	    if(ovt>(z-d1)) return s*huge*huge;	/* overflow */
+float
+__ieee754_powf (float x, float y)
+{
+  unsigned long sign_bias = 0;
+  uint32_t ix, iy;
+
+  ix = asuint (x);
+  iy = asuint (y);
+  if (__glibc_unlikely (ix - 0x00800000 >= 0x7f800000 - 0x00800000
+			|| zeroinfnan (iy)))
+    {
+      /* Either (x < 0x1p-126 or inf or nan) or (y is 0 or inf or nan).  */
+      if (__glibc_unlikely (zeroinfnan (iy)))
+	{
+	  if (2 * iy == 0)
+	    return issignalingf_inline (x) ? x + y : 1.0f;
+	  if (ix == 0x3f800000)
+	    return issignalingf_inline (y) ? x + y : 1.0f;
+	  if (2 * ix > 2u * 0x7f800000 || 2 * iy > 2u * 0x7f800000)
+	    return x + y;
+	  if (2 * ix == 2 * 0x3f800000)
+	    return 1.0f;
+	  if ((2 * ix < 2 * 0x3f800000) == !(iy & 0x80000000))
+	    return 0.0f; /* |x|<1 && y==inf or |x|>1 && y==-inf.  */
+	  return y * y;
+	}
+      if (__glibc_unlikely (zeroinfnan (ix)))
+	{
+	  float_t x2 = x * x;
+	  if (ix & 0x80000000 && checkint (iy) == 1)
+	    {
+	      x2 = -x2;
+	      sign_bias = 1;
+	    }
+#if WANT_ERRNO
+	  if (2 * ix == 0 && iy & 0x80000000)
+	    return __math_divzerof (sign_bias);
+#endif
+	  return iy & 0x80000000 ? 1 / x2 : x2;
 	}
-	else if (__builtin_expect((j&0x7fffffff)>0x43160000, 0))/* z <= -150 */
-	    return s*tiny*tiny;				/* underflow */
-	else if (__builtin_expect((uint32_t) j==0xc3160000, 0)){/* z == -150*/
-	    if(0.0<=(z-d1)) return s*tiny*tiny;		/* underflow */
+      /* x and y are non-zero finite.  */
+      if (ix & 0x80000000)
+	{
+	  /* Finite x < 0.  */
+	  int yint = checkint (iy);
+	  if (yint == 0)
+	    return __math_invalidf (x);
+	  if (yint == 1)
+	    sign_bias = SIGN_BIAS;
+	  ix &= 0x7fffffff;
 	}
-    /*
-     * compute 2**d1
-     */
-	i = j&0x7fffffff;
-	k = (i>>23)-0x7f;
-	n = 0;
-	if(i>0x3f000000) {		/* if |z| > 0.5, set n = [z+0.5] */
-	    n = j+(0x00800000>>(k+1));
-	    k = ((n&0x7fffffff)>>23)-0x7f;	/* new k for n */
-	    SET_FLOAT_WORD(z,n&~(0x007fffff>>k));
-	    n = ((n&0x007fffff)|0x00800000)>>(23-k);
-	    if(j<0) n = -n;
-	    d1 -= z;
+      if (ix < 0x00800000)
+	{
+	  /* Normalize subnormal x so exponent becomes negative.  */
+	  ix = asuint (x * 0x1p23f);
+	  ix &= 0x7fffffff;
+	  ix -= 23 << 23;
 	}
-	d1 = d1 * lg2;
-	d2 = d1*d1;
-	d2 = d1 - d2*(P1+d2*(P2+d2*(P3+d2*(P4+d2*P5))));
-	d2 = (d1*d2)/(d2-two);
-	z = one - (d2-d1);
-	GET_FLOAT_WORD(j,z);
-	j += (n<<23);
-	if((j>>23)<=0)	/* subnormal output */
-	  {
-	    z = __scalbnf (z, n);
-	    float force_underflow = z * z;
-	    math_force_eval (force_underflow);
-	  }
-	else SET_FLOAT_WORD(z,j);
-	return s*z;
+    }
+  double_t logx = log2_inline (ix);
+  double_t ylogx = y * logx; /* Note: cannot overflow, y is single prec.  */
+  if (__glibc_unlikely ((asuint64 (ylogx) >> 47 & 0xffff)
+			>= asuint64 (126.0 * POWF_SCALE) >> 47))
+    {
+      /* |y*log(x)| >= 126.  */
+      if (ylogx > 0x1.fffffffd1d571p+6 * POWF_SCALE)
+	return __math_oflowf (sign_bias);
+      if (ylogx <= -150.0 * POWF_SCALE)
+	return __math_uflowf (sign_bias);
+#if WANT_ERRNO_UFLOW
+      if (ylogx < -149.0 * POWF_SCALE)
+	return __math_may_uflowf (sign_bias);
+#endif
+    }
+  return (float) exp2_inline (ylogx, sign_bias);
 }
 strong_alias (__ieee754_powf, __powf_finite)
diff --git a/sysdeps/ieee754/flt-32/e_powf_log2_data.c b/sysdeps/ieee754/flt-32/e_powf_log2_data.c
new file mode 100644
index 0000000000..7cff06f59b
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/e_powf_log2_data.c
@@ -0,0 +1,45 @@
+/* Data definition for powf.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "math_config.h"
+
+const struct powf_log2_data __powf_log2_data = {
+  .tab = {
+  { 0x1.661ec79f8f3bep+0, -0x1.efec65b963019p-2 * POWF_SCALE },
+  { 0x1.571ed4aaf883dp+0, -0x1.b0b6832d4fca4p-2 * POWF_SCALE },
+  { 0x1.49539f0f010bp+0, -0x1.7418b0a1fb77bp-2 * POWF_SCALE },
+  { 0x1.3c995b0b80385p+0, -0x1.39de91a6dcf7bp-2 * POWF_SCALE },
+  { 0x1.30d190c8864a5p+0, -0x1.01d9bf3f2b631p-2 * POWF_SCALE },
+  { 0x1.25e227b0b8eap+0, -0x1.97c1d1b3b7afp-3 * POWF_SCALE },
+  { 0x1.1bb4a4a1a343fp+0, -0x1.2f9e393af3c9fp-3 * POWF_SCALE },
+  { 0x1.12358f08ae5bap+0, -0x1.960cbbf788d5cp-4 * POWF_SCALE },
+  { 0x1.0953f419900a7p+0, -0x1.a6f9db6475fcep-5 * POWF_SCALE },
+  { 0x1p+0, 0x0p+0 * POWF_SCALE },
+  { 0x1.e608cfd9a47acp-1, 0x1.338ca9f24f53dp-4 * POWF_SCALE },
+  { 0x1.ca4b31f026aap-1, 0x1.476a9543891bap-3 * POWF_SCALE },
+  { 0x1.b2036576afce6p-1, 0x1.e840b4ac4e4d2p-3 * POWF_SCALE },
+  { 0x1.9c2d163a1aa2dp-1, 0x1.40645f0c6651cp-2 * POWF_SCALE },
+  { 0x1.886e6037841edp-1, 0x1.88e9c2c1b9ff8p-2 * POWF_SCALE },
+  { 0x1.767dcf5534862p-1, 0x1.ce0a44eb17bccp-2 * POWF_SCALE },
+  },
+  .poly = {
+  0x1.27616c9496e0bp-2 * POWF_SCALE, -0x1.71969a075c67ap-2 * POWF_SCALE,
+  0x1.ec70a6ca7baddp-2 * POWF_SCALE, -0x1.7154748bef6c8p-1 * POWF_SCALE,
+  0x1.71547652ab82bp0 * POWF_SCALE,
+  }
+};
diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/math_config.h
index f869fbc66c..7e78cb0c96 100644
--- a/sysdeps/ieee754/flt-32/math_config.h
+++ b/sysdeps/ieee754/flt-32/math_config.h
@@ -21,6 +21,7 @@
 
 #include <math.h>
 #include <math_private.h>
+#include <nan-high-order-bit.h>
 #include <stdint.h>
 
 #ifndef WANT_ROUNDING
@@ -90,6 +91,15 @@ asdouble (uint64_t i)
   return u.f;
 }
 
+static inline int
+issignalingf_inline (float x)
+{
+  uint32_t ix = asuint (x);
+  if (HIGH_ORDER_BIT_IS_SET_FOR_SNAN)
+    return (ix & 0x7fc00000) == 0x7fc00000;
+  return 2 * (ix ^ 0x00400000) > 2u * 0x7fc00000;
+}
+
 #define NOINLINE __attribute__ ((noinline))
 
 attribute_hidden float __math_oflowf (unsigned long);
@@ -134,4 +144,21 @@ extern const struct log2f_data
   double poly[LOG2F_POLY_ORDER];
 } __log2f_data attribute_hidden;
 
+#define POWF_LOG2_TABLE_BITS 4
+#define POWF_LOG2_POLY_ORDER 5
+#if TOINT_INTRINSICS
+#define POWF_SCALE_BITS EXP2F_TABLE_BITS
+#else
+#define POWF_SCALE_BITS 0
+#endif
+#define POWF_SCALE ((double) (1 << POWF_SCALE_BITS))
+extern const struct powf_log2_data
+{
+  struct
+  {
+    double invc, logc;
+  } tab[1 << POWF_LOG2_TABLE_BITS];
+  double poly[POWF_LOG2_POLY_ORDER];
+} __powf_log2_data attribute_hidden;
+
 #endif
diff --git a/sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c b/sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c
@@ -0,0 +1 @@
+/* Not needed.  */
-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat
  2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
  2017-09-29 11:03 ` [PATCH 1/5 v3] New generic log2f Szabolcs Nagy
  2017-09-29 11:05 ` [PATCH 2/5 v3] New generic powf Szabolcs Nagy
@ 2017-09-29 11:06 ` Szabolcs Nagy
  2017-09-29 21:01   ` Joseph Myers
  2017-09-29 11:09 ` [PATCH 4/5 v4] Do not wrap expf and exp2f Szabolcs Nagy
  2017-09-29 11:10 ` [PATCH 5/5 v2] Do not wrap logf, log2f and powf Szabolcs Nagy
  4 siblings, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:06 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 11 bytes --]

unchanged.

[-- Attachment #2: 0003-New-symbol-version-for-logf-log2f-and-powf-without-S.patch --]
[-- Type: text/x-patch, Size: 22588 bytes --]

From 309a2d3d3543b89bc9fb1b86b02cb84e8b961780 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Wed, 13 Sep 2017 17:19:51 +0100
Subject: [PATCH 3/5] New symbol version for logf, log2f and powf without SVID
 compat

This patch changes the logf, log2f and powf error handling semantics
to only set errno accoring to POSIX rules. New symbol version is
introduced at GLIBC_2.27.

The old wrappers are kept for compat symbols.

ia64 needed assembly change to have the new and compat versioned
symbol map to the same function.

All linux libm abilists are updated.

2017-09-19  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* math/Versions (logf): New libm symbol at GLIBC_2.27.
	(log2f): Likewise.
	(powf): Likewise.
	* math/w_log2f.c: New file.
	* math/w_logf.c: New file.
	* math/w_powf.c: New file.
	* math/w_log2f_compat.c (__log2f_compat): For compat symbol only.
	* math/w_logf_compat.c (__logf_compat): Likewise.
	* math/w_powf_compat.c (__powf_compat): Likewise.
	* sysdeps/ia64/fpu/e_log2f.S: Add versioned symbols.
	* sysdeps/ia64/fpu/e_logf.S: Likewise.
	* sysdeps/ia64/fpu/e_powf.S: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update.
	* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist:
	Likewise.
	* sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
---
 math/Versions                                                |  2 +-
 math/w_log2f.c                                               |  7 +++++++
 math/w_log2f_compat.c                                        |  6 +++---
 math/w_logf.c                                                |  7 +++++++
 math/w_logf_compat.c                                         |  6 +++---
 math/w_powf.c                                                |  7 +++++++
 math/w_powf_compat.c                                         |  6 +++---
 sysdeps/ia64/fpu/e_log2f.S                                   | 10 ++++++++--
 sysdeps/ia64/fpu/e_logf.S                                    |  6 ++++++
 sysdeps/ia64/fpu/e_powf.S                                    | 10 ++++++++--
 sysdeps/unix/sysv/linux/aarch64/libm.abilist                 |  3 +++
 sysdeps/unix/sysv/linux/alpha/libm.abilist                   |  3 +++
 sysdeps/unix/sysv/linux/arm/libm.abilist                     |  3 +++
 sysdeps/unix/sysv/linux/hppa/libm.abilist                    |  3 +++
 sysdeps/unix/sysv/linux/i386/libm.abilist                    |  3 +++
 sysdeps/unix/sysv/linux/ia64/libm.abilist                    |  3 +++
 sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist           |  3 +++
 sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist             |  3 +++
 sysdeps/unix/sysv/linux/microblaze/libm.abilist              |  3 +++
 sysdeps/unix/sysv/linux/mips/mips32/libm.abilist             |  3 +++
 sysdeps/unix/sysv/linux/mips/mips64/libm.abilist             |  3 +++
 sysdeps/unix/sysv/linux/nios2/libm.abilist                   |  3 +++
 sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist   |  3 +++
 sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist |  3 +++
 sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist    |  3 +++
 sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist       |  3 +++
 sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist            |  3 +++
 sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist            |  3 +++
 sysdeps/unix/sysv/linux/sh/libm.abilist                      |  3 +++
 sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist           |  3 +++
 sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist           |  3 +++
 sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist    |  3 +++
 sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist    |  3 +++
 sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist            |  3 +++
 sysdeps/unix/sysv/linux/x86_64/64/libm.abilist               |  3 +++
 sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist              |  3 +++
 36 files changed, 131 insertions(+), 14 deletions(-)
 create mode 100644 math/w_log2f.c
 create mode 100644 math/w_logf.c
 create mode 100644 math/w_powf.c

diff --git a/math/Versions b/math/Versions
index 380f6a2a1a..2fbdb2f266 100644
--- a/math/Versions
+++ b/math/Versions
@@ -230,6 +230,6 @@ libm {
     fromfpx; fromfpxf; fromfpxl; ufromfpx; ufromfpxf; ufromfpxl;
   }
   GLIBC_2.27 {
-    expf; exp2f;
+    expf; exp2f; logf; log2f; powf;
   }
 }
diff --git a/math/w_log2f.c b/math/w_log2f.c
new file mode 100644
index 0000000000..cda0c3a644
--- /dev/null
+++ b/math/w_log2f.c
@@ -0,0 +1,7 @@
+#include <math-type-macros-float.h>
+#undef __USE_WRAPPER_TEMPLATE
+#define __USE_WRAPPER_TEMPLATE 1
+#undef declare_mgen_alias
+#define declare_mgen_alias(a, b)
+#include <w_log2_template.c>
+versioned_symbol (libm, __log2f, log2f, GLIBC_2_27);
diff --git a/math/w_log2f_compat.c b/math/w_log2f_compat.c
index 295c1620f7..3caa310c51 100644
--- a/math/w_log2f_compat.c
+++ b/math/w_log2f_compat.c
@@ -23,10 +23,10 @@
 #include <libm-alias-float.h>
 
 
-#if LIBM_SVID_COMPAT
+#if LIBM_SVID_COMPAT && SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)
 /* wrapper log2f(x) */
 float
-__log2f (float x)
+__log2f_compat (float x)
 {
   if (__builtin_expect (islessequal (x, 0.0f), 0) && _LIB_VERSION != _IEEE_)
     {
@@ -44,5 +44,5 @@ __log2f (float x)
 
   return  __ieee754_log2f (x);
 }
-libm_alias_float (__log2, log2)
+compat_symbol (libm, __log2f_compat, log2f, GLIBC_2_1);
 #endif
diff --git a/math/w_logf.c b/math/w_logf.c
new file mode 100644
index 0000000000..d960e016d7
--- /dev/null
+++ b/math/w_logf.c
@@ -0,0 +1,7 @@
+#include <math-type-macros-float.h>
+#undef __USE_WRAPPER_TEMPLATE
+#define __USE_WRAPPER_TEMPLATE 1
+#undef declare_mgen_alias
+#define declare_mgen_alias(a, b)
+#include <w_log_template.c>
+versioned_symbol (libm, __logf, logf, GLIBC_2_27);
diff --git a/math/w_logf_compat.c b/math/w_logf_compat.c
index 7cdacdf921..936b3a6e67 100644
--- a/math/w_logf_compat.c
+++ b/math/w_logf_compat.c
@@ -23,10 +23,10 @@
 #include <libm-alias-float.h>
 
 
-#if LIBM_SVID_COMPAT
+#if LIBM_SVID_COMPAT && SHLIB_COMPAT (libm, GLIBC_2_0, GLIBC_2_27)
 /* wrapper logf(x) */
 float
-__logf (float x)
+__logf_compat (float x)
 {
   if (__builtin_expect (islessequal (x, 0.0f), 0) && _LIB_VERSION != _IEEE_)
     {
@@ -44,5 +44,5 @@ __logf (float x)
 
   return  __ieee754_logf (x);
 }
-libm_alias_float (__log, log)
+compat_symbol (libm, __logf_compat, logf, GLIBC_2_0);
 #endif
diff --git a/math/w_powf.c b/math/w_powf.c
new file mode 100644
index 0000000000..a18348329e
--- /dev/null
+++ b/math/w_powf.c
@@ -0,0 +1,7 @@
+#include <math-type-macros-float.h>
+#undef __USE_WRAPPER_TEMPLATE
+#define __USE_WRAPPER_TEMPLATE 1
+#undef declare_mgen_alias
+#define declare_mgen_alias(a, b)
+#include <w_pow_template.c>
+versioned_symbol (libm, __powf, powf, GLIBC_2_27);
diff --git a/math/w_powf_compat.c b/math/w_powf_compat.c
index 39e818af7e..7745639efe 100644
--- a/math/w_powf_compat.c
+++ b/math/w_powf_compat.c
@@ -22,10 +22,10 @@
 #include <libm-alias-float.h>
 
 
-#if LIBM_SVID_COMPAT
+#if LIBM_SVID_COMPAT && SHLIB_COMPAT (libm, GLIBC_2_0, GLIBC_2_27)
 /* wrapper powf */
 float
-__powf (float x, float y)
+__powf_compat (float x, float y)
 {
   float z = __ieee754_powf (x, y);
   if (__glibc_unlikely (!isfinite (z)))
@@ -60,5 +60,5 @@ __powf (float x, float y)
 
   return z;
 }
-libm_alias_float (__pow, pow)
+compat_symbol (libm, __powf_compat, powf, GLIBC_2_0);
 #endif
diff --git a/sysdeps/ia64/fpu/e_log2f.S b/sysdeps/ia64/fpu/e_log2f.S
index 2c3f18f360..9b754d1043 100644
--- a/sysdeps/ia64/fpu/e_log2f.S
+++ b/sysdeps/ia64/fpu/e_log2f.S
@@ -252,7 +252,7 @@ LOCAL_OBJECT_END(T_table)
 
 
 .section .text
-GLOBAL_LIBM_ENTRY(log2f)
+GLOBAL_LIBM_ENTRY(__log2f)
 
 { .mfi
   alloc r32=ar.pfs,1,4,4,0
@@ -491,7 +491,13 @@ SPECIAL_log2f:
   br.ret.sptk b0;;
 }
 
-GLOBAL_LIBM_END(log2f)
+GLOBAL_LIBM_END(__log2f)
+#ifdef SHARED
+.symver __log2f,log2f@@GLIBC_2.27
+.weak __log2f_compat
+.set __log2f_compat,__log2f
+.symver __log2f_compat,log2f@GLIBC_2.2
+#endif
 
 
 LOCAL_LIBM_ENTRY(__libm_error_region)
diff --git a/sysdeps/ia64/fpu/e_logf.S b/sysdeps/ia64/fpu/e_logf.S
index 2dda2186d0..d5f5437793 100644
--- a/sysdeps/ia64/fpu/e_logf.S
+++ b/sysdeps/ia64/fpu/e_logf.S
@@ -1088,6 +1088,12 @@ logf_libm_err:
       nop.i         0
 };;
 GLOBAL_IEEE754_END(logf)
+#ifdef SHARED
+.symver logf,logf@@GLIBC_2.27
+.weak __logf_compat
+.set __logf_compat,__logf
+.symver __logf_compat,logf@GLIBC_2.2
+#endif
 
 
 // Stack operations when calling error support.
diff --git a/sysdeps/ia64/fpu/e_powf.S b/sysdeps/ia64/fpu/e_powf.S
index d61bc79e5e..388391624f 100644
--- a/sysdeps/ia64/fpu/e_powf.S
+++ b/sysdeps/ia64/fpu/e_powf.S
@@ -868,7 +868,7 @@ data8 0xEAC0C6E7DD24392F , 0x00003FFF
 LOCAL_OBJECT_END(pow_tbl2)
 
 .section .text
-GLOBAL_LIBM_ENTRY(powf)
+GLOBAL_LIBM_ENTRY(__powf)
 
 // Get exponent of x.  Will be used to calculate K.
 { .mfi
@@ -2002,7 +2002,13 @@ POW_OVER_UNDER_ERROR:
 }
 ;;
 
-GLOBAL_LIBM_END(powf)
+GLOBAL_LIBM_END(__powf)
+#ifdef SHARED
+.symver __powf,powf@@GLIBC_2.27
+.weak __powf_compat
+.set __powf_compat,__powf
+.symver __powf_compat,powf@GLIBC_2.2
+#endif
 
 
 LOCAL_LIBM_ENTRY(__libm_error_region)
diff --git a/sysdeps/unix/sysv/linux/aarch64/libm.abilist b/sysdeps/unix/sysv/linux/aarch64/libm.abilist
index 10102eeaff..3f0190ae03 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libm.abilist
@@ -463,3 +463,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/alpha/libm.abilist b/sysdeps/unix/sysv/linux/alpha/libm.abilist
index e09a115aa9..78edc5e3d9 100644
--- a/sysdeps/unix/sysv/linux/alpha/libm.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libm.abilist
@@ -473,6 +473,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.3.4 GLIBC_2.3.4 A
 GLIBC_2.3.4 __c1_cabsf F
 GLIBC_2.3.4 __c1_cacosf F
diff --git a/sysdeps/unix/sysv/linux/arm/libm.abilist b/sysdeps/unix/sysv/linux/arm/libm.abilist
index 8095876449..b3fd4a27b2 100644
--- a/sysdeps/unix/sysv/linux/arm/libm.abilist
+++ b/sysdeps/unix/sysv/linux/arm/libm.abilist
@@ -120,6 +120,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 _LIB_VERSION D 0x4
 GLIBC_2.4 __clog10 F
diff --git a/sysdeps/unix/sysv/linux/hppa/libm.abilist b/sysdeps/unix/sysv/linux/hppa/libm.abilist
index 19d40ef50d..ffa61bf1b1 100644
--- a/sysdeps/unix/sysv/linux/hppa/libm.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libm.abilist
@@ -432,5 +432,8 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 exp2l F
diff --git a/sysdeps/unix/sysv/linux/i386/libm.abilist b/sysdeps/unix/sysv/linux/i386/libm.abilist
index 791fba28e5..1a7e6bf449 100644
--- a/sysdeps/unix/sysv/linux/i386/libm.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libm.abilist
@@ -614,4 +614,7 @@ GLIBC_2.26 ynf128 F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/ia64/libm.abilist b/sysdeps/unix/sysv/linux/ia64/libm.abilist
index 65a0fbe56a..7e15735eae 100644
--- a/sysdeps/unix/sysv/linux/ia64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libm.abilist
@@ -543,4 +543,7 @@ GLIBC_2.26 ynf128 F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist
index 8095876449..b3fd4a27b2 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist
@@ -120,6 +120,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 _LIB_VERSION D 0x4
 GLIBC_2.4 __clog10 F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist
index 5e692dda7b..aae61169f9 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist
@@ -474,4 +474,7 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/microblaze/libm.abilist b/sysdeps/unix/sysv/linux/microblaze/libm.abilist
index 65f1d5b451..0d3b4b1e90 100644
--- a/sysdeps/unix/sysv/linux/microblaze/libm.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/libm.abilist
@@ -431,3 +431,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libm.abilist b/sysdeps/unix/sysv/linux/mips/mips32/libm.abilist
index c32ea5b96a..d32d58d4e4 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/libm.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/libm.abilist
@@ -433,6 +433,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 exp2l F
 _gp_disp _gp_disp A
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/libm.abilist b/sysdeps/unix/sysv/linux/mips/mips64/libm.abilist
index 18b2aa2404..f33ba0576b 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/libm.abilist
@@ -465,4 +465,7 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/nios2/libm.abilist b/sysdeps/unix/sysv/linux/nios2/libm.abilist
index e492a68e9d..0fe34e98fa 100644
--- a/sysdeps/unix/sysv/linux/nios2/libm.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libm.abilist
@@ -431,3 +431,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist
index ad8f0372f4..ed013deefd 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist
@@ -476,6 +476,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 __clog10l F
 GLIBC_2.4 __finitel F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist
index 9c26b5b809..6f2873dc80 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist
@@ -475,6 +475,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 __clog10l F
 GLIBC_2.4 __finitel F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist
index 8e36699f28..723be46c20 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist
@@ -608,3 +608,6 @@ GLIBC_2.26 ynf128 F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist
index 9ca0c3ccfc..f3aeac2e1e 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist
@@ -151,6 +151,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 _LIB_VERSION D 0x4
 GLIBC_2.3 __clog10 F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist
index 8a79f0137f..2b758e80fd 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist
@@ -463,6 +463,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 __clog10l F
 GLIBC_2.4 __finitel F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist
index df81853618..62c9bb57a8 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist
@@ -461,6 +461,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 __clog10l F
 GLIBC_2.4 __finitel F
diff --git a/sysdeps/unix/sysv/linux/sh/libm.abilist b/sysdeps/unix/sysv/linux/sh/libm.abilist
index 6b6a42dc9c..a57fbc0eac 100644
--- a/sysdeps/unix/sysv/linux/sh/libm.abilist
+++ b/sysdeps/unix/sysv/linux/sh/libm.abilist
@@ -432,5 +432,8 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 exp2l F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist
index 24d67d22e1..f8f10e5952 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist
@@ -467,6 +467,9 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 __clog10l F
 GLIBC_2.4 __finitel F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist
index 2fdccc0de3..b5412c9b48 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist
@@ -464,4 +464,7 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist
index 98bc348f91..b711e87026 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist
@@ -432,3 +432,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist
index 98bc348f91..b711e87026 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist
@@ -432,3 +432,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist b/sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist
index 98bc348f91..b711e87026 100644
--- a/sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist
@@ -432,3 +432,6 @@ GLIBC_2.25 ufromfpxl F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libm.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libm.abilist
index e6fd3fe3df..201c2ab1b1 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libm.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libm.abilist
@@ -603,4 +603,7 @@ GLIBC_2.26 ynf128 F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
 GLIBC_2.4 GLIBC_2.4 A
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist
index afa7b98697..10e389a96a 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist
@@ -602,3 +602,6 @@ GLIBC_2.26 ynf128 F
 GLIBC_2.27 GLIBC_2.27 A
 GLIBC_2.27 exp2f F
 GLIBC_2.27 expf F
+GLIBC_2.27 log2f F
+GLIBC_2.27 logf F
+GLIBC_2.27 powf F
-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/5 v4] Do not wrap expf and exp2f
  2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
                   ` (2 preceding siblings ...)
  2017-09-29 11:06 ` [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat Szabolcs Nagy
@ 2017-09-29 11:09 ` Szabolcs Nagy
  2017-09-29 21:06   ` Joseph Myers
  2017-09-29 11:10 ` [PATCH 5/5 v2] Do not wrap logf, log2f and powf Szabolcs Nagy
  4 siblings, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:09 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 116 bytes --]

v4:
- ifdefs for multiarch support.
- add w_expf.c for power8, tested with --disable-multi-arch --with-cpu=power8.


[-- Attachment #2: 0004-Do-not-wrap-expf-and-exp2f.patch --]
[-- Type: text/x-patch, Size: 7445 bytes --]

From 7dc9a7c1aa60966df497d6405aab3bdfab5c6083 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Tue, 12 Sep 2017 12:44:18 +0100
Subject: [PATCH 4/5] Do not wrap expf and exp2f

The new generic expf and exp2f code don't need wrappers any more, they
set errno inline, so only use the wrappers on targets that need it.
(If the wrapper is needed, then the top level wrapper code is included,
otherwise empty w_exp*f.c is used to suppress the wrapper.)

A powerpc64 expf implementation includes the expf c code directly which
needed some changes.

2017-09-25  Szabolcs Nagy  <szabolcs.nagy@arm.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper.
	* sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise
	* sysdeps/ieee754/flt-32/w_exp2f.c: New file.
	* sysdeps/ieee754/flt-32/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for
	the new expf code.
	* sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file.
	* sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_expf.c: New file.
	* sysdeps/i386/fpu/w_exp2f.c: New file.
	* sysdeps/i386/fpu/w_expf.c: New file.
	* sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file.
	* sysdeps/x86_64/fpu/w_expf.c: New file.
---
 sysdeps/i386/fpu/w_exp2f.c                             |  1 +
 sysdeps/i386/fpu/w_expf.c                              |  1 +
 sysdeps/i386/i686/fpu/multiarch/w_expf.c               |  1 +
 sysdeps/ieee754/flt-32/e_exp2f.c                       |  9 +++++++--
 sysdeps/ieee754/flt-32/e_expf.c                        | 16 ++++++++++++++--
 sysdeps/ieee754/flt-32/w_exp2f.c                       |  1 +
 sysdeps/ieee754/flt-32/w_expf.c                        |  1 +
 sysdeps/m68k/m680x0/fpu/w_exp2f.c                      |  1 +
 sysdeps/m68k/m680x0/fpu/w_expf.c                       |  1 +
 sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c |  5 +----
 sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c       |  1 +
 sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c          |  1 +
 sysdeps/x86_64/fpu/w_expf.c                            |  1 +
 13 files changed, 32 insertions(+), 8 deletions(-)
 create mode 100644 sysdeps/i386/fpu/w_exp2f.c
 create mode 100644 sysdeps/i386/fpu/w_expf.c
 create mode 100644 sysdeps/i386/i686/fpu/multiarch/w_expf.c
 create mode 100644 sysdeps/ieee754/flt-32/w_exp2f.c
 create mode 100644 sysdeps/ieee754/flt-32/w_expf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_exp2f.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_expf.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c
 create mode 100644 sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c
 create mode 100644 sysdeps/x86_64/fpu/w_expf.c

diff --git a/sysdeps/i386/fpu/w_exp2f.c b/sysdeps/i386/fpu/w_exp2f.c
new file mode 100644
index 0000000000..583065d12a
--- /dev/null
+++ b/sysdeps/i386/fpu/w_exp2f.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_exp2f.c>
diff --git a/sysdeps/i386/fpu/w_expf.c b/sysdeps/i386/fpu/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/i386/fpu/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
diff --git a/sysdeps/i386/i686/fpu/multiarch/w_expf.c b/sysdeps/i386/i686/fpu/multiarch/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/i386/i686/fpu/multiarch/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
diff --git a/sysdeps/ieee754/flt-32/e_exp2f.c b/sysdeps/ieee754/flt-32/e_exp2f.c
index 72b7d8829f..31b660b07b 100644
--- a/sysdeps/ieee754/flt-32/e_exp2f.c
+++ b/sysdeps/ieee754/flt-32/e_exp2f.c
@@ -18,6 +18,7 @@
 
 #include <math.h>
 #include <stdint.h>
+#include <shlib-compat.h>
 #include "math_config.h"
 
 /*
@@ -42,7 +43,7 @@ top12 (float x)
 }
 
 float
-__ieee754_exp2f (float x)
+__exp2f (float x)
 {
   uint32_t abstop;
   uint64_t ki, t;
@@ -85,4 +86,8 @@ __ieee754_exp2f (float x)
   y = y * s;
   return (float) y;
 }
-strong_alias (__ieee754_exp2f, __exp2f_finite)
+#ifndef __exp2f
+strong_alias (__exp2f, __ieee754_exp2f)
+strong_alias (__exp2f, __exp2f_finite)
+versioned_symbol (libm, __exp2f, exp2f, GLIBC_2_27);
+#endif
diff --git a/sysdeps/ieee754/flt-32/e_expf.c b/sysdeps/ieee754/flt-32/e_expf.c
index 12239e1862..74a383a02c 100644
--- a/sysdeps/ieee754/flt-32/e_expf.c
+++ b/sysdeps/ieee754/flt-32/e_expf.c
@@ -16,8 +16,14 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#ifdef __expf
+# undef libm_hidden_proto
+# define libm_hidden_proto(ignored)
+#endif
+
 #include <math.h>
 #include <stdint.h>
+#include <shlib-compat.h>
 #include "math_config.h"
 
 /*
@@ -42,7 +48,7 @@ top12 (float x)
 }
 
 float
-__ieee754_expf (float x)
+__expf (float x)
 {
   uint32_t abstop;
   uint64_t ki, t;
@@ -99,4 +105,10 @@ __ieee754_expf (float x)
   y = y * s;
   return (float) y;
 }
-strong_alias (__ieee754_expf, __expf_finite)
+
+#ifndef __expf
+hidden_def (__expf)
+strong_alias (__expf, __ieee754_expf)
+strong_alias (__expf, __expf_finite)
+versioned_symbol (libm, __expf, expf, GLIBC_2_27);
+#endif
diff --git a/sysdeps/ieee754/flt-32/w_exp2f.c b/sysdeps/ieee754/flt-32/w_exp2f.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/w_exp2f.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ieee754/flt-32/w_expf.c b/sysdeps/ieee754/flt-32/w_expf.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/w_expf.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/m68k/m680x0/fpu/w_exp2f.c b/sysdeps/m68k/m680x0/fpu/w_exp2f.c
new file mode 100644
index 0000000000..583065d12a
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/w_exp2f.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_exp2f.c>
diff --git a/sysdeps/m68k/m680x0/fpu/w_expf.c b/sysdeps/m68k/m680x0/fpu/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c
index b236290ea2..2cd9a5ec8b 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c
@@ -16,9 +16,6 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#undef strong_alias
-#define strong_alias(a, b)
-
-#define __ieee754_expf __ieee754_expf_ppc64
+#define __expf __ieee754_expf_ppc64
 
 #include <sysdeps/ieee754/flt-32/e_expf.c>
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
diff --git a/sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c b/sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
diff --git a/sysdeps/x86_64/fpu/w_expf.c b/sysdeps/x86_64/fpu/w_expf.c
new file mode 100644
index 0000000000..b5fe164520
--- /dev/null
+++ b/sysdeps/x86_64/fpu/w_expf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_expf.c>
-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/5 v2] Do not wrap logf, log2f and powf
  2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
                   ` (3 preceding siblings ...)
  2017-09-29 11:09 ` [PATCH 4/5 v4] Do not wrap expf and exp2f Szabolcs Nagy
@ 2017-09-29 11:10 ` Szabolcs Nagy
  2017-09-29 21:08   ` Joseph Myers
  4 siblings, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2017-09-29 11:10 UTC (permalink / raw)
  To: GNU C Library; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 38 bytes --]

v2:
- ifndefs for multiarch support.


[-- Attachment #2: 0005-Do-not-wrap-logf-log2f-and-powf.patch --]
[-- Type: text/x-patch, Size: 6573 bytes --]

From dc230e23df20ac3ab13e2cc2e858083c0550bb31 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Wed, 13 Sep 2017 18:14:26 +0100
Subject: [PATCH 5/5] Do not wrap logf, log2f and powf

The new generic logf, log2f and powf code don't need wrappers any more,
they set errno inline so only use the wrappers on targets that need it.

2017-09-19  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* sysdeps/ieee754/flt-32/e_log2f.c (__log2f): Define without wrapper.
	* sysdeps/ieee754/flt-32/e_logf.c (__logf): Likewise
	* sysdeps/ieee754/flt-32/e_powf.c (__powf): Likewise
	* sysdeps/ieee754/flt-32/w_log2f.c: New file.
	* sysdeps/ieee754/flt-32/w_logf.c: New file.
	* sysdeps/ieee754/flt-32/w_powf.c: New file.
	* sysdeps/i386/fpu/w_log2f.c: New file.
	* sysdeps/i386/fpu/w_logf.c: New file.
	* sysdeps/i386/fpu/w_powf.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_log2f.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_logf.c: New file.
	* sysdeps/m68k/m680x0/fpu/w_powf.c: New file.
---
 sysdeps/i386/fpu/w_log2f.c        | 1 +
 sysdeps/i386/fpu/w_logf.c         | 1 +
 sysdeps/i386/fpu/w_powf.c         | 1 +
 sysdeps/ieee754/flt-32/e_log2f.c  | 9 +++++++--
 sysdeps/ieee754/flt-32/e_logf.c   | 9 +++++++--
 sysdeps/ieee754/flt-32/e_powf.c   | 9 +++++++--
 sysdeps/ieee754/flt-32/w_log2f.c  | 1 +
 sysdeps/ieee754/flt-32/w_logf.c   | 1 +
 sysdeps/ieee754/flt-32/w_powf.c   | 1 +
 sysdeps/m68k/m680x0/fpu/w_log2f.c | 1 +
 sysdeps/m68k/m680x0/fpu/w_logf.c  | 1 +
 sysdeps/m68k/m680x0/fpu/w_powf.c  | 1 +
 12 files changed, 30 insertions(+), 6 deletions(-)
 create mode 100644 sysdeps/i386/fpu/w_log2f.c
 create mode 100644 sysdeps/i386/fpu/w_logf.c
 create mode 100644 sysdeps/i386/fpu/w_powf.c
 create mode 100644 sysdeps/ieee754/flt-32/w_log2f.c
 create mode 100644 sysdeps/ieee754/flt-32/w_logf.c
 create mode 100644 sysdeps/ieee754/flt-32/w_powf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_log2f.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_logf.c
 create mode 100644 sysdeps/m68k/m680x0/fpu/w_powf.c

diff --git a/sysdeps/i386/fpu/w_log2f.c b/sysdeps/i386/fpu/w_log2f.c
new file mode 100644
index 0000000000..3f5c71cec2
--- /dev/null
+++ b/sysdeps/i386/fpu/w_log2f.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_log2f.c>
diff --git a/sysdeps/i386/fpu/w_logf.c b/sysdeps/i386/fpu/w_logf.c
new file mode 100644
index 0000000000..ea48d1356e
--- /dev/null
+++ b/sysdeps/i386/fpu/w_logf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_logf.c>
diff --git a/sysdeps/i386/fpu/w_powf.c b/sysdeps/i386/fpu/w_powf.c
new file mode 100644
index 0000000000..d133216f5b
--- /dev/null
+++ b/sysdeps/i386/fpu/w_powf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_powf.c>
diff --git a/sysdeps/ieee754/flt-32/e_log2f.c b/sysdeps/ieee754/flt-32/e_log2f.c
index 6c42f27843..ef13b372cb 100644
--- a/sysdeps/ieee754/flt-32/e_log2f.c
+++ b/sysdeps/ieee754/flt-32/e_log2f.c
@@ -18,6 +18,7 @@
 
 #include <math.h>
 #include <stdint.h>
+#include <shlib-compat.h>
 #include "math_config.h"
 
 /*
@@ -34,7 +35,7 @@ Relative error: 1.9 * 2^-26 (before rounding.)
 #define OFF 0x3f330000
 
 float
-__ieee754_log2f (float x)
+__log2f (float x)
 {
   /* double_t for better performance on targets with FLT_EVAL_METHOD==2.  */
   double_t z, r, r2, p, y, y0, invc, logc;
@@ -85,4 +86,8 @@ __ieee754_log2f (float x)
   y = y * r2 + p;
   return (float) y;
 }
-strong_alias (__ieee754_log2f, __log2f_finite)
+#ifndef __log2f
+strong_alias (__log2f, __ieee754_log2f)
+strong_alias (__log2f, __log2f_finite)
+versioned_symbol (libm, __log2f, log2f, GLIBC_2_27);
+#endif
diff --git a/sysdeps/ieee754/flt-32/e_logf.c b/sysdeps/ieee754/flt-32/e_logf.c
index b8d262441f..ea847b57ec 100644
--- a/sysdeps/ieee754/flt-32/e_logf.c
+++ b/sysdeps/ieee754/flt-32/e_logf.c
@@ -18,6 +18,7 @@
 
 #include <math.h>
 #include <stdint.h>
+#include <shlib-compat.h>
 #include "math_config.h"
 
 /*
@@ -35,7 +36,7 @@ Relative error: 1.957 * 2^-26 (before rounding.)
 #define OFF 0x3f330000
 
 float
-__ieee754_logf (float x)
+__logf (float x)
 {
   /* double_t for better performance on targets with FLT_EVAL_METHOD==2.  */
   double_t z, r, r2, y, y0, invc, logc;
@@ -84,4 +85,8 @@ __ieee754_logf (float x)
   y = y * r2 + (y0 + r);
   return (float) y;
 }
-strong_alias (__ieee754_logf, __logf_finite)
+#ifndef __logf
+strong_alias (__logf, __ieee754_logf)
+strong_alias (__logf, __logf_finite)
+versioned_symbol (libm, __logf, logf, GLIBC_2_27);
+#endif
diff --git a/sysdeps/ieee754/flt-32/e_powf.c b/sysdeps/ieee754/flt-32/e_powf.c
index 644a18d05e..08d2c6d058 100644
--- a/sysdeps/ieee754/flt-32/e_powf.c
+++ b/sysdeps/ieee754/flt-32/e_powf.c
@@ -18,6 +18,7 @@
 
 #include <math.h>
 #include <stdint.h>
+#include <shlib-compat.h>
 #include "math_config.h"
 
 /*
@@ -139,7 +140,7 @@ zeroinfnan (uint32_t ix)
 }
 
 float
-__ieee754_powf (float x, float y)
+__powf (float x, float y)
 {
   unsigned long sign_bias = 0;
   uint32_t ix, iy;
@@ -214,4 +215,8 @@ __ieee754_powf (float x, float y)
     }
   return (float) exp2_inline (ylogx, sign_bias);
 }
-strong_alias (__ieee754_powf, __powf_finite)
+#ifndef __powf
+strong_alias (__powf, __ieee754_powf)
+strong_alias (__powf, __powf_finite)
+versioned_symbol (libm, __powf, powf, GLIBC_2_27);
+#endif
diff --git a/sysdeps/ieee754/flt-32/w_log2f.c b/sysdeps/ieee754/flt-32/w_log2f.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/w_log2f.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ieee754/flt-32/w_logf.c b/sysdeps/ieee754/flt-32/w_logf.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/w_logf.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/ieee754/flt-32/w_powf.c b/sysdeps/ieee754/flt-32/w_powf.c
new file mode 100644
index 0000000000..1cc8931700
--- /dev/null
+++ b/sysdeps/ieee754/flt-32/w_powf.c
@@ -0,0 +1 @@
+/* Not needed.  */
diff --git a/sysdeps/m68k/m680x0/fpu/w_log2f.c b/sysdeps/m68k/m680x0/fpu/w_log2f.c
new file mode 100644
index 0000000000..3f5c71cec2
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/w_log2f.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_log2f.c>
diff --git a/sysdeps/m68k/m680x0/fpu/w_logf.c b/sysdeps/m68k/m680x0/fpu/w_logf.c
new file mode 100644
index 0000000000..ea48d1356e
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/w_logf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_logf.c>
diff --git a/sysdeps/m68k/m680x0/fpu/w_powf.c b/sysdeps/m68k/m680x0/fpu/w_powf.c
new file mode 100644
index 0000000000..d133216f5b
--- /dev/null
+++ b/sysdeps/m68k/m680x0/fpu/w_powf.c
@@ -0,0 +1 @@
+#include <sysdeps/../math/w_powf.c>
-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/5 v3] New generic log2f
  2017-09-29 11:03 ` [PATCH 1/5 v3] New generic log2f Szabolcs Nagy
@ 2017-09-29 16:06   ` Joseph Myers
  0 siblings, 0 replies; 11+ messages in thread
From: Joseph Myers @ 2017-09-29 16:06 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: GNU C Library, nd

This version is OK.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/5 v3] New generic powf
  2017-09-29 11:05 ` [PATCH 2/5 v3] New generic powf Szabolcs Nagy
@ 2017-09-29 16:22   ` Joseph Myers
  0 siblings, 0 replies; 11+ messages in thread
From: Joseph Myers @ 2017-09-29 16:22 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: GNU C Library, nd

On Fri, 29 Sep 2017, Szabolcs Nagy wrote:

> +#if TOINT_INTRINSICS
> +#define POWF_SCALE_BITS EXP2F_TABLE_BITS
> +#else
> +#define POWF_SCALE_BITS 0
> +#endif

Missing preprocessor indentation inside #if ("# define").  OK with that 
fixed, provided you've done execution testing of this code on 
architectures both with and without TOINT_INTRINSICS used.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat
  2017-09-29 11:06 ` [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat Szabolcs Nagy
@ 2017-09-29 21:01   ` Joseph Myers
  0 siblings, 0 replies; 11+ messages in thread
From: Joseph Myers @ 2017-09-29 21:01 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: GNU C Library, nd

This patch is OK.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/5 v4] Do not wrap expf and exp2f
  2017-09-29 11:09 ` [PATCH 4/5 v4] Do not wrap expf and exp2f Szabolcs Nagy
@ 2017-09-29 21:06   ` Joseph Myers
  0 siblings, 0 replies; 11+ messages in thread
From: Joseph Myers @ 2017-09-29 21:06 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: GNU C Library, nd, tuliom

OK.  I'd strongly encourage powerpc people to see whether there is any 
advantage to the power8 e_expf.S, or whether, possibly after tuning 
TOINT_* (and potentially any other details of the expf implementation that 
can be tuned for different architectures at the C level), it's at least as 
good just to build the C version for power8.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5 v2] Do not wrap logf, log2f and powf
  2017-09-29 11:10 ` [PATCH 5/5 v2] Do not wrap logf, log2f and powf Szabolcs Nagy
@ 2017-09-29 21:08   ` Joseph Myers
  0 siblings, 0 replies; 11+ messages in thread
From: Joseph Myers @ 2017-09-29 21:08 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: GNU C Library, nd

OK.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-09-29 21:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-29 11:00 [PATCH 0/5] Optimized expf, exp2f, logf, log2f and powf Szabolcs Nagy
2017-09-29 11:03 ` [PATCH 1/5 v3] New generic log2f Szabolcs Nagy
2017-09-29 16:06   ` Joseph Myers
2017-09-29 11:05 ` [PATCH 2/5 v3] New generic powf Szabolcs Nagy
2017-09-29 16:22   ` Joseph Myers
2017-09-29 11:06 ` [PATCH 3/5] New symbol version for logf, log2f and powf without SVID compat Szabolcs Nagy
2017-09-29 21:01   ` Joseph Myers
2017-09-29 11:09 ` [PATCH 4/5 v4] Do not wrap expf and exp2f Szabolcs Nagy
2017-09-29 21:06   ` Joseph Myers
2017-09-29 11:10 ` [PATCH 5/5 v2] Do not wrap logf, log2f and powf Szabolcs Nagy
2017-09-29 21:08   ` Joseph Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).