v4: - Remove ifdefs from the code that is not configurable. - Update the NEWS entry. - Document tested targets in the commit message. v3: - Add empty e_log2_data.c to targets that have their own log2. - Fix GNU style issues. - Document internal function semantics. - Fix remaining __FP_FAST_FMA that was missed in v2 - Add NEWS entry. v2: - use __FP_FAST_MATH and __builtin_fma - drop the wordsize-64/ version - add e_log2_data to the Makefile Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c) where the last term is approximated by a polynomial of x/c - 1, the first order coefficient is about 1/ln2 in this case. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, for which the table size is doubled. The worst case error is 0.547 ULP (0.55 without fma), the read only global data size is 1168 bytes (2192 without fma) on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: latency: 2.0x thruput: 1.9x Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) and arm-linux-gnueabihf (!defined __FP_FAST_FMA) and x86_64-linux-gnu (!defined __FP_FAST_FMA) targets. 2018-07-06 Szabolcs Nagy * NEWS: Mention log2 improvements. * math/Makefile (type-double-routines): Add e_log2_data. * sysdeps/i386/fpu/e_log2_data.c: New file. * sysdeps/ia64/fpu/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/e_log2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove. * sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file. --- NEWS | 2 +- math/Makefile | 2 +- sysdeps/i386/fpu/e_log2_data.c | 1 + sysdeps/ia64/fpu/e_log2_data.c | 1 + sysdeps/ieee754/dbl-64/e_log2.c | 240 ++++++++++++++-------------- sysdeps/ieee754/dbl-64/e_log2_data.c | 194 ++++++++++++++++++++++ sysdeps/ieee754/dbl-64/math_config.h | 15 ++ sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c | 128 --------------- sysdeps/m68k/m680x0/fpu/e_log2_data.c | 1 + 9 files changed, 338 insertions(+), 246 deletions(-) create mode 100644 sysdeps/i386/fpu/e_log2_data.c create mode 100644 sysdeps/ia64/fpu/e_log2_data.c create mode 100644 sysdeps/ieee754/dbl-64/e_log2_data.c delete mode 100644 sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c create mode 100644 sysdeps/m68k/m680x0/fpu/e_log2_data.c