public inbox for glibc-cvs@sourceware.org help / color / mirror / Atom feed
From: H.J. Lu <hjl@sourceware.org> To: glibc-cvs@sourceware.org Subject: [glibc] x86_64: Add log1p with FMA Date: Mon, 21 Aug 2023 17:44:53 +0000 (GMT) [thread overview] Message-ID: <20230821174453.4FD41385C6D5@sourceware.org> (raw) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a8ecb126d4c26c52f4ad828c566afe4043a28155 commit a8ecb126d4c26c52f4ad828c566afe4043a28155 Author: H.J. Lu <hjl.tools@gmail.com> Date: Thu Aug 17 09:42:29 2023 -0700 x86_64: Add log1p with FMA On Skylake, it changes log1p bench performance by: Before After Improvement max 63.349 58.347 8% min 4.448 5.651 -30% mean 12.0674 10.336 14% The minimum code path is if (hx < 0x3FDA827A) /* x < 0.41422 */ { if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */ { ... } if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */ { math_force_eval (two54 + x); /* raise inexact */ if (ax < 0x3c900000) /* |x| < 2**-54 */ { ... } else return x - x * x * 0.5; FMA and non-FMA code sequences look similar. Non-FMA version is slightly faster. Since log1p is called by asinh and atanh, it improves asinh performance by: Before After Improvement max 75.645 63.135 16% min 10.074 10.071 0% mean 15.9483 14.9089 6% and improves atanh performance by: Before After Improvement max 91.768 75.081 18% min 15.548 13.883 10% mean 18.3713 16.8011 8% Diff: --- sysdeps/ieee754/dbl-64/s_log1p.c | 5 +++++ sysdeps/x86_64/fpu/multiarch/Makefile | 2 ++ sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c | 4 ++++ sysdeps/x86_64/fpu/multiarch/s_log1p.c | 29 +++++++++++++++++++++++++++++ 4 files changed, 40 insertions(+) diff --git a/sysdeps/ieee754/dbl-64/s_log1p.c b/sysdeps/ieee754/dbl-64/s_log1p.c index e6476a8260..eeb0af859f 100644 --- a/sysdeps/ieee754/dbl-64/s_log1p.c +++ b/sysdeps/ieee754/dbl-64/s_log1p.c @@ -99,6 +99,11 @@ static const double static const double zero = 0.0; +#ifndef SECTION +# define SECTION +#endif + +SECTION double __log1p (double x) { diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile index add339a876..ea81753b70 100644 --- a/sysdeps/x86_64/fpu/multiarch/Makefile +++ b/sysdeps/x86_64/fpu/multiarch/Makefile @@ -38,6 +38,7 @@ libm-sysdep_routines += \ e_pow-fma \ s_atan-fma \ s_expm1-fma \ + s_log1p-fma \ s_sin-fma \ s_sincos-fma \ s_tan-fma \ @@ -51,6 +52,7 @@ CFLAGS-e_log2-fma.c = -mfma -mavx2 CFLAGS-e_pow-fma.c = -mfma -mavx2 CFLAGS-s_atan-fma.c = -mfma -mavx2 CFLAGS-s_expm1-fma.c = -mfma -mavx2 +CFLAGS-s_log1p-fma.c = -mfma -mavx2 CFLAGS-s_sin-fma.c = -mfma -mavx2 CFLAGS-s_tan-fma.c = -mfma -mavx2 CFLAGS-s_sincos-fma.c = -mfma -mavx2 diff --git a/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c b/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c new file mode 100644 index 0000000000..8952df8f9e --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c @@ -0,0 +1,4 @@ +#define __log1p __log1p_fma +#define SECTION __attribute__ ((section (".text.fma"))) + +#include <sysdeps/ieee754/dbl-64/s_log1p.c> diff --git a/sysdeps/x86_64/fpu/multiarch/s_log1p.c b/sysdeps/x86_64/fpu/multiarch/s_log1p.c new file mode 100644 index 0000000000..6ce5198d6d --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_log1p.c @@ -0,0 +1,29 @@ +/* Multiple versions of log1p. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <libm-alias-double.h> + +extern double __redirect_log1p (double); + +#define SYMBOL_NAME log1p +#include "ifunc-fma.h" + +libc_ifunc_redirected (__redirect_log1p, __log1p, IFUNC_SELECTOR ()); + +#define __log1p __log1p_sse2 +#include <sysdeps/ieee754/dbl-64/s_log1p.c>
reply other threads:[~2023-08-21 17:44 UTC|newest] Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230821174453.4FD41385C6D5@sourceware.org \ --to=hjl@sourceware.org \ --cc=glibc-cvs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).