From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id B8CBF3858D35 for ; Thu, 17 Aug 2023 16:42:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B8CBF3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1bc5acc627dso54771505ad.1 for ; Thu, 17 Aug 2023 09:42:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692290551; x=1692895351; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=+TSL1HZK7nTThnnSfSnpUcMtHmdv1AQ2lGMXTpJjyeI=; b=TNgXDGtZbI6dYm5QYsOfghPb5dYI2X2AdqbkTRX0scY6v04qxSodtLwnv/IL6ZNMOG YhkzBtqLLUe/XVDMWkut+1dz3quxy16wBVYESv3Au6fSaWkuuLnyPl6HyDDcMZ36I93T VNsXym7qd5eTkTF/d1ws+7zQmJXfjdvGpywEuIB1aJxAsExSA8uovbIVQZBVim4KQOyS KwnTa7au+YhEjjFrAGQ0f/L+gdjXgtvdYuRS/oKBVvu3XsNuH7ThwV0Xo8xyombVXRFx Yjq3bCbW36dNuAMSzLJuULSp/eP0V7yBjuxpMuUR+dISo56v7/9vgO1eRvwthrrm1Hmm /ddg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692290551; x=1692895351; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+TSL1HZK7nTThnnSfSnpUcMtHmdv1AQ2lGMXTpJjyeI=; b=kBpHiXrFROfZJbch0Et4fbx4wslFQLVGPU1WY1WdDcM6UUMJvlPjfJ9+nmTF4Norah XuXCrd5U7cSogFIK/Fk+3avg+gYjCd6wdtq82EBaL/YJBrB2bTlj+sDNjZXyFZycB5lR BnK1nKBPkodKpH6BF3QlzfoSHZGPpZ0VM6GcjYxm91+NB5dDuOFiT7uzXdTygNpvNome AyqRUR9LxsqU4maOQmrsNWms9w01jOBon7jWRnoHhqyv7icCTe+L1ZE3un23v9zGNjek u78m5HDNyBlL2VKpxMrg5ujwCvRMS9Xdh991I8xQ9Bga8fJN/yVXJ2fCZEMHMQRu1BFw ZPRA== X-Gm-Message-State: AOJu0YwOzLw+XgScOecadWqmXwWP9D7PkYELDAWonLwp57QGW2YxBWgE m+6kNRXdW0GMEW/StERCr0EK2kaVVj0= X-Google-Smtp-Source: AGHT+IFS9fK6LFU8LPuwK+ELy8NE8w+U+CKxosWuKYQFS0WvJBHxx91CVz43vnpnNBLXiHQ4gIfXYg== X-Received: by 2002:a17:902:ce81:b0:1bd:d566:cd92 with SMTP id f1-20020a170902ce8100b001bdd566cd92mr5867009plg.63.1692290551348; Thu, 17 Aug 2023 09:42:31 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.59.161.42]) by smtp.gmail.com with ESMTPSA id jj18-20020a170903049200b001b9f7bc3e77sm4772739plb.189.2023.08.17.09.42.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Aug 2023 09:42:30 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 7CB10740049 for ; Thu, 17 Aug 2023 09:42:29 -0700 (PDT) From: "H.J. Lu" To: libc-alpha@sourceware.org Subject: [PATCH] x86_64: Add log1p with FMA Date: Thu, 17 Aug 2023 09:42:29 -0700 Message-ID: <20230817164229.512321-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3024.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Skylake, it changes log1p bench performance by: Before After Improvement max 63.349 58.347 8% min 4.448 5.651 -30% mean 12.0674 10.336 14% The minimum code path is if (hx < 0x3FDA827A) /* x < 0.41422 */ { if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */ { ... } if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */ { math_force_eval (two54 + x); /* raise inexact */ if (ax < 0x3c900000) /* |x| < 2**-54 */ { ... } else return x - x * x * 0.5; FMA and non-FMA code sequences look similar. Non-FMA version is slightly faster. Since log1p is called by asinh and atanh, it improves asinh performance by: Before After Improvement max 75.645 63.135 16% min 10.074 10.071 0% mean 15.9483 14.9089 6% and improves atanh performance by: Before After Improvement max 91.768 75.081 18% min 15.548 13.883 10% mean 18.3713 16.8011 8% --- sysdeps/ieee754/dbl-64/s_log1p.c | 5 ++++ sysdeps/x86_64/fpu/multiarch/Makefile | 2 ++ sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c | 4 +++ sysdeps/x86_64/fpu/multiarch/s_log1p.c | 29 ++++++++++++++++++++++ 4 files changed, 40 insertions(+) create mode 100644 sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_log1p.c diff --git a/sysdeps/ieee754/dbl-64/s_log1p.c b/sysdeps/ieee754/dbl-64/s_log1p.c index e6476a8260..eeb0af859f 100644 --- a/sysdeps/ieee754/dbl-64/s_log1p.c +++ b/sysdeps/ieee754/dbl-64/s_log1p.c @@ -99,6 +99,11 @@ static const double static const double zero = 0.0; +#ifndef SECTION +# define SECTION +#endif + +SECTION double __log1p (double x) { diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile index add339a876..ea81753b70 100644 --- a/sysdeps/x86_64/fpu/multiarch/Makefile +++ b/sysdeps/x86_64/fpu/multiarch/Makefile @@ -38,6 +38,7 @@ libm-sysdep_routines += \ e_pow-fma \ s_atan-fma \ s_expm1-fma \ + s_log1p-fma \ s_sin-fma \ s_sincos-fma \ s_tan-fma \ @@ -51,6 +52,7 @@ CFLAGS-e_log2-fma.c = -mfma -mavx2 CFLAGS-e_pow-fma.c = -mfma -mavx2 CFLAGS-s_atan-fma.c = -mfma -mavx2 CFLAGS-s_expm1-fma.c = -mfma -mavx2 +CFLAGS-s_log1p-fma.c = -mfma -mavx2 CFLAGS-s_sin-fma.c = -mfma -mavx2 CFLAGS-s_tan-fma.c = -mfma -mavx2 CFLAGS-s_sincos-fma.c = -mfma -mavx2 diff --git a/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c b/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c new file mode 100644 index 0000000000..8952df8f9e --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_log1p-fma.c @@ -0,0 +1,4 @@ +#define __log1p __log1p_fma +#define SECTION __attribute__ ((section (".text.fma"))) + +#include diff --git a/sysdeps/x86_64/fpu/multiarch/s_log1p.c b/sysdeps/x86_64/fpu/multiarch/s_log1p.c new file mode 100644 index 0000000000..6ce5198d6d --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_log1p.c @@ -0,0 +1,29 @@ +/* Multiple versions of log1p. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern double __redirect_log1p (double); + +#define SYMBOL_NAME log1p +#include "ifunc-fma.h" + +libc_ifunc_redirected (__redirect_log1p, __log1p, IFUNC_SELECTOR ()); + +#define __log1p __log1p_sse2 +#include -- 2.41.0