From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x232.google.com (mail-oi1-x232.google.com [IPv6:2607:f8b0:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 2102B3858D39 for ; Mon, 14 Aug 2023 04:50:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2102B3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oi1-x232.google.com with SMTP id 5614622812f47-3a7d7df4e67so2662014b6e.1 for ; Sun, 13 Aug 2023 21:50:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691988615; x=1692593415; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Uck+3UsRYGIPXFfXsNE8bh4D56K65v1YVs/EbLGSahI=; b=Wg9X/Sal3qAMRvWdsFG8TEkweqCJV5WHY64QlLdT8QmE8Q/KLxr+zvuEvtCcDM8qq7 N/ZNZoPIuHQ+J+EuYGWkCfAMnEyczBUK58wPA6Gs4xRg8zAvmjkh2h8/SRR+0Mam0rzU 9TUQHJKZAkg65mSf1Ysd7bqqMuhA4bA7BQKtJNPH2AhCyV7/uR3MwUTffNpmR6nq6+LT Tfiw9hsMTJsb7dnz8R0iatUeGP5MDxtMNXlysSJkxkmjxo/riaeEeEyXuRkSzHiIMQRf y17VvfLC0Wvax+ypZvrvIBiH74M2jEl86KcqMzhz+L1U8lYEMfKqjWBGZFXuwo8bEpVz B93w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691988615; x=1692593415; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Uck+3UsRYGIPXFfXsNE8bh4D56K65v1YVs/EbLGSahI=; b=HGMKgAQeu2dV0EAC7vJjIMncD/1ynrI/uddUao6C+YBbK1Y6YlG3ec696a1lC0EVtT WVFqrYkg0leGO7Ah3VKHITJ3zlrDyGN2YU2cMMMBRbUe7Ej4PDU/DJtUclXovK+n3hfd m0vf34+a/dPole8IBfQ1SAFu5fct8vp4D63d/qsLkIIzUX8M5h9FKlNEUnAIfAbMXyOY kBpE9ndw9X5CpTgm8V3A9okTEwa3B/K5dQiQ9YqPKcGx4ancJAFRrjoFKbNb0IGBjwy4 ehjWbf/iQ5VnqdpluR2JwQN+NpOcl9KBMWRBu0sz0ylQfWzLauFn6/3qgzq77xlhhG5v Ft/w== X-Gm-Message-State: AOJu0Yzkxu7vedhzxLG4TCnLhEGnPQ5Mjji+TRUXZIxApuI5au5TcpUM 4yFeq9r1OVYd/wGOrYAWjaasgwKbaiVK7jfRtGR+xGbU X-Google-Smtp-Source: AGHT+IE9OZQXAjqBC5cSQgOY0eVpiG7affrV7lfBu3bYzm9LiRZPbyUMICs+D7YA12NCTCdd1Awx0awRCD5vV1Bu1zo= X-Received: by 2002:a05:6870:c6a6:b0:1bb:85c3:9293 with SMTP id cv38-20020a056870c6a600b001bb85c39293mr9571089oab.41.1691988615279; Sun, 13 Aug 2023 21:50:15 -0700 (PDT) MIME-Version: 1.0 References: <20230811150408.2832625-1-hjl.tools@gmail.com> In-Reply-To: <20230811150408.2832625-1-hjl.tools@gmail.com> From: Noah Goldstein Date: Sun, 13 Aug 2023 21:50:03 -0700 Message-ID: Subject: Re: [PATCH] x86_64: Add expm1 with FMA To: "H.J. Lu" Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Aug 11, 2023 at 8:04=E2=80=AFAM H.J. Lu via Libc-alpha wrote: > > On Skylake, it improves expm1 bench performance by: > > Before After Improvement > max 70.204 68.054 3% > min 20.709 16.2 22% > mean 22.1221 16.7367 24% > > NB: Add > > extern long double __expm1l (long double); > extern long double __expm1f128 (long double); > > for __typeof (__expm1l) and __typeof (__expm1f128) when __expm1 is > defined since __expm1 may be expanded in their declarations which > causes the build failure. > --- > sysdeps/ieee754/dbl-64/s_expm1.c | 7 +++++ > sysdeps/x86_64/fpu/multiarch/Makefile | 2 ++ > sysdeps/x86_64/fpu/multiarch/s_expm1-fma.c | 10 ++++++ > sysdeps/x86_64/fpu/multiarch/s_expm1.c | 36 ++++++++++++++++++++++ > 4 files changed, 55 insertions(+) > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_expm1-fma.c > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_expm1.c > > diff --git a/sysdeps/ieee754/dbl-64/s_expm1.c b/sysdeps/ieee754/dbl-64/s_= expm1.c > index 8f1c95bd04..1cafeca9c0 100644 > --- a/sysdeps/ieee754/dbl-64/s_expm1.c > +++ b/sysdeps/ieee754/dbl-64/s_expm1.c > @@ -130,6 +130,11 @@ static const double > 4.00821782732936239552e-06, /* 3ED0CFCA 86E65239 */ > -2.01099218183624371326e-07 }; /* BE8AFDB7 6E09C32D */ > > +#ifndef SECTION > +# define SECTION > +#endif > + > +SECTION > double > __expm1 (double x) > { > @@ -258,4 +263,6 @@ __expm1 (double x) > } > return y; > } > +#ifndef __expm1 > libm_alias_double (__expm1, expm1) > +#endif > diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/m= ultiarch/Makefile > index f773255721..add339a876 100644 > --- a/sysdeps/x86_64/fpu/multiarch/Makefile > +++ b/sysdeps/x86_64/fpu/multiarch/Makefile > @@ -37,6 +37,7 @@ libm-sysdep_routines +=3D \ > e_log2-fma \ > e_pow-fma \ > s_atan-fma \ > + s_expm1-fma \ > s_sin-fma \ > s_sincos-fma \ > s_tan-fma \ > @@ -49,6 +50,7 @@ CFLAGS-e_log-fma.c =3D -mfma -mavx2 > CFLAGS-e_log2-fma.c =3D -mfma -mavx2 > CFLAGS-e_pow-fma.c =3D -mfma -mavx2 > CFLAGS-s_atan-fma.c =3D -mfma -mavx2 > +CFLAGS-s_expm1-fma.c =3D -mfma -mavx2 > CFLAGS-s_sin-fma.c =3D -mfma -mavx2 > CFLAGS-s_tan-fma.c =3D -mfma -mavx2 > CFLAGS-s_sincos-fma.c =3D -mfma -mavx2 > diff --git a/sysdeps/x86_64/fpu/multiarch/s_expm1-fma.c b/sysdeps/x86_64/= fpu/multiarch/s_expm1-fma.c > new file mode 100644 > index 0000000000..3ee2bd804e > --- /dev/null > +++ b/sysdeps/x86_64/fpu/multiarch/s_expm1-fma.c > @@ -0,0 +1,10 @@ > +#define __expm1 __expm1_fma > + > +/* NB: __expm1 may be expanded to __expm1_fma in the following > + prototypes. */ > +extern long double __expm1l (long double); > +extern long double __expm1f128 (long double); > + > +#define SECTION __attribute__ ((section (".text.fma"))) > + > +#include > diff --git a/sysdeps/x86_64/fpu/multiarch/s_expm1.c b/sysdeps/x86_64/fpu/= multiarch/s_expm1.c > new file mode 100644 > index 0000000000..2cae83fb7f > --- /dev/null > +++ b/sysdeps/x86_64/fpu/multiarch/s_expm1.c > @@ -0,0 +1,36 @@ > +/* Multiple versions of expm1. > + Copyright (C) 2023 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +extern double __redirect_expm1 (double); > + > +#define SYMBOL_NAME expm1 > +#include "ifunc-fma.h" > + > +libc_ifunc_redirected (__redirect_expm1, __expm1, IFUNC_SELECTOR ()); > +libm_alias_double (__expm1, expm1) > + > +#define __expm1 __expm1_sse2 > + > +/* NB: __expm1 may be expanded to __expm1_sse2 in the following > + prototypes. */ > +extern long double __expm1l (long double); > +extern long double __expm1f128 (long double); > + > +#include > -- > 2.41.0 > LGTM.