public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Sunil Pandey <skpgkp2@gmail.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v2] x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch
Date: Sat, 24 Feb 2024 14:23:14 -0800	[thread overview]
Message-ID: <CAMAf5_dOK0uO5SwXi7t4xE_B3oDZ5+cuAR6iOL83SyxAJBbuUw@mail.gmail.com> (raw)
In-Reply-To: <CAMe9rOrmxjYNx-OVg6AtY_yXTfj31sZczQ_yEsvBURNrGFBtcg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 9126 bytes --]

On Sat, Feb 24, 2024 at 8:27 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> On Sat, Feb 24, 2024 at 8:23 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Fri, Feb 23, 2024 at 6:36 PM Sunil K Pandey <skpgkp2@gmail.com>
> wrote:
> > >
> > > When glibc is built with ISA level 3 or higher by default, the
> resulting
> > > glibc binaries won't run on SSE or FMA4 processors.  Exclude SSE, AVX
> and
> > > FMA4 variants in libm multiarch when ISA level 3 or higher is enabled
> by
> > > default.
> > >
> > > When glibc is built with ISA level 2 enabled by default, only keep
> SSE4.1
> > > variant.
> > >
> > > Fixes BZ 31335.
> > >
> > > NB: elf/tst-valgrind-smoke test fails with ISA level 4, because
> valgrind
> > > doesn't support AVX512 instructions:
> > >
> > > https://bugs.kde.org/show_bug.cgi?id=383010
> > >
> > > Changes from v1:
> > >
> > > Replace AVX2 and FMA feature check with ISA level.
> > > Replace SSE4_1 feature check with ISA level.
> > > ---
> > >  sysdeps/x86/configure                         |  31 ++++
> > >  sysdeps/x86/configure.ac                      |  23 +++
> > >  sysdeps/x86_64/fpu/multiarch/Makefile         | 148 +++++++++---------
> > >  sysdeps/x86_64/fpu/multiarch/e_asin.c         |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_atan2.c        |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/e_exp.c          |  13 +-
> > >  sysdeps/x86_64/fpu/multiarch/e_exp2f.c        |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_expf.c         |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_log.c          |  13 +-
> > >  sysdeps/x86_64/fpu/multiarch/e_log2.c         |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_log2f.c        |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_logf.c         |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/e_pow.c          |  13 +-
> > >  sysdeps/x86_64/fpu/multiarch/e_powf.c         |  27 ++--
> > >  sysdeps/x86_64/fpu/multiarch/s_atan.c         |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S     |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S  |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_ceil.c         |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S    |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_ceilf.c        |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_cosf.c         |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_expm1.c        |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_floor-avx.S    |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_floor.c        |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S   |  28 ++++
> > >  .../x86_64/fpu/multiarch/s_floorf-sse4_1.S    |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_floorf.c       |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_log1p.c        |  11 +-
> > >  .../x86_64/fpu/multiarch/s_nearbyint-avx.S    |  28 ++++
> > >  .../x86_64/fpu/multiarch/s_nearbyint-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_nearbyint.c    |  19 ++-
> > >  .../x86_64/fpu/multiarch/s_nearbyintf-avx.S   |  28 ++++
> > >  .../fpu/multiarch/s_nearbyintf-sse4_1.S       |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c   |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/s_rint-avx.S     |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S  |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_rint.c         |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S    |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_rintf.c        |  21 +--
> > >  .../x86_64/fpu/multiarch/s_roundeven-avx.S    |  28 ++++
> > >  .../x86_64/fpu/multiarch/s_roundeven-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_roundeven.c    |  19 ++-
> > >  .../x86_64/fpu/multiarch/s_roundevenf-avx.S   |  28 ++++
> > >  .../fpu/multiarch/s_roundevenf-sse4_1.S       |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_roundevenf.c   |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/s_sin.c          |  19 ++-
> > >  sysdeps/x86_64/fpu/multiarch/s_sincos.c       |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_sincosf.c      |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_sinf.c         |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_tan.c          |  11 +-
> > >  sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S    |  28 ++++
> > >  sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_trunc.c        |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S   |  28 ++++
> > >  .../x86_64/fpu/multiarch/s_truncf-sse4_1.S    |  12 ++
> > >  sysdeps/x86_64/fpu/multiarch/s_truncf.c       |  21 +--
> > >  sysdeps/x86_64/fpu/multiarch/w_exp.c          |   7 +-
> > >  sysdeps/x86_64/fpu/multiarch/w_log.c          |   7 +-
> > >  sysdeps/x86_64/fpu/multiarch/w_pow.c          |   7 +-
> > >  62 files changed, 950 insertions(+), 295 deletions(-)
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floor-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rint-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S
> > >  create mode 100644 sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S
> > >
> >
> > > diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile
> b/sysdeps/x86_64/fpu/multiarch/Makefile
> > > index e1a490dd98..91ac85012b 100644
> > > --- a/sysdeps/x86_64/fpu/multiarch/Makefile
> > > +++ b/sysdeps/x86_64/fpu/multiarch/Makefile
> > > @@ -1,49 +1,4 @@
> > >  ifeq ($(subdir),math)
> > > -libm-sysdep_routines += \
> > > -  s_ceil-c \
> > > -  s_ceilf-c \
> > > -  s_floor-c \
> > > -  s_floorf-c \
> > > -  s_nearbyint-c \
> > > -  s_nearbyintf-c \
> > > -  s_rint-c \
> > > -  s_rintf-c \
> > > -  s_roundeven-c \
> > > -  s_roundevenf-c \
> > > -  s_trunc-c \
> > > -  s_truncf-c \
> > > -# libm-sysdep_routines
> > > -
> > > -libm-sysdep_routines += \
> > > -  s_ceil-sse4_1 \
> > > -  s_ceilf-sse4_1 \
> > > -  s_floor-sse4_1 \
> > > -  s_floorf-sse4_1 \
> > > -  s_nearbyint-sse4_1 \
> > > -  s_nearbyintf-sse4_1 \
> > > -  s_rint-sse4_1 \
> > > -  s_rintf-sse4_1 \
> > > -  s_roundeven-sse4_1 \
> > > -  s_roundevenf-sse4_1 \
> > > -  s_trunc-sse4_1 \
> > > -  s_truncf-sse4_1 \
> > > -# libm-sysdep_routines
> > > -
> > > -libm-sysdep_routines += \
> > > -  e_asin-fma \
> > > -  e_atan2-fma \
> > > -  e_exp-fma \
> > > -  e_log-fma \
> > > -  e_log2-fma \
> > > -  e_pow-fma \
> > > -  s_atan-fma \
> > > -  s_expm1-fma \
> > > -  s_log1p-fma \
> > > -  s_sin-fma \
> > > -  s_sincos-fma \
> > > -  s_tan-fma \
> > > -# libm-sysdep_routines
> > > -
> > >  CFLAGS-e_asin-fma.c = -mfma -mavx2
> > >  CFLAGS-e_atan2-fma.c = -mfma -mavx2
> > >  CFLAGS-e_exp-fma.c = -mfma -mavx2
> > > @@ -57,23 +12,6 @@ CFLAGS-s_sin-fma.c = -mfma -mavx2
> > >  CFLAGS-s_tan-fma.c = -mfma -mavx2
> > >  CFLAGS-s_sincos-fma.c = -mfma -mavx2
> > >
> > > -libm-sysdep_routines += \
> > > -  s_cosf-sse2 \
> > > -  s_sincosf-sse2 \
> > > -  s_sinf-sse2 \
> > > -# libm-sysdep_routines
> > > -
> > > -libm-sysdep_routines += \
> > > -  e_exp2f-fma \
> > > -  e_expf-fma \
> > > -  e_log2f-fma \
> > > -  e_logf-fma \
> > > -  e_powf-fma \
> > > -  s_cosf-fma \
> > > -  s_sincosf-fma \
> > > -  s_sinf-fma \
> > > -# libm-sysdep_routines
> > > -
> > >  CFLAGS-e_exp2f-fma.c = -mfma -mavx2
> > >  CFLAGS-e_expf-fma.c = -mfma -mavx2
> > >  CFLAGS-e_log2f-fma.c = -mfma -mavx2
> > > @@ -83,17 +21,93 @@ CFLAGS-s_sinf-fma.c = -mfma -mavx2
> > >  CFLAGS-s_cosf-fma.c = -mfma -mavx2
> > >  CFLAGS-s_sincosf-fma.c = -mfma -mavx2
> > >
> > > +# Check if ISA level is 3 or 4
> > > +ifneq (,$(filter $(have-x86-isa-level),3 4))
> > >  libm-sysdep_routines += \
> >
> > If we add ISA level 5 and compile glibc with ISA level 5, this won't
> work.
> > It is specially bad for release branches.  Glibc release branches should
> > compile properly without any changes when glibc is built with
> > -march=x86-64-v5.
> >
>
> Glibc release branches are OK.  But when level 5 support is added, we
> have to change many places in Makefiles where have-x86-isa-level is used.
> I think we should avoid it.
>
>
have-x86-isa-level used in only one makefile in this patch.

sysdeps/x86_64/fpu/multiarch/Makefile:ifneq (,$(filter
$(have-x86-isa-level),3 4))
sysdeps/x86_64/fpu/multiarch/Makefile:ifeq ($(have-x86-isa-level),baseline)

-- 
> H.J.
>

      reply	other threads:[~2024-02-24 22:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-20 16:58 [PATCH] " Sunil K Pandey
2024-02-20 17:33 ` Noah Goldstein
2024-02-20 17:51   ` Sunil Pandey
2024-02-20 17:56     ` Noah Goldstein
2024-02-20 18:04       ` H.J. Lu
2024-02-20 18:07         ` Noah Goldstein
2024-02-20 18:13           ` H.J. Lu
2024-02-20 18:19             ` Noah Goldstein
2024-02-20 18:27               ` H.J. Lu
2024-02-20 18:32                 ` Noah Goldstein
2024-02-20 18:36                   ` H.J. Lu
2024-02-20 18:38                     ` Noah Goldstein
2024-02-20 18:48                     ` Adhemerval Zanella Netto
2024-02-20 18:54                       ` H.J. Lu
2024-02-20 19:02                         ` Adhemerval Zanella Netto
2024-02-20 19:10                           ` H.J. Lu
2024-02-20 19:56                             ` Adhemerval Zanella Netto
2024-02-20 20:03                               ` Adhemerval Zanella Netto
2024-02-20 20:18                                 ` Noah Goldstein
2024-02-20 20:27                                   ` H.J. Lu
2024-02-24  2:35                                     ` [PATCH v2] " Sunil K Pandey
2024-02-24 14:30                                       ` H.J. Lu
2024-02-24 14:55                                         ` H.J. Lu
2024-02-24 16:23                                       ` H.J. Lu
2024-02-24 16:27                                         ` H.J. Lu
2024-02-24 22:23                                           ` Sunil Pandey [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMAf5_dOK0uO5SwXi7t4xE_B3oDZ5+cuAR6iOL83SyxAJBbuUw@mail.gmail.com \
    --to=skpgkp2@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).