From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by sourceware.org (Postfix) with ESMTPS id E0DAB3858D34 for ; Sat, 24 Feb 2024 16:24:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E0DAB3858D34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E0DAB3858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::112e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708791875; cv=none; b=HuAQny/y0tknpAhKVwDnNhNFaVkBBh0DouDjxmHxiAhZTCCyU7F1WPgYK9peFN4Gif1GQPDxB5cQ3UVDRq4545TkV0PN+HZTGc1HVsBIxsaBlDoEUclFnna/vHhAndYfi5ClNPni3lF1F3dYRkwLHyBs69SINHYwtB3q+pGTLcw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708791875; c=relaxed/simple; bh=F8tFybdmwcM+Y7RaZbUk+tjzKgJXGhKWrMQi12q0N6s=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=UD2OB4eqOdqfHC9vuxvQJ2aeo8tODAvHT+yzeBwjhmAKNBWuDhP/U8Q+BT3lv1PhNW70HClrcNB9j9yG2yCpxeZ5pcNufrn42CdPOibRpG6LDuJ+pi66WnELGCvMVSTxl69fmJ46S60xgVqYs3/krZuv7a/bqVxeM9rNcnJJ1mo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-607f8894550so13894577b3.1 for ; Sat, 24 Feb 2024 08:24:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708791872; x=1709396672; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dMPypwsAQHDYN3gGFXmodUysMKKP5TrRxMi9cIM4HyM=; b=WqHD8a7V++cwtSpx/DIR3hNpc4HbD97KRNxK+9kt0lUbDv4ZWrLihHj9yziZrZvohO bZ7DUAU55ilMQmqQceFPHLu5Mr+jbgnDHKiF97kpr4PmBFpqXVKtEB85DNvpCUGEMWDU KQSEb2JweUR0fL9wtHOjhEl6xpBgyFBO6J3pQIGKlZtL30MO/9vJA38ebBE7BTNWDQr0 guMZhhsrQGx4Etqav3Jf/MtiQLj9ZltY/UdvCs3hXrGDf/CvHHaApktJH/3Z7ppIIelF lOnt7Lrhi1uX3hxusVAR1LPk+TBRIyjEVtCiXkCUSItS+xiAlbk1wx4QP0Nji5D8nCIN ylOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708791872; x=1709396672; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dMPypwsAQHDYN3gGFXmodUysMKKP5TrRxMi9cIM4HyM=; b=Pj2kGwFQq6ZihrbyRwWLkYj5Quu3wBf5QTfFUGDTnjp4vOtKYhx06u2ZPG7MHheCTP Jkk0uuTgp3Q1wdKecZ+MqqLZzooJbQbeWZO/W/SRZGBc/0Mpk2/3YAYV7rvJD7hy62JC 2niGjMjbEPI/QzxeqB/osUG7To5gT344iN1eX8y6zRi98GjsOSFLSkL16DKBZlxdsQKq fgIXt33BuCrlxVZN1oyGcuW8fEkP/onVJVxHZiJc5Oaqaz8IsC7mHOE7W6b9qdXc7LO3 n2T6ar4Zio9ytViz0lsOa3sP+XOH/dtIs6nkG7BLhzr1pKUjlUzvrMffjx8z5Fe1DxjC 4w7A== X-Gm-Message-State: AOJu0YywodI0J08shJ8f12Eimt7SRrOoVW64ZJmKulYpRHwDmYkS7Y0T DMFzbqoOYcDcrMlCcnK1pOQEHfb5dKSXNfD4F9YmXSdXiH2dIVLxvqeftjcxMgVR8O7VhrF40B9 fv8w9Wfvfd2ftznHVN2RopICzO5LWb4fy X-Google-Smtp-Source: AGHT+IGhcYwWIWtB8AbUA/xT6OmNXxV7LrwyV5UrKSBlYqHurHWtCi9YOAqc2ZD/vOYH38emIBNpWxYH+jjkwtwbMJs= X-Received: by 2002:a81:af02:0:b0:608:bf4f:95dc with SMTP id n2-20020a81af02000000b00608bf4f95dcmr2616372ywh.22.1708791872169; Sat, 24 Feb 2024 08:24:32 -0800 (PST) MIME-Version: 1.0 References: <20240224023553.3804579-1-skpgkp2@gmail.com> In-Reply-To: <20240224023553.3804579-1-skpgkp2@gmail.com> From: "H.J. Lu" Date: Sat, 24 Feb 2024 08:23:55 -0800 Message-ID: Subject: Re: [PATCH v2] x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch To: Sunil K Pandey Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3020.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Feb 23, 2024 at 6:36=E2=80=AFPM Sunil K Pandey = wrote: > > When glibc is built with ISA level 3 or higher by default, the resulting > glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and > FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by > default. > > When glibc is built with ISA level 2 enabled by default, only keep SSE4.1 > variant. > > Fixes BZ 31335. > > NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind > doesn't support AVX512 instructions: > > https://bugs.kde.org/show_bug.cgi?id=3D383010 > > Changes from v1: > > Replace AVX2 and FMA feature check with ISA level. > Replace SSE4_1 feature check with ISA level. > --- > sysdeps/x86/configure | 31 ++++ > sysdeps/x86/configure.ac | 23 +++ > sysdeps/x86_64/fpu/multiarch/Makefile | 148 +++++++++--------- > sysdeps/x86_64/fpu/multiarch/e_asin.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_atan2.c | 11 +- > sysdeps/x86_64/fpu/multiarch/e_exp.c | 13 +- > sysdeps/x86_64/fpu/multiarch/e_exp2f.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_expf.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_log.c | 13 +- > sysdeps/x86_64/fpu/multiarch/e_log2.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_log2f.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_logf.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/e_pow.c | 13 +- > sysdeps/x86_64/fpu/multiarch/e_powf.c | 27 ++-- > sysdeps/x86_64/fpu/multiarch/s_atan.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_ceil.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_ceilf.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_cosf.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_expm1.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_floor-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_floor.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S | 28 ++++ > .../x86_64/fpu/multiarch/s_floorf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_floorf.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_log1p.c | 11 +- > .../x86_64/fpu/multiarch/s_nearbyint-avx.S | 28 ++++ > .../x86_64/fpu/multiarch/s_nearbyint-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_nearbyint.c | 19 ++- > .../x86_64/fpu/multiarch/s_nearbyintf-avx.S | 28 ++++ > .../fpu/multiarch/s_nearbyintf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/s_rint-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_rint.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_rintf.c | 21 +-- > .../x86_64/fpu/multiarch/s_roundeven-avx.S | 28 ++++ > .../x86_64/fpu/multiarch/s_roundeven-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_roundeven.c | 19 ++- > .../x86_64/fpu/multiarch/s_roundevenf-avx.S | 28 ++++ > .../fpu/multiarch/s_roundevenf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_roundevenf.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/s_sin.c | 19 ++- > sysdeps/x86_64/fpu/multiarch/s_sincos.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_sincosf.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_sinf.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_tan.c | 11 +- > sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S | 28 ++++ > sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_trunc.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S | 28 ++++ > .../x86_64/fpu/multiarch/s_truncf-sse4_1.S | 12 ++ > sysdeps/x86_64/fpu/multiarch/s_truncf.c | 21 +-- > sysdeps/x86_64/fpu/multiarch/w_exp.c | 7 +- > sysdeps/x86_64/fpu/multiarch/w_log.c | 7 +- > sysdeps/x86_64/fpu/multiarch/w_pow.c | 7 +- > 62 files changed, 950 insertions(+), 295 deletions(-) > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floor-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rint-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S > > diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/m= ultiarch/Makefile > index e1a490dd98..91ac85012b 100644 > --- a/sysdeps/x86_64/fpu/multiarch/Makefile > +++ b/sysdeps/x86_64/fpu/multiarch/Makefile > @@ -1,49 +1,4 @@ > ifeq ($(subdir),math) > -libm-sysdep_routines +=3D \ > - s_ceil-c \ > - s_ceilf-c \ > - s_floor-c \ > - s_floorf-c \ > - s_nearbyint-c \ > - s_nearbyintf-c \ > - s_rint-c \ > - s_rintf-c \ > - s_roundeven-c \ > - s_roundevenf-c \ > - s_trunc-c \ > - s_truncf-c \ > -# libm-sysdep_routines > - > -libm-sysdep_routines +=3D \ > - s_ceil-sse4_1 \ > - s_ceilf-sse4_1 \ > - s_floor-sse4_1 \ > - s_floorf-sse4_1 \ > - s_nearbyint-sse4_1 \ > - s_nearbyintf-sse4_1 \ > - s_rint-sse4_1 \ > - s_rintf-sse4_1 \ > - s_roundeven-sse4_1 \ > - s_roundevenf-sse4_1 \ > - s_trunc-sse4_1 \ > - s_truncf-sse4_1 \ > -# libm-sysdep_routines > - > -libm-sysdep_routines +=3D \ > - e_asin-fma \ > - e_atan2-fma \ > - e_exp-fma \ > - e_log-fma \ > - e_log2-fma \ > - e_pow-fma \ > - s_atan-fma \ > - s_expm1-fma \ > - s_log1p-fma \ > - s_sin-fma \ > - s_sincos-fma \ > - s_tan-fma \ > -# libm-sysdep_routines > - > CFLAGS-e_asin-fma.c =3D -mfma -mavx2 > CFLAGS-e_atan2-fma.c =3D -mfma -mavx2 > CFLAGS-e_exp-fma.c =3D -mfma -mavx2 > @@ -57,23 +12,6 @@ CFLAGS-s_sin-fma.c =3D -mfma -mavx2 > CFLAGS-s_tan-fma.c =3D -mfma -mavx2 > CFLAGS-s_sincos-fma.c =3D -mfma -mavx2 > > -libm-sysdep_routines +=3D \ > - s_cosf-sse2 \ > - s_sincosf-sse2 \ > - s_sinf-sse2 \ > -# libm-sysdep_routines > - > -libm-sysdep_routines +=3D \ > - e_exp2f-fma \ > - e_expf-fma \ > - e_log2f-fma \ > - e_logf-fma \ > - e_powf-fma \ > - s_cosf-fma \ > - s_sincosf-fma \ > - s_sinf-fma \ > -# libm-sysdep_routines > - > CFLAGS-e_exp2f-fma.c =3D -mfma -mavx2 > CFLAGS-e_expf-fma.c =3D -mfma -mavx2 > CFLAGS-e_log2f-fma.c =3D -mfma -mavx2 > @@ -83,17 +21,93 @@ CFLAGS-s_sinf-fma.c =3D -mfma -mavx2 > CFLAGS-s_cosf-fma.c =3D -mfma -mavx2 > CFLAGS-s_sincosf-fma.c =3D -mfma -mavx2 > > +# Check if ISA level is 3 or 4 > +ifneq (,$(filter $(have-x86-isa-level),3 4)) > libm-sysdep_routines +=3D \ If we add ISA level 5 and compile glibc with ISA level 5, this won't work. It is specially bad for release branches. Glibc release branches should compile properly without any changes when glibc is built with -march=3Dx86-64-v5. H.J.