From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) by sourceware.org (Postfix) with ESMTPS id A90693858D20 for ; Tue, 20 Feb 2024 19:11:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A90693858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A90693858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708456280; cv=none; b=hFrNKOA3xRXJifm27tBtlYzjJuWx+OmaFBBQemhAXOvLuEnOVCtHReO7Ku8fSFmr5RvCyDzNRSP4Crvnh7FpoZLRpePuscKShPyVkgHRcLmL4Tqa6ZoVLcsY0OekmmiTL4mAys3m3bb2gB99/2vpTKZsy80PKKYcXTHzinqwqR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708456280; c=relaxed/simple; bh=qUCvAZOBWHYv+itMiPAxibWBUoBkKSywg1YZ/HAZIvY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=MP40h45mkyt/mTqzWS4IfNPYeYXO3aOLJenKC77A8K8EgIQdi9R1TXAIFbnKcmqvCUtE6WZlyNlQFMaoCsGNQZIjlolue/oM0AZSj8/QDAXw8iYHGycBc5IF5fz0OKfr9f0O9a35LpnxcDaBy5aKZUxPMePQEQcoTq0lJoqqOcs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2c.google.com with SMTP id 3f1490d57ef6-dcbf82cdf05so6502124276.2 for ; Tue, 20 Feb 2024 11:11:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708456277; x=1709061077; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9FWnfiSiB3McDNjfDMiCJrQetIel6g6MWNp0kzRMmLk=; b=dygf1BepaWAHzEj8cO5bN7ZHzXnViE8nnWaxfqG9h66f4/jdJWezKGA2UCcyaqbxz8 qKz2gO4L5hLvEBA0tx6aujoY0WnUJ7Ybmim1SqyN3ff/HkW9rcxOJZEOPIac6WAl/DSS J0yzcmc7WfMvpP6n4/jks6YkQPx7USITp/zwSsdtEbqbxcQiX6PtiNsFUjsrgvzetHpI 2tH1q/eMLfFZM7fz1ekJwN/XgcA1sMYlEzdS5ufk/E8Y9iErJYAcC4BnybVKPEQTaciO 0Jl+ofKQDjQYxS3VRDZOtYD13CXpE0y3A0RsyPQRSMHQ9HBB0DIXGQe8HmLOgOWQ0m9P 5q9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708456277; x=1709061077; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9FWnfiSiB3McDNjfDMiCJrQetIel6g6MWNp0kzRMmLk=; b=rQjon29wF+TO6oDWZcKuRT4QOqxysvayAJliXqpqt8r1R62U5dFTAWkYzlnoIDsPcw gC53ITct24BseMwN1HSUCJA1DrHP9n1xxBpSE2yrxJt13ggyiy+a9vQJh+WlARXQClFX JXT9FiT7LJjf1oNB7KsbAtRcVr8KN0G2Bz3hsxfIWVC6hIeLqU/PYom30Xv3+9j5Gebe WVkdSW0wwmLtXFu4nJLfZEFKZRAzdovyV9i2N6vxPO4ZWCwmDCT5aQXJi27eHPH2D6cc TZG7UbggaguyMq0TM/aT+djx/mU2aryJkpBGgIjgrlfgZOaCYbQ3WQiqAVtpF+gIcc3s GmLg== X-Forwarded-Encrypted: i=1; AJvYcCXTaBxYPQB3OAlNVnZb17Kg615boR6sCww1+dMLCa4yE4oQg4j79Vzt2sdVQ7PqvIljKz03yAzgqWOVglI5LQTSvdjtLlhnfoRN X-Gm-Message-State: AOJu0YwlEEcy2eSI2epqvFM1ab7tJLNP5MH0rEXNLqFLhY/GIZuYBSNl 3Qch5C3tFUt7ZLMhsxdyDUKe5xDDojK3jRUnpIXlGDPYBHaEmsPKWGu9CP4uIH4EwPjvEGznOqc 2H6UTCvukMs3bMPkuv3uwyC0uavU= X-Google-Smtp-Source: AGHT+IFs5ypyO7yZDaPSHtZhBQ1VJ9LI2439HrG4fmB5FSkWVZbeontsTpMglcJxNmZ36bfD878LaOv/9ASyOtvBDKw= X-Received: by 2002:a25:ef47:0:b0:dc6:b121:d00c with SMTP id w7-20020a25ef47000000b00dc6b121d00cmr16213516ybm.16.1708456276988; Tue, 20 Feb 2024 11:11:16 -0800 (PST) MIME-Version: 1.0 References: <20240220165805.3629140-1-skpgkp2@gmail.com> <502e1c82-2425-4e28-bde0-67dcfb373ee6@linaro.org> <16fdf5fa-893c-44f6-91b7-69e67e27dff9@linaro.org> In-Reply-To: <16fdf5fa-893c-44f6-91b7-69e67e27dff9@linaro.org> From: "H.J. Lu" Date: Tue, 20 Feb 2024 11:10:41 -0800 Message-ID: Subject: Re: [PATCH] x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch To: Adhemerval Zanella Netto Cc: Noah Goldstein , Sunil Pandey , libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3014.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Feb 20, 2024 at 11:02=E2=80=AFAM Adhemerval Zanella Netto wrote: > > > > On 20/02/24 15:54, H.J. Lu wrote: > > On Tue, Feb 20, 2024 at 10:48=E2=80=AFAM Adhemerval Zanella Netto > > wrote: > >> > >> > >> > >> On 20/02/24 15:36, H.J. Lu wrote: > >>> On Tue, Feb 20, 2024 at 10:32=E2=80=AFAM Noah Goldstein wrote: > >>>> > >>>> On Tue, Feb 20, 2024 at 6:28=E2=80=AFPM H.J. Lu wrote: > >>>>> > >>>>> On Tue, Feb 20, 2024 at 10:19=E2=80=AFAM Noah Goldstein wrote: > >>>>>> > >>>>>> On Tue, Feb 20, 2024 at 6:14=E2=80=AFPM H.J. Lu wrote: > >>>>>>> > >>>>>>> On Tue, Feb 20, 2024 at 10:07=E2=80=AFAM Noah Goldstein wrote: > >>>>>>>> > >>>>>>>> On Tue, Feb 20, 2024 at 6:05=E2=80=AFPM H.J. Lu wrote: > >>>>>>>>> > >>>>>>>>> On Tue, Feb 20, 2024 at 9:56=E2=80=AFAM Noah Goldstein wrote: > >>>>>>>>>> > >>>>>>>>>> On Tue, Feb 20, 2024 at 5:51=E2=80=AFPM Sunil Pandey wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Tue, Feb 20, 2024 at 9:34=E2=80=AFAM Noah Goldstein wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Feb 20, 2024 at 4:58=E2=80=AFPM Sunil K Pandey wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> When glibc is built with FMA and AVX2 enabled by default, t= he resulting > >>>>>>>>>>>>> glibc binaries won't run on SSE or FMA4 processors. Exclud= e SSE, AVX and > >>>>>>>>>>>>> FMA4 variants in libm multiarch when both FMA and AVX2 are = enabled by > >>>>>>>>>>>>> default. Disallow glibc build with only AVX2 or FMA enable= d as all AVX2 > >>>>>>>>>>>>> processors, including VMs, should also support FMA and vice= versa. > >>>>>>>>>>>>> > >>>>>>>>>>>>> When glibc is built with SSE4.1 enabled by default, only ke= ep SSE4.1 > >>>>>>>>>>>>> variant. > >>>>>>>>>>>> Not avx2 + FMA as well? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Correct. Logic is as follows > >>>>>>>>>>> If (build with AVX2+FMA): Keep AVX2+FMA variants only. > >>>>>>>>>>> else if (build with SSE4.1): Keep SSE4.1 variants only. > >>>>>>>>>> What if someone builds with sse4.1 as a minimum but then > >>>>>>>>>> runs on avx2+ machines? > >>>>>>>>> > >>>>>>>>> Only SSE4.1 variant will be used in this case. Both SSE4.1 > >>>>>>>>> and AVX versions only have a single instruction. This matches > >>>>>>>>> the compiler builtin function of SS4.1 and AVX. > >>>>>>>> > >>>>>>>> if they are all the same, whats the rationale for having an > >>>>>>>> avx version at all? > >>>>>>> > >>>>>>> They aren't the same. For ceil, it is > >>>>>>> > >>>>>>> roundsd $10, %xmm0, %xmm0 > >>>>>>> ret > >>>>>>> > >>>>>>> vs > >>>>>>> > >>>>>>> vroundsd $10, %xmm0, %xmm0, %xmm0 > >>>>>>> ret > >>>>>>> > >>>>>>> You get the same things with > >>>>>>> > >>>>>>> return __builtin_ceil (x); > >>>>>> > >>>>>> I mean if they are equal quality sse4.1 / avx, > >>>>>> why not just remove the avx impls are using sse4.1 impls > >>>>>> on avx targets? > >>>>> > >>>>> If glibc is compiled with AVX, we should use the AVX version if > >>>>> appropriate. Since the minimum GCC for glibc build can't inline > >>>>> __builtin_ceil, we inline __builtin_ceil by hand. > >>>> if compiled with avx, but for generic target do we need to hold > >>>> onto avx versions for any reason? > >>> > >>> I don't understand what you were asking. This patch leads to the sa= me > >>> assembly code generated from > >>> > >>> double > >>> __ceil (double x) > >>> { > >>> return __builtin_ceil (x); > >>> } > >> > >> Wouldn't make sense to follow the already define x86_64 ABI versions a= nd > >> provided the ifunc variants based on the ABI uses? > > > > There are no conflicts here. For these math functions, ISA level 2 =3D= =3D SSE4.1 > > and ISA level 3 =3D=3D AVX2 + FMA. If glibc is built with ISA level N= , this patch > > will exclude ISA level N-1 or older variants in IFUNC selection. > > > > I mean, why not use the MINIMUM_X86_ISA_LEVEL to define whether to provid= e/build > the variants instead of adding two new configure checks? One issue is that the minimum GCC (GCC 6?) doesn't support -march=3Dx86-64-= vN. Another reason is that these math functions don't need the full ISA level instructions. --=20 H.J.