From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by sourceware.org (Postfix) with ESMTPS id 341D9385C416 for ; Mon, 3 Oct 2022 21:47:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 341D9385C416 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com Received: by mail-pj1-x1031.google.com with SMTP id fw14so3965414pjb.3 for ; Mon, 03 Oct 2022 14:47:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date; bh=joZ43jHWkOPugqygOAjP3AECsFRwtBO4gollgfO1bbo=; b=IhDApsrB0K7Ilb4jjBhHluX+FQq70/etkLzTrI15E3I9iFYrvydFTKtMJX22+OYhGw L9Qjkim8kDYlNbQ6Yp9KR1T+L82/6iRzmcG12Jia4zW6CZPqCvuvIaMb+G5tgozO5ivX xdKK7aBRk61ZYTDqjGHhPtpRlvAUaq6nHXN10nfCzAQXEBaWH+XJYIZtARy0fIcksFPR Y6hzKKsJCD0+9HxbimwQyV1bdya103AiAXoqoWMkTWp4qSa4w85Th21H5EdJYAvlNRQR KV58gR53JY1sDQUJGoA+v750vwKzaTL4ZNNCYDHmW8bGNPiMgZ/izJfiGN2CdCMLpdcx lpgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date; bh=joZ43jHWkOPugqygOAjP3AECsFRwtBO4gollgfO1bbo=; b=BoUiNyf4Ke70WnQ/5K1usW9+9lWH8iaXuZ1+iXNkZdnxmmKGT8i3GtZKI2nUYhoyfR HNQN3Ef6ir+cRf9/Oh7gs7StF8dzGHBYuTZ8VxFX0Xmwu4Aus9SqBe1kIz0s2otdgbbl onK18VsNS+TGV5qjk+Wi5Qo0MOdSyszWDosPK22ymrCn/pYsv9AzKl7UQBKXexinoeRy oBPtNsTTzAMETjzV/5YX96nybuNNDK+AK5G4JwmyUE2rIJuIzbtT6lfQQlKgAjw9dg3v yGKDz/+frMrHfJL40h18e98/3WbiOWOhblkpZcK4fPKlG+FaTofRcgKnaHwc8sn5FL92 j+Hg== X-Gm-Message-State: ACrzQf0YTBg+ao7u1jq7sz/jQu8ZsjA4TlooSa/sBkrth3GHl4zFPWeP vYKD2qmMejP9g41YOKjqXOenSsTwc32R2A== X-Google-Smtp-Source: AMsMyM59ZurowaLyawwQle0i9a4LjOiUbCRbds8+59nQp5NFTB1KvniZ+0+NyhYX8FuQubEsFCyXoQ== X-Received: by 2002:a17:902:e852:b0:17f:63f0:3ec2 with SMTP id t18-20020a170902e85200b0017f63f03ec2mr4671565plg.66.1664833631048; Mon, 03 Oct 2022 14:47:11 -0700 (PDT) Received: from google.com ([2620:15c:2ce:200:b8cf:2466:6781:d48b]) by smtp.gmail.com with ESMTPSA id i62-20020a626d41000000b0053e468a78a8sm7908655pfc.158.2022.10.03.14.47.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Oct 2022 14:47:09 -0700 (PDT) Date: Mon, 3 Oct 2022 14:47:06 -0700 From: Fangrui Song To: Adhemerval Zanella Cc: libc-alpha@sourceware.org, "H . J . Lu" , Noah Goldstein Subject: Re: [PATCH v2] x86: Remove .tfloat usage Message-ID: <20221003214706.jdjoypkiqvvqovfp@google.com> References: <20221003141802.281647-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20221003141802.281647-1-adhemerval.zanella@linaro.org> X-Spam-Status: No, score=-27.3 required=5.0 tests=BAYES_00,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,GIT_PATCH_0,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2022-10-03, Adhemerval Zanella wrote: >This is what I intend to commit, I saw no regression or changes is >generated code. >-- > >Some compiler does not support it (such as clang integrated assembler) >neither gcc emits it. >--- > sysdeps/i386/fpu/e_atanh.S | 3 ++- > sysdeps/i386/fpu/e_atanhf.S | 3 ++- > sysdeps/i386/fpu/e_atanhl.S | 3 ++- > sysdeps/i386/fpu/s_asinhl.S | 3 ++- > sysdeps/i386/fpu/s_cbrtl.S | 49 ++++++++++++++++++++++------------- > sysdeps/i386/fpu/s_expm1.S | 3 ++- > sysdeps/i386/fpu/s_expm1f.S | 3 ++- > sysdeps/i386/fpu/s_log1pl.S | 3 ++- > sysdeps/x86_64/fpu/s_log1pl.S | 3 ++- > 9 files changed, 47 insertions(+), 26 deletions(-) > >diff --git a/sysdeps/i386/fpu/e_atanh.S b/sysdeps/i386/fpu/e_atanh.S >index 6e4fef06b2..74d8f0e083 100644 >--- a/sysdeps/i386/fpu/e_atanh.S >+++ b/sysdeps/i386/fpu/e_atanh.S >@@ -33,7 +33,8 @@ one: .double 1.0 > limit: .double 0.29 > ASM_SIZE_DIRECTIVE(limit) > .type ln2_2,@object >-ln2_2: .tfloat 0.3465735902799726547086160 >+ln2_2: .quad 0xb17217f7d1cf79ac /* 0.3465735902799726547086160 */ >+ .short 0x3ffd > ASM_SIZE_DIRECTIVE(ln2_2) > There are two places `/*` follows one space instead of two. Reviewed-by: Fangrui Song > DEFINE_DBL_MIN >diff --git a/sysdeps/i386/fpu/e_atanhf.S b/sysdeps/i386/fpu/e_atanhf.S >index 146196eced..1803f55735 100644 >--- a/sysdeps/i386/fpu/e_atanhf.S >+++ b/sysdeps/i386/fpu/e_atanhf.S >@@ -34,7 +34,8 @@ limit: .double 0.29 > ASM_SIZE_DIRECTIVE(limit) > .align ALIGNARG(4) > .type ln2_2,@object >-ln2_2: .tfloat 0.3465735902799726547086160 >+ln2_2: .quad 0xb17217f7d1cf79ac /* 0.3465735902799726547086160 */ >+ .short 0x3ffd > ASM_SIZE_DIRECTIVE(ln2_2) > > DEFINE_FLT_MIN >diff --git a/sysdeps/i386/fpu/e_atanhl.S b/sysdeps/i386/fpu/e_atanhl.S >index 1f6eb7ce48..df3f1b8f84 100644 >--- a/sysdeps/i386/fpu/e_atanhl.S >+++ b/sysdeps/i386/fpu/e_atanhl.S >@@ -39,7 +39,8 @@ limit: .double 0.29 > ASM_SIZE_DIRECTIVE(limit) > .align ALIGNARG(4) > .type ln2_2,@object >-ln2_2: .tfloat 0.3465735902799726547086160 >+ln2_2: .quad 0xb17217f7d1cf79ac /* 0.3465735902799726547086160 */ >+ .short 0x3ffd > ASM_SIZE_DIRECTIVE(ln2_2) > > #ifdef PIC >diff --git a/sysdeps/i386/fpu/s_asinhl.S b/sysdeps/i386/fpu/s_asinhl.S >index bd442c6a09..f4f420d060 100644 >--- a/sysdeps/i386/fpu/s_asinhl.S >+++ b/sysdeps/i386/fpu/s_asinhl.S >@@ -23,7 +23,8 @@ > > .align ALIGNARG(4) > .type huge,@object >-huge: .tfloat 1e+4930 >+huge: .quad 0x89b634e7456ffa1d /* 1e+4930 */ >+ .short 0x7ff8 > ASM_SIZE_DIRECTIVE(huge) > .align ALIGNARG(4) > /* Please note that we use double value for 1.0. This number >diff --git a/sysdeps/i386/fpu/s_cbrtl.S b/sysdeps/i386/fpu/s_cbrtl.S >index 8802164706..23cc308e3c 100644 >--- a/sysdeps/i386/fpu/s_cbrtl.S >+++ b/sysdeps/i386/fpu/s_cbrtl.S >@@ -23,55 +23,68 @@ > > .align ALIGNARG(4) > .type f8,@object >-f8: .tfloat 0.161617097923756032 >+f8: .quad 0xa57ef3d83a542839 /* 0.161617097923756032 */ >+ .short 0x3ffc > ASM_SIZE_DIRECTIVE(f8) > .align ALIGNARG(4) > .type f7,@object >-f7: .tfloat -0.988553671195413709 >+f7: .quad 0xfd11da7820029014 /* -0.988553671195413709 */ >+ .short 0xbffe > ASM_SIZE_DIRECTIVE(f7) > .align ALIGNARG(4) > .type f6,@object >-f6: .tfloat 2.65298938441952296 >+f6: .quad 0xa9ca93fcade3b4ad /* 2.65298938441952296 */ >+ .short 0x4000 > ASM_SIZE_DIRECTIVE(f6) > .align ALIGNARG(4) > .type f5,@object >-f5: .tfloat -4.11151425200350531 >+f5: .quad 0x839186562c931c34 /* -4.11151425200350531 */ >+ .short 0xc001 > ASM_SIZE_DIRECTIVE(f5) > .align ALIGNARG(4) > .type f4,@object >-f4: .tfloat 4.09559907378707839 >+f4: .quad 0x830f25c9ee304594 /* 4.09559907378707839 */ >+ .short 0x4001 > ASM_SIZE_DIRECTIVE(f4) > .align ALIGNARG(4) > .type f3,@object >-f3: .tfloat -2.82414939754975962 >+f3: .quad 0xb4bedd1d5fa2f0c6 /* -2.82414939754975962 */ >+ .short 0xc000 > ASM_SIZE_DIRECTIVE(f3) > .align ALIGNARG(4) > .type f2,@object >-f2: .tfloat 1.67595307700780102 >+f2: .quad 0xd685a163b08586e3 /* 1.67595307700780102 */ >+ .short 0x3fff > ASM_SIZE_DIRECTIVE(f2) > .align ALIGNARG(4) > .type f1,@object >-f1: .tfloat 0.338058687610520237 >+f1: .quad 0xad16073ed4ec3b45 /* 0.338058687610520237 */ >+ .short 0x3ffd > ASM_SIZE_DIRECTIVE(f1) > >-#define CBRT2 1.2599210498948731648 >-#define ONE_CBRT2 0.793700525984099737355196796584 >-#define SQR_CBRT2 1.5874010519681994748 >-#define ONE_SQR_CBRT2 0.629960524947436582364439673883 >- > /* We make the entries in the following table all 16 bytes > wide to avoid having to implement a multiplication by 10. */ > .type factor,@object > .align ALIGNARG(4) >-factor: .tfloat ONE_SQR_CBRT2 >+factor: /* 1.0 / cbrt (2.0) ^ 2 / 0.629960524947436582364439673883 */ Confusing / as noted by Joseph >+ .quad 0xa14517cc6b945711 >+ .short 0x3ffe > .byte 0, 0, 0, 0, 0, 0 >- .tfloat ONE_CBRT2 >+ /* 1.0 / cbrt (2.0) / 0.793700525984099737355196796584 */ >+ .quad 0xcb2ff529eb71e415 >+ .short 0x3ffe > .byte 0, 0, 0, 0, 0, 0 >- .tfloat 1.0 >+ /* 1.0L */ >+ .quad 0x8000000000000000 >+ .short 0x3fff > .byte 0, 0, 0, 0, 0, 0 >- .tfloat CBRT2 >+ /* cbrt (2.0) / 1.2599210498948731648 */ >+ .quad 0xa14517cc6b945711 >+ .short 0x3fff > .byte 0, 0, 0, 0, 0, 0 >- .tfloat SQR_CBRT2 >+ /* cbrt (2.0) ^ 2 / 1.5874010519681994748 */ >+ .quad 0xcb2ff529eb71e416 >+ .short 0x3fff > ASM_SIZE_DIRECTIVE(factor) > > .type two64,@object >diff --git a/sysdeps/i386/fpu/s_expm1.S b/sysdeps/i386/fpu/s_expm1.S >index 7199d681ba..038ff72feb 100644 >--- a/sysdeps/i386/fpu/s_expm1.S >+++ b/sysdeps/i386/fpu/s_expm1.S >@@ -33,7 +33,8 @@ minus1: .double -1.0 > one: .double 1.0 > ASM_SIZE_DIRECTIVE(one) > .type l2e,@object >-l2e: .tfloat 1.442695040888963407359924681002 >+l2e: .quad 0xb8aa3b295c17f0bc /* 1.442695040888963407359924681002 */ >+ .short 0x3fff > ASM_SIZE_DIRECTIVE(l2e) > > DEFINE_DBL_MIN >diff --git a/sysdeps/i386/fpu/s_expm1f.S b/sysdeps/i386/fpu/s_expm1f.S >index 04c37bda1b..59a2bb81ba 100644 >--- a/sysdeps/i386/fpu/s_expm1f.S >+++ b/sysdeps/i386/fpu/s_expm1f.S >@@ -33,7 +33,8 @@ minus1: .double -1.0 > one: .double 1.0 > ASM_SIZE_DIRECTIVE(one) > .type l2e,@object >-l2e: .tfloat 1.442695040888963407359924681002 >+l2e: .quad 0xb8aa3b295c17f0bc /* 1.442695040888963407359924681002 */ >+ .short 0x3fff > ASM_SIZE_DIRECTIVE(l2e) > > DEFINE_FLT_MIN >diff --git a/sysdeps/i386/fpu/s_log1pl.S b/sysdeps/i386/fpu/s_log1pl.S >index f28349f7d2..86aa438f01 100644 >--- a/sysdeps/i386/fpu/s_log1pl.S >+++ b/sysdeps/i386/fpu/s_log1pl.S >@@ -14,7 +14,8 @@ RCSID("$NetBSD: s_log1p.S,v 1.7 1995/05/09 00:10:58 jtc Exp $") > -1 + sqrt(2) / 2 <= x <= 1 - sqrt(2) / 2 > 0.29 is a safe value. > */ >-limit: .tfloat 0.29 >+limit: .quad 0x947ae147ae147ae1 /* 0.29 */ Only one space before `/*` >+ .short 0x3ffd > /* Please note: we use a double value here. Since 1.0 has > an exact representation this does not effect the accuracy > but it helps to optimize the code. */ >diff --git a/sysdeps/x86_64/fpu/s_log1pl.S b/sysdeps/x86_64/fpu/s_log1pl.S >index 8219f6fbcc..187c65e668 100644 >--- a/sysdeps/x86_64/fpu/s_log1pl.S >+++ b/sysdeps/x86_64/fpu/s_log1pl.S >@@ -14,7 +14,8 @@ RCSID("$NetBSD: s_log1p.S,v 1.7 1995/05/09 00:10:58 jtc Exp $") > -1 + sqrt(2) / 2 <= x <= 1 - sqrt(2) / 2 > 0.29 is a safe value. > */ >-limit: .tfloat 0.29 >+limit: .quad 0x947ae147ae147ae1 /* 0.29 */ >+ .short 0x3ffd > /* Please note: we use a double value here. Since 1.0 has > an exact representation this does not effect the accuracy > but it helps to optimize the code. */ >-- >2.34.1 >