From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2074.outbound.protection.outlook.com [40.107.105.74]) by sourceware.org (Postfix) with ESMTPS id 86E74385800A for ; Fri, 4 Aug 2023 08:35:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 86E74385800A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ksCMuHViCrM0L8AUtS9CKQdvjf3MS4j0in7pgIJp+EI=; b=3jm4N14vT11p5/+IhHiZf1umklO+CGlFnw/scqxHpdnXWV31T5UbQ9WQJ3DM+Kug/HKBXIgUOk19fExdcd/DdMeYWILNnkIXtI0bqdxFQy93nKIyV5s6F5b8ptQX3IjaHdyXK77Qe1ElLo+6rjvN684wqXXA+PuOpO00W17GLZU= Received: from DB8PR04CA0018.eurprd04.prod.outlook.com (2603:10a6:10:110::28) by AS8PR08MB10026.eurprd08.prod.outlook.com (2603:10a6:20b:632::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.47; Fri, 4 Aug 2023 08:35:35 +0000 Received: from DBAEUR03FT039.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:110:cafe::e5) by DB8PR04CA0018.outlook.office365.com (2603:10a6:10:110::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.21 via Frontend Transport; Fri, 4 Aug 2023 08:35:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT039.mail.protection.outlook.com (100.127.142.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.45 via Frontend Transport; Fri, 4 Aug 2023 08:35:35 +0000 Received: ("Tessian outbound e1fdbe8a48d3:v145"); Fri, 04 Aug 2023 08:35:35 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6c3b0258350791ec X-CR-MTA-TID: 64aa7808 Received: from d4db7ef39ba3.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 1730FBA5-123E-4202-985F-AAC263FB89ED.1; Fri, 04 Aug 2023 08:35:29 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d4db7ef39ba3.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 04 Aug 2023 08:35:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=D108a/yoj5QPKQBGNZTxllBxwIE171LubqZ7/0DGpDrdfufrxVWSM3pe0274QxV5EjFlQQ7Cy3RXoPPfLP7JXoyLDuv74Bsmfg5qj1mqpMtEoddBuaLXlOfQ+USaDShd9jMSzBdySWZaSdYahc8GYOYlW1s845NWrImB0IU3dutX8oAO7tRLTAF3/SOgL/d8B9eoQ/0C/mEwZNAWDpTrqMvTTjfImYjMWRGvFHoU7pv8FhducY4uPuTwjM9r9kVR5MlaEl35d05QXUHkO7JcQxbB/ZzNEDCcOzo+AClKoQ0VyffgcRM88h/3p8Yow8IkatVSDQuVIKjk1rPg/ESEQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ksCMuHViCrM0L8AUtS9CKQdvjf3MS4j0in7pgIJp+EI=; b=klNxUUnMC7A5682DpSPY9VAkmUinGQ64bB0/muAKATMhuYVmzi/2u3Jt76opKV0k6o19IUOTwkRHKZyDc9zITT6sRPu9CrFazRAiRsqqS+U8quJGxcvOBuDLm1gZcm4zx4ojK1JXuonPpBTvyoAgRPKo0FN1i8iyzdrif7BUoxxAv1NORtH2vIjDCupRLCR7PdHt7SDdAVlzu2SbmXfS+DnpGPqVsXgt32iPdJXN8d5uFdG8bUj1u27p/G6/p1WvYvlScz9kYONk57E7/B+CVGw/xyUt9GmhjU77zLOQuaHxoZvRd3OzoMjBnJQsEB6VmiI5iAnFU3jz0ZkwQEWW/Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ksCMuHViCrM0L8AUtS9CKQdvjf3MS4j0in7pgIJp+EI=; b=3jm4N14vT11p5/+IhHiZf1umklO+CGlFnw/scqxHpdnXWV31T5UbQ9WQJ3DM+Kug/HKBXIgUOk19fExdcd/DdMeYWILNnkIXtI0bqdxFQy93nKIyV5s6F5b8ptQX3IjaHdyXK77Qe1ElLo+6rjvN684wqXXA+PuOpO00W17GLZU= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from DB9PR08MB7179.eurprd08.prod.outlook.com (2603:10a6:10:2cc::19) by DB9PR08MB9635.eurprd08.prod.outlook.com (2603:10a6:10:45e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.47; Fri, 4 Aug 2023 08:35:27 +0000 Received: from DB9PR08MB7179.eurprd08.prod.outlook.com ([fe80::adb0:61cb:8733:6db2]) by DB9PR08MB7179.eurprd08.prod.outlook.com ([fe80::adb0:61cb:8733:6db2%7]) with mapi id 15.20.6652.021; Fri, 4 Aug 2023 08:35:27 +0000 Date: Fri, 4 Aug 2023 09:35:08 +0100 From: Szabolcs Nagy To: Paul Zimmermann Cc: Wilco.Dijkstra@arm.com, libc-alpha@sourceware.org, Joe.Ramsay@arm.com Subject: Re: [PATCH] aarch64: Improve SVE sin polynomial Message-ID: References: Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SN6PR01CA0004.prod.exchangelabs.com (2603:10b6:805:b6::17) To DB9PR08MB7179.eurprd08.prod.outlook.com (2603:10a6:10:2cc::19) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: DB9PR08MB7179:EE_|DB9PR08MB9635:EE_|DBAEUR03FT039:EE_|AS8PR08MB10026:EE_ X-MS-Office365-Filtering-Correlation-Id: fce3126b-867f-40b7-a822-08db94c5c611 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: yn/MqXLZ2TN1g/QzB8coNS4SZc0xWF2cj1rsMR4Zl1q8MWp/HTDIpNYRwQhU0xRPPgd2BuJpujHvU7o5knotTG1VMSPBFShW93FTgKCNkVdtJvGByboEpenQUUaWbShWs80TioJJBzVezxsmz77STRZujRWaYxydx+K7v9AtYkhiB+G8nyz4iTIwVon5vOXvYSwXmJ28AHzxyKimbThVCxwYGlEsf0vqH7Xmz2GFM1Mvarn7fy5HLAeqWRFo1UgJDg5xgisbB6hitdtKr+tSkMVf2RChCePp8jY5qVAllm6Y94IK0E29u5ypT50Oip8aAam1uXbkbt83I85YAE+B6/Wc9iCc0CmmNc+voTpCffR1NyyFOiMud6JEVuGiflrn77Q7Tj/e/68yQoprax+zzbZ9WqVv8GTHpvhkxqPjZptGfHTmTcSzZdMnCQ8aDlhVY0LIxqnCB88Wh3SHux51sn4y4iafP+HzTNEbvNvWfqQjdqi6b9MTYoNFgrea4KBRTQrznU2B1KOSh+y/LIDulprny0sE/mxB5Dps61Bm/ekHFL64O0eYAkTFimy/YMD2 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DB9PR08MB7179.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(346002)(376002)(136003)(366004)(396003)(451199021)(1800799003)(186006)(2616005)(8676002)(26005)(6506007)(83380400001)(66476007)(2906002)(316002)(4326008)(66946007)(5660300002)(6916009)(66556008)(8936002)(41300700001)(6666004)(6486002)(6512007)(478600001)(38100700002)(36756003)(86362001)(66899021);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9635 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT039.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 0a29b71a-164a-43e7-44f3-08db94c5c10b X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gnJ20iPG0GhfRg3/EQmR2woo2vZS47m/5keGK87opdGAPJt+U4pLFZkLReJvnj7TkRqIBGNrk6YkYSIUmGcSLDDNyDvMzTtaKr+p+X4iM3j7N7F0/JcOMfUYZc6j9Mmx/nUhOBvSadYVkc+3CyY6RCLEX9dtBtrRwtB+rHlWze40MQYDls9nZftk7WyJovwtYUbDwFgQfrKBCwk4zSs1P/OTK9UReitXX8Di7kxLUhmfQijrQf4ogMofGo6HwxnvCEKfT0gGAxybFGXrHXt3LIQBqUIBU/QPRV40tTXMa8wdf68JQ6zQxHU++EUYoqR9MIxwzHDNb3dqDNtrGeeGKW3+bhoTuwiUIux/mXKpP4AciiovTgUlrgnr1VEyJR5/648kbQArgQnVNWLVxSTgPACxTSvoUCi3ZRubVAEIzkoziZlrZvhxnsrJbPIXHzJXTVKP0faF7+s8s3jNqTxVZ1lSvKb7VIuNsAS1uzjX33VyO3H/3DgIzs+WMYSNHoG0qBBbFFOVsSY5pfTIs0NIhreYEJage8zt3RqhMaPJI2N/V2wRLkrDrFvS3BefjOxb2h/pHKN4EVa3iyva8eg4LUujNoKauCxcxIdiS4MiKBUJzJuXAY5aVaOC9UXaKMfbgxG6mz9jjuRPShlxclOApVmSu6cK8PGGyUgLtrHRvj+NUDM/JQh7KKU9H2/vtaGNCTGITqlB7zpAhhpnQMihn7LmxczG7bnt+VKAU4kBgZYwXQSeOFGCDyKOSRnocvSj X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(376002)(396003)(136003)(39860400002)(451199021)(82310400008)(1800799003)(186006)(36840700001)(46966006)(40470700004)(40460700003)(41300700001)(2906002)(316002)(70586007)(70206006)(4326008)(6862004)(8676002)(8936002)(5660300002)(81166007)(82740400003)(356005)(6506007)(26005)(36756003)(86362001)(2616005)(83380400001)(336012)(47076005)(36860700001)(478600001)(6512007)(6486002)(6666004)(40480700001)(66899021);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Aug 2023 08:35:35.5571 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fce3126b-867f-40b7-a822-08db94c5c611 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT039.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB10026 X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The 08/04/2023 09:28, Paul Zimmermann wrote: > Hi Szabolcs, > > > i'd say we want to minimize the maximum of > > > > (poly(x) - sin(x))/ulp(sin(x)) > > > > instead of > > > > (poly(x) - sin(x))/sin(x) > > interesting question! I have asked the Sollya developers if this is possible. > > Anyway, unless I'm wrong I find the largest ulp error with the polynomial > from the patch is about 1.72 at x=0.252552902718105, and for the polynomial > I obtained with Sollya it is about 1.43 at the same value of x. > > See attached graph (blue = polynomial from the patch, red = polynomial > I obtained with Sollya), which for 0 <= x <= pi/2 prints the absolute > value of the ulp error as you define above. neither the red nor the blue line has flat maximums, so they are not optimal for |poly(x)-sin(x)|/ulp(sin(x)). but that only considers approx error, not rounding. as you can see the blue error goes down near pi/2, while it's high around small values like 0.25. this is deliberate as at small values there is tiny rounding error, while in [1,pi/2] there is cancellation with huge errors (the order of x + c*x*x*x is smaller than x so the terms are big an cancel each other after relatively big rounding error). so the actual worst case error after rounding is much bigger than what sollya reports (approx error) and it is in [1,pi/2]. e.g. _ZGVnN2v_sin(0x1.7a83373b951abp+0) got 0x1.fdd2e6d528af4p-1 want 0x1.fdd2e6d528af7p-1 - 0.234485 ulp error: -2.7655 ulp which shows that there is a point where rounding error takes over and becomes dominant over approx error. the weighting function can take this into account by roughly following ulp(sin(x)) for a while and then in [1,pi/2] making it smaller (this is the weighting i used, but it was put together manually, not computed precisely: ideally we would compute rounding_error_bound(x) function based on the poly eval and incorporate that into the weighting. for most functions i don't expect this to help, but it matters for sin(x).) i would expect simply minimizing the approx error (using the ulp(sin(x)) weighting) would be better than your polynomial (sin(x) weighting) after rounding but using our polynomial (manually adjusted weighting) is even better (i verified it experimentally that measured error near the approx error peaks are roughly the same so the error is evenly spread out, but i'm sure slightly better polynomial is possible to find with a bit more research into the optimal design).