From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2041.outbound.protection.outlook.com [40.107.21.41]) by sourceware.org (Postfix) with ESMTPS id 2892E385780C for ; Tue, 9 Feb 2021 18:27:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2892E385780C Received: from DB6PR0501CA0005.eurprd05.prod.outlook.com (2603:10a6:4:8f::15) by PA4PR08MB6160.eurprd08.prod.outlook.com (2603:10a6:102:e5::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3825.29; Tue, 9 Feb 2021 18:27:24 +0000 Received: from DB5EUR03FT003.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:8f:cafe::37) by DB6PR0501CA0005.outlook.office365.com (2603:10a6:4:8f::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3846.25 via Frontend Transport; Tue, 9 Feb 2021 18:27:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT003.mail.protection.outlook.com (10.152.20.157) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11 via Frontend Transport; Tue, 9 Feb 2021 18:27:24 +0000 Received: ("Tessian outbound 2b57fdd78668:v71"); Tue, 09 Feb 2021 18:27:24 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: cfb3aee25b5e5cbb X-CR-MTA-TID: 64aa7808 Received: from 1fe9de96577b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 5D80D97C-CFED-48BE-9588-0605903A7E16.1; Tue, 09 Feb 2021 18:27:18 +0000 Received: from EUR02-AM5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 1fe9de96577b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 09 Feb 2021 18:27:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FLbHLBRJkPb8Zh5gq2743epqWq57sHuB6hqobMiFDMq2jAqxQtFgnq+GaPS8FpWEQ98Ddt1431TUqNTyptMV7BuSu5RoCwVjaR42YSNHl1LPNTOGSsHOfG+MolQsq6osECYIFBX83wRLq+F5GVydBP0YNURG64RWrGK6K0XrV76mDPQuJ/1cuyad8VRsyfFXCBID7w4aNsSQRel4S1e7r2hMdXMS6r5OXZRw4q5+Wg/cQflZPpgPW+wXZDWMalgpTn0UToZTCvRq00SYlGjcaYIJsWAe0T5dEA0yST0fJv4f+Mo1UVG1zh71HPBZSefYDhFhnqNKmUwTDvioJvTcRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vudMn/lYFoo2kgXDHd2264WnR9L5gsguHUbGDpcbc+g=; b=Lpa6/3iHDFVDD/YCXxmDF8/ySLyILAtxZhtvcnTBI91yaLBHlaquWx7S3a4RFFyb35WidYCPRtfSZF/BY6vw6SMVTV2U+BwO0vAqWz90V8f33NIt5zUMcWc9MWwhxJbxiSV15+ZrsX+bRriXVEBSwFw4nICGVKChpSfVcgkkQgxQYcA0W5MjuWe9l6k84JdyDoyiAJe8eidBu4yXIZ0rbQndlfxVCnhVwuaLFyEp7I5wypF2Eqxnw5m/kh4LiyyV1/3KDOl8DKC2HsQwx71rKTIrPdk9I0a9mtdMi7fFlwqwS4qZaQsZUDcuK5ymxMYPiYtg8vDsaH3u91fBF22P1g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VE1PR08MB5023.eurprd08.prod.outlook.com (2603:10a6:803:107::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3825.25; Tue, 9 Feb 2021 18:27:17 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::ac5e:cd7a:e3e6:a86b]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::ac5e:cd7a:e3e6:a86b%5]) with mapi id 15.20.3846.026; Tue, 9 Feb 2021 18:27:15 +0000 From: Wilco Dijkstra To: Paul Zimmermann CC: "libc-alpha@sourceware.org" Subject: Re: [PATCH 1/4] Add a routine for accurate reduction mod (2pi), for j0f/j1f/y0f/y1f. Thread-Topic: [PATCH 1/4] Add a routine for accurate reduction mod (2pi), for j0f/j1f/y0f/y1f. Thread-Index: AQHW/tVy0rkJbBETtkuuNgPY9zdD+qpQBPHdgAAQkP0= Date: Tue, 9 Feb 2021 18:27:13 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: inria.fr; dkim=none (message not signed) header.d=none;inria.fr; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.249.100] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 88460122-2a7d-417f-1b39-08d8cd28588f x-ms-traffictypediagnostic: VE1PR08MB5023:|PA4PR08MB6160: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: aW4XBizAYdppBTt08ENXImmY8sOY9sb7s8JMXJ9YLzJjO7+bLM7ImJB6Nw2ai924VHCTm06kZsNCITiiDAecCeClRlCqsgbf7jqQU+L1Uq18GC7ojCGqGIVi+qCNngEhZ7bi0cvOhdCFQxumySjPK5AU2jOEr5aZ9BQ+zqCAQ5+6v+UohBd+dfiJCQIR2CR1O/SP6r7NiwRoPPGHqjjaSOeNtYbyUIbBGtJHVyXRnjt8rIg/MnifR70Z/6b/70jOmxsc6G2JNIxCvYOPK3o9eJf9EQbe7KX5kJcIC1eXMfN1t8bJUaFk3hZfxqjeTK5A4YG/TEk20zqSf2LfCr3Oq2m5D0nEqFpNEgQ7CQejnhi2uH3Fsb/NpF639CxbZvWlvc2rY4RRhMD4+uDwZsddJzYgcIT2SSXaxGy79EkdQgXJy1sHj563pu6zojRHwr83pzHQvhyaDWaYcGOlQaHIkPIFgogx6fRXRn3tvv3MeZ2oz8Z6LNFvnOzHUR0CIzjUsoutGXZK/1XHreFb3KHfdw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(396003)(39860400002)(136003)(346002)(376002)(5660300002)(55016002)(6916009)(2906002)(8936002)(8676002)(4326008)(86362001)(478600001)(9686003)(7696005)(26005)(33656002)(76116006)(66446008)(64756008)(66556008)(66946007)(186003)(66476007)(71200400001)(316002)(52536014)(6506007)(83380400001)(91956017); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?45J9t82SyFHPoVifrzNFFeejZ9IfTjPGlaASgSx45A135Freja0unvtEZ7?= =?iso-8859-1?Q?l910IsFVzRJzikLpKHmPY57Rw/qLNxQCRQvBftLN/Hym4Hkp/JuN6CIMDk?= =?iso-8859-1?Q?ZocxYDJG6L5XiscIzETXlJw6oKIC6KtaHMt6RAshaX9tUQdcGwWZFfA7Sl?= =?iso-8859-1?Q?WCS1LPfk1U3Ck6MiIqTiV0oOkrwY/6UNknk0W+oW0V/LyK8LN9eC9kENF/?= =?iso-8859-1?Q?Sqetsl/lsx38EZE5yMMemTqFwdreqtS+CpPt1zkEtFeq8bxraOuxfgzY97?= =?iso-8859-1?Q?mdHEEGHL8ls73p0qkBaVYcuN1oTkGTgUe5rQRxjD5VjaGU5kXkEC/eZW7V?= =?iso-8859-1?Q?DY7EUVPg6IplSS9QZhEdOdvd/RWyXfYA3aiuvRXCbQSoUc5D+PzXR8pihY?= =?iso-8859-1?Q?OBiI1QnzSTb7i4W4e8rpNT/NsTdJznKu0MNAyYvoe75uEhjlIwVGr8ghVw?= =?iso-8859-1?Q?b4xSkFE98jt8zvwWZiTZS5qy/jBReh0u3Kkt8v6Mi3Fw7Rgn0RDk7fnhFl?= =?iso-8859-1?Q?04xCWg8ww47ODTwkjmCCzRB7yE4IV9caimSKPjBuxu5YOZzoctNQkKakjd?= =?iso-8859-1?Q?YPfLxKnuwUqwYkYCtAfquYAgoFyjjAkWwpsRE1ZWDnCrRdFwIlYKmVhN6H?= =?iso-8859-1?Q?BHMlRP4np9DiGiZ0dCnF+rvsi5pTwJEZml1tkpopXiyFtKUwR3wI12uZOE?= =?iso-8859-1?Q?8LL09Rj2cZeC0LFSf/9woKL5l5MOVA+X+8DEQKSfHI6rbGV33W3VY8125h?= =?iso-8859-1?Q?gIHJdwIqAH124aOzj2tKnQdRkjn5jDhnlNjEqVEHvFmn4Tk0h9e01zgWh8?= =?iso-8859-1?Q?WhWlcrWdMWaAwfEQNw/bX8mG8MgU++SSjSMSUcsUR3zccOUjpNTV17aMwj?= =?iso-8859-1?Q?URoeRU0eL+2OXmQN20K6NUUHh3f1p0ugL0BwB1VUSK7ujSOQ+RSKCPUBQn?= =?iso-8859-1?Q?XtJHzZggByV8/W6EARb+ZdGOrZrunNvfusgned9pTdNScOKfg9jVuipKZH?= =?iso-8859-1?Q?kEBnFn5SpY7UE9PVyCimfbpwLO1O7zBbW0fUzunLs0F6mYopxHGx9GfLPs?= =?iso-8859-1?Q?lEk7n4B5Zr5hpyxJUvoI80yJL6jh3ATvMZtC/Tetabw9oRslHXGW/XrcFS?= =?iso-8859-1?Q?B7tJlffFdR0H5S1qf4vqSySyr90x/i/CQ50PHpm4JwFKtSNawpBraNJ5aj?= =?iso-8859-1?Q?pCLRDjCh1lNSzvyhcb5y498UeTa6sBgoyOt2I1Y2/9lWt5pJVAKCmsfYww?= =?iso-8859-1?Q?YoPsaklUJeXGPlD6PXSSBdfdRmVCGU5App6oEFjWMfiZAN+tmhqlDs84MG?= =?iso-8859-1?Q?aeoMWcTzLxq9EP6wQ0I14ski1fSD9KkxLhuEJMVhgiFRtHo=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5023 Original-Authentication-Results: inria.fr; dkim=none (message not signed) header.d=none;inria.fr; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT003.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: ee495212-a5b8-4bca-28c9-08d8cd28538a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UCAAomRX4pmIh9sJZtRuUotxWg3Z0o5nAjCTiNU3q32QdVVf1o+oaSXUnI9diYSa1JgdvdKgf0nnhtaTkfUpCVWUNonfOJG45dsklnY2aE2Zw6WF01kW5RaT7Aox0GrH18IfLes1Mu/H05JwVOR/PDExnh1tjYWvWW6ojp80YenUVQrBsuqsc0Cip7aCLng8ZDuayo1lJxmxAW2/Uahq+7vlhER6/tynf9hVKOZnDYGH0JmnbnT76raHQZTw5YcX4FNg10KLRmTRTWR3P/ljS94gBgJgaVaxazl6QrL5SDs3mGiBxEaU+SdVBED7Ahsz+qHhLHQhc2JlUnJOnn3B4Hhm17j/Y9K1arU4YcU6voe/U6kKBP/cELVL87SYJRygZDtrDUptDVCKzH7RDadmv3Llqd7+ZAbuOCDLDj9Y3twQU72sfbu5XUtnifp58gmCJ5pSqT/t+3ykH4EM4cuVyhuDbsirakwQhko8tC8Hq2Ss/RkTfH7QrxnZv1soFo5Vo51HvphqHcQrwfhF0apULFnK9SchurURH1zPFobTccAc0Vn14NwVHJD0gfrKxk0huyvozXumaNf0vMHsAM3AQd9MBs53bVjo+B9KY3wADzSKp8LqkirJaUUVbkYzKRjMoaFxcjTBuce422YDFvSolpO7CyXhXMMcHo4U2jxmiYM= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(376002)(346002)(396003)(136003)(46966006)(36840700001)(478600001)(47076005)(186003)(5660300002)(2906002)(7696005)(36860700001)(81166007)(33656002)(6506007)(26005)(82310400003)(86362001)(83380400001)(52536014)(4326008)(82740400003)(70206006)(9686003)(336012)(8676002)(55016002)(8936002)(70586007)(316002)(6862004)(356005); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Feb 2021 18:27:24.1014 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 88460122-2a7d-417f-1b39-08d8cd28588f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT003.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6160 X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Feb 2021 18:27:28 -0000 Hi Paul,=0A= =0A= > reduce_fast is clearly not accurate enough: for x=3D0x1.96d666p+7, it giv= es=0A= > y=3D0x1.947b0da22168cp+8 (and n=3D-128), and we clearly see that y is not= in=0A= > [-PI/4,PI/4] as advertized. What happens is that in the #else part of=0A= > the code that is executed on my machine, r is larger than 2^31, thus the= =0A= > cast (int32_t) r overflows, the value of n is wrong, and thus the returne= d=0A= > value x - n * p->hpi. Maybe a comment should be added about this limitati= on=0A= > of reduce_fast?=0A= =0A= Well the comment says:=0A= =0A= /* Fast range reduction using single multiply-subtract. Return the modulo = of=0A= X as a value between -PI/4 and PI/4 and store the quadrant in NP.=0A= The values for PI/2 and 2/PI are accessed via P. Since PI/2 as a double= =0A= is accurate to 55 bits and the worst-case cancellation happens at 6 * PI= /4,=0A= the result is accurate for |X| <=3D 120.0. */=0A= =0A= Ie. it won't be accurate enough over 120.0 (and yes, it will overflow at ar= ound=0A= 201 but that's well outside the allowed range). Note also reduce_large only= =0A= works for values larger than 2.0. The idea is that they complement each oth= er,=0A= allowing the common case to use the faster reduction - see eg. sinf:=0A= =0A= else if (__glibc_likely (abstop12 (y) < abstop12 (120.0f)))=0A= x =3D reduce_fast (x, p, &n);=0A= else if (abstop12 (y) < abstop12 (INFINITY))=0A= x =3D reduce_large (xi, &n);=0A= =0A= We could add an inline helper that has this logic and the sign handling so= =0A= that it works as a generic range reducer across the full float range.=0A= As Carlos points out, it's not essential for your patch, so it could be don= e=0A= in another path.=0A= =0A= There is a generic range reducer for double precision, however it is very= =0A= complex and slow, so it needs to be rewritten to use the reduce_fast algori= thm.=0A= =0A= > As for reduce_large, I need to run again the exhaustive tests for all fou= r=0A= > functions (j0, j1, y0, y1) and all four rounding modes to see if it is ac= curate=0A= > enough.=0A= =0A= It should be since it gives you x mod PI/4 with absolute error of about 2^-= 62.=0A= =0A= Cheers,=0A= Wilco=