From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27579 invoked by alias); 12 Mar 2018 18:09:00 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 27560 invoked by uid 89); 12 Mar 2018 18:09:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: Siddhesh Poyarekar , =?iso-8859-2?Q?Ond=F8ej_B=EDlka?= CC: "libc-alpha@sourceware.org" , nd Subject: Re: [PATCH 2/6] Remove slow paths from sin/cos Date: Mon, 12 Mar 2018 18:09:00 -0000 Message-ID: References: <8065ca77-ec8f-9925-7e9c-52266cd1e4c6@gotplt.org> <20180309181902.GA26430@domone> ,<41b81cbf-b27a-07c2-1945-4c201327f162@gotplt.org> In-Reply-To: <41b81cbf-b27a-07c2-1945-4c201327f162@gotplt.org> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB6PR0801MB1944;6:uCPtd747YxIY7L/1KRBj0bIpwtnwBEg43r5G87niUQQifWPvDfwGgaYM4xWLiUob2lcaE5RhpX6RY8qjyaybmJhGhBAFoQfUnN0CrSG9NsS1mqc6g9mRAVAONhBwrU+aBq8mQ2MAC0FaSs8XOJoQ26l1+n6SHOBCeKtWgneR4mM1uZHoFmr83iT4+t9PvSgC9xRZY0iW8EcvXtMbJN6aNiWFNK2jmD/0jnyNFjr1jnvM03w8PLbfub00uv8iJRLqwRylaTmAzK8ns6068JVRzZ7SMUev1D8TnT77pftRtadiSjKdqwcBz/fN55Te13vQfZ4D6wiPwyWuWhvTxO2k/TrtugH9XkljILVXWDK/r1LDDSi8Q3dCU6D9uyecYaRq;5:ouLbo4u8pbCNPRjOPGkCKOO7KivdbzmRy82okGba3WAFFP7ZwwrDY3moLVodoXmrHKbbaZSjHyWA6UtR2ev2nTRoUBs7HyRrK1zd+HEf2rkoWs/5pB9F/vfl7NA8gQPLLCGTgA/yKZMu/j5GdQUElCJJqwohV0z9FS/0TCPGG5g=;24:fdNvSpnt7TYtV5q8ohCNkPs9XaFYchb5nKhPAj9wcUAM8AiHbxIUXHD0IA7CCuY8sP1w/+3oiW2745uvHr/Jtgxycb0B41dUgI3QGhkmIwE=;7:UTGGKBedxn95FBm+Qgff03PF/Gy5K5dlllgbo2WbtfQeU/FpYboW7G8Lw9rDHmVBKGBevP6TkUwIMtY5spdQGhEuPB+0VLZPJVU2FwUsbeQEUPS10CakUiL3Eye114JLghV4ObHglQ3wT7wHYl03WMcL+72V3CGPZ9G/mJem5H3jpOTAX2yjRhAFFdwKzCvkVvse+WoHPjkI18fNMK4zEsSs1/FtEDfojHpSRPICkRBr1bdwI2aUm++x/UbUrqqo x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: d3d85b04-397b-4a0e-ee08-08d5884451a8 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020);SRVR:DB6PR0801MB1944; x-ms-traffictypediagnostic: DB6PR0801MB1944: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(10201501046)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123558120)(6072148)(201708071742011);SRVR:DB6PR0801MB1944;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0801MB1944; x-forefront-prvs: 06098A2863 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(39860400002)(346002)(366004)(376002)(396003)(39380400002)(189003)(199004)(4326008)(5250100002)(6246003)(33656002)(8676002)(66066001)(106356001)(74316002)(3846002)(105586002)(97736004)(3280700002)(6116002)(2900100001)(7736002)(5660300001)(25786009)(305945005)(229853002)(6436002)(55016002)(3660700001)(53936002)(2906002)(9686003)(8656006)(86362001)(81166006)(72206003)(110136005)(7696005)(81156014)(14454004)(316002)(186003)(93886005)(6506007)(478600001)(68736007)(102836004)(26005)(54906003)(99286004)(76176011)(8936002)(2950100002);DIR:OUT;SFP:1101;SCL:1;SRVR:DB6PR0801MB1944;H:DB6PR0801MB2053.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: cMRk8K9GbxlyOCZ/7zjb5XRwmcAPGs5E0P5qMR0FDLYBJs/aqddjCohofKUZ+uuWAqdHcv9MwyQXljBX0LPI4oFHg1uszSSoGZd8FdwGpmIle/nTZ9YaNJst8qD3FIJKhlc7DJyyn/Iu3bTdu+30+pFX5yNbpUzu6Ek07LEb9WySLdSFIxGlQeWpRmWXmWr/T+kyWrrQTYr0HvOyGFvqzj5quLC/34VWBlbYflMPB3kavpaEhide+XahRpFIr4JlpFkEq3cqx+aVKceFElL0yOxQu78TtE4Dug2yV9i43IhJtDnA9tm/LhEDx1FrD9jSZxFMWTkIvHDXU7gGuiilrA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: d3d85b04-397b-4a0e-ee08-08d5884451a8 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Mar 2018 18:08:55.1034 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1944 X-SW-Source: 2018-03/txt/msg00293.txt.bz2 Siddhesh Poyarekar wrote: > On Saturday 10 March 2018 12:36 AM, Wilco Dijkstra wrote: >> Exactly, it's rare for sin/cos input to be > 10, largest I've ever seen = was around 1000. >> There is a case to be made for making wildly out or range inputs an erro= r or just return 0.0 >> rather than using insanely accurate range reduction (the result is going= to be equally wrong >> either way), but that's a different discussion. > > To be clear, I do not object to the patchset, it is the right way > forward.=A0 I would like to see clearer documentation as to why we decided > to drop this path in this patch and what the impact of that is.=A0 In that > sense, it would be nice to have a better rationale in the commit log > than "I've never seen inputs that big". There are multiple reasons, correctness (the implementation wasn't accurate= enough), premature optimization, complexity, etc. If we are to spend time on develop= ing a better range reduction, the time is best spent on (a) common cases, ie. small valu= es, and (b) on a much faster general range reducer (which could be designed to be f= aster on smaller inputs rather than be constant-time time __branred). A lot of the speedup is achieved by just making the code simpler. If we wan= t to support 3-level range reduction, there must be a really good reason for it (I've not seen 3= -level reduction used anywhere else). >> Fast range reduction of up to 2^27 is a huge overkill, we could get a si= gnificant speedup by >> reducing this range. >> >> The 2nd level range reduction (2^27 till 2^48) is so pointless that ther= e is absolutely no >> reason keep it. There is about a 2.3x slowdown in this range due to __br= anred being >> braindead slow. > > Thanks, all this should be part of the commit log.=A0 However, from what I > understand, this particular patch will result in a regression (since > IIRC reduce_and_compute is slower) but the following patch will fix that > and improve things. I'll improve the commit log in the next version (assuming there are comment= s on the actual patch!). Wilco