From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34998 invoked by alias); 21 Mar 2018 17:51:31 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 34606 invoked by uid 89); 21 Mar 2018 17:51:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=vi, vx, xn2, speedups X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 2/7] sin/cos slow paths: remove large range reduction Date: Wed, 21 Mar 2018 17:51:00 -0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB6PR0801MB1813;7:5fO1Bp11D1kPA3HR2Cf2i5D0W8Fn1pcJ6bcQ+wDrg/yKXXoxQT2OcF+uiXxp+ST2Gd1SH4ZQbljDeXotdhCaYumOM+pzouATDF/5arxeqsItnoqXLIeY5DBMbwqTeb/Vna22SK5lZlJGEdZ2zb8+bo8YdAmqRamE4NXeegqvvPsblWTiNMR3sOWwYITlT7ZWXgAehjYxCECNqgw3mpBZqzeScKaJuEeiDwYRIg5CkCMFIBzRDgMQIX+lQX1T41IG x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 899bde1c-52cf-4209-736f-08d58f545d23 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020);SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:DB6PR0801MB1813;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002);DIR:OUT;SFP:1101;SCL:1;SRVR:DB6PR0801MB1813;H:DB6PR0801MB2053.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: LsgjvQjFAXwt95QrrzLx71tjE0gRTQHjPbl5DhfGg6rw3utLfulUuXmLr1S9pbMhor0X52WgsKut0DKKWYLuaHycm3APCp7J+btHGq+UlYhL73FOYrIyWc56ZaQHOFnBr7XmtNKbSc/mL81ZpkV5ssq4QrBgdgcvApXiJWm0vTHKv2yCJsG5CnI+Fpc8AYmi96PPx5Wd6NItcn+DawNs/S6lvyDpAIJHroN2lzbfnfvEPcvHZlyOXba4BMjvbG4pXWTF2cOHNlGG5cysbXNWOTzOmBXZdGp+I3txl5OW9pX9E0RtIxXpmQMD7TmJh0hXyWsg+IxPXoaTQDmGuMZpwQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 899bde1c-52cf-4209-736f-08d58f545d23 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:51:24.4953 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 X-SW-Source: 2018-03/txt/msg00505.txt.bz2 This patch removes the large range reduction code and defers to the huge ra= nge reduction code. The first level range reducer supports inputs up to 2^27, which is way too large given that inputs for sin/cos are typically small (< 10), and optimizing for a smaller range would give a significant speedup. Input values above 2^27 are practically never used, so there is no reason f= or supporting range reduction between 2^27 and 2^48. Removing it significantly simplifies code and enables further speedups. There is about a 2.3x slowdo= wn in this range due to __branred being extremely slow (a better algorithm cou= ld easily more than double performance). ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case. -- diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 0c16b728df127ad54039da3eec376e5f1fe4c852..c86fb9f2aa9f18418defc522830= a7b8f85c1dfae 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -362,80 +362,6 @@ do_sincos_1 (double a, double da, double x, int4 n, bo= ol shift_quadrant) return retval; } =20 -static inline int4 -__always_inline -reduce_sincos_2 (double x, double *a, double *da) -{ - mynumber v; - - double t =3D (x * hpinv + toint); - double xn =3D t - toint; - v.x =3D t; - double xn1 =3D (xn + 8.0e22) - 8.0e22; - double xn2 =3D xn - xn1; - double y =3D ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); - int4 n =3D v.i[LOW_HALF] & 3; - double db =3D xn1 * pp3; - t =3D y - db; - db =3D (y - t) - db; - db =3D (db - xn2 * pp3) - xn * pp4; - double b =3D t + db; - db =3D (t - b) + db; - - *a =3D b; - *da =3D db; - - return n; -} - -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ -static double -__always_inline -do_sincos_2 (double a, double da, double x, int4 n, bool shift_quadrant) -{ - double res, retval, cor, xx; - - double eps =3D 1.0e-24; - - int4 k =3D (n + shift_quadrant) & 3; - - switch (k) - { - case 2: - a =3D -a; - da =3D -da; - /* Fall through. */ - case 0: - xx =3D a * a; - if (xx < 0.01588) - { - /* Taylor series. */ - res =3D TAYLOR_SIN (xx, a, da, cor); - cor =3D 1.02 * cor + __copysign (eps, cor); - retval =3D (res =3D=3D res + cor) ? res : bsloww (a, da, x, n); - } - else - { - res =3D do_sin (a, da, &cor); - cor =3D 1.035 * cor + __copysign (eps, cor); - retval =3D ((res =3D=3D res + cor) ? __copysign (res, a) - : bsloww1 (a, da, x, n)); - } - break; - - case 1: - case 3: - res =3D do_cos (a, da, &cor); - cor =3D 1.025 * cor + __copysign (eps, cor); - retval =3D ((res =3D=3D res + cor) ? ((n & 2) ? -res : res) - : bsloww2 (a, da, x, n)); - break; - } - - return retval; -} - /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ @@ -498,16 +424,7 @@ __sin (double x) retval =3D do_sincos_1 (a, da, x, n, false); } /* else if (k < 0x419921FB ) */ =20 -/*---------------------105414350 <|x|< 281474976710656 -------------------= -*/ - else if (k < 0x42F00000) - { - double a, da; - - int4 n =3D reduce_sincos_2 (x, &a, &da); - retval =3D do_sincos_2 (a, da, x, n, false); - } /* else if (k < 0x42F00000 ) */ - -/* -----------------281474976710656 <|x| <2^1024--------------------------= --*/ +/* --------------------105414350 <|x| <2^1024-----------------------------= -*/ else if (k < 0x7ff00000) retval =3D reduce_and_compute (x, false); =20 @@ -584,15 +501,7 @@ __cos (double x) retval =3D do_sincos_1 (a, da, x, n, true); } /* else if (k < 0x419921FB ) */ =20 - else if (k < 0x42F00000) - { - double a, da; - - int4 n =3D reduce_sincos_2 (x, &a, &da); - retval =3D do_sincos_2 (a, da, x, n, true); - } /* else if (k < 0x42F00000 ) */ - - /* 281474976710656 <|x| <2^1024 */ + /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) retval =3D reduce_and_compute (x, true); =20 diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_s= incos.c index e1977ea7e93c32cca5369677f23e68f8f797a9f4..a9af8ce526bfe78c06cfafa65de= 0815ec69585c5 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -86,16 +86,6 @@ __sincos (double x, double *sinx, double *cosx) =20 return; } - if (k < 0x42F00000) - { - double a, da; - int4 n =3D reduce_sincos_2 (x, &a, &da); - - *sinx =3D do_sincos_2 (a, da, x, n, false); - *cosx =3D do_sincos_2 (a, da, x, n, true); - - return; - } if (k < 0x7ff00000) { reduce_and_compute_sincos (x, sinx, cosx);