From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 120891 invoked by alias); 21 Mar 2018 17:53:37 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 120179 invoked by uid 89); 21 Mar 2018 17:53:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=variation, corp, RES X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction Date: Wed, 21 Mar 2018 17:53:00 -0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB6PR0801MB1813;6:2mLdhAhKWXMA8T9iwavaPvtwg2fGTOyPg2nok11gilqtTebl7/j4caKF59TRktyrctoBvuwpa/jOgbqGa2qB04Z78E8XK38LqqLF4mFZvmTGpzzuhE9s77ug9IuPZNW3bGHheASFQijqaGGPhQPycaws/viq/GqzhegFPrlOpQMGyMZ5Q0RBIdWTrRnHH/oc7Ij4OTKYERipK2NFWFAKPfKB7UZV/buLzpT7L3UrQg75KtVkG1ACiEJdvyT4fYm0fYEmzhJRs5yZM6ECNjBDjzC2nZjr5TfHhjK6Zro3oki0d1+SjPpIT7Ia8jqUGH59f76RVco9yw9ihUt/RTxLi/V7yNBEBTMD5cSWXyEj7mQUTBoaOja+w6dowsHCAhSl;5:TIdu4SXH4JaMdUJDMv7Pf/+tUu67imsoxnth/A/CG6yxVWgGTjaSHINOTikTJ0e7KOTjJX5yow+/Pe9R6zIFOoYt9Xm3CspWSLiYSTTY4aG03VCWBJxyzQWF0IZu78vtpw7YivrEtGp09rLIUhqdWzPuGPNByOrn1BsZ++RPTnc=;24:5xxP6cURZ79PJR9HNjPORLKuKATnDhRC9PnXrGr8+c/yz/x2NJhKwTivVH84VmAeKO/IC1HSXS1acfPfNXnlaFxuCGSxvRGb+qmnj5MSkFg=;7:2wFV4XLSa2Bk9pZI7agGxsY765coOHmDr/x5NT9M773XfPDUEOb2yl33wKoybRs/8mI4rC7dAewGUmWo3ByTe9PIaLZH/X6W1EYXwHHEl6S24lvV9S6+NKvlhhezGRmi4PG2TRoGzCbRqkoi7ZjhN1J/e4wnjisJHBg8/yQRqMFkh3BgctqEAZ9zI9k5LLGAZFPH5VJeftbTfZRaH5b+TlT1W7B7c+R4QTcoejcy5rN+FM8iHHi+TiSXRUY995/2 x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 34ec7092-e9b7-4538-71de-08d58f54a8be x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020);SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:DB6PR0801MB1813;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(979002)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002)(14583001)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1101;SCL:1;SRVR:DB6PR0801MB1813;H:DB6PR0801MB2053.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: 2RCCxL7oGYlJApyr8t4Yz8++QWmbpMAyNIedfCoEkiIE2xrIGMpSrT168i9TKE69wH/36WIYas8m3Fal4Rx8jKmAtsIp8RijqesJ7cEBlJ+55deKp97txYsGIum+0OE9WQEEswyRgMGiAWPejZdysAQDSDepOrQ8PmSHnniAFZSqj62XLH1PHwsDDtwyZTEcWH539RsSEyVqt0jJkVx/KefE2o8gRYxdYZMdPrU3/ur4xUVYyYpGDjDp9/gn+SkACP8gokHb+SLwneXT+zhgZJXWEvHmOP5uc4pLXiyx5fPzu7TmcotzDyoHSGm6CRvWiuodA+WiWTz2mkMJ59mL6Q== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 34ec7092-e9b7-4538-71de-08d58f54a8be X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:53:31.3558 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 X-SW-Source: 2018-03/txt/msg00507.txt.bz2 For huge inputs use the improved do_sincos function as well. Now no cases = use the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove = it. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. (reduce_and_compute_sincos): Remove unused function. -- diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index b8c366a6f05ef6b6632302fac96cd19af518f1fe..099a8a128f9883d1e683436a9f0= 9720922e923ce 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -67,11 +67,10 @@ =20 The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! an= d so on. The result is returned to LHS and correction in COR. */ -#define TAYLOR_SIN(xx, a, da, cor) \ +#define TAYLOR_SIN(xx, a, da) \ ({ \ double t =3D ((POLYNOMIAL (xx) * (a) - 0.5 * (da)) * (xx) + (da)); = \ double res =3D (a) + t; \ - (cor) =3D ((a) - res) + t; \ res; \ }) =20 @@ -145,10 +144,10 @@ static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the co= sine of the number by combining the sin and cos of X (as computed by a varia= tion of the Taylor series) with the values looked up from the sin/cos table = to - get the result in RES and a correction value in COR. */ + get the result. */ static inline double __always_inline -do_cos (double x, double dx, double *corp) +do_cos (double x, double dx) { mynumber u; =20 @@ -158,16 +157,13 @@ do_cos (double x, double dx, double *corp) u.x =3D big + fabs (x); x =3D fabs (x) - (u.x - big) + dx; =20 - double xx, s, sn, ssn, c, cs, ccs, res, cor; + double xx, s, sn, ssn, c, cs, ccs, cor; xx =3D x * x; s =3D x + x * xx * (sn3 + xx * sn5); c =3D xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor =3D (ccs - s * ssn - cs * c) - sn * s; - res =3D cs + cor; - cor =3D (cs - res) + cor; - *corp =3D cor; - return res; + return cs + cor; } =20 /* A more precise variant of DO_COS. EPS is the adjustment to the correct= ion @@ -207,10 +203,10 @@ do_cos_slow (double x, double dx, double eps, double = *corp) /* Given a number partitioned into X and DX, this function computes the si= ne of the number by combining the sin and cos of X (as computed by a variatio= n of the Taylor series) with the values looked up from the sin/cos table to = get - the result in RES and a correction value in COR. */ + the result. */ static inline double __always_inline -do_sin (double x, double dx, double *corp) +do_sin (double x, double dx) { mynumber u; =20 @@ -219,16 +215,13 @@ do_sin (double x, double dx, double *corp) u.x =3D big + fabs (x); x =3D fabs (x) - (u.x - big); =20 - double xx, s, sn, ssn, c, cs, ccs, cor, res; + double xx, s, sn, ssn, c, cs, ccs, cor; xx =3D x * x; s =3D x + (dx + x * xx * (sn3 + xx * sn5)); c =3D x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor =3D (ssn + s * ccs - sn * c) + cs * s; - res =3D sn + cor; - cor =3D (sn - res) + cor; - *corp =3D cor; - return res; + return sn + cor; } =20 /* A more precise variant of DO_SIN. EPS is the adjustment to the correct= ion @@ -330,19 +323,19 @@ static double __always_inline do_sincos (double a, double da, int4 n) { - double retval, cor; + double retval; =20 if (n & 1) /* Max ULP is 0.513. */ - retval =3D do_cos (a, da, &cor); + retval =3D do_cos (a, da); else { double xx =3D a * a; /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - retval =3D TAYLOR_SIN (xx, a, da, cor); + retval =3D TAYLOR_SIN (xx, a, da); else - retval =3D __copysign (do_sin (a, da, &cor), a); + retval =3D __copysign (do_sin (a, da), a); } =20 return (n & 2) ? -retval : retval; @@ -362,7 +355,7 @@ SECTION __sin (double x) { #ifndef IN_SINCOS - double xx, t, a, da, cor; + double xx, t, a, da; mynumber u; int4 k, m, n; double retval =3D 0; @@ -396,7 +389,7 @@ __sin (double x) else if (k < 0x3feb6000) { /* Max ULP is 0.548. */ - retval =3D __copysign (do_sin (x, 0, &cor), x); + retval =3D __copysign (do_sin (x, 0), x); } /* else if (k < 0x3feb6000) */ =20 /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ @@ -404,7 +397,7 @@ __sin (double x) { t =3D hp0 - fabs (x); /* Max ULP is 0.51. */ - retval =3D __copysign (do_cos (t, hp1, &cor), x); + retval =3D __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ =20 #ifndef IN_SINCOS @@ -417,8 +410,10 @@ __sin (double x) =20 /* --------------------105414350 <|x| <2^1024-----------------------------= -*/ else if (k < 0x7ff00000) - retval =3D reduce_and_compute (x, false); - + { + n =3D __branred (x, &a, &da); + retval =3D do_sincos (a, da, n); + } /*--------------------- |x| > 2^1024 ----------------------------------*/ else { @@ -445,7 +440,7 @@ SECTION #endif __cos (double x) { - double y, xx, cor, a, da; + double y, xx, a, da; mynumber u; #ifndef IN_SINCOS int4 k, m, n; @@ -470,7 +465,7 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ /* Max ULP is 0.51. */ - retval =3D do_cos (x, 0, &cor); + retval =3D do_cos (x, 0); } /* else if (k < 0x3feb6000) */ =20 else if (k < 0x400368fd) @@ -482,9 +477,9 @@ __cos (double x) /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - retval =3D TAYLOR_SIN (xx, a, da, cor); + retval =3D TAYLOR_SIN (xx, a, da); else - retval =3D __copysign (do_sin (a, da, &cor), a); + retval =3D __copysign (do_sin (a, da), a); } /* else if (k < 0x400368fd) */ =20 =20 @@ -497,7 +492,10 @@ __cos (double x) =20 /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) - retval =3D reduce_and_compute (x, true); + { + n =3D __branred (x, &a, &da); + retval =3D do_sincos (a, da, n + 1); + } =20 else { diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_s= incos.c index 4f032d2e42593ccde22169b374728386dd8fca8e..4335ecbba3c9894e61c087ac970= b392fa73abfab 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -28,37 +28,6 @@ #define IN_SINCOS 1 #include "s_sin.c" =20 -/* Consolidated version of reduce_and_compute in s_sin.c that does range - reduction only once and computes sin and cos together. */ -static inline void -__always_inline -reduce_and_compute_sincos (double x, double *sinx, double *cosx) -{ - double a, da; - unsigned int n =3D __branred (x, &a, &da); - - n =3D n & 3; - - if (n =3D=3D 1 || n =3D=3D 2) - { - a =3D -a; - da =3D -da; - } - - if (n & 1) - { - double *temp =3D cosx; - cosx =3D sinx; - sinx =3D temp; - } - - if (a * a < 0.01588) - *sinx =3D bsloww (a, da, x, n); - else - *sinx =3D bsloww1 (a, da, x, n); - *cosx =3D bsloww2 (a, da, x, n); -} - void __sincos (double x, double *sinx, double *cosx) { @@ -88,8 +57,11 @@ __sincos (double x, double *sinx, double *cosx) } if (k < 0x7ff00000) { - reduce_and_compute_sincos (x, sinx, cosx); - return; + double a, da; + int4 n =3D __branred (x, &a, &da); + + *sinx =3D do_sincos (a, da, n); + *cosx =3D do_sincos (a, da, n + 1); } =20 if (isinf (x))