From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 112490 invoked by alias); 28 Jan 2020 10:47:16 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 112476 invoked by uid 89); 28 Jan 2020 10:47:16 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=claim X-HELO: mail.bitwise.fi Subject: Re: [PATCH 1/2] ieee754: Remove slow paths from asin and acos To: Szabolcs Nagy , "libc-alpha@sourceware.org" Cc: nd References: <20200127104511.16618-1-anssi.hannula@bitwise.fi> <714b667d-7345-bcd4-ca30-7ecbb7098bbc@arm.com> From: Anssi Hannula Message-ID: <158bbf4a-ecb2-34da-f84e-ec05598e3746@bitwise.fi> Date: Tue, 28 Jan 2020 10:51:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: <714b667d-7345-bcd4-ca30-7ecbb7098bbc@arm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2020-01/txt/msg00595.txt.bz2 On 27.1.2020 14.51, Szabolcs Nagy wrote: > On 27/01/2020 10:45, Anssi Hannula wrote: >> asin and acos have slow paths for rounding the last bit that cause some >> calls to be 500-1500x slower than average calls. >> >> These slow paths are rare, a test of a trillion (1.000.000.000.000) >> random inputs between -1 and 1 showed 32870 slow calls for acos and 4473 >> for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0. >> >> The slow paths claim correct rounding and use __sin32() and __cos32() >> (which compare two result candidates and return the closest one) as the >> final step, with the second result candidate (res1) having a small offset >> applied from res. This suggests that res and res1 are intended to be 1 >> ULP apart (which makes sense for rounding), barring bugs, allowing us to >> pick either one and still remain within 1 ULP of the exact result. >> >> Remove the slow paths as the accuracy is better than 1 ULP even without >> them, which is enough for glibc. >> >> Also remove code comments claiming correctly rounded results. > the patch looks good to me. > >> After slow path removal, checking the accuracy of 14.400.000.000 random >> asin() and acos() inputs showed only three incorrectly rounded >> (error > 0.5 ULP) results: >> - asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before) >> - asin(-0x1.f692ba202abcp-4) (0.500003 ULP, same as before) >> - asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact) >> The first two had the same error even before this commit, and they did >> not use the slow path at all. >> >> Checking 4934 known randomly found previously-slow-path asin inputs >> shows 25 calls with incorrectly rounded results, with a maximum error of >> 0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code >> rounded all these inputs correctly (error < 0.5 ULP). >> The observed average speed increase was 130x. >> >> Checking 36240 known randomly found previously-slow-path acos inputs >> shows 42 calls with incorrectly rounded results, with a maximum error of >> 0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact" >> slow-path code showed 34 calls with incorrectly rounded results, with the >> same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1). >> The observed average speed increase was 130x. >> >> The functions could likely be trimmed more while keeping acceptable >> accuracy, but this at least gets rid of the egregiously slow cases. >> >> Tested on x86_64. >> --- > ... >> double >> SECTION >> __ieee754_asin(double x){ >> @@ -100,13 +88,7 @@ __ieee754_asin(double x){ >> if (res == res+1.00014*cor) return res; >> else { >> __doasin(x,0,w); >> - if (w[0]==(w[0]+1.00000001*w[1])) return w[0]; >> - else { >> - y=fabs(x); >> - res=fabs(w[0]); >> - res1=fabs(w[0]+1.1*w[1]); >> - return (m>0)?__sin32(y,res,res1):-__sin32(y,res,res1); >> - } >> + return w[0]; > if we assume the code is correctly rounded (though your tests > show it isnt), then we can derive error bounds from the > checks, e.g. > > if (res == res+1.00014*cor) return res; > > here means that the computed result is res + cor, where > |cor| < 0.5 ulp (in nearest rounding mode), correct rounding > would require the exact result res + xcor with > |xcor| < 1.00014*|cor| < .50007 ulp. > > so in this case i think that earlier if can be removed too > with a comment saying that worst case error is expected to > be .50007 ulp in nearest rounding mode. Indeed, I suspected as much. I wasn't really comfortable trying to derive error bounds (and/or remove more code) as the original results were not 100% accurate and I'm not too much of an expert on this. > it would be nice to do similar analysis in other cases > and have some comments in the code about expected error > bounds. (but if that seems too much work i'm not against > the proposed patch.) I'd prefer to keep the patch as-is as long as that is acceptable, as I've spent a bit too much time on this already :) Thanks for taking a look. -- Anssi Hannula / Bitwise Oy +358 503803997