public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* review of libmvec's accuracy
@ 2022-01-17  7:46 Paul Zimmermann
  2022-01-17 19:07 ` Joseph Myers
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Zimmermann @ 2022-01-17  7:46 UTC (permalink / raw)
  To: libc-alpha

       Hi,

during the last week I did a review of libmvec's accuracy after the recent
changes, both for binary32 and binary64, and for with sse4.2, avx2 and avx512.
Thanks to H.J. and Sunil Pandey for helping me to build libmvec and compile
my program with libmvec.

Apart from an issue with the binary64 atan2 function with sse4.2 which was
rapidly fixed (BZ #28765), I found no error of more than 4 ulps.

For univariate binary32 functions, an exhaustive test was performed.

For binary64 functions, here are the largest errors I found (with corresponding
inputs):

SSE 4.2:
acos 0 -1 0x1.35a0de2b038fep-1 [2] [2.31] 2.30611 2.306107962898911
acosh 0 -1 0x1.001ffe4f6d239p+0 [3] [3.32] 3.31361 3.313607079224876
asin 0 -1 0x1.000c80481e7f9p-1 [3] [3.49] 3.48486 3.484857846054733
asinh 0 -1 -0x1.fff13ade9918p-12 [4] [3.59] 3.58616 3.586157827853675
atan 0 -1 0x1.000a5e848c0dp-3 [3] [2.65] 2.6434 2.643390142606502
atanh 0 -1 -0x1.fff0caf4c48dp-13 [4] [3.59] 3.58751 3.587505773899075
cbrt 0 -1 -0x0.bd54cbc41f0b9p-1022 [3] [3.43] 3.42039 3.4203899319518
cos 0 -1 0x1.852e715836081p+18 [4] [3.85] 3.84216 3.84215369681995
cosh 0 -1 -0x1.633c654fee2bap+9 [2] [1.93] 1.92222 1.922214006544865
erf 0 -1 0x1.0000b7af4dcdp-8 [3] [2.55] 2.54483 2.544825896765913
erfc 0 -1 0x1.78afff6f3044cp+4 [2] [2.22] 2.21867 2.218661454888216
exp 0 -1 -0x1.205968aae119fp-8 [3] [3.21] 3.20379 3.20378601523757
exp10 0 -1 0x1.33f4082f47b74p+8 [2] [2.01] 2.00015 2.000141917186939
exp2 0 -1 -0x1.4c31866f6d3bbp-6 [2] [1.65] 1.64293 1.642923112885358
expm1 0 -1 0x1.86f57e8de4a5p-9 [3] [2.97] 2.96513 2.96512822155441
log 0 -1 0x1.00e000c7fa1c3p+0 [2] [1.59] 1.58569 1.585688691408811
log10 0 -1 0x1.00201204555c8p+0 [2] [2.10] 2.09444 2.094430783262944
log1p 0 -1 0x1.000bcdec306p-11 [3] [2.59] 2.58927 2.589263240103137
log2 0 -1 0x1.002000d8e91c5p+0 [2] [2.09] 2.08932 2.089318403685938
sin 0 -1 0x1.a5a68e24971a3p+20 [4] [3.84] 3.83437 3.834368052489666
sinh 0 -1 -0x1.c5c9440e9422dp-9 [2] [2.40] 2.39492 2.394913144733352
tan 0 -1 0x1.72e90b4651593p+15 [4] [3.97] 3.96274 3.962739804605088
tanh 0 -1 -0x1.000a02c5a8b47p-2 [2] [2.18] 2.17016 2.170154720479918
atan2 0 -1 0x1.abe93a1719613p-948,0x1.aab7bbb5ca811p-948 [2.48] 2.47316 2.473151123402774
hypot 0 -1 0x1.205e5d5fee071p-9,0x1.a71193d4eb838p-9 [2.67] 2.6658 2.665792374792187
pow 0 -1 0x1.e174ee53813b7p+859,-0x1.d929d0607bf52p-12 [1.01] 1.00093 1.000926794407638

AVX2:
acos 0 -1 0x1.ffc00159839aep-1 [2] [2.06] 2.05784 2.057839168511332
acosh 0 -1 0x1.007ff5e6aae25p+0 [3] [3.29] 3.28513 3.285121178034015
asin 0 -1 0x1.000fb59dbb7ffp-1 [3] [2.96] 2.95635 2.956349972684072
asinh 0 -1 0x1.fffdfee9d0656p-12 [4] [3.59] 3.58591 3.585908502909785
atan 0 -1 0x1.0029e0e2db7dp-3 [3] [2.65] 2.64176 2.641757760180734
atanh 0 -1 -0x1.ffe2abaa5690dp-13 [4] [3.59] 3.58538 3.585375859489622
cbrt 0 -1 0x0.bdf2e4b035cc5p-1022 [3] [3.41] 3.40348 3.403477053110753
cos 0 -1 -0x1.f5ec1ef4d1fb8p+3 [4] [3.67] 3.66518 3.665174088332274
cosh 0 -1 -0x1.633c654fee2bap+9 [2] [1.93] 1.92222 1.922214006544865
erf 0 -1 0x1.00005abf94234p-8 [3] [2.55] 2.54487 2.544864849771448
erfc 0 -1 0x1.78affead86a26p+4 [2] [2.21] 2.20423 2.204221099573643
exp 0 -1 -0x1.2059763f8882bp-8 [3] [3.21] 3.20362 3.203617681564174
exp10 0 -1 0x1.33f4082f47b74p+8 [2] [2.01] 2.00015 2.000141917186939
exp2 0 -1 -0x1.4c3c931a5de98p-6 [2] [1.65] 1.64097 1.640964569447761
expm1 0 -1 0x1.856b41d86994cp-9 [3] [2.97] 2.96463 2.964620318450374
log 0 -1 0x1.002001ffaa4ap+0 [2] [1.59] 1.58883 1.588820663143568
log10 0 -1 0x1.001fffbd3f495p+0 [2] [2.10] 2.0948 2.094790260038313
log1p 0 -1 0x1.fff86f9b9acp-12 [3] [2.59] 2.58923 2.589227792166073
log2 0 -1 0x1.002003e5a80e3p+0 [2] [2.09] 2.08921 2.089203751745159
sin 0 -1 0x1.9977bea4253f1p+0 [3] [3.49] 3.48842 3.488412468469455
sinh 0 -1 -0x1.633c654fee2bap+9 [2] [1.93] 1.92222 1.922214006544865
tan 0 -1 0x1.3fab696843fbfp+8 [4] [3.54] 3.53263 3.532620877461187
tanh 0 -1 -0x1.00010c3967f16p-2 [2] [2.14] 2.13884 2.138831342496609
atan2 0 -1 0x1.a83f842ef3f73p-633,0x1.a799d8a6677ep-633 [3.47] 3.46942 3.469416124628504
hypot 0 -1 0x1.6d080c1f5339ep+25,0x1.149ee0ad66632p+13 [1.39] 1.38804 1.388031156080099
pow 0 -1 0x1.bb393b102aa6p+246,-0x1.9c23caed44f1fp-10 [1.00] 0.99995 0.9999496063741427

AVX512:
acos 0 -1 0x1.35b9bac9f42c6p-1 [2] [1.83] 1.82629 1.826281376899049
acosh 0 -1 0x1.0007ffe4f42f8p+0 [2] [1.99] 1.98412 1.984110568276137
asin 0 -1 -0x1.0312655c1d169p-1 [3] [2.70] 2.69006 2.690052531147541
asinh 0 -1 -0x1.fff14d29165f4p-8 [2] [1.53] 1.52036 1.520353639540609
atan 0 -1 -0x1.0010aea41501p-3 [3] [2.65] 2.64024 2.640233443317396
atanh 0 -1 0x1.85cb7cc1e1318p-6 [2] [1.52] 1.51047 1.510460405768978
cbrt 0 -1 0x1.477fc84889eabp-511 [2] [1.84] 1.83354 1.833539371973596
cos 0 -1 -0x1.9a4f79002782p-6 [4] [3.66] 3.65239 3.652382154396661
cosh 0 -1 -0x1.2b3088f4a6e98p+4 [2] [2.03] 2.02425 2.024245773256978
erf 0 -1 0x1.00001d2920fb7p-8 [3] [2.55] 2.54487 2.54486123201942
erfc 0 -1 0x1.78afff9d452cp+4 [2] [2.21] 2.20537 2.205360962115159
exp 0 -1 -0x1.205968a73d4abp-8 [3] [3.21] 3.20361 3.203606347080522
exp10 0 -1 0x1.33f4082f47b74p+8 [2] [2.01] 2.00015 2.000141917186939
exp2 0 -1 -0x1.8000e569a5545p-3 [1] [1.06] 1.05024 1.050231467186485
expm1 0 -1 0x1.c3b7c858f0575p-6 [2] [2.12] 2.11697 2.1169642246763
log 0 -1 0x1.001f01ac83b3p+0 [2] [1.60] 1.59111 1.59110562523555
log10 0 -1 0x1.f03ebdaea826bp-1 [2] [1.96] 1.95902 1.959012325388189
log1p 0 -1 0x1.075745181aabp-6 [2] [1.95] 1.94684 1.94683948265057
log2 0 -1 0x1.ede4ac763282bp-1 [2] [1.87] 1.86313 1.863121719090275
sin 0 -1 -0x1.99631ed67b43fp+0 [3] [3.49] 3.48873 3.488727287224188
sinh 0 -1 -0x1.633c654fee2bap+9 [2] [1.93] 1.92222 1.922214006544865
tan 0 -1 -0x1.780c9aeca907cp+17 [4] [3.99] 3.98992 3.989911851716534
tanh 0 -1 -0x1.001bf41f56582p-1 [1] [1.20] 1.19944 1.199437551242041
atan2 0 -1 0x1.499c920038ab4p+559,0x1.4939bd8e01601p+559 [3.42] 3.41468 3.414672714924121
hypot 0 -1 -0x1.72b48b14296a7p-510,-0x1.3dcd53d99b107p-518 [1.51] 1.50331 1.503309592574038
pow 0 -1 0x1.65f5c9d0c7bc9p-828,0x1.eba10d43b8f54p-12 [1.00] 0.999925 0.9999244722109198

As a side note, I found no mention of the libmvec accurary in the reference
manual (libc.pdf).

Best regards,
Paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: review of libmvec's accuracy
  2022-01-17  7:46 review of libmvec's accuracy Paul Zimmermann
@ 2022-01-17 19:07 ` Joseph Myers
  2022-01-18  3:47   ` Paul Zimmermann
  0 siblings, 1 reply; 3+ messages in thread
From: Joseph Myers @ 2022-01-17 19:07 UTC (permalink / raw)
  To: Paul Zimmermann; +Cc: libc-alpha

On Mon, 17 Jan 2022, Paul Zimmermann wrote:

> As a side note, I found no mention of the libmvec accurary in the reference
> manual (libc.pdf).

There is an explicit statement of omission in math.texi, "Only the 
round-to-nearest rounding mode is covered by this table, and vector 
versions of functions are not covered.".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: review of libmvec's accuracy
  2022-01-17 19:07 ` Joseph Myers
@ 2022-01-18  3:47   ` Paul Zimmermann
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Zimmermann @ 2022-01-18  3:47 UTC (permalink / raw)
  To: Joseph Myers; +Cc: libc-alpha

       Dear Joseph,

> > As a side note, I found no mention of the libmvec accurary in the reference
> > manual (libc.pdf).
> 
> There is an explicit statement of omission in math.texi, "Only the 
> round-to-nearest rounding mode is covered by this table, and vector 
> versions of functions are not covered.".

ok, this is a pity that no information at all about libmvec (how to use
it, on which architectures and for which formats it works, what is its
accuracy) can be found in the reference manual.

Paul


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-01-18  3:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-17  7:46 review of libmvec's accuracy Paul Zimmermann
2022-01-17 19:07 ` Joseph Myers
2022-01-18  3:47   ` Paul Zimmermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).