* [PATCH] Add benchtests for roundeven and roundevenf. @ 2020-03-05 17:31 Shen-Ta Hsieh 2020-03-27 23:25 ` Joseph Myers 0 siblings, 1 reply; 11+ messages in thread From: Shen-Ta Hsieh @ 2020-03-05 17:31 UTC (permalink / raw) To: libc-alpha; +Cc: Shen-Ta Hsieh This patch adds benchtests for the roundeven and roundevenf functions. The inputs are copied from trunc-inputs. * benchtests/Makefile (bench-math): Add roundeven and roundevenff. * benchtests/roundeven-inputs: New file. * benchtests/roundevenf-inputs: Likewise. --- benchtests/Makefile | 6 ++++-- benchtests/roundeven-inputs | 22 ++++++++++++++++++++++ benchtests/roundevenf-inputs | 21 +++++++++++++++++++++ 3 files changed, 47 insertions(+), 2 deletions(-) create mode 100644 benchtests/roundeven-inputs create mode 100644 benchtests/roundevenf-inputs diff --git a/benchtests/Makefile b/benchtests/Makefile index 71b9565fed..335d643ecb 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -23,8 +23,8 @@ subdir := benchtests include ../Makeconfig bench-math := acos acosh asin asinh atan atanh cos cosh exp exp2 log log2 \ modf pow rint sin sincos sinh sqrt tan tanh fmin fmax fminf \ - fmaxf powf trunc truncf expf exp2f logf log2f sincosf sinf \ - cosf isnan isinf isfinite hypot logb logbf + fmaxf powf trunc truncf roundeven roundevenf expf exp2f logf \ + log2f sincosf sinf cosf isnan isinf isfinite hypot logb logbf bench-pthread := pthread_once thread_create @@ -88,6 +88,8 @@ CFLAGS-bench-fmax.c += -fno-builtin CFLAGS-bench-fmaxf.c += -fno-builtin CFLAGS-bench-trunc.c += -fno-builtin CFLAGS-bench-truncf.c += -fno-builtin +CFLAGS-bench-roundeven.c += -fno-builtin +CFLAGS-bench-roundevenf.c += -fno-builtin CFLAGS-bench-isnan.c += -fsignaling-nans CFLAGS-bench-isinf.c += -fsignaling-nans CFLAGS-bench-isfinite.c += -fsignaling-nans diff --git a/benchtests/roundeven-inputs b/benchtests/roundeven-inputs new file mode 100644 index 0000000000..49ff407a6a --- /dev/null +++ b/benchtests/roundeven-inputs @@ -0,0 +1,22 @@ +## args: double +## ret: double +## includes: math.h +0.0 +-0.0 +0.001 +-0.001 +0.5 +-0.5 +0.999 +-0.999 +1.0 +-1.0 +1.001 +-1.001 +123.5 +-123.5 +12345.1 +-1000000.1 +1e15 +-1e30 +1e200 diff --git a/benchtests/roundevenf-inputs b/benchtests/roundevenf-inputs new file mode 100644 index 0000000000..c37c5dacba --- /dev/null +++ b/benchtests/roundevenf-inputs @@ -0,0 +1,21 @@ +## args: float +## ret: float +## includes: math.h +0.0f +-0.0f +0.001f +-0.001f +0.5f +-0.5f +0.999f +-0.999f +1.0f +-1.0f +1.001f +-1.001f +123.5f +-123.5f +12345.1f +-1000000.5f +1e15f +-1e30f -- 2.25.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Add benchtests for roundeven and roundevenf. 2020-03-05 17:31 [PATCH] Add benchtests for roundeven and roundevenf Shen-Ta Hsieh @ 2020-03-27 23:25 ` Joseph Myers 2020-05-02 15:02 ` [PATCH v4 1/2] math: redirect roundeven function Shen-Ta Hsieh 2020-05-02 15:02 ` [PATCH v4 2/2] x86_64: roundeven with sse4.1 support Shen-Ta Hsieh 0 siblings, 2 replies; 11+ messages in thread From: Joseph Myers @ 2020-03-27 23:25 UTC (permalink / raw) To: Shen-Ta Hsieh; +Cc: libc-alpha On Fri, 6 Mar 2020, Shen-Ta Hsieh wrote: > This patch adds benchtests for the roundeven and roundevenf functions. The > inputs are copied from trunc-inputs. Thanks. I've committed this patch. This one doesn't need an FSF copyright assignment, but more substantive patches will. > * benchtests/Makefile (bench-math): Add roundeven and roundevenff. > * benchtests/roundeven-inputs: New file. > * benchtests/roundevenf-inputs: Likewise. This ChangeLog-format log is no longer needed. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 1/2] math: redirect roundeven function 2020-03-27 23:25 ` Joseph Myers @ 2020-05-02 15:02 ` Shen-Ta Hsieh 2020-05-02 15:02 ` [PATCH v4 2/2] x86_64: roundeven with sse4.1 support Shen-Ta Hsieh 1 sibling, 0 replies; 11+ messages in thread From: Shen-Ta Hsieh @ 2020-05-02 15:02 UTC (permalink / raw) To: libc-alpha; +Cc: Shen-Ta Hsieh This patch redirect roundeven function for futhermore changes. --- include/math.h | 2 +- sysdeps/ieee754/dbl-64/s_roundeven.c | 4 +++- sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c | 4 +++- sysdeps/ieee754/float128/s_roundevenf128.c | 1 + sysdeps/ieee754/flt-32/s_roundevenf.c | 3 +++ sysdeps/ieee754/ldbl-128/s_roundevenl.c | 1 + sysdeps/ieee754/ldbl-96/s_roundevenl.c | 1 + 7 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/math.h b/include/math.h index 3979c47400..77d9c33045 100644 --- a/include/math.h +++ b/include/math.h @@ -39,7 +39,6 @@ libm_hidden_proto (__issignaling) libm_hidden_proto (__issignalingf) libm_hidden_proto (__exp) libm_hidden_proto (__expf) -libm_hidden_proto (__roundeven) # if !defined __NO_LONG_DOUBLE_MATH \ && __LDOUBLE_REDIRECTS_TO_FLOAT128_ABI == 0 @@ -160,6 +159,7 @@ fabsf128 (_Float128 x) MATH_REDIRECT (sqrt, "__ieee754_", MATH_REDIRECT_UNARY_ARGS) MATH_REDIRECT (ceil, "__", MATH_REDIRECT_UNARY_ARGS) MATH_REDIRECT (floor, "__", MATH_REDIRECT_UNARY_ARGS) +MATH_REDIRECT (roundeven, "__", MATH_REDIRECT_UNARY_ARGS) MATH_REDIRECT (rint, "__", MATH_REDIRECT_UNARY_ARGS) MATH_REDIRECT (trunc, "__", MATH_REDIRECT_UNARY_ARGS) MATH_REDIRECT (round, "__", MATH_REDIRECT_UNARY_ARGS) diff --git a/sysdeps/ieee754/dbl-64/s_roundeven.c b/sysdeps/ieee754/dbl-64/s_roundeven.c index ac8c64e229..e5cd3e71b5 100644 --- a/sysdeps/ieee754/dbl-64/s_roundeven.c +++ b/sysdeps/ieee754/dbl-64/s_roundeven.c @@ -17,6 +17,7 @@ License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */ +#define NO_MATH_REDIRECT #include <math.h> #include <math_private.h> #include <libm-alias-double.h> @@ -101,5 +102,6 @@ __roundeven (double x) INSERT_WORDS (x, hx, lx); return x; } -hidden_def (__roundeven) +#ifndef __roundeven libm_alias_double (__roundeven, roundeven) +#endif diff --git a/sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c b/sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c index 47dca5f000..279883fed3 100644 --- a/sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c +++ b/sysdeps/ieee754/dbl-64/wordsize-64/s_roundeven.c @@ -17,6 +17,7 @@ License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */ +#define NO_MATH_REDIRECT #include <math.h> #include <math_private.h> #include <libm-alias-double.h> @@ -67,5 +68,6 @@ __roundeven (double x) INSERT_WORDS64 (x, ix); return x; } -hidden_def (__roundeven) +#ifndef __roundeven libm_alias_double (__roundeven, roundeven) +#endif diff --git a/sysdeps/ieee754/float128/s_roundevenf128.c b/sysdeps/ieee754/float128/s_roundevenf128.c index 5a9b3f395f..e0faf727f6 100644 --- a/sysdeps/ieee754/float128/s_roundevenf128.c +++ b/sysdeps/ieee754/float128/s_roundevenf128.c @@ -1,2 +1,3 @@ +#define NO_MATH_REDIRECT #include <float128_private.h> #include "../ldbl-128/s_roundevenl.c" diff --git a/sysdeps/ieee754/flt-32/s_roundevenf.c b/sysdeps/ieee754/flt-32/s_roundevenf.c index 0d7f5eb4eb..22adfce6da 100644 --- a/sysdeps/ieee754/flt-32/s_roundevenf.c +++ b/sysdeps/ieee754/flt-32/s_roundevenf.c @@ -17,6 +17,7 @@ License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */ +#define NO_MATH_REDIRECT #include <math.h> #include <math_private.h> #include <libm-alias-float.h> @@ -67,4 +68,6 @@ __roundevenf (float x) SET_FLOAT_WORD (x, ix); return x; } +#ifndef __roundevenf libm_alias_float (__roundeven, roundeven) +#endif diff --git a/sysdeps/ieee754/ldbl-128/s_roundevenl.c b/sysdeps/ieee754/ldbl-128/s_roundevenl.c index 61b8a377f0..9f4b57bf27 100644 --- a/sysdeps/ieee754/ldbl-128/s_roundevenl.c +++ b/sysdeps/ieee754/ldbl-128/s_roundevenl.c @@ -17,6 +17,7 @@ License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */ +#define NO_MATH_REDIRECT #include <math.h> #include <math_private.h> #include <libm-alias-ldouble.h> diff --git a/sysdeps/ieee754/ldbl-96/s_roundevenl.c b/sysdeps/ieee754/ldbl-96/s_roundevenl.c index 62a9ac38fd..15c990904f 100644 --- a/sysdeps/ieee754/ldbl-96/s_roundevenl.c +++ b/sysdeps/ieee754/ldbl-96/s_roundevenl.c @@ -17,6 +17,7 @@ License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */ +#define NO_MATH_REDIRECT #include <math.h> #include <math_private.h> #include <libm-alias-ldouble.h> -- 2.26.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-03-27 23:25 ` Joseph Myers 2020-05-02 15:02 ` [PATCH v4 1/2] math: redirect roundeven function Shen-Ta Hsieh @ 2020-05-02 15:02 ` Shen-Ta Hsieh 2020-05-28 12:05 ` H.J. Lu 1 sibling, 1 reply; 11+ messages in thread From: Shen-Ta Hsieh @ 2020-05-02 15:02 UTC (permalink / raw) To: libc-alpha; +Cc: Shen-Ta Hsieh This patch adds support for the sse4.1 hardware floating point roundeven. Here is a benchmark result on my AMD Ryzen 9 3900X system: * benchmark result before this commit | | roundeven | roundevenf | |------------|---------------|--------------| | duration | 3.77783e+09 | 3.77792e+09 | | iterations | 3.75706e+08 | 3.80448e+08 | | max | 158.498 | 88.539 | | min | 6.802 | 7.676 | | mean | 10.0553 | 9.93018 | * benchmark result after this commit | | roundeven | roundevenf | |------------|---------------|---------------| | duration | 3.77242e+09 | 3.77238e+09 | | iterations | 5.18681e+08 | 5.2425e+08 | | max | 127.338 | 172.102 | | min | 7.03 | 7.03 | | mean | 7.27311 | 7.19577 | --- sysdeps/x86_64/fpu/multiarch/Makefile | 5 +-- sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c | 2 ++ .../x86_64/fpu/multiarch/s_roundeven-sse4_1.S | 26 ++++++++++++++++ sysdeps/x86_64/fpu/multiarch/s_roundeven.c | 31 +++++++++++++++++++ sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c | 3 ++ .../fpu/multiarch/s_roundevenf-sse4_1.S | 26 ++++++++++++++++ sysdeps/x86_64/fpu/multiarch/s_roundevenf.c | 31 +++++++++++++++++++ 7 files changed, 122 insertions(+), 2 deletions(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf.c diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile index 3836574f48..7e3a3f78cb 100644 --- a/sysdeps/x86_64/fpu/multiarch/Makefile +++ b/sysdeps/x86_64/fpu/multiarch/Makefile @@ -1,11 +1,12 @@ ifeq ($(subdir),math) libm-sysdep_routines += s_floor-c s_ceil-c s_floorf-c s_ceilf-c \ s_rint-c s_rintf-c s_nearbyint-c s_nearbyintf-c \ - s_trunc-c s_truncf-c + s_roundeven-c s_roundevenf-c s_trunc-c s_truncf-c libm-sysdep_routines += s_ceil-sse4_1 s_ceilf-sse4_1 s_floor-sse4_1 \ s_floorf-sse4_1 s_nearbyint-sse4_1 \ - s_nearbyintf-sse4_1 s_rint-sse4_1 s_rintf-sse4_1 \ + s_nearbyintf-sse4_1 s_roundeven-sse4_1 \ + s_roundevenf-sse4_1 s_rint-sse4_1 s_rintf-sse4_1 \ s_trunc-sse4_1 s_truncf-sse4_1 libm-sysdep_routines += e_exp-fma e_log-fma e_pow-fma s_atan-fma \ diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c b/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c new file mode 100644 index 0000000000..c7be43cb22 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c @@ -0,0 +1,2 @@ +#define __roundeven __roundeven_c +#include <sysdeps/ieee754/dbl-64/s_roundeven.c> diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S new file mode 100644 index 0000000000..6db88a1649 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S @@ -0,0 +1,26 @@ +/* Round to nearest integer value, rounding halfway cases to even. + double version. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <sysdep.h> + + .section .text.sse4.1,"ax",@progbits +ENTRY(__roundeven_sse41) + roundsd $8, %xmm0, %xmm0 + ret +END(__roundeven_sse41) diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven.c b/sysdeps/x86_64/fpu/multiarch/s_roundeven.c new file mode 100644 index 0000000000..bd777b0ca7 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven.c @@ -0,0 +1,31 @@ +/* Multiple versions of __roundeven. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <libm-alias-double.h> + +#define roundeven __redirect_roundeven +#define __roundeven __redirect___roundeven +#include <math.h> +#undef roundeven +#undef __roundeven + +#define SYMBOL_NAME roundeven +#include "ifunc-sse4_1.h" + +libc_ifunc_redirected (__redirect_roundeven, __roundeven, IFUNC_SELECTOR ()); +libm_alias_double (__roundeven, roundeven) diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c new file mode 100644 index 0000000000..72a6e7d1fb --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c @@ -0,0 +1,3 @@ +#undef __roundevenf +#define __roundevenf __roundevenf_c +#include <sysdeps/ieee754/flt-32/s_roundevenf.c> diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S new file mode 100644 index 0000000000..74102bac0d --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S @@ -0,0 +1,26 @@ +/* Round to nearest integer value, rounding halfway cases to even. + float version. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <sysdep.h> + + .section .text.sse4.1,"ax",@progbits +ENTRY(__roundevenf_sse41) + roundss $8, %xmm0, %xmm0 + ret +END(__roundevenf_sse41) diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c b/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c new file mode 100644 index 0000000000..8ae1944d2b --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c @@ -0,0 +1,31 @@ +/* Multiple versions of __roundevenf. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <libm-alias-float.h> + +#define roundevenf __redirect_roundevenf +#define __roundevenf __redirect___roundevenf +#include <math.h> +#undef roundevenf +#undef __roundevenf + +#define SYMBOL_NAME roundevenf +#include "ifunc-sse4_1.h" + +libc_ifunc_redirected (__redirect_roundevenf, __roundevenf, IFUNC_SELECTOR ()); +libm_alias_float (__roundeven, roundeven) -- 2.26.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-02 15:02 ` [PATCH v4 2/2] x86_64: roundeven with sse4.1 support Shen-Ta Hsieh @ 2020-05-28 12:05 ` H.J. Lu 2020-05-28 12:22 ` Florian Weimer 0 siblings, 1 reply; 11+ messages in thread From: H.J. Lu @ 2020-05-28 12:05 UTC (permalink / raw) To: Shen-Ta Hsieh; +Cc: GNU C Library On Sat, May 2, 2020 at 8:06 AM Shen-Ta Hsieh via Libc-alpha <libc-alpha@sourceware.org> wrote: > > This patch adds support for the sse4.1 hardware floating point > roundeven. Do you have FSF paper on file? > Here is a benchmark result on my AMD Ryzen 9 3900X system: Since we don't know or may not care SSE4 machines without AVX, should we make it to AVX only? > * benchmark result before this commit > | | roundeven | roundevenf | > |------------|---------------|--------------| > | duration | 3.77783e+09 | 3.77792e+09 | > | iterations | 3.75706e+08 | 3.80448e+08 | > | max | 158.498 | 88.539 | > | min | 6.802 | 7.676 | > | mean | 10.0553 | 9.93018 | > > * benchmark result after this commit > | | roundeven | roundevenf | > |------------|---------------|---------------| > | duration | 3.77242e+09 | 3.77238e+09 | > | iterations | 5.18681e+08 | 5.2425e+08 | > | max | 127.338 | 172.102 | > | min | 7.03 | 7.03 | > | mean | 7.27311 | 7.19577 | > --- > sysdeps/x86_64/fpu/multiarch/Makefile | 5 +-- > sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c | 2 ++ > .../x86_64/fpu/multiarch/s_roundeven-sse4_1.S | 26 ++++++++++++++++ > sysdeps/x86_64/fpu/multiarch/s_roundeven.c | 31 +++++++++++++++++++ > sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c | 3 ++ > .../fpu/multiarch/s_roundevenf-sse4_1.S | 26 ++++++++++++++++ > sysdeps/x86_64/fpu/multiarch/s_roundevenf.c | 31 +++++++++++++++++++ > 7 files changed, 122 insertions(+), 2 deletions(-) > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven.c > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-c.c > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S > create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf.c > > diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile > index 3836574f48..7e3a3f78cb 100644 > --- a/sysdeps/x86_64/fpu/multiarch/Makefile > +++ b/sysdeps/x86_64/fpu/multiarch/Makefile > @@ -1,11 +1,12 @@ > ifeq ($(subdir),math) > libm-sysdep_routines += s_floor-c s_ceil-c s_floorf-c s_ceilf-c \ > s_rint-c s_rintf-c s_nearbyint-c s_nearbyintf-c \ > - s_trunc-c s_truncf-c > + s_roundeven-c s_roundevenf-c s_trunc-c s_truncf-c > > libm-sysdep_routines += s_ceil-sse4_1 s_ceilf-sse4_1 s_floor-sse4_1 \ > s_floorf-sse4_1 s_nearbyint-sse4_1 \ > - s_nearbyintf-sse4_1 s_rint-sse4_1 s_rintf-sse4_1 \ > + s_nearbyintf-sse4_1 s_roundeven-sse4_1 \ > + s_roundevenf-sse4_1 s_rint-sse4_1 s_rintf-sse4_1 \ > s_trunc-sse4_1 s_truncf-sse4_1 > > libm-sysdep_routines += e_exp-fma e_log-fma e_pow-fma s_atan-fma \ > diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c b/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c > new file mode 100644 > index 0000000000..c7be43cb22 > --- /dev/null > +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-c.c > @@ -0,0 +1,2 @@ > +#define __roundeven __roundeven_c > +#include <sysdeps/ieee754/dbl-64/s_roundeven.c> > diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S > new file mode 100644 > index 0000000000..6db88a1649 > --- /dev/null > +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S > @@ -0,0 +1,26 @@ > +/* Round to nearest integer value, rounding halfway cases to even. > + double version. > + Copyright (C) 2019 Free Software Foundation, Inc. Please replace all 2019 with 2020. -- H.J. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-28 12:05 ` H.J. Lu @ 2020-05-28 12:22 ` Florian Weimer 2020-05-28 12:31 ` H.J. Lu 0 siblings, 1 reply; 11+ messages in thread From: Florian Weimer @ 2020-05-28 12:22 UTC (permalink / raw) To: H.J. Lu via Libc-alpha; +Cc: Shen-Ta Hsieh, H.J. Lu * H. J. Lu via Libc-alpha: >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > Since we don't know or may not care SSE4 machines without AVX, > should we make it to AVX only? What about Goldmont/Tremont? Those are current CPUs which do not support AVX, but I think they have sufficient SSE4 support levels for this change. Thanks, Florian ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-28 12:22 ` Florian Weimer @ 2020-05-28 12:31 ` H.J. Lu 2020-05-29 8:48 ` Cui, Lili 0 siblings, 1 reply; 11+ messages in thread From: H.J. Lu @ 2020-05-28 12:31 UTC (permalink / raw) To: Florian Weimer, Lili Cui; +Cc: H.J. Lu via Libc-alpha, Shen-Ta Hsieh On Thu, May 28, 2020 at 5:22 AM Florian Weimer <fweimer@redhat.com> wrote: > > * H. J. Lu via Libc-alpha: > > >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > > > Since we don't know or may not care SSE4 machines without AVX, > > should we make it to AVX only? > > What about Goldmont/Tremont? Those are current CPUs which do not > support AVX, but I think they have sufficient SSE4 support levels for > this change. > Good point. Lili, please collect glibc micro benchmark roundeven/roundevenf data before and after: https://sourceware.org/pipermail/libc-alpha/2020-May/113533.html on Tremont. -- H.J. ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-28 12:31 ` H.J. Lu @ 2020-05-29 8:48 ` Cui, Lili 2020-05-29 11:29 ` H.J. Lu 0 siblings, 1 reply; 11+ messages in thread From: Cui, Lili @ 2020-05-29 8:48 UTC (permalink / raw) To: H.J. Lu, Florian Weimer; +Cc: H.J. Lu via Libc-alpha, Shen-Ta Hsieh > -----Original Message----- > From: H.J. Lu <hjl.tools@gmail.com> > Sent: Thursday, May 28, 2020 8:32 PM > To: Florian Weimer <fweimer@redhat.com>; Cui, Lili <lili.cui@intel.com> > Cc: H.J. Lu via Libc-alpha <libc-alpha@sourceware.org>; Shen-Ta Hsieh > <ibmibmibm.tw@gmail.com> > Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support > > On Thu, May 28, 2020 at 5:22 AM Florian Weimer <fweimer@redhat.com<mailto:fweimer@redhat.com>> > wrote: > > > > * H. J. Lu via Libc-alpha: > > > > >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > > > > > Since we don't know or may not care SSE4 machines without AVX, > > > should we make it to AVX only? > > > > What about Goldmont/Tremont? Those are current CPUs which do not > > support AVX, but I think they have sufficient SSE4 support levels for > > this change. > > > > Good point. Lili, please collect glibc micro benchmark roundeven/roundevenf > data before and after: > > https://sourceware.org/pipermail/libc-alpha/2020-May/113533.html > > on Tremont. > > -- > H.J. Hi H.J, Result is here. benchmark result before this commit on Tremont [X] benchmark result after this commit on Tremont ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-29 8:48 ` Cui, Lili @ 2020-05-29 11:29 ` H.J. Lu 2020-06-01 1:28 ` Cui, Lili 0 siblings, 1 reply; 11+ messages in thread From: H.J. Lu @ 2020-05-29 11:29 UTC (permalink / raw) To: Cui, Lili; +Cc: Florian Weimer, H.J. Lu via Libc-alpha, Shen-Ta Hsieh On Fri, May 29, 2020 at 1:48 AM Cui, Lili <lili.cui@intel.com> wrote: > > > > -----Original Message----- > > From: H.J. Lu <hjl.tools@gmail.com> > > Sent: Thursday, May 28, 2020 8:32 PM > > To: Florian Weimer <fweimer@redhat.com>; Cui, Lili <lili.cui@intel.com> > > Cc: H.J. Lu via Libc-alpha <libc-alpha@sourceware.org>; Shen-Ta Hsieh > > <ibmibmibm.tw@gmail.com> > > Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support > > > > On Thu, May 28, 2020 at 5:22 AM Florian Weimer <fweimer@redhat.com> > > wrote: > > > > > > * H. J. Lu via Libc-alpha: > > > > > > >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > > > > > > > Since we don't know or may not care SSE4 machines without AVX, > > > > should we make it to AVX only? > > > > > > What about Goldmont/Tremont? Those are current CPUs which do not > > > support AVX, but I think they have sufficient SSE4 support levels for > > > this change. > > > > > > > Good point. Lili, please collect glibc micro benchmark > roundeven/roundevenf > > data before and after: > > > > https://sourceware.org/pipermail/libc-alpha/2020-May/113533.html > > > > on Tremont. > > > > -- > > H.J. > > Hi H.J, > > Result is here. > benchmark result before this commit on Tremont > > > > benchmark result after this commit on Tremont > > > > Hi Lili, The results are empty. -- H.J. ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-05-29 11:29 ` H.J. Lu @ 2020-06-01 1:28 ` Cui, Lili 2020-06-01 2:04 ` H.J. Lu 0 siblings, 1 reply; 11+ messages in thread From: Cui, Lili @ 2020-06-01 1:28 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, H.J. Lu via Libc-alpha, Shen-Ta Hsieh From: H.J. Lu <hjl.tools@gmail.com> Sent: Friday, May 29, 2020 7:30 PM To: Cui, Lili <lili.cui@intel.com> Cc: Florian Weimer <fweimer@redhat.com>; H.J. Lu via Libc-alpha <libc-alpha@sourceware.org>; Shen-Ta Hsieh <ibmibmibm.tw@gmail.com> Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support On Fri, May 29, 2020 at 1:48 AM Cui, Lili <lili.cui@intel.com<mailto:lili.cui@intel.com>> wrote: > -----Original Message----- > From: H.J. Lu <hjl.tools@gmail.com<mailto:hjl.tools@gmail.com>> > Sent: Thursday, May 28, 2020 8:32 PM > To: Florian Weimer <fweimer@redhat.com<mailto:fweimer@redhat.com>>; Cui, Lili <lili.cui@intel.com<mailto:lili.cui@intel.com>> > Cc: H.J. Lu via Libc-alpha <libc-alpha@sourceware.org<mailto:libc-alpha@sourceware.org>>; Shen-Ta Hsieh > <ibmibmibm.tw@gmail.com<mailto:ibmibmibm.tw@gmail.com>> > Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support > > On Thu, May 28, 2020 at 5:22 AM Florian Weimer <fweimer@redhat.com<mailto:fweimer@redhat.com>> > wrote: > > > > * H. J. Lu via Libc-alpha: > > > > >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > > > > > Since we don't know or may not care SSE4 machines without AVX, > > > should we make it to AVX only? > > > > What about Goldmont/Tremont? Those are current CPUs which do not > > support AVX, but I think they have sufficient SSE4 support levels for > > this change. > > > > Good point. Lili, please collect glibc micro benchmark roundeven/roundevenf > data before and after: > > https://sourceware.org/pipermail/libc-alpha/2020-May/113533.html > > on Tremont. > > -- > H.J. Hi H.J, Result is here. benchmark result before this commit on Tremont benchmark result after this commit on Tremont Hi Lili, The results are empty. -- H.J. Hi H.J, Sorry for that my format has some problems, data is here. benchmark result before this commit on Tremont "roundeven": "roundevenf": "duration": 2.19422e+09, "duration": 2.19402e+09, "iterations": 1.44514e+08, "iterations": 1.4184e+08, "max": 43.258, "max": 53.07, "min": 11.052, "min": 12.052, "mean": 15.1835 "mean": 15.4683 benchmark result after this commit on Tremont "roundeven": "roundevenf": "duration": 2.19144e+09, "duration": 2.19218e+09, "iterations": 2.17075e+08, "iterations": 1.97982e+08, "max": 395.428, "max": 34.928, "min": 10.044, "min": 11.02, "mean": 10.0953 "mean": 11.0726 Thanks, Lili. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support 2020-06-01 1:28 ` Cui, Lili @ 2020-06-01 2:04 ` H.J. Lu 0 siblings, 0 replies; 11+ messages in thread From: H.J. Lu @ 2020-06-01 2:04 UTC (permalink / raw) To: Cui, Lili; +Cc: Florian Weimer, H.J. Lu via Libc-alpha, Shen-Ta Hsieh On Sun, May 31, 2020 at 6:28 PM Cui, Lili <lili.cui@intel.com> wrote: > > > > > > From: H.J. Lu <hjl.tools@gmail.com> > Sent: Friday, May 29, 2020 7:30 PM > To: Cui, Lili <lili.cui@intel.com> > Cc: Florian Weimer <fweimer@redhat.com>; H.J. Lu via Libc-alpha <libc-alpha@sourceware.org>; Shen-Ta Hsieh <ibmibmibm.tw@gmail.com> > Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support > > > > On Fri, May 29, 2020 at 1:48 AM Cui, Lili <lili.cui@intel.com> wrote: > > > > > > > -----Original Message----- > > > From: H.J. Lu <hjl.tools@gmail.com> > > > Sent: Thursday, May 28, 2020 8:32 PM > > > To: Florian Weimer <fweimer@redhat.com>; Cui, Lili <lili.cui@intel.com> > > > Cc: H.J. Lu via Libc-alpha <libc-alpha@sourceware.org>; Shen-Ta Hsieh > > > <ibmibmibm.tw@gmail.com> > > > Subject: Re: [PATCH v4 2/2] x86_64: roundeven with sse4.1 support > > > > > > On Thu, May 28, 2020 at 5:22 AM Florian Weimer <fweimer@redhat.com> > > > wrote: > > > > > > > > * H. J. Lu via Libc-alpha: > > > > > > > > >> Here is a benchmark result on my AMD Ryzen 9 3900X system: > > > > > > > > > > Since we don't know or may not care SSE4 machines without AVX, > > > > > should we make it to AVX only? > > > > > > > > What about Goldmont/Tremont? Those are current CPUs which do not > > > > support AVX, but I think they have sufficient SSE4 support levels for > > > > this change. > > > > > > > > > > Good point. Lili, please collect glibc micro benchmark roundeven/roundevenf > > > data before and after: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-May/113533.html > > > > > > on Tremont. > > > > > > -- > > > H.J. > > > > Hi H.J, > > > > Result is here. > > benchmark result before this commit on Tremont > > > > > > > > benchmark result after this commit on Tremont > > > > > > > > > > Hi Lili, > > > > The results are empty. > > > > -- > > H.J. > > > > Hi H.J, > > > > Sorry for that my format has some problems, data is here. > > > > benchmark result before this commit on Tremont > > > > "roundeven": "roundevenf": > > "duration": 2.19422e+09, "duration": 2.19402e+09, > > "iterations": 1.44514e+08, "iterations": 1.4184e+08, > > "max": 43.258, "max": 53.07, > > "min": 11.052, "min": 12.052, > > "mean": 15.1835 "mean": 15.4683 > > > > benchmark result after this commit on Tremont > > > > "roundeven": "roundevenf": > > "duration": 2.19144e+09, "duration": 2.19218e+09, > > "iterations": 2.17075e+08, "iterations": 1.97982e+08, > > "max": 395.428, "max": 34.928, > > "min": 10.044, "min": 11.02, > > "mean": 10.0953 "mean": 11.0726 > > > Looks good. Thanks. -- H.J. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-06-01 2:04 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-05 17:31 [PATCH] Add benchtests for roundeven and roundevenf Shen-Ta Hsieh 2020-03-27 23:25 ` Joseph Myers 2020-05-02 15:02 ` [PATCH v4 1/2] math: redirect roundeven function Shen-Ta Hsieh 2020-05-02 15:02 ` [PATCH v4 2/2] x86_64: roundeven with sse4.1 support Shen-Ta Hsieh 2020-05-28 12:05 ` H.J. Lu 2020-05-28 12:22 ` Florian Weimer 2020-05-28 12:31 ` H.J. Lu 2020-05-29 8:48 ` Cui, Lili 2020-05-29 11:29 ` H.J. Lu 2020-06-01 1:28 ` Cui, Lili 2020-06-01 2:04 ` H.J. Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).