* [PATCH] powerpc: Add optimized ilogbf128 for POWER9
@ 2020-12-22 15:30 Raphael Moreira Zinsly
2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Raphael Moreira Zinsly @ 2020-12-22 15:30 UTC (permalink / raw)
To: libc-alpha; +Cc: tuliom, Raphael Moreira Zinsly
The instruction xsxexpqp introduced on POWER9 extracts the exponent
from a quad-precision floating-point, thus it can be used to improve
ilogbf128 and llogbf128.
---
.../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
new file mode 100644
index 0000000000..47558bbadc
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
@@ -0,0 +1,22 @@
+#ifdef _ARCH_PWR9
+int _ilogbf128 (_Float128 __x);
+
+int
+#if defined(_F128_ENABLE_IFUNC)
+__ieee754_ilogbf128_power9 (_Float128 __x)
+#else
+__ieee754_ilogbf128 (_Float128 __x)
+#endif
+{
+ /* Check for exceptional cases. */
+ if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
+ return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
+ else
+ /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal. */
+ return _ilogbf128(__x);
+}
+
+#define __ieee754_ilogbf128 _ilogbf128
+#endif
+
+#include<sysdeps/ieee754/float128/e_ilogbf128.c>
--
2.29.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] benchtests: Add ilogbf128 test
2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
@ 2020-12-22 15:30 ` Raphael Moreira Zinsly
2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
2021-01-04 23:20 ` Paul E Murphy
2 siblings, 0 replies; 5+ messages in thread
From: Raphael Moreira Zinsly @ 2020-12-22 15:30 UTC (permalink / raw)
To: libc-alpha; +Cc: tuliom, Raphael Moreira Zinsly
Add a benchtest to ilogbf128 based on the logb benchtests.
---
benchtests/Makefile | 2 +-
benchtests/ilogbf128-inputs | 11 +++++++++++
2 files changed, 12 insertions(+), 1 deletion(-)
create mode 100644 benchtests/ilogbf128-inputs
diff --git a/benchtests/Makefile b/benchtests/Makefile
index 5cd211ee9a..9c7d99ae66 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -28,7 +28,7 @@ bench-math := acos acosh asin asinh atan atanh cos cosh exp exp2 log log2 \
exp10f
ifneq (,$(filter yes,$(float128-fcts) $(float128-alias-fcts)))
-bench-math += expf128 powf128 sinf128
+bench-math += expf128 powf128 sinf128 ilogbf128
endif
bench-pthread := pthread_once thread_create pthread-locks
diff --git a/benchtests/ilogbf128-inputs b/benchtests/ilogbf128-inputs
new file mode 100644
index 0000000000..bfbfc93714
--- /dev/null
+++ b/benchtests/ilogbf128-inputs
@@ -0,0 +1,11 @@
+## args: _Float128
+## ret: int
+## includes: math.h
+
+## name: subnormal
+6.47517511943802511092443895822764655e-4966f128
+0x1.fffffffffffffff8p-16383f128
+
+## name: normal
+1.0
+-0x8.2faf442f390a9211f5af128673fp+0L
--
2.29.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
@ 2020-12-22 15:36 ` Raphael M Zinsly
2021-01-04 23:20 ` Paul E Murphy
2 siblings, 0 replies; 5+ messages in thread
From: Raphael M Zinsly @ 2020-12-22 15:36 UTC (permalink / raw)
To: libc-alpha
Benchtests results with and without this patch on a POWER9:
without:
"ilogbf128": {
"subnormal": {
"duration": 5.09834e+08,
"iterations": 2.8146e+07,
"max": 38.979,
"min": 2.939,
"mean": 18.1139
},
"normal": {
"duration": 4.99378e+08,
"iterations": 1.6151e+08,
"max": 16.698,
"min": 2.942,
"mean": 3.09193
}
}
with:
"ilogbf128": {
"subnormal": {
"duration": 5.09989e+08,
"iterations": 2.5978e+07,
"max": 41.027,
"min": 4.674,
"mean": 19.6316
},
"normal": {
"duration": 4.98105e+08,
"iterations": 1.77912e+08,
"max": 12.663,
"min": 2.792,
"mean": 2.79972
}
}
Best Regards,
--
Raphael Moreira Zinsly
IBM
Linux on Power Toolchain
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
@ 2021-01-04 23:20 ` Paul E Murphy
2021-01-05 18:19 ` Paul E Murphy
2 siblings, 1 reply; 5+ messages in thread
From: Paul E Murphy @ 2021-01-04 23:20 UTC (permalink / raw)
To: Raphael Moreira Zinsly, libc-alpha; +Cc: tuliom
On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote:
> The instruction xsxexpqp introduced on POWER9 extracts the exponent
> from a quad-precision floating-point, thus it can be used to improve
> ilogbf128 and llogbf128.
> ---
> .../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++
> 1 file changed, 22 insertions(+)
> create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
> new file mode 100644
> index 0000000000..47558bbadc
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
> @@ -0,0 +1,22 @@
> +#ifdef _ARCH_PWR9
> +int _ilogbf128 (_Float128 __x);
This should be a locally (static) scoped function.
> +
> +int
> +#if defined(_F128_ENABLE_IFUNC)
> +__ieee754_ilogbf128_power9 (_Float128 __x)
> +#else
> +__ieee754_ilogbf128 (_Float128 __x)
> +#endif
> +{
> + /* Check for exceptional cases. */
> + if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
> + return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
> + else
> + /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal. */
> + return _ilogbf128(__x);
> +}
> +
> +#define __ieee754_ilogbf128 _ilogbf128
> +#endif
> +
> +#include<sysdeps/ieee754/float128/e_ilogbf128.c>
A space seems to be missing between include and <.
Otherwise, LGTM.
As a side note, I think the benchtests are not too impressive. I am
surprised normal values don't show better results.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
2021-01-04 23:20 ` Paul E Murphy
@ 2021-01-05 18:19 ` Paul E Murphy
0 siblings, 0 replies; 5+ messages in thread
From: Paul E Murphy @ 2021-01-05 18:19 UTC (permalink / raw)
To: Raphael Moreira Zinsly, libc-alpha; +Cc: tuliom
On 1/4/21 5:20 PM, Paul E Murphy via Libc-alpha wrote:
>
>
> On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote:
>> The instruction xsxexpqp introduced on POWER9 extracts the exponent
>> from a quad-precision floating-point, thus it can be used to improve
>> ilogbf128 and llogbf128.
>> ---
>> .../powerpc/powerpc64/le/fpu/e_ilogbf128.c | 22 +++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>> create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>>
>> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>> b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>> new file mode 100644
>> index 0000000000..47558bbadc
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>> @@ -0,0 +1,22 @@
>> +#ifdef _ARCH_PWR9
>> +int _ilogbf128 (_Float128 __x);
>
> This should be a locally (static) scoped function.
>
>> +
>> +int
>> +#if defined(_F128_ENABLE_IFUNC)
>> +__ieee754_ilogbf128_power9 (_Float128 __x)
>> +#else
>> +__ieee754_ilogbf128 (_Float128 __x)
>> +#endif
>> +{
>> + /* Check for exceptional cases. */
>> + if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
>> + return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
>> + else
>> + /* Fallback to the generic ilogb if __x is NaN, Inf or
>> subnormal. */
>> + return _ilogbf128(__x);
>> +}
>> +
>> +#define __ieee754_ilogbf128 _ilogbf128
>> +#endif
>> +
>> +#include<sysdeps/ieee754/float128/e_ilogbf128.c>
>
> A space seems to be missing between include and <.
>
> Otherwise, LGTM.
>
> As a side note, I think the benchtests are not too impressive. I am
> surprised normal values don't show better results.
After spending a little time looking at this, the call overhead of the
wrapper is hiding most of the improvement. Similarly, power9 adds
similar instructions for float32/float64.
I would recommend refactoring this patch to provide an override to
w_ilogb_template.c so all three formats can use these new instructions
without the call overhead for normal numbers.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-01-05 18:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
2021-01-04 23:20 ` Paul E Murphy
2021-01-05 18:19 ` Paul E Murphy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).