[PATCH] powerpc: Add optimized ilogbf128 for POWER9

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] powerpc: Add optimized ilogbf128 for POWER9
@ 2020-12-22 15:30 Raphael Moreira Zinsly
  2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Raphael Moreira Zinsly @ 2020-12-22 15:30 UTC (permalink / raw)
  To: libc-alpha; +Cc: tuliom, Raphael Moreira Zinsly

The instruction xsxexpqp introduced on POWER9 extracts the exponent
from a quad-precision floating-point, thus it can be used to improve
ilogbf128 and llogbf128.
---
 .../powerpc/powerpc64/le/fpu/e_ilogbf128.c    | 22 +++++++++++++++++++
 1 file changed, 22 insertions(+)
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c

diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
new file mode 100644
index 0000000000..47558bbadc
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
@@ -0,0 +1,22 @@
+#ifdef _ARCH_PWR9
+int _ilogbf128 (_Float128 __x);
+
+int
+#if defined(_F128_ENABLE_IFUNC)
+__ieee754_ilogbf128_power9 (_Float128 __x)
+#else
+__ieee754_ilogbf128 (_Float128 __x)
+#endif
+{
+  /* Check for exceptional cases.  */
+  if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
+    return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
+  else
+    /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal.  */
+    return _ilogbf128(__x);
+}
+
+#define __ieee754_ilogbf128 _ilogbf128
+#endif
+
+#include<sysdeps/ieee754/float128/e_ilogbf128.c>
-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] benchtests: Add ilogbf128 test
  2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
@ 2020-12-22 15:30 ` Raphael Moreira Zinsly
  2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
  2021-01-04 23:20 ` Paul E Murphy
  2 siblings, 0 replies; 5+ messages in thread
From: Raphael Moreira Zinsly @ 2020-12-22 15:30 UTC (permalink / raw)
  To: libc-alpha; +Cc: tuliom, Raphael Moreira Zinsly

Add a benchtest to ilogbf128 based on the logb benchtests.
---
 benchtests/Makefile         |  2 +-
 benchtests/ilogbf128-inputs | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 benchtests/ilogbf128-inputs

diff --git a/benchtests/Makefile b/benchtests/Makefile
index 5cd211ee9a..9c7d99ae66 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -28,7 +28,7 @@ bench-math := acos acosh asin asinh atan atanh cos cosh exp exp2 log log2 \
 	      exp10f
 
 ifneq (,$(filter yes,$(float128-fcts) $(float128-alias-fcts)))
-bench-math += expf128 powf128 sinf128
+bench-math += expf128 powf128 sinf128 ilogbf128
 endif
 
 bench-pthread := pthread_once thread_create pthread-locks
diff --git a/benchtests/ilogbf128-inputs b/benchtests/ilogbf128-inputs
new file mode 100644
index 0000000000..bfbfc93714
--- /dev/null
+++ b/benchtests/ilogbf128-inputs
@@ -0,0 +1,11 @@
+## args: _Float128
+## ret: int
+## includes: math.h
+
+## name: subnormal
+6.47517511943802511092443895822764655e-4966f128
+0x1.fffffffffffffff8p-16383f128
+
+## name: normal
+1.0
+-0x8.2faf442f390a9211f5af128673fp+0L
-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
  2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
  2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
@ 2020-12-22 15:36 ` Raphael M Zinsly
  2021-01-04 23:20 ` Paul E Murphy
  2 siblings, 0 replies; 5+ messages in thread
From: Raphael M Zinsly @ 2020-12-22 15:36 UTC (permalink / raw)
  To: libc-alpha

Benchtests results with and without this patch on a POWER9:

without:
   "ilogbf128": {
    "subnormal": {
     "duration": 5.09834e+08,
     "iterations": 2.8146e+07,
     "max": 38.979,
     "min": 2.939,
     "mean": 18.1139
    },
    "normal": {
     "duration": 4.99378e+08,
     "iterations": 1.6151e+08,
     "max": 16.698,
     "min": 2.942,
     "mean": 3.09193
    }
   }

with:
   "ilogbf128": {
    "subnormal": {
     "duration": 5.09989e+08,
     "iterations": 2.5978e+07,
     "max": 41.027,
     "min": 4.674,
     "mean": 19.6316
    },
    "normal": {
     "duration": 4.98105e+08,
     "iterations": 1.77912e+08,
     "max": 12.663,
     "min": 2.792,
     "mean": 2.79972
    }
   }

Best Regards,
-- 
Raphael Moreira Zinsly
IBM
Linux on Power Toolchain

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
  2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
  2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
  2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
@ 2021-01-04 23:20 ` Paul E Murphy
  2021-01-05 18:19   ` Paul E Murphy
  2 siblings, 1 reply; 5+ messages in thread
From: Paul E Murphy @ 2021-01-04 23:20 UTC (permalink / raw)
  To: Raphael Moreira Zinsly, libc-alpha; +Cc: tuliom



On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote:
> The instruction xsxexpqp introduced on POWER9 extracts the exponent
> from a quad-precision floating-point, thus it can be used to improve
> ilogbf128 and llogbf128.
> ---
>   .../powerpc/powerpc64/le/fpu/e_ilogbf128.c    | 22 +++++++++++++++++++
>   1 file changed, 22 insertions(+)
>   create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
> 
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
> new file mode 100644
> index 0000000000..47558bbadc
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
> @@ -0,0 +1,22 @@
> +#ifdef _ARCH_PWR9
> +int _ilogbf128 (_Float128 __x);

This should be a locally (static) scoped function.

> +
> +int
> +#if defined(_F128_ENABLE_IFUNC)
> +__ieee754_ilogbf128_power9 (_Float128 __x)
> +#else
> +__ieee754_ilogbf128 (_Float128 __x)
> +#endif
> +{
> +  /* Check for exceptional cases.  */
> +  if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
> +    return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
> +  else
> +    /* Fallback to the generic ilogb if __x is NaN, Inf or subnormal.  */
> +    return _ilogbf128(__x);
> +}
> +
> +#define __ieee754_ilogbf128 _ilogbf128
> +#endif
> +
> +#include<sysdeps/ieee754/float128/e_ilogbf128.c>

A space seems to be missing between include and <.

Otherwise, LGTM.

As a side note, I think the benchtests are not too impressive. I am 
surprised normal values don't show better results.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc: Add optimized ilogbf128 for POWER9
  2021-01-04 23:20 ` Paul E Murphy
@ 2021-01-05 18:19   ` Paul E Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E Murphy @ 2021-01-05 18:19 UTC (permalink / raw)
  To: Raphael Moreira Zinsly, libc-alpha; +Cc: tuliom



On 1/4/21 5:20 PM, Paul E Murphy via Libc-alpha wrote:
> 
> 
> On 12/22/20 9:30 AM, Raphael Moreira Zinsly via Libc-alpha wrote:
>> The instruction xsxexpqp introduced on POWER9 extracts the exponent
>> from a quad-precision floating-point, thus it can be used to improve
>> ilogbf128 and llogbf128.
>> ---
>>   .../powerpc/powerpc64/le/fpu/e_ilogbf128.c    | 22 +++++++++++++++++++
>>   1 file changed, 22 insertions(+)
>>   create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>>
>> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c 
>> b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>> new file mode 100644
>> index 0000000000..47558bbadc
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/le/fpu/e_ilogbf128.c
>> @@ -0,0 +1,22 @@
>> +#ifdef _ARCH_PWR9
>> +int _ilogbf128 (_Float128 __x);
> 
> This should be a locally (static) scoped function.
> 
>> +
>> +int
>> +#if defined(_F128_ENABLE_IFUNC)
>> +__ieee754_ilogbf128_power9 (_Float128 __x)
>> +#else
>> +__ieee754_ilogbf128 (_Float128 __x)
>> +#endif
>> +{
>> +  /* Check for exceptional cases.  */
>> +  if (!__builtin_vsx_scalar_test_data_class_qp (__x, 0x7f))
>> +    return __builtin_vsx_scalar_extract_expq (__x) - 0x3fff;
>> +  else
>> +    /* Fallback to the generic ilogb if __x is NaN, Inf or 
>> subnormal.  */
>> +    return _ilogbf128(__x);
>> +}
>> +
>> +#define __ieee754_ilogbf128 _ilogbf128
>> +#endif
>> +
>> +#include<sysdeps/ieee754/float128/e_ilogbf128.c>
> 
> A space seems to be missing between include and <.
> 
> Otherwise, LGTM.
> 
> As a side note, I think the benchtests are not too impressive. I am 
> surprised normal values don't show better results.

After spending a little time looking at this, the call overhead of the 
wrapper is hiding most of the improvement.  Similarly, power9 adds 
similar instructions for float32/float64.

I would recommend refactoring this patch to provide an override to 
w_ilogb_template.c so all three formats can use these new instructions 
without the call overhead for normal numbers.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-05 18:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-22 15:30 [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael Moreira Zinsly
2020-12-22 15:30 ` [PATCH 2/2] benchtests: Add ilogbf128 test Raphael Moreira Zinsly
2020-12-22 15:36 ` [PATCH] powerpc: Add optimized ilogbf128 for POWER9 Raphael M Zinsly
2021-01-04 23:20 ` Paul E Murphy
2021-01-05 18:19   ` Paul E Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).