* [PATCH], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
@ 2017-09-27 22:18 Michael Meissner
2017-09-28 0:40 ` Joseph Myers
0 siblings, 1 reply; 7+ messages in thread
From: Michael Meissner @ 2017-09-27 22:18 UTC (permalink / raw)
To: GCC Patches, Segher Boessenkool, David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 1226 bytes --]
The glibc team has requested we define the standard macro (__FP_FAST_FMAF128)
for PowerPC code when we have the IEEE 128-bit floating point hardware
instructions enabled.
This patch does this in the PowerPC backend. As I look at the whole issue, at
some point we should do this more in the machine independent portion of the
compiler. I have some initial patches to do this in the c-family files, but at
the present time, the patches are not complete, and I need to think about it
more.
So, I would like to check in this patch now, and if we come up with a machine
independent version, we can back out this particular patch. I have done a full
bootstrap and regression test, there were no regressions, and the new test case
does run correctly. Can I check this into the trunk?
[gcc]
2017-09-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
__FP_FAST_FMAF128 on ISA 3.0.
[gcc/testsuite]
2017-09-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/float128-fma3.c: New test.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: gcc-power9.patch289b --]
[-- Type: text/plain, Size: 1859 bytes --]
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c (revision 253236)
+++ gcc/config/rs6000/rs6000-c.c (working copy)
@@ -585,7 +585,10 @@ rs6000_target_modify_macros (bool define
/* OPTION_MASK_FLOAT128_HARDWARE can be turned on if -mcpu=power9 is used or
via the target attribute/pragma. */
if ((flags & OPTION_MASK_FLOAT128_HW) != 0)
- rs6000_define_or_undefine_macro (define_p, "__FLOAT128_HARDWARE__");
+ {
+ rs6000_define_or_undefine_macro (define_p, "__FLOAT128_HARDWARE__");
+ rs6000_define_or_undefine_macro (define_p, "__FP_FAST_FMAF128");
+ }
/* options from the builtin masks. */
/* Note that RS6000_BTM_PAIRED is enabled only if
Index: gcc/testsuite/gcc.target/powerpc/float128-fma3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma3.c (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma3.c (working copy)
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+/* Make sure the appropriate FMA fast macros are defined. */
+
+#ifdef __FP_FAST_FMAF
+float
+do_fmaf (float a, float b, float c)
+{
+ return __builtin_fmaf (a, b, c);
+}
+#endif
+
+#ifdef __FP_FAST_FMA
+double
+do_fma (double a, double b, double c)
+{
+ return __builtin_fma (a, b, c);
+}
+#endif
+
+#ifdef __FP_FAST_FMAF128
+_Float128
+do_fmaf128 (_Float128 a, _Float128 b, _Float128 c)
+{
+ return __builtin_fmaf128 (a, b, c);
+}
+#endif
+
+/* { dg-final { scan-assembler {\mfmadds\M|\mxsmadd.sp\M} } } */
+/* { dg-final { scan-assembler {\mfmadd\M|\mxsmadd.dp\M} } } */
+/* { dg-final { scan-assembler {\mxsmaddqp\M} } } */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-09-27 22:18 [PATCH], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0 Michael Meissner
@ 2017-09-28 0:40 ` Joseph Myers
2017-10-02 17:54 ` Michael Meissner
2017-10-02 23:51 ` [PATCH #2], " Michael Meissner
0 siblings, 2 replies; 7+ messages in thread
From: Joseph Myers @ 2017-09-28 0:40 UTC (permalink / raw)
To: Michael Meissner
Cc: GCC Patches, Segher Boessenkool, David Edelsohn, Bill Schmidt
On Wed, 27 Sep 2017, Michael Meissner wrote:
> The glibc team has requested we define the standard macro (__FP_FAST_FMAF128)
> for PowerPC code when we have the IEEE 128-bit floating point hardware
> instructions enabled.
It's not a standard macro. TS 18661-3 has FP_FAST_FMAF128 as an optional
math.h macro (but glibc doesn't define it anywhere at present).
> This patch does this in the PowerPC backend. As I look at the whole issue, at
> some point we should do this more in the machine independent portion of the
> compiler. I have some initial patches to do this in the c-family files, but at
> the present time, the patches are not complete, and I need to think about it
> more.
I think a machine-independent definition (for _FloatN / _FloatNx types in
general) should go along with machine-independent fmafN / fmafNx built-in
functions; when the built-in function is machine-specific, it's natural
for the macro to be as well.
But in any case, the new macro should be documented in cpp.texi alongside
the existing __FP_FAST_FMA* macros (probably in the generic
__FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-09-28 0:40 ` Joseph Myers
@ 2017-10-02 17:54 ` Michael Meissner
2017-10-02 23:51 ` [PATCH #2], " Michael Meissner
1 sibling, 0 replies; 7+ messages in thread
From: Michael Meissner @ 2017-10-02 17:54 UTC (permalink / raw)
To: Joseph Myers
Cc: Michael Meissner, GCC Patches, Segher Boessenkool,
David Edelsohn, Bill Schmidt
On Thu, Sep 28, 2017 at 12:40:24AM +0000, Joseph Myers wrote:
> On Wed, 27 Sep 2017, Michael Meissner wrote:
>
> > The glibc team has requested we define the standard macro (__FP_FAST_FMAF128)
> > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > instructions enabled.
>
> It's not a standard macro. TS 18661-3 has FP_FAST_FMAF128 as an optional
> math.h macro (but glibc doesn't define it anywhere at present).
>
> > This patch does this in the PowerPC backend. As I look at the whole issue, at
> > some point we should do this more in the machine independent portion of the
> > compiler. I have some initial patches to do this in the c-family files, but at
> > the present time, the patches are not complete, and I need to think about it
> > more.
>
> I think a machine-independent definition (for _FloatN / _FloatNx types in
> general) should go along with machine-independent fmafN / fmafNx built-in
> functions; when the built-in function is machine-specific, it's natural
> for the macro to be as well.
I have patches for this that I will submit shortly to replace the rs6000
specific patch.
I haven't yet found all of the places that need to be changed for more
traditional math functions like sqrtf128.
> But in any case, the new macro should be documented in cpp.texi alongside
> the existing __FP_FAST_FMA* macros (probably in the generic
> __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-09-28 0:40 ` Joseph Myers
2017-10-02 17:54 ` Michael Meissner
@ 2017-10-02 23:51 ` Michael Meissner
2017-10-02 23:53 ` Michael Meissner
1 sibling, 1 reply; 7+ messages in thread
From: Michael Meissner @ 2017-10-02 23:51 UTC (permalink / raw)
To: Joseph Myers, Segher Boessenkool, Richard Biener, Jakub Jelinek,
Jason Merrill, Richard Earnshaw, David S. Miller, Bernd Schmidt,
Ian Lance Taylor, Jim Wilson
Cc: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
On Thu, Sep 28, 2017 at 12:40:24AM +0000, Joseph Myers wrote:
> On Wed, 27 Sep 2017, Michael Meissner wrote:
>
> > The glibc team has requested we define the standard macro (__FP_FAST_FMAF128)
> > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > instructions enabled.
>
> It's not a standard macro. TS 18661-3 has FP_FAST_FMAF128 as an optional
> math.h macro (but glibc doesn't define it anywhere at present).
>
> > This patch does this in the PowerPC backend. As I look at the whole issue, at
> > some point we should do this more in the machine independent portion of the
> > compiler. I have some initial patches to do this in the c-family files, but at
> > the present time, the patches are not complete, and I need to think about it
> > more.
>
> I think a machine-independent definition (for _FloatN / _FloatNx types in
> general) should go along with machine-independent fmafN / fmafNx built-in
> functions; when the built-in function is machine-specific, it's natural
> for the macro to be as well.
>
> But in any case, the new macro should be documented in cpp.texi alongside
> the existing __FP_FAST_FMA* macros (probably in the generic
> __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
This patch adds support for adding the built-in __builtin_fmaf<N> and
__builtin_fmaf<N>x functions if the target machine supports an appropriate
fused multiply-add (FMA) instruction. This patch replaces the original PowerPC
specific patch.
Because it involves changes in the built-in support, both the c and c-family
subdirectories, as well as PowerPC changes, I added the global/release
maintainers to the To: list.
I have done a bootstrap and make check on a little endian Power8 with no
regresions in the tests. I have verified that the changed and new tests both
ran fine.
I have also bootstrapped the changes on an x86-64 compiler, and it bootstrapped
fine. I am currently running the unmodified build, but I'm not expecting any
changes in the test suite.
Assuming the x86-64 tests also have no regressions, can I check these changes
into the trunk?
[gcc]
2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
* builtins.def (BUILT_IN_FMAF16): Add support for fused
multiply-add built-in functions for _Float<N> and _Float<N>x
types.
(BUILT_IN_FMAF32): Likewise.
(BUILT_IN_FMAF64): Likewise.
(BUILT_IN_FMAF128): Likewise.
(BUILT_IN_FMAF32X): Likewise.
(BUILT_IN_FMAF64X): Likewise.
(BUILT_IN_FMAF128X): Likewise.
* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16):
Likewise.
(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
* builtins.c (expand_builtin_mathfn_ternary): Likewise.
(expand_builtin): Add fused multiply-add builtin support for
_Float<N> and _Float<N>X types. Issue a warning if the machine
does not provide an appropriate FMA insn.
(fold_builtin_3): Add support for fused multiply-add built-in
functions for _Float<N> and _Float<N>x types.
* config/rs6000/rs6000-builtins.def (FMAF128): Delete creating
__builtin_fmaf128, since this is now done in machine independent
code.
* doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare
that the appropriate fused multiply-add on _Float<N> and
_Float<N>X types is implemented.
(__FP_FAST_FMAF32): Likewise.
(__FP_FAST_FMAF64): Likewise.
(__FP_FAST_FMAF128): Likewise.
(__FP_FAST_FMAF32X): Likewise.
(__FP_FAST_FMAF64X): Likewise.
(__FP_FAST_FMAF128X): Likewise.
[gcc/c]
2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-decl.c (header_for_builtin_fn): Add support for fused
multiply-add built-in functions for _Float<N> and _Float<N>x
types.
[gcc/c-family]
2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128
FMA (KFmode) if long double != __float128.
(c_cpp_builtins): Define __FP_FAST_FMAF<N> if _Float<N> fused
multiply-add is supported. Define __FP_FAST_FMAF<N>X if
_Float<N>x fused multiply-add is supported.
[gcc/testsuite]
2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/float128-fma2.c: Change error to new
warning.
* gcc.target/powerpc/float128-fma3.c: New test.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-10-02 23:51 ` [PATCH #2], " Michael Meissner
@ 2017-10-02 23:53 ` Michael Meissner
2017-10-03 1:19 ` Joseph Myers
2017-10-04 13:00 ` Segher Boessenkool
0 siblings, 2 replies; 7+ messages in thread
From: Michael Meissner @ 2017-10-02 23:53 UTC (permalink / raw)
To: Michael Meissner, Joseph Myers, Segher Boessenkool,
Richard Biener, Jakub Jelinek, Jason Merrill, Richard Earnshaw,
David S. Miller, Bernd Schmidt, Ian Lance Taylor, Jim Wilson,
GCC Patches, David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 5100 bytes --]
Whoops, I forgot to attach the patch.
On Mon, Oct 02, 2017 at 07:51:00PM -0400, Michael Meissner wrote:
> On Thu, Sep 28, 2017 at 12:40:24AM +0000, Joseph Myers wrote:
> > On Wed, 27 Sep 2017, Michael Meissner wrote:
> >
> > > The glibc team has requested we define the standard macro (__FP_FAST_FMAF128)
> > > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > > instructions enabled.
> >
> > It's not a standard macro. TS 18661-3 has FP_FAST_FMAF128 as an optional
> > math.h macro (but glibc doesn't define it anywhere at present).
> >
> > > This patch does this in the PowerPC backend. As I look at the whole issue, at
> > > some point we should do this more in the machine independent portion of the
> > > compiler. I have some initial patches to do this in the c-family files, but at
> > > the present time, the patches are not complete, and I need to think about it
> > > more.
> >
> > I think a machine-independent definition (for _FloatN / _FloatNx types in
> > general) should go along with machine-independent fmafN / fmafNx built-in
> > functions; when the built-in function is machine-specific, it's natural
> > for the macro to be as well.
> >
> > But in any case, the new macro should be documented in cpp.texi alongside
> > the existing __FP_FAST_FMA* macros (probably in the generic
> > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
>
> This patch adds support for adding the built-in __builtin_fmaf<N> and
> __builtin_fmaf<N>x functions if the target machine supports an appropriate
> fused multiply-add (FMA) instruction. This patch replaces the original PowerPC
> specific patch.
>
> Because it involves changes in the built-in support, both the c and c-family
> subdirectories, as well as PowerPC changes, I added the global/release
> maintainers to the To: list.
>
> I have done a bootstrap and make check on a little endian Power8 with no
> regresions in the tests. I have verified that the changed and new tests both
> ran fine.
>
> I have also bootstrapped the changes on an x86-64 compiler, and it bootstrapped
> fine. I am currently running the unmodified build, but I'm not expecting any
> changes in the test suite.
>
> Assuming the x86-64 tests also have no regressions, can I check these changes
> into the trunk?
>
> [gcc]
> 2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * builtins.def (BUILT_IN_FMAF16): Add support for fused
> multiply-add built-in functions for _Float<N> and _Float<N>x
> types.
> (BUILT_IN_FMAF32): Likewise.
> (BUILT_IN_FMAF64): Likewise.
> (BUILT_IN_FMAF128): Likewise.
> (BUILT_IN_FMAF32X): Likewise.
> (BUILT_IN_FMAF64X): Likewise.
> (BUILT_IN_FMAF128X): Likewise.
> * builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16):
> Likewise.
> (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
> (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
> (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
> (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
> (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
> (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
> * builtins.c (expand_builtin_mathfn_ternary): Likewise.
> (expand_builtin): Add fused multiply-add builtin support for
> _Float<N> and _Float<N>X types. Issue a warning if the machine
> does not provide an appropriate FMA insn.
> (fold_builtin_3): Add support for fused multiply-add built-in
> functions for _Float<N> and _Float<N>x types.
> * config/rs6000/rs6000-builtins.def (FMAF128): Delete creating
> __builtin_fmaf128, since this is now done in machine independent
> code.
> * doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare
> that the appropriate fused multiply-add on _Float<N> and
> _Float<N>X types is implemented.
> (__FP_FAST_FMAF32): Likewise.
> (__FP_FAST_FMAF64): Likewise.
> (__FP_FAST_FMAF128): Likewise.
> (__FP_FAST_FMAF32X): Likewise.
> (__FP_FAST_FMAF64X): Likewise.
> (__FP_FAST_FMAF128X): Likewise.
>
> [gcc/c]
> 2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * c-decl.c (header_for_builtin_fn): Add support for fused
> multiply-add built-in functions for _Float<N> and _Float<N>x
> types.
>
> [gcc/c-family]
> 2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128
> FMA (KFmode) if long double != __float128.
> (c_cpp_builtins): Define __FP_FAST_FMAF<N> if _Float<N> fused
> multiply-add is supported. Define __FP_FAST_FMAF<N>X if
> _Float<N>x fused multiply-add is supported.
>
> [gcc/testsuite]
> 2017-10-02 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * gcc.target/powerpc/float128-fma2.c: Change error to new
> warning.
> * gcc.target/powerpc/float128-fma3.c: New test.
>
>
> --
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: gcc-power9.patch293b --]
[-- Type: text/plain, Size: 9126 bytes --]
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def (revision 253358)
+++ gcc/builtins.def (working copy)
@@ -382,6 +382,9 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL,
DEF_C99_BUILTIN (BUILT_IN_FMA, "fma", BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
DEF_C99_BUILTIN (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
DEF_C99_BUILTIN (BUILT_IN_FMAL, "fmal", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
+#define FMA_TYPE(F) BT_FN_##F##_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMA, "fma", FMA_TYPE, ATTR_MATHFN_FPROUNDING)
+#undef FMA_TYPE
DEF_C99_BUILTIN (BUILT_IN_FMAX, "fmax", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
DEF_C99_BUILTIN (BUILT_IN_FMAXF, "fmaxf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
DEF_C99_BUILTIN (BUILT_IN_FMAXL, "fmaxl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def (revision 253358)
+++ gcc/builtin-types.def (working copy)
@@ -544,6 +544,20 @@ DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE
BT_DOUBLE, BT_DOUBLE, BT_DOUBLE, BT_DOUBLE)
DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE,
BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16,
+ BT_FLOAT16, BT_FLOAT16, BT_FLOAT16, BT_FLOAT16)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32,
+ BT_FLOAT32, BT_FLOAT32, BT_FLOAT32, BT_FLOAT32)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64,
+ BT_FLOAT64, BT_FLOAT64, BT_FLOAT64, BT_FLOAT64)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128,
+ BT_FLOAT128, BT_FLOAT128, BT_FLOAT128, BT_FLOAT128)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X,
+ BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X,
+ BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X,
+ BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X)
DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_FLOAT_FLOAT_INTPTR,
BT_FLOAT, BT_FLOAT, BT_FLOAT, BT_INT_PTR)
DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_INTPTR,
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c (revision 253358)
+++ gcc/builtins.c (working copy)
@@ -2067,6 +2067,7 @@ expand_builtin_mathfn_ternary (tree exp,
switch (DECL_FUNCTION_CODE (fndecl))
{
CASE_FLT_FN (BUILT_IN_FMA):
+ CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
builtin_optab = fma_optab; break;
default:
gcc_unreachable ();
@@ -6563,6 +6564,18 @@ expand_builtin (tree exp, rtx target, rt
return target;
break;
+ /* Warn if the user called __builtin_fmaf{32,64,128} and there is no fast
+ insn to support it. */
+ CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
+ target = expand_builtin_mathfn_ternary (exp, target, subtarget);
+ if (target)
+ return target;
+
+ warning_at (tree_nonartificial_location (exp), 0,
+ "%KThe built-in function %<__builtin_fmafN ()%> may not be "
+ "supported", exp);
+ break;
+
CASE_FLT_FN (BUILT_IN_ILOGB):
if (! flag_unsafe_math_optimizations)
break;
@@ -8988,6 +9001,7 @@ fold_builtin_3 (location_t loc, tree fnd
return fold_builtin_sincos (loc, arg0, arg1, arg2);
CASE_FLT_FN (BUILT_IN_FMA):
+ CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
return fold_builtin_fma (loc, arg0, arg1, arg2, type);
CASE_FLT_FN (BUILT_IN_REMQUO):
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def (revision 253358)
+++ gcc/config/rs6000/rs6000-builtin.def (working copy)
@@ -2369,7 +2369,6 @@ BU_FLOAT128_2 (COPYSIGNQ, "copysignq",
hardware. These functions use the new 'f128' suffix. Eventually these
should be folded into the common built-in function handling. */
BU_FLOAT128_1_HW (SQRTF128, "sqrtf128", CONST, sqrtkf2)
-BU_FLOAT128_3_HW (FMAF128, "fmaf128", CONST, fmakf4_hw)
\f
/* 1 argument crypto functions. */
BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox)
Index: gcc/doc/cpp.texi
===================================================================
--- gcc/doc/cpp.texi (revision 253358)
+++ gcc/doc/cpp.texi (working copy)
@@ -2400,6 +2400,20 @@ was used). If 1 or more, it indicates t
those requirements; this does not mean that all relevant language
features are supported by GCC.
+@item __FP_FAST_FMAF16
+@itemx __FP_FAST_FMAF32
+@itemx __FP_FAST_FMAF64
+@itemx __FP_FAST_FMAF128
+@itemx __FP_FAST_FMAF32X
+@itemx __FP_FAST_FMAF64X
+@itemx __FP_FAST_FMAF128X
+This macro is defined with value 1 if the backend supports the
+@code{__builtin_fmaf16}, @code{__builtin_fmaf32},
+@code{__builtin_fmaf64}, @code{__builtin_fmaf128},
+@code{__builtin_fmaf32x}, @code{__builtin_fmaf64x}, or
+@code{__builtin_fmaf128x} builtin functions that do fused multiply-add
+on the types defined in IEEE 754 (IEC 60559).
+
@item __NO_MATH_ERRNO__
This macro is defined if @option{-fno-math-errno} is used, or enabled
by another option such as @option{-ffast-math} or by default.
Index: gcc/c/c-decl.c
===================================================================
--- gcc/c/c-decl.c (revision 253358)
+++ gcc/c/c-decl.c (working copy)
@@ -3171,6 +3171,7 @@ header_for_builtin_fn (enum built_in_fun
CASE_FLT_FN (BUILT_IN_FDIM):
CASE_FLT_FN (BUILT_IN_FLOOR):
CASE_FLT_FN (BUILT_IN_FMA):
+ CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
CASE_FLT_FN (BUILT_IN_FMAX):
CASE_FLT_FN (BUILT_IN_FMIN):
CASE_FLT_FN (BUILT_IN_FMOD):
Index: gcc/c-family/c-cppbuiltin.c
===================================================================
--- gcc/c-family/c-cppbuiltin.c (revision 253358)
+++ gcc/c-family/c-cppbuiltin.c (working copy)
@@ -82,6 +82,11 @@ mode_has_fma (machine_mode mode)
return !!HAVE_fmadf4;
#endif
+#ifdef HAVE_fmakf4 /* PowerPC if long double != __float128. */
+ case E_KFmode:
+ return !!HAVE_fmakf4;
+#endif
+
#ifdef HAVE_fmaxf4
case E_XFmode:
return !!HAVE_fmaxf4;
@@ -1119,7 +1124,7 @@ c_cpp_builtins (cpp_reader *pfile)
floatn_nx_types[i].extended ? "X" : "");
sprintf (csuffix, "F%d%s", floatn_nx_types[i].n,
floatn_nx_types[i].extended ? "x" : "");
- builtin_define_float_constants (prefix, csuffix, "%s", NULL,
+ builtin_define_float_constants (prefix, csuffix, "%s", csuffix,
FLOATN_NX_TYPE_NODE (i));
}
Index: gcc/testsuite/gcc.target/powerpc/float128-fma2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma2.c (revision 253358)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma2.c (working copy)
@@ -5,5 +5,5 @@
__float128
xfma (__float128 a, __float128 b, __float128 c)
{
- return __builtin_fmaf128 (a, b, c); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
+ return __builtin_fmaf128 (a, b, c); /* { dg-warning "__builtin_fmafN" } */
}
Index: gcc/testsuite/gcc.target/powerpc/float128-fma3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma3.c (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma3.c (working copy)
@@ -0,0 +1,63 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+/* Make sure the appropriate FMA fast macros are defined. */
+
+#include <math.h>
+
+#ifdef __FP_FAST_FMAF
+float
+do_fmaf (float a, float b, float c)
+{
+ return __builtin_fmaf (a, b, c);
+}
+#else
+#error "__FP_FAST_FMAF should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF32
+_Float32
+do_fmaf32 (_Float32 a, _Float32 b, _Float32 c)
+{
+ return __builtin_fmaf32 (a, b, -c);
+}
+#else
+#error "__FP_FAST_FMAF32 should be defined"
+#endif
+
+#ifdef __FP_FAST_FMA
+double
+do_fma (double a, double b, double c)
+{
+ return __builtin_fma (a, b, c);
+}
+#else
+#error "__FP_FAST_FMA should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF64
+_Float64
+do_fmaf64 (_Float64 a, _Float64 b, _Float64 c)
+{
+ return __builtin_fmaf64 (a, b, -c);
+}
+#else
+#error "__FP_FAST_FMAF64 should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF128
+_Float128
+do_fmaf128 (_Float128 a, _Float128 b, _Float128 c)
+{
+ return __builtin_fmaf128 (a, b, c);
+}
+#else
+#error "__FP_FAST_FMAF128 should be defined"
+#endif
+
+/* { dg-final { scan-assembler {\mfmadds\M|\mxsmadd.sp\M} } } */
+/* { dg-final { scan-assembler {\mfmsubs\M|\mxsmsub.sp\M} } } */
+/* { dg-final { scan-assembler {\mfmadd\M|\mxsmadd.dp\M} } } */
+/* { dg-final { scan-assembler {\mfmsub\M|\mxsmsub.dp\M} } } */
+/* { dg-final { scan-assembler {\mxsmaddqp\M} } } */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-10-02 23:53 ` Michael Meissner
@ 2017-10-03 1:19 ` Joseph Myers
2017-10-04 13:00 ` Segher Boessenkool
1 sibling, 0 replies; 7+ messages in thread
From: Joseph Myers @ 2017-10-03 1:19 UTC (permalink / raw)
To: Michael Meissner
Cc: Segher Boessenkool, Richard Biener, Jakub Jelinek, Jason Merrill,
Richard Earnshaw, David S. Miller, Bernd Schmidt,
Ian Lance Taylor, Jim Wilson, GCC Patches, David Edelsohn,
Bill Schmidt
On Mon, 2 Oct 2017, Michael Meissner wrote:
> > > But in any case, the new macro should be documented in cpp.texi alongside
> > > the existing __FP_FAST_FMA* macros (probably in the generic
> > > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
> >
> > This patch adds support for adding the built-in __builtin_fmaf<N> and
> > __builtin_fmaf<N>x functions if the target machine supports an appropriate
> > fused multiply-add (FMA) instruction. This patch replaces the original PowerPC
> > specific patch.
Certainly the <math.h> FP_FAST_FMA* macros are supposed to relate to
whether the public functions such as fmaf128 are fast rather than to
__builtin_* names.
I think there's a strong case that you should provide built-in functions
under the public names when defining __FP_FAST_FMA*. I.e., add a variant
of DEF_GCC_FLOATN_NX_BUILTINS that uses DEF_EXT_LIB_BUILTIN instead of
DEF_GCC_BUILTIN, and use that for the new built-in functions.
Then, the built-in functions, in whatever form they are provided, should
be documented in extend.texi, alongside the documentation of
__builtin_fabsf@var{n} etc. (with, of course, the caveats about
availability when appropriate instruction support isn't available - the
__builtin_fabsfN, __builtin_copysignfN functions are always inlined, the
fma ones aren't, and people may well lack C library support for the
underlying functions).
Given that, I don't think the warning about lack of instruction support is
appropriate; a call to __builtin_fmaf128, if the type is supported but
there is no corresponding instruction (on x86_64, say), would just fall
back to calling an external fmaf128 function (which in that case would
work with glibc 2.26 or later, though calls to fmaf64 etc. wouldn't), much
like any other such built-in function (we don't warn about e.g. calling
__builtin_clog10 on systems whose C library may not have clog10).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0
2017-10-02 23:53 ` Michael Meissner
2017-10-03 1:19 ` Joseph Myers
@ 2017-10-04 13:00 ` Segher Boessenkool
1 sibling, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2017-10-04 13:00 UTC (permalink / raw)
To: Michael Meissner, Joseph Myers, Richard Biener, Jakub Jelinek,
Jason Merrill, Richard Earnshaw, David S. Miller, Bernd Schmidt,
Ian Lance Taylor, Jim Wilson, GCC Patches, David Edelsohn,
Bill Schmidt
Hi!
On Mon, Oct 02, 2017 at 07:52:50PM -0400, Michael Meissner wrote:
> Whoops, I forgot to attach the patch.
Heh. The rs6000 parts are of course okay (trivial / obvious, but maybe
you are waiting for an ack).
Thanks,
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-10-04 13:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-27 22:18 [PATCH], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0 Michael Meissner
2017-09-28 0:40 ` Joseph Myers
2017-10-02 17:54 ` Michael Meissner
2017-10-02 23:51 ` [PATCH #2], " Michael Meissner
2017-10-02 23:53 ` Michael Meissner
2017-10-03 1:19 ` Joseph Myers
2017-10-04 13:00 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).