public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/dmf004)] Rework 128-bit complex multiply and divide.
@ 2022-11-09 5:30 Michael Meissner
0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2022-11-09 5:30 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:974113199743c76b1ea3d3017e34977a8ee687c4
commit 974113199743c76b1ea3d3017e34977a8ee687c4
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Wed Nov 9 00:30:02 2022 -0500
Rework 128-bit complex multiply and divide.
This function reworks how the complex multiply and divide built-in functions are
done. Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit. The old code also did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.
In terms of history, I wrote the original code just as I was starting to test
GCC on systems where IEEE 128-bit long double was the default. At the time, we
had not yet started mangling the built-in function names as a way to bridge
going from a system with 128-bit IBM long double to 128-bin IEEE long double.
The original code depends on there only being two 128-bit types invovled. With
the next patch in this series, this assumption will no longer be true. When
long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
explicit __float128/_Float128 type and one for long double).
The problem is we cannot create two separate built-in functions that resolve to
the same name. This is a requirement of add_builtin_function and the C front
end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
only use 2 of them.
This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.
We need to disable using this mapping when we are building libgcc, specifically
when it is building the floating point 128-bit multiply and divide functions.
The flag that is used when libgcc is built (-fbuilding-libcc) is only available
in the C/C++ front ends. We need to remember that we are building libgcc in the
rs6000-c.cc support to be able to use this later to decided whether to mangle
the decl assembler name or not.
When I wrote these patches, I discovered that __ibm128 complex multiply and
divide had originally not been supported if long double is IEEE 128-bit as it
would generate calls to __mulic3 and __divic3. I added tests in the testsuite
to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
case.
I tested all 3 patchs for PR target/107299 on:
1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
Once all 3 patches have been applied, we can once again build GCC when long
double is IEEE 128-bit. There were no other regressions with these patches.
Can I check these patches into the trunk?
2022-11-09 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/107299
* config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set
building_libgcc.
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.
* config/rs6000/rs6000.opt (building_libgcc): New target variable.
gcc/testsuite/
PR target/107299
* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
Diff:
---
gcc/ChangeLog.meissner | 330 +++++++++++++++++++++++++++-
gcc/config/rs6000/rs6000-c.cc | 8 +
gcc/config/rs6000/rs6000.cc | 110 ++++++----
gcc/config/rs6000/rs6000.opt | 4 +
gcc/testsuite/gcc.target/powerpc/divic3-1.c | 18 ++
gcc/testsuite/gcc.target/powerpc/divic3-2.c | 17 ++
gcc/testsuite/gcc.target/powerpc/mulic3-1.c | 18 ++
gcc/testsuite/gcc.target/powerpc/mulic3-2.c | 17 ++
8 files changed, 474 insertions(+), 48 deletions(-)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 1f2cc7b8288..5a6fa542ab6 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,4 +1,332 @@
+==================== Dmf004 branch, patch #4.
+
+Patch libgcc to always use _Float128 and _Complex _Float128 on PowerPC.
+
2022-11-09 Michael Meissner <meissner@linux.ibm.com>
- Clone branch
+libgcc/
+
+ * config/rs6000/quad-float128.h (TF): Delete definition.
+ (TFtype): Define to be _Float128.
+ (TCtype): Change to be _Complex _Float128.
+ * libgcc2.h (TFtype): Allow MD code to override definition.
+ (TCtype): Likewise.
+ * soft-fp/quad.h (TFtype): Likewise.
+
+==================== Dmf004 branch, patch #3.
+
+Update float 128-bit conversions, PR target/107299.
+
+This patch fixes two tests that are still failing when long double is IEEE
+128-bit after the previous 2 patches for PR target/107299 have been applied.
+The tests are:
+
+ gcc.target/powerpc/convert-fp-128.c
+ gcc.target/powerpc/pr85657-3.c
+
+This patch is a rewrite of the patch submitted on August 18th:
+
+| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html
+
+This patch reworks the conversions between 128-bit binary floating point types.
+Previously, we would call rs6000_expand_float128_convert to do all conversions.
+Now, we only define the conversions between the same representation that turn
+into a NOP. The appropriate extend or truncate insn is generated, and after
+register allocation, it is converted to a move.
+
+This patch also fixes two places where we want to override the external name
+for the conversion function, and the wrong optab was used. Previously,
+rs6000_expand_float128_convert would handle the move or generate the call as
+needed. Now, it lets the machine independent code generate the call. But if
+we use the machine independent code to generate the call, we need to update the
+name for two optabs where a truncate would be used in terms of converting
+between the modes. This patch updates those two optabs.
+
+I tested this patch on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+In the past I have also tested this exact patch on the following systems:
+
+ 1) LE Power10 using --with-cpu=power9 --with-long-double-format=ibm
+ 2) LE Power10 using --with-cpu=power8 --with-long-double-format=ibm
+ 3) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+
+There were no regressions in the bootstrap process or running the tests (after
+applying all 3 patches for PR target/107299). Can I check this patch into the
+trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000.cc (init_float128_ieee): Use the correct
+ float_extend or float_truncate optab based on how the machine converts
+ between IEEE 128-bit and IBM 128-bit.
+ * config/rs6000/rs6000.md (IFKF): Delete.
+ (IFKF_reg): Delete.
+ (extendiftf2): Rewrite to be a move if IFmode and TFmode are both IBM
+ 128-bit. Do not run if TFmode is IEEE 128-bit.
+ (extendifkf2): Delete.
+ (extendtfkf2): Delete.
+ (extendtfif2): Delete.
+ (trunciftf2): Delete.
+ (truncifkf2): Delete.
+ (trunckftf2): Delete.
+ (extendkftf2): Implement conversion of IEEE 128-bit types as a move.
+ (trunctfif2): Delete.
+ (trunctfkf2): Implement conversion of IEEE 128-bit types as a move.
+ (extend<mode>tf2_internal): Delete.
+ (extendtf<mode>2_internal): Delete.
+
+==================== Dmf004 branch, patch #2.
+
+Make __float128 use the _Float128 type, PR target/107299.
+
+This patch fixes the issue that GCC cannot build when the default long double
+is IEEE 128-bit. It fails in building libgcc, specifically when it is trying
+to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc
+during the evrp pass. Ultimately it is failing because the code declared the
+type to use TFmode but it used F128 functions (i.e. KFmode).
+
+ typedef float TFtype __attribute__((mode (TF)));
+ typedef __complex float TCtype __attribute__((mode (TC)));
+
+ TCtype
+ __mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d)
+ {
+ TFtype ac, bd, ad, bc, x, y;
+ TCtype res;
+
+ ac = a * c;
+ bd = b * d;
+ ad = a * d;
+ bc = b * c;
+
+ x = ac - bd;
+ y = ad + bc;
+
+ if (__builtin_isnan (x) && __builtin_isnan (y))
+ {
+ _Bool recalc = 0;
+ if (__builtin_isinf (a) || __builtin_isinf (b))
+ {
+
+ a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a);
+ b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b);
+ if (__builtin_isnan (c))
+ c = __builtin_copysignf128 (0, c);
+ if (__builtin_isnan (d))
+ d = __builtin_copysignf128 (0, d);
+ recalc = 1;
+ }
+ if (__builtin_isinf (c) || __builtin_isinf (d))
+ {
+
+ c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c);
+ d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d);
+ if (__builtin_isnan (a))
+ a = __builtin_copysignf128 (0, a);
+ if (__builtin_isnan (b))
+ b = __builtin_copysignf128 (0, b);
+ recalc = 1;
+ }
+ if (!recalc
+ && (__builtin_isinf (ac) || __builtin_isinf (bd)
+ || __builtin_isinf (ad) || __builtin_isinf (bc)))
+ {
+
+ if (__builtin_isnan (a))
+ a = __builtin_copysignf128 (0, a);
+ if (__builtin_isnan (b))
+ b = __builtin_copysignf128 (0, b);
+ if (__builtin_isnan (c))
+ c = __builtin_copysignf128 (0, c);
+ if (__builtin_isnan (d))
+ d = __builtin_copysignf128 (0, d);
+ recalc = 1;
+ }
+ if (recalc)
+ {
+ x = __builtin_inff128 () * (a * c - b * d);
+ y = __builtin_inff128 () * (a * d + b * c);
+ }
+ }
+
+ __real__ res = x;
+ __imag__ res = y;
+ return res;
+ }
+
+Currently GCC uses the long double type node for __float128 if long double is
+IEEE 128-bit. It did not use the node for _Float128.
+
+Originally this was noticed if you call the nansq function to make a signaling
+NaN (nansq is mapped to nansf128). Because the type node for _Float128 is
+different from __float128, the machine independent code converts signaling NaNs
+to quiet NaNs if the types are not compatible. The following tests used to
+fail when run on a system where long double is IEEE 128-bit:
+
+ gcc.dg/torture/float128-nan.c
+ gcc.target/powerpc/nan128-1.c
+
+This patch makes both __float128 and _Float128 use the same type node.
+
+One side effect of not using the long double type node for __float128 is that we
+must only use KFmode for _Float128/__float128. The libstdc++ library won't
+build if we use TFmode for _Float128 and __float128 when long double is IEEE
+128-bit.
+
+Another minor side effect is that the f128 round to odd fused multiply-add
+function will not merge negatition with the FMA operation when the type is long
+double. If the type is __float128 or _Float128, then it will continue to do the
+optimization. The round to odd functions are defined in terms of __float128
+arguments. For example:
+ long double
+ do_fms (long double a, long double b, long double c)
+ {
+ return __builtin_fmaf128_round_to_odd (a, b, -c);
+ }
+
+will generate (assuming -mabi=ieeelongdouble):
+
+ xsnegqp 4,4
+ xsmaddqpo 4,2,3
+ xxlor 34,36,36
+
+while:
+
+ __float128
+ do_fms (__float128 a, __float128 b, __float128 c)
+ {
+ return __builtin_fmaf128_round_to_odd (a, b, -c);
+ }
+
+will generate:
+
+ xsmsubqpo 4,2,3
+ xxlor 34,36,36
+
+I tested all 3 patchs for PR target/107299 on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+Once all 3 patches have been applied, we can once again build GCC when long
+double is IEEE 128-bit. There were no other regressions with these patches.
+Can I check these patches into the trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Always use the
+ _Float128 type for __float128.
+ (rs6000_expand_builtin): Only change a KFmode built-in to TFmode, if the
+ built-in passes or returns TFmode. If the predicate failed because the
+ modes were different, use convert_move to load up the value instead of
+ copy_to_mode_reg.
+ * config/rs6000/rs6000.cc (rs6000_translate_mode_attribute): Don't
+ translate IEEE 128-bit floating point modes to explicit IEEE 128-bit
+ modes (KFmode or KCmode), even if long double is IEEE 128-bit.
+ (rs6000_libgcc_floating_mode_supported_p): Support KFmode all of the
+ time if we support IEEE 128-bit floating point.
+ (rs6000_floatn_mode): _Float128 and _Float128x always uses KFmode.
+
+gcc/testsuite/
+
+ PR target/107299
+ * gcc.target/powerpc/float128-hw12.c: New test.
+ * gcc.target/powerpc/float128-hw13.c: Likewise.
+ * gcc.target/powerpc/float128-hw4.c: Update insns.
+
+==================== Dmf004 branch, patch #1.
+
+Rework 128-bit complex multiply and divide.
+
+This function reworks how the complex multiply and divide built-in functions are
+done. Previously we created built-in declarations for doing long double complex
+multiply and divide when long double is IEEE 128-bit. The old code also did not
+support __ibm128 complex multiply and divide if long double is IEEE 128-bit.
+
+In terms of history, I wrote the original code just as I was starting to test
+GCC on systems where IEEE 128-bit long double was the default. At the time, we
+had not yet started mangling the built-in function names as a way to bridge
+going from a system with 128-bit IBM long double to 128-bin IEEE long double.
+
+The original code depends on there only being two 128-bit types invovled. With
+the next patch in this series, this assumption will no longer be true. When
+long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
+explicit __float128/_Float128 type and one for long double).
+
+The problem is we cannot create two separate built-in functions that resolve to
+the same name. This is a requirement of add_builtin_function and the C front
+end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
+only use 2 of them.
+
+This code does not create the built-in declaration with the changed name.
+Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
+before it is written out to the assembler file like it now does for all of the
+other long double built-in functions.
+
+We need to disable using this mapping when we are building libgcc, specifically
+when it is building the floating point 128-bit multiply and divide functions.
+The flag that is used when libgcc is built (-fbuilding-libcc) is only available
+in the C/C++ front ends. We need to remember that we are building libgcc in the
+rs6000-c.cc support to be able to use this later to decided whether to mangle
+the decl assembler name or not.
+
+When I wrote these patches, I discovered that __ibm128 complex multiply and
+divide had originally not been supported if long double is IEEE 128-bit as it
+would generate calls to __mulic3 and __divic3. I added tests in the testsuite
+to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
+case.
+
+I tested all 3 patchs for PR target/107299 on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+Once all 3 patches have been applied, we can once again build GCC when long
+double is IEEE 128-bit. There were no other regressions with these patches.
+Can I check these patches into the trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set
+ building_libgcc.
+ * config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
+ (init_float128_ieee): Delete code to switch complex multiply and divide
+ for long double.
+ (complex_multiply_builtin_code): New helper function.
+ (complex_divide_builtin_code): Likewise.
+ (rs6000_mangle_decl_assembler_name): Add support for mangling the name
+ of complex 128-bit multiply and divide built-in functions.
+ * config/rs6000/rs6000.opt (building_libgcc): New target variable.
+
+gcc/testsuite/
+
+ PR target/107299
+ * gcc.target/powerpc/divic3-1.c: New test.
+ * gcc.target/powerpc/divic3-2.c: Likewise.
+ * gcc.target/powerpc/mulic3-1.c: Likewise.
+ * gcc.target/powerpc/mulic3-2.c: Likewise.
+
+==================== Dmf004 branch, head.
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+ Clone branch
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 56609462629..5c2f3bcee9f 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -780,6 +780,14 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
|| DEFAULT_ABI == ABI_ELFv2
|| (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
builtin_define ("__STRUCT_PARM_ALIGN__=16");
+
+ /* Store whether or not we are building libgcc. This is needed to disable
+ generating the alternate names for 128-bit complex multiply and divide.
+ We need to disable generating __multc3, __divtc3, __mulkc3, and __divkc3
+ when we are building those functions in libgcc. The variable
+ flag_building_libgcc is only available for the C family of front-ends.
+ We set this variable here to disable generating the alternate names. */
+ building_libgcc = flag_building_libgcc;
}
\f
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a85d7630b41..cfb6227e27b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11123,26 +11123,6 @@ init_float128_ibm (machine_mode mode)
}
}
-/* Create a decl for either complex long double multiply or complex long double
- divide when long double is IEEE 128-bit floating point. We can't use
- __multc3 and __divtc3 because the original long double using IBM extended
- double used those names. The complex multiply/divide functions are encoded
- as builtin functions with a complex result and 4 scalar inputs. */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
- tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
- name, NULL_TREE);
-
- set_builtin_decl (fncode, fndecl, true);
-
- if (TARGET_DEBUG_BUILTIN)
- fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
- return;
-}
-
/* Set up IEEE 128-bit floating point routines. Use different names if the
arguments can be passed in a vector register. The historical PowerPC
implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -11154,32 +11134,6 @@ init_float128_ieee (machine_mode mode)
{
if (FLOAT128_VECTOR_P (mode))
{
- static bool complex_muldiv_init_p = false;
-
- /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble. If
- we have clone or target attributes, this will be called a second
- time. We want to create the built-in function only once. */
- if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
- {
- complex_muldiv_init_p = true;
- built_in_function fncode_mul =
- (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
- - MIN_MODE_COMPLEX_FLOAT);
- built_in_function fncode_div =
- (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
- - MIN_MODE_COMPLEX_FLOAT);
-
- tree fntype = build_function_type_list (complex_long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- NULL_TREE);
-
- create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
- create_complex_muldiv ("__divkc3", fncode_div, fntype);
- }
-
set_optab_libfunc (add_optab, mode, "__addkf3");
set_optab_libfunc (sub_optab, mode, "__subkf3");
set_optab_libfunc (neg_optab, mode, "__negkf2");
@@ -28142,6 +28096,25 @@ rs6000_starting_frame_offset (void)
return RS6000_STARTING_FRAME_OFFSET;
}
\f
+/* Internal function to return the built-in function id for the complex
+ multiply operation for a given mode. */
+
+static inline built_in_function
+complex_multiply_builtin_code (machine_mode mode)
+{
+ return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode
+ - MIN_MODE_COMPLEX_FLOAT);
+}
+
+/* Internal function to return the built-in function id for the complex divide
+ operation for a given mode. */
+
+static inline built_in_function
+complex_divide_builtin_code (machine_mode mode)
+{
+ return (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + mode
+ - MIN_MODE_COMPLEX_FLOAT);
+}
/* On 64-bit Linux and Freebsd systems, possibly switch the long double library
function names from <foo>l to <foo>f128 if the default long double type is
@@ -28160,11 +28133,54 @@ rs6000_starting_frame_offset (void)
only do this transformation if the __float128 type is enabled. This
prevents us from doing the transformation on older 32-bit ports that might
have enabled using IEEE 128-bit floating point as the default long double
- type. */
+ type.
+
+ We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the
+ function names used for complex multiply and divide to the appropriate
+ names. */
static tree
rs6000_mangle_decl_assembler_name (tree decl, tree id)
{
+ /* Handle complex multiply/divide. For IEEE 128-bit, use __mulkc3 or
+ __divkc3 and for IBM 128-bit use __multc3 and __divtc3. */
+ if (TARGET_FLOAT128_TYPE
+ && !building_libgcc
+ && TREE_CODE (decl) == FUNCTION_DECL
+ && DECL_IS_UNDECLARED_BUILTIN (decl)
+ && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
+ {
+ built_in_function id = DECL_FUNCTION_CODE (decl);
+ const char *newname = NULL;
+
+ if (id == complex_multiply_builtin_code (KCmode))
+ newname = "__mulkc3";
+
+ else if (id == complex_multiply_builtin_code (ICmode))
+ newname = "__multc3";
+
+ else if (id == complex_multiply_builtin_code (TCmode))
+ newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3";
+
+ else if (id == complex_divide_builtin_code (KCmode))
+ newname = "__divkc3";
+
+ else if (id == complex_divide_builtin_code (ICmode))
+ newname = "__divtc3";
+
+ else if (id == complex_divide_builtin_code (TCmode))
+ newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3";
+
+ if (newname)
+ {
+ if (TARGET_DEBUG_BUILTIN)
+ fprintf (stderr, "Map complex mul/div => %s\n", newname);
+
+ return get_identifier (newname);
+ }
+ }
+
+ /* Map long double built-in functions if long double is IEEE 128-bit. */
if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
&& TREE_CODE (decl) == FUNCTION_DECL
&& DECL_IS_UNDECLARED_BUILTIN (decl)
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index b63a5d443af..e2de03dda92 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -100,6 +100,10 @@ unsigned int rs6000_recip_control
TargetVariable
unsigned int rs6000_debug
+;; Whether we are building libgcc or not.
+TargetVariable
+bool building_libgcc = false
+
;; Whether to enable the -mfloat128 stuff without necessarily enabling the
;; __float128 keyword.
TargetSave
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
new file mode 100644
index 00000000000..1cc6b1be904
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+ double is IEEE 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
new file mode 100644
index 00000000000..8ff342e0116
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+ double is IBM 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
new file mode 100644
index 00000000000..4cd773c4b06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+ double is IEEE 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
new file mode 100644
index 00000000000..36fe8bc3061
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+ double is IBM 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
^ permalink raw reply [flat|nested] 2+ messages in thread
* [gcc(refs/users/meissner/heads/dmf004)] Rework 128-bit complex multiply and divide.
@ 2022-11-17 21:52 Michael Meissner
0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2022-11-17 21:52 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:ff75e486a237e82046f307bc0ac4fb596859f91b
commit ff75e486a237e82046f307bc0ac4fb596859f91b
Author: Michael Meissner <meissner@linux.ibm.com>
Date: Wed Nov 9 00:30:02 2022 -0500
Rework 128-bit complex multiply and divide.
This function reworks how the complex multiply and divide built-in functions are
done. Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit. The old code also did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.
In terms of history, I wrote the original code just as I was starting to test
GCC on systems where IEEE 128-bit long double was the default. At the time, we
had not yet started mangling the built-in function names as a way to bridge
going from a system with 128-bit IBM long double to 128-bin IEEE long double.
The original code depends on there only being two 128-bit types invovled. With
the next patch in this series, this assumption will no longer be true. When
long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
explicit __float128/_Float128 type and one for long double).
The problem is we cannot create two separate built-in functions that resolve to
the same name. This is a requirement of add_builtin_function and the C front
end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
only use 2 of them.
This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.
We need to disable using this mapping when we are building libgcc, specifically
when it is building the floating point 128-bit multiply and divide functions.
The flag that is used when libgcc is built (-fbuilding-libcc) is only available
in the C/C++ front ends. We need to remember that we are building libgcc in the
rs6000-c.cc support to be able to use this later to decided whether to mangle
the decl assembler name or not.
When I wrote these patches, I discovered that __ibm128 complex multiply and
divide had originally not been supported if long double is IEEE 128-bit as it
would generate calls to __mulic3 and __divic3. I added tests in the testsuite
to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
case.
I tested all 3 patchs for PR target/107299 on:
1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
Once all 3 patches have been applied, we can once again build GCC when long
double is IEEE 128-bit. There were no other regressions with these patches.
Can I check these patches into the trunk?
2022-11-09 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/107299
* config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set
building_libgcc.
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.
* config/rs6000/rs6000.opt (building_libgcc): New target variable.
gcc/testsuite/
PR target/107299
* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
Diff:
---
gcc/ChangeLog.meissner | 330 +++++++++++++++++++++++++++-
gcc/config/rs6000/rs6000-c.cc | 8 +
gcc/config/rs6000/rs6000.cc | 110 ++++++----
gcc/config/rs6000/rs6000.opt | 4 +
gcc/testsuite/gcc.target/powerpc/divic3-1.c | 18 ++
gcc/testsuite/gcc.target/powerpc/divic3-2.c | 17 ++
gcc/testsuite/gcc.target/powerpc/mulic3-1.c | 18 ++
gcc/testsuite/gcc.target/powerpc/mulic3-2.c | 17 ++
8 files changed, 474 insertions(+), 48 deletions(-)
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 1f2cc7b8288..5a6fa542ab6 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,4 +1,332 @@
+==================== Dmf004 branch, patch #4.
+
+Patch libgcc to always use _Float128 and _Complex _Float128 on PowerPC.
+
2022-11-09 Michael Meissner <meissner@linux.ibm.com>
- Clone branch
+libgcc/
+
+ * config/rs6000/quad-float128.h (TF): Delete definition.
+ (TFtype): Define to be _Float128.
+ (TCtype): Change to be _Complex _Float128.
+ * libgcc2.h (TFtype): Allow MD code to override definition.
+ (TCtype): Likewise.
+ * soft-fp/quad.h (TFtype): Likewise.
+
+==================== Dmf004 branch, patch #3.
+
+Update float 128-bit conversions, PR target/107299.
+
+This patch fixes two tests that are still failing when long double is IEEE
+128-bit after the previous 2 patches for PR target/107299 have been applied.
+The tests are:
+
+ gcc.target/powerpc/convert-fp-128.c
+ gcc.target/powerpc/pr85657-3.c
+
+This patch is a rewrite of the patch submitted on August 18th:
+
+| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html
+
+This patch reworks the conversions between 128-bit binary floating point types.
+Previously, we would call rs6000_expand_float128_convert to do all conversions.
+Now, we only define the conversions between the same representation that turn
+into a NOP. The appropriate extend or truncate insn is generated, and after
+register allocation, it is converted to a move.
+
+This patch also fixes two places where we want to override the external name
+for the conversion function, and the wrong optab was used. Previously,
+rs6000_expand_float128_convert would handle the move or generate the call as
+needed. Now, it lets the machine independent code generate the call. But if
+we use the machine independent code to generate the call, we need to update the
+name for two optabs where a truncate would be used in terms of converting
+between the modes. This patch updates those two optabs.
+
+I tested this patch on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+In the past I have also tested this exact patch on the following systems:
+
+ 1) LE Power10 using --with-cpu=power9 --with-long-double-format=ibm
+ 2) LE Power10 using --with-cpu=power8 --with-long-double-format=ibm
+ 3) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+
+There were no regressions in the bootstrap process or running the tests (after
+applying all 3 patches for PR target/107299). Can I check this patch into the
+trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000.cc (init_float128_ieee): Use the correct
+ float_extend or float_truncate optab based on how the machine converts
+ between IEEE 128-bit and IBM 128-bit.
+ * config/rs6000/rs6000.md (IFKF): Delete.
+ (IFKF_reg): Delete.
+ (extendiftf2): Rewrite to be a move if IFmode and TFmode are both IBM
+ 128-bit. Do not run if TFmode is IEEE 128-bit.
+ (extendifkf2): Delete.
+ (extendtfkf2): Delete.
+ (extendtfif2): Delete.
+ (trunciftf2): Delete.
+ (truncifkf2): Delete.
+ (trunckftf2): Delete.
+ (extendkftf2): Implement conversion of IEEE 128-bit types as a move.
+ (trunctfif2): Delete.
+ (trunctfkf2): Implement conversion of IEEE 128-bit types as a move.
+ (extend<mode>tf2_internal): Delete.
+ (extendtf<mode>2_internal): Delete.
+
+==================== Dmf004 branch, patch #2.
+
+Make __float128 use the _Float128 type, PR target/107299.
+
+This patch fixes the issue that GCC cannot build when the default long double
+is IEEE 128-bit. It fails in building libgcc, specifically when it is trying
+to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc
+during the evrp pass. Ultimately it is failing because the code declared the
+type to use TFmode but it used F128 functions (i.e. KFmode).
+
+ typedef float TFtype __attribute__((mode (TF)));
+ typedef __complex float TCtype __attribute__((mode (TC)));
+
+ TCtype
+ __mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d)
+ {
+ TFtype ac, bd, ad, bc, x, y;
+ TCtype res;
+
+ ac = a * c;
+ bd = b * d;
+ ad = a * d;
+ bc = b * c;
+
+ x = ac - bd;
+ y = ad + bc;
+
+ if (__builtin_isnan (x) && __builtin_isnan (y))
+ {
+ _Bool recalc = 0;
+ if (__builtin_isinf (a) || __builtin_isinf (b))
+ {
+
+ a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a);
+ b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b);
+ if (__builtin_isnan (c))
+ c = __builtin_copysignf128 (0, c);
+ if (__builtin_isnan (d))
+ d = __builtin_copysignf128 (0, d);
+ recalc = 1;
+ }
+ if (__builtin_isinf (c) || __builtin_isinf (d))
+ {
+
+ c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c);
+ d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d);
+ if (__builtin_isnan (a))
+ a = __builtin_copysignf128 (0, a);
+ if (__builtin_isnan (b))
+ b = __builtin_copysignf128 (0, b);
+ recalc = 1;
+ }
+ if (!recalc
+ && (__builtin_isinf (ac) || __builtin_isinf (bd)
+ || __builtin_isinf (ad) || __builtin_isinf (bc)))
+ {
+
+ if (__builtin_isnan (a))
+ a = __builtin_copysignf128 (0, a);
+ if (__builtin_isnan (b))
+ b = __builtin_copysignf128 (0, b);
+ if (__builtin_isnan (c))
+ c = __builtin_copysignf128 (0, c);
+ if (__builtin_isnan (d))
+ d = __builtin_copysignf128 (0, d);
+ recalc = 1;
+ }
+ if (recalc)
+ {
+ x = __builtin_inff128 () * (a * c - b * d);
+ y = __builtin_inff128 () * (a * d + b * c);
+ }
+ }
+
+ __real__ res = x;
+ __imag__ res = y;
+ return res;
+ }
+
+Currently GCC uses the long double type node for __float128 if long double is
+IEEE 128-bit. It did not use the node for _Float128.
+
+Originally this was noticed if you call the nansq function to make a signaling
+NaN (nansq is mapped to nansf128). Because the type node for _Float128 is
+different from __float128, the machine independent code converts signaling NaNs
+to quiet NaNs if the types are not compatible. The following tests used to
+fail when run on a system where long double is IEEE 128-bit:
+
+ gcc.dg/torture/float128-nan.c
+ gcc.target/powerpc/nan128-1.c
+
+This patch makes both __float128 and _Float128 use the same type node.
+
+One side effect of not using the long double type node for __float128 is that we
+must only use KFmode for _Float128/__float128. The libstdc++ library won't
+build if we use TFmode for _Float128 and __float128 when long double is IEEE
+128-bit.
+
+Another minor side effect is that the f128 round to odd fused multiply-add
+function will not merge negatition with the FMA operation when the type is long
+double. If the type is __float128 or _Float128, then it will continue to do the
+optimization. The round to odd functions are defined in terms of __float128
+arguments. For example:
+ long double
+ do_fms (long double a, long double b, long double c)
+ {
+ return __builtin_fmaf128_round_to_odd (a, b, -c);
+ }
+
+will generate (assuming -mabi=ieeelongdouble):
+
+ xsnegqp 4,4
+ xsmaddqpo 4,2,3
+ xxlor 34,36,36
+
+while:
+
+ __float128
+ do_fms (__float128 a, __float128 b, __float128 c)
+ {
+ return __builtin_fmaf128_round_to_odd (a, b, -c);
+ }
+
+will generate:
+
+ xsmsubqpo 4,2,3
+ xxlor 34,36,36
+
+I tested all 3 patchs for PR target/107299 on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+Once all 3 patches have been applied, we can once again build GCC when long
+double is IEEE 128-bit. There were no other regressions with these patches.
+Can I check these patches into the trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Always use the
+ _Float128 type for __float128.
+ (rs6000_expand_builtin): Only change a KFmode built-in to TFmode, if the
+ built-in passes or returns TFmode. If the predicate failed because the
+ modes were different, use convert_move to load up the value instead of
+ copy_to_mode_reg.
+ * config/rs6000/rs6000.cc (rs6000_translate_mode_attribute): Don't
+ translate IEEE 128-bit floating point modes to explicit IEEE 128-bit
+ modes (KFmode or KCmode), even if long double is IEEE 128-bit.
+ (rs6000_libgcc_floating_mode_supported_p): Support KFmode all of the
+ time if we support IEEE 128-bit floating point.
+ (rs6000_floatn_mode): _Float128 and _Float128x always uses KFmode.
+
+gcc/testsuite/
+
+ PR target/107299
+ * gcc.target/powerpc/float128-hw12.c: New test.
+ * gcc.target/powerpc/float128-hw13.c: Likewise.
+ * gcc.target/powerpc/float128-hw4.c: Update insns.
+
+==================== Dmf004 branch, patch #1.
+
+Rework 128-bit complex multiply and divide.
+
+This function reworks how the complex multiply and divide built-in functions are
+done. Previously we created built-in declarations for doing long double complex
+multiply and divide when long double is IEEE 128-bit. The old code also did not
+support __ibm128 complex multiply and divide if long double is IEEE 128-bit.
+
+In terms of history, I wrote the original code just as I was starting to test
+GCC on systems where IEEE 128-bit long double was the default. At the time, we
+had not yet started mangling the built-in function names as a way to bridge
+going from a system with 128-bit IBM long double to 128-bin IEEE long double.
+
+The original code depends on there only being two 128-bit types invovled. With
+the next patch in this series, this assumption will no longer be true. When
+long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
+explicit __float128/_Float128 type and one for long double).
+
+The problem is we cannot create two separate built-in functions that resolve to
+the same name. This is a requirement of add_builtin_function and the C front
+end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
+only use 2 of them.
+
+This code does not create the built-in declaration with the changed name.
+Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
+before it is written out to the assembler file like it now does for all of the
+other long double built-in functions.
+
+We need to disable using this mapping when we are building libgcc, specifically
+when it is building the floating point 128-bit multiply and divide functions.
+The flag that is used when libgcc is built (-fbuilding-libcc) is only available
+in the C/C++ front ends. We need to remember that we are building libgcc in the
+rs6000-c.cc support to be able to use this later to decided whether to mangle
+the decl assembler name or not.
+
+When I wrote these patches, I discovered that __ibm128 complex multiply and
+divide had originally not been supported if long double is IEEE 128-bit as it
+would generate calls to __mulic3 and __divic3. I added tests in the testsuite
+to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
+case.
+
+I tested all 3 patchs for PR target/107299 on:
+
+ 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
+ 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
+ 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
+ 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
+
+Once all 3 patches have been applied, we can once again build GCC when long
+double is IEEE 128-bit. There were no other regressions with these patches.
+Can I check these patches into the trunk?
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+gcc/
+
+ PR target/107299
+ * config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set
+ building_libgcc.
+ * config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
+ (init_float128_ieee): Delete code to switch complex multiply and divide
+ for long double.
+ (complex_multiply_builtin_code): New helper function.
+ (complex_divide_builtin_code): Likewise.
+ (rs6000_mangle_decl_assembler_name): Add support for mangling the name
+ of complex 128-bit multiply and divide built-in functions.
+ * config/rs6000/rs6000.opt (building_libgcc): New target variable.
+
+gcc/testsuite/
+
+ PR target/107299
+ * gcc.target/powerpc/divic3-1.c: New test.
+ * gcc.target/powerpc/divic3-2.c: Likewise.
+ * gcc.target/powerpc/mulic3-1.c: Likewise.
+ * gcc.target/powerpc/mulic3-2.c: Likewise.
+
+==================== Dmf004 branch, head.
+
+2022-11-09 Michael Meissner <meissner@linux.ibm.com>
+
+ Clone branch
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 56609462629..5c2f3bcee9f 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -780,6 +780,14 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
|| DEFAULT_ABI == ABI_ELFv2
|| (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
builtin_define ("__STRUCT_PARM_ALIGN__=16");
+
+ /* Store whether or not we are building libgcc. This is needed to disable
+ generating the alternate names for 128-bit complex multiply and divide.
+ We need to disable generating __multc3, __divtc3, __mulkc3, and __divkc3
+ when we are building those functions in libgcc. The variable
+ flag_building_libgcc is only available for the C family of front-ends.
+ We set this variable here to disable generating the alternate names. */
+ building_libgcc = flag_building_libgcc;
}
\f
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a85d7630b41..cfb6227e27b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11123,26 +11123,6 @@ init_float128_ibm (machine_mode mode)
}
}
-/* Create a decl for either complex long double multiply or complex long double
- divide when long double is IEEE 128-bit floating point. We can't use
- __multc3 and __divtc3 because the original long double using IBM extended
- double used those names. The complex multiply/divide functions are encoded
- as builtin functions with a complex result and 4 scalar inputs. */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
- tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
- name, NULL_TREE);
-
- set_builtin_decl (fncode, fndecl, true);
-
- if (TARGET_DEBUG_BUILTIN)
- fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
- return;
-}
-
/* Set up IEEE 128-bit floating point routines. Use different names if the
arguments can be passed in a vector register. The historical PowerPC
implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -11154,32 +11134,6 @@ init_float128_ieee (machine_mode mode)
{
if (FLOAT128_VECTOR_P (mode))
{
- static bool complex_muldiv_init_p = false;
-
- /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble. If
- we have clone or target attributes, this will be called a second
- time. We want to create the built-in function only once. */
- if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
- {
- complex_muldiv_init_p = true;
- built_in_function fncode_mul =
- (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
- - MIN_MODE_COMPLEX_FLOAT);
- built_in_function fncode_div =
- (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
- - MIN_MODE_COMPLEX_FLOAT);
-
- tree fntype = build_function_type_list (complex_long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- long_double_type_node,
- NULL_TREE);
-
- create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
- create_complex_muldiv ("__divkc3", fncode_div, fntype);
- }
-
set_optab_libfunc (add_optab, mode, "__addkf3");
set_optab_libfunc (sub_optab, mode, "__subkf3");
set_optab_libfunc (neg_optab, mode, "__negkf2");
@@ -28142,6 +28096,25 @@ rs6000_starting_frame_offset (void)
return RS6000_STARTING_FRAME_OFFSET;
}
\f
+/* Internal function to return the built-in function id for the complex
+ multiply operation for a given mode. */
+
+static inline built_in_function
+complex_multiply_builtin_code (machine_mode mode)
+{
+ return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode
+ - MIN_MODE_COMPLEX_FLOAT);
+}
+
+/* Internal function to return the built-in function id for the complex divide
+ operation for a given mode. */
+
+static inline built_in_function
+complex_divide_builtin_code (machine_mode mode)
+{
+ return (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + mode
+ - MIN_MODE_COMPLEX_FLOAT);
+}
/* On 64-bit Linux and Freebsd systems, possibly switch the long double library
function names from <foo>l to <foo>f128 if the default long double type is
@@ -28160,11 +28133,54 @@ rs6000_starting_frame_offset (void)
only do this transformation if the __float128 type is enabled. This
prevents us from doing the transformation on older 32-bit ports that might
have enabled using IEEE 128-bit floating point as the default long double
- type. */
+ type.
+
+ We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the
+ function names used for complex multiply and divide to the appropriate
+ names. */
static tree
rs6000_mangle_decl_assembler_name (tree decl, tree id)
{
+ /* Handle complex multiply/divide. For IEEE 128-bit, use __mulkc3 or
+ __divkc3 and for IBM 128-bit use __multc3 and __divtc3. */
+ if (TARGET_FLOAT128_TYPE
+ && !building_libgcc
+ && TREE_CODE (decl) == FUNCTION_DECL
+ && DECL_IS_UNDECLARED_BUILTIN (decl)
+ && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
+ {
+ built_in_function id = DECL_FUNCTION_CODE (decl);
+ const char *newname = NULL;
+
+ if (id == complex_multiply_builtin_code (KCmode))
+ newname = "__mulkc3";
+
+ else if (id == complex_multiply_builtin_code (ICmode))
+ newname = "__multc3";
+
+ else if (id == complex_multiply_builtin_code (TCmode))
+ newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3";
+
+ else if (id == complex_divide_builtin_code (KCmode))
+ newname = "__divkc3";
+
+ else if (id == complex_divide_builtin_code (ICmode))
+ newname = "__divtc3";
+
+ else if (id == complex_divide_builtin_code (TCmode))
+ newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3";
+
+ if (newname)
+ {
+ if (TARGET_DEBUG_BUILTIN)
+ fprintf (stderr, "Map complex mul/div => %s\n", newname);
+
+ return get_identifier (newname);
+ }
+ }
+
+ /* Map long double built-in functions if long double is IEEE 128-bit. */
if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
&& TREE_CODE (decl) == FUNCTION_DECL
&& DECL_IS_UNDECLARED_BUILTIN (decl)
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index b63a5d443af..e2de03dda92 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -100,6 +100,10 @@ unsigned int rs6000_recip_control
TargetVariable
unsigned int rs6000_debug
+;; Whether we are building libgcc or not.
+TargetVariable
+bool building_libgcc = false
+
;; Whether to enable the -mfloat128 stuff without necessarily enabling the
;; __float128 keyword.
TargetSave
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
new file mode 100644
index 00000000000..1cc6b1be904
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+ double is IEEE 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
new file mode 100644
index 00000000000..8ff342e0116
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+ double is IBM 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
new file mode 100644
index 00000000000..4cd773c4b06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+ double is IEEE 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
new file mode 100644
index 00000000000..36fe8bc3061
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+ double is IBM 128-bit floating point. */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+ *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-11-17 21:52 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09 5:30 [gcc(refs/users/meissner/heads/dmf004)] Rework 128-bit complex multiply and divide Michael Meissner
2022-11-17 21:52 Michael Meissner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).