From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org,
Segher Boessenkool <segher@kernel.crashing.org>,
Peter Bergner <bergner@linux.ibm.com>,
David Edelsohn <dje.gcc@gmail.com>,
Will Schmidt <will_schmidt@vnet.ibm.com>,
William Seurer <seurer@gcc.gnu.org>
Subject: Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299
Date: Tue, 6 Dec 2022 19:27:01 +0800 [thread overview]
Message-ID: <e96ce199-348d-7446-102c-8d2b037a7286@linux.ibm.com> (raw)
In-Reply-To: <Y2HZFlHH8HuvhGL4@toto.the-meissners.org>
Hi Mike,
Thanks for fixing this, some comments are inlined below.
on 2022/11/2 10:42, Michael Meissner wrote:
> This patch fixes the issue that GCC cannot build when the default long double
> is IEEE 128-bit. It fails in building libgcc, specifically when it is trying
> to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc
> during the evrp pass. Ultimately it is failing because the code declared the
> type to use TFmode but it used F128 functions (i.e. KFmode).
>
> typedef float TFtype __attribute__((mode (TF)));
> typedef __complex float TCtype __attribute__((mode (TC)));
>
> TCtype
> __mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d)
> {
> TFtype ac, bd, ad, bc, x, y;
> TCtype res;
>
> ac = a * c;
> bd = b * d;
> ad = a * d;
> bc = b * c;
>
> x = ac - bd;
> y = ad + bc;
>
> if (__builtin_isnan (x) && __builtin_isnan (y))
> {
> _Bool recalc = 0;
> if (__builtin_isinf (a) || __builtin_isinf (b))
> {
>
> a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a);
> b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b);
> if (__builtin_isnan (c))
> c = __builtin_copysignf128 (0, c);
> if (__builtin_isnan (d))
> d = __builtin_copysignf128 (0, d);
> recalc = 1;
> }
> if (__builtin_isinf (c) || __builtin_isinf (d))
> {
>
> c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c);
> d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d);
> if (__builtin_isnan (a))
> a = __builtin_copysignf128 (0, a);
> if (__builtin_isnan (b))
> b = __builtin_copysignf128 (0, b);
> recalc = 1;
> }
> if (!recalc
> && (__builtin_isinf (ac) || __builtin_isinf (bd)
> || __builtin_isinf (ad) || __builtin_isinf (bc)))
> {
>
> if (__builtin_isnan (a))
> a = __builtin_copysignf128 (0, a);
> if (__builtin_isnan (b))
> b = __builtin_copysignf128 (0, b);
> if (__builtin_isnan (c))
> c = __builtin_copysignf128 (0, c);
> if (__builtin_isnan (d))
> d = __builtin_copysignf128 (0, d);
> recalc = 1;
> }
> if (recalc)
> {
> x = __builtin_inff128 () * (a * c - b * d);
> y = __builtin_inff128 () * (a * d + b * c);
> }
> }
>
> __real__ res = x;
> __imag__ res = y;
> return res;
> }
>
One further reduced test case can be:
typedef float TFtype __attribute__((mode (TF)));
TFtype test (TFtype t)
{
return __builtin_copysignf128 (1.0q, t);
}
Since this reduced test case is quite small, maybe it's good to make it as one
test case associated with this patch.
> Currently GCC uses the long double type node for __float128 if long double is
> IEEE 128-bit. It did not use the node for _Float128.
>
> Originally this was noticed if you call the nansq function to make a signaling
> NaN (nansq is mapped to nansf128). Because the type node for _Float128 is
> different from __float128, the machine independent code converts signaling NaNs
> to quiet NaNs if the types are not compatible. The following tests used to
> fail when run on a system where long double is IEEE 128-bit:
>
> gcc.dg/torture/float128-nan.c
> gcc.target/powerpc/nan128-1.c
>
> This patch makes both __float128 and _Float128 use the same type node.
>
> One side effect of not using the long double type node for __float128 is that we
> must only use KFmode for _Float128/__float128. The libstdc++ library won't
> build if we use TFmode for _Float128 and __float128 when long double is IEEE
> 128-bit.
>
Sorry that I didn't get the point of the latter sentence, this proposed patch
uses KFmode for _Float128 and __float128, do you mean that would be fine for
libstdc++ library building since we don't use TFmode for them?
> Another minor side effect is that the f128 round to odd fused multiply-add
> function will not merge negatition with the FMA operation when the type is long
> double. If the type is __float128 or _Float128, then it will continue to do the
> optimization. The round to odd functions are defined in terms of __float128
> arguments. For example:
>
> long double
> do_fms (long double a, long double b, long double c)
> {
> return __builtin_fmaf128_round_to_odd (a, b, -c);
> }
>
> will generate (assuming -mabi=ieeelongdouble):
>
> xsnegqp 4,4
> xsmaddqpo 4,2,3
> xxlor 34,36,36
>
> while:
>
> __float128
> do_fms (__float128 a, __float128 b, __float128 c)
> {
> return __builtin_fmaf128_round_to_odd (a, b, -c);
> }
>
> will generate:
>
> xsmsubqpo 4,2,3
> xxlor 34,36,36
>
It sounds like a tiny "regression", maybe we should open a PR to track it?
> I tested all 3 patchs for PR target/107299 on:
>
> 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
> 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
> 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm
> 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm
>
> Once all 3 patches have been applied, we can once again build GCC when long
> double is IEEE 128-bit. There were no other regressions with these patches.
> Can I check these patches into the trunk?
>
> 2022-11-01 Michael Meissner <meissner@linux.ibm.com>
>
> gcc/
>
> PR target/107299
> * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Always use the
> _Float128 type for __float128.
> (rs6000_expand_builtin): Only change a KFmode built-in to TFmode, if the
> built-in passes or returns TFmode. If the predicate failed because the
> modes were different, use convert_move to load up the value instead of
> copy_to_mode_reg.
> * config/rs6000/rs6000.cc (rs6000_translate_mode_attribute): Don't
> translate IEEE 128-bit floating point modes to explicit IEEE 128-bit
> modes (KFmode or KCmode), even if long double is IEEE 128-bit.
You meant to say "Translate .." not "Don't .."?
> (rs6000_libgcc_floating_mode_supported_p): Support KFmode all of the
> time if we support IEEE 128-bit floating point.
> (rs6000_floatn_mode): _Float128 and _Float128x always uses KFmode.
>
> gcc/testsuite/
>
> PR target/107299
> * gcc.target/powerpc/float128-hw12.c: New test.
> * gcc.target/powerpc/float128-hw13.c: Likewise.
> * gcc.target/powerpc/float128-hw4.c: Update insns.
> ---
> gcc/config/rs6000/rs6000-builtin.cc | 237 ++++++++++--------
> gcc/config/rs6000/rs6000.cc | 31 ++-
> .../gcc.target/powerpc/float128-hw12.c | 137 ++++++++++
> .../gcc.target/powerpc/float128-hw13.c | 137 ++++++++++
> .../gcc.target/powerpc/float128-hw4.c | 10 +-
> 5 files changed, 431 insertions(+), 121 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-hw12.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-hw13.c
>
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
> index 90ab39dc258..e5298f45363 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -730,25 +730,28 @@ rs6000_init_builtins (void)
>
> if (TARGET_FLOAT128_TYPE)
> {
> - if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
> - ieee128_float_type_node = long_double_type_node;
> - else
> + /* In the past we used long_double_type_node when long double was IEEE
> + 128-bit. However, this means that the _Float128 type
> + (i.e. float128_type_node) is a different type from __float128
> + (i.e. ieee128_float_type_nonde). This leads to some corner cases,
~~~~ node
> + such as processing signaling NaNs with the nansf128 built-in function
> + (which returns a _Float128 value) and assign it to a long double or
> + __float128 value. The two explicit IEEE 128-bit types should always
> + use the same internal type.
> +
> + For C we only need to register the __ieee128 name for it. For C++, we
> + create a distinct type which will mangle differently (u9__ieee128)
> + vs. _Float128 (DF128_) and behave backwards compatibly. */
> + if (float128t_type_node == NULL_TREE)
> {
> - /* For C we only need to register the __ieee128 name for
> - it. For C++, we create a distinct type which will mangle
> - differently (u9__ieee128) vs. _Float128 (DF128_) and behave
> - backwards compatibly. */
> - if (float128t_type_node == NULL_TREE)
> - {
> - float128t_type_node = make_node (REAL_TYPE);
> - TYPE_PRECISION (float128t_type_node)
> - = TYPE_PRECISION (float128_type_node);
> - layout_type (float128t_type_node);
> - SET_TYPE_MODE (float128t_type_node,
> - TYPE_MODE (float128_type_node));
> - }
> - ieee128_float_type_node = float128t_type_node;
> + float128t_type_node = make_node (REAL_TYPE);
> + TYPE_PRECISION (float128t_type_node)
> + = TYPE_PRECISION (float128_type_node);
> + layout_type (float128t_type_node);
> + SET_TYPE_MODE (float128t_type_node,
> + TYPE_MODE (float128_type_node));
> }
> + ieee128_float_type_node = float128t_type_node;
> t = build_qualified_type (ieee128_float_type_node, TYPE_QUAL_CONST);
> lang_hooks.types.register_builtin_type (ieee128_float_type_node,
> "__ieee128");
> @@ -3265,13 +3268,13 @@ htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode,
>
> /* Expand an expression EXP that calls a built-in function,
> with result going to TARGET if that's convenient
> - (and in mode MODE if that's convenient).
> + (and in mode RETURN_MODE if that's convenient).
> SUBTARGET may be used as the target for computing one of EXP's operands.
> IGNORE is nonzero if the value is to be ignored.
> Use the new builtin infrastructure. */
> rtx
> rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
> - machine_mode /* mode */, int ignore)
> + machine_mode return_mode, int ignore)
> {
> tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
> enum rs6000_gen_builtins fcode
> @@ -3287,78 +3290,99 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
> size_t uns_fcode = (size_t)fcode;
> enum insn_code icode = rs6000_builtin_info[uns_fcode].icode;
>
> - /* TODO: The following commentary and code is inherited from the original
> - builtin processing code. The commentary is a bit confusing, with the
> - intent being that KFmode is always IEEE-128, IFmode is always IBM
> - double-double, and TFmode is the current long double. The code is
> - confusing in that it converts from KFmode to TFmode pattern names,
> - when the other direction is more intuitive. Try to address this. */
> -
> - /* We have two different modes (KFmode, TFmode) that are the IEEE
> - 128-bit floating point type, depending on whether long double is the
> - IBM extended double (KFmode) or long double is IEEE 128-bit (TFmode).
> - It is simpler if we only define one variant of the built-in function,
> - and switch the code when defining it, rather than defining two built-
> - ins and using the overload table in rs6000-c.cc to switch between the
> - two. If we don't have the proper assembler, don't do this switch
> - because CODE_FOR_*kf* and CODE_FOR_*tf* will be CODE_FOR_nothing. */
> - if (FLOAT128_IEEE_P (TFmode))
> - switch (icode)
> - {
> - case CODE_FOR_sqrtkf2_odd:
> - icode = CODE_FOR_sqrttf2_odd;
> - break;
> - case CODE_FOR_trunckfdf2_odd:
> - icode = CODE_FOR_trunctfdf2_odd;
> - break;
> - case CODE_FOR_addkf3_odd:
> - icode = CODE_FOR_addtf3_odd;
> - break;
> - case CODE_FOR_subkf3_odd:
> - icode = CODE_FOR_subtf3_odd;
> - break;
> - case CODE_FOR_mulkf3_odd:
> - icode = CODE_FOR_multf3_odd;
> - break;
> - case CODE_FOR_divkf3_odd:
> - icode = CODE_FOR_divtf3_odd;
> - break;
> - case CODE_FOR_fmakf4_odd:
> - icode = CODE_FOR_fmatf4_odd;
> - break;
> - case CODE_FOR_xsxexpqp_kf:
> - icode = CODE_FOR_xsxexpqp_tf;
> - break;
> - case CODE_FOR_xsxsigqp_kf:
> - icode = CODE_FOR_xsxsigqp_tf;
> - break;
> - case CODE_FOR_xststdcnegqp_kf:
> - icode = CODE_FOR_xststdcnegqp_tf;
> - break;
> - case CODE_FOR_xsiexpqp_kf:
> - icode = CODE_FOR_xsiexpqp_tf;
> - break;
> - case CODE_FOR_xsiexpqpf_kf:
> - icode = CODE_FOR_xsiexpqpf_tf;
> - break;
> - case CODE_FOR_xststdcqp_kf:
> - icode = CODE_FOR_xststdcqp_tf;
> - break;
> - case CODE_FOR_xscmpexpqp_eq_kf:
> - icode = CODE_FOR_xscmpexpqp_eq_tf;
> - break;
> - case CODE_FOR_xscmpexpqp_lt_kf:
> - icode = CODE_FOR_xscmpexpqp_lt_tf;
> - break;
> - case CODE_FOR_xscmpexpqp_gt_kf:
> - icode = CODE_FOR_xscmpexpqp_gt_tf;
> - break;
> - case CODE_FOR_xscmpexpqp_unordered_kf:
> - icode = CODE_FOR_xscmpexpqp_unordered_tf;
> - break;
> - default:
> - break;
> - }
> + /* For 128-bit long double, we may need both the KFmode built-in functions
> + and IFmode built-in functions to the equivalent TFmode built-in function,
> + if either a TFmode result is expected or any of the arguments use
> + TFmode. */
> + if (TARGET_LONG_DOUBLE_128)
> + {
> + bool uses_tf_mode = return_mode == TFmode;
> + if (!uses_tf_mode)
> + {
> + call_expr_arg_iterator iter;
> + tree arg;
> + FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
> + {
> + if (arg != error_mark_node
> + && TYPE_MODE (TREE_TYPE (arg)) == TFmode)
> + {
> + uses_tf_mode = true;
> + break;
> + }
> + }
> + }
> +
> + /* Convert KFmode built-in functions to TFmode when long double is IEEE
> + 128-bit. */
> + if (uses_tf_mode && FLOAT128_IEEE_P (TFmode))
> + switch (icode)
> + {
> + case CODE_FOR_sqrtkf2_odd:
> + icode = CODE_FOR_sqrttf2_odd;
> + break;
> + case CODE_FOR_trunckfdf2_odd:
> + icode = CODE_FOR_trunctfdf2_odd;
> + break;
> + case CODE_FOR_addkf3_odd:
> + icode = CODE_FOR_addtf3_odd;
> + break;
> + case CODE_FOR_subkf3_odd:
> + icode = CODE_FOR_subtf3_odd;
> + break;
> + case CODE_FOR_mulkf3_odd:
> + icode = CODE_FOR_multf3_odd;
> + break;
> + case CODE_FOR_divkf3_odd:
> + icode = CODE_FOR_divtf3_odd;
> + break;
> + case CODE_FOR_fmakf4_odd:
> + icode = CODE_FOR_fmatf4_odd;
> + break;
> + case CODE_FOR_xsxexpqp_kf:
> + icode = CODE_FOR_xsxexpqp_tf;
> + break;
> + case CODE_FOR_xsxsigqp_kf:
> + icode = CODE_FOR_xsxsigqp_tf;
> + break;
> + case CODE_FOR_xststdcnegqp_kf:
> + icode = CODE_FOR_xststdcnegqp_tf;
> + break;
> + case CODE_FOR_xsiexpqp_kf:
> + icode = CODE_FOR_xsiexpqp_tf;
> + break;
> + case CODE_FOR_xsiexpqpf_kf:
> + icode = CODE_FOR_xsiexpqpf_tf;
> + break;
> + case CODE_FOR_xststdcqp_kf:
> + icode = CODE_FOR_xststdcqp_tf;
> + break;
> + case CODE_FOR_xscmpexpqp_eq_kf:
> + icode = CODE_FOR_xscmpexpqp_eq_tf;
> + break;
> + case CODE_FOR_xscmpexpqp_lt_kf:
> + icode = CODE_FOR_xscmpexpqp_lt_tf;
> + break;
> + case CODE_FOR_xscmpexpqp_gt_kf:
> + icode = CODE_FOR_xscmpexpqp_gt_tf;
> + break;
> + case CODE_FOR_xscmpexpqp_unordered_kf:
> + icode = CODE_FOR_xscmpexpqp_unordered_tf;
> + break;
> + default:
> + break;
> + }
> +
> + /* Convert IFmode built-in functions to TFmode when long double is IBM
> + 128-bit. */
> + else if (uses_tf_mode && FLOAT128_IBM_P (TFmode))
> + {
> + if (icode == CODE_FOR_packif)
> + icode = CODE_FOR_packtf;
> +
> + else if (icode == CODE_FOR_unpackif)
> + icode = CODE_FOR_unpacktf;
> + }
> + }
>
> /* In case of "#pragma target" changes, we initialize all builtins
> but check for actual availability now, during expand time. For
> @@ -3481,18 +3505,6 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
>
> if (bif_is_ibm128 (*bifaddr) && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD)
> {
> - if (fcode == RS6000_BIF_PACK_IF)
> - {
> - icode = CODE_FOR_packtf;
> - fcode = RS6000_BIF_PACK_TF;
> - uns_fcode = (size_t) fcode;
> - }
> - else if (fcode == RS6000_BIF_UNPACK_IF)
> - {
> - icode = CODE_FOR_unpacktf;
> - fcode = RS6000_BIF_UNPACK_TF;
> - uns_fcode = (size_t) fcode;
> - }
> }
This remaining "if" hunk is useless. The others looks good to me.
BR,
Kewen
next prev parent reply other threads:[~2022-12-06 11:27 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-02 2:39 Patch [0/3] for PR target/107299 (GCC does not build on PowerPC when long double is IEEE 128-bit) Michael Meissner
2022-11-02 2:40 ` [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299 Michael Meissner
2022-11-07 15:41 ` Ping: " Michael Meissner
2022-11-29 17:43 ` Ping #2: " Michael Meissner
2022-12-02 17:58 ` Ping #3: " Michael Meissner
2022-12-06 9:36 ` Kewen.Lin
2022-12-07 6:44 ` Michael Meissner
2022-12-07 7:55 ` Kewen.Lin
2022-12-08 22:04 ` Michael Meissner
2022-12-12 10:20 ` Kewen.Lin
2022-12-13 6:14 ` Michael Meissner
2022-12-13 13:51 ` Segher Boessenkool
2022-12-14 8:45 ` Kewen.Lin
2022-12-13 6:23 ` Michael Meissner
2022-11-02 2:42 ` [PATCH 2/3] Make __float128 use the _Float128 type, " Michael Meissner
2022-11-07 15:43 ` Ping: " Michael Meissner
2022-11-29 17:44 ` Michael Meissner
2022-12-02 18:01 ` Ping #3: " Michael Meissner
2022-12-06 11:27 ` Kewen.Lin [this message]
2022-12-14 8:46 ` Kewen.Lin
2022-12-14 9:36 ` Jakub Jelinek
2022-12-14 10:11 ` Kewen.Lin
2022-12-14 10:33 ` Jakub Jelinek
2022-12-15 7:54 ` Kewen.Lin
2022-12-15 7:45 ` Kewen.Lin
2022-12-15 18:28 ` Joseph Myers
2022-12-15 18:49 ` Segher Boessenkool
2022-12-15 18:56 ` Jakub Jelinek
2022-12-15 20:26 ` Segher Boessenkool
2022-12-15 17:59 ` Segher Boessenkool
2022-12-16 0:09 ` Michael Meissner
2022-12-16 17:55 ` Segher Boessenkool
2022-12-16 21:53 ` Michael Meissner
2023-01-11 20:24 ` Michael Meissner
2022-11-02 2:44 ` [PATCH 3/3] Update float 128-bit conversions, " Michael Meissner
2022-11-07 15:44 ` Ping: " Michael Meissner
2022-11-29 17:46 ` Ping #3: " Michael Meissner
2022-12-02 18:04 ` Michael Meissner
2022-12-06 14:56 ` Patch [0/3] for PR target/107299 (GCC does not build on PowerPC when long double is IEEE 128-bit) Segher Boessenkool
2022-12-06 15:03 ` Jakub Jelinek
2022-12-13 14:11 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e96ce199-348d-7446-102c-8d2b037a7286@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=meissner@linux.ibm.com \
--cc=segher@kernel.crashing.org \
--cc=seurer@gcc.gnu.org \
--cc=will_schmidt@vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).