From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 827A1384C357; Tue, 13 Dec 2022 23:27:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 827A1384C357 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1670974051; bh=rAt7Mug1vFwDH+AQN2gt+UdmGgtL0kpBfGiJXKJojy4=; h=From:To:Subject:Date:From; b=faIouOWsMbB5d6zO/ARZvQAiyz0wuae0S2A5PJSnixIHwyOZOKEHyLIZR0rQqOPfc IiZrcg3oQtHsXMChxtqxFnDRj1MZIhEd1rQ2fYI+W0DMJ60woMOiI8rfSWXeYr2YqA O6hthQoMD5actib9Jw0RolDUedZcdm4yiPbXSLv4= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work103)] Update ChangeLog.meissner. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work103 X-Git-Oldrev: 1e144fbe7d85e66a71e2ecc6bd606a7852580698 X-Git-Newrev: a6a8b31decfeb059b661db6e7815d82b4e47b045 Message-Id: <20221213232731.827A1384C357@sourceware.org> Date: Tue, 13 Dec 2022 23:27:31 +0000 (GMT) List-Id: https://gcc.gnu.org/g:a6a8b31decfeb059b661db6e7815d82b4e47b045 commit a6a8b31decfeb059b661db6e7815d82b4e47b045 Author: Michael Meissner Date: Tue Dec 13 18:27:12 2022 -0500 Update ChangeLog.meissner. 2022-12-13 Michael Meissner gcc/ * ChangeLog.meissner: Update. Diff: --- gcc/ChangeLog.meissner | 323 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 323 insertions(+) diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner index 62c29740393..f850b740e4f 100644 --- a/gcc/ChangeLog.meissner +++ b/gcc/ChangeLog.meissner @@ -1,3 +1,326 @@ +==================== work103, patch #3 + +Update float 128-bit conversions, PR target/107299. + +This patch fixes two tests that are still failing when long double is IEEE +128-bit after the previous 2 patches for PR target/107299 have been applied. +The tests are: + + gcc.target/powerpc/convert-fp-128.c + gcc.target/powerpc/pr85657-3.c + +This patch is a rewrite of the patch submitted on August 18th: + +| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html + +This patch reworks the conversions between 128-bit binary floating point types. +Previously, we would call rs6000_expand_float128_convert to do all conversions. +Now, we only define the conversions between the same representation that turn +into a NOP. The appropriate extend or truncate insn is generated, and after +register allocation, it is converted to a move. + +This patch also fixes two places where we want to override the external name +for the conversion function, and the wrong optab was used. Previously, +rs6000_expand_float128_convert would handle the move or generate the call as +needed. Now, it lets the machine independent code generate the call. But if +we use the machine independent code to generate the call, we need to update the +name for two optabs where a truncate would be used in terms of converting +between the modes. This patch updates those two optabs. + +I tested this patch on: + + 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee + 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm + 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm + 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm + +In the past I have also tested this exact patch on the following systems: + + 1) LE Power10 using --with-cpu=power9 --with-long-double-format=ibm + 2) LE Power10 using --with-cpu=power8 --with-long-double-format=ibm + 3) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm + +There were no regressions in the bootstrap process or running the tests (after +applying all 3 patches for PR target/107299). Can I check this patch into the +trunk? + +2022-12-14 Michael Meissner + +gcc/ + + PR target/107299 + * config/rs6000/rs6000.cc (init_float128_ieee): Use the correct + float_extend or float_truncate optab based on how the machine converts + between IEEE 128-bit and IBM 128-bit. + * config/rs6000/rs6000.md (IFKF): Delete. + (IFKF_reg): Delete. + (extendiftf2): Rewrite to be a move if IFmode and TFmode are both IBM + 128-bit. Do not run if TFmode is IEEE 128-bit. + (extendifkf2): Delete. + (extendtfkf2): Delete. + (extendtfif2): Delete. + (trunciftf2): Delete. + (truncifkf2): Delete. + (trunckftf2): Delete. + (extendkftf2): Implement conversion of IEEE 128-bit types as a move. + (trunctfif2): Delete. + (trunctfkf2): Implement conversion of IEEE 128-bit types as a move. + (extendtf2_internal): Delete. + (extendtf2_internal): Delete. + +==================== work103, patch #2 + +Make __float128 use the _Float128 type, PR target/107299. + +This patch fixes the issue that GCC cannot build when the default long double +is IEEE 128-bit. It fails in building libgcc, specifically when it is trying +to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc +during the evrp pass. Ultimately it is failing because the code declared the +internal type for one IEEE 128-bit floating point type, and NaN functions use a +different IEEE 128-bit floating point type. + +Gimple-range-fold uses the internal types, but there are similar problems when +the code is converted to RTL and the two different modes (KFmode, TFmode) are +used. + + typedef float TFtype __attribute__((mode (TF))); + typedef __complex float TCtype __attribute__((mode (TC))); + + TCtype + __mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d) + { + TFtype ac, bd, ad, bc, x, y; + TCtype res; + + ac = a * c; + bd = b * d; + ad = a * d; + bc = b * c; + + x = ac - bd; + y = ad + bc; + + if (__builtin_isnan (x) && __builtin_isnan (y)) + { + _Bool recalc = 0; + if (__builtin_isinf (a) || __builtin_isinf (b)) + { + + a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a); + b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b); + if (__builtin_isnan (c)) + c = __builtin_copysignf128 (0, c); + if (__builtin_isnan (d)) + d = __builtin_copysignf128 (0, d); + recalc = 1; + } + if (__builtin_isinf (c) || __builtin_isinf (d)) + { + + c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c); + d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d); + if (__builtin_isnan (a)) + a = __builtin_copysignf128 (0, a); + if (__builtin_isnan (b)) + b = __builtin_copysignf128 (0, b); + recalc = 1; + } + if (!recalc + && (__builtin_isinf (ac) || __builtin_isinf (bd) + || __builtin_isinf (ad) || __builtin_isinf (bc))) + { + + if (__builtin_isnan (a)) + a = __builtin_copysignf128 (0, a); + if (__builtin_isnan (b)) + b = __builtin_copysignf128 (0, b); + if (__builtin_isnan (c)) + c = __builtin_copysignf128 (0, c); + if (__builtin_isnan (d)) + d = __builtin_copysignf128 (0, d); + recalc = 1; + } + if (recalc) + { + x = __builtin_inff128 () * (a * c - b * d); + y = __builtin_inff128 () * (a * d + b * c); + } + } + + __real__ res = x; + __imag__ res = y; + return res; + } + +Currently GCC uses the long double type node for __float128 if long double is +IEEE 128-bit. It did not use the node for _Float128. + +Originally this was noticed if you call the nansq function to make a signaling +NaN (nansq is mapped to nansf128). Because the type node for _Float128 is +different from __float128, the machine independent code converts signaling NaNs +to quiet NaNs if the types are not compatible. The following tests used to +fail when run on a system where long double is IEEE 128-bit: + + gcc.dg/torture/float128-nan.c + gcc.target/powerpc/nan128-1.c + +This patch makes both __float128 and _Float128 use the same type node. + +One side effect of not using the long double type node for __float128 is that we +must only use KFmode for _Float128/__float128. The libstdc++ library won't +build if we use TFmode for _Float128 and __float128 when long double is IEEE +128-bit. + +Another minor side effect is that the f128 round to odd fused multiply-add +function will not merge negatition with the FMA operation when the type is long +double. If the type is __float128 or _Float128, then it will continue to do the +optimization. The round to odd functions are defined in terms of __float128 +arguments. For example: + + long double + do_fms (long double a, long double b, long double c) + { + return __builtin_fmaf128_round_to_odd (a, b, -c); + } + +will generate (assuming -mabi=ieeelongdouble): + + xsnegqp 4,4 + xsmaddqpo 4,2,3 + xxlor 34,36,36 + +while: + + __float128 + do_fms (__float128 a, __float128 b, __float128 c) + { + return __builtin_fmaf128_round_to_odd (a, b, -c); + } + +will generate: + + xsmsubqpo 4,2,3 + xxlor 34,36,36 + +Assuming this patch goes in, we can open a bug about the above optimizations not +working. However, given that the functions are explicitly documented to use +__float128 types, and the code in the test is using long double, I don't think +it is a high priority issue. The user should use the documented types. + +I did experiment to do the support, and to to it properly, you need a bunch of +insns used by the combiner to deal with combining the conversion from TFmode to +KFmode along with the optimization in order to eventually combine the multiple +and add/subtract operations into a separate FMA. + +I tested all 3 patchs for PR target/107299 on: + + 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee + 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm + 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm + 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm + +Once all 3 patches have been applied, we can once again build GCC when long +double is IEEE 128-bit. There were no other regressions with these patches. +Can I check these patches into the trunk? + +2022-12-14 Michael Meissner + +gcc/ + + PR target/107299 + * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Always use the + _Float128 type for __float128. + (rs6000_expand_builtin): Only change a KFmode built-in to TFmode, if the + built-in passes or returns TFmode. If the predicate failed because the + modes were different, use convert_move to load up the value instead of + copy_to_mode_reg. + * config/rs6000/rs6000.cc (rs6000_translate_mode_attribute): Don't + translate IEEE 128-bit floating point modes to explicit IEEE 128-bit + modes (KFmode or KCmode), even if long double is IEEE 128-bit. + (rs6000_libgcc_floating_mode_supported_p): Support KFmode all of the + time if we support IEEE 128-bit floating point. + (rs6000_floatn_mode): _Float128 and _Float128x always uses KFmode. + +gcc/testsuite/ + + PR target/107299 + * gcc.target/powerpc/float128-hw12.c: New test. + * gcc.target/powerpc/float128-hw13.c: Likewise. + * gcc.target/powerpc/float128-hw4.c: Update insns. + +==================== work103, patch #1 + +Rework 128-bit complex multiply and divide. + +This function reworks how the complex multiply and divide built-in functions are +done. Previously we created built-in declarations for doing long double complex +multiply and divide when long double is IEEE 128-bit. The old code also did not +support __ibm128 complex multiply and divide if long double is IEEE 128-bit. + +In terms of history, I wrote the original code just as I was starting to test +GCC on systems where IEEE 128-bit long double was the default. At the time, we +had not yet started mangling the built-in function names as a way to bridge +going from a system with 128-bit IBM long double to 128-bin IEEE long double. + +The original code depends on there only being two 128-bit types invovled. With +the next patch in this series, this assumption will no longer be true. When +long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the +explicit __float128/_Float128 type and one for long double). + +The problem is we cannot create two separate built-in functions that resolve to +the same name. This is a requirement of add_builtin_function and the C front +end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can +only use 2 of them. + +This code does not create the built-in declaration with the changed name. +Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name +before it is written out to the assembler file like it now does for all of the +other long double built-in functions. + +When I wrote these patches, I discovered that __ibm128 complex multiply and +divide had originally not been supported if long double is IEEE 128-bit as it +would generate calls to __mulic3 and __divic3. I added tests in the testsuite +to verify that the correct name (i.e. __multc3 and __divtc3) is used in this +case. + +I had previously sent this patch out on November 1st. Compared to that version, +this version no longer disables the special mapping when you are building +libgcc, as it turns out we don't need it. + +I tested all 3 patchs for PR target/107299 on: + + 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee + 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm + 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm + 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm + +Once all 3 patches have been applied, we can once again build GCC when long +double is IEEE 128-bit. There were no other regressions with these patches. +Can I check these patches into the trunk? + +2022-12-14 Michael Meissner + +gcc/ + + PR target/107299 + * config/rs6000/rs6000.cc (create_complex_muldiv): Delete. + (init_float128_ieee): Delete code to switch complex multiply and divide + for long double. + (complex_multiply_builtin_code): New helper function. + (complex_divide_builtin_code): Likewise. + (rs6000_mangle_decl_assembler_name): Add support for mangling the name + of complex 128-bit multiply and divide built-in functions. + +gcc/testsuite/ + + PR target/107299 + * gcc.target/powerpc/divic3-1.c: New test. + * gcc.target/powerpc/divic3-2.c: Likewise. + * gcc.target/powerpc/mulic3-1.c: Likewise. + * gcc.target/powerpc/mulic3-2.c: Likewise. + +==================== work103, clone branch + 2022-12-13 Michael Meissner Clone branch