public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/5] IEEE 128-bit built-in overload support.
@ 2022-07-28  4:43 Michael Meissner
  2022-07-28  4:47 ` [PATCH 1/5] " Michael Meissner
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:43 UTC (permalink / raw)
  To: gcc-patches, Michael Meissner, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

The following patches add support for doing built-in function overloading
between the two 128-bit IEEE types (i.e. _Float182/__float128 using KFmode and
when long double uses the IEEE 128-bit encoding with TFmode).

These patches lay the foundation for a set of follow-on patches that will
change the internal handling of 128-bit floating point types in GCC.  In the
future patches, I hope to change the compiler to always use KFmode for the
explicit _Float128/__float128 types, to always use TFmode for the long double
type, no matter which 128-bit floating point type is used, and IFmode for the
explicit __ibm128 type.

But before I can submit those patches to change the internal type structure, I
need to make sure that the built-in functions can handle both sets of types,
and the overload mechanism automatically switches between the two.

There are 5 patches in the series.

The first patch adds the infrastructure to the built-in mechanism to deal with
long doubles that use the IEEE 128-bit encoding.

The second patch adds overload support to the IEEE 128-bit round to odd
built-in functions.

The third patch adds overload support to the IEEE 128-bit comparason built-in
functions.

The fourth patch adds overload support to the IEEE 128-bit scalar extract field
and insert field built-in functions.

The fifth patch adds overload support to the IEEE 128-bit test data and test
data negate built-in functions.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install these patches to the trunk?

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] IEEE 128-bit built-in overload support.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
@ 2022-07-28  4:47 ` Michael Meissner
  2022-07-28  4:48 ` [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions Michael Meissner
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

[PATCH 1/5] IEEE 128-bit built-in overload support.

This patch lays the ground work that future patches will use to add
builtin support (both normal and overloaded) for the case where long
double uses the IEEE 128-bit encoding.

This adds a new stanza (ieee128-hw-ld) for when we have IEEE 128-bit
hardware support and long double uses the IEEE 128-bit encoding.

A new type attribute (ieeeld) is added for long double if long double uses
the IEEE 128-bit encoding.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install this patch to the trunk?

2022-07-27   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add
	support for ibm128-hw-ld stanza.
	(rs6000_builtin_is_supported): Likewise.
	(rs6000_init_builtins): Likewise.
	(rs6000_expand_builtin): Add support for IEEE128_HW_LD.  Add
	support for ieeeld.
	* config/rs6000/rs6000-builtins.def (toplevel): Add comment about
	the new ieeeld attribute.
	* config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add
	BSTZ_IEEE128_HW_LD.
	(stanza_map): Likewise.
	(enable_string): Likewise.
	(attrinfo): Add isieeeld.
	(parse_bif_attrs): Parse ieeeld.  Add printing ieeeld to the debug
	print.
	(write_decls): Add support for ibm128-hw-ld stanza and ieeeld
	attribute.
	(write_bif_static_init): Add support for ieeeld attribute.
---
 gcc/config/rs6000/rs6000-builtin.cc      | 18 ++++++++++++++++++
 gcc/config/rs6000/rs6000-builtins.def    |  1 +
 gcc/config/rs6000/rs6000-gen-builtins.cc | 18 ++++++++++++++++--
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
index 2819773d9f9..67e86bee781 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -123,6 +123,10 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
     case ENB_IEEE128_HW:
       error ("%qs requires quad-precision floating-point arithmetic", name);
       break;
+    case ENB_IEEE128_HW_LD:
+      error ("%qs requires %qs to use IEEE quad-precision floating-point "
+	     "arithmetic", name, "long double");
+      break;
     case ENB_DFP:
       error ("%qs requires the %qs option", name, "-mhard-dfp");
       break;
@@ -189,6 +193,8 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins fncode)
       return TARGET_ALTIVEC && rs6000_cpu == PROCESSOR_CELL;
     case ENB_IEEE128_HW:
       return TARGET_FLOAT128_HW;
+    case ENB_IEEE128_HW_LD:
+      return TARGET_FLOAT128_HW && FLOAT128_IEEE_P (TFmode);
     case ENB_DFP:
       return TARGET_DFP;
     case ENB_CRYPTO:
@@ -857,6 +863,9 @@ rs6000_init_builtins (void)
 	    continue;
 	  if (e == ENB_IEEE128_HW && !TARGET_FLOAT128_HW)
 	    continue;
+	  if (e == ENB_IEEE128_HW_LD && (!TARGET_FLOAT128_HW
+					 || !FLOAT128_IEEE_P (TFmode)))
+	    continue;
 	  if (e == ENB_DFP && !TARGET_DFP)
 	    continue;
 	  if (e == ENB_CRYPTO && !TARGET_CRYPTO)
@@ -3387,6 +3396,8 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
 	|| (e == ENB_P9_64 && TARGET_MODULO && TARGET_POWERPC64)
 	|| (e == ENB_P9V && TARGET_P9_VECTOR)
 	|| (e == ENB_IEEE128_HW && TARGET_FLOAT128_HW)
+	|| (e == ENB_IEEE128_HW_LD && TARGET_FLOAT128_HW
+	    && FLOAT128_IEEE_P (TFmode))
 	|| (e == ENB_DFP && TARGET_DFP)
 	|| (e == ENB_CRYPTO && TARGET_CRYPTO)
 	|| (e == ENB_HTM && TARGET_HTM)
@@ -3426,6 +3437,13 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
       return const0_rtx;
     }
 
+  if (bif_is_ieeeld (*bifaddr) && !FLOAT128_IEEE_P (TFmode))
+    {
+      error ("%qs requires %<long double%> to be IEEE 128-bit format",
+	     bifaddr->bifname);
+      return const0_rtx;
+    }
+
   if (bif_is_cpu (*bifaddr))
     return cpu_expand_builtin (fcode, exp, target);
 
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..defd7e25ffe 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -139,6 +139,7 @@
 ;   endian   Needs special handling for endianness
 ;   ibmld    Restrict usage to the case when TFmode is IBM-128
 ;   ibm128   Restrict usage to the case where __ibm128 is supported or if ibmld
+;   ieeeld   Restrict usage to the case when TFmode is IEEE-128
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.cc b/gcc/config/rs6000/rs6000-gen-builtins.cc
index 0bd7a535e5f..b939e04c258 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.cc
+++ b/gcc/config/rs6000/rs6000-gen-builtins.cc
@@ -95,6 +95,7 @@ along with GCC; see the file COPYING3.  If not see
      ibmld    Restrict usage to the case when TFmode is IBM-128
      ibm128   Restrict usage to the case where __ibm128 is supported or
               if ibmld
+     ieeeld   Restrict usage to the case when TFmode is IEEE-128
 
    An example stanza might look like this:
 
@@ -227,6 +228,7 @@ enum bif_stanza
  BSTZ_P9_64,
  BSTZ_P9V,
  BSTZ_IEEE128_HW,
+ BSTZ_IEEE128_HW_LD,
  BSTZ_DFP,
  BSTZ_CRYPTO,
  BSTZ_HTM,
@@ -261,6 +263,7 @@ static stanza_entry stanza_map[NUMBIFSTANZAS] =
     { "power9-64",	BSTZ_P9_64	},
     { "power9-vector",	BSTZ_P9V	},
     { "ieee128-hw",	BSTZ_IEEE128_HW	},
+    { "ieee128-hw-ld",	BSTZ_IEEE128_HW_LD },
     { "dfp",		BSTZ_DFP	},
     { "crypto",		BSTZ_CRYPTO	},
     { "htm",		BSTZ_HTM	},
@@ -286,6 +289,7 @@ static const char *enable_string[NUMBIFSTANZAS] =
     "ENB_P9_64",
     "ENB_P9V",
     "ENB_IEEE128_HW",
+    "ENB_IEEE128_HW_LD",
     "ENB_DFP",
     "ENB_CRYPTO",
     "ENB_HTM",
@@ -395,6 +399,7 @@ struct attrinfo
   bool isendian;
   bool isibmld;
   bool isibm128;
+  bool isieeeld;
 };
 
 /* Fields associated with a function prototype (bif or overload).  */
@@ -1444,6 +1449,8 @@ parse_bif_attrs (attrinfo *attrptr)
 	  attrptr->isibmld = 1;
 	else if (!strcmp (attrname, "ibm128"))
 	  attrptr->isibm128 = 1;
+	else if (!strcmp (attrname, "ieeeld"))
+	  attrptr->isieeeld = 1;
 	else
 	  {
 	    diag (oldpos, "unknown attribute.\n");
@@ -1477,7 +1484,8 @@ parse_bif_attrs (attrinfo *attrptr)
 	"ldvec = %d, stvec = %d, reve = %d, pred = %d, htm = %d, "
 	"htmspr = %d, htmcr = %d, mma = %d, quad = %d, pair = %d, "
 	"mmaint = %d, no32bit = %d, 32bit = %d, cpu = %d, ldstmask = %d, "
-	"lxvrse = %d, lxvrze = %d, endian = %d, ibmdld = %d, ibm128 = %d.\n",
+	"lxvrse = %d, lxvrze = %d, endian = %d, ibmdld = %d, ibm128 = %d, "
+	"ieeeld = %d.\n",
 	attrptr->isinit, attrptr->isset, attrptr->isextract,
 	attrptr->isnosoft, attrptr->isldvec, attrptr->isstvec,
 	attrptr->isreve, attrptr->ispred, attrptr->ishtm, attrptr->ishtmspr,
@@ -1485,7 +1493,7 @@ parse_bif_attrs (attrinfo *attrptr)
 	attrptr->ismmaint, attrptr->isno32bit, attrptr->is32bit,
 	attrptr->iscpu, attrptr->isldstmask, attrptr->islxvrse,
 	attrptr->islxvrze, attrptr->isendian, attrptr->isibmld,
-	attrptr->isibm128);
+	attrptr->isibm128, attrptr->isieeeld);
 #endif
 
   return PC_OK;
@@ -2252,6 +2260,7 @@ write_decls (void)
   fprintf (header_file, "  ENB_P9_64,\n");
   fprintf (header_file, "  ENB_P9V,\n");
   fprintf (header_file, "  ENB_IEEE128_HW,\n");
+  fprintf (header_file, "  ENB_IEEE128_HW_LD,\n");
   fprintf (header_file, "  ENB_DFP,\n");
   fprintf (header_file, "  ENB_CRYPTO,\n");
   fprintf (header_file, "  ENB_HTM,\n");
@@ -2301,6 +2310,7 @@ write_decls (void)
   fprintf (header_file, "#define bif_endian_bit\t\t(0x00200000)\n");
   fprintf (header_file, "#define bif_ibmld_bit\t\t(0x00400000)\n");
   fprintf (header_file, "#define bif_ibm128_bit\t\t(0x00800000)\n");
+  fprintf (header_file, "#define bif_ieeeld_bit\t\t(0x01000000)\n");
   fprintf (header_file, "\n");
   fprintf (header_file,
 	   "#define bif_is_init(x)\t\t((x).bifattrs & bif_init_bit)\n");
@@ -2350,6 +2360,8 @@ write_decls (void)
 	   "#define bif_is_ibmld(x)\t((x).bifattrs & bif_ibmld_bit)\n");
   fprintf (header_file,
 	   "#define bif_is_ibm128(x)\t((x).bifattrs & bif_ibm128_bit)\n");
+  fprintf (header_file,
+	   "#define bif_is_ieeeld(x)\t((x).bifattrs & bif_ieeeld_bit)\n");
   fprintf (header_file, "\n");
 
   fprintf (header_file,
@@ -2548,6 +2560,8 @@ write_bif_static_init (void)
 	fprintf (init_file, " | bif_ibmld_bit");
       if (bifp->attrs.isibm128)
 	fprintf (init_file, " | bif_ibm128_bit");
+      if (bifp->attrs.isieeeld)
+	fprintf (init_file, " | bif_ieeeld_bit");
       fprintf (init_file, ",\n");
       fprintf (init_file, "      /* restr_opnd */\t{%d, %d, %d},\n",
 	       bifp->proto.restr_opnd[0], bifp->proto.restr_opnd[1],
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
  2022-07-28  4:47 ` [PATCH 1/5] " Michael Meissner
@ 2022-07-28  4:48 ` Michael Meissner
  2022-07-28  4:50 ` [PATCH 3/5] Support IEEE 128-bit overload comparison " Michael Meissner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:48 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

[PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions.

This patch adds support for overloading the IEEE 128-bit round to odd
built-in functions bewteeen KFmode and TFmode arguments.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install this patch to the trunk, assuming I have installed the first
patch in the series?

2022-07-27   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-builtins.def
	(__builtin_addf128_round_to_odd_kf): Rename KFmode round to odd
	built-in functions with a KF suffix to allow overloading.
	(__builtin_divf128_round_to_odd_kf): Likewise.
	(__builtin_fmaf128_round_to_odd_kf): Likewise.
	(__builtin_mulf128_round_to_odd_kf): Likewise.
	(__builtin_sqrtf128_round_to_odd_kf): Likewise.
	(__builtin_subf128_round_to_odd_kf): Likewise.
	(__builtin_truncf128_round_to_odd_kf): Likewise.
	(__builtin_addf128_round_to_odd_tf): Add TFmode round to odd
	built-in functions.
	(__builtin_fmaf128_round_to_odd_tf): Likewise.
	(__builtin_mulf128_round_to_odd_tf): Likewise.
	(__builtin_sqrtf128_round_to_odd_tf): Likewise.
	(__builtin_subf128_round_to_odd_tf): Likewise.
	(__builtin_truncf128_round_to_odd_tf): Likewise.
	* config/rs6000/rs6000-overload.def
	(__builtin_addf128_round_to_odd): Make IEEE 128-bit round to odd
	built-in functions overloaded.
	(__builtin_divf128_round_to_odd): Likewise.
	(__builtin_fmaf128_round_to_odd): Likewise.
	(__builtin_mulf128_round_to_odd): Likewise.
	(__builtin_sqrtf128_round_to_odd): Likewise.
	(__builtin_subf128_round_to_odd): Likewise.
	(__builtin_truncf128_round_to_odd): Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def | 58 ++++++++++++++++++++-------
 gcc/config/rs6000/rs6000-overload.def | 44 ++++++++++++++++++++
 2 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index defd7e25ffe..d72ff8cb7fe 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2867,18 +2867,18 @@
 
 ; Builtins requiring hardware support for IEEE-128 floating-point.
 [ieee128-hw]
-  fpmath _Float128 __builtin_addf128_round_to_odd (_Float128, _Float128);
-    ADDF128_ODD addkf3_odd {}
+  fpmath _Float128 __builtin_addf128_round_to_odd_kf (_Float128, _Float128);
+    ADDF128_ODD_KF addkf3_odd {}
 
-  fpmath _Float128 __builtin_divf128_round_to_odd (_Float128, _Float128);
-    DIVF128_ODD divkf3_odd {}
+  fpmath _Float128 __builtin_divf128_round_to_odd_kf (_Float128, _Float128);
+    DIVF128_ODD_KF divkf3_odd {}
 
-  fpmath _Float128 __builtin_fmaf128_round_to_odd (_Float128, _Float128, \
-                                                   _Float128);
-    FMAF128_ODD fmakf4_odd {}
+  fpmath _Float128 __builtin_fmaf128_round_to_odd_kf (_Float128, _Float128, \
+						      _Float128);
+    FMAF128_ODD_KF fmakf4_odd {}
 
-  fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
-    MULF128_ODD mulkf3_odd {}
+  fpmath _Float128 __builtin_mulf128_round_to_odd_kf (_Float128, _Float128);
+    MULF128_ODD_KF mulkf3_odd {}
 
   const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
     VSCEQPEQ xscmpexpqp_eq_kf {}
@@ -2893,14 +2893,14 @@
       __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
     VSCEQPUO xscmpexpqp_unordered_kf {}
 
-  fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
-    SQRTF128_ODD sqrtkf2_odd {}
+  fpmath _Float128 __builtin_sqrtf128_round_to_odd_kf (_Float128);
+    SQRTF128_ODD_KF sqrtkf2_odd {}
 
-  fpmath _Float128 __builtin_subf128_round_to_odd (_Float128, _Float128);
-    SUBF128_ODD subkf3_odd {}
+  fpmath _Float128 __builtin_subf128_round_to_odd_kf (_Float128, _Float128);
+    SUBF128_ODD_KF subkf3_odd {}
 
-  fpmath double __builtin_truncf128_round_to_odd (_Float128);
-    TRUNCF128_ODD trunckfdf2_odd {}
+  fpmath double __builtin_truncf128_round_to_odd_kf (_Float128);
+    TRUNCF128_ODD_KF trunckfdf2_odd {}
 
   const signed long long __builtin_vsx_scalar_extract_expq (_Float128);
     VSEEQP xsxexpqp_kf {}
@@ -2924,6 +2924,34 @@
     VSTDCNQP xststdcnegqp_kf {}
 
 
+; Builtins requiring hardware support for IEEE-128 floating-point.  Long double
+; must use the IEEE 128-bit encoding.
+[ieee128-hw-ld]
+  fpmath long double __builtin_addf128_round_to_odd_tf (long double, long double);
+    ADDF128_ODD_TF addtf3_odd {ieeeld}
+
+  fpmath long double __builtin_divf128_round_to_odd_tf (long double,long double);
+    DIVF128_ODD_TF divtf3_odd {ieeeld}
+
+  fpmath long double __builtin_fmaf128_round_to_odd_tf (long double, \
+							long double, \
+							long double);
+    FMAF128_ODD_TF fmatf4_odd {ieeeld}
+
+  fpmath long double __builtin_mulf128_round_to_odd_tf (long double, \
+							long double);
+    MULF128_ODD_TF multf3_odd {ieeeld}
+
+  fpmath long double __builtin_sqrtf128_round_to_odd_tf (long double);
+    SQRTF128_ODD_TF sqrttf2_odd {ieeeld}
+
+  fpmath long double __builtin_subf128_round_to_odd_tf (long double, \
+							long double);
+    SUBF128_ODD_TF subtf3_odd {ieeeld}
+
+  fpmath double __builtin_truncf128_round_to_odd_tf (long double);
+    TRUNCF128_ODD_TF trunctfdf2_odd {ieeeld}
+
 
 ; Decimal floating-point builtins.
 [dfp]
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 44e2945aaa0..f406a16a882 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -6175,3 +6175,47 @@
     VUPKLSW  VUPKLSW_DEPR1
   vbll __builtin_vec_vupklsw (vbi);
     VUPKLSW  VUPKLSW_DEPR2
+
+[ADDF128_ODD, SKIP, __builtin_addf128_round_to_odd]
+  long double __builtin_addf128_round_to_odd (long double, long double);
+    ADDF128_ODD_TF
+  _Float128 __builtin_addf128_round_to_odd (_Float128, _Float128);
+    ADDF128_ODD_KF
+
+[DIVF128_ODD, SKIP, __builtin_divf128_round_to_odd]
+  long double __builtin_divf128_round_to_odd (long double, long double);
+    DIVF128_ODD_TF
+  _Float128 __builtin_divf128_round_to_odd (_Float128, _Float128);
+    DIVF128_ODD_KF
+
+[FMAF128_ODD, SKIP, __builtin_fmaf128_round_to_odd]
+  long double __builtin_fmaf128_round_to_odd (long double, long double, \
+					      long double);
+    FMAF128_ODD_TF
+  _Float128 __builtin_fmaf128_round_to_odd (_Float128, _Float128, \
+					    _Float128);
+    FMAF128_ODD_KF
+
+[MULF128_ODD, SKIP, __builtin_mulf128_round_to_odd]
+  long double __builtin_mulf128_round_to_odd (long double, long double);
+    MULF128_ODD_TF
+  _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
+    MULF128_ODD_KF
+
+[SQRTF128_ODD, SKIP, __builtin_sqrtf128_round_to_odd]
+  long double __builtin_sqrtf128_round_to_odd (long double);
+    SQRTF128_ODD_TF
+  _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
+    SQRTF128_ODD_KF
+
+[SUBF128_ODD, SKIP, __builtin_subf128_round_to_odd]
+  long double __builtin_subf128_round_to_odd (long double, long double);
+    SUBF128_ODD_TF
+  _Float128 __builtin_subf128_round_to_odd (_Float128, _Float128);
+    SUBF128_ODD_KF
+
+[TRUNCF128_ODD, SKIP, __builtin_truncf128_round_to_odd]
+  long double __builtin_truncf128_round_to_odd (long double);
+    TRUNCF128_ODD_TF
+  _Float128 __builtin_truncf128_round_to_odd (_Float128);
+    TRUNCF128_ODD_KF
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
  2022-07-28  4:47 ` [PATCH 1/5] " Michael Meissner
  2022-07-28  4:48 ` [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions Michael Meissner
@ 2022-07-28  4:50 ` Michael Meissner
  2022-07-28  4:52 ` [PATCH 4/5] Support IEEE 128-bit overload extract and insert " Michael Meissner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:50 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions.

This patch adds support for overloading the IEEE 128-bit comparison
built-in functions bewteeen KFmode and TFmode arguments.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install this patch to the trunk, assuming I have installed the first two
patches in the series?

2022-07-27   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_cmp_exp_qp_eq_kf): Rename KFmode comparison
	built-in functions to have a KF suffix to allow overloading.
	(__builtin_vsx_scalar_cmp_exp_qp_gt_kf): Likewise.
	(__builtin_vsx_scalar_cmp_exp_qp_lt_kf): Likewise.
	(__builtin_vsx_scalar_cmp_exp_qp_unordered_kf): Likewise.
	(__builtin_vsx_scalar_cmp_exp_qp_eq_tf): Add TFmode comparison
	built-in functions.
	(__builtin_vsx_scalar_cmp_exp_qp_gt_tf): Likewise.
	(__builtin_vsx_scalar_cmp_exp_qp_lt_tf): Likewise.
	(__builtin_vsx_scalar_cmp_exp_qp_unordered_tf): Likewise.
	* config/rs6000/rs6000-overload.def
	(__builtin_vec_scalar_cmp_exp_eq): Add TFmode overloaded
	functions.
	(__builtin_vec_scalar_cmp_exp_gt): Likewise.
	(__builtin_vec_scalar_cmp_exp_lt): Likewise.
	(__builtin_vec_scalar_cmp_exp_unordered): Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def | 32 ++++++++++++++++++++-------
 gcc/config/rs6000/rs6000-overload.def | 16 ++++++++++----
 2 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index d72ff8cb7fe..23fc4a5f108 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2880,18 +2880,18 @@
   fpmath _Float128 __builtin_mulf128_round_to_odd_kf (_Float128, _Float128);
     MULF128_ODD_KF mulkf3_odd {}
 
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
-    VSCEQPEQ xscmpexpqp_eq_kf {}
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq_kf (_Float128, _Float128);
+    VSCEQPEQ_KF xscmpexpqp_eq_kf {}
 
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
-    VSCEQPGT xscmpexpqp_gt_kf {}
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt_kf (_Float128, _Float128);
+    VSCEQPGT_KF xscmpexpqp_gt_kf {}
 
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
-    VSCEQPLT xscmpexpqp_lt_kf {}
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt_kf (_Float128, _Float128);
+    VSCEQPLT_KF xscmpexpqp_lt_kf {}
 
   const signed int \
-      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
-    VSCEQPUO xscmpexpqp_unordered_kf {}
+      __builtin_vsx_scalar_cmp_exp_qp_unordered_kf (_Float128, _Float128);
+    VSCEQPUO_KF xscmpexpqp_unordered_kf {}
 
   fpmath _Float128 __builtin_sqrtf128_round_to_odd_kf (_Float128);
     SQRTF128_ODD_KF sqrtkf2_odd {}
@@ -2942,6 +2942,22 @@
 							long double);
     MULF128_ODD_TF multf3_odd {ieeeld}
 
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq_tf (long double, \
+							  long double);
+    VSCEQPEQ_TF xscmpexpqp_eq_tf {ieeeld}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt_tf (long double, \
+							  long double);
+    VSCEQPGT_TF xscmpexpqp_gt_kf {ieeeld}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt_tf (long double, \
+							  long double);
+    VSCEQPLT_TF xscmpexpqp_lt_tf {ieeeld}
+
+  const signed int \
+      __builtin_vsx_scalar_cmp_exp_qp_unordered_tf (long double, long double);
+    VSCEQPUO_TF xscmpexpqp_unordered_tf {ieeeld}
+
   fpmath long double __builtin_sqrtf128_round_to_odd_tf (long double);
     SQRTF128_ODD_TF sqrttf2_odd {ieeeld}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index f406a16a882..511a3821d5b 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4474,25 +4474,33 @@
   signed int __builtin_vec_scalar_cmp_exp_eq (double, double);
     VSCEDPEQ
   signed int __builtin_vec_scalar_cmp_exp_eq (_Float128, _Float128);
-    VSCEQPEQ
+    VSCEQPEQ_KF
+  signed int __builtin_vec_scalar_cmp_exp_eq (long double, long double);
+    VSCEQPEQ_TF
 
 [VEC_VSCEGT, scalar_cmp_exp_gt, __builtin_vec_scalar_cmp_exp_gt]
   signed int __builtin_vec_scalar_cmp_exp_gt (double, double);
     VSCEDPGT
   signed int __builtin_vec_scalar_cmp_exp_gt (_Float128, _Float128);
-    VSCEQPGT
+    VSCEQPGT_KF
+  signed int __builtin_vec_scalar_cmp_exp_gt (long double, long double);
+    VSCEQPGT_TF
 
 [VEC_VSCELT, scalar_cmp_exp_lt, __builtin_vec_scalar_cmp_exp_lt]
   signed int __builtin_vec_scalar_cmp_exp_lt (double, double);
     VSCEDPLT
   signed int __builtin_vec_scalar_cmp_exp_lt (_Float128, _Float128);
-    VSCEQPLT
+    VSCEQPLT_KF
+  signed int __builtin_vec_scalar_cmp_exp_lt (long double, long double);
+    VSCEQPLT_TF
 
 [VEC_VSCEUO, scalar_cmp_exp_unordered, __builtin_vec_scalar_cmp_exp_unordered]
   signed int __builtin_vec_scalar_cmp_exp_unordered (double, double);
     VSCEDPUO
   signed int __builtin_vec_scalar_cmp_exp_unordered (_Float128, _Float128);
-    VSCEQPUO
+    VSCEQPUO_KF
+  signed int __builtin_vec_scalar_cmp_exp_unordered (long double, long double);
+    VSCEQPUO_TF
 
 [VEC_VSEE, scalar_extract_exp, __builtin_vec_scalar_extract_exp]
   unsigned int __builtin_vec_scalar_extract_exp (double);
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
                   ` (2 preceding siblings ...)
  2022-07-28  4:50 ` [PATCH 3/5] Support IEEE 128-bit overload comparison " Michael Meissner
@ 2022-07-28  4:52 ` Michael Meissner
  2022-07-28  4:54 ` [PATCH 5/5] Support IEEE 128-bit overload test data " Michael Meissner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:52 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

[PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions.

This patch adds support for overloading the IEEE 128-bit extract and
insert built-in functions bewteeen KFmode and TFmode arguments.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install this patch to the trunk, assuming I have installed the first
three patches in the series?

2022-07-27   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_extract_expq_kf): Rename KFmode IEEE 128-bit
	insert and extract built-in functions to have a KF suffix to allow
	overloading.
	(__builtin_vsx_scalar_extract_sigq_kf): Likewise.
	(__builtin_vsx_scalar_insert_exp_qp_kf): Likewise.
	(__builtin_vsx_scalar_extract_expq_tf): Add TFmode variants for
	IEEE 128-bit insert and extract support.
	(__builtin_vsx_scalar_extract_sigq_tf): Likewise.
	(__builtin_vsx_scalar_insert_exp_qp_tf): Likewise.
	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
	Add support for having KFmode and TFmode variants of VSIEQPF.
	* config/rs6000/rs6000-overload.def
	(__builtin_vec_scalar_extract_exp): Add TFmode overloads.
	(__builtin_vec_scalar_extract_sig): Likewise.
	(__builtin_vec_scalar_insert_exp): Likewise.

gcc/testsuite/

	* gcc.target/powerpc/bfp/scalar-extract-exp-4.c:  Update the
	expected error message.
	* gcc.target/powerpc/bfp/scalar-extract-sig-4.c: Likewise.
	* gcc.target/powerpc/bfp/scalar-insert-exp-10.c: Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def         | 26 ++++++++++++++-----
 gcc/config/rs6000/rs6000-c.cc                 | 10 ++++---
 gcc/config/rs6000/rs6000-overload.def         | 12 ++++++---
 .../powerpc/bfp/scalar-extract-exp-4.c        |  2 +-
 .../powerpc/bfp/scalar-extract-sig-4.c        |  2 +-
 .../powerpc/bfp/scalar-insert-exp-10.c        |  2 +-
 6 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 23fc4a5f108..2ac66b39975 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2902,19 +2902,21 @@
   fpmath double __builtin_truncf128_round_to_odd_kf (_Float128);
     TRUNCF128_ODD_KF trunckfdf2_odd {}
 
-  const signed long long __builtin_vsx_scalar_extract_expq (_Float128);
-    VSEEQP xsxexpqp_kf {}
+  const signed long long __builtin_vsx_scalar_extract_expq_kf (_Float128);
+    VSEEQP_KF xsxexpqp_kf {}
 
-  const signed __int128 __builtin_vsx_scalar_extract_sigq (_Float128);
-    VSESQP xsxsigqp_kf {}
+  const signed __int128 __builtin_vsx_scalar_extract_sigq_kf (_Float128);
+    VSESQP_KF xsxsigqp_kf {}
 
+; Note we cannot overload this function since it does not have KFmode
+; or TFmode arguments.
   const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned __int128, \
                                                      unsigned long long);
     VSIEQP xsiexpqp_kf {}
 
-  const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, \
-                                                      unsigned long long);
-    VSIEQPF xsiexpqpf_kf {}
+  const _Float128 __builtin_vsx_scalar_insert_exp_qp_kf (_Float128, \
+							 unsigned long long);
+    VSIEQPF_KF xsiexpqpf_kf {}
 
   const signed int __builtin_vsx_scalar_test_data_class_qp (_Float128, \
                                                             const int<7>);
@@ -2968,6 +2970,16 @@
   fpmath double __builtin_truncf128_round_to_odd_tf (long double);
     TRUNCF128_ODD_TF trunctfdf2_odd {ieeeld}
 
+  const signed long long __builtin_vsx_scalar_extract_expq_tf (long double);
+    VSEEQP_TF xsxexpqp_tf {ieeeld}
+
+  const signed __int128 __builtin_vsx_scalar_extract_sigq_tf (long double);
+    VSESQP_TF xsxsigqp_tf {ieeeld}
+
+  const long double __builtin_vsx_scalar_insert_exp_qp_tf (long double,		\
+							   unsigned long long);
+    VSIEQPF_TF xsiexpqpf_tf {ieeeld}
+
 
 ; Decimal floating-point builtins.
 [dfp]
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 0d13645040f..4532cb4624b 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -1935,11 +1935,13 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
 	   128-bit variant of built-in function.  */
 	if (GET_MODE_PRECISION (arg1_mode) > 64)
 	  {
-	    /* If first argument is of float variety, choose variant
-	       that expects __ieee128 argument.  Otherwise, expect
-	       __int128 argument.  */
+	    /* If first argument is of float variety, choose variant that
+	       expects _Float128 argument (or long double if long doubles are
+	       IEEE 128-bit).  Otherwise, expect __int128 argument.  */
 	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
-	      instance_code = RS6000_BIF_VSIEQPF;
+	      instance_code = ((arg1_mode == TFmode)
+			       ? RS6000_BIF_VSIEQPF_TF
+			       : RS6000_BIF_VSIEQPF_KF);
 	    else
 	      instance_code = RS6000_BIF_VSIEQP;
 	  }
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 511a3821d5b..546883ece19 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4506,13 +4506,17 @@
   unsigned int __builtin_vec_scalar_extract_exp (double);
     VSEEDP
   unsigned int __builtin_vec_scalar_extract_exp (_Float128);
-    VSEEQP
+    VSEEQP_KF
+  unsigned int __builtin_vec_scalar_extract_exp (long double);
+    VSEEQP_TF
 
 [VEC_VSES, scalar_extract_sig, __builtin_vec_scalar_extract_sig]
   unsigned long long __builtin_vec_scalar_extract_sig (double);
     VSESDP
   unsigned __int128 __builtin_vec_scalar_extract_sig (_Float128);
-    VSESQP
+    VSESQP_KF
+  unsigned __int128 __builtin_vec_scalar_extract_sig (long double);
+    VSESQP_TF
 
 [VEC_VSIE, scalar_insert_exp, __builtin_vec_scalar_insert_exp]
   double __builtin_vec_scalar_insert_exp (unsigned long long, unsigned long long);
@@ -4522,7 +4526,9 @@
   _Float128 __builtin_vec_scalar_insert_exp (unsigned __int128, unsigned long long);
     VSIEQP
   _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned long long);
-    VSIEQPF
+    VSIEQPF_KF
+  long double __builtin_vec_scalar_insert_exp (long double, unsigned long long);
+    VSIEQPF_TF
 
 [VEC_VSTDC, scalar_test_data_class, __builtin_vec_scalar_test_data_class]
   unsigned int __builtin_vec_scalar_test_data_class (float, const int);
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-4.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-4.c
index 850ff620490..14c6554f417 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-4.c
@@ -11,7 +11,7 @@ get_exponent (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return __builtin_vec_scalar_extract_exp (source); /* { dg-error "'__builtin_vsx_scalar_extract_expq' requires" } */
+  return __builtin_vec_scalar_extract_exp (source); /* { dg-error "'__builtin_vsx_scalar_extract_expq.*' requires" } */
 }
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-4.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-4.c
index 32a53c6fffd..9800cf65017 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-4.c
@@ -11,5 +11,5 @@ get_significand (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return __builtin_vec_scalar_extract_sig (source);	/* { dg-error "'__builtin_vsx_scalar_extract_sigq' requires" } */
+  return __builtin_vec_scalar_extract_sig (source);	/* { dg-error "'__builtin_vsx_scalar_extract_sigq.*' requires" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-10.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-10.c
index 769d3b0546a..4018c8fa08a 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-10.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-10.c
@@ -13,5 +13,5 @@ insert_exponent (__ieee128 *significand_p,
   __ieee128 significand = *significand_p;
   unsigned long long int exponent = *exponent_p;
 
-  return __builtin_vec_scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vsx_scalar_insert_exp_qp' requires" } */
+  return __builtin_vec_scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vsx_scalar_insert_exp_qp.*' requires" } */
 }
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/5] Support IEEE 128-bit overload test data built-in functions.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
                   ` (3 preceding siblings ...)
  2022-07-28  4:52 ` [PATCH 4/5] Support IEEE 128-bit overload extract and insert " Michael Meissner
@ 2022-07-28  4:54 ` Michael Meissner
  2022-08-03 17:58 ` Ping: [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
  2022-08-05 18:19 ` Segher Boessenkool
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-07-28  4:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

[PATCH 5/5] Support IEEE 128-bit overload test data built-in functions.

This patch adds support for overloading the IEEE 128-bit test data and
test data negate built-in functions bewteeen KFmode and TFmode arguments.

I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long doubles that are IEEE 128-bit.  I have built two
parallel GCC compilers, one that defaults to using IEEE 128-bit long doubles
and one that defaults to using IBM 128-bit long doubles.

I have compared the test results to the original compiler results, comparing a
modified GCC to the original compiler using an IEEE 128-bit long double
default, and also comparing a modified GCC to the original compiler using an
IBM 128-bit long double default.  In both cases, the results are the same.

I have also compared the compilers with the future patch in progress that does
switch the internal type handling.  Once those patches are installed, the
overload mechanism will insure the correct built-in is used.

Can I install this patch to the trunk, assuming I have installed the first
four patches in the series?

2022-07-27   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_test_data_class_qp_kf): Rename KFmode IEEE
	128-bit test data built-in functions to have a KF suffix to allow
	overloading.
	(__builtin_vsx_scalar_test_neg_qp_kf): Likewise.
	(__builtin_vsx_scalar_test_data_class_qp_tf): Add TFmode variants
	for IEEE 128-bit insert and extract support.
	(__builtin_vsx_scalar_test_neg_qp_tf): Likewise.
	* config/rs6000/rs6000-overload.def
	(__builtin_vec_scalar_test_data_class): Add TFmode overloads.
	(__builtin_vec_scalar_test_neg): Likewise.
	(__builtin_vec_scalar_test_neg_qp): Likewise.
	(__builtin_vec_scalar_test_data_class_qp): Likewise.

gcc/testsuite/

	* gcc.target/powerpc/bfp/scalar-test-data-class-11.c:  Update the
	expected error message.
	* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def          | 17 ++++++++++++-----
 gcc/config/rs6000/rs6000-overload.def          | 18 +++++++++++++-----
 .../powerpc/bfp/scalar-test-data-class-11.c    |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c |  2 +-
 4 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 2ac66b39975..e12efc95965 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2918,12 +2918,12 @@
 							 unsigned long long);
     VSIEQPF_KF xsiexpqpf_kf {}
 
-  const signed int __builtin_vsx_scalar_test_data_class_qp (_Float128, \
-                                                            const int<7>);
-    VSTDCQP xststdcqp_kf {}
+  const signed int __builtin_vsx_scalar_test_data_class_qp_kf (_Float128, \
+							       const int<7>);
+    VSTDCQP_KF xststdcqp_kf {}
 
-  const signed int __builtin_vsx_scalar_test_neg_qp (_Float128);
-    VSTDCNQP xststdcnegqp_kf {}
+  const signed int __builtin_vsx_scalar_test_neg_qp_kf (_Float128);
+    VSTDCNQP_KF xststdcnegqp_kf {}
 
 
 ; Builtins requiring hardware support for IEEE-128 floating-point.  Long double
@@ -2980,6 +2980,13 @@
 							   unsigned long long);
     VSIEQPF_TF xsiexpqpf_tf {ieeeld}
 
+  const signed int __builtin_vsx_scalar_test_data_class_qp_tf (_Float128, \
+							       const int<7>);
+    VSTDCQP_TF xststdcqp_tf {ieeeld}
+
+  const signed int __builtin_vsx_scalar_test_neg_qp_tf (_Float128);
+    VSTDCNQP_TF xststdcnegqp_tf {ieeeld}
+
 
 ; Decimal floating-point builtins.
 [dfp]
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 546883ece19..572e3510360 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4536,7 +4536,9 @@
   unsigned int __builtin_vec_scalar_test_data_class (double, const int);
     VSTDCDP
   unsigned int __builtin_vec_scalar_test_data_class (_Float128, const int);
-    VSTDCQP
+    VSTDCQP_KF
+  unsigned int __builtin_vec_scalar_test_data_class (long double, const int);
+    VSTDCQP_TF
 
 [VEC_VSTDCN, scalar_test_neg, __builtin_vec_scalar_test_neg]
   unsigned int __builtin_vec_scalar_test_neg (float);
@@ -4544,7 +4546,9 @@
   unsigned int __builtin_vec_scalar_test_neg (double);
     VSTDCNDP
   unsigned int __builtin_vec_scalar_test_neg (_Float128);
-    VSTDCNQP
+    VSTDCNQP_KF
+  unsigned int __builtin_vec_scalar_test_neg (long double);
+    VSTDCNQP_TF
 
 [VEC_VTDC, vec_test_data_class, __builtin_vec_test_data_class]
   vbi __builtin_vec_test_data_class (vf, const int);
@@ -5928,9 +5932,11 @@
   unsigned int __builtin_vec_scalar_test_neg_dp (double);
     VSTDCNDP  VSTDCNDP_DEPR1
 
-[VEC_VSTDCNQP, scalar_test_neg_qp, __builtin_vec_scalar_test_neg_qp]
+[VEC_VSTDCNQP_KF, scalar_test_neg_qp, __builtin_vec_scalar_test_neg_qp]
   unsigned int __builtin_vec_scalar_test_neg_qp (_Float128);
-    VSTDCNQP  VSTDCNQP_DEPR1
+    VSTDCNQP_KF  VSTDCNQP_KF_DEPR1
+  unsigned int __builtin_vec_scalar_test_neg_qp (long double);
+    VSTDCNQP_TF  VSTDCNQP_TF_DEPR1
 
 [VEC_VSTDCNSP, scalar_test_neg_sp, __builtin_vec_scalar_test_neg_sp]
   unsigned int __builtin_vec_scalar_test_neg_sp (float);
@@ -5938,7 +5944,9 @@
 
 [VEC_VSTDCQP, scalar_test_data_class_qp, __builtin_vec_scalar_test_data_class_qp]
   unsigned int __builtin_vec_scalar_test_data_class_qp (_Float128, const int);
-    VSTDCQP  VSTDCQP_DEPR1
+    VSTDCQP_KF  VSTDCQP_KF_DEPR1
+  unsigned int __builtin_vec_scalar_test_data_class_qp (long double, const int);
+    VSTDCQP_TF  VSTDCQP_TF_DEPR1
 
 [VEC_VSTDCSP, scalar_test_data_class_sp, __builtin_vec_scalar_test_data_class_sp]
   unsigned int __builtin_vec_scalar_test_data_class_sp (float, const int);
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-11.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-11.c
index 7c6fca2b729..82da5956e05 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-data-class-11.c
@@ -10,5 +10,5 @@ test_data_class (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return __builtin_vec_scalar_test_data_class (source, 3); /* { dg-error "'__builtin_vsx_scalar_test_data_class_qp' requires" } */
+  return __builtin_vec_scalar_test_data_class (source, 3); /* { dg-error "'__builtin_vsx_scalar_test_data_class_qp.*' requires" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
index 8c55c1cfb5c..eef02f40f3d 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
@@ -10,5 +10,5 @@ test_neg (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return __builtin_vec_scalar_test_neg (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_qp' requires" } */
+  return __builtin_vec_scalar_test_neg (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_qp.*' requires" } */
 }
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Ping: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
                   ` (4 preceding siblings ...)
  2022-07-28  4:54 ` [PATCH 5/5] Support IEEE 128-bit overload test data " Michael Meissner
@ 2022-08-03 17:58 ` Michael Meissner
  2022-08-05 18:19 ` Segher Boessenkool
  6 siblings, 0 replies; 14+ messages in thread
From: Michael Meissner @ 2022-08-03 17:58 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

Ping patches.

Patch #1 of 5.
| Date: Thu, 28 Jul 2022 00:47:13 -0400
| Subject: [PATCH 1/5] IEEE 128-bit built-in overload support.
| Message-ID: <YuIU0Yj4mu8LASSd@toto.the-meissners.org>

Patch #2 of 5.
| Date: Thu, 28 Jul 2022 00:48:51 -0400
| Subject: [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions.
| Message-ID: <YuIVM+APJ5g/Yzcv@toto.the-meissners.org>

Patch #3 of 5.
| Date: Thu, 28 Jul 2022 00:50:43 -0400
| Subject: [PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions.
| Message-ID: <YuIVo7MN5hmUzlOr@toto.the-meissners.org>

Patch #4 of 5.
| Date: Thu, 28 Jul 2022 00:52:38 -0400
| Subject: [PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions.
| Message-ID: <YuIWFhnEXlfee42q@toto.the-meissners.org>

Patch #5 of 5.
| Date: Thu, 28 Jul 2022 00:54:15 -0400
| Subject: [PATCH 5/5] Support IEEE 128-bit overload test data built-in functions.
| Message-ID: <YuIWd+k7A3+lf6Hd@toto.the-meissners.org>

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
                   ` (5 preceding siblings ...)
  2022-08-03 17:58 ` Ping: [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
@ 2022-08-05 18:19 ` Segher Boessenkool
  2022-08-10  6:23   ` Michael Meissner
  6 siblings, 1 reply; 14+ messages in thread
From: Segher Boessenkool @ 2022-08-05 18:19 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> These patches lay the foundation for a set of follow-on patches that will
> change the internal handling of 128-bit floating point types in GCC.  In the
> future patches, I hope to change the compiler to always use KFmode for the
> explicit _Float128/__float128 types, to always use TFmode for the long double
> type, no matter which 128-bit floating point type is used, and IFmode for the
> explicit __ibm128 type.

Making TFmode different from KFmode and IFmode is not an improvement.
NAK.


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-05 18:19 ` Segher Boessenkool
@ 2022-08-10  6:23   ` Michael Meissner
  2022-08-10 17:03     ` Segher Boessenkool
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Meissner @ 2022-08-10  6:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote:
> On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> > These patches lay the foundation for a set of follow-on patches that will
> > change the internal handling of 128-bit floating point types in GCC.  In the
> > future patches, I hope to change the compiler to always use KFmode for the
> > explicit _Float128/__float128 types, to always use TFmode for the long double
> > type, no matter which 128-bit floating point type is used, and IFmode for the
> > explicit __ibm128 type.
> 
> Making TFmode different from KFmode and IFmode is not an improvement.
> NAK.
> 
> 
> Segher

First of all, it already IS different from KFmode and IFmode, as we've talked
about.  I'm trying to clean this mess up.  Having explicit __float128's being
converted to TFmode if -mabi=ieeelongdouble is just as bad, and it means that
_Float128 and __float128 are not the same type.

What I'm trying to eliminate is the code in rs6000-builtin.cc that overrides
the builtin ops (i.e. it does the equivalent of an overloaded function):

  /* TODO: The following commentary and code is inherited from the original
     builtin processing code.  The commentary is a bit confusing, with the
     intent being that KFmode is always IEEE-128, IFmode is always IBM
     double-double, and TFmode is the current long double.  The code is
     confusing in that it converts from KFmode to TFmode pattern names,
     when the other direction is more intuitive.  Try to address this.  */

  /* We have two different modes (KFmode, TFmode) that are the IEEE
     128-bit floating point type, depending on whether long double is the
     IBM extended double (KFmode) or long double is IEEE 128-bit (TFmode).
     It is simpler if we only define one variant of the built-in function,
     and switch the code when defining it, rather than defining two built-
     ins and using the overload table in rs6000-c.cc to switch between the
     two.  If we don't have the proper assembler, don't do this switch
     because CODE_FOR_*kf* and CODE_FOR_*tf* will be CODE_FOR_nothing.  */
  if (FLOAT128_IEEE_P (TFmode))
    switch (icode)
      {
      case CODE_FOR_sqrtkf2_odd:
	icode = CODE_FOR_sqrttf2_odd;
	break;
      case CODE_FOR_trunckfdf2_odd:
	icode = CODE_FOR_trunctfdf2_odd;
	break;
      case CODE_FOR_addkf3_odd:
	icode = CODE_FOR_addtf3_odd;
	break;
      case CODE_FOR_subkf3_odd:
	icode = CODE_FOR_subtf3_odd;
	break;
      case CODE_FOR_mulkf3_odd:
	icode = CODE_FOR_multf3_odd;
	break;
      case CODE_FOR_divkf3_odd:
	icode = CODE_FOR_divtf3_odd;
	break;
      case CODE_FOR_fmakf4_odd:
	icode = CODE_FOR_fmatf4_odd;
	break;
      case CODE_FOR_xsxexpqp_kf:
	icode = CODE_FOR_xsxexpqp_tf;
	break;
      case CODE_FOR_xsxsigqp_kf:
	icode = CODE_FOR_xsxsigqp_tf;
	break;
      case CODE_FOR_xststdcnegqp_kf:
	icode = CODE_FOR_xststdcnegqp_tf;
	break;
      case CODE_FOR_xsiexpqp_kf:
	icode = CODE_FOR_xsiexpqp_tf;
	break;
      case CODE_FOR_xsiexpqpf_kf:
	icode = CODE_FOR_xsiexpqpf_tf;
	break;
      case CODE_FOR_xststdcqp_kf:
	icode = CODE_FOR_xststdcqp_tf;
	break;
      case CODE_FOR_xscmpexpqp_eq_kf:
	icode = CODE_FOR_xscmpexpqp_eq_tf;
	break;
      case CODE_FOR_xscmpexpqp_lt_kf:
	icode = CODE_FOR_xscmpexpqp_lt_tf;
	break;
      case CODE_FOR_xscmpexpqp_gt_kf:
	icode = CODE_FOR_xscmpexpqp_gt_tf;
	break;
      case CODE_FOR_xscmpexpqp_unordered_kf:
	icode = CODE_FOR_xscmpexpqp_unordered_tf;
	break;
      default:
	break;
      }

    // ... other code

  if (bif_is_ibm128 (*bifaddr) && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD)
    {
      if (fcode == RS6000_BIF_PACK_IF)
	{
	  icode = CODE_FOR_packtf;
	  fcode = RS6000_BIF_PACK_TF;
	  uns_fcode = (size_t) fcode;
	}
      else if (fcode == RS6000_BIF_UNPACK_IF)
	{
	  icode = CODE_FOR_unpacktf;
	  fcode = RS6000_BIF_UNPACK_TF;
	  uns_fcode = (size_t) fcode;
	}
    }

In particular, without overloaded built-ins, we likely have something similar
to the above to cover all of the built-ins for both modes.  I tend to think
overloading is more natural in this case.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-10  6:23   ` Michael Meissner
@ 2022-08-10 17:03     ` Segher Boessenkool
  2022-08-11 20:01       ` Michael Meissner
  0 siblings, 1 reply; 14+ messages in thread
From: Segher Boessenkool @ 2022-08-10 17:03 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote:
> On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote:
> > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> > > These patches lay the foundation for a set of follow-on patches that will
> > > change the internal handling of 128-bit floating point types in GCC.  In the
> > > future patches, I hope to change the compiler to always use KFmode for the
> > > explicit _Float128/__float128 types, to always use TFmode for the long double
> > > type, no matter which 128-bit floating point type is used, and IFmode for the
> > > explicit __ibm128 type.
> > 
> > Making TFmode different from KFmode and IFmode is not an improvement.
> > NAK.
> 
> First of all, it already IS different from KFmode and IFmode, as we've talked
> about.

It always is the same as either IFmode or KFmode in the end.  It is a
separate mode, yes, because generic code always wants to use TFmode.

> I'm trying to clean this mess up.  Having explicit __float128's being
> converted to TFmode if -mabi=ieeelongdouble is just as bad, and it means that
> _Float128 and __float128 are not the same type.

What do types have to do with this at all?

If TFmode means IEEE QP float, TFmode and KFmode can be used
interchangeably.  When TFmode means double-double, TFmode and IFmode can
be used interchangeably.  We should never depend on TFmode being
different from both underlying modes, that way madness lies.

If you remember, in 2016 or such I experimented with making TFmode a
macro-like thingie, so that we always get KFmode and IFmode in the
instruction stream.  This did not work because of the fundamental
problem that KFmode and IFmode cannot be ordered: for both modes there
are numbers it can represent that cannot be represented in the other
mode; converting from IFmode to KFmode is lossty for some numbers, and
the same is true for converting from KFmode to IFmode.  But, some
internals of GCC require all pairs of floating point modes (that can be
converted between at least) to be comparable (in the mathmatical sense).

Until that problem is solved, we CANNOT move forward.  Your 126/127/128
precision hack gave us some time, but nothing has been improved since
then, and things have started to fall apart at the seams again


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-10 17:03     ` Segher Boessenkool
@ 2022-08-11 20:01       ` Michael Meissner
  2022-08-11 20:44         ` Joseph Myers
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Meissner @ 2022-08-11 20:01 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Wed, Aug 10, 2022 at 12:03:16PM -0500, Segher Boessenkool wrote:
> On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote:
> > On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote:
> > > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> > > > These patches lay the foundation for a set of follow-on patches that will
> > > > change the internal handling of 128-bit floating point types in GCC.  In the
> > > > future patches, I hope to change the compiler to always use KFmode for the
> > > > explicit _Float128/__float128 types, to always use TFmode for the long double
> > > > type, no matter which 128-bit floating point type is used, and IFmode for the
> > > > explicit __ibm128 type.
> > > 
> > > Making TFmode different from KFmode and IFmode is not an improvement.
> > > NAK.
> > 
> > First of all, it already IS different from KFmode and IFmode, as we've talked
> > about.
> 
> It always is the same as either IFmode or KFmode in the end.  It is a
> separate mode, yes, because generic code always wants to use TFmode.
> 
> > I'm trying to clean this mess up.  Having explicit __float128's being
> > converted to TFmode if -mabi=ieeelongdouble is just as bad, and it means that
> > _Float128 and __float128 are not the same type.
> 
> What do types have to do with this at all?

I believe the issue is with these two tests:

	gcc.dg/torture/float128-nan.c
	gcc.target/powerpc/nan128-1.c

In particular, both use nansq to create a signaling NaN.  The nansq function is
defined as nansf128 (i.e. it returns a _Float128 type).

However, at present, __float128 uses the long double type if the
-mabi=ieeelongdouble option is used, not the _Float128 type.  Even though these
both use TFmode in that scenario, the gimple code sees a _Float128 type that is
stored into a long double type.  The machine independent support sees that you
are changing types, and it silently converts the signaling NaN into a quiet
NaN.

An earlier patch was to just change nanq and nansq to resolve to nanl and nansl
in the case -mabi=ieeelongdouble, which you did not like.

In looking at it, I now believe that the type for _Float128 and __float128
should always be the same within the compiler.  Whether we would continue to
use the same type for long double and _Float128/__float128 remains to be seen.

But in doing the change, there are several places that need to be changed as
well.

> If TFmode means IEEE QP float, TFmode and KFmode can be used
> interchangeably.  When TFmode means double-double, TFmode and IFmode can
> be used interchangeably.  We should never depend on TFmode being
> different from both underlying modes, that way madness lies.

No, this is not supported without conversions being done (even if the
conversions are eventually nop conversions).  GCC firmly believes that there are
no modes that are equivalant and can be used interchangeably.

For example, in the float128-odd.c test case we have:

	__float128
	f128_fms (__float128 a, __float128 b, __float128 c)
	{
	  return __builtin_fmaf128_round_to_odd (a, b, -c);
	}

by default if we just use the KFmode functions (because that is how they are
defined in the built-in tables) on a system where __float128 uses the long
double type and uses the TFmode, and remove the code in rs6000_expand_builtin
that changes the built-in (i.e. overloading by another name) right now the
compiler will trap because it calls copy_to_mode_reg if the predicate fails,
and the mode being copied is different from the operand.

The predicate fails because the type in the insn (i.e. KFmode) is not the same
as the type of the operand (i.e. TFmode), and the default predicate
(i.e. register_operand, altivec_register_operand, or vsx_register_operand)
checks the mode.

But that can be fixed by using convert_move's instead of copy_to_mode_reg, and
possibly with new predicates that support either TFmode or KFmode.

However, then GCC will insert convert's going from TFmode to KFmode.  Which
avoids the crash, but the converts mean that the combiner won't combine the
negate and __builtin_fmaf128_round_to_od and produce the single "xsmsubqpo"
instruction.  Instead it will generate a negate and then a "xsmaddqpo"
instruction.

I've played with adding new predicates that recognize either IEEE 128-bit type
and a separate one that recognizes either IBM 128-bit type.


This is why I proposed to have overload support so that the built-in functions
will automatically use a TFmode built-in or a KFmode built-in depending on what
the mode.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-11 20:01       ` Michael Meissner
@ 2022-08-11 20:44         ` Joseph Myers
  2022-08-16 18:07           ` Jakub Jelinek
  0 siblings, 1 reply; 14+ messages in thread
From: Joseph Myers @ 2022-08-11 20:44 UTC (permalink / raw)
  To: Michael Meissner
  Cc: Segher Boessenkool, Peter Bergner, gcc-patches, David Edelsohn

On Thu, 11 Aug 2022, Michael Meissner via Gcc-patches wrote:

> In looking at it, I now believe that the type for _Float128 and __float128
> should always be the same within the compiler.  Whether we would continue to
> use the same type for long double and _Float128/__float128 remains to be seen.

long double and _Float128 must always be different types; that's how it's 
defined in C23.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-11 20:44         ` Joseph Myers
@ 2022-08-16 18:07           ` Jakub Jelinek
  2022-08-16 18:55             ` Segher Boessenkool
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2022-08-16 18:07 UTC (permalink / raw)
  To: Joseph Myers, Jason Merrill, Jonathan Wakely
  Cc: Michael Meissner, Peter Bergner, gcc-patches, David Edelsohn,
	Segher Boessenkool

On Thu, Aug 11, 2022 at 08:44:17PM +0000, Joseph Myers wrote:
> On Thu, 11 Aug 2022, Michael Meissner via Gcc-patches wrote:
> 
> > In looking at it, I now believe that the type for _Float128 and __float128
> > should always be the same within the compiler.  Whether we would continue to
> > use the same type for long double and _Float128/__float128 remains to be seen.
> 
> long double and _Float128 must always be different types; that's how it's 
> defined in C23.

And when we implement C++23 P1467R9, if std::float128_t will be
_Float128 under the hood, then long double and _Float128 have to remain
distinct types and mangle differently, long double (and __float128 if
long double is IEEE quad and __float128 exists?) need to mangle the way
they currently do and _Float128 should mangle as  DF128_ .
             ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)
Wonder how shall we mangle the underlying type of std::bfloat16_t though.

I assume e.g. for libstdc++ implementation purposes we need to have
__ibm128 and __float128 types mangling as long double mangles when the
-mabi={ibm,ieee}longdouble option is used, because otherwise it would be
really hard to implement it.

	Jakub


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] IEEE 128-bit built-in overload support.
  2022-08-16 18:07           ` Jakub Jelinek
@ 2022-08-16 18:55             ` Segher Boessenkool
  0 siblings, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2022-08-16 18:55 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Joseph Myers, Jason Merrill, Jonathan Wakely, Michael Meissner,
	Peter Bergner, gcc-patches, David Edelsohn

Hi!

On Tue, Aug 16, 2022 at 08:07:48PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 11, 2022 at 08:44:17PM +0000, Joseph Myers wrote:
> > On Thu, 11 Aug 2022, Michael Meissner via Gcc-patches wrote:
> > > In looking at it, I now believe that the type for _Float128 and __float128
> > > should always be the same within the compiler.  Whether we would continue to
> > > use the same type for long double and _Float128/__float128 remains to be seen.
> > 
> > long double and _Float128 must always be different types; that's how it's 
> > defined in C23.
> 
> And when we implement C++23 P1467R9, if std::float128_t will be
> _Float128 under the hood, then long double and _Float128 have to remain
> distinct types and mangle differently, long double (and __float128 if
> long double is IEEE quad and __float128 exists?) need to mangle the way
> they currently do and _Float128 should mangle as  DF128_ .
>              ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)

So should we make std::floatNN_t be the same as _FloatNN, and mangled
as DF<NN>_ ?  And __ieee128 (and long double implemented as that) the
same as we already have.

> Wonder how shall we mangle the underlying type of std::bfloat16_t though.

That should get some cross-platform mangling?  Power shouldn't go its
own way here :-)

> I assume e.g. for libstdc++ implementation purposes we need to have
> __ibm128 and __float128 types mangling as long double mangles when the
> -mabi={ibm,ieee}longdouble option is used, because otherwise it would be
> really hard to implement it.

If at all possible it should be the same as we have already: otherwise
it will be at least five years before anything works again (for users).

This agrees with what you propose afaics, but let's make this explicit?
It helps us sleep at night :-)


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-08-16 18:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-28  4:43 [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
2022-07-28  4:47 ` [PATCH 1/5] " Michael Meissner
2022-07-28  4:48 ` [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions Michael Meissner
2022-07-28  4:50 ` [PATCH 3/5] Support IEEE 128-bit overload comparison " Michael Meissner
2022-07-28  4:52 ` [PATCH 4/5] Support IEEE 128-bit overload extract and insert " Michael Meissner
2022-07-28  4:54 ` [PATCH 5/5] Support IEEE 128-bit overload test data " Michael Meissner
2022-08-03 17:58 ` Ping: [PATCH 0/5] IEEE 128-bit built-in overload support Michael Meissner
2022-08-05 18:19 ` Segher Boessenkool
2022-08-10  6:23   ` Michael Meissner
2022-08-10 17:03     ` Segher Boessenkool
2022-08-11 20:01       ` Michael Meissner
2022-08-11 20:44         ` Joseph Myers
2022-08-16 18:07           ` Jakub Jelinek
2022-08-16 18:55             ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).