[PATCH] rs6000, __builtin_set_fpscr

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
@ 2023-06-19 15:57 Carl Love
  2023-06-29  9:13 ` Kewen.Lin
  0 siblings, 1 reply; 3+ messages in thread
From: Carl Love @ 2023-06-19 15:57 UTC (permalink / raw)
  To: Kewen.Lin, Segher Boessenkool, gcc-patches, dje.gcc; +Cc: Peter Bergner, cel

GCC maintainers:


The GLibC team requested a builtin to replace the mffscrn and mffscrniinline asm instructions in the GLibC code.  Previously there was discussion on adding builtins for the mffscrn instructions.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html

In the end, it was felt that it would be to extend the existing
__builtin_set_fpscr_rn builtin to return a double instead of a void
type.  The desire is that we could have the functionality of the
mffscrn and mffscrni instructions on older ISAs.  The two instructions
were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
needed functionality to set the RN field using the mffscrn and mffscrni
instructions if ISA 3.0 is supported or fall back to using logical
instructions to mask and set the bits for earlier ISAs.  The
instructions return the current value of the FPSCR fields DRN, VE, OE,
UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
the new RN value provided.

The current __builtin_set_fpscr_rn builtin has a return type of void. 
So, changing the return type to double and returning the  FPSCR fields
DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
functionally equivalent of the mffscrn and mffscrni instructions.  Any
current uses of the builtin would just ignore the return value yet any
new uses could use the return value.  So the requirement is for the
change to the __builtin_set_fpscr_rn builtin to be backwardly
compatible and work for all ISAs.

The following patch changes the return type of the
 __builtin_set_fpscr_rn builtin from void to double.  The return value
is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
XE, NI, RN bit positions when the builtin is called.  The builtin then
updated the RN field with the new value provided as an argument to the
builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
check that the builtin returns the current value of the FPSCR fields
and then updates the RN field.

The GLibC team has reviewed the patch to make sure it met their needs
as a drop in replacement for the inline asm mffscr and mffscrni
statements in the GLibC code.  T

The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
LE.

Please let me know if the patch is acceptable for mainline.  Thanks.

                               Carl 

--------------------------------
rs6000, __builtin_set_fpscr_rn add retrun value

Change the return value from void to double.  The return value consists of
the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
overloaded version which accepts a double argument.

The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
double reterun value and the new double argument.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Delete.
	(__builtin_set_fpscr_rn_i): New builtin definition.
	(__builtin_set_fpscr_rn_d): New builtin definition.
	* config/rs6000/rs6000-overload.def (__builtin_set_fpscr_rn): New
	overloaded definition.
	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
	define_expand.
	(rs6000_update_fpscr_rn_field): New define_expand.
	(rs6000_set_fpscr_rn_d): New define expand.
	(rs6000_set_fpscr_rn_i): Renamed from rs6000_set_fpscr_rn, Added
	return argument.  Updated to use new rs6000_get_fpscr_fields and
	rs6000_update_fpscr_rn_field define _expands.
	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
	the return value and new double argument.

gcc/testsuite/ChangeLog:
	gcc.target/powerpc/test_fpscr_rn_builtin.c: Add new tests th check
	double return value.  Add tests for overloaded double argument.
	re
---
 gcc/config/rs6000/rs6000-builtins.def         |   7 +-
 gcc/config/rs6000/rs6000-overload.def         |   6 +
 gcc/config/rs6000/rs6000.md                   | 122 ++++++++++++---
 gcc/doc/extend.texi                           |  25 ++-
 .../powerpc/test_fpscr_rn_builtin.c           | 143 +++++++++++++++++-
 5 files changed, 262 insertions(+), 41 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 289a37998b1..30e0b0bb06d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -237,8 +237,11 @@
   const __ibm128 __builtin_pack_ibm128 (double, double);
     PACK_IF packif {ibm128}
 
-  void __builtin_set_fpscr_rn (const int[0,3]);
-    SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
+  double __builtin_set_fpscr_rn_i (const int[0,3]);
+    SET_FPSCR_RN_I rs6000_set_fpscr_rn_i {nosoft}
+
+  double __builtin_set_fpscr_rn_d (double);
+    SET_FPSCR_RN_D rs6000_set_fpscr_rn_d {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
     UNPACK_IF unpackif {ibm128}
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..bb904b85820 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -195,6 +195,12 @@
   unsigned long long __builtin_cmpb (unsigned long long, unsigned long long);
     CMPB
 
+[SET_FPSCR_RN, set_fpscr_rn, __builtin_set_fpscr_rn]
+  double __builtin_set_fpscr_rn (int);
+    SET_FPSCR_RN_I
+  double __builtin_set_fpscr_rn (double);
+    SET_FPSCR_RN_D
+
 [VEC_ABS, vec_abs, __builtin_vec_abs]
   vsc __builtin_vec_abs (vsc);
     ABS_V16QI
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b0db8ae508d..5c53383c879 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6440,8 +6440,84 @@
   "mffscdrn %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "rs6000_set_fpscr_rn"
- [(match_operand:SI 0 "reg_or_cint_operand")]
+
+(define_expand "rs6000_get_fpscr_fields"
+ [(match_operand:DF 0 "gpc_reg_operand")]
+  "TARGET_HARD_FLOAT"
+{
+  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR and return them.  */
+  rtx tmp_df = gen_reg_rtx (DFmode);
+  rtx tmp_di = gen_reg_rtx (DImode);
+
+  emit_insn (gen_rs6000_mffs (tmp_df));
+  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x00000007000000FFULL)));
+  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
+  emit_move_insn (operands[0], tmp_rtn);
+  DONE;
+})
+
+(define_expand "rs6000_update_fpscr_rn_field"
+ [(match_operand:DI 0 "gpc_reg_operand")]
+  "TARGET_HARD_FLOAT"
+{
+  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
+  rtx tmp_di = gen_reg_rtx (DImode);
+  rtx tmp_df = gen_reg_rtx (DFmode);
+
+  emit_insn (gen_rs6000_mffs (tmp_df));
+  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
+  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
+
+  /* Need to write to field k=15.  The fields are [0:15].  Hence with
+     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
+     8-bit field[0:7]. Need to set the bit that corresponds to the
+     value of i that you want [0:7].  */
+
+  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
+  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
+  DONE;
+})
+
+(define_expand "rs6000_set_fpscr_rn_d"
+ [(set (match_operand:DF 0 "gpc_reg_operand")
+       (unspec_volatile:DF [(match_operand:DF 1 "gpc_reg_operand")]
+       UNSPECV_MFFSCDRN))]
+  "TARGET_HARD_FLOAT"
+{
+  rtx tmp_df = gen_reg_rtx (DFmode);
+
+  if (TARGET_P9_MISC)
+    {
+       emit_insn (gen_rs6000_mffscrn (tmp_df, operands[1]));
+       emit_move_insn (operands[0], tmp_df);
+       DONE;
+    }
+
+  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
+     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
+  */
+
+  /* Get the return value */
+  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));
+
+  /* Extract new RN mode from operand.  */
+  rtx op1 = simplify_gen_subreg (DImode, operands[1], DFmode, 0);
+  rtx tmp_rn = gen_reg_rtx (DImode);
+  emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
+
+  /* Insert new RN mode into FSCPR.  */
+  emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
+  DONE;
+})
+
+(define_expand "rs6000_set_fpscr_rn_i"
+ [(set (match_operand:DF 0 "gpc_reg_operand")
+       (unspec_volatile:DF [(match_operand:SI 1 "reg_or_cint_operand")]
+       UNSPECV_MFFSCDRN))]
   "TARGET_HARD_FLOAT"
 {
   rtx tmp_df = gen_reg_rtx (DFmode);
@@ -6450,25 +6526,34 @@
      new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
   if (TARGET_P9_MISC)
     {
-      if (const_0_to_3_operand (operands[0], VOIDmode))
-	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
+      if (const_0_to_3_operand (operands[1], VOIDmode))
+	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
       else
 	{
-	  rtx op0 = convert_to_mode (DImode, operands[0], false);
-	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
+	  rtx op1 = convert_to_mode (DImode, operands[1], false);
+	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
 	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
 	}
-      DONE;
+       emit_move_insn (operands[0], tmp_df);
+       DONE;
     }
 
-  if (CONST_INT_P (operands[0]))
+  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
+     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
+  */
+
+  /* Get the current FPSCR fields to return.  */
+  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));
+
+  if (CONST_INT_P (operands[1]))
     {
-      if ((INTVAL (operands[0]) & 0x1) == 0x1)
+      if ((INTVAL (operands[1]) & 0x1) == 0x1)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
 
-      if ((INTVAL (operands[0]) & 0x2) == 0x2)
+      if ((INTVAL (operands[1]) & 0x2) == 0x2)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
@@ -6476,24 +6561,13 @@
   else
     {
       rtx tmp_rn = gen_reg_rtx (DImode);
-      rtx tmp_di = gen_reg_rtx (DImode);
 
       /* Extract new RN mode from operand.  */
-      rtx op0 = convert_to_mode (DImode, operands[0], false);
-      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
+      rtx op1 = convert_to_mode (DImode, operands[1], false);
+      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
 
       /* Insert new RN mode into FSCPR.  */
-      emit_insn (gen_rs6000_mffs (tmp_df));
-      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
-      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
-      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
-
-      /* Need to write to field k=15.  The fields are [0:15].  Hence with
-	 L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
-	 8-bit field[0:7]. Need to set the bit that corresponds to the
-	 value of i that you want [0:7].  */
-      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
-      emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
+      emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
     }
   DONE;
 })
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cdbd4b34a35..c6ad3a44d26 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18188,7 +18188,6 @@ double __builtin_mffs (void);
 void __builtin_mtfsf (const int, double);
 void __builtin_mtfsb0 (const int);
 void __builtin_mtfsb1 (const int);
-void __builtin_set_fpscr_rn (int);
 @end smallexample
 
 The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
@@ -18209,13 +18208,23 @@ values to selected fields of the FPSCR.  The
 as an argument.  The valid bit range is between 0 and 31.  The builtins map to
 the @code{mtfsb0} and @code{mtfsb1} instructions which take the argument and
 add 32.  Hence these instructions only modify the FPSCR[32:63] bits by
-changing the specified bit to a zero or one respectively.  The
-@code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
-point rounding mode bits.  The argument is a 2-bit value.  The argument can
-either be a @code{const int} or stored in a variable. The builtin uses
-the ISA 3.0
-instruction @code{mffscrn} if available, otherwise it reads the FPSCR, masks
-the current rounding mode bits out and OR's in the new value.
+changing the specified bit to a zero or one respectively.
+
+@smallexample
+double __builtin_set_fpscr_rn (int);
+double __builtin_set_fpscr_rn (double);
+@end smallexample
+
+The @code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
+point rounding mode bits and returning the various FPSCR fields before the RN
+field is updated.  The builtin returns a double consisting of the initial value
+of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with all
+other bits set to zero. The builtin argument is a 2-bit value for the new RN
+field value.  The argument can either be a @code{const int} or stored in a
+variable. Additionally, the argument can be a variable of type double. The
+builtin uses the ISA 3.0 instruction @code{mffscrn} if available, otherwise it
+reads the FPSCR, masks the current rounding mode bits out and OR'sin the new
+RN field value.
 
 @node Basic PowerPC Built-in Functions Available on ISA 2.05
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.05
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
index 04707ad8a56..1522983102d 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
@@ -6,11 +6,70 @@
 #endif
 
 #define RN_MASK  0x3LL             /* RN field mask */
+#define FIELD_MASK 0x00000007000000FFULL
+
+union blah {
+  double d;
+  unsigned long long ll;
+} conv_val;
 
 void abort (void);
-void __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
+double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
 {
-  __builtin_set_fpscr_rn (val);
+  return __builtin_set_fpscr_rn (val);
+}
+
+double __attribute__ ((noipa)) wrap_set_fpscr_rn_d (double val)
+{
+  return __builtin_set_fpscr_rn (val);
+}
+
+double __attribute__ ((noipa)) wrap_const_fpscr_rn (int val)
+{
+  switch (val)
+    {
+    case 0: return __builtin_set_fpscr_rn (0x0);
+    case 1: return __builtin_set_fpscr_rn (0x1);
+    case 2: return __builtin_set_fpscr_rn (0x2);
+    case 3: return __builtin_set_fpscr_rn (0x3);
+    }
+}
+
+void check_builtin_set_fpscr_rn (unsigned long long initial_fpscr,
+				 int new_RN, double result)
+{
+  register double  f14;
+  unsigned long long masked_fpscr = initial_fpscr & FIELD_MASK;
+  
+  conv_val.d = result;
+
+  /* Check the result.  */
+  if (conv_val.ll != masked_fpscr)
+    {
+#ifdef DEBUG
+       printf("ERROR, __builtin_set_fpscr_rn(%d) did not return expected value %llx.\n",
+	      new_RN, masked_fpscr);
+       printf("fpscr_val_initial = 0x%llx\n", initial_fpscr);       
+       printf("result = 0x%llx\n", conv_val.ll);
+#else
+       abort();
+#endif
+    }
+
+  /* Check to see if the RN field was updated.  */
+    __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+
+  if ((conv_val.ll & RN_MASK) != new_RN)
+#ifdef DEBUG
+    {
+      printf("ERROR,  __builtin_set_fpscr_rn(%d) did not update RN to %llx.\n",
+	     new_RN, new_RN);
+      printf("  conv_val.ll = 0x%llx\n", conv_val.ll);
+    }
+#else
+    abort();
+#endif
 }
 
 int main ()
@@ -18,12 +77,10 @@ int main ()
   int i;
   int val, bit;
   double fpscr_val;
-  union blah {
-    double d;
-    unsigned long long ll;
-  } conv_val;
+  unsigned long long fpscr_val_initial;
   
   unsigned long long ll_value;
+  union blah src_double;
   register double  f14;
 
   /* __builtin_set_fpscr_rn() builtin can take a const or a variable
@@ -190,4 +247,76 @@ int main ()
        abort();
 #endif
     }		  
-}
+
+
+  /* Test return value from __builtin_set_fpscr_rn. The fields (DRN, VE, OE,
+     UE, ZE, XE, NI, RN) are returned and  the RN field of FPSCR is updated
+     with the specified argument for the builtin.  */
+
+  /* Check immediate argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x0;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x3;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+  
+  /* Check int argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x1;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x2;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  /* Check double  argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  conv_val.ll = 0x3;
+  fpscr_val = wrap_set_fpscr_rn_d (conv_val.d);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, conv_val.ll, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  conv_val.ll = 0x0;
+  fpscr_val = wrap_set_fpscr_rn_d (conv_val.d);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, conv_val.ll, fpscr_val);
+  
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  conv_val.ll = 0x1;
+  fpscr_val = wrap_set_fpscr_rn_d (conv_val.d);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, conv_val.ll, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  conv_val.ll = 0x2;
+  fpscr_val = wrap_set_fpscr_rn_d (conv_val.d);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, conv_val.ll, fpscr_val);
+}		  
-- 
2.37.2



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
  2023-06-19 15:57 [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value Carl Love
@ 2023-06-29  9:13 ` Kewen.Lin
  2023-06-30  3:14   ` Peter Bergner
  0 siblings, 1 reply; 3+ messages in thread
From: Kewen.Lin @ 2023-06-29  9:13 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, dje.gcc, gcc-patches

Hi Carl,

on 2023/6/19 23:57, Carl Love wrote:
> GCC maintainers:
> 
> 
> The GLibC team requested a builtin to replace the mffscrn and mffscrni inline asm instructions in the GLibC code> Previously there was discussion on adding builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.

But this patch also introduces one more overloading instance with argument
type double, I had a check on glibc usage of mffscrn/mffscrni, I don't
think it's necessary to add this new instance, as it takes the given
rounding mode as integral type.

For examples:

#define __fe_mffscrn(rn)                                                \
  ({register fenv_union_t __fr;                                         \
    if (__builtin_constant_p (rn))                                      \
      __asm__ __volatile__ (                                            \
        ".machine push; .machine \"power9\"; mffscrni %0,%1; .machine pop" \
        : "=f" (__fr.fenv) : "n" (rn));                                 \
    else                                                                \
    {                                                                   \
      __fr.l = (rn);                                                    \
      __asm__ __volatile__ (                                            \
        ".machine push; .machine \"power9\"; mffscrn %0,%1; .machine pop" \
        : "=f" (__fr.fenv) : "f" (__fr.fenv));                          \
    }                                                                   \
    __fr.fenv;                                                          \
  })


/* Same as __fesetround_inline, however without runtime check to use DFP
   mtfsfi syntax (as relax_fenv_state) or if round value is valid.  */
static inline void
__fesetround_inline_nocheck (const int round)
{
#ifdef _ARCH_PWR9
  __fe_mffscrn (round);
#else
  if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00))
    __fe_mffscrn (round);
  else
    asm volatile ("mtfsfi 7,%0" : : "n" (round));
#endif
}

So could you just extend return type (from void to double) but without one
more overloading instance?

Without overloading, we can still use the original bif instance SET_FPSCR_RN
and its correpsonding expander rs6000_set_fpscr_rn, just add some more
handlings to fetch bits for return value.  It would be simpler IMHO.

> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                                Carl 
> 
> --------------------------------
> rs6000, __builtin_set_fpscr_rn add retrun value
> 
> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.
> 
> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Delete.
> 	(__builtin_set_fpscr_rn_i): New builtin definition.
> 	(__builtin_set_fpscr_rn_d): New builtin definition.
> 	* config/rs6000/rs6000-overload.def (__builtin_set_fpscr_rn): New
> 	overloaded definition.
> 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
> 	define_expand.
> 	(rs6000_update_fpscr_rn_field): New define_expand.
> 	(rs6000_set_fpscr_rn_d): New define expand.
> 	(rs6000_set_fpscr_rn_i): Renamed from rs6000_set_fpscr_rn, Added
> 	return argument.  Updated to use new rs6000_get_fpscr_fields and
> 	rs6000_update_fpscr_rn_field define _expands.
> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
> 	the return value and new double argument.
> 
> gcc/testsuite/ChangeLog:
> 	gcc.target/powerpc/test_fpscr_rn_builtin.c: Add new tests th check
> 	double return value.  Add tests for overloaded double argument.

Since we have the expectation that the existing usage of __builtin_set_fpscr_rn
would still work fine, so could we keep the existing one unchanged to test it?
and test the new capability with one new test case?

BR,
Kewen

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value
  2023-06-29  9:13 ` Kewen.Lin
@ 2023-06-30  3:14   ` Peter Bergner
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Bergner @ 2023-06-30  3:14 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: Segher Boessenkool, dje.gcc, gcc-patches

On 6/29/23 4:13 AM, Kewen.Lin wrote:
> on 2023/6/19 23:57, Carl Love wrote:
>> The following patch changes the return type of the
>>  __builtin_set_fpscr_rn builtin from void to double.  The return value
>> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
>> XE, NI, RN bit positions when the builtin is called.  The builtin then
>> updated the RN field with the new value provided as an argument to the
>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
>> check that the builtin returns the current value of the FPSCR fields
>> and then updates the RN field.
> 
> But this patch also introduces one more overloading instance with argument
> type double, I had a check on glibc usage of mffscrn/mffscrni, I don't
> think it's necessary to add this new instance, as it takes the given
> rounding mode as integral type.

I agree with Ke Wen, I don't know why there is an extra overloaded
instance.  I assumed we'd have just the one we had before, except now
its return type is double and not void.

That said, we need to announce to users that the return type has
changed, so new code compiled with a new GCC can get the new return
value, but if that new code is compiled with an old GCC (still has void
return type), it knows it needs to use some other method to get the
FPSCR value, if it needs it.  We normally do that with a predefine
macro (#define ...) that the user can test for in their code, ala:

#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__  /* Or whatever name.  */
  ret = __builtin_set_fpscr_rn (......);
#else
  __builtin_set_fpscr_rn (......);
  ret = <some other method for reading FPSCR>;
#endif

We add those predefined macros in rs6000-c.cc:rs6000_target_modify_macros()
and it should be gated via the same conditions that the built-in itself
is enabled.  Look at my addition of the __TM_FENCE__ macro that let the
user know our HTM insn patterns were fixed to now act as memory barriers.

Peter

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-30  3:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-19 15:57 [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value Carl Love
2023-06-29  9:13 ` Kewen.Lin
2023-06-30  3:14   ` Peter Bergner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).