public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn
@ 2023-07-11 18:06 Carl Love
  2023-07-13  7:33 ` Kewen.Lin
  0 siblings, 1 reply; 2+ messages in thread
From: Carl Love @ 2023-07-11 18:06 UTC (permalink / raw)
  To: Kewen.Lin, cel, Segher Boessenkool, dje.gcc, gcc-patches
  Cc: cel, Peter Bergner

GCC maintainers:

Ver 4, Removed extra space in subject line.  Added comment to commit
log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define.  Changed
Added to Add and Renamed to Rename in ChangeLog.  Updated define_expand
"rs6000_set_fpscr_rn" per Peter's comments to use new temporary
register for output value.  Also, comments from Kewen about moving rtx
tmp_di1 close to use.  Renamed tmp_di2 as orig_df_in_di.  Additionally,
changed the name of tmp_di3 to tmp_di2 so the numbering is
sequential.  Moved the new rtx tmp_di2 = gen_reg_rtx (DImode); right
before its use to be consistent with previous move request.  Fixed tabs
in comment.  Remove -std=c99 from test_fpscr_rn_builtin_1.c. Cleaned up
comment and removed abort from test_fpscr_rn_builtin_2.c.  

Fixed a couple of additional issues with the ChangeLog per feedback
from git gcc-verify.

Retested updated patch on Power 8, 9 and 10 to verify changes.

Ver 3, Renamed the patch per comments on ver 2.  Previous subject line
was " [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value".  
Fixed spelling mistakes and formatting.  Updated define_expand
"rs6000_set_fpscr_rn to have the rs6000_get_fpscr_fields and
rs6000_update_fpscr_rn_field define expands inlined.  Optimized the
code and fixed use of temporary register values. Updated the test file
dg-do run arguments and dg-options.  Removed the check for
__SET_FPSCR_RN_RETURNS_FPSCR__. Removed additional references to the
overloaded built-in with double argument.  Fixed up the documentation
file.  Updated patch retested on Power 8 BE/LE, Power 9 BE/LE and Power
10 LE.

Ver 2,  Went back thru the requirements and emails.  Not sure where I
came up with the requirement for an overloaded version with double
argument.  Removed the overloaded version with the double argument. 
Added the macro to announce if the __builtin_set_fpscr_rn returns a
void or a double with the FPSCR bits.  Updated the documentation file. 
Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
file.  Per request, the original test file functionality was not
changed.  Just changed the name from test_fpscr_rn_builtin.c to 
test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
new test file, test_fpscr_rn_builtin_2.c.

The GLibC team requested a builtin to replace the mffscrn and
mffscrniinline asm instructions in the GLibC code.  Previously there
was discussion on adding builtins for the mffscrn instructions.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html

In the end, it was felt that it would be to extend the existing
__builtin_set_fpscr_rn builtin to return a double instead of a void
type.  The desire is that we could have the functionality of the
mffscrn and mffscrni instructions on older ISAs.  The two instructions
were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
needed functionality to set the RN field using the mffscrn and mffscrni
instructions if ISA 3.0 is supported or fall back to using logical
instructions to mask and set the bits for earlier ISAs.  The
instructions return the current value of the FPSCR fields DRN, VE, OE,
UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
the new RN value provided.

The current __builtin_set_fpscr_rn builtin has a return type of void. 
So, changing the return type to double and returning the  FPSCR fields
DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
functionally equivalent of the mffscrn and mffscrni instructions.  Any
current uses of the builtin would just ignore the return value yet any
new uses could use the return value.  So the requirement is for the
change to the __builtin_set_fpscr_rn builtin to be backwardly
compatible and work for all ISAs.

The following patch changes the return type of the
 __builtin_set_fpscr_rn builtin from void to double.  The return value
is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
XE, NI, RN bit positions when the builtin is called.  The builtin then
updated the RN field with the new value provided as an argument to the
builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
check that the builtin returns the current value of the FPSCR fields
and then updates the RN field.

The GLibC team has reviewed the patch to make sure it met their needs
as a drop in replacement for the inline asm mffscr and mffscrni
statements in the GLibC code.  T

The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
LE.

Please let me know if the patch is acceptable for mainline.  Thanks.

                               Carl 

-----------------------------------------
rs6000, Add return value to __builtin_set_fpscr_rn

Change the return value from void to double for __builtin_set_fpscr_rn.
The return value consists of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI,
RN bit positions.  A new test file, test powerpc/test_fpscr_rn_builtin_2.c,
is added to test the new return value for the built-in.

The value __SET_FPSCR_RN_RETURNS_FPSCR__ is defined if
__builtin_set_fpscr_rn returns a double.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
	built-in definition return type.
	* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Add check,
	define __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
	* config/rs6000/rs6000.md (rs6000_set_fpscr_rn): Add return
	argument to return FPSCR fields.
	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
	the return value.  Add description for
	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/test_fpscr_rn_builtin.c: Rename to
	test_fpscr_rn_builtin_1.c.  Add comment.
	* gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
	return value of __builtin_set_fpscr_rn builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   2 +-
 gcc/config/rs6000/rs6000-c.cc                 |   5 +
 gcc/config/rs6000/rs6000.md                   |  55 ++++---
 gcc/doc/extend.texi                           |  23 ++-
 ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |  12 +-
 .../powerpc/test_fpscr_rn_builtin_2.c         | 148 ++++++++++++++++++
 6 files changed, 213 insertions(+), 32 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => test_fpscr_rn_builtin_1.c} (89%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 289a37998b1..28788b69c7d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -237,7 +237,7 @@
   const __ibm128 __builtin_pack_ibm128 (double, double);
     PACK_IF packif {ibm128}
 
-  void __builtin_set_fpscr_rn (const int[0,3]);
+  double __builtin_set_fpscr_rn (const int[0,3]);
     SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 8555174d36e..07a32badb54 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -604,6 +604,11 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
   /* Tell the user -mrop-protect is in play.  */
   if (rs6000_rop_protect)
     rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
+  /* Tell the user __builtin_set_fpscr_rn now returns the FPSCR fields
+     in a double.  Originally the built-in returned void.  */
+  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
+    rs6000_define_or_undefine_macro (define_p,
+				     "__SET_FPSCR_RN_RETURNS_FPSCR__");
 }
 
 void
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b0db8ae508d..641ab2dfcfe 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6441,7 +6441,8 @@ (define_insn "rs6000_mffscdrn"
   [(set_attr "type" "fp")])
 
 (define_expand "rs6000_set_fpscr_rn"
- [(match_operand:SI 0 "reg_or_cint_operand")]
+ [(use (match_operand:DF 0 "gpc_reg_operand"))
+  (use (match_operand:SI 1 "reg_or_cint_operand"))]
   "TARGET_HARD_FLOAT"
 {
   rtx tmp_df = gen_reg_rtx (DFmode);
@@ -6450,49 +6451,63 @@ (define_expand "rs6000_set_fpscr_rn"
      new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
   if (TARGET_P9_MISC)
     {
-      if (const_0_to_3_operand (operands[0], VOIDmode))
-	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
+      if (const_0_to_3_operand (operands[1], VOIDmode))
+	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
       else
 	{
-	  rtx op0 = convert_to_mode (DImode, operands[0], false);
-	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
+	  rtx op1 = convert_to_mode (DImode, operands[1], false);
+	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
 	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
 	}
-      DONE;
+       emit_move_insn (operands[0], tmp_df);
+       DONE;
     }
 
-  if (CONST_INT_P (operands[0]))
+  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
+     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
+  */
+
+  /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
+    ZE, XE, NI, RN) from the FPSCR and return them.  */
+
+  emit_insn (gen_rs6000_mffs (tmp_df));
+  rtx orig_df_in_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+  rtx tmp_di1 = gen_reg_rtx (DImode);
+  emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di,
+	     GEN_INT (0x00000007000000FFULL)));
+  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di1, DImode, 0);
+  emit_move_insn (operands[0], tmp_rtn);
+
+  if (CONST_INT_P (operands[1]))
     {
-      if ((INTVAL (operands[0]) & 0x1) == 0x1)
+      if ((INTVAL (operands[1]) & 0x1) == 0x1)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
 
-      if ((INTVAL (operands[0]) & 0x2) == 0x2)
+      if ((INTVAL (operands[1]) & 0x2) == 0x2)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
     }
   else
     {
-      rtx tmp_rn = gen_reg_rtx (DImode);
-      rtx tmp_di = gen_reg_rtx (DImode);
-
       /* Extract new RN mode from operand.  */
-      rtx op0 = convert_to_mode (DImode, operands[0], false);
-      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
+      rtx op1 = convert_to_mode (DImode, operands[1], false);
+      rtx tmp_rn = gen_reg_rtx (DImode);
+      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
 
-      /* Insert new RN mode into FSCPR.  */
-      emit_insn (gen_rs6000_mffs (tmp_df));
-      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
-      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
-      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
+      /* Insert the new RN value from tmp_rn into FPSCR bit [62:63].  */
+      emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di, GEN_INT (-4)));
+      rtx tmp_di2 = gen_reg_rtx (DImode);
+      emit_insn (gen_iordi3 (tmp_di2, tmp_di1, tmp_rn));
 
       /* Need to write to field k=15.  The fields are [0:15].  Hence with
 	 L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
 	 8-bit field[0:7]. Need to set the bit that corresponds to the
 	 value of i that you want [0:7].  */
-      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
+      tmp_df = simplify_gen_subreg (DFmode, tmp_di2, DImode, 0);
       emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
     }
   DONE;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cdbd4b34a35..ce5ac59629f 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18188,7 +18188,7 @@ double __builtin_mffs (void);
 void __builtin_mtfsf (const int, double);
 void __builtin_mtfsb0 (const int);
 void __builtin_mtfsb1 (const int);
-void __builtin_set_fpscr_rn (int);
+double __builtin_set_fpscr_rn (int);
 @end smallexample
 
 The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
@@ -18209,13 +18209,20 @@ values to selected fields of the FPSCR.  The
 as an argument.  The valid bit range is between 0 and 31.  The builtins map to
 the @code{mtfsb0} and @code{mtfsb1} instructions which take the argument and
 add 32.  Hence these instructions only modify the FPSCR[32:63] bits by
-changing the specified bit to a zero or one respectively.  The
-@code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
-point rounding mode bits.  The argument is a 2-bit value.  The argument can
-either be a @code{const int} or stored in a variable. The builtin uses
-the ISA 3.0
-instruction @code{mffscrn} if available, otherwise it reads the FPSCR, masks
-the current rounding mode bits out and OR's in the new value.
+changing the specified bit to a zero or one respectively.
+
+The @code{__builtin_set_fpscr_rn} built-in allows changing both of the floating
+point rounding mode bits and returning the various FPSCR fields before the RN
+field is updated.  The built-in returns a double consisting of the initial
+value of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions
+with all other bits set to zero.  The built-in argument is a 2-bit value for the
+new RN field value.  The argument can either be an @code{const int} or stored
+in a variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned
+void.  A @code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If
+defined, then the @code{__builtin_set_fpscr_rn} built-in returns the FPSCR
+fields.  If not defined, the @code{__builtin_set_fpscr_rn} does not return a
+value.  If the @option{-msoft-float} option is used, the
+@code{__builtin_set_fpscr_rn} built-in will not return a value.
 
 @node Basic PowerPC Built-in Functions Available on ISA 2.05
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.05
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
similarity index 89%
rename from gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
rename to gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
index 04707ad8a56..e09dd646145 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
@@ -1,5 +1,11 @@
-/* { dg-do run { target { powerpc*-*-* } } } */
-/* { dg-options "-O2 -std=c99" } */
+/* { dg-do run { target powerpc_fprs } } */
+/* { dg-options "-O2" } */
+
+/* Originally __builtin_set_fpscr_rn was defined to return void.  It was
+   later extended to return a double with the various FPSCR bits.  The
+   extended built-in is intended to be a drop in replacement for the original
+   version.  This test is for the original version of the built-in and should
+   work exactly as before.  */
 
 #ifdef DEBUG
 #include <stdio.h>
@@ -26,7 +32,7 @@ int main ()
   unsigned long long ll_value;
   register double  f14;
 
-  /* __builtin_set_fpscr_rn() builtin can take a const or a variable
+  /* __builtin_set_fpscr_rn() built-in can take a const or a variable
      value between 0 and 3 as the argument.
      __builtin_mtfsb0 and __builtin_mtfsb1 argument must be a constant 
      30 or 31.
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
new file mode 100644
index 00000000000..be4f65d2ab3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
@@ -0,0 +1,148 @@
+/* { dg-do run { target powerpc_fprs } } */
+/* { dg-options "-O2" } */
+
+/* The __builtin_set_fpscr_rn built-in was originally defined to return
+   void.  The built-in now returns a double containing the various FPSCR bits.
+   This test verifies the new return value.  */
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define RN_MASK 0x3LL             /* RN field mask */
+#define FIELD_MASK 0x00000007000000FFULL
+
+union blah {
+  double d;
+  unsigned long long ll;
+} conv_val;
+
+void abort (void);
+
+double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
+{
+  return __builtin_set_fpscr_rn (val);
+}
+
+double __attribute__ ((noipa)) wrap_const_fpscr_rn (int val)
+{
+  switch (val)
+    {
+    case 0: return __builtin_set_fpscr_rn (0x0);
+    case 1: return __builtin_set_fpscr_rn (0x1);
+    case 2: return __builtin_set_fpscr_rn (0x2);
+    case 3: return __builtin_set_fpscr_rn (0x3);
+    }
+}
+
+void check_builtin_set_fpscr_rn (unsigned long long initial_fpscr,
+				 int new_RN, double result)
+{
+  register double  f14;
+  unsigned long long masked_fpscr = initial_fpscr & FIELD_MASK;
+  
+  conv_val.d = result;
+
+  /* Check the result.  */
+  if (conv_val.ll != masked_fpscr)
+    {
+#ifdef DEBUG
+       printf("ERROR, __builtin_set_fpscr_rn(%d) did not return expected value %llx.\n",
+	      new_RN, masked_fpscr);
+       printf("fpscr_val_initial = 0x%llx\n", initial_fpscr);       
+       printf("result = 0x%llx\n", conv_val.ll);
+#else
+       abort();
+#endif
+    }
+
+  /* Check to see if the RN field was updated.  */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+
+  if ((conv_val.ll & RN_MASK) != new_RN)
+#ifdef DEBUG
+    {
+      printf("ERROR,  __builtin_set_fpscr_rn(%d) did not update RN to %llx.\n",
+	     new_RN, new_RN);
+      printf("  conv_val.ll = 0x%llx\n", conv_val.ll);
+    }
+#else
+    abort();
+#endif
+}
+
+int
+main ()
+{
+  int i;
+  int val, bit;
+  double fpscr_val;
+  unsigned long long fpscr_val_initial;
+  
+  unsigned long long ll_value;
+  union blah src_double;
+  register double  f14;
+
+  /* If __SET_FPSCR_RN_RETURNS_FPSCR__ is defined, the __builtin_set_fpscr_rn()
+     builtin returns the FPSCR fields.*/
+  
+  /* __builtin_set_fpscr_rn() builtin can take a constant or a variable
+     value between 0 and 3 as the argument.
+     __builtin_mtfsb0 and __builtin_mtfsb1 argument must be a constant 
+     30 or 31.
+  */
+
+  /* Test reading the FPSCR register */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+
+  if (conv_val.d != __builtin_mffs())
+    {
+#ifdef DEBUG
+       printf("ERROR, __builtin_mffs() returned 0x%llx, not the expected value 0x%llx\n",
+	      __builtin_mffs(), conv_val.d);
+#else
+       abort();
+#endif
+    }		  
+
+  /* Test return value from __builtin_set_fpscr_rn. The FPSCR fields (DRN, VE,
+     OE, UE, ZE, XE, NI, RN) are returned and  the RN field of FPSCR is updated
+     with the specified argument for the built-in.  */
+
+  /* Check immediate argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x0;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x3;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+  
+  /* Check int argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x1;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x2;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+  return 0;
+}		  
-- 
2.37.2



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn
  2023-07-11 18:06 [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn Carl Love
@ 2023-07-13  7:33 ` Kewen.Lin
  0 siblings, 0 replies; 2+ messages in thread
From: Kewen.Lin @ 2023-07-13  7:33 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches, dje.gcc

Hi Carl,

on 2023/7/12 02:06, Carl Love wrote:
> GCC maintainers:
> 
> Ver 4, Removed extra space in subject line.  Added comment to commit
> log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define.  Changed
> Added to Add and Renamed to Rename in ChangeLog.  Updated define_expand
> "rs6000_set_fpscr_rn" per Peter's comments to use new temporary
> register for output value.  Also, comments from Kewen about moving rtx
> tmp_di1 close to use.  Renamed tmp_di2 as orig_df_in_di.  Additionally,
> changed the name of tmp_di3 to tmp_di2 so the numbering is
> sequential.  Moved the new rtx tmp_di2 = gen_reg_rtx (DImode); right
> before its use to be consistent with previous move request.  Fixed tabs
> in comment.  Remove -std=c99 from test_fpscr_rn_builtin_1.c. Cleaned up
> comment and removed abort from test_fpscr_rn_builtin_2.c.  
> 
> Fixed a couple of additional issues with the ChangeLog per feedback
> from git gcc-verify.
> 
> Retested updated patch on Power 8, 9 and 10 to verify changes.
> 
> Ver 3, Renamed the patch per comments on ver 2.  Previous subject line
> was " [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value".  
> Fixed spelling mistakes and formatting.  Updated define_expand
> "rs6000_set_fpscr_rn to have the rs6000_get_fpscr_fields and
> rs6000_update_fpscr_rn_field define expands inlined.  Optimized the
> code and fixed use of temporary register values. Updated the test file
> dg-do run arguments and dg-options.  Removed the check for
> __SET_FPSCR_RN_RETURNS_FPSCR__. Removed additional references to the
> overloaded built-in with double argument.  Fixed up the documentation
> file.  Updated patch retested on Power 8 BE/LE, Power 9 BE/LE and Power
> 10 LE.
> 
> Ver 2,  Went back thru the requirements and emails.  Not sure where I
> came up with the requirement for an overloaded version with double
> argument.  Removed the overloaded version with the double argument. 
> Added the macro to announce if the __builtin_set_fpscr_rn returns a
> void or a double with the FPSCR bits.  Updated the documentation file. 
> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
> file.  Per request, the original test file functionality was not
> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
> test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
> new test file, test_fpscr_rn_builtin_2.c.
> 
> The GLibC team requested a builtin to replace the mffscrn and
> mffscrniinline asm instructions in the GLibC code.  Previously there
> was discussion on adding builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.
> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                                Carl 
> 
> -----------------------------------------
> rs6000, Add return value to __builtin_set_fpscr_rn
> 
> Change the return value from void to double for __builtin_set_fpscr_rn.
> The return value consists of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI,
> RN bit positions.  A new test file, test powerpc/test_fpscr_rn_builtin_2.c,
> is added to test the new return value for the built-in.
> 
> The value __SET_FPSCR_RN_RETURNS_FPSCR__ is defined if
> __builtin_set_fpscr_rn returns a double.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
> 	built-in definition return type.
> 	* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Add check,
> 	define __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 	* config/rs6000/rs6000.md (rs6000_set_fpscr_rn): Add return
> 	argument to return FPSCR fields.
> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
> 	the return value.  Add description for
> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/test_fpscr_rn_builtin.c: Rename to
> 	test_fpscr_rn_builtin_1.c.  Add comment.
> 	* gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
> 	return value of __builtin_set_fpscr_rn builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>  gcc/config/rs6000/rs6000-c.cc                 |   5 +
>  gcc/config/rs6000/rs6000.md                   |  55 ++++---
>  gcc/doc/extend.texi                           |  23 ++-
>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |  12 +-
>  .../powerpc/test_fpscr_rn_builtin_2.c         | 148 ++++++++++++++++++
>  6 files changed, 213 insertions(+), 32 deletions(-)
>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => test_fpscr_rn_builtin_1.c} (89%)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 289a37998b1..28788b69c7d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -237,7 +237,7 @@
>    const __ibm128 __builtin_pack_ibm128 (double, double);
>      PACK_IF packif {ibm128}
>  
> -  void __builtin_set_fpscr_rn (const int[0,3]);
> +  double __builtin_set_fpscr_rn (const int[0,3]);
>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>  
>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index 8555174d36e..07a32badb54 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -604,6 +604,11 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
>    /* Tell the user -mrop-protect is in play.  */
>    if (rs6000_rop_protect)
>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
> +  /* Tell the user __builtin_set_fpscr_rn now returns the FPSCR fields
> +     in a double.  Originally the built-in returned void.  */
> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +    rs6000_define_or_undefine_macro (define_p,
> +				     "__SET_FPSCR_RN_RETURNS_FPSCR__");
>  }
>  
>  void
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b0db8ae508d..641ab2dfcfe 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6441,7 +6441,8 @@ (define_insn "rs6000_mffscdrn"
>    [(set_attr "type" "fp")])
>  
>  (define_expand "rs6000_set_fpscr_rn"
> - [(match_operand:SI 0 "reg_or_cint_operand")]
> + [(use (match_operand:DF 0 "gpc_reg_operand"))
> +  (use (match_operand:SI 1 "reg_or_cint_operand"))]
>    "TARGET_HARD_FLOAT"
>  {
>    rtx tmp_df = gen_reg_rtx (DFmode);
> @@ -6450,49 +6451,63 @@ (define_expand "rs6000_set_fpscr_rn"
>       new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
>    if (TARGET_P9_MISC)
>      {
> -      if (const_0_to_3_operand (operands[0], VOIDmode))
> -	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
> +      if (const_0_to_3_operand (operands[1], VOIDmode))
> +	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
>        else
>  	{
> -	  rtx op0 = convert_to_mode (DImode, operands[0], false);
> -	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
> +	  rtx op1 = convert_to_mode (DImode, operands[1], false);
> +	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
>  	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
>  	}
> -      DONE;
> +       emit_move_insn (operands[0], tmp_df);
> +       DONE;
>      }
>  
> -  if (CONST_INT_P (operands[0]))
> +  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
> +     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
> +  */
> +
> +  /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
> +    ZE, XE, NI, RN) from the FPSCR and return them.  */
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  rtx orig_df_in_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  rtx tmp_di1 = gen_reg_rtx (DImode);
> +  emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di,
> +	     GEN_INT (0x00000007000000FFULL)));

Nit: The line with GEN_INT should indent more to tmp_di1, that is:

  emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di,
			 GEN_INT (0x00000007000000FFULL)));

> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di1, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +
> +  if (CONST_INT_P (operands[1]))
>      {
> -      if ((INTVAL (operands[0]) & 0x1) == 0x1)
> +      if ((INTVAL (operands[1]) & 0x1) == 0x1)
>  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
>        else
>  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
>  
> -      if ((INTVAL (operands[0]) & 0x2) == 0x2)
> +      if ((INTVAL (operands[1]) & 0x2) == 0x2)
>  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
>        else
>  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
>      }
>    else
>      {
> -      rtx tmp_rn = gen_reg_rtx (DImode);
> -      rtx tmp_di = gen_reg_rtx (DImode);
> -
>        /* Extract new RN mode from operand.  */
> -      rtx op0 = convert_to_mode (DImode, operands[0], false);
> -      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
> +      rtx op1 = convert_to_mode (DImode, operands[1], false);
> +      rtx tmp_rn = gen_reg_rtx (DImode);
> +      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
>  
> -      /* Insert new RN mode into FSCPR.  */
> -      emit_insn (gen_rs6000_mffs (tmp_df));
> -      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> +      /* Insert the new RN value from tmp_rn into FPSCR bit [62:63].  */
> +      emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di, GEN_INT (-4)));

Nit: This tmp_di1 has been used as destination pseudo above, as Peter's
comments, it's better to use one new pseudo here.

OK for trunk with these nits fixed and bootstrapped & regress-tested
as before.  Thanks!

Kewen

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-07-13  7:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-11 18:06 [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn Carl Love
2023-07-13  7:33 ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).