[PATCH] rs6000: Add buildin for mffscrn instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] rs6000: Add buildin for mffscrn instructions
@ 2023-04-13 20:47 Carl Love
  2023-05-18 21:12 ` [PATCH v2] " Carl Love
  0 siblings, 1 reply; 14+ messages in thread
From: Carl Love @ 2023-04-13 20:47 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches; +Cc: Peter Bergner, cel


GCC maintainers:

The following patch adds an overloaded builtin.  There are two possible
arguments for the builtin.  The builtin definitions are:

  double __builtin_mffscrn (unsigned long int);
  double __builtin_mffscrn (double);

The patch has been tested on Power 10 with no regressions.  

Please let me know if the patch is acceptable for mainline.  Thanks.

                    Carl 

-----------------------------------
rs6000: Add buildin for mffscrn instructions

This patch adds overloaded __builtin_mffscrn for the move From FPSCR
Control & Set R instruction with an immediate argument.  It also adds the
builtin with a floating point register argument.  A new runnable test is
added for the new builtin.

gcc/

	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
	__builtin_mffscrnd): Add builtin definitions.
	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
	overloaded definition.
	* doc/extend.texi: Add documentation for __builtin_mffscrn.

gcc/testsuite/

	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
	builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   7 ++
 gcc/config/rs6000/rs6000-overload.def         |   5 +
 gcc/doc/extend.texi                           |   8 ++
 .../gcc.target/powerpc/builtin-mffscrn.c      | 105 ++++++++++++++++++
 4 files changed, 125 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 03fb194b151..6247cb6c0fe 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2863,6 +2863,13 @@
   pure vsc __builtin_vsx_xl_len_r (void *, signed long);
     XL_LEN_R xl_len_r {}
 
+; Immediate instruction only uses the least significant two bits of the
+; const int.
+  double __builtin_mffscrni (const int<2>);
+    MFFSCRNI rs6000_mffscrni {}
+
+  double __builtin_mffscrnd (double);
+    MFFSCRNF rs6000_mffscrn {}
 
 ; Builtins requiring hardware support for IEEE-128 floating-point.
 [ieee128-hw]
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..adda2df69ea 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -78,6 +78,11 @@
 ; like after a required newline, but nowhere else.  Lines beginning with
 ; a semicolon are also treated as blank lines.
 
+[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
+  double __builtin_mffscrn (const int<2>);
+    MFFSCRNI
+  double __builtin_mffscrn (double);
+    MFFSCRNF
 
 [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
   vsq __builtin_vec_bcdadd (vsq, vsq, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3adb67aa47a..168d439c0e4 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18317,6 +18317,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
 
 double __builtin_mffsl(void);
 
+double __builtin_mffscrn (unsigned long int);
+double __builtin_mffscrn (double);
+
 @end smallexample
 The @code{__builtin_byte_in_set} function requires a
 64-bit environment supporting ISA 3.0 or later.  This function returns
@@ -18373,6 +18376,11 @@ the FPSCR.  The instruction is a lower latency version of the @code{mffs}
 instruction.  If the @code{mffsl} instruction is not available, then the
 builtin uses the older @code{mffs} instruction to read the FPSCR.
 
+The @code{__builtin_mffscrn} returns the contents of the control bits in the
+FPSCR, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN).  The
+contents of bits [62:63] of the unsigned long int or double argument are placed
+into bits [62:63] of the FPSCR (RN).
+
 @node Basic PowerPC Built-in Functions Available on ISA 3.1
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 3.1
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
new file mode 100644
index 00000000000..433a9081499
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
@@ -0,0 +1,105 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p9vector_hw } */
+
+#include <altivec.h>
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define MASK 0x3
+#define EXPECTED1 0x1
+#define EXPECTED2 0x2
+
+void abort (void);
+
+int
+main()
+{
+  unsigned long mask, result, expected;
+  double double_arg;
+  
+  union convert_t {
+    double d;
+    unsigned long ul;
+  } val;
+
+  /* Test immediate version of __builtin_mffscrn. */
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (EXPECTED2);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin
+     call.  */
+  val.d = __builtin_mffscrn (EXPECTED1);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn immediate doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.d = __builtin_mffscrn (EXPECTED1);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+  printf("Result of mffscrn immediate doesn't match EXPECTED1.  Result was 0x%lx\n",
+	 result);
+#else
+  abort();
+#endif
+
+
+  /* Test double argument version of __builtin_mffscrn */
+  val.ul = EXPECTED2;
+  double_arg = val.d;
+
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (double_arg);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin  
+     call.  */
+
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED1.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+}
-- 
2.37.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-04-13 20:47 [PATCH] rs6000: Add buildin for mffscrn instructions Carl Love
@ 2023-05-18 21:12 ` Carl Love
  2023-05-22  6:36   ` Kewen.Lin
  0 siblings, 1 reply; 14+ messages in thread
From: Carl Love @ 2023-05-18 21:12 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches; +Cc: Peter Bergner

GCC maintainers:

version 2.  Fixed an issue with the test case.  The dg-options line was
missing.

The following patch adds an overloaded builtin.  There are two possible
arguments for the builtin.  The builtin definitions are:

  double __builtin_mffscrn (unsigned long int);
  double __builtin_mffscrn (double);

The patch has been tested on Power 10 with no regressions.  

Please let me know if the patch is acceptable for mainline.  Thanks.

                    Carl

------------------------------------------------
rs6000: Add buildin for mffscrn instructions

This patch adds overloaded __builtin_mffscrn for the move From FPSCR
Control & Set R instruction with an immediate argument.  It also adds the
builtin with a floating point register argument.  A new runnable test is
added for the new builtin.

gcc/

	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
	__builtin_mffscrnd): Add builtin definitions.
	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
	overloaded definition.
	* doc/extend.texi: Add documentation for __builtin_mffscrn.

gcc/testsuite/

	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
	builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   7 ++
 gcc/config/rs6000/rs6000-overload.def         |   5 +
 gcc/doc/extend.texi                           |   8 ++
 .../gcc.target/powerpc/builtin-mffscrn.c      | 106 ++++++++++++++++++
 4 files changed, 126 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 92d9b46e1b9..67125473684 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2875,6 +2875,13 @@
   pure vsc __builtin_vsx_xl_len_r (void *, signed long);
     XL_LEN_R xl_len_r {}
 
+; Immediate instruction only uses the least significant two bits of the
+; const int.
+  double __builtin_mffscrni (const int<2>);
+    MFFSCRNI rs6000_mffscrni {}
+
+  double __builtin_mffscrnd (double);
+    MFFSCRNF rs6000_mffscrn {}
 
 ; Builtins requiring hardware support for IEEE-128 floating-point.
 [ieee128-hw]
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..adda2df69ea 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -78,6 +78,11 @@
 ; like after a required newline, but nowhere else.  Lines beginning with
 ; a semicolon are also treated as blank lines.
 
+[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
+  double __builtin_mffscrn (const int<2>);
+    MFFSCRNI
+  double __builtin_mffscrn (double);
+    MFFSCRNF
 
 [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
   vsq __builtin_vec_bcdadd (vsq, vsq, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ed8b9c8a87b..f16c046051a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18455,6 +18455,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
 
 double __builtin_mffsl(void);
 
+double __builtin_mffscrn (unsigned long int);
+double __builtin_mffscrn (double);
+
 @end smallexample
 The @code{__builtin_byte_in_set} function requires a
 64-bit environment supporting ISA 3.0 or later.  This function returns
@@ -18511,6 +18514,11 @@ the FPSCR.  The instruction is a lower latency version of the @code{mffs}
 instruction.  If the @code{mffsl} instruction is not available, then the
 builtin uses the older @code{mffs} instruction to read the FPSCR.
 
+The @code{__builtin_mffscrn} returns the contents of the control bits in the
+FPSCR, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN).  The
+contents of bits [62:63] of the unsigned long int or double argument are placed
+into bits [62:63] of the FPSCR (RN).
+
 @node Basic PowerPC Built-in Functions Available on ISA 3.1
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 3.1
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
new file mode 100644
index 00000000000..26c666a4091
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
@@ -0,0 +1,106 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-mpower9-vector -mdejagnu-cpu=power9" } */
+
+#include <altivec.h>
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define MASK 0x3
+#define EXPECTED1 0x1
+#define EXPECTED2 0x2
+
+void abort (void);
+
+int
+main()
+{
+  unsigned long mask, result, expected;
+  double double_arg;
+  
+  union convert_t {
+    double d;
+    unsigned long ul;
+  } val;
+
+  /* Test immediate version of __builtin_mffscrn. */
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (EXPECTED2);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin
+     call.  */
+  val.d = __builtin_mffscrn (EXPECTED1);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn immediate doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.d = __builtin_mffscrn (EXPECTED1);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+  printf("Result of mffscrn immediate doesn't match EXPECTED1.  Result was 0x%lx\n",
+	 result);
+#else
+  abort();
+#endif
+
+
+  /* Test double argument version of __builtin_mffscrn */
+  val.ul = EXPECTED2;
+  double_arg = val.d;
+
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (double_arg);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin  
+     call.  */
+
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED1.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+}
-- 
2.37.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-18 21:12 ` [PATCH v2] " Carl Love
@ 2023-05-22  6:36   ` Kewen.Lin
  2023-05-22 17:31     ` Carl Love
  2023-05-22 17:36     ` [PATCH v3] " Carl Love
  0 siblings, 2 replies; 14+ messages in thread
From: Kewen.Lin @ 2023-05-22  6:36 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches

Hi Carl,

on 2023/5/19 05:12, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> version 2.  Fixed an issue with the test case.  The dg-options line was
> missing.
> 
> The following patch adds an overloaded builtin.  There are two possible
> arguments for the builtin.  The builtin definitions are:
> 
>   double __builtin_mffscrn (unsigned long int);
>   double __builtin_mffscrn (double);
> 

We already have one  bif __builtin_set_fpscr_rn for RN setting, apparently
these two are mainly for direct mapping to mffscr[ni] and want the FPSCR bits.
I'm curious what's the requirements requesting these two built-in functions?

> The patch has been tested on Power 10 with no regressions.  
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                     Carl
> 
> ------------------------------------------------
> rs6000: Add buildin for mffscrn instructions
> 

s/buildin/built-in/

> This patch adds overloaded __builtin_mffscrn for the move From FPSCR
> Control & Set R instruction with an immediate argument.  It also adds the
> builtin with a floating point register argument.  A new runnable test is
> added for the new builtin.

s/Set R/Set RN/

> 
> gcc/
> 
> 	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
> 	__builtin_mffscrnd): Add builtin definitions.
> 	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
> 	overloaded definition.
> 	* doc/extend.texi: Add documentation for __builtin_mffscrn.
> 
> gcc/testsuite/
> 
> 	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
> 	builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   7 ++
>  gcc/config/rs6000/rs6000-overload.def         |   5 +
>  gcc/doc/extend.texi                           |   8 ++
>  .../gcc.target/powerpc/builtin-mffscrn.c      | 106 ++++++++++++++++++
>  4 files changed, 126 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 92d9b46e1b9..67125473684 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2875,6 +2875,13 @@
>    pure vsc __builtin_vsx_xl_len_r (void *, signed long);
>      XL_LEN_R xl_len_r {}
>  
> +; Immediate instruction only uses the least significant two bits of the
> +; const int.
> +  double __builtin_mffscrni (const int<2>);
> +    MFFSCRNI rs6000_mffscrni {}
> +
> +  double __builtin_mffscrnd (double);
> +    MFFSCRNF rs6000_mffscrn {}
>  

Why are them put in [power9-64] rather than [power9]?  IMHO [power9] is the
correct stanza for them.  Besides, {nosoft} attribute is required.

>  ; Builtins requiring hardware support for IEEE-128 floating-point.
>  [ieee128-hw]
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..adda2df69ea 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -78,6 +78,11 @@
>  ; like after a required newline, but nowhere else.  Lines beginning with
>  ; a semicolon are also treated as blank lines.
>  
> +[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
> +  double __builtin_mffscrn (const int<2>);
> +    MFFSCRNI
> +  double __builtin_mffscrn (double);
> +    MFFSCRNF
>  
>  [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
>    vsq __builtin_vec_bcdadd (vsq, vsq, const int);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index ed8b9c8a87b..f16c046051a 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18455,6 +18455,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
>  
>  double __builtin_mffsl(void);
>  
> +double __builtin_mffscrn (unsigned long int);
> +double __builtin_mffscrn (double);

s/unsigned long int/const int/

Note that this section is for all configurations and your implementation is put
__builtin_mffscrn power9 only, so if the intention (requirement) is to make this
be for also all configurations, we need to deal with the cases without the support
of actual hw insns mffscrn{,i}, just like the existing handlings for mffsl etc.

> +
>  @end smallexample
>  The @code{__builtin_byte_in_set} function requires a
>  64-bit environment supporting ISA 3.0 or later.  This function returns
> @@ -18511,6 +18514,11 @@ the FPSCR.  The instruction is a lower latency version of the @code{mffs}
>  instruction.  If the @code{mffsl} instruction is not available, then the
>  builtin uses the older @code{mffs} instruction to read the FPSCR.
>  
> +The @code{__builtin_mffscrn} returns the contents of the control bits in the
> +FPSCR, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN).  The
> +contents of bits [62:63] of the unsigned long int or double argument are placed
> +into bits [62:63] of the FPSCR (RN).
> +

I know this description is copied from ISA doc, but this part is for GCC documentation,
the document conventions can be different, I prefer to just describe which control
bits in FPSCR like "control bits DRN and VE, OE, UE, ZE, XE, NI, RN in FPSCR are
returned with the same locations in FPSCR, RN in FPSCR is set according to the given
2-bits constant for the variant with const int type argument, or the given double whose
two bits sits in the same locations of FPSCR RN for the variant with double type argument."

>  @node Basic PowerPC Built-in Functions Available on ISA 3.1
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 3.1
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> new file mode 100644
> index 00000000000..26c666a4091
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> @@ -0,0 +1,106 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target p9vector_hw } */

Since it's nothing about vector, I think you should use p9modulo_hw instead ...

> +/* { dg-options "-mpower9-vector -mdejagnu-cpu=power9" } */

... and it doesn't need -mpower9-vector here.

BR,
Kewen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-22  6:36   ` Kewen.Lin
@ 2023-05-22 17:31     ` Carl Love
  2023-05-23  5:24       ` Kewen.Lin
  2023-05-22 17:36     ` [PATCH v3] " Carl Love
  1 sibling, 1 reply; 14+ messages in thread
From: Carl Love @ 2023-05-22 17:31 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches, cel

On Mon, 2023-05-22 at 14:36 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2023/5/19 05:12, Carl Love via Gcc-patches wrote:
> > GCC maintainers:
> > 
> > version 2.  Fixed an issue with the test case.  The dg-options line
> > was
> > missing.
> > 
> > The following patch adds an overloaded builtin.  There are two
> > possible
> > arguments for the builtin.  The builtin definitions are:
> > 
> >   double __builtin_mffscrn (unsigned long int);
> >   double __builtin_mffscrn (double);
> > 
> 
> We already have one  bif __builtin_set_fpscr_rn for RN setting,
> apparently
> these two are mainly for direct mapping to mffscr[ni] and want the
> FPSCR bits.
> I'm curious what's the requirements requesting these two built-in
> functions?

The builtins were requested for use in GLibC.  As of version 2.31 they
were added as inline asm.  They requested a builtin so the asm could be
removed.

> 
> > The patch has been tested on Power 10 with no regressions.  
> > 
> > Please let me know if the patch is acceptable for
> > mainline.  Thanks.
> > 
> >                     Carl
> > 
> > ------------------------------------------------
> > rs6000: Add buildin for mffscrn instructions
> > 
> 
> s/buildin/built-in/

fixed
> 
> > This patch adds overloaded __builtin_mffscrn for the move From
> > FPSCR
> > Control & Set R instruction with an immediate argument.  It also
> > adds the
> > builtin with a floating point register argument.  A new runnable
> > test is
> > added for the new builtin.
> 
> s/Set R/Set RN/

fixed

> > gcc/
> > 
> > 	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
> > 	__builtin_mffscrnd): Add builtin definitions.
> > 	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
> > 	overloaded definition.
> > 	* doc/extend.texi: Add documentation for __builtin_mffscrn.
> > 
> > gcc/testsuite/
> > 
> > 	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
> > 	builtin.
> > ---
> >  gcc/config/rs6000/rs6000-builtins.def         |   7 ++
> >  gcc/config/rs6000/rs6000-overload.def         |   5 +
> >  gcc/doc/extend.texi                           |   8 ++
> >  .../gcc.target/powerpc/builtin-mffscrn.c      | 106
> > ++++++++++++++++++
> >  4 files changed, 126 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-
> > mffscrn.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index 92d9b46e1b9..67125473684 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -2875,6 +2875,13 @@
> >    pure vsc __builtin_vsx_xl_len_r (void *, signed long);
> >      XL_LEN_R xl_len_r {}
> >  
> > +; Immediate instruction only uses the least significant two bits
> > of the
> > +; const int.
> > +  double __builtin_mffscrni (const int<2>);
> > +    MFFSCRNI rs6000_mffscrni {}
> > +
> > +  double __builtin_mffscrnd (double);
> > +    MFFSCRNF rs6000_mffscrn {}
> >  
> 
> Why are them put in [power9-64] rather than [power9]?  IMHO [power9]
> is the
> correct stanza for them.

Moved them to power 9 stanza.

>   Besides, {nosoft} attribute is required.

OK, added that.  I was trying to figure out why nosoft is needed.  The
instructions are manipulating bits in a physical register that controls
the hardware floating point instructions.  It looks to me like that
would be why.  Because if you were using msoft float then the floating
point HW registers are disabled and the floating point operations are
done using software.  Did I figure this out correctly?

 
> 
> >  ; Builtins requiring hardware support for IEEE-128 floating-point.
> >  [ieee128-hw]
> > diff --git a/gcc/config/rs6000/rs6000-overload.def
> > b/gcc/config/rs6000/rs6000-overload.def
> > index c582490c084..adda2df69ea 100644
> > --- a/gcc/config/rs6000/rs6000-overload.def
> > +++ b/gcc/config/rs6000/rs6000-overload.def
> > @@ -78,6 +78,11 @@
> >  ; like after a required newline, but nowhere else.  Lines
> > beginning with
> >  ; a semicolon are also treated as blank lines.
> >  
> > +[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
> > +  double __builtin_mffscrn (const int<2>);
> > +    MFFSCRNI
> > +  double __builtin_mffscrn (double);
> > +    MFFSCRNF
> >  
> >  [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
> >    vsq __builtin_vec_bcdadd (vsq, vsq, const int);
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index ed8b9c8a87b..f16c046051a 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -18455,6 +18455,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned
> > int comparison, _Decimal128 value);
> >  
> >  double __builtin_mffsl(void);
> >  
> > +double __builtin_mffscrn (unsigned long int);
> > +double __builtin_mffscrn (double);
> 
> s/unsigned long int/const int/

Fixed

> 
> Note that this section is for all configurations and your
> implementation is put
> __builtin_mffscrn power9 only, so if the intention (requirement) is
> to make this
> be for also all configurations, we need to deal with the cases
> without the support
> of actual hw insns mffscrn{,i}, just like the existing handlings for
> mffsl etc.

I think it should be sufficient to just support these when floating
hardware instructions are supported.  So I believe I just need to move
these to the correct place in the document.  I think that means they
should be in the section:

  The following functions require @option{-mhard-float},
  @option{-mpowerpc-gfxopt}, and @option{-mpopcntb} options.

Moved to the end of the above section.  Hope that is correct.

> 
> > +
> >  @end smallexample
> >  The @code{__builtin_byte_in_set} function requires a
> >  64-bit environment supporting ISA 3.0 or later.  This function
> > returns
> > @@ -18511,6 +18514,11 @@ the FPSCR.  The instruction is a lower
> > latency version of the @code{mffs}
> >  instruction.  If the @code{mffsl} instruction is not available,
> > then the
> >  builtin uses the older @code{mffs} instruction to read the FPSCR.
> >  
> > +The @code{__builtin_mffscrn} returns the contents of the control
> > bits in the
> > +FPSCR, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> > RN).  The
> > +contents of bits [62:63] of the unsigned long int or double
> > argument are placed
> > +into bits [62:63] of the FPSCR (RN).
> > +
> 
> I know this description is copied from ISA doc, but this part is for
> GCC documentation,
> the document conventions can be different, I prefer to just describe
> which control
> bits in FPSCR like "control bits DRN and VE, OE, UE, ZE, XE, NI, RN
> in FPSCR are
> returned with the same locations in FPSCR, RN in FPSCR is set
> according to the given
> 2-bits constant for the variant with const int type argument, or the
> given double whose
> two bits sits in the same locations of FPSCR RN for the variant with
> double type argument."

Updated the description to not specify the specific bit positions.

> 
> >  @node Basic PowerPC Built-in Functions Available on ISA 3.1
> >  @subsubsection Basic PowerPC Built-in Functions Available on ISA
> > 3.1
> >  
> > diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> > b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> > new file mode 100644
> > index 00000000000..26c666a4091
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> > @@ -0,0 +1,106 @@
> > +/* { dg-do run } */
> > +/* { dg-require-effective-target p9vector_hw } */
> 
> Since it's nothing about vector, I think you should use p9modulo_hw
> instead ...

changed

> 
> > +/* { dg-options "-mpower9-vector -mdejagnu-cpu=power9" } */
> 
> ... and it doesn't need -mpower9-vector here.

Removed -mpower9-vector
> 
> BR,
> Kewen
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3] rs6000: Add buildin for mffscrn instructions
  2023-05-22  6:36   ` Kewen.Lin
  2023-05-22 17:31     ` Carl Love
@ 2023-05-22 17:36     ` Carl Love
  1 sibling, 0 replies; 14+ messages in thread
From: Carl Love @ 2023-05-22 17:36 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches

Kewen, Segher, GCC maintainers:

Version 3, fixed various issues noted by Kewen.  Retested on Power 10. 
No regression issues.

Version 2,  Fixed an issue with the test case.  The dg-options line was
missing.

The following patch adds an overloaded builtin.  There are two possible
arguments for the builtin.  The builtin definitions are:

  double __builtin_mffscrn (unsigned long int);
  double __builtin_mffscrn (double);

The patch has been tested on Power 10 with no regressions.  

Please let me know if the patch is acceptable for mainline.  Thanks.

                    Carl 
-------------------------------------------

rs6000: Add builtin for mffscrn instructions

This patch adds overloaded __builtin_mffscrn for the move From FPSCR
Control & Set RN instruction with an immediate argument.  It also adds the
builtin with a floating point register argument.  A new runnable test is
added for the new builtin.

gcc/

	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
	__builtin_mffscrnd): Add builtin definitions.
	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
	overloaded definition.
	* doc/extend.texi: Add documentation for __builtin_mffscrn.

gcc/testsuite/

	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
	builtin.

---
 gcc/config/rs6000/rs6000-builtins.def         |   9 +-
 gcc/config/rs6000/rs6000-overload.def         |   5 +
 gcc/doc/extend.texi                           |  10 ++
 .../gcc.target/powerpc/builtin-mffscrn.c      | 106 ++++++++++++++++++
 4 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 92d9b46e1b9..ae08d2fbff7 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2849,6 +2849,14 @@
   const signed int  __builtin_vsx_scalar_extract_exp (double);
     VSEEDP xsxexpdp_si {}
 
+; Immediate instruction only uses the least significant two bits of the
+; const int.
+  double __builtin_mffscrni (const int<2>);
+    MFFSCRNI rs6000_mffscrni {nosoft}
+
+  double __builtin_mffscrnd (double);
+    MFFSCRNF rs6000_mffscrn {nosoft}
+
 [power9-64]
   void __builtin_altivec_xst_len_r (vsc, void *, long);
     XST_LEN_R xst_len_r {}
@@ -2875,7 +2883,6 @@
   pure vsc __builtin_vsx_xl_len_r (void *, signed long);
     XL_LEN_R xl_len_r {}
 
-
 ; Builtins requiring hardware support for IEEE-128 floating-point.
 [ieee128-hw]
   fpmath _Float128 __builtin_addf128_round_to_odd (_Float128, _Float128);
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 26dc662b8fb..39423bcec2b 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -78,6 +78,11 @@
 ; like after a required newline, but nowhere else.  Lines beginning with
 ; a semicolon are also treated as blank lines.
 
+[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
+  double __builtin_mffscrn (const int<2>);
+    MFFSCRNI
+  double __builtin_mffscrn (double);
+    MFFSCRNF
 
 [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
   vsq __builtin_vec_bcdadd (vsq, vsq, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ed8b9c8a87b..82f9932666a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18274,6 +18274,16 @@ The @code{__builtin_recipdiv}, and @code{__builtin_recipdivf}
 functions generate multiple instructions to implement division using
 the reciprocal estimate instructions.
 
+double __builtin_mffscrn (const int);
+double __builtin_mffscrn (double);
+
+The @code{__builtin_mffscrn} returns the contents of the control bits DRN, VE,
+OE, UE, ZE, XE, NI, RN in the FPSCR are returned with RN updated appropriately.
+In the case of the const int variant of the builtin, RN is set to the 2-bit
+value specified in the builtin.  In the case of the double builtin variant, the
+2-bit value in the double argument that corresponds to the RN location in the
+FPSCR is updated.
+
 The following functions require @option{-mhard-float} and
 @option{-mmultiple} options.
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
new file mode 100644
index 00000000000..69a7a17cfc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
@@ -0,0 +1,106 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p9modulo_hw } */
+/* { dg-options "-mdejagnu-cpu=power9" } */
+
+#include <altivec.h>
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define MASK 0x3
+#define EXPECTED1 0x1
+#define EXPECTED2 0x2
+
+void abort (void);
+
+int
+main()
+{
+  unsigned long mask, result, expected;
+  double double_arg;
+  
+  union convert_t {
+    double d;
+    unsigned long ul;
+  } val;
+
+  /* Test immediate version of __builtin_mffscrn. */
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (EXPECTED2);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin
+     call.  */
+  val.d = __builtin_mffscrn (EXPECTED1);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn immediate doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.d = __builtin_mffscrn (EXPECTED1);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+  printf("Result of mffscrn immediate doesn't match EXPECTED1.  Result was 0x%lx\n",
+	 result);
+#else
+  abort();
+#endif
+
+
+  /* Test double argument version of __builtin_mffscrn */
+  val.ul = EXPECTED2;
+  double_arg = val.d;
+
+  /* Read FPSCR and set RN bits in FPSCR[62:63]. */
+  val.d = __builtin_mffscrn (double_arg);
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x2 by previous builtin  
+     call.  */
+
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  /* The expected result is the argument for the previous call to
+     __builtin_mffscrn.  */
+  expected = EXPECTED2;
+  result = MASK & val.ul;
+
+  if (EXPECTED2 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED2.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+
+  /* Read FPSCR, bits [62:63] should have been set to 0x1 by previous builtin
+   call*/
+  val.ul = EXPECTED1;
+  double_arg = val.d;
+
+  val.d = __builtin_mffscrn (double_arg);
+  expected = EXPECTED1;
+  result = MASK & val.ul;
+
+  if (EXPECTED1 != result)
+#ifdef DEBUG
+    printf("Result of mffscrn double arg doesn't match EXPECTED1.  Result was 0x%lx\n",
+	   result);
+#else
+    abort();
+#endif
+}
-- 
2.37.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-22 17:31     ` Carl Love
@ 2023-05-23  5:24       ` Kewen.Lin
  2023-05-23 22:30         ` Peter Bergner
  0 siblings, 1 reply; 14+ messages in thread
From: Kewen.Lin @ 2023-05-23  5:24 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches

on 2023/5/23 01:31, Carl Love wrote:
> On Mon, 2023-05-22 at 14:36 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> on 2023/5/19 05:12, Carl Love via Gcc-patches wrote:
>>> GCC maintainers:
>>>
>>> version 2.  Fixed an issue with the test case.  The dg-options line
>>> was
>>> missing.
>>>
>>> The following patch adds an overloaded builtin.  There are two
>>> possible
>>> arguments for the builtin.  The builtin definitions are:
>>>
>>>   double __builtin_mffscrn (unsigned long int);
>>>   double __builtin_mffscrn (double);
>>>
>>
>> We already have one  bif __builtin_set_fpscr_rn for RN setting,
>> apparently
>> these two are mainly for direct mapping to mffscr[ni] and want the
>> FPSCR bits.
>> I'm curious what's the requirements requesting these two built-in
>> functions?
> 
> The builtins were requested for use in GLibC.  As of version 2.31 they
> were added as inline asm.  They requested a builtin so the asm could be
> removed.
> 

OK, thanks for the information.

>>
>>> The patch has been tested on Power 10 with no regressions.  
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline.  Thanks.
>>>
>>>                     Carl
>>>
>>> ------------------------------------------------
>>> rs6000: Add buildin for mffscrn instructions
>>>
>>
>> s/buildin/built-in/
> 
> fixed
>>
>>> This patch adds overloaded __builtin_mffscrn for the move From
>>> FPSCR
>>> Control & Set R instruction with an immediate argument.  It also
>>> adds the
>>> builtin with a floating point register argument.  A new runnable
>>> test is
>>> added for the new builtin.
>>
>> s/Set R/Set RN/
> 
> fixed
> 
>>> gcc/
>>>
>>> 	* config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
>>> 	__builtin_mffscrnd): Add builtin definitions.
>>> 	* config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
>>> 	overloaded definition.
>>> 	* doc/extend.texi: Add documentation for __builtin_mffscrn.
>>>
>>> gcc/testsuite/
>>>
>>> 	* gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
>>> 	builtin.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def         |   7 ++
>>>  gcc/config/rs6000/rs6000-overload.def         |   5 +
>>>  gcc/doc/extend.texi                           |   8 ++
>>>  .../gcc.target/powerpc/builtin-mffscrn.c      | 106
>>> ++++++++++++++++++
>>>  4 files changed, 126 insertions(+)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-
>>> mffscrn.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 92d9b46e1b9..67125473684 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -2875,6 +2875,13 @@
>>>    pure vsc __builtin_vsx_xl_len_r (void *, signed long);
>>>      XL_LEN_R xl_len_r {}
>>>  
>>> +; Immediate instruction only uses the least significant two bits
>>> of the
>>> +; const int.
>>> +  double __builtin_mffscrni (const int<2>);
>>> +    MFFSCRNI rs6000_mffscrni {}
>>> +
>>> +  double __builtin_mffscrnd (double);
>>> +    MFFSCRNF rs6000_mffscrn {}
>>>  
>>
>> Why are them put in [power9-64] rather than [power9]?  IMHO [power9]
>> is the
>> correct stanza for them.
> 
> Moved them to power 9 stanza.
> 
>>   Besides, {nosoft} attribute is required.
> 
> OK, added that.  I was trying to figure out why nosoft is needed.  The
> instructions are manipulating bits in a physical register that controls
> the hardware floating point instructions.  It looks to me like that
> would be why.  Because if you were using msoft float then the floating
> point HW registers are disabled and the floating point operations are
> done using software.  Did I figure this out correctly?

Yes, and also the destination of these two instructions is hardware float
register, its relatives mffs and mffsl have that as well.

> 
>  
>>
>>>  ; Builtins requiring hardware support for IEEE-128 floating-point.
>>>  [ieee128-hw]
>>> diff --git a/gcc/config/rs6000/rs6000-overload.def
>>> b/gcc/config/rs6000/rs6000-overload.def
>>> index c582490c084..adda2df69ea 100644
>>> --- a/gcc/config/rs6000/rs6000-overload.def
>>> +++ b/gcc/config/rs6000/rs6000-overload.def
>>> @@ -78,6 +78,11 @@
>>>  ; like after a required newline, but nowhere else.  Lines
>>> beginning with
>>>  ; a semicolon are also treated as blank lines.
>>>  
>>> +[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
>>> +  double __builtin_mffscrn (const int<2>);
>>> +    MFFSCRNI
>>> +  double __builtin_mffscrn (double);
>>> +    MFFSCRNF
>>>  
>>>  [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
>>>    vsq __builtin_vec_bcdadd (vsq, vsq, const int);
>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>> index ed8b9c8a87b..f16c046051a 100644
>>> --- a/gcc/doc/extend.texi
>>> +++ b/gcc/doc/extend.texi
>>> @@ -18455,6 +18455,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned
>>> int comparison, _Decimal128 value);
>>>  
>>>  double __builtin_mffsl(void);
>>>  
>>> +double __builtin_mffscrn (unsigned long int);
>>> +double __builtin_mffscrn (double);
>>
>> s/unsigned long int/const int/
> 
> Fixed
> 
>>
>> Note that this section is for all configurations and your
>> implementation is put
>> __builtin_mffscrn power9 only, so if the intention (requirement) is
>> to make this
>> be for also all configurations, we need to deal with the cases
>> without the support
>> of actual hw insns mffscrn{,i}, just like the existing handlings for
>> mffsl etc.
> 
> I think it should be sufficient to just support these when floating
> hardware instructions are supported.  So I believe I just need to move
> these to the correct place in the document.  I think that means they
> should be in the section:
> 
>   The following functions require @option{-mhard-float},
>   @option{-mpowerpc-gfxopt}, and @option{-mpopcntb} options.
> 
> Moved to the end of the above section.  Hope that is correct.

The hardware insn mffsl which is available starting from ISA 3.0, but
the related built-in __builtin_mffsl is available on all configurations,
since it has the falling back as rs6000-builtins.def said:

; Although the mffsl instruction is only available on POWER9 and later
; processors, this builtin automatically falls back to mffs on older
; platforms.  Thus it appears here in the [always] stanza.
  double __builtin_mffsl ();
    MFFSL rs6000_mffsl {nosoft}

So IMHO we also want the similar support for mffscrn, that is to make
use of mffscrn and mffscrni on Power9 and later, but falls back to 
__builtin_set_fpscr_rn + mffs similar on older platforms.

Segher & Peter, what do you think of this?

BR,
Kewen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-23  5:24       ` Kewen.Lin
@ 2023-05-23 22:30         ` Peter Bergner
  2023-05-24  5:32           ` Kewen.Lin
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Bergner @ 2023-05-23 22:30 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: Segher Boessenkool, gcc-patches

On 5/23/23 12:24 AM, Kewen.Lin wrote:
> on 2023/5/23 01:31, Carl Love wrote:
>> The builtins were requested for use in GLibC.  As of version 2.31 they
>> were added as inline asm.  They requested a builtin so the asm could be
>> removed.
>
> So IMHO we also want the similar support for mffscrn, that is to make
> use of mffscrn and mffscrni on Power9 and later, but falls back to 
> __builtin_set_fpscr_rn + mffs similar on older platforms.

So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
uses mffscrn/mffscrni on P9 and later and uses older insns on pre-P9.
The only problem is we don't return the current FPSCR bits, as the bif
is defined to return void.  Crazy idea, but could we extend the built-in
with an overload that returns the FPSCR bits?  To be honest, I like
the __builtin_set_fpscr_rn name better than __builtin_mffscrn[i].
The built-in machinery can see that the usage is expecting a return value
or not and for the pre-P9 code, can skip generating the ending mffs if
we don't want the return value.

Peter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-23 22:30         ` Peter Bergner
@ 2023-05-24  5:32           ` Kewen.Lin
  2023-05-24 15:20             ` Carl Love
  0 siblings, 1 reply; 14+ messages in thread
From: Kewen.Lin @ 2023-05-24  5:32 UTC (permalink / raw)
  To: Peter Bergner, Carl Love; +Cc: Segher Boessenkool, gcc-patches

on 2023/5/24 06:30, Peter Bergner wrote:
> On 5/23/23 12:24 AM, Kewen.Lin wrote:
>> on 2023/5/23 01:31, Carl Love wrote:
>>> The builtins were requested for use in GLibC.  As of version 2.31 they
>>> were added as inline asm.  They requested a builtin so the asm could be
>>> removed.
>>
>> So IMHO we also want the similar support for mffscrn, that is to make
>> use of mffscrn and mffscrni on Power9 and later, but falls back to 
>> __builtin_set_fpscr_rn + mffs similar on older platforms.
> 
> So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
> uses mffscrn/mffscrni on P9 and later and uses older insns on pre-P9.
> The only problem is we don't return the current FPSCR bits, as the bif
> is defined to return void.

Yes.

> Crazy idea, but could we extend the built-in
> with an overload that returns the FPSCR bits?  

So you agree that we should make this proposed new bif handle pre-P9 just
like some other existing bifs. :)  I think extending it is good and doable,
but the only concern here is the bif name "__builtin_set_fpscr_rn", which
matches the existing behavior (only set rounding) but doesn't match the
proposed extending behavior (set rounding and get some env bits back).
Maybe it's not a big deal if the documentation clarify it well.


> To be honest, I like
> the __builtin_set_fpscr_rn name better than __builtin_mffscrn[i].

+1

BR,
Kewen

> The built-in machinery can see that the usage is expecting a return value
> or not and for the pre-P9 code, can skip generating the ending mffs if
> we don't want the return value.
> 
> Peter
> 
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-24  5:32           ` Kewen.Lin
@ 2023-05-24 15:20             ` Carl Love
  2023-05-24 16:32               ` Peter Bergner
  2023-05-25  5:28               ` Kewen.Lin
  0 siblings, 2 replies; 14+ messages in thread
From: Carl Love @ 2023-05-24 15:20 UTC (permalink / raw)
  To: Kewen.Lin, Peter Bergner; +Cc: Segher Boessenkool, gcc-patches, cel

On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
> on 2023/5/24 06:30, Peter Bergner wrote:
> > On 5/23/23 12:24 AM, Kewen.Lin wrote:
> > > on 2023/5/23 01:31, Carl Love wrote:
> > > > The builtins were requested for use in GLibC.  As of version
> > > > 2.31 they
> > > > were added as inline asm.  They requested a builtin so the asm
> > > > could be
> > > > removed.
> > > 
> > > So IMHO we also want the similar support for mffscrn, that is to
> > > make
> > > use of mffscrn and mffscrni on Power9 and later, but falls back
> > > to 
> > > __builtin_set_fpscr_rn + mffs similar on older platforms.
> > 
> > So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
> > uses mffscrn/mffscrni on P9 and later and uses older insns on pre-
> > P9.
> > The only problem is we don't return the current FPSCR bits, as the
> > bif
> > is defined to return void.
> 
> Yes.
> 
> > Crazy idea, but could we extend the built-in
> > with an overload that returns the FPSCR bits?  
> 
> So you agree that we should make this proposed new bif handle pre-P9
> just
> like some other existing bifs. :)  I think extending it is good and
> doable,
> but the only concern here is the bif name "__builtin_set_fpscr_rn",
> which
> matches the existing behavior (only set rounding) but doesn't match
> the
> proposed extending behavior (set rounding and get some env bits
> back).
> Maybe it's not a big deal if the documentation clarify it well.

Extending the builtin to pre Power 9 is straight forward and I agree
would make good sense to do.

I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
new functionality.  Peter suggests overloading the builtin to either
return void or returns FPSCR bits.  It is my understanding that the
return value for a given builtin had to be the same, i.e. you can't
overload the return value. Maybe you can with Bill's new
infrastructure?  I recall having problems trying to overload the return
value in the past and Bill said you couldn't do it.  I play with this
and see if I can overload the return value.
> 
> 
> > To be honest, I like
> > the __builtin_set_fpscr_rn name better than __builtin_mffscrn[i].
> 
> +1
> 
> BR,
> Kewen
> 
> > The built-in machinery can see that the usage is expecting a return
> > value
> > or not and for the pre-P9 code, can skip generating the ending mffs
> > if
> > we don't want the return value.
> > 
> > Peter
> > 
> > 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-24 15:20             ` Carl Love
@ 2023-05-24 16:32               ` Peter Bergner
  2023-05-25  5:28               ` Kewen.Lin
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Bergner @ 2023-05-24 16:32 UTC (permalink / raw)
  To: Carl Love, Kewen.Lin; +Cc: Segher Boessenkool, gcc-patches

On 5/24/23 10:20 AM, Carl Love wrote:
> Extending the builtin to pre Power 9 is straight forward and I agree
> would make good sense to do.
> 
> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
> new functionality.  Peter suggests overloading the builtin to either
> return void or returns FPSCR bits.  It is my understanding that the
> return value for a given builtin had to be the same, i.e. you can't
> overload the return value. Maybe you can with Bill's new
> infrastructure?  I recall having problems trying to overload the return
> value in the past and Bill said you couldn't do it.  I play with this
> and see if I can overload the return value.

In this case, I don't think we need a built-in overload, but just change
the current built-in to return a double rather than void.  All of the
old code should still work, since they'll just ignore the return
value.  As I said, the built-in machinery can see whether we're assigning
the built-in return value to a variable or not, ie, the difference between

  __builtin_set_fpscr_rn ();

versus:

  foo = __builtin_set_fpscr_rn ();

In the former case, the built-in can expand exactly as how it does now.
In the later case, we'll use the target rtx we're passed in as the
destination of the mffscrn[i] insn for P9/10 and for pre-P9, we'll
use the target for an additional mffs instruction which we don't
generate now.  Note we'd only generate the mffs when we're passed in
a target rtx as in the second case.  The first case we won't.

This is all assuming Segher is fine with the change this way.  If we do
go this way, I would recommend adding a predefined macro that users can
test for to know whether the built-in returns a value or not.

If Segher doesn't like this idea, then it's all moot! :-)

Peter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-24 15:20             ` Carl Love
  2023-05-24 16:32               ` Peter Bergner
@ 2023-05-25  5:28               ` Kewen.Lin
  2023-05-25 15:59                 ` Carl Love
  1 sibling, 1 reply; 14+ messages in thread
From: Kewen.Lin @ 2023-05-25  5:28 UTC (permalink / raw)
  To: Carl Love; +Cc: Segher Boessenkool, gcc-patches, Peter Bergner

on 2023/5/24 23:20, Carl Love wrote:
> On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
>> on 2023/5/24 06:30, Peter Bergner wrote:
>>> On 5/23/23 12:24 AM, Kewen.Lin wrote:
>>>> on 2023/5/23 01:31, Carl Love wrote:
>>>>> The builtins were requested for use in GLibC.  As of version
>>>>> 2.31 they
>>>>> were added as inline asm.  They requested a builtin so the asm
>>>>> could be
>>>>> removed.
>>>>
>>>> So IMHO we also want the similar support for mffscrn, that is to
>>>> make
>>>> use of mffscrn and mffscrni on Power9 and later, but falls back
>>>> to 
>>>> __builtin_set_fpscr_rn + mffs similar on older platforms.
>>>
>>> So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
>>> uses mffscrn/mffscrni on P9 and later and uses older insns on pre-
>>> P9.
>>> The only problem is we don't return the current FPSCR bits, as the
>>> bif
>>> is defined to return void.
>>
>> Yes.
>>
>>> Crazy idea, but could we extend the built-in
>>> with an overload that returns the FPSCR bits?  
>>
>> So you agree that we should make this proposed new bif handle pre-P9
>> just
>> like some other existing bifs. :)  I think extending it is good and
>> doable,
>> but the only concern here is the bif name "__builtin_set_fpscr_rn",
>> which
>> matches the existing behavior (only set rounding) but doesn't match
>> the
>> proposed extending behavior (set rounding and get some env bits
>> back).
>> Maybe it's not a big deal if the documentation clarify it well.
> 
> Extending the builtin to pre Power 9 is straight forward and I agree
> would make good sense to do.
> 
> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
> new functionality.  Peter suggests overloading the builtin to either
> return void or returns FPSCR bits.  It is my understanding that the
> return value for a given builtin had to be the same, i.e. you can't
> overload the return value. Maybe you can with Bill's new
> infrastructure?  I recall having problems trying to overload the return
> value in the past and Bill said you couldn't do it.  I play with this
> and see if I can overload the return value.

Your understanding on that we fail to overload this for just different
return types is correct.  But previously I interpreted the extending
proposal as to extend
 
  void __builtin_set_fpscr_rn (int);

to 

  void __builtin_set_fpscr_rn (int, double*);

The related address taken and store here can be optimized out normally.

BR,
Kewen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-25  5:28               ` Kewen.Lin
@ 2023-05-25 15:59                 ` Carl Love
  2023-05-31  9:11                   ` Kewen.Lin
  0 siblings, 1 reply; 14+ messages in thread
From: Carl Love @ 2023-05-25 15:59 UTC (permalink / raw)
  To: Kewen.Lin, cel; +Cc: Segher Boessenkool, gcc-patches, Peter Bergner

Peter, Kewen:

On Thu, 2023-05-25 at 13:28 +0800, Kewen.Lin wrote:
> on 2023/5/24 23:20, Carl Love wrote:
> > On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
> > > on 2023/5/24 06:30, Peter Bergner wrote:
> > > > On 5/23/23 12:24 AM, Kewen.Lin wrote:
> > > > > on 2023/5/23 01:31, Carl Love wrote:
> > > > > > The builtins were requested for use in GLibC.  As of
> > > > > > version
> > > > > > 2.31 they
> > > > > > were added as inline asm.  They requested a builtin so the
> > > > > > asm
> > > > > > could be
> > > > > > removed.
> > > > > 
> > > > > So IMHO we also want the similar support for mffscrn, that is
> > > > > to
> > > > > make
> > > > > use of mffscrn and mffscrni on Power9 and later, but falls
> > > > > back
> > > > > to 
> > > > > __builtin_set_fpscr_rn + mffs similar on older platforms.
> > > > 
> > > > So __builtin_set_fpscr_rn everything we want (sets the RN bits)
> > > > and
> > > > uses mffscrn/mffscrni on P9 and later and uses older insns on
> > > > pre-
> > > > P9.
> > > > The only problem is we don't return the current FPSCR bits, as
> > > > the
> > > > bif
> > > > is defined to return void.
> > > 
> > > Yes.
> > > 
> > > > Crazy idea, but could we extend the built-in
> > > > with an overload that returns the FPSCR bits?  
> > > 
> > > So you agree that we should make this proposed new bif handle
> > > pre-P9
> > > just
> > > like some other existing bifs. :)  I think extending it is good
> > > and
> > > doable,
> > > but the only concern here is the bif name
> > > "__builtin_set_fpscr_rn",
> > > which
> > > matches the existing behavior (only set rounding) but doesn't
> > > match
> > > the
> > > proposed extending behavior (set rounding and get some env bits
> > > back).
> > > Maybe it's not a big deal if the documentation clarify it well.
> > 
> > Extending the builtin to pre Power 9 is straight forward and I
> > agree
> > would make good sense to do.
> > 
> > I am a bit concerned on how to extend __builtin_set_fpscr_rn to add
> > the
> > new functionality.  Peter suggests overloading the builtin to
> > either
> > return void or returns FPSCR bits.  It is my understanding that the
> > return value for a given builtin had to be the same, i.e. you can't
> > overload the return value. Maybe you can with Bill's new
> > infrastructure?  I recall having problems trying to overload the
> > return
> > value in the past and Bill said you couldn't do it.  I play with
> > this
> > and see if I can overload the return value.
> 
> Your understanding on that we fail to overload this for just
> different
> return types is correct.  But previously I interpreted the extending
> proposal as to extend
> 
>   void __builtin_set_fpscr_rn (int);
> 
> to 
> 
>   void __builtin_set_fpscr_rn (int, double*);
> 
> The related address taken and store here can be optimized out
> normally.

I don't think that is correct.   The current definition of the builtin
is:

     void __builtin_set_fpscr_rn (int);

The proposal by Peter was to change the return type to double, i.e.

     double __builtin_set_fpscr_rn (int);

Peter also said the following:

   The built-in machinery can see that the usage is expecting a return
   value or not and for the pre-P9 code, can skip generating the ending
   mffs if we don't want the return value.

Which I don't think we want.  The mffscrn and mffscrni instructions
return the contents of the control bits in the FPSCR, that is, bits
29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN), are placed
into the corresponding bits in register FRT. All other bits in register
FRT are set to 0.  

The instructions also updates the current RN field of the FPSCR with
the new RN supplied the second argument of the instruction.  So, the
instructions update the RN field just like the __builtin_set_fpscr_rn. 
So, we can use the existing __builtin_set_fpscr_rn to update the RN for
all ISAs, we just need to have __builtin_set_fpscr_rn always return a
double with the desired fields from the FPSCR (the current RN).  This
will then emulate the behavior of the mffscrn and mffscrni
instructions.  The current uses of __builtin_set_fpscr_rn will just
ignore the return value which is not a problem.  The return value can
be stored in the places were the user is currently using the inline asm
for the mffscrn and mffscrni instructions.

The __builtin_set_fpscr_rn builtin is currently using the mffscrn and
mffscrni on Power 9 and throwing away the result from the instruction. 
We just need to change __builtin_set_fpscr_rn to return the value
instead.  For the pre Power 9 code, the builtin will need to read the
full FPSCR, mask of the desired fields and return the fields.

So, there is no need for the builtin to have to determine if the user
is storing the result of the __builtin_set_fpscr_rn.  The RN bits will
always be updated by the __builtin_set_fpscr_rn builtin and the
existing fields of the FPSCR will always be returned by the builtin.

Please let me know if you agree.  I think I have this sorted out
correctly.  Thanks.

                        Carl 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-25 15:59                 ` Carl Love
@ 2023-05-31  9:11                   ` Kewen.Lin
  2023-05-31 15:36                     ` Carl Love
  0 siblings, 1 reply; 14+ messages in thread
From: Kewen.Lin @ 2023-05-31  9:11 UTC (permalink / raw)
  To: Carl Love, Segher Boessenkool; +Cc: gcc-patches, Peter Bergner

Hi Carl,

on 2023/5/25 23:59, Carl Love wrote:
> Peter, Kewen:
> 
> On Thu, 2023-05-25 at 13:28 +0800, Kewen.Lin wrote:
>> on 2023/5/24 23:20, Carl Love wrote:
>>> On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
>>>> on 2023/5/24 06:30, Peter Bergner wrote:
>>>>> On 5/23/23 12:24 AM, Kewen.Lin wrote:
>>>>>> on 2023/5/23 01:31, Carl Love wrote:
>>>>>>> The builtins were requested for use in GLibC.  As of
>>>>>>> version
>>>>>>> 2.31 they
>>>>>>> were added as inline asm.  They requested a builtin so the
>>>>>>> asm
>>>>>>> could be
>>>>>>> removed.
>>>>>>
>>>>>> So IMHO we also want the similar support for mffscrn, that is
>>>>>> to
>>>>>> make
>>>>>> use of mffscrn and mffscrni on Power9 and later, but falls
>>>>>> back
>>>>>> to 
>>>>>> __builtin_set_fpscr_rn + mffs similar on older platforms.
>>>>>
>>>>> So __builtin_set_fpscr_rn everything we want (sets the RN bits)
>>>>> and
>>>>> uses mffscrn/mffscrni on P9 and later and uses older insns on
>>>>> pre-
>>>>> P9.
>>>>> The only problem is we don't return the current FPSCR bits, as
>>>>> the
>>>>> bif
>>>>> is defined to return void.
>>>>
>>>> Yes.
>>>>
>>>>> Crazy idea, but could we extend the built-in
>>>>> with an overload that returns the FPSCR bits?  
>>>>
>>>> So you agree that we should make this proposed new bif handle
>>>> pre-P9
>>>> just
>>>> like some other existing bifs. :)  I think extending it is good
>>>> and
>>>> doable,
>>>> but the only concern here is the bif name
>>>> "__builtin_set_fpscr_rn",
>>>> which
>>>> matches the existing behavior (only set rounding) but doesn't
>>>> match
>>>> the
>>>> proposed extending behavior (set rounding and get some env bits
>>>> back).
>>>> Maybe it's not a big deal if the documentation clarify it well.
>>>
>>> Extending the builtin to pre Power 9 is straight forward and I
>>> agree
>>> would make good sense to do.
>>>
>>> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add
>>> the
>>> new functionality.  Peter suggests overloading the builtin to
>>> either
>>> return void or returns FPSCR bits.  It is my understanding that the
>>> return value for a given builtin had to be the same, i.e. you can't
>>> overload the return value. Maybe you can with Bill's new
>>> infrastructure?  I recall having problems trying to overload the
>>> return
>>> value in the past and Bill said you couldn't do it.  I play with
>>> this
>>> and see if I can overload the return value.
>>
>> Your understanding on that we fail to overload this for just
>> different
>> return types is correct.  But previously I interpreted the extending
>> proposal as to extend
>>
>>   void __builtin_set_fpscr_rn (int);
>>
>> to 
>>
>>   void __builtin_set_fpscr_rn (int, double*);
>>
>> The related address taken and store here can be optimized out
>> normally.
> 
> I don't think that is correct.   The current definition of the builtin

Just to clarify: I didn't meant to suggest it, just explained why I
thought overloading is doable previously as I interpreted it with one
extended argument. :)

> is:
> 
>      void __builtin_set_fpscr_rn (int);
> 
> The proposal by Peter was to change the return type to double, i.e.
> 
>      double __builtin_set_fpscr_rn (int);
> 
> Peter also said the following:
> 
>    The built-in machinery can see that the usage is expecting a return
>    value or not and for the pre-P9 code, can skip generating the ending
>    mffs if we don't want the return value.
> 
> Which I don't think we want.  The mffscrn and mffscrni instructions
> return the contents of the control bits in the FPSCR, that is, bits
> 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN), are placed
> into the corresponding bits in register FRT. All other bits in register
> FRT are set to 0.  
> 
> The instructions also updates the current RN field of the FPSCR with
> the new RN supplied the second argument of the instruction.  So, the
> instructions update the RN field just like the __builtin_set_fpscr_rn. 
> So, we can use the existing __builtin_set_fpscr_rn to update the RN for
> all ISAs, we just need to have __builtin_set_fpscr_rn always return a
> double with the desired fields from the FPSCR (the current RN).  This
> will then emulate the behavior of the mffscrn and mffscrni
> instructions.  The current uses of __builtin_set_fpscr_rn will just
> ignore the return value which is not a problem.  The return value can
> be stored in the places were the user is currently using the inline asm
> for the mffscrn and mffscrni instructions.
> 
> The __builtin_set_fpscr_rn builtin is currently using the mffscrn and
> mffscrni on Power 9 and throwing away the result from the instruction. 
> We just need to change __builtin_set_fpscr_rn to return the value
> instead.  For the pre Power 9 code, the builtin will need to read the
> full FPSCR, mask of the desired fields and return the fields.
> 
> So, there is no need for the builtin to have to determine if the user
> is storing the result of the __builtin_set_fpscr_rn.  The RN bits will
> always be updated by the __builtin_set_fpscr_rn builtin and the
> existing fields of the FPSCR will always be returned by the builtin.

Yeah, I agree, even with pre-P9 code when the returned value is unused,
I'd expect DCE can eliminate the part for the FPSCR bits reading and
masking, it's just like before (only setting RN bits).

The only concern I mentioned before is the built-in name doesn't clearly
match what it does (with extending, it returns something instead) since
it's only saying "set" and setting RN bits, the return value is easily
misunderstood as returning old RN bits, the documentation has to explain
and note it well.

Looking forward to Segher's opinion on this.

BR,
Kewen

> 
> Please let me know if you agree.  I think I have this sorted out
> correctly.  Thanks.
> 
>                         Carl 
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions
  2023-05-31  9:11                   ` Kewen.Lin
@ 2023-05-31 15:36                     ` Carl Love
  0 siblings, 0 replies; 14+ messages in thread
From: Carl Love @ 2023-05-31 15:36 UTC (permalink / raw)
  To: Kewen.Lin, Segher Boessenkool; +Cc: gcc-patches, Peter Bergner, cel

Kewen:

On Wed, 2023-05-31 at 17:11 +0800, Kewen.Lin wrote:
> > So, there is no need for the builtin to have to determine if the
> > user
> > is storing the result of the __builtin_set_fpscr_rn.  The RN bits
> > will
> > always be updated by the __builtin_set_fpscr_rn builtin and the
> > existing fields of the FPSCR will always be returned by the
> > builtin.
> 
> Yeah, I agree, even with pre-P9 code when the returned value is
> unused,
> I'd expect DCE can eliminate the part for the FPSCR bits reading and
> masking, it's just like before (only setting RN bits).
> 
> The only concern I mentioned before is the built-in name doesn't
> clearly
> match what it does (with extending, it returns something instead)
> since
> it's only saying "set" and setting RN bits, the return value is
> easily
> misunderstood as returning old RN bits, the documentation has to
> explain
> and note it well.
> 
> Looking forward to Segher's opinion on this.

I have the patch to extend the __builtin_set_fpscr_rn builtin working. 
I agree the documentation on the instructions in the ISA is not really
clear about that.  It needs to be much more explicit in the builtin
description that the current RN field is returned then the field is
updated with the new RN bits from the argument.  

I sent the patch, with the updated builtin description and testcases to
the GLibC team to see what they thought of it.  The goal was for the
builtin to be effectively a "drop in replacement" for the inline asm
that they have.  I was planning on posting the new version if the GLibC
team says it works for them.  Hopefully I will hear from them soon.

                    Carl 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-05-31 15:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-13 20:47 [PATCH] rs6000: Add buildin for mffscrn instructions Carl Love
2023-05-18 21:12 ` [PATCH v2] " Carl Love
2023-05-22  6:36   ` Kewen.Lin
2023-05-22 17:31     ` Carl Love
2023-05-23  5:24       ` Kewen.Lin
2023-05-23 22:30         ` Peter Bergner
2023-05-24  5:32           ` Kewen.Lin
2023-05-24 15:20             ` Carl Love
2023-05-24 16:32               ` Peter Bergner
2023-05-25  5:28               ` Kewen.Lin
2023-05-25 15:59                 ` Carl Love
2023-05-31  9:11                   ` Kewen.Lin
2023-05-31 15:36                     ` Carl Love
2023-05-22 17:36     ` [PATCH v3] " Carl Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).