Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Carl Love <cel@us.ibm.com>
Cc: Peter Bergner <bergner@linux.ibm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value
Date: Tue, 11 Jul 2023 13:39:57 +0800	[thread overview]
Message-ID: <79937d81-4667-5da9-a281-21455d4c1020@linux.ibm.com> (raw)
In-Reply-To: <ef3d902cd525f9a9d16dba5f6b6b04697f254162.camel@us.ibm.com>

on 2023/7/11 03:18, Carl Love wrote:
> On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> Some more minor comments are inline below on top of Peter's
>> insightful
>> review comments.
>>
>> on 2023/7/1 08:58, Carl Love wrote:
>>> GCC maintainers:
>>>
>>> Ver 2,  Went back thru the requirements and emails.  Not sure where
>>> I
>>> came up with the requirement for an overloaded version with double
>>> argument.  Removed the overloaded version with the double
>>> argument. 
>>> Added the macro to announce if the __builtin_set_fpscr_rn returns a
>>> void or a double with the FPSCR bits.  Updated the documentation
>>> file. 
>>> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the
>>> test
>>> file.  Per request, the original test file functionality was not
>>> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
>>> test_fpscr_rn_builtin_1.c.  Put new tests for the return values
>>> into a
>>> new test file, test_fpscr_rn_builtin_2.c.
>>>
>>> The GLibC team requested a builtin to replace the mffscrn and
>>> mffscrniinline asm instructions in the GLibC code.  Previously
>>> there
>>> was discussion on adding builtins for the mffscrn instructions.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
>>>
>>> In the end, it was felt that it would be to extend the existing
>>> __builtin_set_fpscr_rn builtin to return a double instead of a void
>>> type.  The desire is that we could have the functionality of the
>>> mffscrn and mffscrni instructions on older ISAs.  The two
>>> instructions
>>> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has
>>> the
>>> needed functionality to set the RN field using the mffscrn and
>>> mffscrni
>>> instructions if ISA 3.0 is supported or fall back to using logical
>>> instructions to mask and set the bits for earlier ISAs.  The
>>> instructions return the current value of the FPSCR fields DRN, VE,
>>> OE,
>>> UE, ZE, XE, NI, RN bit positions then update the RN bit positions
>>> with
>>> the new RN value provided.
>>>
>>> The current __builtin_set_fpscr_rn builtin has a return type of
>>> void. 
>>> So, changing the return type to double and returning the  FPSCR
>>> fields
>>> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
>>> functionally equivalent of the mffscrn and mffscrni
>>> instructions.  Any
>>> current uses of the builtin would just ignore the return value yet
>>> any
>>> new uses could use the return value.  So the requirement is for the
>>> change to the __builtin_set_fpscr_rn builtin to be backwardly
>>> compatible and work for all ISAs.
>>>
>>> The following patch changes the return type of the
>>>  __builtin_set_fpscr_rn builtin from void to double.  The return
>>> value
>>> is the current value of the various FPSCR fields DRN, VE, OE, UE,
>>> ZE,
>>> XE, NI, RN bit positions when the builtin is called.  The builtin
>>> then
>>> updated the RN field with the new value provided as an argument to
>>> the
>>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c
>>> to
>>> check that the builtin returns the current value of the FPSCR
>>> fields
>>> and then updates the RN field.
>>>
>>> The GLibC team has reviewed the patch to make sure it met their
>>> needs
>>> as a drop in replacement for the inline asm mffscr and mffscrni
>>> statements in the GLibC code.  T
>>>
>>> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power
>>> 10
>>> LE.
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline.  Thanks.
>>>
>>>                                Carl 
>>>
>>>
>>> ----------------------------------
>>> rs6000, __builtin_set_fpscr_rn add retrun value
>>>
>>> Change the return value from void to double.  The return value
>>> consists of
>>> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
>>> positions.  Add an
>>> overloaded version which accepts a double argument.
>>>
>>> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
>>> for the
>>> double reterun value and the new double argument.
>>>
>>> gcc/ChangeLog:
>>> 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn):
>>> Update
>>> 	builtin definition return type.
>>> 	* config/rs6000-c.cc(rs6000_target_modify_macros): Add check,
>>> define
>>> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>> 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>>> 	define_expand.
>>> 	(rs6000_update_fpscr_rn_field): New define_expand.
>>> 	(rs6000_set_fpscr_rn): Added	return argument.  Updated
>>> to use new
>>> 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>>> 	 _expands.
>>> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description
>>> for
>>> 	the return value and new double argument.  Add descripton for
>>> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>>
>>> gcc/testsuite/ChangeLog:
>>> 	gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
>>> 	test_fpscr_rn_builtin_1.c.  Added comment.
>>> 	gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>>> 	return value of __builtin_set_fpscr_rn builtin.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>>>  gcc/config/rs6000/rs6000-c.cc                 |   4 +
>>>  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
>>>  gcc/doc/extend.texi                           |  26 ++-
>>>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
>>>  .../powerpc/test_fpscr_rn_builtin_2.c         | 153
>>> ++++++++++++++++++
>>>  6 files changed, 246 insertions(+), 32 deletions(-)
>>>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c
>>> => test_fpscr_rn_builtin_1.c} (92%)
>>>  create mode 100644
>>> gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 289a37998b1..28788b69c7d 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -237,7 +237,7 @@
>>>    const __ibm128 __builtin_pack_ibm128 (double, double);
>>>      PACK_IF packif {ibm128}
>>>  
>>> -  void __builtin_set_fpscr_rn (const int[0,3]);
>>> +  double __builtin_set_fpscr_rn (const int[0,3]);
>>>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>>>  
>>>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
>>> diff --git a/gcc/config/rs6000/rs6000-c.cc
>>> b/gcc/config/rs6000/rs6000-c.cc
>>> index 8555174d36e..8373bb66919 100644
>>> --- a/gcc/config/rs6000/rs6000-c.cc
>>> +++ b/gcc/config/rs6000/rs6000-c.cc
>>> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p,
>>> HOST_WIDE_INT flags)
>>>    /* Tell the user -mrop-protect is in play.  */
>>>    if (rs6000_rop_protect)
>>>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
>>> +  /* Tell the user the __builtin_set_fpscr_rn now returns the
>>> FPSCR fields
>>> +     in a double.  Originally the builtin returned void.  */
>>> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
>>> +  rs6000_define_or_undefine_macro (define_p,
>>> "__SET_FPSCR_RN_RETURNS_FPSCR__");
>>>  }
>>>  
>>>  void
>>> diff --git a/gcc/config/rs6000/rs6000.md
>>> b/gcc/config/rs6000/rs6000.md
>>> index b0db8ae508d..1b77a13c8a1 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -6440,8 +6440,51 @@
>>>    "mffscdrn %0,%1"
>>>    [(set_attr "type" "fp")])
>>>  
>>> +
>>> +(define_expand "rs6000_get_fpscr_fields"
>>> + [(match_operand:DF 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
>>> ZE, XE, NI,
>>> +     RN) from the FPSCR and return them.  */
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
>>> (0x00000007000000FFULL)));
>>> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_move_insn (operands[0], tmp_rtn);
>>> +  DONE;
>>> +})
>>> +
>>> +(define_expand "rs6000_update_fpscr_rn_field"
>>> + [(match_operand:DI 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Insert the new RN value from operands[0] into FPSCR bit
>>> [62:63].  */
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
>>> +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
>>> +
>>> +  /* Need to write to field k=15.  The fields are [0:15].  Hence
>>> with
>>> +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is
>>> an
>>> +     8-bit field[0:7]. Need to set the bit that corresponds to the
>>> +     value of i that you want [0:7].  */
>>> +
>>> +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
>>> +  DONE;
>>> +})
>>> +
>>>  (define_expand "rs6000_set_fpscr_rn"
>>> - [(match_operand:SI 0 "reg_or_cint_operand")]
>>> + [(set (match_operand:DF 0 "gpc_reg_operand")
>>> +       (unspec_volatile:DF [(match_operand:SI 1
>>> "reg_or_cint_operand")]
>>> +       UNSPECV_MFFSCDRN))]
>>
>> I don't think we need this pattern to be complete, since it's always
>> expanded with "DONE", the pattern is useless, so it can be simpler
>> like:
>>
>>    [(use (match_operand:DF 0 "gpc_reg_operand"))
>>     (use (match_operand:SI 1 "reg_or_cint_operand"))]
> 
> OK, changed this.  Went and reviewed the use of set in the GCC internal
> documentation.  Not really sure why we can use set with the DONE.  Not
> really following your explanation.  It does seem to work.
> 

It's mainly based on the documentation of define_expand:

"If the preparation falls through (invokes neither DONE nor FAIL), then
 the define_expand acts like a define_insn in that the RTL template is
 used to generate the insn.

 The RTL template is not used for matching, only for generating the initial
 insn list. **If the preparation statement always invokes DONE or FAIL, the RTL
 template may be reduced to a simple list of operands** ..."

https://gcc.gnu.org/onlinedocs/gccint/Expander-Definitions.html

BR,
Kewen

     prev parent reply	other threads:[~2023-07-11  5:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-01  0:58 Carl Love
2023-07-06 22:54 ` Peter Bergner
2023-07-06 23:00   ` Peter Bergner
2023-07-07  5:08     ` Kewen.Lin
2023-07-07 14:23       ` Peter Bergner
2023-07-10 19:17   ` Carl Love
2023-07-07  4:06 ` Kewen.Lin
2023-07-10 19:18   ` Carl Love
2023-07-11  5:39     ` Kewen.Lin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79937d81-4667-5da9-a281-21455d4c1020@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=cel@us.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).