From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Carl Love <cel@us.ibm.com>
Cc: Peter Bergner <bergner@linux.ibm.com>,
Segher Boessenkool <segher@kernel.crashing.org>,
gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value
Date: Tue, 11 Jul 2023 13:39:57 +0800 [thread overview]
Message-ID: <79937d81-4667-5da9-a281-21455d4c1020@linux.ibm.com> (raw)
In-Reply-To: <ef3d902cd525f9a9d16dba5f6b6b04697f254162.camel@us.ibm.com>
on 2023/7/11 03:18, Carl Love wrote:
> On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> Some more minor comments are inline below on top of Peter's
>> insightful
>> review comments.
>>
>> on 2023/7/1 08:58, Carl Love wrote:
>>> GCC maintainers:
>>>
>>> Ver 2, Went back thru the requirements and emails. Not sure where
>>> I
>>> came up with the requirement for an overloaded version with double
>>> argument. Removed the overloaded version with the double
>>> argument.
>>> Added the macro to announce if the __builtin_set_fpscr_rn returns a
>>> void or a double with the FPSCR bits. Updated the documentation
>>> file.
>>> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE. Redid the
>>> test
>>> file. Per request, the original test file functionality was not
>>> changed. Just changed the name from test_fpscr_rn_builtin.c to
>>> test_fpscr_rn_builtin_1.c. Put new tests for the return values
>>> into a
>>> new test file, test_fpscr_rn_builtin_2.c.
>>>
>>> The GLibC team requested a builtin to replace the mffscrn and
>>> mffscrniinline asm instructions in the GLibC code. Previously
>>> there
>>> was discussion on adding builtins for the mffscrn instructions.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
>>>
>>> In the end, it was felt that it would be to extend the existing
>>> __builtin_set_fpscr_rn builtin to return a double instead of a void
>>> type. The desire is that we could have the functionality of the
>>> mffscrn and mffscrni instructions on older ISAs. The two
>>> instructions
>>> were initially added in ISA 3.0. The __builtin_set_fpscr_rn has
>>> the
>>> needed functionality to set the RN field using the mffscrn and
>>> mffscrni
>>> instructions if ISA 3.0 is supported or fall back to using logical
>>> instructions to mask and set the bits for earlier ISAs. The
>>> instructions return the current value of the FPSCR fields DRN, VE,
>>> OE,
>>> UE, ZE, XE, NI, RN bit positions then update the RN bit positions
>>> with
>>> the new RN value provided.
>>>
>>> The current __builtin_set_fpscr_rn builtin has a return type of
>>> void.
>>> So, changing the return type to double and returning the FPSCR
>>> fields
>>> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
>>> functionally equivalent of the mffscrn and mffscrni
>>> instructions. Any
>>> current uses of the builtin would just ignore the return value yet
>>> any
>>> new uses could use the return value. So the requirement is for the
>>> change to the __builtin_set_fpscr_rn builtin to be backwardly
>>> compatible and work for all ISAs.
>>>
>>> The following patch changes the return type of the
>>> __builtin_set_fpscr_rn builtin from void to double. The return
>>> value
>>> is the current value of the various FPSCR fields DRN, VE, OE, UE,
>>> ZE,
>>> XE, NI, RN bit positions when the builtin is called. The builtin
>>> then
>>> updated the RN field with the new value provided as an argument to
>>> the
>>> builtin. The patch adds new testcases to test_fpscr_rn_builtin.c
>>> to
>>> check that the builtin returns the current value of the FPSCR
>>> fields
>>> and then updates the RN field.
>>>
>>> The GLibC team has reviewed the patch to make sure it met their
>>> needs
>>> as a drop in replacement for the inline asm mffscr and mffscrni
>>> statements in the GLibC code. T
>>>
>>> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power
>>> 10
>>> LE.
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline. Thanks.
>>>
>>> Carl
>>>
>>>
>>> ----------------------------------
>>> rs6000, __builtin_set_fpscr_rn add retrun value
>>>
>>> Change the return value from void to double. The return value
>>> consists of
>>> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
>>> positions. Add an
>>> overloaded version which accepts a double argument.
>>>
>>> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
>>> for the
>>> double reterun value and the new double argument.
>>>
>>> gcc/ChangeLog:
>>> * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn):
>>> Update
>>> builtin definition return type.
>>> * config/rs6000-c.cc(rs6000_target_modify_macros): Add check,
>>> define
>>> __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>> * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>>> define_expand.
>>> (rs6000_update_fpscr_rn_field): New define_expand.
>>> (rs6000_set_fpscr_rn): Added return argument. Updated
>>> to use new
>>> rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>>> _expands.
>>> * doc/extend.texi (__builtin_set_fpscr_rn): Update description
>>> for
>>> the return value and new double argument. Add descripton for
>>> __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>>
>>> gcc/testsuite/ChangeLog:
>>> gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
>>> test_fpscr_rn_builtin_1.c. Added comment.
>>> gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>>> return value of __builtin_set_fpscr_rn builtin.
>>> ---
>>> gcc/config/rs6000/rs6000-builtins.def | 2 +-
>>> gcc/config/rs6000/rs6000-c.cc | 4 +
>>> gcc/config/rs6000/rs6000.md | 87 +++++++---
>>> gcc/doc/extend.texi | 26 ++-
>>> ...rn_builtin.c => test_fpscr_rn_builtin_1.c} | 6 +
>>> .../powerpc/test_fpscr_rn_builtin_2.c | 153
>>> ++++++++++++++++++
>>> 6 files changed, 246 insertions(+), 32 deletions(-)
>>> rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c
>>> => test_fpscr_rn_builtin_1.c} (92%)
>>> create mode 100644
>>> gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 289a37998b1..28788b69c7d 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -237,7 +237,7 @@
>>> const __ibm128 __builtin_pack_ibm128 (double, double);
>>> PACK_IF packif {ibm128}
>>>
>>> - void __builtin_set_fpscr_rn (const int[0,3]);
>>> + double __builtin_set_fpscr_rn (const int[0,3]);
>>> SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>>>
>>> const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
>>> diff --git a/gcc/config/rs6000/rs6000-c.cc
>>> b/gcc/config/rs6000/rs6000-c.cc
>>> index 8555174d36e..8373bb66919 100644
>>> --- a/gcc/config/rs6000/rs6000-c.cc
>>> +++ b/gcc/config/rs6000/rs6000-c.cc
>>> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p,
>>> HOST_WIDE_INT flags)
>>> /* Tell the user -mrop-protect is in play. */
>>> if (rs6000_rop_protect)
>>> rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
>>> + /* Tell the user the __builtin_set_fpscr_rn now returns the
>>> FPSCR fields
>>> + in a double. Originally the builtin returned void. */
>>> + if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
>>> + rs6000_define_or_undefine_macro (define_p,
>>> "__SET_FPSCR_RN_RETURNS_FPSCR__");
>>> }
>>>
>>> void
>>> diff --git a/gcc/config/rs6000/rs6000.md
>>> b/gcc/config/rs6000/rs6000.md
>>> index b0db8ae508d..1b77a13c8a1 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -6440,8 +6440,51 @@
>>> "mffscdrn %0,%1"
>>> [(set_attr "type" "fp")])
>>>
>>> +
>>> +(define_expand "rs6000_get_fpscr_fields"
>>> + [(match_operand:DF 0 "gpc_reg_operand")]
>>> + "TARGET_HARD_FLOAT"
>>> +{
>>> + /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
>>> ZE, XE, NI,
>>> + RN) from the FPSCR and return them. */
>>> + rtx tmp_df = gen_reg_rtx (DFmode);
>>> + rtx tmp_di = gen_reg_rtx (DImode);
>>> +
>>> + emit_insn (gen_rs6000_mffs (tmp_df));
>>> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> + emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
>>> (0x00000007000000FFULL)));
>>> + rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> + emit_move_insn (operands[0], tmp_rtn);
>>> + DONE;
>>> +})
>>> +
>>> +(define_expand "rs6000_update_fpscr_rn_field"
>>> + [(match_operand:DI 0 "gpc_reg_operand")]
>>> + "TARGET_HARD_FLOAT"
>>> +{
>>> + /* Insert the new RN value from operands[0] into FPSCR bit
>>> [62:63]. */
>>> + rtx tmp_di = gen_reg_rtx (DImode);
>>> + rtx tmp_df = gen_reg_rtx (DFmode);
>>> +
>>> + emit_insn (gen_rs6000_mffs (tmp_df));
>>> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> + emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
>>> + emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
>>> +
>>> + /* Need to write to field k=15. The fields are [0:15]. Hence
>>> with
>>> + L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W). FLM is
>>> an
>>> + 8-bit field[0:7]. Need to set the bit that corresponds to the
>>> + value of i that you want [0:7]. */
>>> +
>>> + tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> + emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
>>> + DONE;
>>> +})
>>> +
>>> (define_expand "rs6000_set_fpscr_rn"
>>> - [(match_operand:SI 0 "reg_or_cint_operand")]
>>> + [(set (match_operand:DF 0 "gpc_reg_operand")
>>> + (unspec_volatile:DF [(match_operand:SI 1
>>> "reg_or_cint_operand")]
>>> + UNSPECV_MFFSCDRN))]
>>
>> I don't think we need this pattern to be complete, since it's always
>> expanded with "DONE", the pattern is useless, so it can be simpler
>> like:
>>
>> [(use (match_operand:DF 0 "gpc_reg_operand"))
>> (use (match_operand:SI 1 "reg_or_cint_operand"))]
>
> OK, changed this. Went and reviewed the use of set in the GCC internal
> documentation. Not really sure why we can use set with the DONE. Not
> really following your explanation. It does seem to work.
>
It's mainly based on the documentation of define_expand:
"If the preparation falls through (invokes neither DONE nor FAIL), then
the define_expand acts like a define_insn in that the RTL template is
used to generate the insn.
The RTL template is not used for matching, only for generating the initial
insn list. **If the preparation statement always invokes DONE or FAIL, the RTL
template may be reduced to a simple list of operands** ..."
https://gcc.gnu.org/onlinedocs/gccint/Expander-Definitions.html
BR,
Kewen
prev parent reply other threads:[~2023-07-11 5:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-01 0:58 Carl Love
2023-07-06 22:54 ` Peter Bergner
2023-07-06 23:00 ` Peter Bergner
2023-07-07 5:08 ` Kewen.Lin
2023-07-07 14:23 ` Peter Bergner
2023-07-10 19:17 ` Carl Love
2023-07-07 4:06 ` Kewen.Lin
2023-07-10 19:18 ` Carl Love
2023-07-11 5:39 ` Kewen.Lin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=79937d81-4667-5da9-a281-21455d4c1020@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=cel@us.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).