From: Richard Sandiford <richard.sandiford@arm.com>
To: Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3] DSE: Use the constant store source if possible
Date: Mon, 30 May 2022 09:28:50 +0100 [thread overview]
Message-ID: <mpt5ylno24d.fsf@arm.com> (raw)
In-Reply-To: <82a25496-2a00-4876-7a57-df09c983f2fe@gmail.com> (Jeff Law via Gcc-patches's message of "Sun, 29 May 2022 16:55:04 -0600")
Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On 5/29/2022 3:43 PM, H.J. Lu wrote:
>> On Sat, May 28, 2022 at 11:37 AM Jeff Law via Gcc-patches
>> <gcc-patches@gcc.gnu.org> wrote:
>>>
>>>
>>> On 5/26/2022 2:43 PM, H.J. Lu via Gcc-patches wrote:
>>>> On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
>>>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>>>> On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
>>>>>> <richard.sandiford@arm.com> wrote:
>>>>>>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>>>>>>>> On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>>>>>>>>> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>>>>>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>>>>>> When recording store for RTL dead store elimination, check if the source
>>>>>>>>>> register is set only once to a constant. If yes, record the constant
>>>>>>>>>> as the store source. It eliminates unrolled zero stores after memset 0
>>>>>>>>>> in a loop where a vector register is used as the zero store source.
>>>>>>>>>>
>>>>>>>>>> gcc/
>>>>>>>>>>
>>>>>>>>>> PR rtl-optimization/105638
>>>>>>>>>> * dse.cc (record_store): Use the constant source if the source
>>>>>>>>>> register is set only once.
>>>>>>>>>>
>>>>>>>>>> gcc/testsuite/
>>>>>>>>>>
>>>>>>>>>> PR rtl-optimization/105638
>>>>>>>>>> * g++.target/i386/pr105638.C: New test.
>>>>>>>>>> ---
>>>>>>>>>> gcc/dse.cc | 19 ++++++++++
>>>>>>>>>> gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>>>>>>>>>> 2 files changed, 63 insertions(+)
>>>>>>>>>> create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>>>>>>>>>>
>>>>>>>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc
>>>>>>>>>> index 30c11cee034..0433dd3d846 100644
>>>>>>>>>> --- a/gcc/dse.cc
>>>>>>>>>> +++ b/gcc/dse.cc
>>>>>>>>>> @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
>>>>>>>>>>
>>>>>>>>>> if (tem && CONSTANT_P (tem))
>>>>>>>>>> const_rhs = tem;
>>>>>>>>>> + else
>>>>>>>>>> + {
>>>>>>>>>> + /* If RHS is set only once to a constant, set CONST_RHS
>>>>>>>>>> + to the constant. */
>>>>>>>>>> + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>>>>>> + if (def != nullptr
>>>>>>>>>> + && !DF_REF_IS_ARTIFICIAL (def)
>>>>>>>>>> + && !DF_REF_NEXT_REG (def))
>>>>>>>>>> + {
>>>>>>>>>> + rtx_insn *def_insn = DF_REF_INSN (def);
>>>>>>>>>> + rtx def_body = PATTERN (def_insn);
>>>>>>>>>> + if (GET_CODE (def_body) == SET)
>>>>>>>>>> + {
>>>>>>>>>> + rtx def_src = SET_SRC (def_body);
>>>>>>>>>> + if (CONSTANT_P (def_src))
>>>>>>>>>> + const_rhs = def_src;
>>>>>>>>> doesn't DSE have its own tracking of stored values? Shouldn't we
>>>>>>>> It tracks stored values only within the basic block. When RTL loop
>>>>>>>> invariant motion hoists a constant initialization out of the loop into
>>>>>>>> a separate basic block, the constant store value becomes unknown
>>>>>>>> within the original basic block.
>>>>>>>>
>>>>>>>>> improve _that_ if it is not enough? I also wonder if you need to
>>>>>>>> My patch extends DSE stored value tracking to include the constant which
>>>>>>>> is set only once in another basic block.
>>>>>>>>
>>>>>>>>> verify the SET isn't partial?
>>>>>>>>>
>>>>>>>> Here is the v2 patch to check that the constant is set by a non-partial
>>>>>>>> unconditional load.
>>>>>>>>
>>>>>>>> OK for master?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> H.J.
>>>>>>>> ---
>>>>>>>> RTL DSE tracks redundant constant stores within a basic block. When RTL
>>>>>>>> loop invariant motion hoists a constant initialization out of the loop
>>>>>>>> into a separate basic block, the constant store value becomes unknown
>>>>>>>> within the original basic block. When recording store for RTL DSE, check
>>>>>>>> if the source register is set only once to a constant by a non-partial
>>>>>>>> unconditional load. If yes, record the constant as the constant store
>>>>>>>> source. It eliminates unrolled zero stores after memset 0 in a loop
>>>>>>>> where a vector register is used as the zero store source.
>>>>>>>>
>>>>>>>> gcc/
>>>>>>>>
>>>>>>>> PR rtl-optimization/105638
>>>>>>>> * dse.cc (record_store): Use the constant source if the source
>>>>>>>> register is set only once.
>>>>>>>>
>>>>>>>> gcc/testsuite/
>>>>>>>>
>>>>>>>> PR rtl-optimization/105638
>>>>>>>> * g++.target/i386/pr105638.C: New test.
>>>>>>>> ---
>>>>>>>> gcc/dse.cc | 22 ++++++++++++
>>>>>>>> gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>>>>>>>> 2 files changed, 66 insertions(+)
>>>>>>>> create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>>>>>>>>
>>>>>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc
>>>>>>>> index 30c11cee034..af8e88dac32 100644
>>>>>>>> --- a/gcc/dse.cc
>>>>>>>> +++ b/gcc/dse.cc
>>>>>>>> @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
>>>>>>>>
>>>>>>>> if (tem && CONSTANT_P (tem))
>>>>>>>> const_rhs = tem;
>>>>>>>> + else
>>>>>>>> + {
>>>>>>>> + /* If RHS is set only once to a constant, set CONST_RHS
>>>>>>>> + to the constant. */
>>>>>>>> + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>>>> + if (def != nullptr
>>>>>>>> + && !DF_REF_IS_ARTIFICIAL (def)
>>>>>>>> + && !(DF_REF_FLAGS (def)
>>>>>>>> + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
>>>>>>>> + && !DF_REF_NEXT_REG (def))
>>>>>>> Can we introduce a helper for this? There are already similar tests
>>>>>>> in ira and loop-iv, and it seems a bit too complex to have to open-code
>>>>>>> each time.
>>>>>> I can use find_single_def_src in loop-iv.cc:
>>>>>>
>>>>>> /* If REGNO has a single definition, return its known value, otherwise return
>>>>>> null. */
>>>>>>
>>>>>> rtx
>>>>>> find_single_def_src (unsigned int regno)
>>>>> Yeah, reusing that sounds good. Perhaps we should move it into df-core.cc,
>>>>> alongside the df_reg_used group of functions.
>>>>>
>>>>> I think the mode check in your original patch should be kept though,
>>>>> so how about we change the parameter to an rtx reg and use rtx_equal in:
>>>>>
>>>>> rtx set = single_set (DF_REF_INSN (adef));
>>>>> if (set == NULL || !REG_P (SET_DEST (set))
>>>>> || REGNO (SET_DEST (set)) != regno)
>>>>> return NULL_RTX;
>>>> Fixed.
>>>>
>>>>> rather than the existing !REG_P and REGNO checks. We should also add:
>>>>>
>>>>>> and do
>>>>>>
>>>>>> /* If RHS is set only once to a constant, set CONST_RHS
>>>>>> to the constant. */
>>>>>> rtx def_src = find_single_def_src (REGNO (rhs));
>>>>>> if (def_src != nullptr && CONSTANT_P (def_src))
>>>>>> {
>>>>>> df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>> if (!(DF_REF_FLAGS (def)
>>>>>> & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
>>>>> …this check to the function, since it's needed for correctness even
>>>>> in the loop-iv.cc use.
>>>> Fixed.
>>>>
>>>>> Thanks,
>>>>> Richard
>>>> Here is the v3 patch. OK for master?
>>>>
>>>> Thanks.
>>>>
>>>> H.J.
>>>> ---
>>>> RTL DSE tracks redundant constant stores within a basic block. When RTL
>>>> loop invariant motion hoists a constant initialization out of the loop
>>>> into a separate basic block, the constant store value becomes unknown
>>>> within the original basic block. When recording store for RTL DSE, check
>>>> if the source register is set only once to a constant by a non-partial
>>>> unconditional load. If yes, record the constant as the constant store
>>>> source. It eliminates unrolled zero stores after memset 0 in a loop
>>>> where a vector register is used as the zero store source.
>>>>
>>>> Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
>>>>
>>>> 1. Rename to df_find_single_def_src.
>>>> 2. Change the argument to rtx and use rtx_equal_p.
>>>> 3. Return null for partial or conditional defs.
>>>>
>>>> gcc/
>>>>
>>>> PR rtl-optimization/105638
>>>> * df-core.cc (df_find_single_def_sr): Moved and renamed from
>>>> find_single_def_src in loop-iv.cc. Change the argument to rtx
>>>> and use rtx_equal_p. Return null for partial or conditional
>>>> defs.
>>>> * df.h (df_find_single_def_src): New prototype.
>>>> * dse.cc (record_store): Use the constant source if the source
>>>> register is set only once.
>>>> * loop-iv.cc (find_single_def_src): Moved to df-core.cc.
>>>> (replace_single_def_regs): Replace find_single_def_src with
>>>> df_find_single_def_src.
>>>>
>>>> gcc/testsuite/
>>>>
>>>> PR rtl-optimization/105638
>>>> * g++.target/i386/pr105638.C: New test.
>>> Avoiding REG_EQUAL and only handling REG_EQUIV notes would be better
>>> here. REG_EQUIV indicates the destination could be replaced with the
>>> source of the note at any point and semantically the code would still be
>>> valid. REG_EQUAL doesn't give us that guarantee.
>>>
>>> To allow REG_EQUAL you have to check that the block with the note
>>> dominates the use.
>> When a use only has one non-conditional def which doesn't dominate
>> the use, isn't its behavior undefined?
> I think so, even for an irreducible loop. Even so, it's the safest
> thing to do, particularly if someone tries to extend this code in the
> future.
If any use of R was upwards exposed, R would have an artificial
definition in the entry block in addition to any “real” definition.
Since we're checking for exactly one definition in total, it should
follow that the definition dominates all uses.
Thanks,
Richard
next prev parent reply other threads:[~2022-05-30 8:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-21 3:01 [PATCH] DSE: Use the constant " H.J. Lu
2022-05-23 10:38 ` Richard Biener
2022-05-23 18:34 ` [PATCH v2] DSE: Use the constant store " H.J. Lu
2022-05-24 6:42 ` Richard Biener
2022-05-24 20:10 ` H.J. Lu
2022-05-25 9:22 ` Richard Biener
2022-05-25 9:30 ` Richard Sandiford
2022-05-25 19:01 ` H.J. Lu
2022-05-25 7:30 ` Richard Sandiford
2022-05-25 18:56 ` H.J. Lu
2022-05-26 15:14 ` Richard Sandiford
2022-05-26 20:43 ` [PATCH v3] " H.J. Lu
2022-05-28 18:37 ` Jeff Law
2022-05-29 21:43 ` H.J. Lu
2022-05-29 22:55 ` Jeff Law
2022-05-30 8:28 ` Richard Sandiford [this message]
2022-05-30 22:58 ` Jeff Law
2022-05-30 8:35 ` Richard Sandiford
2022-05-31 17:12 ` [PATCH v4] " H.J. Lu
2022-06-01 7:20 ` Richard Sandiford
2022-06-01 21:07 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mpt5ylno24d.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).