public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>,
	"H.J. Lu" <hjl.tools@gmail.com>,
	richard.sandiford@arm.com
Subject: Re: [PATCH v3] DSE: Use the constant store source if possible
Date: Mon, 30 May 2022 16:58:33 -0600	[thread overview]
Message-ID: <f516e4a3-984a-6900-2668-47451283771b@gmail.com> (raw)
In-Reply-To: <mpt5ylno24d.fsf@arm.com>



On 5/30/2022 2:28 AM, Richard Sandiford wrote:
> Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> On 5/29/2022 3:43 PM, H.J. Lu wrote:
>>> On Sat, May 28, 2022 at 11:37 AM Jeff Law via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> On 5/26/2022 2:43 PM, H.J. Lu via Gcc-patches wrote:
>>>>> On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
>>>>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>>>>> On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
>>>>>>> <richard.sandiford@arm.com> wrote:
>>>>>>>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>>>>>>>>> On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>>>>>>>>>> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>>>>>>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>>>>>>> When recording store for RTL dead store elimination, check if the source
>>>>>>>>>>> register is set only once to a constant.  If yes, record the constant
>>>>>>>>>>> as the store source.  It eliminates unrolled zero stores after memset 0
>>>>>>>>>>> in a loop where a vector register is used as the zero store source.
>>>>>>>>>>>
>>>>>>>>>>> gcc/
>>>>>>>>>>>
>>>>>>>>>>>            PR rtl-optimization/105638
>>>>>>>>>>>            * dse.cc (record_store): Use the constant source if the source
>>>>>>>>>>>            register is set only once.
>>>>>>>>>>>
>>>>>>>>>>> gcc/testsuite/
>>>>>>>>>>>
>>>>>>>>>>>            PR rtl-optimization/105638
>>>>>>>>>>>            * g++.target/i386/pr105638.C: New test.
>>>>>>>>>>> ---
>>>>>>>>>>>     gcc/dse.cc                               | 19 ++++++++++
>>>>>>>>>>>     gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>>>>>>>>>>>     2 files changed, 63 insertions(+)
>>>>>>>>>>>     create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc
>>>>>>>>>>> index 30c11cee034..0433dd3d846 100644
>>>>>>>>>>> --- a/gcc/dse.cc
>>>>>>>>>>> +++ b/gcc/dse.cc
>>>>>>>>>>> @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
>>>>>>>>>>>
>>>>>>>>>>>              if (tem && CONSTANT_P (tem))
>>>>>>>>>>>                const_rhs = tem;
>>>>>>>>>>> +         else
>>>>>>>>>>> +           {
>>>>>>>>>>> +             /* If RHS is set only once to a constant, set CONST_RHS
>>>>>>>>>>> +                to the constant.  */
>>>>>>>>>>> +             df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>>>>>>> +             if (def != nullptr
>>>>>>>>>>> +                 && !DF_REF_IS_ARTIFICIAL (def)
>>>>>>>>>>> +                 && !DF_REF_NEXT_REG (def))
>>>>>>>>>>> +               {
>>>>>>>>>>> +                 rtx_insn *def_insn = DF_REF_INSN (def);
>>>>>>>>>>> +                 rtx def_body = PATTERN (def_insn);
>>>>>>>>>>> +                 if (GET_CODE (def_body) == SET)
>>>>>>>>>>> +                   {
>>>>>>>>>>> +                     rtx def_src = SET_SRC (def_body);
>>>>>>>>>>> +                     if (CONSTANT_P (def_src))
>>>>>>>>>>> +                       const_rhs = def_src;
>>>>>>>>>> doesn't DSE have its own tracking of stored values?  Shouldn't we
>>>>>>>>> It tracks stored values only within the basic block.  When RTL loop
>>>>>>>>> invariant motion hoists a constant initialization out of the loop into
>>>>>>>>> a separate basic block, the constant store value becomes unknown
>>>>>>>>> within the original basic block.
>>>>>>>>>
>>>>>>>>>> improve _that_ if it is not enough?  I also wonder if you need to
>>>>>>>>> My patch extends DSE stored value tracking to include the constant which
>>>>>>>>> is set only once in another basic block.
>>>>>>>>>
>>>>>>>>>> verify the SET isn't partial?
>>>>>>>>>>
>>>>>>>>> Here is the v2 patch to check that the constant is set by a non-partial
>>>>>>>>> unconditional load.
>>>>>>>>>
>>>>>>>>> OK for master?
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> H.J.
>>>>>>>>> ---
>>>>>>>>> RTL DSE tracks redundant constant stores within a basic block.  When RTL
>>>>>>>>> loop invariant motion hoists a constant initialization out of the loop
>>>>>>>>> into a separate basic block, the constant store value becomes unknown
>>>>>>>>> within the original basic block.  When recording store for RTL DSE, check
>>>>>>>>> if the source register is set only once to a constant by a non-partial
>>>>>>>>> unconditional load.  If yes, record the constant as the constant store
>>>>>>>>> source.  It eliminates unrolled zero stores after memset 0 in a loop
>>>>>>>>> where a vector register is used as the zero store source.
>>>>>>>>>
>>>>>>>>> gcc/
>>>>>>>>>
>>>>>>>>>          PR rtl-optimization/105638
>>>>>>>>>          * dse.cc (record_store): Use the constant source if the source
>>>>>>>>>          register is set only once.
>>>>>>>>>
>>>>>>>>> gcc/testsuite/
>>>>>>>>>
>>>>>>>>>          PR rtl-optimization/105638
>>>>>>>>>          * g++.target/i386/pr105638.C: New test.
>>>>>>>>> ---
>>>>>>>>>     gcc/dse.cc                               | 22 ++++++++++++
>>>>>>>>>     gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>>>>>>>>>     2 files changed, 66 insertions(+)
>>>>>>>>>     create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>>>>>>>>>
>>>>>>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc
>>>>>>>>> index 30c11cee034..af8e88dac32 100644
>>>>>>>>> --- a/gcc/dse.cc
>>>>>>>>> +++ b/gcc/dse.cc
>>>>>>>>> @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
>>>>>>>>>
>>>>>>>>>            if (tem && CONSTANT_P (tem))
>>>>>>>>>              const_rhs = tem;
>>>>>>>>> +       else
>>>>>>>>> +         {
>>>>>>>>> +           /* If RHS is set only once to a constant, set CONST_RHS
>>>>>>>>> +              to the constant.  */
>>>>>>>>> +           df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>>>>> +           if (def != nullptr
>>>>>>>>> +               && !DF_REF_IS_ARTIFICIAL (def)
>>>>>>>>> +               && !(DF_REF_FLAGS (def)
>>>>>>>>> +                    & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
>>>>>>>>> +               && !DF_REF_NEXT_REG (def))
>>>>>>>> Can we introduce a helper for this?  There are already similar tests
>>>>>>>> in ira and loop-iv, and it seems a bit too complex to have to open-code
>>>>>>>> each time.
>>>>>>> I can use find_single_def_src in loop-iv.cc:
>>>>>>>
>>>>>>> /* If REGNO has a single definition, return its known value, otherwise return
>>>>>>>       null.  */
>>>>>>>
>>>>>>> rtx
>>>>>>> find_single_def_src (unsigned int regno)
>>>>>> Yeah, reusing that sounds good.  Perhaps we should move it into df-core.cc,
>>>>>> alongside the df_reg_used group of functions.
>>>>>>
>>>>>> I think the mode check in your original patch should be kept though,
>>>>>> so how about we change the parameter to an rtx reg and use rtx_equal in:
>>>>>>
>>>>>>          rtx set = single_set (DF_REF_INSN (adef));
>>>>>>          if (set == NULL || !REG_P (SET_DEST (set))
>>>>>>          || REGNO (SET_DEST (set)) != regno)
>>>>>>        return NULL_RTX;
>>>>> Fixed.
>>>>>
>>>>>> rather than the existing !REG_P and REGNO checks.  We should also add:
>>>>>>
>>>>>>> and do
>>>>>>>
>>>>>>>                 /* If RHS is set only once to a constant, set CONST_RHS
>>>>>>>                     to the constant.  */
>>>>>>>                  rtx def_src = find_single_def_src (REGNO (rhs));
>>>>>>>                  if (def_src != nullptr && CONSTANT_P (def_src))
>>>>>>>                    {
>>>>>>>                      df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>>>>>>>                      if (!(DF_REF_FLAGS (def)
>>>>>>>                            & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
>>>>>> …this check to the function, since it's needed for correctness even
>>>>>> in the loop-iv.cc use.
>>>>> Fixed.
>>>>>
>>>>>> Thanks,
>>>>>> Richard
>>>>> Here is the v3 patch.  OK for master?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> H.J.
>>>>> ---
>>>>> RTL DSE tracks redundant constant stores within a basic block.  When RTL
>>>>> loop invariant motion hoists a constant initialization out of the loop
>>>>> into a separate basic block, the constant store value becomes unknown
>>>>> within the original basic block.  When recording store for RTL DSE, check
>>>>> if the source register is set only once to a constant by a non-partial
>>>>> unconditional load.  If yes, record the constant as the constant store
>>>>> source.  It eliminates unrolled zero stores after memset 0 in a loop
>>>>> where a vector register is used as the zero store source.
>>>>>
>>>>> Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
>>>>>
>>>>> 1. Rename to df_find_single_def_src.
>>>>> 2. Change the argument to rtx and use rtx_equal_p.
>>>>> 3. Return null for partial or conditional defs.
>>>>>
>>>>> gcc/
>>>>>
>>>>>         PR rtl-optimization/105638
>>>>>         * df-core.cc (df_find_single_def_sr): Moved and renamed from
>>>>>         find_single_def_src in loop-iv.cc.  Change the argument to rtx
>>>>>         and use rtx_equal_p.  Return null for partial or conditional
>>>>>         defs.
>>>>>         * df.h (df_find_single_def_src): New prototype.
>>>>>         * dse.cc (record_store): Use the constant source if the source
>>>>>         register is set only once.
>>>>>         * loop-iv.cc (find_single_def_src): Moved to df-core.cc.
>>>>>         (replace_single_def_regs): Replace find_single_def_src with
>>>>>         df_find_single_def_src.
>>>>>
>>>>> gcc/testsuite/
>>>>>
>>>>>         PR rtl-optimization/105638
>>>>>         * g++.target/i386/pr105638.C: New test.
>>>> Avoiding REG_EQUAL and only handling REG_EQUIV notes would be better
>>>> here.  REG_EQUIV indicates the destination could be replaced with the
>>>> source of the note at any point and semantically the code would still be
>>>> valid.  REG_EQUAL doesn't give us that guarantee.
>>>>
>>>> To allow REG_EQUAL you have to check that the block with the note
>>>> dominates the use.
>>> When a use only has one non-conditional def which doesn't dominate
>>> the use, isn't its behavior undefined?
>> I think so, even for an irreducible loop.  Even so, it's the safest
>> thing to do, particularly if someone tries to extend this code in the
>> future.
> If any use of R was upwards exposed, R would have an artificial
> definition in the entry block in addition to any “real” definition.
> Since we're checking for exactly one definition in total, it should
> follow that the definition dominates all uses.
Oh, duh.  My mind was stuck in the old reg_n_{refs,defs} way of 
thinking, not the DF world.  My bad.

If you're OK with the patch, so am I.

jeff

  reply	other threads:[~2022-05-30 22:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21  3:01 [PATCH] DSE: Use the constant " H.J. Lu
2022-05-23 10:38 ` Richard Biener
2022-05-23 18:34   ` [PATCH v2] DSE: Use the constant store " H.J. Lu
2022-05-24  6:42     ` Richard Biener
2022-05-24 20:10       ` H.J. Lu
2022-05-25  9:22         ` Richard Biener
2022-05-25  9:30           ` Richard Sandiford
2022-05-25 19:01             ` H.J. Lu
2022-05-25  7:30     ` Richard Sandiford
2022-05-25 18:56       ` H.J. Lu
2022-05-26 15:14         ` Richard Sandiford
2022-05-26 20:43           ` [PATCH v3] " H.J. Lu
2022-05-28 18:37             ` Jeff Law
2022-05-29 21:43               ` H.J. Lu
2022-05-29 22:55                 ` Jeff Law
2022-05-30  8:28                   ` Richard Sandiford
2022-05-30 22:58                     ` Jeff Law [this message]
2022-05-30  8:35             ` Richard Sandiford
2022-05-31 17:12               ` [PATCH v4] " H.J. Lu
2022-06-01  7:20                 ` Richard Sandiford
2022-06-01 21:07                   ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f516e4a3-984a-6900-2668-47451283771b@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).