public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: Gcc-patches <gcc-patches@gcc.gnu.org>,
	Richard Biener <richard.guenther@gmail.com>
Subject: Re: [PATCH v3] DSE: Use the constant store source if possible
Date: Mon, 30 May 2022 09:35:43 +0100	[thread overview]
Message-ID: <mptv8tnmn8g.fsf@arm.com> (raw)
In-Reply-To: <Yo/mVc4I+l9zd+KZ@gmail.com> (H. J. Lu's message of "Thu, 26 May 2022 13:43:01 -0700")

"H.J. Lu" <hjl.tools@gmail.com> writes:
> On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>> > On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
>> >> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
>> >> >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
>> >> >> <gcc-patches@gcc.gnu.org> wrote:
>> >> >> >
>> >> >> > When recording store for RTL dead store elimination, check if the source
>> >> >> > register is set only once to a constant.  If yes, record the constant
>> >> >> > as the store source.  It eliminates unrolled zero stores after memset 0
>> >> >> > in a loop where a vector register is used as the zero store source.
>> >> >> >
>> >> >> > gcc/
>> >> >> >
>> >> >> >         PR rtl-optimization/105638
>> >> >> >         * dse.cc (record_store): Use the constant source if the source
>> >> >> >         register is set only once.
>> >> >> >
>> >> >> > gcc/testsuite/
>> >> >> >
>> >> >> >         PR rtl-optimization/105638
>> >> >> >         * g++.target/i386/pr105638.C: New test.
>> >> >> > ---
>> >> >> >  gcc/dse.cc                               | 19 ++++++++++
>> >> >> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>> >> >> >  2 files changed, 63 insertions(+)
>> >> >> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>> >> >> >
>> >> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
>> >> >> > index 30c11cee034..0433dd3d846 100644
>> >> >> > --- a/gcc/dse.cc
>> >> >> > +++ b/gcc/dse.cc
>> >> >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
>> >> >> >
>> >> >> >           if (tem && CONSTANT_P (tem))
>> >> >> >             const_rhs = tem;
>> >> >> > +         else
>> >> >> > +           {
>> >> >> > +             /* If RHS is set only once to a constant, set CONST_RHS
>> >> >> > +                to the constant.  */
>> >> >> > +             df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>> >> >> > +             if (def != nullptr
>> >> >> > +                 && !DF_REF_IS_ARTIFICIAL (def)
>> >> >> > +                 && !DF_REF_NEXT_REG (def))
>> >> >> > +               {
>> >> >> > +                 rtx_insn *def_insn = DF_REF_INSN (def);
>> >> >> > +                 rtx def_body = PATTERN (def_insn);
>> >> >> > +                 if (GET_CODE (def_body) == SET)
>> >> >> > +                   {
>> >> >> > +                     rtx def_src = SET_SRC (def_body);
>> >> >> > +                     if (CONSTANT_P (def_src))
>> >> >> > +                       const_rhs = def_src;
>> >> >>
>> >> >> doesn't DSE have its own tracking of stored values?  Shouldn't we
>> >> >
>> >> > It tracks stored values only within the basic block.  When RTL loop
>> >> > invariant motion hoists a constant initialization out of the loop into
>> >> > a separate basic block, the constant store value becomes unknown
>> >> > within the original basic block.
>> >> >
>> >> >> improve _that_ if it is not enough?  I also wonder if you need to
>> >> >
>> >> > My patch extends DSE stored value tracking to include the constant which
>> >> > is set only once in another basic block.
>> >> >
>> >> >> verify the SET isn't partial?
>> >> >>
>> >> >
>> >> > Here is the v2 patch to check that the constant is set by a non-partial
>> >> > unconditional load.
>> >> >
>> >> > OK for master?
>> >> >
>> >> > Thanks.
>> >> >
>> >> > H.J.
>> >> > ---
>> >> > RTL DSE tracks redundant constant stores within a basic block.  When RTL
>> >> > loop invariant motion hoists a constant initialization out of the loop
>> >> > into a separate basic block, the constant store value becomes unknown
>> >> > within the original basic block.  When recording store for RTL DSE, check
>> >> > if the source register is set only once to a constant by a non-partial
>> >> > unconditional load.  If yes, record the constant as the constant store
>> >> > source.  It eliminates unrolled zero stores after memset 0 in a loop
>> >> > where a vector register is used as the zero store source.
>> >> >
>> >> > gcc/
>> >> >
>> >> >       PR rtl-optimization/105638
>> >> >       * dse.cc (record_store): Use the constant source if the source
>> >> >       register is set only once.
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> >       PR rtl-optimization/105638
>> >> >       * g++.target/i386/pr105638.C: New test.
>> >> > ---
>> >> >  gcc/dse.cc                               | 22 ++++++++++++
>> >> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
>> >> >  2 files changed, 66 insertions(+)
>> >> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>> >> >
>> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
>> >> > index 30c11cee034..af8e88dac32 100644
>> >> > --- a/gcc/dse.cc
>> >> > +++ b/gcc/dse.cc
>> >> > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
>> >> >
>> >> >         if (tem && CONSTANT_P (tem))
>> >> >           const_rhs = tem;
>> >> > +       else
>> >> > +         {
>> >> > +           /* If RHS is set only once to a constant, set CONST_RHS
>> >> > +              to the constant.  */
>> >> > +           df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>> >> > +           if (def != nullptr
>> >> > +               && !DF_REF_IS_ARTIFICIAL (def)
>> >> > +               && !(DF_REF_FLAGS (def)
>> >> > +                    & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
>> >> > +               && !DF_REF_NEXT_REG (def))
>> >>
>> >> Can we introduce a helper for this?  There are already similar tests
>> >> in ira and loop-iv, and it seems a bit too complex to have to open-code
>> >> each time.
>> >
>> > I can use find_single_def_src in loop-iv.cc:
>> >
>> > /* If REGNO has a single definition, return its known value, otherwise return
>> >    null.  */
>> >
>> > rtx
>> > find_single_def_src (unsigned int regno)
>> 
>> Yeah, reusing that sounds good.  Perhaps we should move it into df-core.cc,
>> alongside the df_reg_used group of functions.
>> 
>> I think the mode check in your original patch should be kept though,
>> so how about we change the parameter to an rtx reg and use rtx_equal in:
>> 
>>       rtx set = single_set (DF_REF_INSN (adef));
>>       if (set == NULL || !REG_P (SET_DEST (set))
>> 	  || REGNO (SET_DEST (set)) != regno)
>> 	return NULL_RTX;
>
> Fixed.
>
>> rather than the existing !REG_P and REGNO checks.  We should also add:
>> 
>> >
>> > and do
>> >
>> >              /* If RHS is set only once to a constant, set CONST_RHS
>> >                  to the constant.  */
>> >               rtx def_src = find_single_def_src (REGNO (rhs));
>> >               if (def_src != nullptr && CONSTANT_P (def_src))
>> >                 {
>> >                   df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
>> >                   if (!(DF_REF_FLAGS (def)
>> >                         & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
>> 
>> …this check to the function, since it's needed for correctness even
>> in the loop-iv.cc use.
>
> Fixed.
>
>> 
>> Thanks,
>> Richard
>
> Here is the v3 patch.  OK for master?
>
> Thanks.
>
> H.J.
> ---
> RTL DSE tracks redundant constant stores within a basic block.  When RTL
> loop invariant motion hoists a constant initialization out of the loop
> into a separate basic block, the constant store value becomes unknown
> within the original basic block.  When recording store for RTL DSE, check
> if the source register is set only once to a constant by a non-partial
> unconditional load.  If yes, record the constant as the constant store
> source.  It eliminates unrolled zero stores after memset 0 in a loop
> where a vector register is used as the zero store source.
>
> Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
>
> 1. Rename to df_find_single_def_src.
> 2. Change the argument to rtx and use rtx_equal_p.
> 3. Return null for partial or conditional defs.
>
> gcc/
>
> 	PR rtl-optimization/105638
> 	* df-core.cc (df_find_single_def_sr): Moved and renamed from
> 	find_single_def_src in loop-iv.cc.  Change the argument to rtx
> 	and use rtx_equal_p.  Return null for partial or conditional
> 	defs.
> 	* df.h (df_find_single_def_src): New prototype.
> 	* dse.cc (record_store): Use the constant source if the source
> 	register is set only once.
> 	* loop-iv.cc (find_single_def_src): Moved to df-core.cc.
> 	(replace_single_def_regs): Replace find_single_def_src with
> 	df_find_single_def_src.
>
> gcc/testsuite/
>
> 	PR rtl-optimization/105638
> 	* g++.target/i386/pr105638.C: New test.
> ---
>  gcc/df-core.cc                           | 44 +++++++++++++++++++++++
>  gcc/df.h                                 |  1 +
>  gcc/dse.cc                               | 14 ++++++++
>  gcc/loop-iv.cc                           | 45 +-----------------------
>  gcc/testsuite/g++.target/i386/pr105638.C | 44 +++++++++++++++++++++++
>  5 files changed, 104 insertions(+), 44 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>
> diff --git a/gcc/df-core.cc b/gcc/df-core.cc
> index a901b84878f..f9b4de8eb7a 100644
> --- a/gcc/df-core.cc
> +++ b/gcc/df-core.cc
> @@ -2009,6 +2009,50 @@ df_reg_used (rtx_insn *insn, rtx reg)
>    return df_find_use (insn, reg) != NULL;
>  }
>  
> +/* If REG has a single definition, return its known value, otherwise return
> +   null.  */
> +
> +rtx
> +df_find_single_def_src (rtx reg)
> +{
> +  rtx src = NULL_RTX;
> +
> +  /* Don't look through unbounded number of single definition REG copies,
> +     there might be loops for sources with uninitialized variables.  */
> +  for (int cnt = 0; cnt < 128; cnt++)
> +    {
> +      df_ref adef = DF_REG_DEF_CHAIN (REGNO (reg));
> +      if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
> +	  || DF_REF_IS_ARTIFICIAL (adef)
> +	  || (DF_REF_FLAGS (adef)
> +	      & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
> +	return NULL_RTX;
> +
> +      rtx set = single_set (DF_REF_INSN (adef));
> +      if (set == NULL || !rtx_equal_p (SET_DEST (set), reg))
> +	return NULL_RTX;
> +
> +      rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
> +      if (note && function_invariant_p (XEXP (note, 0)))
> +	{
> +	  src = XEXP (note, 0);
> +	  break;
> +	}

Seems simpler to return this directly, rather than break and then
check function_invariant_p again.

> +      src = SET_SRC (set);
> +
> +      if (REG_P (src))
> +	{
> +	  reg = src;
> +	  continue;
> +	}
> +      break;
> +    }
> +  if (!function_invariant_p (src))
> +    return NULL_RTX;
> +
> +  return src;
> +}
> +
>  \f
>  /*----------------------------------------------------------------------------
>     Debugging and printing functions.
> diff --git a/gcc/df.h b/gcc/df.h
> index bd329205d08..71e249ad20a 100644
> --- a/gcc/df.h
> +++ b/gcc/df.h
> @@ -991,6 +991,7 @@ extern df_ref df_find_def (rtx_insn *, rtx);
>  extern bool df_reg_defined (rtx_insn *, rtx);
>  extern df_ref df_find_use (rtx_insn *, rtx);
>  extern bool df_reg_used (rtx_insn *, rtx);
> +extern rtx df_find_single_def_src (rtx);
>  extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int);
>  extern void df_print_regset (FILE *file, const_bitmap r);
>  extern void df_print_word_regset (FILE *file, const_bitmap r);
> diff --git a/gcc/dse.cc b/gcc/dse.cc
> index 30c11cee034..c915266f025 100644
> --- a/gcc/dse.cc
> +++ b/gcc/dse.cc
> @@ -1508,6 +1508,20 @@ record_store (rtx body, bb_info_t bb_info)
>  
>  	  if (tem && CONSTANT_P (tem))
>  	    const_rhs = tem;
> +	  else
> +	    {
> +	      /* If RHS is set only once to a constant, set CONST_RHS
> +		 to the constant.  */
> +	      rtx def_src = df_find_single_def_src (rhs);
> +	      if (def_src != nullptr && CONSTANT_P (def_src))
> +		{
> +		  df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> +		  rtx_insn *def_insn = DF_REF_INSN (def);
> +		  rtx def_body = single_set (def_insn);
> +		  if (rhs == SET_DEST (def_body))
> +		    const_rhs = def_src;
> +		}
> +	    }

This shouldn't be necessary now: we can rely on def_src being correct.

I.e. I think this can just be:

	      rtx def_src = df_find_single_def_src (rhs);
	      if (def_src != nullptr && CONSTANT_P (def_src))
   		const_rhs = def_src;

Thanks,
Richard

>  	}
>      }
>  
> diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
> index 0eafe7d2362..d639336445a 100644
> --- a/gcc/loop-iv.cc
> +++ b/gcc/loop-iv.cc
> @@ -1378,49 +1378,6 @@ simple_rhs_p (rtx rhs)
>      }
>  }
>  
> -/* If REGNO has a single definition, return its known value, otherwise return
> -   null.  */
> -
> -static rtx
> -find_single_def_src (unsigned int regno)
> -{
> -  rtx src = NULL_RTX;
> -
> -  /* Don't look through unbounded number of single definition REG copies,
> -     there might be loops for sources with uninitialized variables.  */
> -  for (int cnt = 0; cnt < 128; cnt++)
> -    {
> -      df_ref adef = DF_REG_DEF_CHAIN (regno);
> -      if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
> -	  || DF_REF_IS_ARTIFICIAL (adef))
> -	return NULL_RTX;
> -
> -      rtx set = single_set (DF_REF_INSN (adef));
> -      if (set == NULL || !REG_P (SET_DEST (set))
> -	  || REGNO (SET_DEST (set)) != regno)
> -	return NULL_RTX;
> -
> -      rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
> -      if (note && function_invariant_p (XEXP (note, 0)))
> -	{
> -	  src = XEXP (note, 0);
> -	  break;
> -	}
> -      src = SET_SRC (set);
> -
> -      if (REG_P (src))
> -	{
> -	  regno = REGNO (src);
> -	  continue;
> -	}
> -      break;
> -    }
> -  if (!function_invariant_p (src))
> -    return NULL_RTX;
> -
> -  return src;
> -}
> -
>  /* If any registers in *EXPR that have a single definition, try to replace
>     them with the known-equivalent values.  */
>  
> @@ -1433,7 +1390,7 @@ replace_single_def_regs (rtx *expr)
>      {
>        rtx x = *iter;
>        if (REG_P (x))
> -	if (rtx new_x = find_single_def_src (REGNO (x)))
> +	if (rtx new_x = df_find_single_def_src (x))
>  	  {
>  	    *expr = simplify_replace_rtx (*expr, x, new_x);
>  	    goto repeat;
> diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C
> new file mode 100644
> index 00000000000..ff40a459de1
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/pr105638.C
> @@ -0,0 +1,44 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */
> +/* { dg-final { scan-assembler-not "vpxor" } } */
> +
> +#include <stdint.h>
> +#include <vector>
> +#include <tr1/array>
> +
> +class FastBoard {
> +public:
> +    typedef std::pair<int, int> movescore_t;
> +    typedef std::tr1::array<movescore_t, 24> scoredlist_t;
> +    
> +protected:
> +    std::vector<int> m_critical;
> +
> +    int m_boardsize;    
> +};
> +
> +class FastState {
> +public:        
> +    FastBoard board;
> +    
> +    int movenum;              
> +protected:
> +    FastBoard::scoredlist_t scoredmoves;
> +};
> +
> +class KoState : public FastState {
> +private:         
> +    std::vector<uint64_t> ko_hash_history;   
> +    std::vector<uint64_t> hash_history;     
> +};
> +
> +class GameState : public KoState {
> +public:                    
> +    void foo ();      
> +private:
> +    std::vector<KoState> game_history;                          
> +};
> +
> +void GameState::foo() {
> +    game_history.resize(movenum);
> +}

  parent reply	other threads:[~2022-05-30  8:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21  3:01 [PATCH] DSE: Use the constant " H.J. Lu
2022-05-23 10:38 ` Richard Biener
2022-05-23 18:34   ` [PATCH v2] DSE: Use the constant store " H.J. Lu
2022-05-24  6:42     ` Richard Biener
2022-05-24 20:10       ` H.J. Lu
2022-05-25  9:22         ` Richard Biener
2022-05-25  9:30           ` Richard Sandiford
2022-05-25 19:01             ` H.J. Lu
2022-05-25  7:30     ` Richard Sandiford
2022-05-25 18:56       ` H.J. Lu
2022-05-26 15:14         ` Richard Sandiford
2022-05-26 20:43           ` [PATCH v3] " H.J. Lu
2022-05-28 18:37             ` Jeff Law
2022-05-29 21:43               ` H.J. Lu
2022-05-29 22:55                 ` Jeff Law
2022-05-30  8:28                   ` Richard Sandiford
2022-05-30 22:58                     ` Jeff Law
2022-05-30  8:35             ` Richard Sandiford [this message]
2022-05-31 17:12               ` [PATCH v4] " H.J. Lu
2022-06-01  7:20                 ` Richard Sandiford
2022-06-01 21:07                   ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptv8tnmn8g.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).