From: "H.J. Lu" <hjl.tools@gmail.com>
To: Gcc-patches <gcc-patches@gcc.gnu.org>,
Richard Biener <richard.guenther@gmail.com>,
richard.sandiford@arm.com
Subject: [PATCH v3] DSE: Use the constant store source if possible
Date: Thu, 26 May 2022 13:43:01 -0700 [thread overview]
Message-ID: <Yo/mVc4I+l9zd+KZ@gmail.com> (raw)
In-Reply-To: <mptsfownx6e.fsf@arm.com>
On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote:
> "H.J. Lu" <hjl.tools@gmail.com> writes:
> > On Wed, May 25, 2022 at 12:30 AM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> >> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
> >> >> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
> >> >> <gcc-patches@gcc.gnu.org> wrote:
> >> >> >
> >> >> > When recording store for RTL dead store elimination, check if the source
> >> >> > register is set only once to a constant. If yes, record the constant
> >> >> > as the store source. It eliminates unrolled zero stores after memset 0
> >> >> > in a loop where a vector register is used as the zero store source.
> >> >> >
> >> >> > gcc/
> >> >> >
> >> >> > PR rtl-optimization/105638
> >> >> > * dse.cc (record_store): Use the constant source if the source
> >> >> > register is set only once.
> >> >> >
> >> >> > gcc/testsuite/
> >> >> >
> >> >> > PR rtl-optimization/105638
> >> >> > * g++.target/i386/pr105638.C: New test.
> >> >> > ---
> >> >> > gcc/dse.cc | 19 ++++++++++
> >> >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
> >> >> > 2 files changed, 63 insertions(+)
> >> >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
> >> >> >
> >> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
> >> >> > index 30c11cee034..0433dd3d846 100644
> >> >> > --- a/gcc/dse.cc
> >> >> > +++ b/gcc/dse.cc
> >> >> > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
> >> >> >
> >> >> > if (tem && CONSTANT_P (tem))
> >> >> > const_rhs = tem;
> >> >> > + else
> >> >> > + {
> >> >> > + /* If RHS is set only once to a constant, set CONST_RHS
> >> >> > + to the constant. */
> >> >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> >> >> > + if (def != nullptr
> >> >> > + && !DF_REF_IS_ARTIFICIAL (def)
> >> >> > + && !DF_REF_NEXT_REG (def))
> >> >> > + {
> >> >> > + rtx_insn *def_insn = DF_REF_INSN (def);
> >> >> > + rtx def_body = PATTERN (def_insn);
> >> >> > + if (GET_CODE (def_body) == SET)
> >> >> > + {
> >> >> > + rtx def_src = SET_SRC (def_body);
> >> >> > + if (CONSTANT_P (def_src))
> >> >> > + const_rhs = def_src;
> >> >>
> >> >> doesn't DSE have its own tracking of stored values? Shouldn't we
> >> >
> >> > It tracks stored values only within the basic block. When RTL loop
> >> > invariant motion hoists a constant initialization out of the loop into
> >> > a separate basic block, the constant store value becomes unknown
> >> > within the original basic block.
> >> >
> >> >> improve _that_ if it is not enough? I also wonder if you need to
> >> >
> >> > My patch extends DSE stored value tracking to include the constant which
> >> > is set only once in another basic block.
> >> >
> >> >> verify the SET isn't partial?
> >> >>
> >> >
> >> > Here is the v2 patch to check that the constant is set by a non-partial
> >> > unconditional load.
> >> >
> >> > OK for master?
> >> >
> >> > Thanks.
> >> >
> >> > H.J.
> >> > ---
> >> > RTL DSE tracks redundant constant stores within a basic block. When RTL
> >> > loop invariant motion hoists a constant initialization out of the loop
> >> > into a separate basic block, the constant store value becomes unknown
> >> > within the original basic block. When recording store for RTL DSE, check
> >> > if the source register is set only once to a constant by a non-partial
> >> > unconditional load. If yes, record the constant as the constant store
> >> > source. It eliminates unrolled zero stores after memset 0 in a loop
> >> > where a vector register is used as the zero store source.
> >> >
> >> > gcc/
> >> >
> >> > PR rtl-optimization/105638
> >> > * dse.cc (record_store): Use the constant source if the source
> >> > register is set only once.
> >> >
> >> > gcc/testsuite/
> >> >
> >> > PR rtl-optimization/105638
> >> > * g++.target/i386/pr105638.C: New test.
> >> > ---
> >> > gcc/dse.cc | 22 ++++++++++++
> >> > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++
> >> > 2 files changed, 66 insertions(+)
> >> > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
> >> >
> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
> >> > index 30c11cee034..af8e88dac32 100644
> >> > --- a/gcc/dse.cc
> >> > +++ b/gcc/dse.cc
> >> > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
> >> >
> >> > if (tem && CONSTANT_P (tem))
> >> > const_rhs = tem;
> >> > + else
> >> > + {
> >> > + /* If RHS is set only once to a constant, set CONST_RHS
> >> > + to the constant. */
> >> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> >> > + if (def != nullptr
> >> > + && !DF_REF_IS_ARTIFICIAL (def)
> >> > + && !(DF_REF_FLAGS (def)
> >> > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
> >> > + && !DF_REF_NEXT_REG (def))
> >>
> >> Can we introduce a helper for this? There are already similar tests
> >> in ira and loop-iv, and it seems a bit too complex to have to open-code
> >> each time.
> >
> > I can use find_single_def_src in loop-iv.cc:
> >
> > /* If REGNO has a single definition, return its known value, otherwise return
> > null. */
> >
> > rtx
> > find_single_def_src (unsigned int regno)
>
> Yeah, reusing that sounds good. Perhaps we should move it into df-core.cc,
> alongside the df_reg_used group of functions.
>
> I think the mode check in your original patch should be kept though,
> so how about we change the parameter to an rtx reg and use rtx_equal in:
>
> rtx set = single_set (DF_REF_INSN (adef));
> if (set == NULL || !REG_P (SET_DEST (set))
> || REGNO (SET_DEST (set)) != regno)
> return NULL_RTX;
Fixed.
> rather than the existing !REG_P and REGNO checks. We should also add:
>
> >
> > and do
> >
> > /* If RHS is set only once to a constant, set CONST_RHS
> > to the constant. */
> > rtx def_src = find_single_def_src (REGNO (rhs));
> > if (def_src != nullptr && CONSTANT_P (def_src))
> > {
> > df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> > if (!(DF_REF_FLAGS (def)
> > & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
>
> …this check to the function, since it's needed for correctness even
> in the loop-iv.cc use.
Fixed.
>
> Thanks,
> Richard
Here is the v3 patch. OK for master?
Thanks.
H.J.
---
RTL DSE tracks redundant constant stores within a basic block. When RTL
loop invariant motion hoists a constant initialization out of the loop
into a separate basic block, the constant store value becomes unknown
within the original basic block. When recording store for RTL DSE, check
if the source register is set only once to a constant by a non-partial
unconditional load. If yes, record the constant as the constant store
source. It eliminates unrolled zero stores after memset 0 in a loop
where a vector register is used as the zero store source.
Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
1. Rename to df_find_single_def_src.
2. Change the argument to rtx and use rtx_equal_p.
3. Return null for partial or conditional defs.
gcc/
PR rtl-optimization/105638
* df-core.cc (df_find_single_def_sr): Moved and renamed from
find_single_def_src in loop-iv.cc. Change the argument to rtx
and use rtx_equal_p. Return null for partial or conditional
defs.
* df.h (df_find_single_def_src): New prototype.
* dse.cc (record_store): Use the constant source if the source
register is set only once.
* loop-iv.cc (find_single_def_src): Moved to df-core.cc.
(replace_single_def_regs): Replace find_single_def_src with
df_find_single_def_src.
gcc/testsuite/
PR rtl-optimization/105638
* g++.target/i386/pr105638.C: New test.
---
gcc/df-core.cc | 44 +++++++++++++++++++++++
gcc/df.h | 1 +
gcc/dse.cc | 14 ++++++++
gcc/loop-iv.cc | 45 +-----------------------
gcc/testsuite/g++.target/i386/pr105638.C | 44 +++++++++++++++++++++++
5 files changed, 104 insertions(+), 44 deletions(-)
create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index a901b84878f..f9b4de8eb7a 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -2009,6 +2009,50 @@ df_reg_used (rtx_insn *insn, rtx reg)
return df_find_use (insn, reg) != NULL;
}
+/* If REG has a single definition, return its known value, otherwise return
+ null. */
+
+rtx
+df_find_single_def_src (rtx reg)
+{
+ rtx src = NULL_RTX;
+
+ /* Don't look through unbounded number of single definition REG copies,
+ there might be loops for sources with uninitialized variables. */
+ for (int cnt = 0; cnt < 128; cnt++)
+ {
+ df_ref adef = DF_REG_DEF_CHAIN (REGNO (reg));
+ if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
+ || DF_REF_IS_ARTIFICIAL (adef)
+ || (DF_REF_FLAGS (adef)
+ & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
+ return NULL_RTX;
+
+ rtx set = single_set (DF_REF_INSN (adef));
+ if (set == NULL || !rtx_equal_p (SET_DEST (set), reg))
+ return NULL_RTX;
+
+ rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
+ if (note && function_invariant_p (XEXP (note, 0)))
+ {
+ src = XEXP (note, 0);
+ break;
+ }
+ src = SET_SRC (set);
+
+ if (REG_P (src))
+ {
+ reg = src;
+ continue;
+ }
+ break;
+ }
+ if (!function_invariant_p (src))
+ return NULL_RTX;
+
+ return src;
+}
+
\f
/*----------------------------------------------------------------------------
Debugging and printing functions.
diff --git a/gcc/df.h b/gcc/df.h
index bd329205d08..71e249ad20a 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -991,6 +991,7 @@ extern df_ref df_find_def (rtx_insn *, rtx);
extern bool df_reg_defined (rtx_insn *, rtx);
extern df_ref df_find_use (rtx_insn *, rtx);
extern bool df_reg_used (rtx_insn *, rtx);
+extern rtx df_find_single_def_src (rtx);
extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int);
extern void df_print_regset (FILE *file, const_bitmap r);
extern void df_print_word_regset (FILE *file, const_bitmap r);
diff --git a/gcc/dse.cc b/gcc/dse.cc
index 30c11cee034..c915266f025 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1508,6 +1508,20 @@ record_store (rtx body, bb_info_t bb_info)
if (tem && CONSTANT_P (tem))
const_rhs = tem;
+ else
+ {
+ /* If RHS is set only once to a constant, set CONST_RHS
+ to the constant. */
+ rtx def_src = df_find_single_def_src (rhs);
+ if (def_src != nullptr && CONSTANT_P (def_src))
+ {
+ df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
+ rtx_insn *def_insn = DF_REF_INSN (def);
+ rtx def_body = single_set (def_insn);
+ if (rhs == SET_DEST (def_body))
+ const_rhs = def_src;
+ }
+ }
}
}
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index 0eafe7d2362..d639336445a 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -1378,49 +1378,6 @@ simple_rhs_p (rtx rhs)
}
}
-/* If REGNO has a single definition, return its known value, otherwise return
- null. */
-
-static rtx
-find_single_def_src (unsigned int regno)
-{
- rtx src = NULL_RTX;
-
- /* Don't look through unbounded number of single definition REG copies,
- there might be loops for sources with uninitialized variables. */
- for (int cnt = 0; cnt < 128; cnt++)
- {
- df_ref adef = DF_REG_DEF_CHAIN (regno);
- if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
- || DF_REF_IS_ARTIFICIAL (adef))
- return NULL_RTX;
-
- rtx set = single_set (DF_REF_INSN (adef));
- if (set == NULL || !REG_P (SET_DEST (set))
- || REGNO (SET_DEST (set)) != regno)
- return NULL_RTX;
-
- rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
- if (note && function_invariant_p (XEXP (note, 0)))
- {
- src = XEXP (note, 0);
- break;
- }
- src = SET_SRC (set);
-
- if (REG_P (src))
- {
- regno = REGNO (src);
- continue;
- }
- break;
- }
- if (!function_invariant_p (src))
- return NULL_RTX;
-
- return src;
-}
-
/* If any registers in *EXPR that have a single definition, try to replace
them with the known-equivalent values. */
@@ -1433,7 +1390,7 @@ replace_single_def_regs (rtx *expr)
{
rtx x = *iter;
if (REG_P (x))
- if (rtx new_x = find_single_def_src (REGNO (x)))
+ if (rtx new_x = df_find_single_def_src (x))
{
*expr = simplify_replace_rtx (*expr, x, new_x);
goto repeat;
diff --git a/gcc/testsuite/g++.target/i386/pr105638.C b/gcc/testsuite/g++.target/i386/pr105638.C
new file mode 100644
index 00000000000..ff40a459de1
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr105638.C
@@ -0,0 +1,44 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-std=gnu++20 -O2 -march=skylake" } */
+/* { dg-final { scan-assembler-not "vpxor" } } */
+
+#include <stdint.h>
+#include <vector>
+#include <tr1/array>
+
+class FastBoard {
+public:
+ typedef std::pair<int, int> movescore_t;
+ typedef std::tr1::array<movescore_t, 24> scoredlist_t;
+
+protected:
+ std::vector<int> m_critical;
+
+ int m_boardsize;
+};
+
+class FastState {
+public:
+ FastBoard board;
+
+ int movenum;
+protected:
+ FastBoard::scoredlist_t scoredmoves;
+};
+
+class KoState : public FastState {
+private:
+ std::vector<uint64_t> ko_hash_history;
+ std::vector<uint64_t> hash_history;
+};
+
+class GameState : public KoState {
+public:
+ void foo ();
+private:
+ std::vector<KoState> game_history;
+};
+
+void GameState::foo() {
+ game_history.resize(movenum);
+}
--
2.36.1
next prev parent reply other threads:[~2022-05-26 20:43 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-21 3:01 [PATCH] DSE: Use the constant " H.J. Lu
2022-05-23 10:38 ` Richard Biener
2022-05-23 18:34 ` [PATCH v2] DSE: Use the constant store " H.J. Lu
2022-05-24 6:42 ` Richard Biener
2022-05-24 20:10 ` H.J. Lu
2022-05-25 9:22 ` Richard Biener
2022-05-25 9:30 ` Richard Sandiford
2022-05-25 19:01 ` H.J. Lu
2022-05-25 7:30 ` Richard Sandiford
2022-05-25 18:56 ` H.J. Lu
2022-05-26 15:14 ` Richard Sandiford
2022-05-26 20:43 ` H.J. Lu [this message]
2022-05-28 18:37 ` [PATCH v3] " Jeff Law
2022-05-29 21:43 ` H.J. Lu
2022-05-29 22:55 ` Jeff Law
2022-05-30 8:28 ` Richard Sandiford
2022-05-30 22:58 ` Jeff Law
2022-05-30 8:35 ` Richard Sandiford
2022-05-31 17:12 ` [PATCH v4] " H.J. Lu
2022-06-01 7:20 ` Richard Sandiford
2022-06-01 21:07 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yo/mVc4I+l9zd+KZ@gmail.com \
--to=hjl.tools@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).