public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Optimize std::max early
Date: Mon, 19 Jun 2023 10:40:34 +0200	[thread overview]
Message-ID: <CAFiYyc0ZYCw6_Uy6HURR_szodvVPnHH3CS_qCTdnyUg1t31JbA@mail.gmail.com> (raw)
In-Reply-To: <ZI8owVx4aDXxwB8Y@kam.mff.cuni.cz>

On Sun, Jun 18, 2023 at 5:55 PM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> we currently produce very bad code on loops using std::vector as a stack, since
> we fail to inline push_back which in turn prevents SRA and we fail to optimize
> out some store-to-load pairs (PR109849).
>
> I looked into why this function is not inlined and it is inlined by clang.  We
> currently estimate it to 66 instructions and inline limits are 15 at -O2 and 30
> at -O3.  Clang has similar estimate, but still decides to inline at -O2.
>
> I looked into reason why the body is so large and one problem I spotted is the
> way std::max is implemented by taking and returning reference to the values.
>
>   const T& max( const T& a, const T& b );
>
> This makes it necessary to store the values to memory and load them later
> (Max is used by code computing new size of vector on resize.)
> Two stores, conditional and load accounts as 8 instructions, while
> MAX_EXPR as 1 and has a lot better chance to fold with the surrounding
> code.
>
> We optimize this to MAX_EXPR, but only during late optimizations.  I think this
> is a common enough coding pattern and we ought to make this transparent to
> early opts and IPA.  The following is easist fix that simply adds phiprop pass
> that turns the PHI of address values into PHI of values so later FRE can
> propagate values across memory, phiopt discover the MAX_EXPR pattern and DSE
> remove the memory stores.
>
> Bootstrapped/regtested x86_64-linux, does this look resonable thing to do?

OK.

Thanks,
Richard.

> Looking into how expensive the pass is, I think it is very cheap, except
> that it computes postdominator and updates ssa even if no patterns
> are matched.  I will send patch to avoid that.
>
> gcc/ChangeLog:
>
>         PR tree-optimization/109811
>         PR tree-optimization/109849
>         * passes.def: Add phiprop to early optimization passes.
>         * tree-ssa-phiprop.cc: Allow clonning.
>
> gcc/testsuite/ChangeLog:
>
>         PR tree-optimization/109811
>         PR tree-optimization/109849
>         * gcc.dg/tree-ssa/phiprop-1.c: New test.
>
> diff --git a/gcc/passes.def b/gcc/passes.def
> index c9a8f19747b..faa5208b26b 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -88,6 +88,8 @@ along with GCC; see the file COPYING3.  If not see
>           /* pass_build_ealias is a dummy pass that ensures that we
>              execute TODO_rebuild_alias at this point.  */
>           NEXT_PASS (pass_build_ealias);
> +         /* Do phiprop before FRE so we optimize std::min and std::max well.  */
> +         NEXT_PASS (pass_phiprop);
>           NEXT_PASS (pass_fre, true /* may_iterate */);
>           NEXT_PASS (pass_early_vrp);
>           NEXT_PASS (pass_merge_phi);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c
> new file mode 100644
> index 00000000000..9f52c2a7298
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-phiprop1-details -fdump-tree-release_ssa" } */
> +int max(int a, int b)
> +{
> +        int *ptr;
> +        if (a > b)
> +          ptr = &a;
> +        else
> +          ptr = &b;
> +        return *ptr;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Inserting PHI for result of load" 1 "phiprop1"} } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "release_ssa"} } */
> diff --git a/gcc/tree-ssa-phiprop.cc b/gcc/tree-ssa-phiprop.cc
> index 3cb4900b6be..5dc505df420 100644
> --- a/gcc/tree-ssa-phiprop.cc
> +++ b/gcc/tree-ssa-phiprop.cc
> @@ -476,6 +476,7 @@ public:
>    {}
>
>    /* opt_pass methods: */
> +  opt_pass * clone () final override { return new pass_phiprop (m_ctxt); }
>    bool gate (function *) final override { return flag_tree_phiprop; }
>    unsigned int execute (function *) final override;
>

      reply	other threads:[~2023-06-19  8:43 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-18 15:54 Jan Hubicka
2023-06-19  8:40 ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0ZYCw6_Uy6HURR_szodvVPnHH3CS_qCTdnyUg1t31JbA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).