From: Jason Merrill <jason@redhat.com>
To: Patrick Palka <ppalka@redhat.com>
Cc: gcc-patches@gcc.gnu.org, Jonathan Wakely <jwakely.gcc@gmail.com>
Subject: Re: [PATCH] c++: fold calls to std::move/forward [PR96780]
Date: Mon, 14 Mar 2022 18:20:30 -0400 [thread overview]
Message-ID: <eba8b0c5-614b-d72c-8c4a-58ad2eeacb93@redhat.com> (raw)
In-Reply-To: <aa1a718c-adad-4a9c-fddc-14946ff6340a@idea>
On 3/14/22 13:13, Patrick Palka wrote:
> On Fri, 11 Mar 2022, Jason Merrill wrote:
>
>> On 3/10/22 11:27, Patrick Palka wrote:
>>> On Wed, 9 Mar 2022, Jason Merrill wrote:
>>>
>>>> On 3/1/22 18:08, Patrick Palka wrote:
>>>>> A well-formed call to std::move/forward is equivalent to a cast, but the
>>>>> former being a function call means it comes with bloated debug info,
>>>>> which
>>>>> persists even after the call has been inlined away, for an operation
>>>>> that
>>>>> is never interesting to debug.
>>>>>
>>>>> This patch addresses this problem in a relatively ad-hoc way by folding
>>>>> calls to std::move/forward into casts as part of the frontend's general
>>>>> expression folding routine. After this patch with -O2 and a
>>>>> non-checking
>>>>> compiler, debug info size for some testcases decreases by about ~10% and
>>>>> overall compile time and memory usage decreases by ~2%.
>>>>
>>>> Impressive. Which testcases?
>>>
>>> I saw the largest percent reductions in debug file object size in
>>> various tests from cmcstl2 and range-v3, e.g.
>>> test/algorithm/set_symmetric_difference4.cpp and .../rotate_copy.cpp
>>> (which are among their biggest tests).
>>>
>>> Significant reductions in debug object file size can be observed in
>>> some libstdc++ testcases too, such as a 5.5% reduction in
>>> std/ranges/adaptor/join.cc
>>>
>>>>
>>>> Do you also want to handle addressof and as_const in this patch, as
>>>> Jonathan
>>>> suggested?
>>>
>>> Yes, good idea. Since each of their argument and return types are
>>> indirect types, I think we can use the same NOP_EXPR-based folding for
>>> them.
>>>
>>>>
>>>> I think we can do this now, and think about generalizing more in stage 1.
>>>>
>>>>> Bootstrapped and regtested on x86_64-pc-linux-gnu, is this something we
>>>>> want to consider for GCC 12?
>>>>>
>>>>> PR c++/96780
>>>>>
>>>>> gcc/cp/ChangeLog:
>>>>>
>>>>> * cp-gimplify.cc (cp_fold) <case CALL_EXPR>: When optimizing,
>>>>> fold calls to std::move/forward into simple casts.
>>>>> * cp-tree.h (is_std_move_p, is_std_forward_p): Declare.
>>>>> * typeck.cc (is_std_move_p, is_std_forward_p): Export.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> * g++.dg/opt/pr96780.C: New test.
>>>>> ---
>>>>> gcc/cp/cp-gimplify.cc | 18 ++++++++++++++++++
>>>>> gcc/cp/cp-tree.h | 2 ++
>>>>> gcc/cp/typeck.cc | 6 ++----
>>>>> gcc/testsuite/g++.dg/opt/pr96780.C | 24 ++++++++++++++++++++++++
>>>>> 4 files changed, 46 insertions(+), 4 deletions(-)
>>>>> create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C
>>>>>
>>>>> diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
>>>>> index d7323fb5c09..0b009b631c7 100644
>>>>> --- a/gcc/cp/cp-gimplify.cc
>>>>> +++ b/gcc/cp/cp-gimplify.cc
>>>>> @@ -2756,6 +2756,24 @@ cp_fold (tree x)
>>>>> case CALL_EXPR:
>>>>> {
>>>>> + if (optimize
>>>>
>>>> I think this should check flag_no_inline rather than optimize.
>>>
>>> Sounds good.
>>>
>>> Here's a patch that extends the folding to as_const and addressof (as
>>> well as __addressof, which I'm kind of unsure about since it's
>>> non-standard). I suppose it also doesn't hurt to verify that the return
>>> and argument type of the function are sane before we commit to folding.
>>>
>>> -- >8 --
>>>
>>> Subject: [PATCH] c++: fold calls to std::move/forward [PR96780]
>>>
>>> A well-formed call to std::move/forward is equivalent to a cast, but the
>>> former being a function call means the compiler generates debug info for
>>> it, which persists even after the call has been inlined away, for an
>>> operation that's never interesting to debug.
>>>
>>> This patch addresses this problem in a relatively ad-hoc way by folding
>>> calls to std::move/forward and other cast-like functions into simple
>>> casts as part of the frontend's general expression folding routine.
>>> After this patch with -O2 and a non-checking compiler, debug info size
>>> for some testcases decreases by about ~10% and overall compile time and
>>> memory usage decreases by ~2%.
>>>
>>> PR c++/96780
>>>
>>> gcc/cp/ChangeLog:
>>>
>>> * cp-gimplify.cc (cp_fold) <case CALL_EXPR>: When optimizing,
>>> fold calls to std::move/forward and other cast-like functions
>>> into simple casts.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * g++.dg/opt/pr96780.C: New test.
>>> ---
>>> gcc/cp/cp-gimplify.cc | 36 +++++++++++++++++++++++++++-
>>> gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++
>>> 2 files changed, 73 insertions(+), 1 deletion(-)
>>> create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C
>>>
>>> diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
>>> index d7323fb5c09..efc4c8f0eb9 100644
>>> --- a/gcc/cp/cp-gimplify.cc
>>> +++ b/gcc/cp/cp-gimplify.cc
>>> @@ -2756,9 +2756,43 @@ cp_fold (tree x)
>>> case CALL_EXPR:
>>> {
>>> - int sv = optimize, nw = sv;
>>> tree callee = get_callee_fndecl (x);
>>> + /* "Inline" calls to std::move/forward and other cast-like functions
>>> + by simply folding them into the corresponding cast determined by
>>> + their return type. This is cheaper than relying on the middle-end
>>> + to do so, and also means we avoid generating useless debug info for
>>> + them at all.
>>> +
>>> + At this point the argument has already been converted into a
>>> + reference, so it suffices to use a NOP_EXPR to express the
>>> + cast. */
>>> + if (!flag_no_inline
>>
>> In our conversation yesterday it occurred to me that we might make this a
>> separate flag that defaults to the value of flag_no_inline; I was thinking of
>> -ffold-simple-inlines. Then Vittorio et al can specify that explicitly at -O0
>> if they'd like.
>
> Makes sense, like so? Bootstrapped and regtested on x86_64-pc-linux-gnu.
>
> The patch defaults -ffold-simple-inlines according to the value of
> flag_no_inline at startup. IIUC this means that if the flag has been
> defaulted to set, then e.g. an optimize("O0") function attribute won't
> disable -ffold-simple-inlines for that function, since we only compute
> its default value once.
>
> I wonder if we therefore instead want to handle defaulting the flag
> when it's used, e.g. check
>
> (flag_fold_simple_inlines == -1
> ? flag_no_inline
> : flag_fold_simple_inlines)
>
> instead of
>
> flag_fold_simple_inlines
>
> in cp_fold?
I guess that makes sense, we can't add front-end options to the
default_options_table. But I think let's use OPTION_SET_P instead of
checking for -1.
> -- >8 --
>
> Subject: [PATCH] c++: fold calls to std::move/forward [PR96780]
>
> A well-formed call to std::move/forward is equivalent to a cast, but the
> former being a function call means the compiler generates debug info for
> it, which persists even after the call has been inlined away, for an
> operation that's never interesting to debug.
>
> This patch addresses this problem by folding calls to std::move/forward
> and other cast-like functions into simple casts as part of the frontend's
> general expression folding routine. This behavior is controlled by a
> new flag -ffold-simple-inlines which defaults to the value of -fno-inline.
>
> After this patch with -O2 and a non-checking compiler, debug info size
> for some testcases (e.g. from range-v3 and cmcstl2) decreases by about
> ~10% and overall compile time and memory usage decreases by ~2%.
Did you compare the reduction after handling more functions?
> PR c++/96780
>
> gcc/c-family/ChangeLog:
>
> * c-opts.cc (c_common_post_options): Handle defaulting of
> flag_fold_simple_inlines.
> * c.opt: Add -ffold-simple-inlines.
>
> gcc/cp/ChangeLog:
>
> * cp-gimplify.cc (cp_fold) <case CALL_EXPR>: Fold calls to
> std::move/forward and other cast-like functions into simple
> casts.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/opt/pr96780.C: New test.
> ---
> gcc/c-family/c-opts.cc | 3 +++
> gcc/c-family/c.opt | 4 ++++
> gcc/cp/cp-gimplify.cc | 35 ++++++++++++++++++++++++++-
> gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++
> 4 files changed, 79 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C
>
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index a341a061758..e8831d16c7b 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -1058,6 +1058,9 @@ c_common_post_options (const char **pfilename)
> if (flag_implicit_constexpr && cxx_dialect < cxx14)
> flag_implicit_constexpr = false;
>
> + if (flag_fold_simple_inlines == -1)
> + flag_fold_simple_inlines = !flag_no_inline;
> +
> /* Global sized deallocation is new in C++14. */
> if (flag_sized_deallocation == -1)
> flag_sized_deallocation = (cxx_dialect >= cxx14);
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 9cfd2a6bc4e..9a2a597e587 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1731,6 +1731,10 @@ Support dynamic initialization of thread-local variables in a different translat
> fexternal-templates
> C++ ObjC++ WarnRemoved
>
> +ffold-simple-inlines
> +C++ ObjC++ Optimization Var(flag_fold_simple_inlines) Init(-1)
> +Fold calls to simple inline functions.
> +
> ffor-scope
> C++ ObjC++ WarnRemoved
>
> diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> index d7323fb5c09..b09fede2d75 100644
> --- a/gcc/cp/cp-gimplify.cc
> +++ b/gcc/cp/cp-gimplify.cc
> @@ -2756,9 +2756,42 @@ cp_fold (tree x)
>
> case CALL_EXPR:
> {
> - int sv = optimize, nw = sv;
> tree callee = get_callee_fndecl (x);
>
> + /* "Inline" calls to std::move/forward and other cast-like functions
> + by simply folding them into a corresponding cast to their return
> + type. This is cheaper than relying on the middle-end to do so, and
> + also means we avoid generating useless debug info for them at all.
> +
> + At this point the argument has already been converted into a
> + reference, so it suffices to use a NOP_EXPR to express the
> + cast. */
> + if (flag_fold_simple_inlines
> + && call_expr_nargs (x) == 1
> + && decl_in_std_namespace_p (callee)
> + && DECL_NAME (callee) != NULL_TREE
> + && (id_equal (DECL_NAME (callee), "move")
> + || id_equal (DECL_NAME (callee), "forward")
> + || id_equal (DECL_NAME (callee), "addressof")
> + /* This addressof equivalent is used heavily in libstdc++. */
> + || id_equal (DECL_NAME (callee), "__addressof")
> + || id_equal (DECL_NAME (callee), "as_const")))
> + {
> + r = CALL_EXPR_ARG (x, 0);
> + /* Check that the return and arguments types are sane before
> + folding. */
> + if (INDIRECT_TYPE_P (TREE_TYPE (x))
> + && INDIRECT_TYPE_P (TREE_TYPE (r)))
> + {
> + if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
> + r = build_nop (TREE_TYPE (x), r);
> + x = cp_fold (r);
> + break;
> + }
> + }
> +
> + int sv = optimize, nw = sv;
> +
> /* Some built-in function calls will be evaluated at compile-time in
> fold (). Set optimize to 1 when folding __builtin_constant_p inside
> a constexpr function so that fold_builtin_1 doesn't fold it to 0. */
> diff --git a/gcc/testsuite/g++.dg/opt/pr96780.C b/gcc/testsuite/g++.dg/opt/pr96780.C
> new file mode 100644
> index 00000000000..61e11855eeb
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/opt/pr96780.C
> @@ -0,0 +1,38 @@
> +// PR c++/96780
> +// Verify calls to std::move/forward are folded away by the frontend.
> +// { dg-do compile { target c++11 } }
> +// { dg-additional-options "-ffold-simple-inlines -fdump-tree-gimple" }
> +
> +#include <utility>
> +
> +struct A;
> +
> +extern A& a;
> +extern const A& ca;
> +
> +void f() {
> + auto&& x1 = std::move(a);
> + auto&& x2 = std::forward<A>(a);
> + auto&& x3 = std::forward<A&>(a);
> +
> + auto&& x4 = std::move(ca);
> + auto&& x5 = std::forward<const A>(ca);
> + auto&& x6 = std::forward<const A&>(ca);
> +
> + auto x7 = std::addressof(a);
> + auto x8 = std::addressof(ca);
> +#if __GLIBCXX__
> + auto x9 = std::__addressof(a);
> + auto x10 = std::__addressof(ca);
> +#endif
> +#if __cpp_lib_as_const
> + auto&& x11 = std::as_const(a);
> + auto&& x12 = std::as_const(ca);
> +#endif
> +}
> +
> +// { dg-final { scan-tree-dump-not "= std::move" "gimple" } }
> +// { dg-final { scan-tree-dump-not "= std::forward" "gimple" } }
> +// { dg-final { scan-tree-dump-not "= std::addressof" "gimple" } }
> +// { dg-final { scan-tree-dump-not "= std::__addressof" "gimple" } }
> +// { dg-final { scan-tree-dump-not "= std::as_const" "gimple" } }
next prev parent reply other threads:[~2022-03-14 22:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-01 22:08 Patrick Palka
2022-03-10 4:09 ` Jason Merrill
2022-03-10 15:27 ` Patrick Palka
2022-03-10 15:32 ` Jonathan Wakely
2022-03-12 1:31 ` Jason Merrill
2022-03-14 17:13 ` Patrick Palka
2022-03-14 22:20 ` Jason Merrill [this message]
2022-03-15 14:03 ` Patrick Palka
2022-03-15 15:38 ` Jason Merrill
2022-03-15 17:09 ` Patrick Palka
2022-03-15 21:03 ` Jason Merrill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eba8b0c5-614b-d72c-8c4a-58ad2eeacb93@redhat.com \
--to=jason@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jwakely.gcc@gmail.com \
--cc=ppalka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).