From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 19DF4385842C for ; Mon, 14 Mar 2022 22:20:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 19DF4385842C Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-436-0dghDU7QNG21u7y7G8oAzQ-1; Mon, 14 Mar 2022 18:20:33 -0400 X-MC-Unique: 0dghDU7QNG21u7y7G8oAzQ-1 Received: by mail-qv1-f70.google.com with SMTP id g2-20020a0562141cc200b004123b0abe18so15094455qvd.2 for ; Mon, 14 Mar 2022 15:20:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=YTu/GaPz7gPrlDgxwVyU0tfzSl/0hhFEyt2AyaOEyCw=; b=dq0RHKckKdqNOyGyZ9qAMEhg6JfEsVxL9s5Rn6yrK4ZAsGwTPSHUfKJ5cuiY6Mtonr xh0DjqxuqIOZY0SCqdAxAqo7VOdqgzUzCduqSb5ShHsem04NolRIXjfYnknfmgBvsQzB Kw/A97FXkeaOuo7aIpY9cUw7d97dvw1nsl6iNqq1lRSONnZ2UMVPY43o9L6KhzRaIPMl 0KSxzdt+6yU0ZWWTiCfNAhpgGl90mmV4BGj9e5cXVFq21v7Zk67fNWysXB7Rf3S/JDxX fil+xKsMi7o4ZRwvlt2s1I8xkneL2YIPk8TbjtEc49vhkGzRwfc9FwtnXco96So2e++L Imkg== X-Gm-Message-State: AOAM532rTbzTEsU0VKwyWge4/PNUyPeD92pFavCYZ+h5bKWJKCG9Tq7G F9nKVpUmLINGSUuyL7uN9+KSRKV0m8+PsxK/kzBasOT+kqCFOM3LDnKqFhJmfYYpo7LzggurHq0 4xb0vtSFW6/OGMtLr/Q== X-Received: by 2002:a37:c402:0:b0:67b:2ddd:f63d with SMTP id d2-20020a37c402000000b0067b2dddf63dmr16264278qki.398.1647296432062; Mon, 14 Mar 2022 15:20:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyCWx0AfXaYM7iXGDN4d1uTeMorfn+oKpeUDIRFVNJWaKH/f6Bl+f3xOmX2wdvON7yZPw5nUg== X-Received: by 2002:a37:c402:0:b0:67b:2ddd:f63d with SMTP id d2-20020a37c402000000b0067b2dddf63dmr16264258qki.398.1647296431583; Mon, 14 Mar 2022 15:20:31 -0700 (PDT) Received: from [192.168.1.149] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id b1-20020a05622a020100b002de94100619sm12163222qtx.81.2022.03.14.15.20.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Mar 2022 15:20:30 -0700 (PDT) Message-ID: Date: Mon, 14 Mar 2022 18:20:30 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH] c++: fold calls to std::move/forward [PR96780] To: Patrick Palka Cc: gcc-patches@gcc.gnu.org, Jonathan Wakely References: <20220301220821.1732163-1-ppalka@redhat.com> <1de0b889-66e7-8202-5c5e-526b728a36e8@redhat.com> <6b2257e0-cd0f-4771-7c12-1be4dbf88423@idea> <7b1f33b7-3c59-bf1e-1fa4-7226504a5ae2@redhat.com> From: Jason Merrill In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Mar 2022 22:20:37 -0000 On 3/14/22 13:13, Patrick Palka wrote: > On Fri, 11 Mar 2022, Jason Merrill wrote: > >> On 3/10/22 11:27, Patrick Palka wrote: >>> On Wed, 9 Mar 2022, Jason Merrill wrote: >>> >>>> On 3/1/22 18:08, Patrick Palka wrote: >>>>> A well-formed call to std::move/forward is equivalent to a cast, but the >>>>> former being a function call means it comes with bloated debug info, >>>>> which >>>>> persists even after the call has been inlined away, for an operation >>>>> that >>>>> is never interesting to debug. >>>>> >>>>> This patch addresses this problem in a relatively ad-hoc way by folding >>>>> calls to std::move/forward into casts as part of the frontend's general >>>>> expression folding routine. After this patch with -O2 and a >>>>> non-checking >>>>> compiler, debug info size for some testcases decreases by about ~10% and >>>>> overall compile time and memory usage decreases by ~2%. >>>> >>>> Impressive. Which testcases? >>> >>> I saw the largest percent reductions in debug file object size in >>> various tests from cmcstl2 and range-v3, e.g. >>> test/algorithm/set_symmetric_difference4.cpp and .../rotate_copy.cpp >>> (which are among their biggest tests). >>> >>> Significant reductions in debug object file size can be observed in >>> some libstdc++ testcases too, such as a 5.5% reduction in >>> std/ranges/adaptor/join.cc >>> >>>> >>>> Do you also want to handle addressof and as_const in this patch, as >>>> Jonathan >>>> suggested? >>> >>> Yes, good idea. Since each of their argument and return types are >>> indirect types, I think we can use the same NOP_EXPR-based folding for >>> them. >>> >>>> >>>> I think we can do this now, and think about generalizing more in stage 1. >>>> >>>>> Bootstrapped and regtested on x86_64-pc-linux-gnu, is this something we >>>>> want to consider for GCC 12? >>>>> >>>>> PR c++/96780 >>>>> >>>>> gcc/cp/ChangeLog: >>>>> >>>>> * cp-gimplify.cc (cp_fold) : When optimizing, >>>>> fold calls to std::move/forward into simple casts. >>>>> * cp-tree.h (is_std_move_p, is_std_forward_p): Declare. >>>>> * typeck.cc (is_std_move_p, is_std_forward_p): Export. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> * g++.dg/opt/pr96780.C: New test. >>>>> --- >>>>> gcc/cp/cp-gimplify.cc | 18 ++++++++++++++++++ >>>>> gcc/cp/cp-tree.h | 2 ++ >>>>> gcc/cp/typeck.cc | 6 ++---- >>>>> gcc/testsuite/g++.dg/opt/pr96780.C | 24 ++++++++++++++++++++++++ >>>>> 4 files changed, 46 insertions(+), 4 deletions(-) >>>>> create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C >>>>> >>>>> diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc >>>>> index d7323fb5c09..0b009b631c7 100644 >>>>> --- a/gcc/cp/cp-gimplify.cc >>>>> +++ b/gcc/cp/cp-gimplify.cc >>>>> @@ -2756,6 +2756,24 @@ cp_fold (tree x) >>>>> case CALL_EXPR: >>>>> { >>>>> + if (optimize >>>> >>>> I think this should check flag_no_inline rather than optimize. >>> >>> Sounds good. >>> >>> Here's a patch that extends the folding to as_const and addressof (as >>> well as __addressof, which I'm kind of unsure about since it's >>> non-standard). I suppose it also doesn't hurt to verify that the return >>> and argument type of the function are sane before we commit to folding. >>> >>> -- >8 -- >>> >>> Subject: [PATCH] c++: fold calls to std::move/forward [PR96780] >>> >>> A well-formed call to std::move/forward is equivalent to a cast, but the >>> former being a function call means the compiler generates debug info for >>> it, which persists even after the call has been inlined away, for an >>> operation that's never interesting to debug. >>> >>> This patch addresses this problem in a relatively ad-hoc way by folding >>> calls to std::move/forward and other cast-like functions into simple >>> casts as part of the frontend's general expression folding routine. >>> After this patch with -O2 and a non-checking compiler, debug info size >>> for some testcases decreases by about ~10% and overall compile time and >>> memory usage decreases by ~2%. >>> >>> PR c++/96780 >>> >>> gcc/cp/ChangeLog: >>> >>> * cp-gimplify.cc (cp_fold) : When optimizing, >>> fold calls to std::move/forward and other cast-like functions >>> into simple casts. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * g++.dg/opt/pr96780.C: New test. >>> --- >>> gcc/cp/cp-gimplify.cc | 36 +++++++++++++++++++++++++++- >>> gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++ >>> 2 files changed, 73 insertions(+), 1 deletion(-) >>> create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C >>> >>> diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc >>> index d7323fb5c09..efc4c8f0eb9 100644 >>> --- a/gcc/cp/cp-gimplify.cc >>> +++ b/gcc/cp/cp-gimplify.cc >>> @@ -2756,9 +2756,43 @@ cp_fold (tree x) >>> case CALL_EXPR: >>> { >>> - int sv = optimize, nw = sv; >>> tree callee = get_callee_fndecl (x); >>> + /* "Inline" calls to std::move/forward and other cast-like functions >>> + by simply folding them into the corresponding cast determined by >>> + their return type. This is cheaper than relying on the middle-end >>> + to do so, and also means we avoid generating useless debug info for >>> + them at all. >>> + >>> + At this point the argument has already been converted into a >>> + reference, so it suffices to use a NOP_EXPR to express the >>> + cast. */ >>> + if (!flag_no_inline >> >> In our conversation yesterday it occurred to me that we might make this a >> separate flag that defaults to the value of flag_no_inline; I was thinking of >> -ffold-simple-inlines. Then Vittorio et al can specify that explicitly at -O0 >> if they'd like. > > Makes sense, like so? Bootstrapped and regtested on x86_64-pc-linux-gnu. > > The patch defaults -ffold-simple-inlines according to the value of > flag_no_inline at startup. IIUC this means that if the flag has been > defaulted to set, then e.g. an optimize("O0") function attribute won't > disable -ffold-simple-inlines for that function, since we only compute > its default value once. > > I wonder if we therefore instead want to handle defaulting the flag > when it's used, e.g. check > > (flag_fold_simple_inlines == -1 > ? flag_no_inline > : flag_fold_simple_inlines) > > instead of > > flag_fold_simple_inlines > > in cp_fold? I guess that makes sense, we can't add front-end options to the default_options_table. But I think let's use OPTION_SET_P instead of checking for -1. > -- >8 -- > > Subject: [PATCH] c++: fold calls to std::move/forward [PR96780] > > A well-formed call to std::move/forward is equivalent to a cast, but the > former being a function call means the compiler generates debug info for > it, which persists even after the call has been inlined away, for an > operation that's never interesting to debug. > > This patch addresses this problem by folding calls to std::move/forward > and other cast-like functions into simple casts as part of the frontend's > general expression folding routine. This behavior is controlled by a > new flag -ffold-simple-inlines which defaults to the value of -fno-inline. > > After this patch with -O2 and a non-checking compiler, debug info size > for some testcases (e.g. from range-v3 and cmcstl2) decreases by about > ~10% and overall compile time and memory usage decreases by ~2%. Did you compare the reduction after handling more functions? > PR c++/96780 > > gcc/c-family/ChangeLog: > > * c-opts.cc (c_common_post_options): Handle defaulting of > flag_fold_simple_inlines. > * c.opt: Add -ffold-simple-inlines. > > gcc/cp/ChangeLog: > > * cp-gimplify.cc (cp_fold) : Fold calls to > std::move/forward and other cast-like functions into simple > casts. > > gcc/testsuite/ChangeLog: > > * g++.dg/opt/pr96780.C: New test. > --- > gcc/c-family/c-opts.cc | 3 +++ > gcc/c-family/c.opt | 4 ++++ > gcc/cp/cp-gimplify.cc | 35 ++++++++++++++++++++++++++- > gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++ > 4 files changed, 79 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C > > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc > index a341a061758..e8831d16c7b 100644 > --- a/gcc/c-family/c-opts.cc > +++ b/gcc/c-family/c-opts.cc > @@ -1058,6 +1058,9 @@ c_common_post_options (const char **pfilename) > if (flag_implicit_constexpr && cxx_dialect < cxx14) > flag_implicit_constexpr = false; > > + if (flag_fold_simple_inlines == -1) > + flag_fold_simple_inlines = !flag_no_inline; > + > /* Global sized deallocation is new in C++14. */ > if (flag_sized_deallocation == -1) > flag_sized_deallocation = (cxx_dialect >= cxx14); > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 9cfd2a6bc4e..9a2a597e587 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1731,6 +1731,10 @@ Support dynamic initialization of thread-local variables in a different translat > fexternal-templates > C++ ObjC++ WarnRemoved > > +ffold-simple-inlines > +C++ ObjC++ Optimization Var(flag_fold_simple_inlines) Init(-1) > +Fold calls to simple inline functions. > + > ffor-scope > C++ ObjC++ WarnRemoved > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc > index d7323fb5c09..b09fede2d75 100644 > --- a/gcc/cp/cp-gimplify.cc > +++ b/gcc/cp/cp-gimplify.cc > @@ -2756,9 +2756,42 @@ cp_fold (tree x) > > case CALL_EXPR: > { > - int sv = optimize, nw = sv; > tree callee = get_callee_fndecl (x); > > + /* "Inline" calls to std::move/forward and other cast-like functions > + by simply folding them into a corresponding cast to their return > + type. This is cheaper than relying on the middle-end to do so, and > + also means we avoid generating useless debug info for them at all. > + > + At this point the argument has already been converted into a > + reference, so it suffices to use a NOP_EXPR to express the > + cast. */ > + if (flag_fold_simple_inlines > + && call_expr_nargs (x) == 1 > + && decl_in_std_namespace_p (callee) > + && DECL_NAME (callee) != NULL_TREE > + && (id_equal (DECL_NAME (callee), "move") > + || id_equal (DECL_NAME (callee), "forward") > + || id_equal (DECL_NAME (callee), "addressof") > + /* This addressof equivalent is used heavily in libstdc++. */ > + || id_equal (DECL_NAME (callee), "__addressof") > + || id_equal (DECL_NAME (callee), "as_const"))) > + { > + r = CALL_EXPR_ARG (x, 0); > + /* Check that the return and arguments types are sane before > + folding. */ > + if (INDIRECT_TYPE_P (TREE_TYPE (x)) > + && INDIRECT_TYPE_P (TREE_TYPE (r))) > + { > + if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r))) > + r = build_nop (TREE_TYPE (x), r); > + x = cp_fold (r); > + break; > + } > + } > + > + int sv = optimize, nw = sv; > + > /* Some built-in function calls will be evaluated at compile-time in > fold (). Set optimize to 1 when folding __builtin_constant_p inside > a constexpr function so that fold_builtin_1 doesn't fold it to 0. */ > diff --git a/gcc/testsuite/g++.dg/opt/pr96780.C b/gcc/testsuite/g++.dg/opt/pr96780.C > new file mode 100644 > index 00000000000..61e11855eeb > --- /dev/null > +++ b/gcc/testsuite/g++.dg/opt/pr96780.C > @@ -0,0 +1,38 @@ > +// PR c++/96780 > +// Verify calls to std::move/forward are folded away by the frontend. > +// { dg-do compile { target c++11 } } > +// { dg-additional-options "-ffold-simple-inlines -fdump-tree-gimple" } > + > +#include > + > +struct A; > + > +extern A& a; > +extern const A& ca; > + > +void f() { > + auto&& x1 = std::move(a); > + auto&& x2 = std::forward(a); > + auto&& x3 = std::forward(a); > + > + auto&& x4 = std::move(ca); > + auto&& x5 = std::forward(ca); > + auto&& x6 = std::forward(ca); > + > + auto x7 = std::addressof(a); > + auto x8 = std::addressof(ca); > +#if __GLIBCXX__ > + auto x9 = std::__addressof(a); > + auto x10 = std::__addressof(ca); > +#endif > +#if __cpp_lib_as_const > + auto&& x11 = std::as_const(a); > + auto&& x12 = std::as_const(ca); > +#endif > +} > + > +// { dg-final { scan-tree-dump-not "= std::move" "gimple" } } > +// { dg-final { scan-tree-dump-not "= std::forward" "gimple" } } > +// { dg-final { scan-tree-dump-not "= std::addressof" "gimple" } } > +// { dg-final { scan-tree-dump-not "= std::__addressof" "gimple" } } > +// { dg-final { scan-tree-dump-not "= std::as_const" "gimple" } }