From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id A4A263858D20 for ; Mon, 14 Mar 2022 17:13:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A4A263858D20 Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-539-Bs2vU3HaOmG9TfdL488d3w-1; Mon, 14 Mar 2022 13:13:55 -0400 X-MC-Unique: Bs2vU3HaOmG9TfdL488d3w-1 Received: by mail-qt1-f199.google.com with SMTP id w11-20020a05622a134b00b002dd15429effso12038841qtk.18 for ; Mon, 14 Mar 2022 10:13:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=+RwGIV7eNwbkkYoNzfMe/+9Dm/Pl6ZlK+Ix1Nn3uWAk=; b=D9DoHYkta13fR6qie8VFcPZJDSzBRCcSihuAIrGpoJHkg7yJUdfHo3sQQ1I8N7gIyz FAwLFBXnonx4rjo3+L/9xebR29ZxtkSb/6QE4G1aPXzQQ8s7GNGUmsfrSw6fdnqkcTNU b93/KGrLUUXMNABfMcidUN6ElDOZfz46aHLvDCLi38OPfmkprOnE9SZHSo6DuPi8h691 t7OL64oB2it7ZPMziXa7L0CQ3lfwcxHCiZ2Ivtvb6rborAmP2nSDLae5AXZXyPaG0bD2 0xTsZ8aB6KXZYCoFLxSsH1TDPP0HmxpOLq7PDmQeQzVvEPXexdvbU4mM6G9DRHYqCP00 NYkw== X-Gm-Message-State: AOAM530Mmk8h5NfyTIzxz8+oiYEJd4n+nn8lcxIm6FAoBVcnge1hxyX3 zwRnuV5bnoarRQZ2ncQLAK9BAcWNpqtpjARUv7X64ad1MhF7N5qKUfFqFeZOIhahrPj1wj71Y2V UMJr1cpmbMbaYQRJ2yw== X-Received: by 2002:a05:6214:29c1:b0:435:1b04:d09 with SMTP id gh1-20020a05621429c100b004351b040d09mr18631575qvb.9.1647278034087; Mon, 14 Mar 2022 10:13:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJziHKQw8W/lR9L1/hDrqY8Ab0pzs3ZtgLAvHsZeBgN8VBQc4vaUoM+JYh5E0F+lTn/UlHzZOQ== X-Received: by 2002:a05:6214:29c1:b0:435:1b04:d09 with SMTP id gh1-20020a05621429c100b004351b040d09mr18631533qvb.9.1647278033641; Mon, 14 Mar 2022 10:13:53 -0700 (PDT) Received: from [192.168.1.130] (ool-18e40894.dyn.optonline.net. [24.228.8.148]) by smtp.gmail.com with ESMTPSA id h6-20020a379e06000000b0067b30874b90sm8123101qke.41.2022.03.14.10.13.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Mar 2022 10:13:53 -0700 (PDT) From: Patrick Palka X-Google-Original-From: Patrick Palka Date: Mon, 14 Mar 2022 13:13:52 -0400 (EDT) To: Jason Merrill cc: Patrick Palka , gcc-patches@gcc.gnu.org, Jonathan Wakely Subject: Re: [PATCH] c++: fold calls to std::move/forward [PR96780] In-Reply-To: <7b1f33b7-3c59-bf1e-1fa4-7226504a5ae2@redhat.com> Message-ID: References: <20220301220821.1732163-1-ppalka@redhat.com> <1de0b889-66e7-8202-5c5e-526b728a36e8@redhat.com> <6b2257e0-cd0f-4771-7c12-1be4dbf88423@idea> <7b1f33b7-3c59-bf1e-1fa4-7226504a5ae2@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-14.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Mar 2022 17:14:01 -0000 On Fri, 11 Mar 2022, Jason Merrill wrote: > On 3/10/22 11:27, Patrick Palka wrote: > > On Wed, 9 Mar 2022, Jason Merrill wrote: > > > > > On 3/1/22 18:08, Patrick Palka wrote: > > > > A well-formed call to std::move/forward is equivalent to a cast, but the > > > > former being a function call means it comes with bloated debug info, > > > > which > > > > persists even after the call has been inlined away, for an operation > > > > that > > > > is never interesting to debug. > > > > > > > > This patch addresses this problem in a relatively ad-hoc way by folding > > > > calls to std::move/forward into casts as part of the frontend's general > > > > expression folding routine. After this patch with -O2 and a > > > > non-checking > > > > compiler, debug info size for some testcases decreases by about ~10% and > > > > overall compile time and memory usage decreases by ~2%. > > > > > > Impressive. Which testcases? > > > > I saw the largest percent reductions in debug file object size in > > various tests from cmcstl2 and range-v3, e.g. > > test/algorithm/set_symmetric_difference4.cpp and .../rotate_copy.cpp > > (which are among their biggest tests). > > > > Significant reductions in debug object file size can be observed in > > some libstdc++ testcases too, such as a 5.5% reduction in > > std/ranges/adaptor/join.cc > > > > > > > > Do you also want to handle addressof and as_const in this patch, as > > > Jonathan > > > suggested? > > > > Yes, good idea. Since each of their argument and return types are > > indirect types, I think we can use the same NOP_EXPR-based folding for > > them. > > > > > > > > I think we can do this now, and think about generalizing more in stage 1. > > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, is this something we > > > > want to consider for GCC 12? > > > > > > > > PR c++/96780 > > > > > > > > gcc/cp/ChangeLog: > > > > > > > > * cp-gimplify.cc (cp_fold) : When optimizing, > > > > fold calls to std::move/forward into simple casts. > > > > * cp-tree.h (is_std_move_p, is_std_forward_p): Declare. > > > > * typeck.cc (is_std_move_p, is_std_forward_p): Export. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * g++.dg/opt/pr96780.C: New test. > > > > --- > > > > gcc/cp/cp-gimplify.cc | 18 ++++++++++++++++++ > > > > gcc/cp/cp-tree.h | 2 ++ > > > > gcc/cp/typeck.cc | 6 ++---- > > > > gcc/testsuite/g++.dg/opt/pr96780.C | 24 ++++++++++++++++++++++++ > > > > 4 files changed, 46 insertions(+), 4 deletions(-) > > > > create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C > > > > > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc > > > > index d7323fb5c09..0b009b631c7 100644 > > > > --- a/gcc/cp/cp-gimplify.cc > > > > +++ b/gcc/cp/cp-gimplify.cc > > > > @@ -2756,6 +2756,24 @@ cp_fold (tree x) > > > > case CALL_EXPR: > > > > { > > > > + if (optimize > > > > > > I think this should check flag_no_inline rather than optimize. > > > > Sounds good. > > > > Here's a patch that extends the folding to as_const and addressof (as > > well as __addressof, which I'm kind of unsure about since it's > > non-standard). I suppose it also doesn't hurt to verify that the return > > and argument type of the function are sane before we commit to folding. > > > > -- >8 -- > > > > Subject: [PATCH] c++: fold calls to std::move/forward [PR96780] > > > > A well-formed call to std::move/forward is equivalent to a cast, but the > > former being a function call means the compiler generates debug info for > > it, which persists even after the call has been inlined away, for an > > operation that's never interesting to debug. > > > > This patch addresses this problem in a relatively ad-hoc way by folding > > calls to std::move/forward and other cast-like functions into simple > > casts as part of the frontend's general expression folding routine. > > After this patch with -O2 and a non-checking compiler, debug info size > > for some testcases decreases by about ~10% and overall compile time and > > memory usage decreases by ~2%. > > > > PR c++/96780 > > > > gcc/cp/ChangeLog: > > > > * cp-gimplify.cc (cp_fold) : When optimizing, > > fold calls to std::move/forward and other cast-like functions > > into simple casts. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/opt/pr96780.C: New test. > > --- > > gcc/cp/cp-gimplify.cc | 36 +++++++++++++++++++++++++++- > > gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++ > > 2 files changed, 73 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C > > > > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc > > index d7323fb5c09..efc4c8f0eb9 100644 > > --- a/gcc/cp/cp-gimplify.cc > > +++ b/gcc/cp/cp-gimplify.cc > > @@ -2756,9 +2756,43 @@ cp_fold (tree x) > > case CALL_EXPR: > > { > > - int sv = optimize, nw = sv; > > tree callee = get_callee_fndecl (x); > > + /* "Inline" calls to std::move/forward and other cast-like functions > > + by simply folding them into the corresponding cast determined by > > + their return type. This is cheaper than relying on the middle-end > > + to do so, and also means we avoid generating useless debug info for > > + them at all. > > + > > + At this point the argument has already been converted into a > > + reference, so it suffices to use a NOP_EXPR to express the > > + cast. */ > > + if (!flag_no_inline > > In our conversation yesterday it occurred to me that we might make this a > separate flag that defaults to the value of flag_no_inline; I was thinking of > -ffold-simple-inlines. Then Vittorio et al can specify that explicitly at -O0 > if they'd like. Makes sense, like so? Bootstrapped and regtested on x86_64-pc-linux-gnu. The patch defaults -ffold-simple-inlines according to the value of flag_no_inline at startup. IIUC this means that if the flag has been defaulted to set, then e.g. an optimize("O0") function attribute won't disable -ffold-simple-inlines for that function, since we only compute its default value once. I wonder if we therefore instead want to handle defaulting the flag when it's used, e.g. check (flag_fold_simple_inlines == -1 ? flag_no_inline : flag_fold_simple_inlines) instead of flag_fold_simple_inlines in cp_fold? -- >8 -- Subject: [PATCH] c++: fold calls to std::move/forward [PR96780] A well-formed call to std::move/forward is equivalent to a cast, but the former being a function call means the compiler generates debug info for it, which persists even after the call has been inlined away, for an operation that's never interesting to debug. This patch addresses this problem by folding calls to std::move/forward and other cast-like functions into simple casts as part of the frontend's general expression folding routine. This behavior is controlled by a new flag -ffold-simple-inlines which defaults to the value of -fno-inline. After this patch with -O2 and a non-checking compiler, debug info size for some testcases (e.g. from range-v3 and cmcstl2) decreases by about ~10% and overall compile time and memory usage decreases by ~2%. PR c++/96780 gcc/c-family/ChangeLog: * c-opts.cc (c_common_post_options): Handle defaulting of flag_fold_simple_inlines. * c.opt: Add -ffold-simple-inlines. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold) : Fold calls to std::move/forward and other cast-like functions into simple casts. gcc/testsuite/ChangeLog: * g++.dg/opt/pr96780.C: New test. --- gcc/c-family/c-opts.cc | 3 +++ gcc/c-family/c.opt | 4 ++++ gcc/cp/cp-gimplify.cc | 35 ++++++++++++++++++++++++++- gcc/testsuite/g++.dg/opt/pr96780.C | 38 ++++++++++++++++++++++++++++++ 4 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index a341a061758..e8831d16c7b 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1058,6 +1058,9 @@ c_common_post_options (const char **pfilename) if (flag_implicit_constexpr && cxx_dialect < cxx14) flag_implicit_constexpr = false; + if (flag_fold_simple_inlines == -1) + flag_fold_simple_inlines = !flag_no_inline; + /* Global sized deallocation is new in C++14. */ if (flag_sized_deallocation == -1) flag_sized_deallocation = (cxx_dialect >= cxx14); diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9cfd2a6bc4e..9a2a597e587 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1731,6 +1731,10 @@ Support dynamic initialization of thread-local variables in a different translat fexternal-templates C++ ObjC++ WarnRemoved +ffold-simple-inlines +C++ ObjC++ Optimization Var(flag_fold_simple_inlines) Init(-1) +Fold calls to simple inline functions. + ffor-scope C++ ObjC++ WarnRemoved diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index d7323fb5c09..b09fede2d75 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -2756,9 +2756,42 @@ cp_fold (tree x) case CALL_EXPR: { - int sv = optimize, nw = sv; tree callee = get_callee_fndecl (x); + /* "Inline" calls to std::move/forward and other cast-like functions + by simply folding them into a corresponding cast to their return + type. This is cheaper than relying on the middle-end to do so, and + also means we avoid generating useless debug info for them at all. + + At this point the argument has already been converted into a + reference, so it suffices to use a NOP_EXPR to express the + cast. */ + if (flag_fold_simple_inlines + && call_expr_nargs (x) == 1 + && decl_in_std_namespace_p (callee) + && DECL_NAME (callee) != NULL_TREE + && (id_equal (DECL_NAME (callee), "move") + || id_equal (DECL_NAME (callee), "forward") + || id_equal (DECL_NAME (callee), "addressof") + /* This addressof equivalent is used heavily in libstdc++. */ + || id_equal (DECL_NAME (callee), "__addressof") + || id_equal (DECL_NAME (callee), "as_const"))) + { + r = CALL_EXPR_ARG (x, 0); + /* Check that the return and arguments types are sane before + folding. */ + if (INDIRECT_TYPE_P (TREE_TYPE (x)) + && INDIRECT_TYPE_P (TREE_TYPE (r))) + { + if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r))) + r = build_nop (TREE_TYPE (x), r); + x = cp_fold (r); + break; + } + } + + int sv = optimize, nw = sv; + /* Some built-in function calls will be evaluated at compile-time in fold (). Set optimize to 1 when folding __builtin_constant_p inside a constexpr function so that fold_builtin_1 doesn't fold it to 0. */ diff --git a/gcc/testsuite/g++.dg/opt/pr96780.C b/gcc/testsuite/g++.dg/opt/pr96780.C new file mode 100644 index 00000000000..61e11855eeb --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr96780.C @@ -0,0 +1,38 @@ +// PR c++/96780 +// Verify calls to std::move/forward are folded away by the frontend. +// { dg-do compile { target c++11 } } +// { dg-additional-options "-ffold-simple-inlines -fdump-tree-gimple" } + +#include + +struct A; + +extern A& a; +extern const A& ca; + +void f() { + auto&& x1 = std::move(a); + auto&& x2 = std::forward(a); + auto&& x3 = std::forward(a); + + auto&& x4 = std::move(ca); + auto&& x5 = std::forward(ca); + auto&& x6 = std::forward(ca); + + auto x7 = std::addressof(a); + auto x8 = std::addressof(ca); +#if __GLIBCXX__ + auto x9 = std::__addressof(a); + auto x10 = std::__addressof(ca); +#endif +#if __cpp_lib_as_const + auto&& x11 = std::as_const(a); + auto&& x12 = std::as_const(ca); +#endif +} + +// { dg-final { scan-tree-dump-not "= std::move" "gimple" } } +// { dg-final { scan-tree-dump-not "= std::forward" "gimple" } } +// { dg-final { scan-tree-dump-not "= std::addressof" "gimple" } } +// { dg-final { scan-tree-dump-not "= std::__addressof" "gimple" } } +// { dg-final { scan-tree-dump-not "= std::as_const" "gimple" } } -- 2.35.1.455.g1a4874565f