From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 7785F38582BD; Fri, 23 Jun 2023 16:44:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7785F38582BD Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 0F9D42810D0; Fri, 23 Jun 2023 18:44:33 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1687538673; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=i/0DvoBfuaQy/U6Mdx4epN9ih6iB6BAZAhJfLKq8apo=; b=OncrtiKNleSxm7LfuPJ/SgrLwnm/tDa7nSGXCVRML6AZ88mRhcDp+VNE7bbUJPanzocLq5 W5pKCdOTCEyUuxzovF8s9Y9qbLO2q4G1nU2K9z/sunRkUWWbq5ILoUZnJQ78JuIDOCuaQy CmU8bSuk7+Skq0mRmRfhowT2johDMoU= Date: Fri, 23 Jun 2023 18:44:32 +0200 From: Jan Hubicka To: Jonathan Wakely Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: [PATCH] libstdc++: Use RAII in std::vector::_M_realloc_insert Message-ID: References: <20230620133858.166497-1-jwakely@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230620133858.166497-1-jwakely@redhat.com> X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > I intend to push this to trunk once testing finishes. > > I generated the diff with -b so the whitespace changes aren't shown, > because there was some re-indenting that makes the diff look larger than > it really is. > > Honza, I don't think this is likely to make much difference for the PR > 110287 testcases, but I think it simplifies the code and so is an > improvement in terms of maintenance and readability. Thanks for cleaning it up :) The new version seems slightly smaller than the original in inliner metrics. I started to look if we can break out useful parts out of _M_realloc_insert to make it smaller and fit -O3 inline limits. ipa-fnsplit does some "useful" job, like break out: [local count: 107374184]: if (__n_9(D) > 2305843009213693951) goto ; [50.00%] else goto ; [50.00%] [local count: 53687092]: std::__throw_bad_array_new_length (); [local count: 53687092]: std::__throw_bad_alloc (); from std::__new_allocator ::allocate into a separate function, which saves another 4 instructions in the estimate. It is fun to notice that both checks are dead with default predictor, but we do not know that because _M_check_len is not inlined and we do not have return value value range propagation, which I will add. With your propsed change to _M_check_len we will also need to solve PR110377 and actually notice the value range implied by __bulitin_unreachable early enough. What however also goes wrong is that after splitting we decide to inline it back before we consider inlining _M_realloc_insert, so the savings does not help. The reason is that the profile is estimated as: _4 = __builtin_expect (_3, 0); if (_4 != 0) goto ; [10.00%] else goto ; [90.00%] so we expect that with 10% probability the allocation will exceed 64bit address space. The reason is that __builtin_expect is defined to have 10% missrate which we can't change, since it is used in algorithms where the probability of unlikely value really is non-zero. There is __builtin_expect_with_probability that makes it to possible to set probability to 0 or 100 that may be better in such situation, however here it is useless. If code path leads to noreturn function, we predict it as noreturn. This heuristics has lower precedence than builtin_expect so it is not applied, but would do the same work. To work out that the code path is really very unlikely and should be offloaded to a cold section, we however need: diff --git a/libstdc++-v3/include/bits/functexcept.h b/libstdc++-v3/include/bits/functexcept.h index 89972baf2c9..2765f7865df 100644 --- a/libstdc++-v3/include/bits/functexcept.h +++ b/libstdc++-v3/include/bits/functexcept.h @@ -46,14 +46,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if _GLIBCXX_HOSTED // Helper for exception objects in void - __throw_bad_exception(void) __attribute__((__noreturn__)); + __throw_bad_exception(void) __attribute__((__noreturn__,__cold__)); // Helper for exception objects in void - __throw_bad_alloc(void) __attribute__((__noreturn__)); + __throw_bad_alloc(void) __attribute__((__noreturn__,__cold__)); void - __throw_bad_array_new_length(void) __attribute__((__noreturn__)); + __throw_bad_array_new_length(void) __attribute__((__noreturn__,__cold__)); // Helper for exception objects in void This makes us to drop cont to profile_count::zero which indicates that the code is very likely not executed at all during run of the program. The reason why we can't take such a strong hint from unreachable attribute is twofold. First most programs do call "exit (0)" so taking this as a strong hint may make us to optimize whole program for size. Second is that we consider a possibility that insane developers may make EH delivery relatively common. Would be possible to annotate throw functions in libstdc++ which are very unlikely taken by a working program as __cold__ and possibly drop the redundant __builtin_expect? I will reorder predictors so __builtin_cold_noreturn and __builtin_expect_with_probability thakes precedence over __builtin_expect. It is fun to see how many things can go wrong in such a simple use of libstdc++ :) Honza