From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Qrar=CL=kam.mff.cuni.cz=hubicka@sourceware.org>
Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16])
	by sourceware.org (Postfix) with ESMTPS id 7785F38582BD;
	Fri, 23 Jun 2023 16:44:34 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7785F38582BD
Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz
Authentication-Results: sourceware.org; spf=none smtp.mailfrom=kam.mff.cuni.cz
Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202)
	id 0F9D42810D0; Fri, 23 Jun 2023 18:44:33 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1;
	t=1687538673;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=i/0DvoBfuaQy/U6Mdx4epN9ih6iB6BAZAhJfLKq8apo=;
	b=OncrtiKNleSxm7LfuPJ/SgrLwnm/tDa7nSGXCVRML6AZ88mRhcDp+VNE7bbUJPanzocLq5
	W5pKCdOTCEyUuxzovF8s9Y9qbLO2q4G1nU2K9z/sunRkUWWbq5ILoUZnJQ78JuIDOCuaQy
	CmU8bSuk7+Skq0mRmRfhowT2johDMoU=
Date: Fri, 23 Jun 2023 18:44:32 +0200
From: Jan Hubicka <hubicka@ucw.cz>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libstdc++: Use RAII in std::vector::_M_realloc_insert
Message-ID: <ZJXL8AI2fGPlHds7@kam.mff.cuni.cz>
References: <20230620133858.166497-1-jwakely@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230620133858.166497-1-jwakely@redhat.com>
X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

> I intend to push this to trunk once testing finishes.
> 
> I generated the diff with -b so the whitespace changes aren't shown,
> because there was some re-indenting that makes the diff look larger than
> it really is.
> 
> Honza, I don't think this is likely to make much difference for the PR
> 110287 testcases, but I think it simplifies the code and so is an
> improvement in terms of maintenance and readability.

Thanks for cleaning it up :)
The new version seems slightly smaller than the original in inliner
metrics.

I started to look if we can break out useful parts out of
_M_realloc_insert to make it smaller and fit -O3 inline limits.
ipa-fnsplit does some "useful" job, like break out:

  <bb 3> [local count: 107374184]:
  if (__n_9(D) > 2305843009213693951)
    goto <bb 4>; [50.00%]
  else
    goto <bb 5>; [50.00%]

  <bb 4> [local count: 53687092]:
  std::__throw_bad_array_new_length ();

  <bb 5> [local count: 53687092]:
  std::__throw_bad_alloc ();

from std::__new_allocator<std::pair<unsigned int, unsigned int> ::allocate
into a separate function, which saves another 4 instructions in the estimate.

It is fun to notice that both checks are dead with default predictor,
but we do not know that because _M_check_len is not inlined and we do
not have return value value range propagation, which I will add.  With
your propsed change to _M_check_len we will also need to solve PR110377
and actually notice the value range implied by __bulitin_unreachable
early enough.

What however also goes wrong is that after splitting we decide to inline
it back before we consider inlining _M_realloc_insert, so the savings
does not help.  The reason is that the profile is estimated as:

  _4 = __builtin_expect (_3, 0);
  if (_4 != 0)
    goto <bb 3>; [10.00%]
  else
    goto <bb 6>; [90.00%]

so we expect that with 10% probability the allocation will exceed 64bit
address space.  The reason is that __builtin_expect is defined to have
10% missrate which we can't change, since it is used in algorithms where
the probability of unlikely value really is non-zero.

There is __builtin_expect_with_probability that makes it to possible to
set probability to 0 or 100 that may be better in such situation,
however here it is useless.  If code path leads to noreturn function,
we predict it as noreturn.  This heuristics has lower precedence than
builtin_expect so it is not applied, but would do the same work.

To work out that the code path is really very unlikely and should be
offloaded to a cold section, we however need:
diff --git a/libstdc++-v3/include/bits/functexcept.h b/libstdc++-v3/include/bits/functexcept.h
index 89972baf2c9..2765f7865df 100644
--- a/libstdc++-v3/include/bits/functexcept.h
+++ b/libstdc++-v3/include/bits/functexcept.h
@@ -46,14 +46,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if _GLIBCXX_HOSTED
   // Helper for exception objects in <except>
   void
-  __throw_bad_exception(void) __attribute__((__noreturn__));
+  __throw_bad_exception(void) __attribute__((__noreturn__,__cold__));
 
   // Helper for exception objects in <new>
   void
-  __throw_bad_alloc(void) __attribute__((__noreturn__));
+  __throw_bad_alloc(void) __attribute__((__noreturn__,__cold__));
 
   void
-  __throw_bad_array_new_length(void) __attribute__((__noreturn__));
+  __throw_bad_array_new_length(void) __attribute__((__noreturn__,__cold__));
 
   // Helper for exception objects in <typeinfo>
   void

This makes us to drop cont to profile_count::zero which indicates that
the code is very likely not executed at all during run of the program.

The reason why we can't take such a strong hint from unreachable
attribute is twofold.  First most programs do call "exit (0)" so taking
this as a strong hint may make us to optimize whole program for size.
Second is that we consider a possibility that insane developers may make
EH delivery relatively common.

Would be possible to annotate throw functions in libstdc++ which are
very unlikely taken by a working program as __cold__ and possibly drop
the redundant __builtin_expect?

I will reorder predictors so __builtin_cold_noreturn and
__builtin_expect_with_probability thakes precedence over
__builtin_expect.

It is fun to see how many things can go wrong in such a simple use of
libstdc++ :)

Honza