From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 08CB438582BF; Thu, 23 Nov 2023 15:07:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 08CB438582BF Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 08CB438582BF Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700752044; cv=none; b=CE19mZjbInS3at9E5AMYj+JZW1bsXogK6T6P76t8fUuK5BB22lqDOcfXyyiYpAnb9P+oJ+MX0ER4i1Jaj4E/Hc/0yVDaqRNMwWO3I4F9b+uGNv93qrzRhb5nnp3a8wS3W5p7BROCLGrxpIl5FaZpxywGP/7EXG/YEbGtbpHc6BM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700752044; c=relaxed/simple; bh=6Kq0cYiSjPEsoYZS0W+IaRy3wp5p3lSyf1hg+ZVYhCo=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Yx0Ca1eq7rLiShfd9F2WI7WfFJQmg6uil4EI0U0DibWfhVwqPMCOhAbVU2jj9NuUK0KI2lw8A+neKyQkU7Esh7RiMoI8sZ2QDqGgrstQ1mEZCpuRfbiqb/rKhY8q4NOgV9WO/Rx2uR+UnFpsH/2TzpdF8dZRLV0/KyBYgn9E1LQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1493C28BA4F; Thu, 23 Nov 2023 16:07:22 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1700752042; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=lAu8+Vcmm1ZNM4+2c8dc+c4i2EN07PX7r0WyNm1/Mzo=; b=jSC7Cu/8SNAG/zgpD5FB79B+daj1UE4vQb5Ma9+0n0yWZ4Wx8Jnm8dkD+svNKu25FKUY3d QPnAslT6pC3aUrEHdM7RoPG1+y/M7z38ef2TRnXbE1ZGmV/tVhq4PM13C8qD+mTY5draAM oe4keYkCxeC2jnvuOaS69ANKRidV+rk= Date: Thu, 23 Nov 2023 16:07:21 +0100 From: Jan Hubicka To: Matthias Kretz Cc: jwakely@redhat.com, libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: libstdc++: Speed up push_back Message-ID: References: <11345207.nUPlyArG6x@minbar> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <11345207.nUPlyArG6x@minbar> X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,JMQ_SPF_NEUTRAL,KAM_SHORT,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > On Sunday, 19 November 2023 22:53:37 CET Jan Hubicka wrote: > > Sadly it is really hard to work out this > > from IPA passes, since we basically care whether the iterator points to > > the same place as the end pointer, which are both passed by reference. > > This is inter-procedural value numbering that is quite out of reach. > > I've done a fair share of branching on __builtin_constant_p in > std::experimental::simd to improve code-gen. It's powerful! But maybe we > also need the other side of the story to tell the optimizer: "I know you > can't const-prop everything; but this variable / expression, even if you > need to put in a lot of effort, the performance difference will be worth > it." > > For std::vector, the remaining capacity could be such a value. The > functions f() and g() are equivalent (their code-gen isn't https:// > compiler-explorer.com/z/r44ejK1qz): > > #include > > auto > f() > { > std::vector x; > x.reserve(10); > for (int i = 0; i < 10; ++i) > x.push_back(0); > return x; > } > auto > g() > { return std::vector(10, 0); } With my changes at -O3 we now inline push_back, so we could optimize the first loop to the second. However with ~/trunk-install/bin/gcc -O3 auto.C -S -fdump-tree-all-details -fno-exceptions -fno-store-merging -fno-tree-slp-vectorize the fist problem is right at the begining: [local count: 97603128]: MEM[(struct _Vector_impl_data *)x_4(D)]._M_start = 0B; MEM[(struct _Vector_impl_data *)x_4(D)]._M_finish = 0B; MEM[(struct _Vector_impl_data *)x_4(D)]._M_end_of_storage = 0B; _37 = operator new (40); _22 = x_4(D)->D.26019._M_impl.D.25320._M_finish; _23 = x_4(D)->D.26019._M_impl.D.25320._M_start; _24 = _22 - _23; if (_24 > 0) goto ; [41.48%] else goto ; [58.52%] So the vector is fist initialized with _M_start=_M_finish=0, but after call to new we already are not able to propagate this. This is because x is returned and PTA considers it escaping. This is problem discussed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 Which shows that it is likely worthwhile to fix PTA to handle this correctly.