From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id D861C385843E; Thu, 23 Nov 2023 15:43:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D861C385843E Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D861C385843E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700754240; cv=none; b=sKx4XtjxnmzGeb94SgmnyJ9K8Hi9eW3TFUH7UBmGbRWSCzsxqM4eByFk82c0Xm+UwkxYvi9BkjureJpKhzNKZ3BWBrq6y7V+ea1tfV7SQv8V9vOnx2jVEYFc4LN2X+YmH7Kjw6yNTcQ3kWTdSRlnG9guJnhNxrKR+d714H3504o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700754240; c=relaxed/simple; bh=NZ4N9AgqetMvFomyWCJvFFPtPj/PAaUJqw02HsGXl6c=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=vc+Tz1kKdlMg0Rv52A3cu67+qmkE+MdKEzA4oX7GpEnCUQ/8UEErQ+V6O94KZkAimKRugq3z9mK67x8LL7ABsH9BhesS5rF8QRHLcdmKmgFvUoPyLzvvLD+5yO9DagoZHV0KiTDLLUYpIFWYVzt7Rg4bfsv4kMegZhKYLjLz330= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 0884728BA55; Thu, 23 Nov 2023 16:43:57 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1700754238; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ONNbpKi+y++c0y0zgXKCTuEo8u+JwWPjMkbt/zqkbIE=; b=JwIbwmK67T9lIzhAsa27iMVsjNNXhYNowaj6opLf1N5nHLDo8F9PqjB36Y+3oRQToCPVpy 6uDiVDSSxK0ZqvLjmGmv0FW8NEy8T/IEJaGNJE3sfHAqFz+vlGFW0LYV7YtdVIw54ZE0UV H5wTVL9hzHVC5bfUkxXIcN/YZCUUtrU= Date: Thu, 23 Nov 2023 16:43:57 +0100 From: Jan Hubicka To: Matthias Kretz , rguenther@suse.de Cc: jwakely@redhat.com, libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: libstdc++: Speed up push_back Message-ID: References: <11345207.nUPlyArG6x@minbar> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,JMQ_SPF_NEUTRAL,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, so if I understand it right, it should be safe to simply replace memmove by memcpy. I wonder if we can get rid of the count != 0 check at least for glibc systems. In general push_back now need inline-insns-auto to be 33 to be inlined at -O2 jh@ryzen4:/tmp> cat ~/tt.C #include typedef unsigned int uint32_t; struct pair_t {uint32_t first, second;}; struct pair_t pair; void test() { std::vector stack; stack.push_back (pair); while (!stack.empty()) { pair_t cur = stack.back(); stack.pop_back(); if (!cur.first) { cur.second++; stack.push_back (cur); } if (cur.second > 10000) break; } } int main() { for (int i = 0; i < 10000; i++) test(); } jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param max-inline-insns-auto=32 ; time ./a.out real 0m0.399s user 0m0.399s sys 0m0.000s jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param max-inline-insns-auto=33 ; time ./a.out real 0m0.039s user 0m0.039s sys 0m0.000s Current inline limit is 15. We can save - 2 insns if inliner knows that conditional guarding builtin_unreachable will die (I have patch for this) - 4 isnsn if we work out that on 64bit hosts allocating vector with 2^63 elements is impossible - 2 insns if we allow NULL parameter on memcpy - 2 insns if we allos NULL parameter on delete So thi is 23 instructions. Inliner has hinting which could make push_back reasonable candidate for -O2 inlining and then we could be able to propagate interesitng stuff across repeated calls to push_back. libstdc++-v3/ChangeLog: * include/bits/stl_uninitialized.h (relocate_a_1): Use memcpy instead of memmove. diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_uninitialized.h index 1282af3bc43..a9b802774c6 100644 --- a/libstdc++-v3/include/bits/stl_uninitialized.h +++ b/libstdc++-v3/include/bits/stl_uninitialized.h @@ -1119,14 +1119,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #ifdef __cpp_lib_is_constant_evaluated if (std::is_constant_evaluated()) { - // Can't use memmove. Wrap the pointer so that __relocate_a_1 + // Can't use memcpy. Wrap the pointer so that __relocate_a_1 // resolves to the non-trivial overload above. __gnu_cxx::__normal_iterator<_Tp*, void> __out(__result); __out = std::__relocate_a_1(__first, __last, __out, __alloc); return __out.base(); } #endif - __builtin_memmove(__result, __first, __count * sizeof(_Tp)); + __builtin_memcpy(__result, __first, __count * sizeof(_Tp)); } return __result + __count; }