From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id BEB743858CD1 for ; Thu, 23 Nov 2023 16:26:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BEB743858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BEB743858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700756790; cv=none; b=sSqQplUTp/P05GkHwpH0F3fq/hlscphiFwtfV2yeBTPWPOvdYdPri2GgmoIXGtIISKOFX47ZahpUUw3nQYVlvhTZ8fXdl+BjhC5tmRq1GBkBi4BCf79W/GwYoIT/PCG6B0Ct3m5rOC/t8bffCEMsUMzWVm6GF3ssgyVShtmdpWQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700756790; c=relaxed/simple; bh=Mp/oWuTlGAUAl+/7ItTnIo0A0EGlPyAw8ECmUluy7e8=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=bhCtN/ko0EuxMiulJXl1Nmx+G2CyqKjF2+zNpdt6zHd8ZGmIj39LS6fWr9xDGSTqhDiuZJ8HadnWBhQXJ0ErGXOvYxZMTlKZPxmhBGmexVkQlR+nddRGw4tlmb8N9CkwcZUA4nlnOqaLsSRdG+R5JwIcBcNmKSuUj5rvTWURvSc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700756786; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HJqQu0zhmMT31KOMjzJWh1rHYKlfrOeReaEEItgECj4=; b=Qdpzpwe8lR4ed40y5Ae932G8F7vkDltTNAW4KqA2BNLh7ItgP/uFmXgje90U0EF1lsRhi1 cGlM5dDU8LqAgLHw4VvQahQUGG5oxlkW7YID6R8Mn0mh4e0KgmkiboK2kbP9TxLAXo8aJH Np9HxqiL0POMHm2vdPlT338EaSPiiXY= Received: from mail-yw1-f200.google.com (mail-yw1-f200.google.com [209.85.128.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-267-J83h5oejPkiYmMgVi7jn4w-1; Thu, 23 Nov 2023 11:26:24 -0500 X-MC-Unique: J83h5oejPkiYmMgVi7jn4w-1 Received: by mail-yw1-f200.google.com with SMTP id 00721157ae682-5c5daf2baccso12516067b3.3 for ; Thu, 23 Nov 2023 08:26:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700756784; x=1701361584; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HJqQu0zhmMT31KOMjzJWh1rHYKlfrOeReaEEItgECj4=; b=JurP96zWt8p38o8wtsz7kuxYa5qYGlp829VCYBMot+eiM1EKPBd2Ykjk/2F2setgAo J5Afcc8aTJC8uW8Wo95XdMkDkpd5ykKKGm03u2csF01SvGRgo5Q+gBx3OZ5HTI4cSg1f qF60BslrS9NLD/LA01mG6aoeH4w3IT5utnrPwyylF9GJGiFARm37F67NcADU5z2Z+Kcg /tFIzYOaFXxEjGFej8hkgF3SGaRyEgj44Jc/TSGVQ5vrKptxuV7v0eqYzsLM866Q46NM BTdpQJZJW6oVZJYNV1tTTj5Nlf/H44Dizct5nE4k3kch8okKfYNaw/KrmcXRUJAwt0Lz lBsQ== X-Gm-Message-State: AOJu0YwrPeZtu/4i1CFJ38UXEBeTsgZCsg4vnhJ34baPdtjkKbNLbYqz wmj2ukBKh6WlBS2xAbVw1x9dMqdeUmPD+rQCnQlLQNBexAdtHMmfQoOrG7Vb6A0ZScgZRd7glze C+Lrj5M44OSFLjaa8KI0t+6oy1KK160ZtdvmOoBjtQw== X-Received: by 2002:a0d:cdc3:0:b0:599:8bd:5bdf with SMTP id p186-20020a0dcdc3000000b0059908bd5bdfmr6333638ywd.50.1700756784427; Thu, 23 Nov 2023 08:26:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IG50ihuXjC/muxvk0kT4odbTeUoyYMYay4m1LFGKIiKSsdb8xziq3V/MBt7nk/YADZC1qxep6vhWbKfz1sc5RE= X-Received: by 2002:a0d:cdc3:0:b0:599:8bd:5bdf with SMTP id p186-20020a0dcdc3000000b0059908bd5bdfmr6333626ywd.50.1700756784183; Thu, 23 Nov 2023 08:26:24 -0800 (PST) MIME-Version: 1.0 References: <11345207.nUPlyArG6x@minbar> In-Reply-To: From: Jonathan Wakely Date: Thu, 23 Nov 2023 16:26:13 +0000 Message-ID: Subject: Re: libstdc++: Speed up push_back To: Jan Hubicka Cc: Matthias Kretz , rguenther@suse.de, libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 23 Nov 2023 at 15:44, Jan Hubicka wrote: > > Hi, > so if I understand it right, it should be safe to simply replace memmove > by memcpy. I wonder if we can get rid of the count != 0 check at least > for glibc systems. I don't think we can do that. It's still undefined with glibc, and glibc marks it with __attribute__((nonnull)), and ubsan will diagnose it. > In general push_back now need inline-insns-auto to > be 33 to be inlined at -O2 > > > jh@ryzen4:/tmp> cat ~/tt.C > #include > typedef unsigned int uint32_t; > struct pair_t {uint32_t first, second;}; > struct pair_t pair; > void > test() > { > std::vector stack; > stack.push_back (pair); > while (!stack.empty()) { > pair_t cur = stack.back(); > stack.pop_back(); > if (!cur.first) > { > cur.second++; > stack.push_back (cur); > } > if (cur.second > 10000) > break; > } > } > int > main() > { > for (int i = 0; i < 10000; i++) > test(); > } > > jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param max-inline-insns-auto=32 ; time ./a.out > > real 0m0.399s > user 0m0.399s > sys 0m0.000s > jh@ryzen4:/tmp> ~/trunk-install/bin/g++ ~/tt.C -O2 --param max-inline-insns-auto=33 ; time ./a.out > > real 0m0.039s > user 0m0.039s > sys 0m0.000s > > Current inline limit is 15. We can save > - 2 insns if inliner knows that conditional guarding > builtin_unreachable will die (I have patch for this) > - 4 isnsn if we work out that on 64bit hosts allocating vector with > 2^63 elements is impossible > - 2 insns if we allow NULL parameter on memcpy I don't think we can do that. > - 2 insns if we allos NULL parameter on delete That's allowed, I think we just check first to avoid making a function call if it's null, because we know operator delete will do nothing. But if it's hurting inlining, maybe that's the wrong choice. > So thi is 23 instructions. Inliner has hinting which could make > push_back reasonable candidate for -O2 inlining and then we could be > able to propagate interesitng stuff across repeated calls to push_back. > > libstdc++-v3/ChangeLog: > > * include/bits/stl_uninitialized.h (relocate_a_1): Use memcpy instead of memmove. This patch is OK for trunk. > > diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_uninitialized.h > index 1282af3bc43..a9b802774c6 100644 > --- a/libstdc++-v3/include/bits/stl_uninitialized.h > +++ b/libstdc++-v3/include/bits/stl_uninitialized.h > @@ -1119,14 +1119,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > #ifdef __cpp_lib_is_constant_evaluated > if (std::is_constant_evaluated()) > { > - // Can't use memmove. Wrap the pointer so that __relocate_a_1 > + // Can't use memcpy. Wrap the pointer so that __relocate_a_1 > // resolves to the non-trivial overload above. > __gnu_cxx::__normal_iterator<_Tp*, void> __out(__result); > __out = std::__relocate_a_1(__first, __last, __out, __alloc); > return __out.base(); > } > #endif > - __builtin_memmove(__result, __first, __count * sizeof(_Tp)); > + __builtin_memcpy(__result, __first, __count * sizeof(_Tp)); > } > return __result + __count; > } >