From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 290F23858CD1 for ; Thu, 23 Nov 2023 16:21:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 290F23858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 290F23858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700756469; cv=none; b=pmWkKVoFER9vwZU9TbYx5PDgH2act13fe/ZxOQDjr0/AB+3Jm3C7Qdchev3aAYn4/L/93aN8JWzoL5sDn314ERJZdEzCI1KRIX9k9To4b4+kNzirSMsvsS/0TQ4xPi2uCGNJPmrD6+o2ojFfWDrK2Rcf2cwEVkx9eYE3GeCDES4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700756469; c=relaxed/simple; bh=BUVypTop4fAPMFfzsGLIQgd9XDP/VYIXo9wj+BH+10A=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=lakfww/MPVXwIy+1ngoOcSm/318hbrNC0YiMLLQEQ/ciJ5EtMkBNpT/y5y4Locz3DiMh5rCDbN/sEBK2mdb8q09aBLJSOg2Xxd/jGWm34So3sdoEgee5b2+1DAZg4Wy63N1y5BYWkzpSb1bzyY6VQBSKjJKNyr0Vq35I5Ob9OiU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700756467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/1oZRvF3K02Trl7sROgo8UHWFtHKmjK8hJUSgGucTy8=; b=fCYpiKchI8JyxQ35nbcrE64otGSVhcSK1792LwFnnE111aSc1WTwkQ9hupjWSDS74/LcH7 G0Sl08fJBiBB8I4ZIWgrHjqIoAORnphXzsNNqRrLP0gAdpw664jI76ZBDWaK9DRjqQMwo/ pxRAdDKy5Anl7TI/kV5DlCVE0HHAbE4= Received: from mail-yw1-f200.google.com (mail-yw1-f200.google.com [209.85.128.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-642-zxYcxrhhOb-pF3fDj309Pg-1; Thu, 23 Nov 2023 11:21:04 -0500 X-MC-Unique: zxYcxrhhOb-pF3fDj309Pg-1 Received: by mail-yw1-f200.google.com with SMTP id 00721157ae682-5ca61d84dc3so12351537b3.0 for ; Thu, 23 Nov 2023 08:21:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700756464; x=1701361264; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/1oZRvF3K02Trl7sROgo8UHWFtHKmjK8hJUSgGucTy8=; b=UwDsAVkzx64BrEezlJFW8uHQCafHO6KdHzyRgaHZLcNTkgjVNRSd0moC1/zd8ftTAQ KlKQPncn9lUa9UxGuQ3+Z4n/jRU1P1C/tIt14YC7KHXzyr/J2aPDXQHTJzvRwWybAcjN m7mg7SfHxpa8UyJaMccsWGYkTs/jegUE32385zrlxcjPG+IjlKouy97MsXuhsC5vj7ba dY2TZ9mTnJ5mkqtLjd/tx/do+q60q/+/ioGUK0J8zRZa/++LraN0+BIMBtj2d8dSKeqm 0EpTCNgbGSFi/x9MMbrXg6oIzsulSzXfbt5WxrwDLi/6gn9c8xLUs+/h7YFO3aSmuoyL HHlQ== X-Gm-Message-State: AOJu0YzGLwekhTmB4tEihpmRJ0AHN0e1rqYpyuA/e2ihnnWR6IDTlnFk T/KROUhk8du8qXlKBUYMOen9UZja8jZ9NG15sOfdBWI77hRi76PQaH47bG6SPQ7f7zrmLFxU48y TxfYSVT3yWD6j+YFe7tYVD6M8EamMB9s= X-Received: by 2002:a81:5e83:0:b0:5cb:57da:e607 with SMTP id s125-20020a815e83000000b005cb57dae607mr6344633ywb.30.1700756464099; Thu, 23 Nov 2023 08:21:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IHCD9GA02x9SI9Tw9G1YeQdOesQcvLJ9fsw+tEKMXcNZ6om37WwgDmuyQ4yDNc2qD5yO5gmP3ZTbSRqg2M2UD4= X-Received: by 2002:a81:5e83:0:b0:5cb:57da:e607 with SMTP id s125-20020a815e83000000b005cb57dae607mr6344614ywb.30.1700756463834; Thu, 23 Nov 2023 08:21:03 -0800 (PST) MIME-Version: 1.0 References: <11345207.nUPlyArG6x@minbar> In-Reply-To: From: Jonathan Wakely Date: Thu, 23 Nov 2023 16:20:52 +0000 Message-ID: Subject: Re: libstdc++: Speed up push_back To: Jan Hubicka Cc: Matthias Kretz , rguenther@suse.de, libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 23 Nov 2023 at 15:34, Jan Hubicka wrote: > > > > On Sunday, 19 November 2023 22:53:37 CET Jan Hubicka wrote: > > > > Sadly it is really hard to work out this > > > > from IPA passes, since we basically care whether the iterator points to > > > > the same place as the end pointer, which are both passed by reference. > > > > This is inter-procedural value numbering that is quite out of reach. > > > > > > I've done a fair share of branching on __builtin_constant_p in > > > std::experimental::simd to improve code-gen. It's powerful! But maybe we > > > also need the other side of the story to tell the optimizer: "I know you > > > can't const-prop everything; but this variable / expression, even if you > > > need to put in a lot of effort, the performance difference will be worth > > > it." > > > > > > For std::vector, the remaining capacity could be such a value. The > > > functions f() and g() are equivalent (their code-gen isn't https:// > > > compiler-explorer.com/z/r44ejK1qz): > > > > > > #include > > > > > > auto > > > f() > > > { > > > std::vector x; > > > x.reserve(10); > > > for (int i = 0; i < 10; ++i) > > > x.push_back(0); > > > return x; > > > } > > > auto > > > g() > > > { return std::vector(10, 0); } > > > > With my changes at -O3 we now inline push_back, so we could optimize the > > first loop to the second. However with > > ~/trunk-install/bin/gcc -O3 auto.C -S -fdump-tree-all-details -fno-exceptions -fno-store-merging -fno-tree-slp-vectorize > > the fist problem is right at the begining: > > > > [local count: 97603128]: > > MEM[(struct _Vector_impl_data *)x_4(D)]._M_start = 0B; > > MEM[(struct _Vector_impl_data *)x_4(D)]._M_finish = 0B; > > MEM[(struct _Vector_impl_data *)x_4(D)]._M_end_of_storage = 0B; > > _37 = operator new (40); > > I also wonder, if default operator new and malloc can be handled as not > reading/modifying anything visible to the user code. No, there's no way to know if the default operator new is being used. A replacement operator new could be provided at link-time. That's why we need -fsane-operator-new > That would help > us to propagate here even if we lose track of points-to information. > > We have: > > /* If the call is to a replaceable operator delete and results > from a delete expression as opposed to a direct call to > such operator, then we can treat it as free. */ > if (fndecl > && DECL_IS_OPERATOR_DELETE_P (fndecl) > && DECL_IS_REPLACEABLE_OPERATOR (fndecl) > && gimple_call_from_new_or_delete (stmt)) > return ". o "; > /* Similarly operator new can be treated as malloc. */ > if (fndecl > && DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl) > && gimple_call_from_new_or_delete (stmt)) > return "m "; > Which informs alias analysis that new returns pointer to memory > not aliasing with anything and that free is not reading anything > from its parameter (but it is modelled as a write to make it clear > that the memory dies). But this only applies to new T[n] not to operator new(n * sizeof(T)). So it's not relevant to std::vector. > stmt_kills_ref_p special cases BUILT_IN_FREE but not OPERATOR delete > to make it clear that everything pointed to by it dies. This is needed > because 'o' only means that some data may be overwritten, but it does > not make it clear that all data dies. > > Not handling operator delete seems like an omision, but maybe it is not > too critical since we emit clobbers around destructors that are usually > right before call to delete. Also ipa-modref kill analysis does not > understand BUILT_IN_FREE nor delete and could. > > I wonder if we can handle both as const except for side-effects > described. > > Honza > > _22 = x_4(D)->D.26019._M_impl.D.25320._M_finish; > > _23 = x_4(D)->D.26019._M_impl.D.25320._M_start; > > _24 = _22 - _23; > > if (_24 > 0) > > goto ; [41.48%] > > else > > goto ; [58.52%] > > > > So the vector is fist initialized with _M_start=_M_finish=0, but after > > call to new we already are not able to propagate this. > > > > This is because x is returned and PTA considers it escaping. This is > > problem discussed in > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 > > Which shows that it is likely worthwhile to fix PTA to handle this > > correctly. >