From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 770E63858405 for ; Wed, 5 Jan 2022 08:01:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 770E63858405 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 8822B210EB; Wed, 5 Jan 2022 08:01:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1641369702; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pQ6WS+CqBIjN/nzxs73HV9r+5YPADpCy5X+Cjk6aTEo=; b=TOSfwCz0Nuq/ig8BFpDoGUgaoSVBz6YlB77A/1jD0k3V0cLDZYuS2Qq6xeZVPSfVGKRj78 al4Srmm80c394Ex1fP7O0357KiSorO1qpE6ytsc3gDBR/yTlxKXgGo/9yaOFfFmi+HKB2D +gtLDuC025m2q7L7LStL8Z0rJ8qTV0E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1641369702; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pQ6WS+CqBIjN/nzxs73HV9r+5YPADpCy5X+Cjk6aTEo=; b=B7dsoods7TNnrXNDlOYFC8no/sxnIRTM6gzC7iLUQvizUclzgEjZePQ5OuWqt/wLLXxMku ERJjTGRKx5FJXTBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6053F13B98; Wed, 5 Jan 2022 08:01:42 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id RWdCFmZQ1WEKGwAAMHmgww (envelope-from ); Wed, 05 Jan 2022 08:01:42 +0000 Message-ID: Date: Wed, 5 Jan 2022 09:01:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Subject: Re: [Highlight] Performance improvements Content-Language: en-US To: Mark Wielaard Cc: Tom de Vries , dwz@sourceware.org, Jakub Jelinek , Michael Matz References: <5000ad54-f6c7-a164-7519-82e84f91f6db@suse.de> <20463996-a518-c26a-3c68-65b91606d84c@suse.cz> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: dwz@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Dwz mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jan 2022 08:01:45 -0000 On 1/3/22 23:06, Mark Wielaard wrote: > Hi Martin, > > I noticed that this is a reply to a thread from 2 years ago. Is it > related to the work mentioned by Tom in that thread? Hello. It's related only a bit as it's also connected to Performance improvements :) > > On Thu, Dec 23, 2021 at 12:57:48PM +0100, Martin Liška wrote: >> I've made couple of experiments with dwz speed. I've taken the following packages: >> gcc, krita, libetonyek, rtags, sysdig and run dwz -m x ... for them. >> >> There are numbers I collected for the following configurations: >> dwz (system package, built with LTO and -O2), dwz-O2_lto is supposed >> to be the same (built from source), then I experimented with -O3 and PGO >> (based on tramp3d copies 4 times). And the final run is experimental patch >> I have that replaces the iterative_hash with xxhash: >> https://github.com/Cyan4973/xxHash >> >> # 1/5: sysdig (60M) >> dwz : 10.0 >> dwz : 9.8 (98.7%) >> dwz-O2_lto : 9.5 (95.6%) >> dwz-O3_lto : 9.2 (91.9%) >> dwz-O3_lto_pgo : 8.1 (81.3%) >> dwz-O3_lto_pgo_xxhash : 7.3 (72.9%) >> # 2/5: rtags (148M) >> dwz : 19.6 >> dwz : 19.6 (99.9%) >> dwz-O2_lto : 17.4 (89.0%) >> dwz-O3_lto : 16.7 (85.4%) >> dwz-O3_lto_pgo : 14.4 (73.6%) >> dwz-O3_lto_pgo_xxhash : 13.2 (67.6%) >> # 3/5: libetonyek (112M) >> dwz : 10.5 >> dwz : 10.5 (100.6%) >> dwz-O2_lto : 10.8 (102.8%) >> dwz-O3_lto : 10.1 (96.7%) >> dwz-O3_lto_pgo : 9.1 (87.4%) >> dwz-O3_lto_pgo_xxhash : 8.1 (77.1%) >> # 4/5: krita (685M) >> dwz : 133.7 >> dwz : 134.3 (100.5%) >> dwz-O2_lto : 95.3 (71.3%) >> dwz-O3_lto : 91.2 (68.2%) >> dwz-O3_lto_pgo : 78.9 (59.0%) >> dwz-O3_lto_pgo_xxhash : 71.6 (53.5%) >> # 5/5: gcc (1.2G) >> dwz : 61.9 >> dwz : 61.9 (99.9%) >> dwz-O2_lto : 58.5 (94.5%) >> dwz-O3_lto : 56.6 (91.3%) >> dwz-O3_lto_pgo : 54.1 (87.4%) >> dwz-O3_lto_pgo_xxhash : 51.7 (83.4%) >> >> So as seen, using -O3 really help, one gets a bigger binary, but as dwz is small >> it's negligible: >> >> bloaty dwz-O3_lto -- dwz-O2_lto >> FILE SIZE VM SIZE >> -------------- -------------- >> +28% +50.3Ki [ = ] 0 .debug_loclists >> +18% +25.3Ki +18% +25.3Ki .text >> +12% +24.6Ki [ = ] 0 .debug_info >> +16% +17.3Ki [ = ] 0 .debug_line >> +31% +6.19Ki [ = ] 0 .debug_rnglists >> +11% +689 [ = ] 0 .debug_abbrev >> +7.1% +633 [ = ] 0 .strtab >> +5.5% +504 +5.5% +504 .eh_frame >> +1.3% +453 [ = ] 0 .debug_str >> +0.8% +375 +0.8% +375 .rodata >> +2.8% +336 [ = ] 0 .symtab >> +11% +64 [ = ] 0 .debug_aranges >> +4.2% +64 +4.4% +64 .eh_frame_hdr >> [ = ] 0 +1.8% +32 .bss >> -3.1% -21 -3.1% -21 [LOAD #2 [RX]] >> -61.0% -2.20Ki [ = ] 0 [Unmapped] >> +16% +124Ki +13% +26.2Ki TOTAL >> >> Then, PGO also helps significantly. And finally, using xxhash one can get 5-10% percent >> improvement. >> >> For now I'm suggesting using -O3 and PGO for our openSUSE package: >> https://build.opensuse.org/request/show/942235 >> >> Upstream questions I have: >> - What about changing -O2 with -O3 by default? > > Did you test that without -flto? If it still gets a ~5% speedup then I Yep: # 1/5: sysdig (60M) dwz_O2 : 9.7 dwz_O2_xxhash : 8.5 (87.7%) # 2/5: rtags (58M) dwz_O2 : 17.6 dwz_O2_xxhash : 15.8 (89.5%) # 3/5: libetonyek (91M) dwz_O2 : 10.8 dwz_O2_xxhash : 9.4 (86.7%) # 4/5: krita (685M) dwz_O2 : 96.0 dwz_O2_xxhash : 85.6 (89.1%) # 5/5: gcc (1.2G) dwz_O2 : 58.6 dwz_O2_xxhash : 54.1 (92.4%) > like that idea. Or maybe we should also include -flto by default? Well, it's probably something that can be decided by distributions. Maybe, we can add a default dwz.spec file? > >> - Are you interested in the xxhash patch? Do you want it as a conditional build >> or may I replace the currently existing hash function? > > I think it is best to simply replace the existing hash function > instead of making it a conditional thing. Fine, I'm going to prepare a patch. > > Does it rely on having the libxxhash dynamic library available or > would we simply embed a copy (replacing the hashtab.[ch] files)? I would not do that as it may become obsolete quite fast. I would rather use a standard shared library (similarly to libelf). Martin > > Cheers, > > Mark >