From: "Martin Liška" <mliska@suse.cz>
To: Mark Wielaard <mark@klomp.org>
Cc: Tom de Vries <tdevries@suse.de>,
dwz@sourceware.org, Jakub Jelinek <jakub@redhat.com>,
Michael Matz <matz@suse.de>
Subject: Re: [Highlight] Performance improvements
Date: Wed, 5 Jan 2022 09:01:41 +0100 [thread overview]
Message-ID: <fb5edd44-ce48-cc91-2e2b-b39c08c20baf@suse.cz> (raw)
In-Reply-To: <YdNzYU3tZkLlgLeW@wildebeest.org>
On 1/3/22 23:06, Mark Wielaard wrote:
> Hi Martin,
>
> I noticed that this is a reply to a thread from 2 years ago. Is it
> related to the work mentioned by Tom in that thread?
Hello.
It's related only a bit as it's also connected to Performance improvements :)
>
> On Thu, Dec 23, 2021 at 12:57:48PM +0100, Martin Liška wrote:
>> I've made couple of experiments with dwz speed. I've taken the following packages:
>> gcc, krita, libetonyek, rtags, sysdig and run dwz -m x ... for them.
>>
>> There are numbers I collected for the following configurations:
>> dwz (system package, built with LTO and -O2), dwz-O2_lto is supposed
>> to be the same (built from source), then I experimented with -O3 and PGO
>> (based on tramp3d copies 4 times). And the final run is experimental patch
>> I have that replaces the iterative_hash with xxhash:
>> https://github.com/Cyan4973/xxHash
>>
>> # 1/5: sysdig (60M)
>> dwz : 10.0
>> dwz : 9.8 (98.7%)
>> dwz-O2_lto : 9.5 (95.6%)
>> dwz-O3_lto : 9.2 (91.9%)
>> dwz-O3_lto_pgo : 8.1 (81.3%)
>> dwz-O3_lto_pgo_xxhash : 7.3 (72.9%)
>> # 2/5: rtags (148M)
>> dwz : 19.6
>> dwz : 19.6 (99.9%)
>> dwz-O2_lto : 17.4 (89.0%)
>> dwz-O3_lto : 16.7 (85.4%)
>> dwz-O3_lto_pgo : 14.4 (73.6%)
>> dwz-O3_lto_pgo_xxhash : 13.2 (67.6%)
>> # 3/5: libetonyek (112M)
>> dwz : 10.5
>> dwz : 10.5 (100.6%)
>> dwz-O2_lto : 10.8 (102.8%)
>> dwz-O3_lto : 10.1 (96.7%)
>> dwz-O3_lto_pgo : 9.1 (87.4%)
>> dwz-O3_lto_pgo_xxhash : 8.1 (77.1%)
>> # 4/5: krita (685M)
>> dwz : 133.7
>> dwz : 134.3 (100.5%)
>> dwz-O2_lto : 95.3 (71.3%)
>> dwz-O3_lto : 91.2 (68.2%)
>> dwz-O3_lto_pgo : 78.9 (59.0%)
>> dwz-O3_lto_pgo_xxhash : 71.6 (53.5%)
>> # 5/5: gcc (1.2G)
>> dwz : 61.9
>> dwz : 61.9 (99.9%)
>> dwz-O2_lto : 58.5 (94.5%)
>> dwz-O3_lto : 56.6 (91.3%)
>> dwz-O3_lto_pgo : 54.1 (87.4%)
>> dwz-O3_lto_pgo_xxhash : 51.7 (83.4%)
>>
>> So as seen, using -O3 really help, one gets a bigger binary, but as dwz is small
>> it's negligible:
>>
>> bloaty dwz-O3_lto -- dwz-O2_lto
>> FILE SIZE VM SIZE
>> -------------- --------------
>> +28% +50.3Ki [ = ] 0 .debug_loclists
>> +18% +25.3Ki +18% +25.3Ki .text
>> +12% +24.6Ki [ = ] 0 .debug_info
>> +16% +17.3Ki [ = ] 0 .debug_line
>> +31% +6.19Ki [ = ] 0 .debug_rnglists
>> +11% +689 [ = ] 0 .debug_abbrev
>> +7.1% +633 [ = ] 0 .strtab
>> +5.5% +504 +5.5% +504 .eh_frame
>> +1.3% +453 [ = ] 0 .debug_str
>> +0.8% +375 +0.8% +375 .rodata
>> +2.8% +336 [ = ] 0 .symtab
>> +11% +64 [ = ] 0 .debug_aranges
>> +4.2% +64 +4.4% +64 .eh_frame_hdr
>> [ = ] 0 +1.8% +32 .bss
>> -3.1% -21 -3.1% -21 [LOAD #2 [RX]]
>> -61.0% -2.20Ki [ = ] 0 [Unmapped]
>> +16% +124Ki +13% +26.2Ki TOTAL
>>
>> Then, PGO also helps significantly. And finally, using xxhash one can get 5-10% percent
>> improvement.
>>
>> For now I'm suggesting using -O3 and PGO for our openSUSE package:
>> https://build.opensuse.org/request/show/942235
>>
>> Upstream questions I have:
>> - What about changing -O2 with -O3 by default?
>
> Did you test that without -flto? If it still gets a ~5% speedup then I
Yep:
# 1/5: sysdig (60M)
dwz_O2 : 9.7
dwz_O2_xxhash : 8.5 (87.7%)
# 2/5: rtags (58M)
dwz_O2 : 17.6
dwz_O2_xxhash : 15.8 (89.5%)
# 3/5: libetonyek (91M)
dwz_O2 : 10.8
dwz_O2_xxhash : 9.4 (86.7%)
# 4/5: krita (685M)
dwz_O2 : 96.0
dwz_O2_xxhash : 85.6 (89.1%)
# 5/5: gcc (1.2G)
dwz_O2 : 58.6
dwz_O2_xxhash : 54.1 (92.4%)
> like that idea. Or maybe we should also include -flto by default?
Well, it's probably something that can be decided by distributions. Maybe, we can
add a default dwz.spec file?
>
>> - Are you interested in the xxhash patch? Do you want it as a conditional build
>> or may I replace the currently existing hash function?
>
> I think it is best to simply replace the existing hash function
> instead of making it a conditional thing.
Fine, I'm going to prepare a patch.
>
> Does it rely on having the libxxhash dynamic library available or
> would we simply embed a copy (replacing the hashtab.[ch] files)?
I would not do that as it may become obsolete quite fast. I would rather use a standard
shared library (similarly to libelf).
Martin
>
> Cheers,
>
> Mark
>
prev parent reply other threads:[~2022-01-05 8:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-01 0:00 Tom de Vries
2019-01-01 0:00 ` Martin Liška
2019-01-01 0:00 ` Tom de Vries
2021-12-23 11:57 ` Martin Liška
2022-01-03 22:06 ` Mark Wielaard
2022-01-05 8:01 ` Martin Liška [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb5edd44-ce48-cc91-2e2b-b39c08c20baf@suse.cz \
--to=mliska@suse.cz \
--cc=dwz@sourceware.org \
--cc=jakub@redhat.com \
--cc=mark@klomp.org \
--cc=matz@suse.de \
--cc=tdevries@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).