public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: "Martin Liška" <mliska@suse.cz>
Cc: Tom de Vries <tdevries@suse.de>,
	dwz@sourceware.org, Jakub Jelinek <jakub@redhat.com>,
	Michael Matz <matz@suse.de>
Subject: Re: [Highlight] Performance improvements
Date: Mon, 3 Jan 2022 23:06:25 +0100	[thread overview]
Message-ID: <YdNzYU3tZkLlgLeW@wildebeest.org> (raw)
In-Reply-To: <20463996-a518-c26a-3c68-65b91606d84c@suse.cz>

Hi Martin,

I noticed that this is a reply to a thread from 2 years ago. Is it
related to the work mentioned by Tom in that thread?

On Thu, Dec 23, 2021 at 12:57:48PM +0100, Martin Liška wrote:
> I've made couple of experiments with dwz speed. I've taken the following packages:
> gcc, krita, libetonyek, rtags, sysdig and run dwz -m x ... for them.
> 
> There are numbers I collected for the following configurations:
> dwz (system package, built with LTO and -O2), dwz-O2_lto is supposed
> to be the same (built from source), then I experimented with -O3 and PGO
> (based on tramp3d copies 4 times). And the final run is experimental patch
> I have that replaces the iterative_hash with xxhash:
> https://github.com/Cyan4973/xxHash
> 
> # 1/5: sysdig (60M)
> dwz                   : 10.0
> dwz                   : 9.8 (98.7%)
> dwz-O2_lto            : 9.5 (95.6%)
> dwz-O3_lto            : 9.2 (91.9%)
> dwz-O3_lto_pgo        : 8.1 (81.3%)
> dwz-O3_lto_pgo_xxhash : 7.3 (72.9%)
> # 2/5: rtags (148M)
> dwz                   : 19.6
> dwz                   : 19.6 (99.9%)
> dwz-O2_lto            : 17.4 (89.0%)
> dwz-O3_lto            : 16.7 (85.4%)
> dwz-O3_lto_pgo        : 14.4 (73.6%)
> dwz-O3_lto_pgo_xxhash : 13.2 (67.6%)
> # 3/5: libetonyek (112M)
> dwz                   : 10.5
> dwz                   : 10.5 (100.6%)
> dwz-O2_lto            : 10.8 (102.8%)
> dwz-O3_lto            : 10.1 (96.7%)
> dwz-O3_lto_pgo        : 9.1 (87.4%)
> dwz-O3_lto_pgo_xxhash : 8.1 (77.1%)
> # 4/5: krita (685M)
> dwz                   : 133.7
> dwz                   : 134.3 (100.5%)
> dwz-O2_lto            : 95.3 (71.3%)
> dwz-O3_lto            : 91.2 (68.2%)
> dwz-O3_lto_pgo        : 78.9 (59.0%)
> dwz-O3_lto_pgo_xxhash : 71.6 (53.5%)
> # 5/5: gcc (1.2G)
> dwz                   : 61.9
> dwz                   : 61.9 (99.9%)
> dwz-O2_lto            : 58.5 (94.5%)
> dwz-O3_lto            : 56.6 (91.3%)
> dwz-O3_lto_pgo        : 54.1 (87.4%)
> dwz-O3_lto_pgo_xxhash : 51.7 (83.4%)
> 
> So as seen, using -O3 really help, one gets a bigger binary, but as dwz is small
> it's negligible:
> 
> bloaty dwz-O3_lto -- dwz-O2_lto
>     FILE SIZE        VM SIZE
>  --------------  --------------
>    +28% +50.3Ki  [ = ]       0    .debug_loclists
>    +18% +25.3Ki   +18% +25.3Ki    .text
>    +12% +24.6Ki  [ = ]       0    .debug_info
>    +16% +17.3Ki  [ = ]       0    .debug_line
>    +31% +6.19Ki  [ = ]       0    .debug_rnglists
>    +11%    +689  [ = ]       0    .debug_abbrev
>   +7.1%    +633  [ = ]       0    .strtab
>   +5.5%    +504  +5.5%    +504    .eh_frame
>   +1.3%    +453  [ = ]       0    .debug_str
>   +0.8%    +375  +0.8%    +375    .rodata
>   +2.8%    +336  [ = ]       0    .symtab
>    +11%     +64  [ = ]       0    .debug_aranges
>   +4.2%     +64  +4.4%     +64    .eh_frame_hdr
>   [ = ]       0  +1.8%     +32    .bss
>   -3.1%     -21  -3.1%     -21    [LOAD #2 [RX]]
>  -61.0% -2.20Ki  [ = ]       0    [Unmapped]
>    +16%  +124Ki   +13% +26.2Ki    TOTAL
> 
> Then, PGO also helps significantly. And finally, using xxhash one can get 5-10% percent
> improvement.
> 
> For now I'm suggesting using -O3 and PGO for our openSUSE package:
> https://build.opensuse.org/request/show/942235
> 
> Upstream questions I have:
> - What about changing -O2 with -O3 by default?

Did you test that without -flto? If it still gets a ~5% speedup then I
like that idea. Or maybe we should also include -flto by default?

> - Are you interested in the xxhash patch? Do you want it as a conditional build
>   or may I replace the currently existing hash function?

I think it is best to simply replace the existing hash function
instead of making it a conditional thing.

Does it rely on having the libxxhash dynamic library available or
would we simply embed a copy (replacing the hashtab.[ch] files)?

Cheers,

Mark


  reply	other threads:[~2022-01-03 22:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  0:00 Tom de Vries
2019-01-01  0:00 ` Martin Liška
2019-01-01  0:00   ` Tom de Vries
2021-12-23 11:57     ` Martin Liška
2022-01-03 22:06       ` Mark Wielaard [this message]
2022-01-05  8:01         ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YdNzYU3tZkLlgLeW@wildebeest.org \
    --to=mark@klomp.org \
    --cc=dwz@sourceware.org \
    --cc=jakub@redhat.com \
    --cc=matz@suse.de \
    --cc=mliska@suse.cz \
    --cc=tdevries@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).