public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
From: "vries at gcc dot gnu.org" <sourceware-bugzilla@sourceware.org>
To: dwz@sourceware.org
Subject: [Bug default/25951] support for parallel processing?
Date: Wed, 10 Mar 2021 10:19:44 +0000	[thread overview]
Message-ID: <bug-25951-11298-DNZXrIqgpJ@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-25951-11298@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=25951

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 13297
  --> https://sourceware.org/bugzilla/attachment.cgi?id=13297&action=edit
Demonstrator patch

This demonstrator patch implements a simple form of multithreading, which only
works without:
- multifile (-m)
- hardlink (-h)
- low-mem limit 0 (-l0)

If a file hits the low-mem limit during the parallel phase, it's rerun in
low-mem mode after the parallel phase.

It passes the test-suite.  There is only one thread-sanitizer warning left, for
multiple assignment of dwz_oom to obstack_alloc_failed_handler.

I did a build of the libreoffice package on openSUSE with dwz disabled,
harvested the resulting .debug files (in total 175 files, 685MB), and did a dwz
run (without multifile) using those files.

With master:
...
maxmem: 714956
real: 17.77
user: 15.76
system: 0.50
...

With the patch on top of master:
...
maxmem: 1106516
real: 10.37
user: 20.59
system: 1.46
...

So, the trade off is as expected: faster realtime, but higher peak memory.

DWZ though contains the low-mem mode to keep memory usage in check, such that
dwz can be used on 32-bit systems, with still relatively large files.  So the
trade off on those systems may not be advantageous.  We could fix this by not
enabling parallel processing on such systems.

OTOH, we could also spawn processes instead of threads.  That means the
per-process peak memory does not increase.  It would also mean less messy code
changes (not having to use __thread all over the place).

An initial version that wouldn't deal with multifile (like this demonstrator
patch) wouldn't need much changes.  A version that would support multifile
would need a switch to indicate the location of the dwz.debug_info etc files. 
So, something like:
...
$ dwz -m 3 1 2
 create temp dir /tmp/abcdef
 spawn dwz 1 --multifile-dir /tmp/abcdef
 spawn dwz 2 --multifile-dir /tmp/abcdef
 wait for 2 spawned processes to finish ...
 spawned dwz 1 - compressing
 spawned dwz 2 - compressing
 spawned dwz 1 - multifile write (using dir /tmp/abcdef)
 spawned dwz 2 - multifile write (using dir /tmp/abcdef)
 spawned dwz 1 - done
 spawned dwz 2 - done
 waiting done
 multifile optimize (using files in /tmp/abcdef)
 multifile read
 multifile finalize 1
 multifile finalize 2
...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2021-03-10 10:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08 12:52 [Bug default/25951] New: " samuel.thibault@ens-lyon.org
2021-03-02  7:53 ` [Bug default/25951] " vries at gcc dot gnu.org
2021-03-10 10:19 ` vries at gcc dot gnu.org [this message]
2021-03-23 20:22 ` vries at gcc dot gnu.org
2021-03-26 11:47 ` vries at gcc dot gnu.org
2021-03-26 11:51 ` jakub at redhat dot com
2021-03-26 16:42 ` vries at gcc dot gnu.org
2021-03-31  7:18 ` vries at gcc dot gnu.org
2021-04-12  8:22 ` vries at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-25951-11298-DNZXrIqgpJ@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=dwz@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).