From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2205) id 26A4538582BE; Fri, 5 Aug 2022 14:13:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 26A4538582BE Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Tom de Vries To: gdb-cvs@sourceware.org Subject: [binutils-gdb] [gdb/symtab] Use task size in parallel_for_each in dwarf2_build_psymtabs_hard X-Act-Checkin: binutils-gdb X-Git-Author: Tom de Vries X-Git-Refname: refs/heads/master X-Git-Oldrev: b859a3ef488cd1a3bf072f71002dced36353875b X-Git-Newrev: b069b588cfe10e6bf20ed723cf796686ba4fc0dc Message-Id: <20220805141312.26A4538582BE@sourceware.org> Date: Fri, 5 Aug 2022 14:13:12 +0000 (GMT) X-BeenThere: gdb-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Aug 2022 14:13:12 -0000 https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3Db069b588cfe1= 0e6bf20ed723cf796686ba4fc0dc commit b069b588cfe10e6bf20ed723cf796686ba4fc0dc Author: Tom de Vries Date: Fri Aug 5 16:12:56 2022 +0200 [gdb/symtab] Use task size in parallel_for_each in dwarf2_build_psymtab= s_hard =20 In dwarf2_build_psymtabs_hard, we use a parallel_for_each to distribute= CUs over threads. =20 Ensuring a fair distribution over the worker threads and main thread in= terms of number of CUs might not be the most efficient way, given that CUs ca= n vary in size. =20 Fix this by using per_cu->get_length () as the task size. =20 I've used this experiment to verify the performance impact: ... $ for n in $(seq 1 10); do \ time gdb -q -batch ~/firefox/libxul.so-93.0-1.1.x86_64.debug \ 2>&1 \ | grep "real:"; \ done ... and without the patch got: ... real: 4.71 real: 4.88 real: 4.29 real: 4.30 real: 4.65 real: 4.27 real: 4.27 real: 4.27 real: 4.75 real: 4.41 ... and with the patch: ... real: 3.68 real: 3.81 real: 3.80 real: 3.68 real: 3.75 real: 3.69 real: 3.69 real: 3.74 real: 3.67 real: 3.74 ... so that seems a reasonable improvement. =20 With parallel_for_each_debug set to true, we get some more detail about the difference in behaviour. Without the patch we have: ... Parallel for: n_elements: 2818 Parallel for: minimum elements per thread: 1 Parallel for: elts_per_thread: 704 Parallel for: elements on worker thread 0 : 705 Parallel for: elements on worker thread 1 : 705 Parallel for: elements on worker thread 2 : 704 Parallel for: elements on worker thread 3 : 0 Parallel for: elements on main thread : 704 ... and with the patch: ... Parallel for: n_elements: 2818 Parallel for: total_size: 1483674865 Parallel for: size_per_thread: 370918716 Parallel for: elements on worker thread 0 : 752 (size: 37181179= 0) Parallel for: elements on worker thread 1 : 360 (size: 37150937= 0) Parallel for: elements on worker thread 2 : 1130 (size: 37268171= 0) Parallel for: elements on worker thread 3 : 0 (size: 0) Parallel for: elements on main thread : 576 (size: 36767199= 5) ... =20 Tested on x86_64-linux. Diff: --- gdb/dwarf2/read.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index f03151983dc..348fbe12da1 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -7085,6 +7085,13 @@ dwarf2_build_psymtabs_hard (dwarf2_per_objfile *per_= objfile) =20 using iter_type =3D decltype (per_bfd->all_comp_units.begin ()); =20 + auto task_size_ =3D [] (iter_type iter) + { + dwarf2_per_cu_data *per_cu =3D iter->get (); + return (size_t)per_cu->length (); + }; + auto task_size =3D gdb::make_function_view (task_size_); + /* Each thread returns a pair holding a cooked index, and a vector of errors that should be printed. The latter is done because GDB's I/O system is not thread-safe. run_on_main_thread could be @@ -7113,7 +7120,7 @@ dwarf2_build_psymtabs_hard (dwarf2_per_objfile *per_o= bjfile) } } return result_type (thread_storage.release (), std::move (errors)); - }); + }, task_size); =20 /* Only show a given exception a single time. */ std::unordered_set seen_exceptions;