From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 2616C3861028 for ; Tue, 30 Mar 2021 09:42:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2616C3861028 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tdevries@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 334C7B1E7; Tue, 30 Mar 2021 09:42:44 +0000 (UTC) Subject: [PATCH] Allow parallel multifile with -p -e To: Jakub Jelinek Cc: dwz@sourceware.org, mark@klomp.org References: <20210326164049.GA29676@delia.home> <20210326164738.GW1179226@tucnak> From: Tom de Vries Message-ID: Date: Tue, 30 Mar 2021 11:42:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: <20210326164738.GW1179226@tucnak> Content-Type: multipart/mixed; boundary="------------058F853D7E4F1DC10684B763" Content-Language: en-US X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: dwz@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Dwz mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2021 09:42:47 -0000 This is a multi-part message in MIME format. --------------058F853D7E4F1DC10684B763 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit [ was: Re: [RFC] Allow parallel multifile with -p -e ] On 3/26/21 5:47 PM, Jakub Jelinek wrote: > On Fri, Mar 26, 2021 at 05:40:51PM +0100, Tom de Vries wrote: >> The temporary multifile section contributions happen in random >> order, so consequently the multifile layout will be different, and the >> files referring to the multifile will be different. > > What I meant is that each fork should use different temporary filenames > for the multifiles, once all childs are done, merge them (depends on how > exactly is the work distributed among the forks, if e.g. for 4 forks > first fork gets first quarter of files, second second quarter etc., then > just merge them in the order, otherwise more work would be needed to make > the merging reproduceable. I tried this approach for a bit, but this unfortunately doesn't work verify easily. The temp multifile contributions can contain DW_FORM_ref_addr refs. If we concatenate different temp multifiles, we invalidate those refs, and they need to be fixed up when reading them in, which is cumbersome and errorprone. So I've gone the more conservative way: serialize multifile contribution. Any comments? Thanks, - Tom --------------058F853D7E4F1DC10684B763 Content-Type: text/x-patch; charset=UTF-8; name="0001-Enable-parallel-multifile-with-e-p.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline; filename="0001-Enable-parallel-multifile-with-e-p.patch" Enable parallel multifile with -e -p Currently, parallel dwz is disabled when multifile is used: =2E.. $ dwz -m 5 3 1 2 4 -j 4 =2E.. Enable this when the multifile parameter characteristics are specified us= ing -p and -e: =2E.. $ dwz -m 5 3 1 2 4 -j 4 -p 8 -e l =2E.. This works around the child processes having to communicate back to the p= arent the found pointer size and endiannes, and doing the -j auto and -e auto consistency checking. The problem of the different child processes writing to the same temporar= y multifile is solved by writing in file order to the temporary multifile. In principle, the problem could be solved by writing into per-file tempor= ary multifiles and concatenating them afterwards. However, the temporary multifile contains DW_FORM_ref_addr references, and after concatenation those references in the temporary multifile require relocation (i.e., add= the offset of the start of the file contribution). This requires a lot of changes in various parts of the code, so for now we choose instead this cleaner solution. The enforcing of writing in file order is done by passing a token to each= child process to child process using pipes. Experiment: =2E.. $ for j in 1 2; do \ cp debug/cc1 1; cp 1 2; cp 2 3; cp 3 4; \ echo "j: $j"; \ ../measure/time.sh ./dwz -lnone -m 5 1 2 3 4 -j $j -p 8 -e l; \ done j: 1 maxmem: 1297260 real: 48.35 user: 46.93 system: 1.37 j: 2 maxmem: 1296584 real: 31.85 user: 50.46 system: 1.64 =2E.. 2021-03-30 Tom de Vries PR dwz/25951 * dwz.c (write_multifile_1): Factor out of ... (write_multifile): ... here. Call get_token. (get_token, pass_token): New function. (wait_children_exit): Call pass_token. (dwz_files_1): Allow parallel multifile with -j -e. Call pass_token. --- dwz.c | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++= +++--- 1 file changed, 147 insertions(+), 5 deletions(-) diff --git a/dwz.c b/dwz.c index 037cb75..92b7e5d 100644 --- a/dwz.c +++ b/dwz.c @@ -15059,7 +15059,7 @@ struct file_result /* Collect potentially shareable DIEs, strings and .debug_macro opcode sequences into temporary .debug_* files. */ static int -write_multifile (DSO *dso, struct file_result *res) +write_multifile_1 (DSO *dso, struct file_result *res) { dw_cu_ref cu; bool any_cus =3D false; @@ -15230,6 +15230,60 @@ write_multifile (DSO *dso, struct file_result *r= es) return ret; } =20 +static bool write_multifile_parallel_p; +static int child_id; +static int *pipes; + +/* Get token. */ +static void +get_token (void) +{ + int n =3D child_id; + int *base =3D &pipes[n * 2]; + int readfd =3D base[0]; + int writefd =3D base[1]; + close (writefd); + char buf; + read (readfd, &buf, 1); + close (readfd); +} + +/* Pass token to child N. */ +static void +pass_token (int n) +{ + int *base =3D &pipes[n * 2]; + int readfd =3D base[0]; + int writefd =3D base[1]; + close (readfd); + char buf =3D '\0'; + write (writefd, &buf, 1); + close (writefd); +} + +/* Wrapper around write_multifile_1 that ensures write_multifile_1 is ca= lled + in file order. */ +static int +write_multifile (DSO *dso, struct file_result *res) +{ + int ret; + + if (write_multifile_parallel_p) + { + get_token (); + + multi_info_off =3D lseek (multi_info_fd, 0L, SEEK_END); + multi_abbrev_off =3D lseek (multi_abbrev_fd, 0L, SEEK_END); + multi_line_off =3D lseek (multi_line_fd, 0L, SEEK_END); + multi_str_off =3D lseek (multi_str_fd, 0L, SEEK_END); + multi_macro_off =3D lseek (multi_macro_fd, 0L, SEEK_END); + } + + ret =3D write_multifile_1 (dso, res); + + return ret; +} + /* During fi_multifile phase, see what DIEs in a partial unit contain no children worth keeping where all real DIEs have dups in the shared .debug_info section and what remains is @@ -16472,6 +16526,11 @@ wait_child_exit (pid_t pid, pid_t *pids, int nr_= pids, resa[i].ret =3D decode_child_exit_status (state, &resa[i]); } =20 +static int *workset; +static int workset_size =3D 0; +int current_multifile_owner =3D -1; +int current_multifile_owner_file_idx =3D -1; + /* Wait on exit of chilren in PIDS, update RESA. */ static void wait_children_exit (pid_t *pids, int nr_files, struct file_result *resa)= @@ -16483,6 +16542,16 @@ wait_children_exit (pid_t *pids, int nr_files, s= truct file_result *resa) if (pids[i] =3D=3D 0) continue; wait_child_exit (pids[i], &pids[i], 1, res); + if (current_multifile_owner_file_idx =3D=3D -1 + || i < current_multifile_owner_file_idx) + continue; + assert (i =3D=3D current_multifile_owner_file_idx); + current_multifile_owner++; + if (current_multifile_owner =3D=3D workset_size) + continue; + current_multifile_owner_file_idx + =3D workset[current_multifile_owner]; + pass_token (current_multifile_owner); } } =20 @@ -16526,8 +16595,9 @@ dwz_files_1 (int nr_files, char *files[], bool ha= rdlink, if (hardlink) hardlink =3D detect_hardlinks (nr_files, files, resa); =20 - int workset[nr_files]; - int workset_size =3D 0; + workset =3D malloc (nr_files * sizeof (int)); + if (workset =3D=3D NULL) + error (1, ENOMEM, "failed to allocate workset array"); for (i =3D 0; i < nr_files; i++) { struct file_result *res =3D &resa[i]; @@ -16537,7 +16607,28 @@ dwz_files_1 (int nr_files, char *files[], bool h= ardlink, workset[workset_size] =3D i; workset_size++; } - bool initial_parallel_p =3D max_forks > 1 && multifile =3D=3D NULL; + + bool initial_parallel_p =3D max_forks > 1; + if (initial_parallel_p && multifile) + { + if (multifile_force_ptr_size !=3D 0 && multifile_force_endian !=3D = 0) + { + write_multifile_parallel_p =3D true; + pipes =3D malloc (workset_size * 2 * sizeof (int)); + if (pipes =3D=3D NULL) + error (1, ENOMEM, "failed to allocate pipes array"); + for (i =3D 0; i < workset_size; i++) + { + int fds[2]; + if (pipe (fds) !=3D 0) + error (1, ENOMEM, "failed to initialize pipe"); + pipes[i * 2] =3D fds[0]; + pipes[i * 2 + 1] =3D fds[1]; + } + } + else + initial_parallel_p =3D false; + } if (initial_parallel_p) { pid_t pids[nr_files]; @@ -16550,7 +16641,17 @@ dwz_files_1 (int nr_files, char *files[], bool h= ardlink, =20 if (nr_forks =3D=3D max_forks) { - wait_child_exit (-1, pids, i, resa); + if (multifile =3D=3D NULL) + wait_child_exit (-1, pids, i, resa); + else + { + int k =3D current_multifile_owner_file_idx; + wait_child_exit (pids[k], &pids[k], 1, &resa[k]); + current_multifile_owner++; + current_multifile_owner_file_idx + =3D workset[current_multifile_owner]; + pass_token (current_multifile_owner); + } nr_forks--; } =20 @@ -16558,6 +16659,7 @@ dwz_files_1 (int nr_files, char *files[], bool ha= rdlink, assert (fork_res !=3D -1); if (fork_res =3D=3D 0) { + child_id =3D j; file =3D files[i]; struct file_result *res =3D &resa[i]; int thisret =3D dwz_with_low_mem (file, NULL, res); @@ -16565,6 +16667,13 @@ dwz_files_1 (int nr_files, char *files[], bool h= ardlink, } else { + if (multifile && j =3D=3D 0) + { + current_multifile_owner =3D j; + current_multifile_owner_file_idx + =3D workset[current_multifile_owner]; + pass_token (current_multifile_owner); + } pids[i] =3D fork_res; nr_forks++; } @@ -16608,6 +16717,14 @@ dwz_files_1 (int nr_files, char *files[], bool h= ardlink, return ret; } =20 + if (write_multifile_parallel_p) + { + multi_info_off =3D lseek (multi_info_fd, 0L, SEEK_END); + multi_abbrev_off =3D lseek (multi_abbrev_fd, 0L, SEEK_END); + multi_line_off =3D lseek (multi_line_fd, 0L, SEEK_END); + multi_str_off =3D lseek (multi_str_fd, 0L, SEEK_END); + multi_macro_off =3D lseek (multi_macro_fd, 0L, SEEK_END); + } if (multi_info_off =3D=3D 0 && multi_str_off =3D=3D 0 && multi_macro_o= ff =3D=3D 0) { if (!quiet) @@ -16615,6 +16732,31 @@ dwz_files_1 (int nr_files, char *files[], bool h= ardlink, return ret; } =20 + if (write_multifile_parallel_p) + { + /* We reproduce here what happens when we run sequentially. This = is a + kludge that probably needs to be replaced by IPC. */ + for (i =3D 0; i < nr_files; i++) + { + struct file_result *res =3D &resa[i]; + if (!res->low_mem_p && !res->skip_multifile && res->res >=3D 0) + { + int fd =3D open (files[i], O_RDONLY); + if (fd < 0) + return ret; + DSO *dso =3D fdopen_dso (fd, files[i]); + if (dso =3D=3D NULL) + { + close (fd); + return ret; + } + assert (multi_ehdr.e_ident[0] =3D=3D '\0'); + multi_ehdr =3D dso->ehdr; + break; + } + } + } + unsigned int multifile_die_count =3D 0; int multi_fd =3D optimize_multifile (&multifile_die_count); DSO *dso; --------------058F853D7E4F1DC10684B763--