From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id 8561538346AD for ; Fri, 6 May 2022 14:46:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8561538346AD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from [10.10.3.121] (unknown [10.10.3.121]) by mail.ispras.ru (Postfix) with ESMTPS id 93C0A40755CB; Fri, 6 May 2022 14:46:35 +0000 (UTC) Date: Fri, 6 May 2022 17:46:35 +0300 (MSK) From: Alexander Monakov To: =?ISO-8859-15?Q?Martin_Li=A8ka?= cc: Richard Biener , GCC Patches , Jan Hubicka Subject: Re: [PATCH] lto-plugin: add support for feature detection In-Reply-To: Message-ID: <80f37f2-efdf-673-a8f4-69f2d5842ea2@ispras.ru> References: <63633ead-aa7e-c424-9851-ac332ac13df3@suse.cz> <27841a42-baef-d53e-c601-ad265030854d@suse.cz> MIME-Version: 1.0 X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2022 14:46:42 -0000 On Thu, 5 May 2022, Martin Liška wrote: > On 5/5/22 12:52, Alexander Monakov wrote: > > Feels a bit weird to ask, but before entertaining such an API extension, > > can we step back and understand the v3 variant of get_symbols? It is not > > documented, and from what little I saw I did not get the "motivation" for > > its existence (what it is doing that couldn't be done with the v2 api). > > Please see here: > https://github.com/rui314/mold/issues/181#issuecomment-1037927757 Thanks. I've also re-read [1] and [2] which provided some relevant ideas. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=23411 OK, so the crux of the issue is that sometimes the linker needs to feed the compiler plugin with LTO .o files extracted from static archives. This is not really obvious, because normally .a archives have an index that enumerates symbols defined/used by its .o files, and even during LTO the linker can simply consult the index to find out which members to extract. In theory, at least. The theory breaks in the following cases: - ld.bfd and common symbols (I wonder if weak/comdat code is also affected?): archive index does not indicate which definitions are common, so ld.bfd extracts the member and feeds it to the plugin to find out; - ld.gold and emulated archives via --start-lib a.o b.o ... --end-lib: here there's no index to consult and ld.gold feeds each .o to the plugin. In those cases it may happen that the linker extracts an .o file that would not be extracted during non-LTO link, and if that happens, the linker needs to inform the plugin. This is not the same as marking each symbol from spuriously extracted .o file as PREEMPTED when the .o file has constructors (the plugin will assume the constructors are kept while the linker needs to discard them). So get_symbols_v3 allows the linker to discard an LTO .o file to solve this. In absence of get_symbols_v3 mold tries to ensure correctness by restarting itself while appending a list of .o files to be discarded to its command line. I wonder if mold can invoke plugin cleanup callback to solve this without restarting. (also, hm, it seems to confirm my idea that LTO .o files should have had the correct symbol table so normal linker algorithms would work) Hopefully this was useful. Alexander