From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 8C0AF383F871; Fri, 13 Nov 2020 15:41:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8C0AF383F871 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 0ADFfcR9024527 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Nov 2020 10:41:43 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 0ADFfcR9024527 Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net [192.222.181.218]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 5F7071E552; Fri, 13 Nov 2020 10:41:38 -0500 (EST) Subject: Re: Split DWARF and rnglists, gcc vs clang To: Mark Wielaard Cc: "gdb@sourceware.org" , gcc@gcc.gnu.org, "gdb-patches@sourceware.org" References: <20201113001143.GA2654@wildebeest.org> <3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca> <55c0f2f69f1e6ad5d008665d004d629ad62ab65f.camel@klomp.org> From: Simon Marchi Message-ID: Date: Fri, 13 Nov 2020 10:41:37 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <55c0f2f69f1e6ad5d008665d004d629ad62ab65f.camel@klomp.org> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Fri, 13 Nov 2020 15:41:38 +0000 X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Nov 2020 15:41:48 -0000 On 2020-11-13 10:18 a.m., Mark Wielaard wrote: > That too, but I was actually referring to the sections that define > Range List and Location List Tables (7.28 and 7.29) which define the > meaning of DW_AT_rnglists_base and DW_AT_loclists_base. But you could > also look at 3.1.3 Split Full Compilation Unit Entries which says that > those base attributes are inherited from the corresponding skeleton > compilation unit for a split unit. Hmm, indeed, if we interpret that sentence in 3.1.3 to the letter, it suggests that the the DW_FORM_rnglistx attributes in the DWO are meant to point in the linked file's .debug_rnglists. Otherwise, inheriting DW_AT_rnglists_base wouldn't be meaningful. But when DWO files use a .debug_rnglists.dwo section, it doesn't make sense to consider the inherited DW_AT_rnglists_base. So in the end the logical thing to do when encountering a DW_FORM_rnglistx in a split-unit, in order to support everybody, is probably to go to the .debug_rnglists.dwo section, if there's one, disregarding the (inherited) DW_AT_rnglists_base. If there isn't, then try the linked file's .debug_rnglists section, using DW_AT_rnglists_base. If there isn't, then something is malformed. >> What I understand from this is that the rnglist class and >> DW_AT_rnglists_base attribute help reduce the number of relocations in >> the non-split case (it removes the need for relocations from >> DW_AT_ranges attribute values in .debug_info to .debug_rnglists). I >> don't understand it as saying anything about where to put the rnglist >> data in the split-unit case. > > I interpreted it as when there is a base attribute in the (skeleton) > unit, then the corresponding section (index table) can be found in the > main object file. That doesn't work with how clang produces it, AFAIU. There is a DW_AT_rnglists_base attribute in the skeleton and a .debug_rnglists in the linked file, which is used for the skeleton's DW_AT_ranges attribute. And there is also .debug_rnglists.dwo sections in the DWO files. So DW_FORM_rnglistx values in the skeleton use the .debug_rnglists in the linked file, while the DW_FORM_rnglistx values in the DWO file use the .debug_rnglists.dwo in that file (even though there is a DW_AT_rnglists_base in the skeleton). > At least that is how elfutils libdw interprets the > base attributes, not just for rnglists_base, but also str_offsets_base, > addr_base, etc. And that is also how/when GCC emits them. > >>> So I believe both encodings are valid according to the spec. It just >>> depends on what you are optimizing for, small main object file size or >>> smallest encoding with least number of indirections. >> >> So, if I understand correctly, gcc's way of doing things (putting all >> the rnglists in a common .debug_rnglists section) reduces the overall >> size of debug info since the rnglists can use the direct addressing >> rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones >> (e.g. DW_RLE_startx_endx). But this come at the expense of a lot of >> relocations in the rnglists themselves, since they refer to addresses >> directly. > > Yes, and it reduces the number of .debug_addr entries that need > relocations. > >> I thought that the main point of split-units was to reduce the number of >> relocations processed by the linker and data moved around by the linker, >> to reduce link time and provide a better edit-build-debug cycle. Is >> that the case? > > I think it depends on who exactly you ask and what their specific > goals/setups are. Both things, reducing the number of relocations and > moving data out of the main object file, are independently useful in > different context. But I think it is mainly reducing the number of > relocations that is beneficial. For example clang (but not yet gcc) > supports having the .dwo sections themselves in the main object file > (using SHF_EXCLUDED for the .dwo sections, so the linker will still > skip them). Which is also a possibility that the spec describes and > which really makes split DWARF much more usable, because then you don't > need to change your build system to deal with multiple output files. Not sure I understand. Does that mean that the .dwo sections are emitted in the .o files, and that's the end of the road for them? The DW_AT_dwo_name attributes of the skeletons then refer to the .o files? Simon