From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 50A47386F02C; Fri, 13 Nov 2020 14:45:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 50A47386F02C Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 0ADEjPVP009394 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Nov 2020 09:45:30 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 0ADEjPVP009394 Received: from [172.16.0.95] (192-222-181-218.qc.cable.ebox.net [192.222.181.218]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id A00281E552; Fri, 13 Nov 2020 09:45:24 -0500 (EST) Subject: Re: Split DWARF and rnglists, gcc vs clang To: Mark Wielaard Cc: "gdb@sourceware.org" , gcc@gcc.gnu.org, "gdb-patches@sourceware.org" References: <20201113001143.GA2654@wildebeest.org> From: Simon Marchi Message-ID: <3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca> Date: Fri, 13 Nov 2020 09:45:24 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20201113001143.GA2654@wildebeest.org> Content-Type: text/plain; charset=utf-8 Content-Language: tl Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Fri, 13 Nov 2020 14:45:25 +0000 X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Nov 2020 14:45:35 -0000 On 2020-11-12 7:11 p.m., Mark Wielaard wrote: > Hi Simon, > > On Thu, Nov 05, 2020 at 11:11:43PM -0500, Simon Marchi wrote: >> I'm currently squashing some bugs related to .debug_rnglists in GDB, and >> I happened to notice that clang and gcc do different things when >> generating rnglists with split DWARF. I'd like to know if the two >> behaviors are acceptable, and therefore if we need to make GDB accept >> both. Or maybe one of them is not doing things correctly and would need >> to be fixed. >> >> clang generates a .debug_rnglists.dwo section in the .dwo file. Any >> DW_FORM_rnglistx attribute in the DWO refers to that section. That >> section is not shared with any other object, so DW_AT_rnglists_base is >> never involved for these attributes. Note that there might still be a >> DW_AT_rnglists_base on the DW_TAG_skeleton_unit, in the linked file, >> used if the skeleton itself has an attribute of form DW_FORM_rnglistx. >> This rnglist would be found in a .debug_rnglists section in the linked >> file, shared with the other units of the linked file. >> >> gcc generates a single .debug_rnglists in the linked file and no >> .debug_rnglists.dwo in the DWO files. So when an attribute has form >> DW_FORM_rnglistx in a DWO file, I presume we need to do the lookup in >> the .debug_rnglists section in the linked file, using the >> DW_AT_rnglists_base attribute found in the corresponding skeleton unit. >> This looks vaguely similar to how it was done pre-DWARF 5, with >> DW_AT_GNU_ranges base. >> >> So, is gcc wrong here? I don't see anything in the DWARF 5 spec >> prohibiting to do it like gcc does, but clang's way of doing it sounds >> more in-line with the intent of what's described in the DWARF 5 spec. >> So I wonder if it's maybe an oversight or a misunderstanding between the >> two compilers. > > I think I would have asked the question the other way around :) The > spec explicitly describes rnglists_base (and loclists_base) as a way > to reference ranges (loclists) through the index table, so that the > only relocation you need is in the (skeleton) DIE. I presume you reference this non-normative text in section 2.17.3? This range list representation, the rnglist class, and the related DW_AT_rnglists_base attribute are new in DWARF Version 5. Together they eliminate most or all of the object language relocations previously needed for range lists. What I understand from this is that the rnglist class and DW_AT_rnglists_base attribute help reduce the number of relocations in the non-split case (it removes the need for relocations from DW_AT_ranges attribute values in .debug_info to .debug_rnglists). I don't understand it as saying anything about where to put the rnglist data in the split-unit case. > But the rnglists > (loclists) themselves can still use relocations. A large part of them > is non-shared addresses, so using indexes (into the .debug_addr > addr_base) would simply be extra overhead. The relocations will > disappear once linked, but the index tables won't. > > As an alternative, if you like to minimize the amount of debug data in > the main object file, the spec also describes how to put a whole > .debug_rnglists.dwo (or .debug_loclists.dwo) in the split dwarf > file. Then you cannot use all entry encodings and do need to use an > .debug_addr index to refer to any addresses in that case. So the > relocations are still there, you just refer to them through an extra > index indirection. > > So I believe both encodings are valid according to the spec. It just > depends on what you are optimizing for, small main object file size or > smallest encoding with least number of indirections. So, if I understand correctly, gcc's way of doing things (putting all the rnglists in a common .debug_rnglists section) reduces the overall size of debug info since the rnglists can use the direct addressing rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones (e.g. DW_RLE_startx_endx). But this come at the expense of a lot of relocations in the rnglists themselves, since they refer to addresses directly. I thought that the main point of split-units was to reduce the number of relocations processed by the linker and data moved around by the linker, to reduce link time and provide a better edit-build-debug cycle. Is that the case? Anyway, regardless of the intent, the spec should ideally be clear about that so we don't have to guess. > P.S. I am really interested in these interpretations of DWARF, but I > don't really follow the gdb implementation details very much. Could we > maybe move discussions like these from the -patches list to the main > gdb (or gcc) mailinglist? Sure, I added gdb@ and gcc@. I also left gdb-patches@ so that it's possible to follow the discussion there. Simon