From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 33CFE385E445 for ; Thu, 10 Dec 2020 17:41:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 33CFE385E445 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 0BAHewqK015899 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 10 Dec 2020 12:41:03 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 0BAHewqK015899 Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 2E5811E552; Thu, 10 Dec 2020 12:40:58 -0500 (EST) Subject: Re: [PATCH 3/4] gdb/dwarf: don't enqueue CU in maybe_queue_comp_unit if already expanded To: Tom Tromey , Simon Marchi via Gdb-patches Cc: nilsgladitz@gmail.com References: <20201117191231.2712629-1-simon.marchi@polymtl.ca> <20201117191231.2712629-4-simon.marchi@polymtl.ca> <87eejyr7uq.fsf@tromey.com> From: Simon Marchi Message-ID: <59300a21-12f4-1e6b-b635-c57f82f85ea4@polymtl.ca> Date: Thu, 10 Dec 2020 12:40:57 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.3 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Thu, 10 Dec 2020 17:40:58 +0000 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 17:41:07 -0000 On 2020-12-10 9:52 a.m., Simon Marchi via Gdb-patches wrote: > > > On 2020-12-09 4:24 p.m., Tom Tromey wrote: >> Re-reading this code, I realized again that the return value of this >> function does not really make sense to me. The intro says: >> >> The result is non-zero if PER_CU was queued, otherwise the result is zero >> meaning either PER_CU is already queued or it is already loaded. >> >> ... but it seems unlikely for callees to want to detect that just this >> call caused the enqueueing. > > Ok, re-reading that comment, I think I understand things a bit differently > than I did previously: > > /* If PER_CU is not yet queued, add it to the queue. > If DEPENDENT_CU is non-NULL, it has a reference to PER_CU so add a > dependency. > The result is non-zero if PER_CU was queued, otherwise the result is zero > meaning either PER_CU is already queued or it is already loaded. > > N.B. There is an invariant here that if a CU is queued then it is loaded. > The caller is required to load PER_CU if we return non-zero. */ > > The premise is: there is the invariant that if a CU is queued for expansion, > its DIEs are loaded. If maybe_queue_comp_unit enqueues a CU for expansion > whose DIEs are not loaded, it returns 1 to its caller to ask "please load the > DIEs for that CU because I just enqueued it and if you don't the invariant > will get violated and we'll get in trouble". So if a CU is already expanded > and maybe_queue_comp_unit doesn't enqueue it, it makes sense that it returns > 0, because it doesn't *require* the caller to load the DIEs. > > However, that means that the caller shouldn't rely on maybe_queue_comp_unit's > return value to determine whether the CU's DIEs are currently loaded, because: > > 1. whether maybe_queue_comp_unit requires it to load the CU's DIEs > 2. whether the CU's DIEs are currently loaded > > are two different things. > > If the caller wants to know #2, because it itself needs to ensure the DIEs are > loaded, it should not rely on maybe_queue_comp_unit's return value, but > instead check itself with dwarf2_per_objfile::get_cu. > > I'll go over my patch and think about it a little more. > > Simon > Hi Tom, Following my reconsideration of what the return value of maybe_queue_comp_unit means and the original intent of the function, here's an updated patch. The differences are: - New comment on maybe_queue_comp_unit, hopefully making it clearer. - Update logic in maybe_queue_comp_unit: if the CU does not get enqueued because it is already expanded and its DIEs are not loaded, return false. Because in this case, maybe_queue_comp_unit doesn't _need_ the caller to load the DIEs. - Update callers follow_die_sig_1 and follow_die_ref to not rely on maybe_queue_comp_unit to tell them whether DIEs are loaded right now. >From 70ed5a2468e2776e887c3481b89945347a856089 Mon Sep 17 00:00:00 2001 From: Simon Marchi Date: Wed, 11 Nov 2020 21:30:22 -0500 Subject: [PATCH] gdb/dwarf: don't enqueue CU in maybe_queue_comp_unit if already expanded The previous commit log described how items could be left lingering in the dwarf2_per_bfd::queue and how that could cause trouble. This patch fixes the issue by changing maybe_queue_comp_unit so that it doesn't put a CU in the to-expand queue if that CU is already expanded. This will make it so that when dwarf2_fetch_die_type_sect_off calls follow_die_offset and maybe_queue_comp_unit, it won't enqueue the target CU, because it will see the CU is already expanded. This assumes that if a CU is dwarf2_fetch_die_type_sect_off's target CU, it will have previously been expanded. I think it is the case, but I can't be 100% sure. If that's not true, the assertions added in the following patch will catch it, and it means we'll have to re-think a bit more how things work (it wouldn't be well handled at all today anyway). This fixes something else in maybe_queue_comp_unit that looks wrong. Imagine the DIEs of a CU are loaded in memory, but that CU is not expanded. In that case, maybe_queue_comp_unit will use this early return: /* If the compilation unit is already loaded, just mark it as used. */ dwarf2_cu *cu = per_objfile->get_cu (per_cu); if (cu != nullptr) { cu->last_used = 0; return 0; } ... so the CU won't be queued for expansion. Whether the DIEs of a CU are loaded in memory and whether that CU is expanded are two orthogonal things, but that function appears to mix them. So, move the queuing above that check / early return, so that if the CU's DIEs are loaded in memory but the CU is not expanded yet, it gets enqueued. I tried to improve maybe_queue_comp_unit's documentation to clarify what the return value means. By clarifying this, I noticed that two callers (follow_die_offset and follow_die_sig_1) access the CU's DIEs after calling maybe_queue_comp_unit, only relying on maybe_queue_comp_unit's return value to tell whether DIEs need to be loaded first or not. As explained in the new comment, this is problematic: maybe_queue_comp_unit's return value doesn't tell whether DIEs are currently loaded, it means whether maybe_queue_comp_unit requires the caller to load them. If the CU is already expanded but the DIEs to have been freed, maybe_queue_comp_unit returns 0, meaning "I don't need you to load the DIEs". So if these two functions (follow_die_offset and follow_die_sig_1) need to access the DIEs in any case, for their own usage, they should make sure to load them if they are not loaded already. I therefore added an extra check to the condition they use, making it so they will always load the DIEs if they aren't already. >From what I found, other callers don't care for the CU's DIEs, they call maybe_queue_comp_unit to ensure the CU gets expanded eventually, but don't care for it after that. gdb/ChangeLog: PR gdb/26828 * dwarf2/read.c (maybe_queue_comp_unit): Check if CU is expanded to decide whether or not to enqueue it for expansion. (follow_die_offset, follow_die_sig_1): Ensure we load the DIEs after calling maybe_queue_comp_unit. Change-Id: Id98c6b60669f4b4b21b9be16d0518fc62bdf686a --- gdb/dwarf2/read.c | 70 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 53 insertions(+), 17 deletions(-) diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index f199229985e9..62676cd91492 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -9153,14 +9153,30 @@ queue_comp_unit (dwarf2_per_cu_data *per_cu, per_cu->per_bfd->queue.emplace (per_cu, per_objfile, pretend_language); } -/* If PER_CU is not yet queued, add it to the queue. +/* If PER_CU is not yet expanded of queued for expansion, add it to the queue. + If DEPENDENT_CU is non-NULL, it has a reference to PER_CU so add a dependency. - The result is non-zero if PER_CU was queued, otherwise the result is zero - meaning either PER_CU is already queued or it is already loaded. - N.B. There is an invariant here that if a CU is queued then it is loaded. - The caller is required to load PER_CU if we return non-zero. */ + Return true if maybe_queue_comp_unit requires the caller to load the CU's + DIEs, false otherwise. + + Explanation: there is an invariant that if a CU is queued for expansion + (present in `dwarf2_per_bfd::queue`), then its DIEs are loaded + (a dwarf2_cu object exists for this CU, and `dwarf2_per_objfile::get_cu` + returns non-nullptr). If the CU gets enqueued by this function but its DIEs + are not yet loaded, the the caller must load the CU's DIEs to ensure the + invariant is respected. + + The caller is therefore not required to load the CU's DIEs (we return false) + if: + + - the CU is already expanded, and therefore does not get enqueued + - the CU gets enqueued for expansion, but its DIEs are already loaded + + Note that the caller should not use this function's return value as an + indicator of whether the CU's DIEs are loaded right now, it should check + that by calling `dwarf2_per_objfile::get_cu` instead. */ static int maybe_queue_comp_unit (struct dwarf2_cu *dependent_cu, @@ -9191,22 +9207,32 @@ maybe_queue_comp_unit (struct dwarf2_cu *dependent_cu, /* Verify the invariant that if a CU is queued for expansion, its DIEs are loaded. */ gdb_assert (per_objfile->get_cu (per_cu) != nullptr); + + /* If the CU is queued for expansion, it should not already be + expanded. */ + gdb_assert (!per_objfile->symtab_set_p (per_cu)); + + /* The DIEs are already loaded, the caller doesn't need to do it. */ return 0; } + bool queued = false; + if (!per_objfile->symtab_set_p (per_cu)) + { + /* Add it to the queue. */ + queue_comp_unit (per_cu, per_objfile, pretend_language); + queued = true; + } + /* If the compilation unit is already loaded, just mark it as used. */ dwarf2_cu *cu = per_objfile->get_cu (per_cu); if (cu != nullptr) - { - cu->last_used = 0; - return 0; - } + cu->last_used = 0; - /* Add it to the queue. */ - queue_comp_unit (per_cu, per_objfile, pretend_language); - - return 1; + /* Ask the caller to load the CU's DIEs if the CU got enqueued for expansion + and the DIEs are not already loaded. */ + return queued && cu == nullptr; } /* Process the queue. */ @@ -23568,12 +23594,18 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz, sect_offset_str (per_cu->sect_off), per_objfile->get_cu (per_cu) != nullptr); - /* If necessary, add it to the queue and load its DIEs. */ - if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language)) + /* If necessary, add it to the queue and load its DIEs. + + Even if maybe_queue_comp_unit doesn't require us to load the CU's DIEs, + it doesn't mean they are currently loaded. Since we require them + to be loaded, we must check for ourselves. */ + if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language) + || per_objfile->get_cu (per_cu) == nullptr) load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu), false, cu->language); target_cu = per_objfile->get_cu (per_cu); + gdb_assert (target_cu != nullptr); } else if (cu->dies == NULL) { @@ -23947,10 +23979,14 @@ follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type, we can get here for DW_AT_imported_declaration where we need the DIE not the type. */ - /* If necessary, add it to the queue and load its DIEs. */ + /* If necessary, add it to the queue and load its DIEs. + Even if maybe_queue_comp_unit doesn't require us to load the CU's DIEs, + it doesn't mean they are currently loaded. Since we require them + to be loaded, we must check for ourselves. */ if (maybe_queue_comp_unit (*ref_cu, &sig_type->per_cu, per_objfile, - language_minimal)) + language_minimal) + || per_objfile->get_cu (&sig_type->per_cu) == nullptr) read_signatured_type (sig_type, per_objfile); sig_cu = per_objfile->get_cu (&sig_type->per_cu); -- 2.29.2