From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id C1D6E3858D32 for ; Mon, 8 Apr 2024 12:58:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C1D6E3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C1D6E3858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712581123; cv=none; b=k3b8TqJHWEVc7Dm0OkvzGm3Z4JtuNbFcwNDVCdNY857cQ3nFmY9o0PdBkOvYq4BCSInQpM78Jc8nzJ707ZIfX3o/K5T/KhBCguSbBN6dPrp8vDfehDtGIQu7pgx9Qk42tU9kFGQB5CNtKwaPO4+9GtjNJWQ7hiLjpn+qtxSuQ0s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712581123; c=relaxed/simple; bh=L/HR4hPTk8OeLKJMK3AEQavkBjWMfRGLYTvVxPWC+0c=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature: Message-ID:Date:MIME-Version:Subject:To:From; b=T7jnqmdib//6LnzyqyHbJdY/gMSb0m3dcAos1CfQiWsIUW7JlLH+G2DFxM3/hFVoSwbxYLBNbNR7fBZHgelWKCzDwuxP2KDSR27hnRY4BjbIbTuXDkwJViQq7/3od1r+ggbdH7JZruX1G3bqSNen6O2orgvMVV9yZri82EGjKC0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A853920343; Mon, 8 Apr 2024 12:58:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712581119; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dnk9SniDRd9zK31AwDZ/Bmx6Y/XMH1Ix1jPmgV69P1w=; b=qwq0oIdj70LMhxFBIcbapycQiWDda8qQmKRMvNT8JuyikR3prTSNdQN5zOvMRu/5Fm1Thc vVLXur8K+R5qO2VYnfEfNa/wDCifFtdU7+kcadkpA0htmaGmXfjB3dPkEn8M6+iGPBDjPg pwva+PkVZg6HYSjWF8pHJeeD/eJNUYk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712581119; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dnk9SniDRd9zK31AwDZ/Bmx6Y/XMH1Ix1jPmgV69P1w=; b=ezb1tz4rLF2MpyuM3U+YqlYbFFFzBf8GYfae0yekWOcKNVANZnwOav+JJtVwEDsXaHTYWT Udkb8kHh/VASddBw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=qwq0oIdj; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ezb1tz4r DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712581119; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dnk9SniDRd9zK31AwDZ/Bmx6Y/XMH1Ix1jPmgV69P1w=; b=qwq0oIdj70LMhxFBIcbapycQiWDda8qQmKRMvNT8JuyikR3prTSNdQN5zOvMRu/5Fm1Thc vVLXur8K+R5qO2VYnfEfNa/wDCifFtdU7+kcadkpA0htmaGmXfjB3dPkEn8M6+iGPBDjPg pwva+PkVZg6HYSjWF8pHJeeD/eJNUYk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712581119; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dnk9SniDRd9zK31AwDZ/Bmx6Y/XMH1Ix1jPmgV69P1w=; b=ezb1tz4rLF2MpyuM3U+YqlYbFFFzBf8GYfae0yekWOcKNVANZnwOav+JJtVwEDsXaHTYWT Udkb8kHh/VASddBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8FF9C13675; Mon, 8 Apr 2024 12:58:39 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id I0G+If/pE2Y/MAAAD6G6ig (envelope-from ); Mon, 08 Apr 2024 12:58:39 +0000 Message-ID: <1c28bfe4-8841-4363-8460-d2d3bd5e529d@suse.de> Date: Mon, 8 Apr 2024 14:58:53 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Content-Language: en-US To: Bernd Edlinger , "gdb-patches@sourceware.org" References: <20240405151012.14763-1-tdevries@suse.de> <544ecc72-916f-436e-b4f2-093bea6882a9@suse.de> From: Tom de Vries In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: -4.50 X-Rspamd-Action: no action X-Rspamd-Queue-Id: A853920343 X-Spam-Level: X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.50 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; XM_UA_NO_VERSION(0.01)[]; RCPT_COUNT_TWO(0.00)[2]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FREEMAIL_TO(0.00)[hotmail.de,sourceware.org]; FUZZY_BLOCKED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_EQ_ADDR_SOME(0.00)[]; FREEMAIL_ENVRCPT(0.00)[hotmail.de]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,sourceware.org:url,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/7/24 10:17, Bernd Edlinger wrote: > On 4/6/24 10:29, Tom de Vries wrote: >> On 4/6/24 07:03, Bernd Edlinger wrote: >>> On 4/5/24 17:10, Tom de Vries wrote: >>>> >>>> diff --git a/gdb/symtab.c b/gdb/symtab.c >>>> index 86603dfebc3..0c126d99cd4 100644 >>>> --- a/gdb/symtab.c >>>> +++ b/gdb/symtab.c >>>> @@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr) >>>>       = unrelocated_addr (end_pc - objfile->text_section_offset ()); >>>>           const linetable *linetable = sal.symtab->linetable (); >>>> -      /* This should find the last linetable entry of the current function. >>>> -     It is probably where the epilogue begins, but since the DWARF 5 >>>> -     spec doesn't guarantee it, we iterate backwards through the function >>>> -     until we either find it or are sure that it doesn't exist.  */ >>>> +      if (linetable->nitems == 0) >>>> +    { >>>> +      /* Empty line table.  */ >>>> +      return {}; >>>> +    } >>>> + > > Hmm, this can be an assertion, because > the line table was found by find_pc_line (start_pc, 0); > so the linetable is guaranteed to be non-empty. > BTW: empty linetables are usually NULL pointers, > so that probably the assertion should > also assert like > gdb_assert (linetable != NULL && linetable->nitems > 0); > I've updated the if condition to include the nullptr check, but didn't turn it into an assert. By doing so we gain nothing, and add a potential user inconvenience. >>>> +      /* Find the first linetable entry after the current function.  Note that >>>> +     this also may be an end_sequence entry.  */ >>>>         auto it = std::lower_bound >>>>       (linetable->item, linetable->item + linetable->nitems, unrel_end, >>>>        [] (const linetable_entry <e, unrelocated_addr pc) >>>> @@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr) >>>>          return lte.unrelocated_pc () < pc; >>>>        }); >>>>   -      while (it->unrelocated_pc () >= unrel_start) >>>> -      { >>>> -    if (it->epilogue_begin) >>>> -      return {it->pc (objfile)}; >>>> -    it --; >>>> -      } >>>> +      if (it == linetable->item + linetable->nitems) >>>> +    { >>>> +      /* We couldn't find either: >>>> +         - a linetable entry starting the function after the current >>>> +           function, or >>>> +         - an end_sequence entry that terminates the current function >>>> +           at unrel_end. >>>> +         This can happen when the linetable doesn't describe the full >>>> +         extent of the function.  Even though this is a corner case, which >>>> +         may not happen other than in dwarf assembly test-cases, let's >>>> +         handle this. >>>> + >>>> +         Move to the last entry in the linetable, and check that it's an >>>> +         end_sequence terminating the current function.  */ >>>> +      gdb_assert (it != &linetable->item[0]); >>>> +      it--; >>>> +      if (!(it->line == 0 >>>> +        && unrel_start <= it->unrelocated_pc () >>>> +        && it->unrelocated_pc () < unrel_end)) >>>> +        return {}; >>> >>> Why is this check necessary here, and not also when >>> this is not the last function in the line-table? >>> >>> And why is this check necessary at all? >>> >> >> It spells out as much as possible the specific conditions of the corner-case we're handling. >> >> We could also just simply handle the cornercase by returning {}, I went forth and back a bit on that, and decided to support it on the basis that at least currently we have dwarf assembly test-cases in the testsuite that trigger this path, though I've submitted a series to clean that up. >> >> But I'm still on the fence about this, if you prefer a "return {}" I'm fine with that. >> I've thought about this a bit more over the weekend, and I managed to convince myself that a simple "return {}" is the proper solution. The rationale is that an incorrectly written dwarf assembly test-case is not a good reason to support a corner-case. If we do stumble one day into a compiler that generates the incorrect debug info, we can always opt to add a workaround for this (which we then can test extensively). But adding a workaround without such an incentive is pointless. >>>> +    } >>>> +      else >>>> +    gdb_assert (unrel_end <= it->unrelocated_pc ()); >>> >>> Why do you not check that 'it' points to an end_sequence >>> at exactly unrel_end? >>> It could be anything at an address much higher PC than unrel_end. >> >> This assert spells out the post-condition of the call to std::lower_bound, in case it found an entry. >> >> If there's debug info where one line entry straddles two functions, the call returns the entry after it, which doesn't have address unrel_end. >> >> Having said that, we can unsupport such a scenario by doing: >> ... >>       else >>         { >>           if (unrel_end < it->unrelocated_pc ()) >>             return {}; >>           gdb_assert (unrel_end == it->unrelocated_pc ()); >> ... >> > > I think we should not look at the `it` in any case. > If there is an inconsistency in the debug info, a > debug message that can be enabled in maintainer mode > would be good enough. > But even in this case, I would prefer a best effort, > and continue whenever possible. > IMO, a best effort only makes sense in case there's a compiler release generating the incorrect debug info. > If you look at skip_prologue_using_linetable > you see that it does stop the search immediately, when > it->unrelocated_pc() reaches unrel_end, or when the > end of the linetable is reached: > > for (; > (it < linetable->item + linetable->nitems > && it->unrelocated_pc () < unrel_end); > it++) > if (it->prologue_end) > return {it->pc (objfile)}; > > Therefore I would like find_epilogue_using_linetable > to use the same algorithm just in reverse direction. > Which means always do `it--` first before using `it`. > I think each function should do what is appropriate, whatever that is. > After all this is just a partial function range, > it can end with a jump or a return, and in both > cases the linetable entry at unrel_end can belong > to a completely different function, and it is not > guaranteed to be an end_sequence entry. > > BTW: I am not sure what happens if the function > has multiple line tables, e.g. because of inline > functions, or #include to pull in parts of the function > body. In that case I would expect that the line > table found by find_pc_line (start_pc, 0); > may be covering the prologue area, while the epilogue > may be missing. > Maybe find_pc_line (end_pc - 1, 0); would be better > candidate for a line table covering the epilogue area? > The function was committed with this caveat documented: ... While the standard allows for multiple points marked with epilogue_begin in the same function, for performance reasons, the function that searches for the epilogue address will only find the last address that sets this flag for a given block. ... So indeed there's work to be done to extend this to be more general, but that's beyond the scope of our current patch. I've submitted a v4 here ( https://sourceware.org/pipermail/gdb-patches/2024-April/207925.html ). Thanks, - Tom