From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by sourceware.org (Postfix) with ESMTPS id 14F883858C98 for ; Fri, 5 Apr 2024 16:18:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14F883858C98 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=obs.cr Authentication-Results: sourceware.org; spf=none smtp.mailfrom=obs.cr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 14F883858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::102e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712333922; cv=none; b=Ca+DEswFzpM1T2JG8S7I3bV43aCK14jPP0rUNQthot7dvoP+DIehsoKBeJo72DYEZm71WuiHBIQ22b4Bs7D0PhrlD4DpjWKX72F625Y/AqgMOFGxyq4OI2hEdQkrBVNiyjsHHfS8wofTjMhAkQE/Tlrlz9MFama2srArvy/eTZM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712333922; c=relaxed/simple; bh=4CvXne85rb/0j6zgLUGpSyhaLT08cal12nRVUXeivGA=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=I+H0HX3fvHP652SRSf+YXki124WiA46wVPG5U4YXxOaKrU9o3m95dWVkN9mjSrmH2DWdZhVewWDgi+pGvXGc+h9TVZzeO7LiJ8XIWjWRlNFqP8kiCzEVu9kmBQOTmD2C3fEA0WQnz+eEUP09dyUMciE0AjMONSSSUoxqNaSHKfU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x102e.google.com with SMTP id 98e67ed59e1d1-2a475bdd4a6so499935a91.2 for ; Fri, 05 Apr 2024 09:18:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=obs-cr.20230601.gappssmtp.com; s=20230601; t=1712333916; x=1712938716; darn=sourceware.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=K6AyG+VoDZ3/NTm41dLYkjPcwTlRV24VBNmPN9f7EiE=; b=IrU167MrvdTepbNcRl8WQicyak5+i6Kwk7R6G2WKrWQQn96IQWueCjxsycPTnIV4dy TfOyh+7u2pZEto+v9hWsf5GG6B+3cZQ6Kvj0O+JhAtcJr8pxuViFhFr46w/SQ0ldxFT5 XHfzuNoAfYNA0FCcYaNyGcVDtDsq7JO3iVwWHA9b8lI0K6waWq6quBw07CIzEpBiBryu PaHIae6Tq4CcsQ6PHj2XZ0r5qImjmy10p3uqlNbfrQBfWfs6i+3Rdrv4aSrUe5I3JinU eDqjNPv4nKf7abnoDVH09eAUzdMT56hsHQo++ssHZlBmqRD3yWJYYdl8DrVGCeDaeyiO g48A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712333916; x=1712938716; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K6AyG+VoDZ3/NTm41dLYkjPcwTlRV24VBNmPN9f7EiE=; b=b52FX4l29PxOF3BH6p4kuF6u4DfqQ/kbkCWYM8CYCjL2SbIOaIso+ZoELINLSqwyz+ 402SGySKS8hG1+s59Z5v9LkrSYTIYgHFggosvSNR8PV0k9ToMQnkXkqYgRxbbr0UHwfF G3B8YFhjuM4cjIOuZ6URAvsSzJkk83JqnvAtSYkyEjuCN6OLH8XdFeOVxAPzTgqdTk/7 8+Ju7s457Uv+gBR4q2qOlgwfTGte8BIWI5VVAspRjQmbEW1vtrySnwyAWb3l4MJMF/nN +WuYJ/1vX67yDDY9l8Zw/Q7q9OtWHnp6BP1rvF0gz4Bi0kEZTo5/p46UE/22351/gpMj wXMA== X-Gm-Message-State: AOJu0Yz8GWYGqUEiX0G8oY2ZDLJeFYSqklG9sCAfuQ0bhp4cej7qjHl9 sI87yyCTg9sNqu8aQ5wdZgZmwpZZ2IHsc+frO5hkVvXbCOvQYAXPAs/fjNksl/BG3Iv7OZwoI5L PwFhqTWEYvTFNRW5WGoOnHCSobPl6bKLr7FjbEds21s/eDMH4 X-Google-Smtp-Source: AGHT+IGrUrvCGYUNUyMhh1MXBqf9aFhjBUveSlwy+KWQXHyZvU3aoPpdiaL2McYSUvHSqdRZtcahdPDEB3UmqOLOCJg= X-Received: by 2002:a17:90b:347:b0:2a0:4495:1f3d with SMTP id fh7-20020a17090b034700b002a044951f3dmr2299079pjb.0.1712333915402; Fri, 05 Apr 2024 09:18:35 -0700 (PDT) MIME-Version: 1.0 References: <20240405161302.573912-1-hawkinsw@obs.cr> In-Reply-To: <20240405161302.573912-1-hawkinsw@obs.cr> From: Will Hawkins Date: Fri, 5 Apr 2024 12:18:23 -0400 Message-ID: Subject: Re: [PATCH v3] gdb: Support embedded source in DWARF To: gdb-patches@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: While there are still some areas that I need to address from Tom's feedback, I took a stab at updating the source code access through the cache and wanted to get your initial thoughts. See comments below. On Fri, Apr 5, 2024 at 12:13=E2=80=AFPM Will Hawkins wrot= e: > > While DW_LNCT_source is not yet finalized in the DWARF standard > (https://dwarfstd.org/issues/180201.1.html), LLVM does emit it. > > This patch adds support for it in gdb. > > Signed-off-by: Will Hawkins > --- > > Notes: > v2 -> v3 > - Address feedback from v2. > - Fixed up a bug where non-CU embedded files were incorrectly > marked as embedded. > - Revamped source access through the cache to avoid copies. > > v1 -> v2 > - Address feedback from original PR > - Add support for maintenance commands to see embedded source s= tatus > - Prevent access to the filesystem for symtabs with embedded so= urce > - Add additional unit tests > > gdb/dwarf2/file-and-dir.h | 23 ++++- > gdb/dwarf2/line-header.c | 22 +++-- > gdb/dwarf2/line-header.h | 9 +- > gdb/dwarf2/read.c | 83 +++++++++++++++--- > gdb/extension.c | 7 +- > gdb/extension.h | 9 +- > gdb/source-cache.c | 87 ++++++++++++------- > gdb/source-cache.h | 12 +-- > gdb/source.c | 74 +++++++++++----- > gdb/source.h | 4 + > gdb/symmisc.c | 17 ++-- > gdb/symtab.h | 5 ++ > gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c | 24 ++++++ > gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp | 88 ++++++++++++++++++++ > gdb/testsuite/lib/dwarf.exp | 19 +++-- > include/dwarf2.h | 5 ++ > 16 files changed, 390 insertions(+), 98 deletions(-) > create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c > create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp > > diff --git a/gdb/dwarf2/file-and-dir.h b/gdb/dwarf2/file-and-dir.h > index a5b1d8a3a21..84a7367e973 100644 > --- a/gdb/dwarf2/file-and-dir.h > +++ b/gdb/dwarf2/file-and-dir.h > @@ -95,7 +95,14 @@ struct file_and_directory > const char *get_fullname () > { > if (m_fullname =3D=3D nullptr) > - m_fullname =3D find_source_or_rewrite (get_name (), get_comp_dir (= )); > + { > + if (m_is_embedded) > + m_fullname =3D make_unique_xstrdup (embedded_fullname ( > + get_name (= ), > + get_comp_d= ir ())); > + else > + m_fullname =3D find_source_or_rewrite (get_name (), get_comp_di= r ()); > + } > return m_fullname.get (); > } > > @@ -105,6 +112,17 @@ struct file_and_directory > m_fullname.reset (); > } > > + /* Set whether the file's source is embedded in the dwarf. */ > + void set_embedded (bool is_embedded) > + { > + m_is_embedded =3D is_embedded; > + } > + > + /* Return true if the file's source is embedded in the dwarf. */ > + bool is_embedded () const > + { > + return m_is_embedded; > + } > private: > > /* The filename. */ > @@ -124,6 +142,9 @@ struct file_and_directory > > /* The full name. */ > gdb::unique_xmalloc_ptr m_fullname; > + > + /* Whether the file's source is embedded in the dwarf. */ > + bool m_is_embedded =3D false; > }; > > #endif /* GDB_DWARF2_FILE_AND_DIR_H */ > diff --git a/gdb/dwarf2/line-header.c b/gdb/dwarf2/line-header.c > index a3ca49b64f5..3bc707e999e 100644 > --- a/gdb/dwarf2/line-header.c > +++ b/gdb/dwarf2/line-header.c > @@ -45,6 +45,7 @@ line_header::add_include_dir (const char *include_dir) > void > line_header::add_file_name (const char *name, > dir_index d_index, > + const char *source, > unsigned int mod_time, > unsigned int length) > { > @@ -54,7 +55,7 @@ line_header::add_file_name (const char *name, > if (dwarf_line_debug >=3D 2) > gdb_printf (gdb_stdlog, "Adding file %d: %s\n", index, name); > > - m_file_names.emplace_back (name, index, d_index, mod_time, length); > + m_file_names.emplace_back (name, index, d_index, source, mod_time, len= gth); > } > > std::string > @@ -125,6 +126,7 @@ read_formatted_entries (dwarf2_per_objfile *per_objfi= le, bfd *abfd, > void (*callback) (struct line_header *lh, > const char *name, > dir_index d_index, > + const char *source, > unsigned int mod_time, > unsigned int length)) > { > @@ -239,13 +241,17 @@ read_formatted_entries (dwarf2_per_objfile *per_obj= file, bfd *abfd, > break; > case DW_LNCT_MD5: > break; > + case DW_LNCT_LLVM_SOURCE: > + if (string.has_value () && (*string)[0] !=3D '\0') > + fe.source =3D *string; > + break; > default: > complaint (_("Unknown format content type %s"), > pulongest (content_type)); > } > } > > - callback (lh, fe.name, fe.d_index, fe.mod_time, fe.length); > + callback (lh, fe.name, fe.d_index, fe.source, fe.mod_time, fe.leng= th); > } > > *bufp =3D buf; > @@ -368,8 +374,8 @@ dwarf_decode_line_header (sect_offset sect_off, bool= is_dwz, > read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (), > offset_size, > [] (struct line_header *header, const char = *name, > - dir_index d_index, unsigned int mod_tim= e, > - unsigned int length) > + dir_index d_index, const char *source, > + unsigned int mod_time, unsigned int len= gth) > { > header->add_include_dir (name); > }); > @@ -378,10 +384,10 @@ dwarf_decode_line_header (sect_offset sect_off, bo= ol is_dwz, > read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (), > offset_size, > [] (struct line_header *header, const char = *name, > - dir_index d_index, unsigned int mod_tim= e, > - unsigned int length) > + dir_index d_index, const char *source, > + unsigned int mod_time, unsigned int len= gth) > { > - header->add_file_name (name, d_index, mod_time, length); > + header->add_file_name (name, d_index, source, mod_time, length)= ; > }); > } > else > @@ -408,7 +414,7 @@ dwarf_decode_line_header (sect_offset sect_off, bool= is_dwz, > length =3D read_unsigned_leb128 (abfd, line_ptr, &bytes_read); > line_ptr +=3D bytes_read; > > - lh->add_file_name (cur_file, d_index, mod_time, length); > + lh->add_file_name (cur_file, d_index, nullptr, mod_time, length= ); > } > line_ptr +=3D bytes_read; > } > diff --git a/gdb/dwarf2/line-header.h b/gdb/dwarf2/line-header.h > index c068dff70a3..abc95f3ee87 100644 > --- a/gdb/dwarf2/line-header.h > +++ b/gdb/dwarf2/line-header.h > @@ -35,9 +35,10 @@ struct file_entry > file_entry () =3D default; > > file_entry (const char *name_, file_name_index index_, dir_index d_ind= ex_, > - unsigned int mod_time_, unsigned int length_) > + const char *source_, unsigned int mod_time_, unsigned int l= ength_) > : name (name_), > index (index_), > + source (source_), > d_index (d_index_), > mod_time (mod_time_), > length (length_) > @@ -54,6 +55,10 @@ struct file_entry > /* The index of this file in the file table. */ > file_name_index index {}; > > + /* The file's contents (if not null). Note this is an observing point= er. > + The memory is owned by debug_line_buffer. */ > + const char *source {}; > + > /* The directory index (1-based). */ > dir_index d_index {}; > > @@ -88,7 +93,7 @@ struct line_header > void add_include_dir (const char *include_dir); > > /* Add an entry to the file name table. */ > - void add_file_name (const char *name, dir_index d_index, > + void add_file_name (const char *name, dir_index d_index, const char *s= ource, > unsigned int mod_time, unsigned int length); > > /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based bef= ore). > diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c > index 7442094874c..e35ff019acf 100644 > --- a/gdb/dwarf2/read.c > +++ b/gdb/dwarf2/read.c > @@ -1635,6 +1635,10 @@ struct quick_file_names > /* The file names from the line table after being run through > gdb_realpath. These are computed lazily. */ > const char **real_names; > + > + /* Whether or not the file names refer to sources embedded > + in the dwarf. */ > + const bool *embeddeds; > }; > > /* With OBJF_READNOW, the DWARF reader expands all CUs immediately. > @@ -1908,27 +1912,39 @@ dw2_get_file_names_reader (const struct die_reade= r_specs *reader, > if (slot !=3D nullptr) > *slot =3D qfn; > > + > + bool cu_file_embedded =3D false; > + std::vector embeddedv; > + > std::vector include_names; > if (lh !=3D nullptr) > { > for (const auto &entry : lh->file_names ()) > { > std::string name_holder; > - const char *include_name =3D > - compute_include_file_name (lh.get (), entry, fnd, name_holder= ); > + const char *include_name > + =3D compute_include_file_name (lh.get (), entry, fnd, name_ho= lder); > if (include_name !=3D nullptr) > { > include_name =3D per_objfile->objfile->intern (include_name= ); > include_names.push_back (include_name); > + embeddedv.push_back (entry.source !=3D nullptr); > } > + else if (entry.source !=3D nullptr) > + { > + /* We have an embedded source for the CU. */ > + gdb_assert (offset =3D=3D 1); > + cu_file_embedded =3D true; > + } > + > } > } > > qfn->num_file_names =3D offset + include_names.size (); > qfn->comp_dir =3D fnd.intern_comp_dir (per_objfile->objfile); > - qfn->file_names =3D > - XOBNEWVEC (&per_objfile->per_bfd->obstack, const char *, > - qfn->num_file_names); > + qfn->file_names > + =3D XOBNEWVEC (&per_objfile->per_bfd->obstack, const char *, > + qfn->num_file_names); > if (offset !=3D 0) > qfn->file_names[0] =3D per_objfile->objfile->intern (fnd.get_name ()= ); > > @@ -1936,7 +1952,16 @@ dw2_get_file_names_reader (const struct die_reader= _specs *reader, > memcpy (&qfn->file_names[offset], include_names.data (), > include_names.size () * sizeof (const char *)); > > - qfn->real_names =3D NULL; > + bool *embeddeds > + =3D XOBNEWVEC (&per_objfile->per_bfd->obstack, bool, > + qfn->num_file_names); > + if (cu_file_embedded) > + embeddeds[0] =3D true; > + for (size_t i =3D 0; i < embeddedv.size (); i++) > + embeddeds[offset + i] =3D embeddedv[i]; > + qfn->embeddeds =3D embeddeds; > + > + qfn->real_names =3D nullptr; > > lh_cu->file_names =3D qfn; > } > @@ -1980,7 +2005,11 @@ dw2_get_real_path (dwarf2_per_objfile *per_objfile= , > dirname =3D qfn->comp_dir; > > gdb::unique_xmalloc_ptr fullname; > - fullname =3D find_source_or_rewrite (qfn->file_names[index], dirna= me); > + > + if (qfn->embeddeds[index]) > + fullname.reset (embedded_fullname (dirname, qfn->file_names[index= ])); > + else > + fullname =3D find_source_or_rewrite (qfn->file_names[index], dirn= ame); > > qfn->real_names[index] =3D fullname.release (); > } > @@ -7311,6 +7340,35 @@ find_file_and_directory (struct die_info *die, str= uct dwarf2_cu *cu) > file_and_directory res (dwarf2_string_attr (die, DW_AT_name, cu), > dwarf2_string_attr (die, DW_AT_comp_dir, cu)); > > + /* Because the line header may tell us information about the CU > + filename (e.g., whether it is embedded) which will affect other > + calculations, we have to read that information here. */ > + line_header *lh =3D cu->line_header; > + struct attribute *attr =3D dwarf2_attr (die, DW_AT_stmt_list, cu); > + if (lh =3D=3D nullptr && attr !=3D nullptr && attr->form_is_unsigned (= )) > + { > + sect_offset line_offset =3D (sect_offset) attr->as_unsigned (); > + line_header_up lhu =3D dwarf_decode_line_header (line_offset, cu, > + res.get_comp_dir ()); > + if (lhu !=3D nullptr) > + lh =3D lhu.release(); > + } > + > + if (lh !=3D nullptr) > + { > + for (const auto &entry : lh->file_names ()) > + { > + if (entry.source =3D=3D nullptr) > + continue; > + > + std::string name_holder; > + const char *include_name =3D > + compute_include_file_name (lh, entry, res, name_holder); > + if (include_name =3D=3D nullptr) > + res.set_embedded (true); > + } > + } > + > if (res.get_comp_dir () =3D=3D nullptr > && producer_is_gcc_lt_4_3 (cu) > && res.get_name () !=3D nullptr > @@ -18448,7 +18506,7 @@ dwarf_decode_lines_1 (struct line_header *lh, str= uct dwarf2_cu *cu, > length =3D > read_unsigned_leb128 (abfd, line_ptr, &bytes_read); > line_ptr +=3D bytes_read; > - lh->add_file_name (cur_file, dindex, mod_time, length= ); > + lh->add_file_name (cur_file, dindex, nullptr, mod_tim= e, length); > } > break; > case DW_LNE_set_discriminator: > @@ -18603,9 +18661,12 @@ dwarf_decode_lines (struct line_header *lh, stru= ct dwarf2_cu *cu, > subfile *sf =3D builder->get_current_subfile (); > > if (sf->symtab =3D=3D nullptr) > - sf->symtab =3D allocate_symtab (cust, sf->name.c_str (), > - sf->name_for_id.c_str ()); > - > + { > + sf->symtab =3D allocate_symtab (cust, sf->name.c_str (), > + sf->name_for_id.c_str ()); > + if (fe.source) > + sf->symtab->source =3D fe.source; > + } > fe.symtab =3D sf->symtab; > } > } > diff --git a/gdb/extension.c b/gdb/extension.c > index 9db8b53a087..f17991a3751 100644 > --- a/gdb/extension.c > +++ b/gdb/extension.c > @@ -991,16 +991,19 @@ xmethod_worker::get_result_type (value *object, gdb= ::array_view args) > /* See extension.h. */ > > std::optional > -ext_lang_colorize (const std::string &filename, const std::string &conte= nts) > +ext_lang_colorize (const std::string &filename, const std::string_view c= ontents) > { > std::optional result; > > + /* We avoided copies as long as possible. The external colorization AP= I > + requires a std::string. */ > + std::string contents_storage =3D std::string (contents); > for (const struct extension_language_defn *extlang : extension_languag= es) > { > if (extlang->ops =3D=3D nullptr > || extlang->ops->colorize =3D=3D nullptr) > continue; > - result =3D extlang->ops->colorize (filename, contents); > + result =3D extlang->ops->colorize (filename, contents_storage); > if (result.has_value ()) > return result; > } > diff --git a/gdb/extension.h b/gdb/extension.h > index 5260bcbde00..e5a5333b7c3 100644 > --- a/gdb/extension.h > +++ b/gdb/extension.h > @@ -319,12 +319,13 @@ extern void get_matching_xmethod_workers > std::vector *workers); > > /* Try to colorize some source code. FILENAME is the name of the file > - holding the code. CONTENTS is the source code itself. This will > - either a colorized (using ANSI terminal escapes) version of the > - source code, or an empty value if colorizing could not be done. */ > + holding the code. CONTENTS is a view of the source code itself. > + This will either generate a colorized (using ANSI terminal escapes) > + version of the source code, or an empty value if colorizing could not > + be done. */ > > extern std::optional ext_lang_colorize > - (const std::string &filename, const std::string &contents); > + (const std::string &filename, const std::string_view contents); > > /* Try to colorize a single line of disassembler output, CONTENT for > GDBARCH. This will return either a colorized (using ANSI terminal > diff --git a/gdb/source-cache.c b/gdb/source-cache.c > index 8b5bd84d19a..f5168921f99 100644 > --- a/gdb/source-cache.c > +++ b/gdb/source-cache.c > @@ -93,32 +93,45 @@ set_use_gnu_source_highlight_enabled (const char *ign= ore_args, > > /* See source-cache.h. */ > > -std::string > +std::string_view > source_cache::get_plain_source_lines (struct symtab *s, > - const std::string &fullname) > + const std::string &fullname, > + std::string &source_storage) > { > - scoped_fd desc (open_source_file (s)); > - if (desc.get () < 0) > - perror_with_name (symtab_to_filename_for_display (s), -desc.get ()); > + std::string_view lines; > + if (!s->source) > + { > + > + scoped_fd desc (open_source_file (s)); > + if (desc.get () < 0) > + perror_with_name (symtab_to_filename_for_display (s), -desc.get (= )); > > - struct stat st; > - if (fstat (desc.get (), &st) < 0) > - perror_with_name (symtab_to_filename_for_display (s)); > + struct stat st; > + if (fstat (desc.get (), &st) < 0) > + perror_with_name (symtab_to_filename_for_display (s)); > > - std::string lines; > - lines.resize (st.st_size); > - if (myread (desc.get (), &lines[0], lines.size ()) < 0) > - perror_with_name (symtab_to_filename_for_display (s)); > + source_storage.resize (st.st_size); > + if (myread (desc.get (), > + &source_storage[0], > + source_storage.size ()) < 0) > + perror_with_name (symtab_to_filename_for_display (s)); > > - time_t mtime =3D 0; > - if (s->compunit ()->objfile () !=3D NULL > - && s->compunit ()->objfile ()->obfd !=3D NULL) > - mtime =3D s->compunit ()->objfile ()->mtime; > - else if (current_program_space->exec_bfd ()) > - mtime =3D current_program_space->ebfd_mtime; > + time_t mtime =3D 0; > + if (s->compunit ()->objfile () !=3D nullptr > + && s->compunit ()->objfile ()->obfd !=3D nullptr) > + mtime =3D s->compunit ()->objfile ()->mtime; > + else if (current_program_space->exec_bfd ()) > + mtime =3D current_program_space->ebfd_mtime; > > - if (mtime && mtime < st.st_mtime) > - warning (_("Source file is more recent than executable.")); > + if (mtime && mtime < st.st_mtime) > + warning (_("Source file is more recent than executable.")); > + > + lines =3D source_storage; > + } > + else > + { > + lines =3D s->source; > + } > Tom mentioned lifting this check to the caller (source_cache::ensure) but I thought it made more sense to leave it here because of the code that calculates the offsets. If that is not reason enough to leave it here, I am glad to take another whack at refactoring! > std::vector offsets; > offsets.push_back (0); > @@ -200,9 +213,10 @@ get_language_name (enum language lang) > succeeded. */ > > static bool > -try_source_highlight (std::string &contents ATTRIBUTE_UNUSED, > +try_source_highlight (std::string_view &contents ATTRIBUTE_UNUSED, > enum language lang ATTRIBUTE_UNUSED, > - const std::string &fullname ATTRIBUTE_UNUSED) > + const std::string &fullname ATTRIBUTE_UNUSED, > + std::string &contents_storage) > { My general approach was to use the '_storage' paradigm that I have seen throughout the codebase. Where I can use a std::string_view, I did. Where I ultimately needed some storage, I use those parameters. As above, please let me know if this pattern looks okay! If it does, I will continue to finalize based on the feedback that I received in the previous review. Thank you again for the welcome to the project! Will > #ifdef HAVE_SOURCE_HIGHLIGHT > if (!use_gnu_source_highlight) > @@ -240,10 +254,14 @@ try_source_highlight (std::string &contents ATTRIBU= TE_UNUSED, > lang_name =3D detected_lang.c_str (); > } > > - std::istringstream input (contents); > + /* We waited as long as possible but now we need a string. */ > + contents_storage =3D contents; > + std::istringstream input (contents_storage); > std::ostringstream output; > highlighter->highlight (input, output, lang_name, fullname); > - contents =3D std::move (output).str (); > + /* Use the given storage for the contents and set the > + view appropriately. */ > + contents =3D contents_storage =3D std::move (output).str (); > styled =3D true; > } > catch (...) > @@ -275,13 +293,16 @@ static void gnu_source_highlight_test () > "}\n"); > const std::string fullname =3D "test.c"; > std::string styled_prog; > + std::string_view styled_prog_view; > > bool res =3D false; > bool saw_exception =3D false; > styled_prog =3D prog; > + styled_prog_view =3D styled_prog; > try > { > - res =3D try_source_highlight (styled_prog, language_c, fullname); > + res =3D try_source_highlight (styled_prog_view, language_c, fullna= me, > + styled_prog); > } > catch (...) > { > @@ -324,10 +345,12 @@ source_cache::ensure (struct symtab *s) > } > } > > - std::string contents; > + std::string_view source_contents; > + std::string source_contents_storage; > try > { > - contents =3D get_plain_source_lines (s, fullname); > + source_contents =3D get_plain_source_lines (s, fullname, > + source_contents_storage); > } > catch (const gdb_exception_error &e) > { > @@ -339,15 +362,17 @@ source_cache::ensure (struct symtab *s) > && m_no_styling_files.count (fullname) =3D=3D 0) > { > bool already_styled > - =3D try_source_highlight (contents, s->language (), fullname); > + =3D try_source_highlight (source_contents, s->language (), fullna= me, > + source_contents_storage); > > if (!already_styled) > { > std::optional ext_contents; > - ext_contents =3D ext_lang_colorize (fullname, contents); > + ext_contents =3D ext_lang_colorize (fullname, source_contents); > if (ext_contents.has_value ()) > { > - contents =3D std::move (*ext_contents); > + source_contents =3D source_contents_storage > + =3D std::move (*ext_contents); > already_styled =3D true; > } > } > @@ -369,7 +394,7 @@ source_cache::ensure (struct symtab *s) > } > } > > - source_text result =3D { std::move (fullname), std::move (contents) }; > + source_text result =3D { std::move (fullname), std::string (source_con= tents) }; > m_source_map.push_back (std::move (result)); > > if (m_source_map.size () > MAX_ENTRIES) > diff --git a/gdb/source-cache.h b/gdb/source-cache.h > index d4cb7d00ae8..84996a6a5a1 100644 > --- a/gdb/source-cache.h > +++ b/gdb/source-cache.h > @@ -80,11 +80,13 @@ class source_cache > std::string contents; > }; > > - /* A helper function for get_source_lines reads a source file. > - Returns the contents of the file; or throws an exception on > - error. This also updates m_offset_cache. */ > - std::string get_plain_source_lines (struct symtab *s, > - const std::string &fullname); > + /* A helper function for get_source_lines that reads a source file. > + Returns a view of the contents of the file using SOURCE_STORAGE > + as necessary (i.e., if the source is stored on disk); or throws > + an exception on error. This also updates m_offset_cache. */ > + std::string_view get_plain_source_lines (struct symtab *s, > + const std::string &fullname, > + std::string &source_storage); > > /* A helper function that the data for the given symtab is entered > into both caches. Returns false on error. */ > diff --git a/gdb/source.c b/gdb/source.c > index bbeb4154258..8bfaffe43a8 100644 > --- a/gdb/source.c > +++ b/gdb/source.c > @@ -684,9 +684,11 @@ info_source_command (const char *ignore, int from_tt= y) > > cust =3D s->compunit (); > gdb_printf (_("Current source file is %s\n"), s->filename); > - if (s->compunit ()->dirname () !=3D NULL) > + if (s->compunit ()->dirname () !=3D nullptr) > gdb_printf (_("Compilation directory is %s\n"), s->compunit ()->dirn= ame ()); > - if (s->fullname) > + if (s->source !=3D nullptr) > + gdb_printf (_("With embedded source.\n")); > + else if (s->fullname) > gdb_printf (_("Located in %s\n"), s->fullname); > const std::vector *offsets; > if (g_source_cache.get_line_charpos (s, &offsets)) > @@ -960,6 +962,19 @@ source_full_path_of (const char *filename, > return 1; > } > > +/* See source.h. */ > + > +char * > +embedded_fullname (const char *dirname, const char *filename) > +{ > + if (dirname !=3D nullptr) > + { > + return concat (dirname, SLASH_STRING, filename, (char *) nullptr); > + } > + > + return xstrdup (filename); > +} > + > /* Return non-zero if RULE matches PATH, that is if the rule can be > applied to PATH. */ > > @@ -1237,27 +1252,35 @@ symtab_to_fullname (struct symtab *s) > /* Use cached copy if we have it. > We rely on forget_cached_source_info being called appropriately > to handle cases like the file being moved. */ > - if (s->fullname =3D=3D NULL) > + if (s->fullname =3D=3D nullptr) > { > - scoped_fd fd =3D open_source_file (s); > - > - if (fd.get () < 0) > + if (s->source) > + s->fullname =3D embedded_fullname (s->compunit ()->dirname (), > + s->filename); > + else > { > - gdb::unique_xmalloc_ptr fullname; > + scoped_fd fd =3D open_source_file (s); > > - /* rewrite_source_path would be applied by find_and_open_source= , we > - should report the pathname where GDB tried to find the file.= */ > + if (fd.get () < 0) > + { > + gdb::unique_xmalloc_ptr fullname; > > - if (s->compunit ()->dirname () =3D=3D nullptr > - || IS_ABSOLUTE_PATH (s->filename)) > - fullname.reset (xstrdup (s->filename)); > - else > - fullname.reset (concat (s->compunit ()->dirname (), SLASH_STR= ING, > - s->filename, (char *) NULL)); > + /* rewrite_source_path would be applied by find_and_open_so= urce, > + we should report the pathname where GDB tried to find th= e > + file. */ > > - s->fullname =3D rewrite_source_path (fullname.get ()).release (= ); > - if (s->fullname =3D=3D NULL) > - s->fullname =3D fullname.release (); > + if (s->compunit ()->dirname () =3D=3D nullptr > + || IS_ABSOLUTE_PATH (s->filename)) > + fullname.reset (xstrdup (s->filename)); > + else > + fullname.reset (concat (s->compunit ()->dirname (), > + SLASH_STRING, s->filename, > + (char *) nullptr)); > + > + s->fullname =3D rewrite_source_path (fullname.get ()).relea= se (); > + if (s->fullname =3D=3D nullptr) > + s->fullname =3D fullname.release (); > + } > } > } > > @@ -1317,12 +1340,17 @@ print_source_lines_base (struct symtab *s, int li= ne, int stopline, > else > { > last_source_visited =3D s; > - scoped_fd desc =3D open_source_file (s); > - last_source_error =3D desc.get () < 0; > - if (last_source_error) > + /* Do not attempt to open a source file for a symtab > + with an embedded source. */ > + if (!s->source) > { > - noprint =3D true; > - errcode =3D -desc.get (); > + scoped_fd desc =3D open_source_file (s); > + last_source_error =3D desc.get () < 0; > + if (last_source_error) > + { > + noprint =3D true; > + errcode =3D -desc.get (); > + } > } > } > } > diff --git a/gdb/source.h b/gdb/source.h > index 144ee48f722..f27caf1a9b5 100644 > --- a/gdb/source.h > +++ b/gdb/source.h > @@ -216,4 +216,8 @@ extern void forget_cached_source_info (void); > need to would make things slower than necessary. */ > extern void select_source_symtab (); > > +/* Compute the fullname for a source file whose source is embedded > + in the dwarf file. */ > +extern char *embedded_fullname (const char *dirname, > + const char *filename); > #endif > diff --git a/gdb/symmisc.c b/gdb/symmisc.c > index 49b9674f77a..f5092de2659 100644 > --- a/gdb/symmisc.c > +++ b/gdb/symmisc.c > @@ -810,9 +810,11 @@ maintenance_info_symtabs (const char *regexp, int fr= om_tty) > gdb_printf ("((struct symtab *) %s)\n", > host_address_to_string (symtab)); > gdb_printf ("\t fullname %s\n", > - symtab->fullname !=3D NULL > + symtab->fullname !=3D nullptr > ? symtab->fullname > : "(null)"); > + if (symtab->source !=3D nullptr) > + gdb_printf ("\t source embedded in DWARF\n"); > gdb_printf ("\t " > "linetable ((struct linetable *) %s)\n", > host_address_to_string > @@ -955,15 +957,20 @@ maintenance_print_one_line_table (struct symtab *sy= mtab, void *data) > gdb_printf (_("compunit_symtab: %s ((struct compunit_symtab *) %s)\n")= , > symtab->compunit ()->name, > host_address_to_string (symtab->compunit ())); > + styled_string_s *styled_symtab_fullname =3D nullptr; > + if (symtab->source) > + styled_symtab_fullname =3D styled_string (metadata_style.style (), > + _("Source embedded in DWARF")= ); > + else > + styled_symtab_fullname =3D styled_string (file_name_style.style (), > + symtab_to_fullname (symtab)); > gdb_printf (_("symtab: %ps ((struct symtab *) %s)\n"), > - styled_string (file_name_style.style (), > - symtab_to_fullname (symtab)), > + styled_symtab_fullname, > host_address_to_string (symtab)); > linetable =3D symtab->linetable (); > gdb_printf (_("linetable: ((struct linetable *) %s):\n"), > host_address_to_string (linetable)); > - > - if (linetable =3D=3D NULL) > + if (linetable =3D=3D nullptr) > gdb_printf (_("No line table.\n")); > else if (linetable->nitems <=3D 0) > gdb_printf (_("Line table has no lines.\n")); > diff --git a/gdb/symtab.h b/gdb/symtab.h > index bf9a3cfb79f..90c2f202390 100644 > --- a/gdb/symtab.h > +++ b/gdb/symtab.h > @@ -1755,6 +1755,11 @@ struct symtab > > const char *filename; > > + /* When the contents of the source file were/are embedded with the > + debugging info, this pointer will refer to that source. Can be null= ptr > + if the source for this source file is on disk. */ > + const char *source; > + > /* Filename for this source file, used as an identifier to link with > related objects such as associated macro_source_file objects. It m= ust > therefore match the name of any macro_source_file object created fo= r this > diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c b/gdb/testsuite/g= db.dwarf2/dw2-lnct-source.c > new file mode 100644 > index 00000000000..86cdb9655b7 > --- /dev/null > +++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c > @@ -0,0 +1,24 @@ > +/* Copyright 2024 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see .= */ > + > + > +int > +main (void) > +{ /* main prologue = */ > + asm ("main_label: .global main_label"); > + int m =3D 42; /* main assign = m */ > + asm ("main_end: .global main_end"); /* main end */ > + return m; > +} > diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp b/gdb/testsuite= /gdb.dwarf2/dw2-lnct-source.exp > new file mode 100644 > index 00000000000..73a06bd7b9c > --- /dev/null > +++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp > @@ -0,0 +1,88 @@ > +# Copyright 2024 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +# Check that GDB can honor LNCT_llvm_SOURCE. > + > +load_lib dwarf.exp > + > +# This test can only be run on targets which support DWARF-2 and use gas= . > +require dwarf2_support > + > +standard_testfile .c .S > + > +set asm_file [standard_output_file $srcfile2] > + > +set fp [open "${srcdir}/${subdir}/${srcfile}" r] > +set srcfile_data [read $fp] > +close $fp > + > +Dwarf::assemble $asm_file { > + global srcdir subdir srcfile srcfile2 srcfile_data > + declare_labels lines_label > + > + get_func_info main > + > + cu {} { > + compile_unit { > + {language @DW_LANG_C} > + {name missing-file.c} > + {stmt_list ${lines_label} DW_FORM_sec_offset} > + } { > + subprogram { > + {external 1 flag} > + {name main} > + {low_pc $main_start addr} > + {high_pc "$main_start + $main_len" addr} > + } > + } > + } > + > + lines {version 5} lines_label { > + set diridx [include_dir "${srcdir}/${subdir}"] > + file_name "missing-file.c" $diridx "${srcfile_data}" > + > + program { > + DW_LNS_set_file $diridx > + DW_LNE_set_address $main_start > + line [gdb_get_line_number "main prologue"] > + DW_LNS_copy > + > + DW_LNE_set_address main_label > + line [gdb_get_line_number "main assign m"] > + DW_LNS_copy > + > + DW_LNE_set_address main_end > + line [gdb_get_line_number "main end"] > + DW_LNS_copy > + > + DW_LNE_end_sequence > + } > + } > +} > + > +if { [prepare_for_testing "failed to prepare" ${testfile} \ > + [list $srcfile $asm_file] {nodebug}] } { > + return -1 > +} > + > +if ![runto_main] { > + return -1 > +} > + > +set assign_m_line [gdb_get_line_number "main assign m"] > +gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*" > +gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in= DWARF.*" > +gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source = embedded in DWARF.*" > +gdb_test "info source" ".*With embedded source.*" > diff --git a/gdb/testsuite/lib/dwarf.exp b/gdb/testsuite/lib/dwarf.exp > index d085f835f07..b1f09eab3d3 100644 > --- a/gdb/testsuite/lib/dwarf.exp > +++ b/gdb/testsuite/lib/dwarf.exp > @@ -2423,9 +2423,9 @@ namespace eval Dwarf { > # Add a file name entry to the line table header's file names tab= le. > # > # Return the index by which this entry can be referred to. > - proc file_name {filename diridx} { > + proc file_name {filename diridx { source "" } } { > variable _line_file_names > - lappend _line_file_names $filename $diridx > + lappend _line_file_names $filename $diridx $source > > if { $Dwarf::_line_unit_version >=3D 5 } { > return [expr [llength $_line_file_names] - 1] > @@ -2481,7 +2481,7 @@ namespace eval Dwarf { > } > } > > - _op .byte 2 "file_name_entry_format_count" > + _op .byte 3 "file_name_entry_format_count" > _op .uleb128 1 \ > "file_name_entry_format (content type code: DW_LNCT_p= ath)" > switch $_line_string_form { > @@ -2494,15 +2494,21 @@ namespace eval Dwarf { > "directory_entry_format (form: DW_FORM_line_s= trp)" > } > } > + > _op .uleb128 2 \ > "file_name_entry_format (content type code: DW_LNCT_d= irectory_index)" > _op .uleb128 0x0f \ > "file_name_entry_format (form: DW_FORM_udata)" > > - set nr_files [expr [llength $_line_file_names] / 2] > + _op .uleb128 0x2001 \ > + "file_name_entry_format (content type code: DW_LNCT_L= LVM_SOURCE)" > + _op .uleb128 0x08 \ > + "file_name_entry_format (form: DW_FORM_string)" > + > + set nr_files [expr [llength $_line_file_names] / 3] > _op .byte $nr_files "file_names_count" > > - foreach { filename diridx } $_line_file_names { > + foreach { filename diridx source } $_line_file_names { > switch $_line_string_form { > string { > _op .ascii [_quote $filename] > @@ -2517,6 +2523,7 @@ namespace eval Dwarf { > } > } > _op .uleb128 $diridx > + _op .ascii [_quote [string map { "\"" "\\\"" "\n" "\\= n" } $source]] > } > } else { > foreach dirname $_line_include_dirs { > @@ -2525,7 +2532,7 @@ namespace eval Dwarf { > > _op .byte 0 "Terminator (include_directories)" > > - foreach { filename diridx } $_line_file_names { > + foreach { filename diridx source } $_line_file_names { > _op .ascii [_quote $filename] > _op .sleb128 $diridx > _op .sleb128 0 "mtime" > diff --git a/include/dwarf2.h b/include/dwarf2.h > index b3d3731ee83..3823c041bab 100644 > --- a/include/dwarf2.h > +++ b/include/dwarf2.h > @@ -289,6 +289,11 @@ enum dwarf_line_number_content_type > DW_LNCT_size =3D 0x4, > DW_LNCT_MD5 =3D 0x5, > DW_LNCT_lo_user =3D 0x2000, > + /* LLVM has implemented DW_LNCT_source (see > + https://dwarfstd.org/issues/180201.1.html) as DW_LNCT_LLVM_SOURCE= as > + a vendor extension until the DWARF standard is updated (see > + https://github.com/llvm/llvm-project/blob/08bb121835be432ac52372f= 92845950628ce9a4a/llvm/include/llvm/BinaryFormat/Dwarf.def#L1080 . */ > + DW_LNCT_LLVM_SOURCE =3D 0x2001, > DW_LNCT_hi_user =3D 0x3fff > }; > > -- > 2.44.0 >