public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* PR30000 - introduce new srcfiles tool
@ 2023-09-28 14:59 Housam Alamour
  2023-10-23 14:08 ` Frank Ch. Eigler
  2023-10-23 22:43 ` Aaron Merey
  0 siblings, 2 replies; 5+ messages in thread
From: Housam Alamour @ 2023-09-28 14:59 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 15051 bytes --]

Hi,

Here is the new srcfiles tool ready for review.

Author: Housam Alamour <halamour@redhat.com>
Date:   Thu Sep 7 14:29:19 2023 -0400

    PR 30000: debuginfod-find should have a source-list verb

    * seclines.cxx: Introduce new tool that compiles a list of source
    files associated with a specified dwarf/elf file. This
    compilation relies on searching the dwarf debug information,
    which can be automatically retrieved via debuginfod when required.
    The target file can encompass various types, such as an executable,
    a coredump, a running process, or the currently executing kernel.
    The source file names are rendered as unique entries and then
    displayed on the standard output.

    *  run-srcfiles-self.sh: New test-case for tool.

    https://sourceware.org/bugzilla/show_bug.cgi?id=30000

    Signed-off-by: Housam Alamour <halamour@redhat.com>

diff --git a/doc/srcfiles.1 b/doc/srcfiles.1
new file mode 100644
index 00000000..245c2d27
--- /dev/null
+++ b/doc/srcfiles.1
@@ -0,0 +1,125 @@
+.\" Copyright 2023 Red Hat Inc.
+.\" Mon 2023-Sept 23 Housam Alamour <halamour@redhat.com>
+.\" Contact elfutils-devel@sourceware.org to correct errors or typos.
+.TH EU-SRCFILES 1 "2023-Sept-25" "elfutils"
+
+.de SAMPLE
+.br
+.RS 0
+.nf
+.nh
+\fB
+..
+.de ESAMPLE
+\fP
+.hy
+.fi
+.RE
+..
+
+.SH "NAME"
+eu-srcfiles \- Lists the source files of a dwarf/elf file.
+
+.SH "SYNOPSIS"
+eu-srcfiles [\fB\-0\fR|\fB\-\-null\fR] [\fB\-c\fR|\fB\-\-cu\-only\fR]
[\fB\-v\fR|\fB\-\-verbose\fR] INPUT
+
+.SH "DESCRIPTION"
+\fBeu-srcfiles\fR lists the source files of a given \s-1dwarf/elf\s0
+file.  This list is based on a search of the dwarf debuginfo, which
+may be automatically fetched by debuginfod if applicable.  The target
+file may be an executable, a coredump, a process, or even the running
+kernel.  The default is the file 'a.out'.  The source file names are
+made unique and printed to standard output.
+
+.SH "OPTIONS"
+The long and short forms of options, shown here as alternatives, are
+equivalent.
+
+.SS "Input Options"
+
+Only one INPUT option may be used.  The default is '-e a.out'.
+
+.TP
+\fB\-e FILE\fR, \fB\-\-executable=FILE\fR
+Find source files for FILE.
+
+.TP
+\fB\-\-core=COREFILE\fR
+Find source files for the executable plus all shared libraries loaded
+into the coredumped process.
+
+.TP
+\fB\-p PID\fR, \fB\-\-pid=PID\fR
+Find source files for the executable plus all shared libraries loaded
+into running process PID.
+
+.TP
+\fB\-M FILE\fR, \fB\-\-linux\-process\-map=FILE\fR
+Find source files for the executable plus all shared libraries loaded
+into a process, with given map file in linux /proc/PID/maps format.
+The file names listed in the map file must exist and be current on the
+system.
+
+.TP
+\fB\-K, \-\-offline\-kernel[=RELEASE]
+Find source files for the kernel plus all installed modules
+with the given RELEASE version.  This may be very slow if
+dwarf is found via debuginfod.
+
+.SS "Output Options"
+
+.TP
+\fB\-0, \-\-null\fR
+Separate items by a null instead of a newline.
+
+.TP
+\fB\-c, \-\-cu\-only\fR
+Only list the CU names.
+
+.TP
+\fB\-v, \-\-verbose\fR
+Increase verbosity of logging messages to stderr.
+
+
+.SH EXAMPLES
+
+List all source files for a binary.
+.SAMPLE
+eu-srcfiles -e /bin/ls
+.ESAMPLE
+
+List all compilation units (CU) names for a given process (including
shared libraries).
+.SAMPLE
+eu-srcfiles -c -p $$
+.ESAMPLE
+
+List source files of a binary based on its buildid, using debuginfod.
+.SAMPLE
+binary=`debuginfod-find executable
9c22d8d9e42bd051ffdc1064fdfd456ba781c629`
+eu-srcfiles -c -e $binary
+.ESAMPLE
+
+Show the source code of the first CU of a shared library.
+.SAMPLE
+binary=/usr/lib64/libc.so.6
+srcfile=`eu-srcfiles -c -e $binary | head -1`
+cat `debuginfod-find source $binary $srcfile`
+.ESAMPLE
+
+List the source files of a kernel image.
+.SAMPLE
+eu-srcfiles -e /boot/vmlinuz-`uname -r`
+.ESAMPLE
+
+
+.SH "AUTHOR"
+Written by Housam Alamour.
+
+.SH "REPORTING BUGS"
+Please reports bugs at https://sourceware.org/bugzilla/
+
+.SH "COPYRIGHT"
+Copyright (c) 2023 Red Hat Inc.  License GPLv3+: GNU GPL version 3 or
+later <https://gnu.org/licenses/gpl.html>.  This is free software: you
+are free to change and redistribute it.  There is NO WARRANTY, to the
+extent permitted by law.
diff --git a/lib/printversion.h b/lib/printversion.h
index 4154b328..6811aa5b 100644
--- a/lib/printversion.h
+++ b/lib/printversion.h
@@ -32,6 +32,9 @@
 #include <argp.h>
 #include <stdio.h>

+#ifdef __cplusplus
+extern "C" {
+#endif
 /* Defined in version.c.  Common ARGP_PROGRAM_VERSION_HOOK_DEF.  */
 void print_version (FILE *stream, struct argp_state *state);

@@ -49,4 +52,8 @@ extern const char *const apba__;
 #define ARGP_PROGRAM_BUG_ADDRESS_DEF \
   const char *const apba__ __asm ("argp_program_bug_address")

+#ifdef __cplusplus
+}
+#endif
+
 #endif // PRINTVERSION_H
diff --git a/src/Makefile.am b/src/Makefile.am
index 10d59a48..d3d9d408 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -27,7 +27,7 @@ AM_LDFLAGS = -Wl,-rpath-link,../libelf:../libdw
$(STACK_USAGE_NO_ERROR)

 bin_PROGRAMS = readelf nm size strip elflint findtextrel addr2line \
        elfcmp objdump ranlib strings ar unstrip stack elfcompress \
-       elfclassify
+       elfclassify srcfiles

 noinst_LIBRARIES = libar.a

@@ -84,6 +84,8 @@ unstrip_LDADD = $(libebl) $(libelf) $(libdw) $(libeu)
$(argp_LDADD)
 stack_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD)
$(demanglelib)
 elfcompress_LDADD = $(libebl) $(libelf) $(libdw) $(libeu) $(argp_LDADD)
 elfclassify_LDADD = $(libelf) $(libdw) $(libeu) $(argp_LDADD)
+srcfiles_SOURCES = srcfiles.cxx
+srcfiles_LDADD = $(libdw) $(libelf) $(libeu)  $(argp_LDADD)

 installcheck-binPROGRAMS: $(bin_PROGRAMS)
  bad=0; pid=$$$$; list="$(bin_PROGRAMS)"; for p in $$list; do \
diff --git a/src/srcfiles.cxx b/src/srcfiles.cxx
new file mode 100644
index 00000000..c948269d
--- /dev/null
+++ b/src/srcfiles.cxx
@@ -0,0 +1,217 @@
+/* Print the source files of a given ELF file.
+   Copyright (C) 2023 Red Hat, Inc.
+   This file is part of elfutils.
+   Written by Housam Alamour <alamourh@redhat.com>.
+
+   This file is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   elfutils is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */
+
+#include "printversion.h"
+#include <dwarf.h>
+#include <argp.h>
+#include <cstring>
+#include <set>
+#include <string>
+#include <cassert>
+#include <config.h>
+
+#include <libdwfl.h>
+#include <fcntl.h>
+#include <iostream>
+#include <libdw.h>
+
+using namespace std;
+
+/* Name and version of program.  */
+ARGP_PROGRAM_VERSION_HOOK_DEF = print_version;
+
+/* Bug report address.  */
+ARGP_PROGRAM_BUG_ADDRESS_DEF = PACKAGE_BUGREPORT;
+
+/* Definitions of arguments for argp functions.  */
+static const struct argp_option options[] =
+{
+  { "null", '0', NULL, 0,
+    N_ ("items are separated by a null, not whitespace."), 0 },
+  { "verbose", 'v', NULL, 0,
+    N_ ("Increase verbosity of logging messages."), 0 },
+  { "cu-only", 'c', NULL, 0, N_ ("Only list the CU names."), 0 },
+  { NULL, 0, NULL, 0, NULL, 0 }
+};
+
+/* Short description of program.  */
+static const char doc[] = N_("Show source files of given ELF FILE.");
+
+/* Strings for arguments in help texts.  */
+static const char args_doc[] = N_("[FILE]");
+
+/* Prototype for option handler.  */
+static error_t parse_opt (int key, char *arg, struct argp_state *state);
+
+static struct argp_child argp_children[2]; /* [0] is set in main.  */
+
+/* Data structure to communicate with argp functions.  */
+static const struct argp argp =
+{
+  options, parse_opt, args_doc, doc, argp_children, NULL, NULL
+};
+
+/* Verbose message printing. */
+static bool verbose;
+/* Delimit the output with nulls. */
+static bool null_arg;
+/* Only print compilation unit names. */
+static bool CU_only;
+
+/* Handle program arguments. */
+static error_t
+parse_opt (int key, char *arg, struct argp_state *state)
+{
+  /* Suppress "unused parameter" warning. */
+  (void)arg;
+  switch (key)
+    {
+    case ARGP_KEY_INIT:
+      state->child_inputs[0] = state->input;
+      break;
+
+    case '0':
+      null_arg = true;
+      break;
+
+    case 'v':
+      verbose = true;
+      break;
+
+    case 'c':
+      CU_only = true;
+      break;
+
+    default:
+      return ARGP_ERR_UNKNOWN;
+    }
+  return 0;
+}
+
+
+// Global list of collected source files.  Normally, it'll contain
+// the sources of just one named binary, but the '-K' option can cause
+// multiple dwfl modules to be loaded, thus listed.
+set<string> debug_sourcefiles;
+
+static int
+collect_sourcefiles (Dwfl_Module *dwflmod,
+                     void **userdata __attribute__ ((unused)),
+                     const char *name __attribute__ ((unused)),
+                     Dwarf_Addr base __attribute__ ((unused)),
+                     void *arg __attribute__ ((unused)))
+{
+  Dwarf *dbg;
+  Dwarf_Addr bias; // ignored - for addressing purposes only
+
+  dbg = dwfl_module_getdwarf (dwflmod, &bias);
+
+  Dwarf_Off offset = 0;
+  Dwarf_Off old_offset;
+  size_t hsize;
+
+  // Traverse all CUs of this module.
+  while (dwarf_nextcu (dbg, old_offset = offset, &offset, &hsize, NULL,
NULL, NULL) == 0)
+    {
+      Dwarf_Die cudie_mem;
+      Dwarf_Die *cudie = dwarf_offdie (dbg, old_offset + hsize,
&cudie_mem);
+
+      if (cudie == NULL)
+        continue;
+
+      const char *cuname = dwarf_diename (cudie) ?: "<unknown>";
+      Dwarf_Files *files;
+      size_t nfiles;
+      if (dwarf_getsrcfiles (cudie, &files, &nfiles) != 0)
+        continue;
+
+      // extract DW_AT_comp_dir to resolve relative file names
+      const char *comp_dir = "";
+      const char *const *dirs;
+      size_t ndirs;
+
+      if (dwarf_getsrcdirs (files, &dirs, &ndirs) == 0 && dirs[0] != NULL)
+        comp_dir = dirs[0];
+      if (comp_dir == NULL)
+        comp_dir = "";
+
+      if (verbose)
+        std::clog << "searching for sources for cu=" << cuname
+                  << " comp_dir=" << comp_dir << " #files=" << nfiles
+                  << " #dirs=" << ndirs << endl;
+
+      for (size_t f = 1; f < nfiles; f++)
+        {
+          const char *hat;
+          if (CU_only)
+          {
+            if (strcmp(cuname, "<unknown>") == 0 || strcmp(cuname,
"<artificial>") == 0 )
+              continue;
+            hat = cuname;
+          }
+          else
+            hat = dwarf_filesrc (files, f, NULL, NULL);
+
+          if (hat == NULL)
+            continue;
+
+          if (string(hat).find("<built-in>")
+              != std::string::npos) // gcc intrinsics, don't bother record
+            continue;
+
+          string waldo;
+          if (hat[0] == '/') // absolute
+            waldo = (string (hat));
+          else if (comp_dir[0] != '\0') // comp_dir relative
+            waldo = (string (comp_dir) + string ("/") + string (hat));
+          debug_sourcefiles.insert (waldo);
+        }
+    }
+
+  return DWARF_CB_OK;
+}
+
+
+int
+main (int argc, char *argv[])
+{
+  int remaining;
+
+  /* Parse and process arguments.  This includes opening the modules.  */
+  argp_children[0].argp = dwfl_standard_argp ();
+  argp_children[0].group = 1;
+
+  Dwfl *dwfl = NULL;
+  (void) argp_parse (&argp, argc, argv, 0, &remaining, &dwfl);
+  assert (dwfl != NULL);
+  // Process all loaded modules - probably just one, except if -K or -p is
used.
+  (void) dwfl_getmodules (dwfl, &collect_sourcefiles, NULL, 0);
+
+  if (!debug_sourcefiles.empty ())
+    for (const string &element : debug_sourcefiles)
+      {
+        std::cout << element;
+        if (null_arg)
+          std::cout << '\0';
+        else
+          std::cout << '\n';
+      }
+
+  dwfl_end (dwfl);
+  return 0;
+}
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 32b18e6e..d0ae7c6d 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -131,6 +131,7 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh
newfile test-nlist \
  run-find-prologues.sh run-allregs.sh run-addrcfi.sh \
  run-dwarfcfi.sh run-nm-syms.sh \
  run-nm-self.sh run-readelf-self.sh run-readelf-info-plus.sh \
+ run-srcfiles-self.sh \
  run-readelf-compressed.sh \
  run-readelf-const-values.sh \
  run-varlocs-self.sh run-exprlocs-self.sh \
@@ -323,7 +324,8 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh
run-ar.sh \
      run-addrscopes.sh run-strings-test.sh run-funcscopes.sh \
      run-nm-syms.sh testfilesyms32.bz2 testfilesyms64.bz2 \
      run-nm-self.sh run-readelf-self.sh run-readelf-info-plus.sh \
-     run-readelf-compressed.sh \
+     run-srcfiles-self.sh \
+ run-readelf-compressed.sh \
      run-readelf-compressed-zstd.sh \
      run-readelf-const-values.sh testfile-const-values.debug.bz2 \
      run-addrcfi.sh run-dwarfcfi.sh \
diff --git a/tests/run-srcfiles-self.sh b/tests/run-srcfiles-self.sh
new file mode 100755
index 00000000..075ff7d2
--- /dev/null
+++ b/tests/run-srcfiles-self.sh
@@ -0,0 +1,45 @@
+#! /bin/sh
+# Copyright (C) 2023 Red Hat, Inc.
+# This file is part of elfutils.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# elfutils is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. $srcdir/test-subr.sh
+
+# Test different command line combinations on the srcfiles binary itself.
+ET_EXEC="${abs_top_builddir}/src/srcfiles"
+ET_PID=$$
+
+SRC_NAME="srcfiles.cxx"
+
+for null_arg in --null ""; do
+  for verbose_arg in --verbose ""; do
+    testrun $ET_EXEC $null_arg $verbose_arg -p $ET_PID > /dev/null
+
+    # Ensure that the output contains srclines.cxx
+    cu_only=$(testrun $ET_EXEC $null_arg $verbose_arg -c -e $ET_EXEC)
+    default=$(testrun $ET_EXEC $null_arg $verbose_arg -e $ET_EXEC)
+    result1=$(echo "$cu_only" | grep "$SRC_NAME")
+    result2=$(echo "$default" | grep "$SRC_NAME")
+    if [ -z "$result1" ] || [-z "$result2"]; then
+      exit 1
+    fi
+
+    # Ensure that the output with the cu-only option contains less source
files
+    if [ "$cu_only" -gt "$default" ]; then
+      exit 1
+    fi
+  done
+done
+

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-24 19:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-28 14:59 PR30000 - introduce new srcfiles tool Housam Alamour
2023-10-23 14:08 ` Frank Ch. Eigler
2023-10-23 22:43 ` Aaron Merey
2023-10-24 17:34   ` [PATCH] Fixed formatting of man page and --help options and added additional test case to ensure the correct source file is output Housam Alamour
2023-10-24 19:59     ` Aaron Merey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).