public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Khem Raj <raj.khem@gmail.com>, elfutils-devel@sourceware.org
Subject: Re: [PATCH] Add helper function for basename
Date: Thu, 14 Dec 2023 19:07:51 +0100	[thread overview]
Message-ID: <dcf67d617aa423b5c68b2314d9cf95a4597cd713.camel@klomp.org> (raw)
In-Reply-To: <20231210202033.151578-1-raj.khem@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

Hi Khem,

On Sun, 2023-12-10 at 12:20 -0800, Khem Raj wrote:
> musl does not provide GNU version of basename and lately have removed
> the definiton from string.h [1] which exposes this problem. It can be
> made to work by providing a local implementation of basename which
> implements the GNU basename behavior, this makes it work across C
> libraries which have POSIX implementation only.
> 
> Upstream-Status: Pending
> [1] https://git.musl-libc.org/cgit/musl/commit/?id=725e17ed6dff4d0cd22487bb64470881e86a92e7
> Signed-off-by: Khem Raj <raj.khem@gmail.com>

Our discussion showed we really need this if we want to support musl
(or any other alternative libc without the string.h basename variant).
I would have liked a configure check, but old musl makes that kind of
impossible. So I agree we should use our own implementation.

I did structure it slightly differently though. Instead of adding it to
libeu I added it to system.h as static inline function. And I poisoned
the basename symbol. That found two other places where basename was
used (and now replaced by xbasename). Sadly it means we have to rename
a variable in debuginfod.cxx from basename to filename, but I think
that is acceptable.

I don't like the const cast away in ar.c, but that seems necessary
because we are using search.h and that interface just takes non-cast
char pointers (even though they really are const).

What do you think of the attached variant of your patch?

Thanks,

Mark

[-- Attachment #2: Type: text/x-patch, Size: 14732 bytes --]

From 3dfbc0f4381c835004544f7c5ee596845bb17044 Mon Sep 17 00:00:00 2001
From: Khem Raj <raj.khem@gmail.com>
Date: Sun, 10 Dec 2023 12:20:33 -0800
Subject: [PATCH] Add helper function for basename

musl does not provide GNU version of basename and lately have removed
the definiton from string.h [1] which exposes this problem. It can be
made to work by providing a local implementation of basename which
implements the GNU basename behavior, this makes it work across C
libraries which have POSIX implementation only.

[1] https://git.musl-libc.org/cgit/musl/commit/?id=725e17ed6dff4d0cd22487bb64470881e86a92e7

    * lib/system.h (xbasename): New static inline functions.
    Poison basename.
    * libdw/dwarf_getsrc_file.c (dwarf_getsrc_file): Use xbasename.
    * libdwfl/core-file.c (dwfl_core_file_report): Likewise.
    * libdwfl/dwfl_module_getsrc_file.c (dwfl_module_getsrc_file):
    Likewise.
    * libdwfl/dwfl_segment_report_module.c (dwfl_segment_report_module):
    Likewise.
    * libdwfl/find-debuginfo.c (find_debuginfo_in_path): Likewise.
    * libdwfl/link_map.c (report_r_debug): Likewise.
    * libdwfl/linux-kernel-modules.c (try_kernel_name): Likewise.
    * src/addr2line.c (print_dwarf_function): Likewise.
    (print_src): Likewise.
    * src/ar.c (do_oper_insert): Likewise.
    And cast away const in entry.key assignment.
    * src/nm.c (show_symbols): Use xbasename.
    * src/stack.c (module_callback): Likewise.
    * src/strip.c (handle_elf): Likewise.
    * tests/show-die-info.c: Include system.h.
    (dwarf_tag_string): Use xbasename.
    * tests/varlocs.c: Likewise.
    * debuginfod/debuginfod.cxx: Move include system.h to the end.
    (register_file_name): Rename basename to filename.

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 debuginfod/debuginfod.cxx            | 22 +++++++++++-----------
 lib/system.h                         | 14 ++++++++++++++
 libdw/dwarf_getsrc_file.c            |  2 +-
 libdwfl/core-file.c                  |  2 +-
 libdwfl/dwfl_module_getsrc_file.c    |  2 +-
 libdwfl/dwfl_segment_report_module.c |  4 ++--
 libdwfl/find-debuginfo.c             |  6 +++---
 libdwfl/link_map.c                   |  2 +-
 libdwfl/linux-kernel-modules.c       |  2 +-
 src/addr2line.c                      |  4 ++--
 src/ar.c                             |  4 ++--
 src/nm.c                             |  4 ++--
 src/stack.c                          |  2 +-
 src/strip.c                          |  2 +-
 tests/show-die-info.c                |  3 ++-
 tests/varlocs.c                      |  2 +-
 16 files changed, 46 insertions(+), 31 deletions(-)

diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index c11aeda1..524be948 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -50,11 +50,6 @@ extern "C" {
 }
 #endif
 
-extern "C" {
-#include "printversion.h"
-#include "system.h"
-}
-
 #include "debuginfod.h"
 #include <dwarf.h>
 
@@ -135,6 +130,11 @@ using namespace std;
 #define tid() pthread_self()
 #endif
 
+extern "C" {
+#include "printversion.h"
+#include "system.h"
+}
+
 
 inline bool
 string_endswith(const string& haystack, const string& needle)
@@ -3194,16 +3194,16 @@ register_file_name(sqlite_ps& ps_upsert_fileparts,
                    const string& name)
 {
   std::size_t slash = name.rfind('/');
-  string dirname, basename;
+  string dirname, filename;
   if (slash == std::string::npos)
     {
       dirname = "";
-      basename = name;
+      filename = name;
     }
   else
     {
       dirname = name.substr(0, slash);
-      basename = name.substr(slash+1);
+      filename = name.substr(slash+1);
     }
 
   // intern the two substrings
@@ -3213,21 +3213,21 @@ register_file_name(sqlite_ps& ps_upsert_fileparts,
     .step_ok_done();
   ps_upsert_fileparts
     .reset()
-    .bind(1, basename)
+    .bind(1, filename)
     .step_ok_done();
 
   // intern the tuple
   ps_upsert_file
     .reset()
     .bind(1, dirname)
-    .bind(2, basename)
+    .bind(2, filename)
     .step_ok_done();
 
   // look up the tuple's id
   ps_lookup_file
     .reset()
     .bind(1, dirname)
-    .bind(2, basename);
+    .bind(2, filename);
   int rc = ps_lookup_file.step();
   if (rc != SQLITE_ROW) throw sqlite_exception(rc, "step");
   
diff --git a/lib/system.h b/lib/system.h
index 1c914efc..0db12d99 100644
--- a/lib/system.h
+++ b/lib/system.h
@@ -1,6 +1,7 @@
 /* Declarations for common convenience functions.
    Copyright (C) 2006-2011 Red Hat, Inc.
    Copyright (C) 2022 Mark J. Wielaard <mark@klomp.org>
+   Copyright (C) 2023 Khem Raj.
    This file is part of elfutils.
 
    This file is free software; you can redistribute it and/or modify
@@ -211,4 +212,17 @@ extern char *__cxa_demangle (const char *mangled_name, char *output_buffer,
   extern int never_defined_just_used_for_checking[(expr) ? 1 : -1]	\
     __attribute__ ((unused))
 
+/* We really want a basename implementation that doesn't modify the
+   input argument.  Normally you get that from string.h with _GNU_SOURCE
+   define.  But some libc implementations don't define it and other
+   define it, but provide an implementation that still modifies the
+   argument.  So define our own and poison a bare basename symbol.  */
+static inline const char *
+xbasename(const char *s)
+{
+  const char *p = strrchr(s, '/');
+  return p ? p+1 : s;
+}
+#pragma GCC poison basename
+
 #endif /* system.h */
diff --git a/libdw/dwarf_getsrc_file.c b/libdw/dwarf_getsrc_file.c
index 5289c7da..03da431c 100644
--- a/libdw/dwarf_getsrc_file.c
+++ b/libdw/dwarf_getsrc_file.c
@@ -98,7 +98,7 @@ dwarf_getsrc_file (Dwarf *dbg, const char *fname, int lineno, int column,
 	      /* Match the name with the name the user provided.  */
 	      const char *fname2 = line->files->info[lastfile].name;
 	      if (is_basename)
-		lastmatch = strcmp (basename (fname2), fname) == 0;
+		lastmatch = strcmp (xbasename (fname2), fname) == 0;
 	      else
 		lastmatch = strcmp (fname2, fname) == 0;
 	    }
diff --git a/libdwfl/core-file.c b/libdwfl/core-file.c
index 87c940cb..89527d23 100644
--- a/libdwfl/core-file.c
+++ b/libdwfl/core-file.c
@@ -595,7 +595,7 @@ dwfl_core_file_report (Dwfl *dwfl, Elf *elf, const char *executable)
       if (! __libdwfl_dynamic_vaddr_get (module->elf, &file_dynamic_vaddr))
 	continue;
       Dwfl_Module *mod;
-      mod = __libdwfl_report_elf (dwfl, basename (module->name), module->name,
+      mod = __libdwfl_report_elf (dwfl, xbasename (module->name), module->name,
 				  module->fd, module->elf,
 				  module->l_ld - file_dynamic_vaddr,
 				  true, true);
diff --git a/libdwfl/dwfl_module_getsrc_file.c b/libdwfl/dwfl_module_getsrc_file.c
index 513af6b8..fc144225 100644
--- a/libdwfl/dwfl_module_getsrc_file.c
+++ b/libdwfl/dwfl_module_getsrc_file.c
@@ -103,7 +103,7 @@ dwfl_module_getsrc_file (Dwfl_Module *mod,
 		{
 		  /* Match the name with the name the user provided.  */
 		  lastfile = file;
-		  lastmatch = !strcmp (is_basename ? basename (file) : file,
+		  lastmatch = !strcmp (is_basename ? xbasename (file) : file,
 				       fname);
 		}
 	    }
diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
index 09ee37b3..dc34e0ae 100644
--- a/libdwfl/dwfl_segment_report_module.c
+++ b/libdwfl/dwfl_segment_report_module.c
@@ -718,7 +718,7 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
 	      bias += fixup;
 	      if (module->name[0] != '\0')
 		{
-		  name = basename (module->name);
+		  name = xbasename (module->name);
 		  name_is_final = true;
 		}
 	      break;
@@ -743,7 +743,7 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
 		   prevents premature closure of the correct ELF in cases
 		   where segments of a module are non-contiguous in memory.  */
 		if (name != NULL && module->name[0] != '\0'
-		    && strcmp (basename (module->name), basename (name)) == 0)
+		    && strcmp (xbasename (module->name), xbasename (name)) == 0)
 		  {
 		    elf_end (module->elf);
 		    close (module->fd);
diff --git a/libdwfl/find-debuginfo.c b/libdwfl/find-debuginfo.c
index 7f7ab632..b358c774 100644
--- a/libdwfl/find-debuginfo.c
+++ b/libdwfl/find-debuginfo.c
@@ -164,7 +164,7 @@ find_debuginfo_in_path (Dwfl_Module *mod, const char *file_name,
 {
   bool cancheck = debuglink_crc != (GElf_Word) 0;
 
-  const char *file_basename = file_name == NULL ? NULL : basename (file_name);
+  const char *file_basename = file_name == NULL ? NULL : xbasename (file_name);
   char *localname = NULL;
 
   /* We invent a debuglink .debug name if NULL, but then want to try the
@@ -278,7 +278,7 @@ find_debuginfo_in_path (Dwfl_Module *mod, const char *file_name,
 	  else
 	    {
 	      subdir = NULL;
-	      file = basename (debuglink_file);
+	      file = xbasename (debuglink_file);
 	    }
 	  try_file_basename = debuglink_null;
 	  break;
@@ -306,7 +306,7 @@ find_debuginfo_in_path (Dwfl_Module *mod, const char *file_name,
 	    if (mod->dw != NULL && (p[0] == '\0' || p[0] == '/'))
 	      {
 		fd = try_open (&main_stat, dir, ".dwz",
-			       basename (file), &fname);
+			       xbasename (file), &fname);
 		if (fd < 0)
 		  {
 		    if (errno != ENOENT && errno != ENOTDIR)
diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c
index 76f23354..a6c66c78 100644
--- a/libdwfl/link_map.c
+++ b/libdwfl/link_map.c
@@ -475,7 +475,7 @@ report_r_debug (uint_fast8_t elfclass, uint_fast8_t elfdata,
 		      if (r_debug_info_module == NULL)
 			{
 			  // XXX hook for sysroot
-			  mod = __libdwfl_report_elf (dwfl, basename (name),
+			  mod = __libdwfl_report_elf (dwfl, xbasename (name),
 						      name, fd, elf, base,
 						      true, true);
 			  if (mod != NULL)
diff --git a/libdwfl/linux-kernel-modules.c b/libdwfl/linux-kernel-modules.c
index 58c0c417..e9faba26 100644
--- a/libdwfl/linux-kernel-modules.c
+++ b/libdwfl/linux-kernel-modules.c
@@ -116,7 +116,7 @@ try_kernel_name (Dwfl *dwfl, char **fname, bool try_debug)
 	/* Try the file's unadorned basename as DEBUGLINK_FILE,
 	   to look only for "vmlinux" files.  */
 	fd = INTUSE(dwfl_standard_find_debuginfo) (&fakemod, NULL, NULL, 0,
-						   *fname, basename (*fname),
+						   *fname, xbasename (*fname),
 						   0, &fakemod.debug.name);
 
       if (fakemod.debug.name != NULL)
diff --git a/src/addr2line.c b/src/addr2line.c
index d902d791..d87e5b45 100644
--- a/src/addr2line.c
+++ b/src/addr2line.c
@@ -385,7 +385,7 @@ print_dwarf_function (Dwfl_Module *mod, Dwarf_Addr addr)
 		  if (file == NULL)
 		    file = "???";
 		  else if (only_basenames)
-		    file = basename (file);
+		    file = xbasename (file);
 		  else if (use_comp_dir && file[0] != '/')
 		    {
 		      const char *const *dirs;
@@ -568,7 +568,7 @@ print_src (const char *src, int lineno, int linecol, Dwarf_Die *cu)
   const char *comp_dir_sep = "";
 
   if (only_basenames)
-    src = basename (src);
+    src = xbasename (src);
   else if (use_comp_dir && src[0] != '/')
     {
       Dwarf_Attribute attr;
diff --git a/src/ar.c b/src/ar.c
index 3bcb18fe..e6d6d58f 100644
--- a/src/ar.c
+++ b/src/ar.c
@@ -1133,7 +1133,7 @@ do_oper_insert (int oper, const char *arfname, char **argv, int argc,
       for (int cnt = 0; cnt < argc; ++cnt)
 	{
 	  ENTRY entry;
-	  entry.key = full_path ? argv[cnt] : basename (argv[cnt]);
+	  entry.key = full_path ? argv[cnt] : (char*)xbasename (argv[cnt]);
 	  entry.data = &argv[cnt];
 	  if (hsearch (entry, ENTER) == NULL)
 	    error_exit (errno, _("cannot insert into hash table"));
@@ -1242,7 +1242,7 @@ do_oper_insert (int oper, const char *arfname, char **argv, int argc,
       /* Open all the new files, get their sizes and add all symbols.  */
       for (int cnt = 0; cnt < argc; ++cnt)
 	{
-	  const char *bname = basename (argv[cnt]);
+	  const char *bname = xbasename (argv[cnt]);
 	  size_t bnamelen = strlen (bname);
 	  if (found[cnt] == NULL)
 	    {
diff --git a/src/nm.c b/src/nm.c
index fbdee8e1..3675f59b 100644
--- a/src/nm.c
+++ b/src/nm.c
@@ -1417,7 +1417,7 @@ show_symbols (int fd, Ebl *ebl, GElf_Ehdr *ehdr,
 			  int lineno;
 			  (void) dwarf_lineno (line, &lineno);
 			  const char *file = dwarf_linesrc (line, NULL, NULL);
-			  file = (file != NULL) ? basename (file) : "???";
+			  file = (file != NULL) ? xbasename (file) : "???";
 			  int n;
 			  n = obstack_printf (&whereob, "%s:%d%c", file,
 					      lineno, '\0');
@@ -1448,7 +1448,7 @@ show_symbols (int fd, Ebl *ebl, GElf_Ehdr *ehdr,
 		{
 		  /* We found the line.  */
 		  int n = obstack_printf (&whereob, "%s:%" PRIu64 "%c",
-					  basename ((*found)->file),
+					  xbasename ((*found)->file),
 					  (*found)->lineno,
 					  '\0');
 		  sym_mem[nentries_used].where = obstack_finish (&whereob);
diff --git a/src/stack.c b/src/stack.c
index 534aa93c..f4c5ba8c 100644
--- a/src/stack.c
+++ b/src/stack.c
@@ -152,7 +152,7 @@ module_callback (Dwfl_Module *mod, void **userdata __attribute__((unused)),
 
   int width = get_addr_width (mod);
   printf ("0x%0*" PRIx64 "-0x%0*" PRIx64 " %s\n",
-	  width, start, width, end, basename (name));
+	  width, start, width, end, xbasename (name));
 
   const unsigned char *id;
   GElf_Addr id_vaddr;
diff --git a/src/strip.c b/src/strip.c
index 7f4788b3..6436443d 100644
--- a/src/strip.c
+++ b/src/strip.c
@@ -1800,7 +1800,7 @@ handle_elf (int fd, Elf *elf, const char *prefix, const char *fname,
 		      elf_errmsg (-1));
 	}
 
-      char *debug_basename = basename (debug_fname_embed ?: debug_fname);
+      const char *debug_basename = xbasename (debug_fname_embed ?: debug_fname);
       off_t crc_offset = strlen (debug_basename) + 1;
       /* Align to 4 byte boundary */
       crc_offset = ((crc_offset - 1) & ~3) + 4;
diff --git a/tests/show-die-info.c b/tests/show-die-info.c
index 1a3191cd..bda459a5 100644
--- a/tests/show-die-info.c
+++ b/tests/show-die-info.c
@@ -27,6 +27,7 @@
 #include <unistd.h>
 
 #include "../libdw/known-dwarf.h"
+#include "../lib/system.h"
 
 static const char *
 dwarf_tag_string (unsigned int tag)
@@ -318,7 +319,7 @@ main (int argc, char *argv[])
       int fd = open (argv[cnt], O_RDONLY);
       Dwarf *dbg;
 
-      printf ("file: %s\n", basename (argv[cnt]));
+      printf ("file: %s\n", xbasename (argv[cnt]));
 
       dbg = dwarf_begin (fd, DWARF_C_READ);
       if (dbg == NULL)
diff --git a/tests/varlocs.c b/tests/varlocs.c
index 8e563fd3..1004f969 100644
--- a/tests/varlocs.c
+++ b/tests/varlocs.c
@@ -1120,7 +1120,7 @@ main (int argc, char *argv[])
 
 	  const char *name = (modname[0] != '\0'
 			      ? modname
-			      :  basename (mainfile));
+			      :  xbasename (mainfile));
 	  printf ("module '%s'\n", name);
 	  print_die (&cudie, "CU", 0);
 
-- 
2.41.0


  parent reply	other threads:[~2023-12-14 18:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-10 20:20 Khem Raj
2023-12-12 13:18 ` Mark Wielaard
2023-12-12 17:16   ` Khem Raj
2023-12-13 15:10     ` Mark Wielaard
2023-12-13 16:29       ` Khem Raj
2023-12-13 22:33         ` Mark Wielaard
2023-12-14 18:07 ` Mark Wielaard [this message]
2023-12-14 18:15   ` Khem Raj
2023-12-14 19:22     ` Mark Wielaard
2023-12-14 19:28       ` Khem Raj
2023-12-20 16:43         ` Khem Raj
2023-12-21  1:02           ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcf67d617aa423b5c68b2314d9cf95a4597cd713.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=raj.khem@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).