public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* RFCv2: debuginfod debian archive support
@ 2019-12-02 22:54 Frank Ch. Eigler
  2019-12-05 12:17 ` Mark Wielaard
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2019-12-02 22:54 UTC (permalink / raw)
  To: elfutils-devel

Hi -

Presenting v2 of the debuginfod .deb/.ddeb support patch.

On second thought, generalized the code & terminology.  This may be
ready for merging, except that it'd be awesome if a
debian/ubuntu-literate person could create some test .deb/.ddeb files
matching the tests/debuginfod-rpms.  (cc:'d some maintainers there in
hope they might have the time.)  Hand-testing here looks okay.

If anyone knows of a distro or ISV who is using standard .zip or .tar
or somesuch archive formats as their distribution mechanism, it would
be a tiny effort more to add another option, say "-A" (any archive),
because libarchive can recognize and unpack a large variety of
formats.  WDYT?

This patch is also on the elfutils.git fche/debuginfod-deb branch.


Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Sat Nov 30 10:46:44 2019 -0500

    debuginfod server: support .deb/.ddeb archives
    
    Add support for scanning .deb / .ddeb files, enabled with a new
    command line option "-U".
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 8aa2944..1731dca 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,10 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
+
+	* debuginfod.cxx (*_rpm_*): Rename to *_archive_* throughout.
+	(scan_archives): New read-mostly global to identify archive
+	file extensions and corresponding extractor commands.
+	(parse_opt): Handle new -U flag.
+
 2019-11-26  Mark Wielaard  <mark@klomp.org>
 
 	* Makefile.am (BUILD_STATIC): Add needed libraries for libdw and
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index aa7ffcf..e96a4dd 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -106,6 +106,15 @@ using namespace std;
 #endif
 
 
+inline bool
+string_endswith(const string& haystack, const string& needle)
+{
+  return (haystack.size() >= needle.size() &&
+	  equal(haystack.end()-needle.size(), haystack.end(),
+                needle.begin()));
+}
+
+
 // Roll this identifier for every sqlite schema incompatiblity.
 #define BUILDIDS "buildids9"
 
@@ -231,9 +240,9 @@ static const char DEBUGINFOD_SQLITE_DDL[] =
   "create view if not exists " BUILDIDS "_stats as\n"
   "          select 'file d/e' as label,count(*) as quantity from " BUILDIDS "_f_de\n"
   "union all select 'file s',count(*) from " BUILDIDS "_f_s\n"
-  "union all select 'rpm d/e',count(*) from " BUILDIDS "_r_de\n"
-  "union all select 'rpm sref',count(*) from " BUILDIDS "_r_sref\n"
-  "union all select 'rpm sdef',count(*) from " BUILDIDS "_r_sdef\n"
+  "union all select 'archive d/e',count(*) from " BUILDIDS "_r_de\n"
+  "union all select 'archive sref',count(*) from " BUILDIDS "_r_sref\n"
+  "union all select 'archive sdef',count(*) from " BUILDIDS "_r_sdef\n"
   "union all select 'buildids',count(*) from " BUILDIDS "_buildids\n"
   "union all select 'filenames',count(*) from " BUILDIDS "_files\n"
   "union all select 'files scanned (#)',count(*) from " BUILDIDS "_file_mtime_scanned\n"
@@ -324,6 +333,7 @@ static const struct argp_option options[] =
    { NULL, 0, NULL, 0, "Scanners:", 1 },
    { "scan-file-dir", 'F', NULL, 0, "Enable ELF/DWARF file scanning threads.", 0 },
    { "scan-rpm-dir", 'R', NULL, 0, "Enable RPM scanning threads.", 0 },
+   { "scan-deb-dir", 'U', NULL, 0, "Enable DEB scanning threads.", 0 },   
    // "source-oci-imageregistry"  ... 
 
    { NULL, 0, NULL, 0, "Options:", 2 },
@@ -371,7 +381,7 @@ static unsigned maxigroom = false;
 static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
 static set<string> source_paths;
 static bool scan_files = false;
-static bool scan_rpms = false;
+static map<string,string> scan_archives;
 static vector<string> extra_ddl;
 static regex_t file_include_regex;
 static regex_t file_exclude_regex;
@@ -402,7 +412,13 @@ parse_opt (int key, char *arg,
       if (http_port > 65535) argp_failure(state, 1, EINVAL, "port number");
       break;
     case 'F': scan_files = true; break;
-    case 'R': scan_rpms = true; break;
+    case 'R':
+      scan_archives[".rpm"]="rpm2cpio";
+      break;
+    case 'U':
+      scan_archives[".deb"]="dpkg-deb --fsys-tarfile";
+      scan_archives[".ddeb"]="dpkg-deb --fsys-tarfile";
+      break;
     case 'L':
       traverse_logical = true;
       break;
@@ -851,7 +867,11 @@ handle_buildid_r_match (int64_t b_mtime,
       return 0;
     }
 
-  string popen_cmd = string("rpm2cpio " + shell_escape(b_source0));
+  string archive_decoder = "/dev/null";
+  for (auto&& arch : scan_archives)
+    if (string_endswith(b_source0, arch.first))
+      archive_decoder = arch.second;
+  string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
     throw libc_exception (errno, string("popen ") + popen_cmd);
@@ -863,9 +883,9 @@ handle_buildid_r_match (int64_t b_mtime,
     throw archive_exception("cannot create archive reader");
   defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
 
-  rc = archive_read_support_format_cpio(a);
+  rc = archive_read_support_format_all(a);
   if (rc != ARCHIVE_OK)
-    throw archive_exception(a, "cannot select cpio format");
+    throw archive_exception(a, "cannot select all format");
   rc = archive_read_support_filter_all(a);
   if (rc != ARCHIVE_OK)
     throw archive_exception(a, "cannot select all filters");
@@ -902,7 +922,7 @@ handle_buildid_r_match (int64_t b_mtime,
           throw archive_exception(a, "cannot extract file");
         }
 
-      inc_metric ("http_responses_total","result","rpm");
+      inc_metric ("http_responses_total","result","archive");
       struct MHD_Response* r = MHD_create_response_from_fd (archive_entry_size(e), fd);
       if (r == 0)
         {
@@ -916,7 +936,7 @@ handle_buildid_r_match (int64_t b_mtime,
           MHD_add_response_header (r, "Content-Type", "application/octet-stream");
           add_mhd_last_modified (r, archive_entry_mtime(e));
           if (verbose > 1)
-            obatched(clog) << "serving rpm " << b_source0 << " file " << b_source1 << endl;
+            obatched(clog) << "serving archive " << b_source0 << " file " << b_source1 << endl;
           /* libmicrohttpd will close it. */
           if (result_fd)
             *result_fd = fd;
@@ -1858,16 +1878,20 @@ thread_main_scan_source_file_path (void* arg)
 
 
 
-// Analyze given *.rpm file of given age; record buildids / exec/debuginfo-ness of its
+// Analyze given archive file of given age; record buildids / exec/debuginfo-ness of its
 // constituent files with given upsert statements.
 static void
-rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
+archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
               sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
               time_t mtime,
               unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
               bool& fts_sref_complete_p)
 {
-  string popen_cmd = string("rpm2cpio " + shell_escape(rps));
+  string archive_decoder = "/dev/null";
+  for (auto&& arch : scan_archives)
+    if (string_endswith(rps, arch.first))
+      archive_decoder = arch.second;
+  string popen_cmd = archive_decoder + " " + shell_escape(rps);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
     throw libc_exception (errno, string("popen ") + popen_cmd);
@@ -1879,9 +1903,9 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
     throw archive_exception("cannot create archive reader");
   defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
 
-  int rc = archive_read_support_format_cpio(a);
+  int rc = archive_read_support_format_all(a);
   if (rc != ARCHIVE_OK)
-    throw archive_exception(a, "cannot select cpio format");
+    throw archive_exception(a, "cannot select all formats");
   rc = archive_read_support_filter_all(a);
   if (rc != ARCHIVE_OK)
     throw archive_exception(a, "cannot select all filters");
@@ -2027,11 +2051,11 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
 
 
 
-// scan for *.rpm files
+// scan for archive files such as .rpm
 static void
-scan_source_rpm_path (const string& dir)
+scan_source_archive_path (const string& dir)
 {
-  obatched(clog) << "fts/rpm traversing " << dir << endl;
+  obatched(clog) << "fts/archive traversing " << dir << endl;
 
   sqlite_ps ps_upsert_buildids (db, "rpm-buildid-intern", "insert or ignore into " BUILDIDS "_buildids VALUES (NULL, ?);");
   sqlite_ps ps_upsert_files (db, "rpm-file-intern", "insert or ignore into " BUILDIDS "_files VALUES (NULL, ?);");
@@ -2060,7 +2084,7 @@ scan_source_rpm_path (const string& dir)
   struct timeval tv_start, tv_end;
   gettimeofday (&tv_start, NULL);
   unsigned fts_scanned=0, fts_regex=0, fts_cached=0, fts_debuginfo=0;
-  unsigned fts_executable=0, fts_rpm = 0, fts_sref=0, fts_sdef=0;
+  unsigned fts_executable=0, fts_archive = 0, fts_sref=0, fts_sdef=0;
 
   FTS *fts = fts_open (dirs,
                        (traverse_logical ? FTS_LOGICAL : FTS_PHYSICAL|FTS_XDEV)
@@ -2082,7 +2106,7 @@ scan_source_rpm_path (const string& dir)
         break;
 
       if (verbose > 2)
-        obatched(clog) << "fts/rpm traversing " << f->fts_path << endl;
+        obatched(clog) << "fts/archive traversing " << f->fts_path << endl;
 
       try
         {
@@ -2101,7 +2125,7 @@ scan_source_rpm_path (const string& dir)
           if (!ri || rx)
             {
               if (verbose > 3)
-                obatched(clog) << "fts/rpm skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
+                obatched(clog) << "fts/archive skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
               fts_regex ++;
               continue;
             }
@@ -2116,13 +2140,13 @@ scan_source_rpm_path (const string& dir)
 
             case FTS_F:
               {
-                // heuristic: reject if file name does not end with ".rpm"
-                // (alternative: try opening with librpm etc., caching)
-                string suffix = ".rpm";
-                if (rps.size() < suffix.size() ||
-                    rps.substr(rps.size()-suffix.size()) != suffix)
+		bool any = false;
+                for (auto&& arch : scan_archives)
+                  if (string_endswith(rps, arch.first))
+		    any = true;
+		if (! any)
                   continue;
-                fts_rpm ++;
+                fts_archive ++;
 
                 /* See if we know of it already. */
                 int rc = ps_query
@@ -2133,7 +2157,7 @@ scan_source_rpm_path (const string& dir)
                 ps_query.reset();
                 if (rc == SQLITE_ROW) // i.e., a result, as opposed to DONE (no results)
                   // no need to recheck a file/version we already know
-                  // specifically, no need to parse this rpm again, since we already have
+                  // specifically, no need to parse this archive again, since we already have
                   // it as a D or E or S record,
                   // (so is stored with buildid=NULL)
                   {
@@ -2141,29 +2165,29 @@ scan_source_rpm_path (const string& dir)
                     continue;
                   }
 
-                // intern the rpm file name
+                // intern the archive file name
                 ps_upsert_files
                   .reset()
                   .bind(1, rps)
                   .step_ok_done();
 
-                // extract the rpm contents via popen("rpm2cpio") | libarchive | loop-of-elf_classify()
+                // extract the archive contents
                 unsigned my_fts_executable = 0, my_fts_debuginfo = 0, my_fts_sref = 0, my_fts_sdef = 0;
                 bool my_fts_sref_complete_p = true;
                 try
                   {
-                    rpm_classify (rps,
+                    archive_classify (rps,
                                   ps_upsert_buildids, ps_upsert_files,
                                   ps_upsert_de, ps_upsert_sref, ps_upsert_sdef, // dalt
                                   f->fts_statp->st_mtime,
                                   my_fts_executable, my_fts_debuginfo, my_fts_sref, my_fts_sdef,
                                   my_fts_sref_complete_p);
-                    inc_metric ("scanned_total","source","rpm");
-                    add_metric("found_debuginfo_total","source","rpm",
+                    inc_metric ("scanned_total","source","archive");
+                    add_metric("found_debuginfo_total","source","archive",
                                my_fts_debuginfo);
-                    add_metric("found_executable_total","source","rpm",
+                    add_metric("found_executable_total","source","archive",
                                my_fts_executable);
-                    add_metric("found_sourcerefs_total","source","rpm",
+                    add_metric("found_sourcerefs_total","source","archive",
                                my_fts_sref);
                   }
                 catch (const reportable_exception& e)
@@ -2172,7 +2196,7 @@ scan_source_rpm_path (const string& dir)
                   }
 
                 if (verbose > 2)
-                  obatched(clog) << "scanned rpm=" << rps
+                  obatched(clog) << "scanned archive=" << rps
                                  << " mtime=" << f->fts_statp->st_mtime
                                  << " executables=" << my_fts_executable
                                  << " debuginfos=" << my_fts_debuginfo
@@ -2197,7 +2221,7 @@ scan_source_rpm_path (const string& dir)
 
             case FTS_ERR:
             case FTS_NS:
-              throw libc_exception(f->fts_errno, string("fts/rpm traversal ") + string(f->fts_path));
+              throw libc_exception(f->fts_errno, string("fts/archive traversal ") + string(f->fts_path));
 
             default:
             case FTS_SL: /* ignore symlinks; seen in non-L mode only */
@@ -2206,9 +2230,9 @@ scan_source_rpm_path (const string& dir)
 
           if ((verbose && f->fts_info == FTS_DP) ||
               (verbose > 1 && f->fts_info == FTS_F))
-            obatched(clog) << "fts/rpm traversing " << rps << ", scanned=" << fts_scanned
+            obatched(clog) << "fts/archive traversing " << rps << ", scanned=" << fts_scanned
                            << ", regex-skipped=" << fts_regex
-                           << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
+                           << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
                            << ", executable=" << fts_executable
                            << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
         }
@@ -2222,9 +2246,9 @@ scan_source_rpm_path (const string& dir)
   gettimeofday (&tv_end, NULL);
   double deltas = (tv_end.tv_sec - tv_start.tv_sec) + (tv_end.tv_usec - tv_start.tv_usec)*0.000001;
 
-  obatched(clog) << "fts/rpm traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
+  obatched(clog) << "fts/archive traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
                  << ", regex-skipped=" << fts_regex
-                 << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
+                 << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
                  << ", executable=" << fts_executable
                  << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
 }
@@ -2232,18 +2256,18 @@ scan_source_rpm_path (const string& dir)
 
 
 static void*
-thread_main_scan_source_rpm_path (void* arg)
+thread_main_scan_source_archive_path (void* arg)
 {
   string dir = string((const char*) arg);
 
   unsigned rescan_timer = 0;
   sig_atomic_t forced_rescan_count = 0;
-  set_metric("thread_timer_max", "rpm", dir, rescan_s);
-  set_metric("thread_tid", "rpm", dir, tid());
+  set_metric("thread_timer_max", "archive", dir, rescan_s);
+  set_metric("thread_tid", "archive", dir, tid());
   while (! interrupted)
     {
-      set_metric("thread_timer", "rpm", dir, rescan_timer);
-      set_metric("thread_forced_total", "rpm", dir, forced_rescan_count);
+      set_metric("thread_timer", "archive", dir, rescan_timer);
+      set_metric("thread_forced_total", "archive", dir, forced_rescan_count);
       if (rescan_s && rescan_timer > rescan_s)
         rescan_timer = 0;
       if (sigusr1 != forced_rescan_count)
@@ -2254,10 +2278,10 @@ thread_main_scan_source_rpm_path (void* arg)
       if (rescan_timer == 0)
         try
           {
-            set_metric("thread_working", "rpm", dir, time(NULL));
-            inc_metric("thread_work_total", "rpm", dir);
-            scan_source_rpm_path (dir);
-            set_metric("thread_working", "rpm", dir, 0);
+            set_metric("thread_working", "archive", dir, time(NULL));
+            inc_metric("thread_work_total", "archive", dir);
+            scan_source_archive_path (dir);
+            set_metric("thread_working", "archive", dir, 0);
           }
         catch (const sqlite_exception& e)
           {
@@ -2490,8 +2514,8 @@ main (int argc, char *argv[])
       error (EXIT_FAILURE, 0,
              "unexpected argument: %s", argv[remaining]);
 
-  if (!scan_rpms && !scan_files && source_paths.size()>0)
-    obatched(clog) << "warning: without -F and/or -R, ignoring PATHs" << endl;
+  if (scan_archives.size()==0 && !scan_files && source_paths.size()>0)
+    obatched(clog) << "warning: without -F -R -U, ignoring PATHs" << endl;
 
   (void) signal (SIGPIPE, SIG_IGN); // microhttpd can generate it incidentally, ignore
   (void) signal (SIGINT, signal_handler); // ^C
@@ -2611,16 +2635,22 @@ main (int argc, char *argv[])
   if (maxigroom)
     obatched(clog) << "maxigroomed database" << endl;
 
-
   obatched(clog) << "search concurrency " << concurrency << endl;
   obatched(clog) << "rescan time " << rescan_s << endl;
   obatched(clog) << "groom time " << groom_s << endl;
+  if (scan_archives.size()>0)
+    {
+      obatched ob(clog);
+      auto& o = ob << "scanning archive types ";
+      for (auto&& arch : scan_archives)
+	o << arch.first << " ";
+      o << endl;
+    }
   const char* du = getenv(DEBUGINFOD_URLS_ENV_VAR);
   if (du && du[0] != '\0') // set to non-empty string?
     obatched(clog) << "upstream debuginfod servers: " << du << endl;
 
-  vector<pthread_t> source_file_scanner_threads;
-  vector<pthread_t> source_rpm_scanner_threads;
+  vector<pthread_t> scanner_threads;
   pthread_t groom_thread;
 
   rc = pthread_create (& groom_thread, NULL, thread_main_groom, NULL);
@@ -2632,20 +2662,21 @@ main (int argc, char *argv[])
       pthread_t pt;
       rc = pthread_create (& pt, NULL, thread_main_scan_source_file_path, (void*) it.c_str());
       if (rc < 0)
-        error (0, 0, "warning: cannot spawn thread (%d) to scan source files %s\n", rc, it.c_str());
+        error (0, 0, "warning: cannot spawn thread (%d) to scan files %s\n", rc, it.c_str());
       else
-        source_file_scanner_threads.push_back(pt);
+        scanner_threads.push_back(pt);
     }
 
-  if (scan_rpms) for (auto&& it : source_paths)
-    {
-      pthread_t pt;
-      rc = pthread_create (& pt, NULL, thread_main_scan_source_rpm_path, (void*) it.c_str());
-      if (rc < 0)
-        error (0, 0, "warning: cannot spawn thread (%d) to scan source rpms %s\n", rc, it.c_str());
-      else
-        source_rpm_scanner_threads.push_back(pt);
-    }
+  if (scan_archives.size() > 0)
+    for (auto&& it : source_paths)
+      {
+        pthread_t pt;
+        rc = pthread_create (& pt, NULL, thread_main_scan_source_archive_path, (void*) it.c_str());
+        if (rc < 0)
+          error (0, 0, "warning: cannot spawn thread (%d) to scan archives %s\n", rc, it.c_str());
+        else
+          scanner_threads.push_back(pt);
+      }
 
   /* Trivial main loop! */
   set_metric("ready", 1);
@@ -2656,10 +2687,8 @@ main (int argc, char *argv[])
   if (verbose)
     obatched(clog) << "stopping" << endl;
 
-  /* Join any source scanning threads. */
-  for (auto&& it : source_file_scanner_threads)
-    pthread_join (it, NULL);
-  for (auto&& it : source_rpm_scanner_threads)
+  /* Join all our threads. */
+  for (auto&& it : scanner_threads)
     pthread_join (it, NULL);
   pthread_join (groom_thread, NULL);
   
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 00a61ac..398048f 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,12 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com
+
+	* debuginfod.8: Add -U (DEB) flag, generalize RPM to "archive".
+
+2019-11-26  Frank Ch. Eigler  <fche@redhat.com>
+	    Aaron Merey  <amerey@redhat.com>
+
+	* debuginfod.8, find-debuginfo.1, debuginfod_*.3: New files.
+
 2019-09-02  Mark Wielaard  <mark@klomp.org>
 
 	* readelf.1 (symbols): Add optional section name.
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 210550e..6e078f6 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -24,7 +24,7 @@ debuginfod \- debuginfo-related http file-server daemon
 .SH DESCRIPTION
 \fBdebuginfod\fP serves debuginfo-related artifacts over HTTP.  It
 periodically scans a set of directories for ELF/DWARF files and their
-associated source code, as well as RPM files containing the above, to
+associated source code, as well as archive files containing the above, to
 build an index by their buildid.  This index is used when remote
 clients use the HTTP webapi, to fetch these files by the same buildid.
 
@@ -55,17 +55,22 @@ or even use debuginfod itself:
 ^C
 .ESAMPLE
 
-If the \fB\-R\fP option is given each listed PATH creates a thread to
-scan for ELF/DWARF/source files contained in matching RPMs under the
-given physical directory.  Duplicate directories are ignored.  You may
-use a file name for a PATH, but source code indexing may be
-incomplete; prefer using a directory that contains normal RPMs
-alongside debuginfo/debugsource RPMs.  Because of complications such
-as DWZ-compressed debuginfo, may require \fItwo\fP scan passes to
-identify all source code.  Source files for RPMs are only served
-from other RPMs, so the caution for \-F does not apply.
-
-If no PATH is listed, or neither \-F nor \-R option is given, then
+If the \fB\-R\fP and/or \fB-U\fP option is given, each listed PATH
+creates a thread to scan for ELF/DWARF/source files contained in
+archive files.  If \-R is given, the will scan RPMs; and/or if \-U is
+given, they will scan DEB / DDEB files.  (The terms RPM and DEB and
+DDEB are used synonymously as "archives" in diagnostic messages.)
+Duplicate directories are ignored.  You may use a file name for a
+PATH, but source code indexing may be incomplete.  Instead, use a
+directory that contains normal RPMs alongside debuginfo/debugsource
+RPMs.  Because of complications such as DWZ-compressed debuginfo, may
+require \fItwo\fP scan passes to identify all source code.  Source
+files for RPMs are only served from other RPMs, so the caution for \-F
+does not apply.  Note that due to Debian/Ubuntu packaging policies &
+mechanisms, debuginfod cannot resolve source files for DEB/DDEB at
+all.
+
+If no PATH is listed, or neither \-F nor \-R nor \-U option is given, then
 \fBdebuginfod\fP will simply serve content that it scanned into its
 index in previous runs: the data is cumulative.
 
@@ -81,7 +86,11 @@ Activate ELF/DWARF file scanning threads.  The default is off.
 
 .TP
 .B "\-R"
-Activate RPM file scanning threads.  The default is off.
+Activate RPM patterns in archive scanning threads.  The default is off.
+
+.TP
+.B "\-U"
+Activate DEB/DDEB patterns in archive scanning threads.  The default is off.
 
 .TP
 .B "\-d FILE" "\-\-database=FILE"
@@ -114,12 +123,12 @@ extended REs, thus may include alternation.  They are evaluated
 against the full path of each file, based on its \fBrealpath(3)\fP
 canonicalization.  By default, all files are included and none are
 excluded.  A file that matches both include and exclude REGEX is
-excluded.  (The \fIcontents\fP of RPM files are not subject to
+excluded.  (The \fIcontents\fP of archive files are not subject to
 inclusion or exclusion filtering: they are all processed.)
 
 .TP
 .B "\-t SECONDS"  "\-\-rescan\-time=SECONDS"
-Set the rescan time for the file and RPM directories.  This is the
+Set the rescan time for the file and archive directories.  This is the
 amount of time the scanning threads will wait after finishing a scan,
 before doing it again.  A rescan for unchanged files is fast (because
 the index also stores the file mtimes).  A time of zero is acceptable,
@@ -143,8 +152,8 @@ independent of the groom time (including if it was zero).
 .B "\-G"
 Run an extraordinary maximal-grooming pass at debuginfod startup.
 This pass can take considerable time, because it tries to remove any
-debuginfo-unrelated content from the RPM-related parts of the index.
-It should not be run if any recent RPM-related indexing operations
+debuginfo-unrelated content from the archive-related parts of the index.
+It should not be run if any recent archive-related indexing operations
 were aborted early.  It can take considerable space, because it
 finishes up with an sqlite "vacuum" operation, which repacks the
 database file by triplicating it temporarily.  The default is not to
@@ -155,7 +164,7 @@ do maximal-grooming.  See also the \fIDATA MANAGEMENT\fP section.
 Set the concurrency limit for all the scanning threads.  While many
 threads may be spawned to cover all the given PATHs, only NUM may
 concurrently do CPU-intensive operations like parsing an ELF file
-or an RPM.  The default is the number of processors on the system;
+or an archive.  The default is the number of processors on the system;
 the minimum is 1.
 
 .TP
@@ -257,10 +266,10 @@ many files.  This section offers some advice about the implications.
 
 As a general explanation for size, consider that debuginfod indexes
 ELF/DWARF files, it stores their names and referenced source file
-names, and buildids will be stored.  When indexing RPMs, it stores
-every file name \fIof or in\fP an RPM, every buildid, plus every
-source file name referenced from a DWARF file.  (Indexing RPMs takes
-more space because the source files often reside in separate
+names, and buildids will be stored.  When indexing archives, it stores
+every file name \fIof or in\fP an archive, every buildid, plus every
+source file name referenced from a DWARF file.  (Indexing archives
+takes more space because the source files often reside in separate
 subpackages that may not be indexed at the same pass, so extra
 metadata has to be kept.)
 
@@ -283,14 +292,14 @@ This means that the sqlite files grow fast during initial indexing,
 slowly during index rescans, and periodically shrink during grooming.
 There is also an optional one-shot \fImaximal grooming\fP pass is
 available.  It removes information debuginfo-unrelated data from the
-RPM content index such as file names found in RPMs ("rpm sdef"
-records) that are not referred to as source files from any binaries
-find in RPMs ("rpm sref" records).  This can save considerable disk
-space.  However, it is slow and temporarily requires up to twice the
-database size as free space.  Worse: it may result in missing
-source-code info if the RPM traversals were interrupted, so the not
-all source file references were known.  Use it rarely to polish a
-complete index.
+archive content index such as file names found in archives ("archive
+sdef" records) that are not referred to as source files from any
+binaries find in archives ("archive sref" records).  This can save
+considerable disk space.  However, it is slow and temporarily requires
+up to twice the database size as free space.  Worse: it may result in
+missing source-code info if the archive traversals were interrupted,
+so the not all source file references were known.  Use it rarely to
+polish a complete index.
 
 You should ensure that ample disk space remains available.  (The flood
 of error messages on -ENOSPC is ugly and nagging.  But, like for most
@@ -317,7 +326,7 @@ happens, new versions of debuginfod will issue SQL statements to
 \fIdrop\fP all prior schema & data, and start over.  So, disk space
 will not be wasted for retaining a no-longer-useable dataset.
 
-In summary, if your system can bear a 0.5%-3% index-to-RPM-dataset
+In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
 size ratio, and slow growth afterwards, you should not need to
 worry about disk space.  If a system crash corrupts the database,
 or you want to force debuginfod to reset and start over, simply
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 6e3923f..8fcb161 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,7 @@
+2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
+
+	* run-debuginfod-find.sh: Adjust to "rpm"->"archive" in metrics.
+
 2019-11-26  Mark Wielaard  <mark@klomp.org>
 
 	* Makefile.am (BUILD_STATIC): Add libraries needed for libdw.
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 0ade03b..c2926a0 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -162,7 +162,7 @@ cp -rp ${abs_srcdir}/debuginfod-rpms R
 kill -USR1 $PID1
 # All rpms need to be in the index
 rpms=$(find R -name \*rpm | wc -l)
-wait_ready $PORT1 'scanned_total{source="rpm"}' $rpms
+wait_ready $PORT1 'scanned_total{source="archive"}' $rpms
 
 kill -USR1 $PID1  # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs
 # Expect all source files found in the rpms (they are all called hello.c :)
@@ -187,7 +187,7 @@ sourcefiles=$(find -name \*\\.debug \
 cd ..
 rm -rf extracted
 
-wait_ready $PORT1 'found_sourcerefs_total{source="rpm"}' $sourcefiles
+wait_ready $PORT1 'found_sourcerefs_total{source="archive"}' $sourcefiles
 
 # Run a bank of queries against the debuginfod-rpms test cases
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-02 22:54 RFCv2: debuginfod debian archive support Frank Ch. Eigler
@ 2019-12-05 12:17 ` Mark Wielaard
  2019-12-06 21:17 ` Frank Ch. Eigler
  2019-12-06 23:32 ` Mark Wielaard
  2 siblings, 0 replies; 9+ messages in thread
From: Mark Wielaard @ 2019-12-05 12:17 UTC (permalink / raw)
  To: Frank Ch. Eigler, elfutils-devel; +Cc: Kurt Roeckx, Matthias Klose

[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]

Hi Kurt and Matthias,

On Mon, 2019-12-02 at 17:54 -0500, Frank Ch. Eigler wrote:
> On second thought, generalized the code & terminology.  This may be
> ready for merging, except that it'd be awesome if a
> debian/ubuntu-literate person could create some test .deb/.ddeb files
> matching the tests/debuginfod-rpms.  (cc:'d some maintainers there in
> hope they might have the time.)  Hand-testing here looks okay.

Would you be able to show how to build a minimal (separate) debuginfo
package for Debian and/or Ubunutu? For rpm based systems we use the
attached self contained spec file that creates two nearly identical
hello programs to test any issues with duplicate name/debug/DWARF, it
also tests dwz multi files (if you use those on Debian or Ubuntu). To
(re)create the binary test packages for a different arch one would
simply run rpmbuild -ba hello2.spec.

How would one build a similar binary deb package (and the debuginfo
subpackages) on a Debian/Ubuntu system?

Thanks,

Mark

[-- Attachment #2: hello2.spec. --]
[-- Type: text/x-rpm-spec, Size: 1243 bytes --]

Summary: hello2 -- double hello, world rpm
Name: hello2
Version: 1.0
Release: 2
Group: Utilities
License: GPL
Distribution: RPM ^W Elfutils test suite.
Vendor: Red Hat Software
Packager: Red Hat Software <bugs@redhat.com>
URL: http://www.redhat.com
BuildRequires: gcc make
Source0: hello-1.0.tar.gz

%description
Simple rpm demonstration with an eye to consumption by debuginfod.

%package two
Summary: hello2two
License: GPL

%description two
Dittoish.

%prep
%setup -q -n hello-1.0

%build
gcc -g -O1 hello.c -o hello
gcc -g -O2 -D_FORTIFY_SOURCE=2 hello.c -o hello2

%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/local/bin
cp hello $RPM_BUILD_ROOT/usr/local/bin/
cp hello2 $RPM_BUILD_ROOT/usr/local/bin/

%clean
rm -rf $RPM_BUILD_ROOT

%files 
%defattr(-,root,root)
%attr(0751,root,root)   /usr/local/bin/hello

%files two
%defattr(-,root,root)
%attr(0751,root,root)   /usr/local/bin/hello2

%changelog
* Thu Nov 14 2019 Frank Ch. Eigler <fche@redhat.com>
- Added source code right here to make spec file self-contained.
- Dropped misc files not relevant to debuginfod testing.

* Wed May 18 2016 Mark Wielaard <mjw@redhat.com>
- Add hello2 for dwz testing support.

* Tue Oct 20 1998 Jeff Johnson <jbj@redhat.com>
- create.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-02 22:54 RFCv2: debuginfod debian archive support Frank Ch. Eigler
  2019-12-05 12:17 ` Mark Wielaard
@ 2019-12-06 21:17 ` Frank Ch. Eigler
  2019-12-11 21:06   ` Mark Wielaard
  2019-12-06 23:32 ` Mark Wielaard
  2 siblings, 1 reply; 9+ messages in thread
From: Frank Ch. Eigler @ 2019-12-06 21:17 UTC (permalink / raw)
  To: elfutils-devel

Hi -

Presenting testing for the debuginfod .deb/.ddeb support patch, after
finding a good debian-packaging tutorial, and generating a workable
basic set of test deb's on a Ubuntu box.

This patch is also on the elfutils.git fche/debuginfod-deb branch.  In
the absence of objections, I plan to merge this next week.


commit 1d9ff81da0f5e1c7383b255c15529402035ad5bf
Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Fri Dec 6 16:08:50 2019 -0500

    debuginfod: deb support, tests
    
    Using a synthetic .deb/.ddeb from a Ubuntu 18 machine,
    extend the debuginfod testsuite with some .deb processing,
    if the dpkg-deb binary is installed.
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/config/ChangeLog b/config/ChangeLog
index d71fb39..9b2a408 100644
--- a/config/ChangeLog
+++ b/config/ChangeLog
@@ -1,3 +1,8 @@
+2019-12-06  Frank Ch. Eigler  <fche@redhat.com>
+
+	* elfutils.spec.in (debuginfod): Add BuildRequire dpkg
+	for deb testing.  (Available on Fedora & EPEL, not base RHEL.)
+
 2019-11-28  Mark Wielaard  <mark@klomp.org>
 
 	* elfutils.spec.in (debuginfod): Add an explicit Requires
diff --git a/config/elfutils.spec.in b/config/elfutils.spec.in
index 1cdca21..faeb7f8 100644
--- a/config/elfutils.spec.in
+++ b/config/elfutils.spec.in
@@ -35,6 +35,9 @@ BuildRequires: pkgconfig(libarchive) >= 3.1.2
 BuildRequires: bzip2
 # For the run-debuginfod-find.sh test case in %check for /usr/sbin/ss
 BuildRequires: iproute
+%if 0%{?fedora} >= 20
+BuildRequires: dpkg
+%endif
 BuildRequires: curl
 
 %define _gnu %{nil}
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 8fcb161..ba3a10e 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,10 @@
+2019-12-06  Frank Ch. Eigler  <fche@redhat.com>
+
+	* debuginfod-debs/*: New test files, based on
+	https://wiki.debian.org/Packaging/Intro.
+	* run-debuginfod-find.sh: Test deb file processing (if dpkg
+	installed).
+
 2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
 
 	* run-debuginfod-find.sh: Adjust to "rpm"->"archive" in metrics.
diff --git a/tests/debuginfod-debs/hithere-dbgsym_1.0-1_amd64.ddeb b/tests/debuginfod-debs/hithere-dbgsym_1.0-1_amd64.ddeb
new file mode 100644
index 0000000..f9879eb
Binary files /dev/null and b/tests/debuginfod-debs/hithere-dbgsym_1.0-1_amd64.ddeb differ
diff --git a/tests/debuginfod-debs/hithere_1.0-1.debian.tar.xz b/tests/debuginfod-debs/hithere_1.0-1.debian.tar.xz
new file mode 100644
index 0000000..9f0ce68
Binary files /dev/null and b/tests/debuginfod-debs/hithere_1.0-1.debian.tar.xz differ
diff --git a/tests/debuginfod-debs/hithere_1.0-1.dsc b/tests/debuginfod-debs/hithere_1.0-1.dsc
new file mode 100644
index 0000000..d5f72b9
--- /dev/null
+++ b/tests/debuginfod-debs/hithere_1.0-1.dsc
@@ -0,0 +1,19 @@
+Format: 3.0 (quilt)
+Source: hithere
+Binary: hithere
+Architecture: any
+Version: 1.0-1
+Maintainer: Lars Wirzenius <liw@liw.fi>
+Standards-Version: 3.9.2
+Build-Depends: debhelper (>= 9)
+Package-List:
+ hithere deb misc optional arch=any
+Checksums-Sha1:
+ 2dcd65497a12a3ea03223f52186447bd5733dce9 617 hithere_1.0.orig.tar.gz
+ 0b71331ef1c714c5bac67878551864b7356c56ce 764 hithere_1.0-1.debian.tar.xz
+Checksums-Sha256:
+ 63062b582a712f169f37a5f52a41aa3ca9a405aafb8aa837bc906fa413b62cdb 617 hithere_1.0.orig.tar.gz
+ 9afa907e360e626639ccb86b86e799429bea27149034aec5d5c7e500971d651e 764 hithere_1.0-1.debian.tar.xz
+Files:
+ 5b2830fa1fcd44ce489774771625526e 617 hithere_1.0.orig.tar.gz
+ 70106164d9397c70c2c1a4594e9897e4 764 hithere_1.0-1.debian.tar.xz
diff --git a/tests/debuginfod-debs/hithere_1.0-1_amd64.deb b/tests/debuginfod-debs/hithere_1.0-1_amd64.deb
new file mode 100644
index 0000000..11d1e95
Binary files /dev/null and b/tests/debuginfod-debs/hithere_1.0-1_amd64.deb differ
diff --git a/tests/debuginfod-debs/hithere_1.0.orig.tar.gz b/tests/debuginfod-debs/hithere_1.0.orig.tar.gz
new file mode 100644
index 0000000..23abea7
Binary files /dev/null and b/tests/debuginfod-debs/hithere_1.0.orig.tar.gz differ
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index c2926a0..cd31e30 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -18,6 +18,10 @@
 
 . $srcdir/test-subr.sh  # includes set -e
 
+# for test case debugging, uncomment:
+# set -x
+# VERBOSE=-vvvv
+
 DB=${PWD}/.debuginfod_tmp.sqlite
 tempfiles $DB
 export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache
@@ -30,7 +34,7 @@ cleanup()
   if [ $PID1 -ne 0 ]; then kill $PID1; wait $PID1; fi
   if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi
 
-  rm -rf F R L ${PWD}/.client_cache*
+  rm -rf F R D L ${PWD}/.client_cache*
   exit_cleanup
 }
 
@@ -52,8 +56,8 @@ done
 # So we gather the LD_LIBRARY_PATH with this cunning trick:
 ldpath=`testrun sh -c 'echo $LD_LIBRARY_PATH'`
 
-mkdir F R L
-# not tempfiles F R L - they are directories which we clean up manually
+mkdir F R L D
+# not tempfiles F R L D - they are directories which we clean up manually
 ln -s ${abs_builddir}/dwfllines L/foo   # any program not used elsewhere in this test
 
 wait_ready()
@@ -82,7 +86,7 @@ wait_ready()
   fi
 }
 
-env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS= ${abs_builddir}/../debuginfod/debuginfod -F -R -d $DB -p $PORT1 -t0 -g0 R F L &
+env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS= ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -F -R -d $DB -p $PORT1 -t0 -g0 R F L &
 PID1=$!
 # Server must become ready
 wait_ready $PORT1 'ready' 1
@@ -158,7 +162,7 @@ cmp $filename F/prog2
 filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID2 ${PWD}/prog2.c`
 cmp $filename ${PWD}/prog2.c
 
-cp -rp ${abs_srcdir}/debuginfod-rpms R
+cp -rvp ${abs_srcdir}/debuginfod-rpms R
 kill -USR1 $PID1
 # All rpms need to be in the index
 rpms=$(find R -name \*rpm | wc -l)
@@ -177,7 +181,7 @@ for i in $newrpms; do
     mkdir $subdir;
     cd $subdir;
     ls -lah ../$i
-    rpm2cpio ../$i | cpio -id;
+    rpm2cpio ../$i | cpio -ivd;
     cd ..;
 done
 sourcefiles=$(find -name \*\\.debug \
@@ -205,10 +209,12 @@ rpm_test() {
     buildid=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
              -a $filename | grep 'Build ID' | cut -d ' ' -f 7`
     test $__BUILDID = $buildid
-    
-    filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $__BUILDID $__SOURCEPATH`
-    hash=`cat $filename | sha1sum | awk '{print $1}'`
-    test $__SOURCESHA1 = $hash
+
+    if test "x$__SOURCEPATH" != "x"; then
+        filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $__BUILDID $__SOURCEPATH`
+        hash=`cat $filename | sha1sum | awk '{print $1}'`
+        test $__SOURCESHA1 = $hash
+    fi
 }
 
 
@@ -257,13 +263,26 @@ export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache2
 mkdir -p $DEBUGINFOD_CACHE_PATH
 # NB: inherits the DEBUGINFOD_URLS to the first server
 # NB: run in -L symlink-following mode for the L subdir
-env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod -F -d ${DB}_2 -p $PORT2 -L L &
+env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -F -U -d ${DB}_2 -p $PORT2 -L L D &
 PID2=$!
 tempfiles ${DB}_2
 wait_ready $PORT2 'ready' 1
 
 # have clients contact the new server
 export DEBUGINFOD_URLS=http://127.0.0.1:$PORT2
+
+if type dpkg-deb 2>/dev/null; then
+    # copy in the deb files
+    cp -rvp ${abs_srcdir}/debuginfod-debs/*deb D
+    kill -USR1 $PID2
+    # All debs need to be in the index
+    debs=$(find D -name \*deb | wc -l)
+    wait_ready $PORT2 'scanned_total{source="archive"}' `expr $debs`
+
+    # ubuntu
+    rpm_test f17a29b5a25bd4960531d82aa6b07c8abe84fa66 "" ""
+fi
+
 rm -rf $DEBUGINFOD_CACHE_PATH
 testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-02 22:54 RFCv2: debuginfod debian archive support Frank Ch. Eigler
  2019-12-05 12:17 ` Mark Wielaard
  2019-12-06 21:17 ` Frank Ch. Eigler
@ 2019-12-06 23:32 ` Mark Wielaard
  2019-12-07  3:03   ` Frank Ch. Eigler
  2 siblings, 1 reply; 9+ messages in thread
From: Mark Wielaard @ 2019-12-06 23:32 UTC (permalink / raw)
  To: Frank Ch. Eigler, elfutils-devel

Hi Frank,

On Mon, 2019-12-02 at 17:54 -0500, Frank Ch. Eigler wrote:
> If anyone knows of a distro or ISV who is using standard .zip or .tar
> or somesuch archive formats as their distribution mechanism, it would
> be a tiny effort more to add another option, say "-A" (any archive),
> because libarchive can recognize and unpack a large variety of
> formats.  WDYT?

I looked at some distros, but the only ones that provide consistent
debug[info] packages do so in rpm or deb format.

> Author: Frank Ch. Eigler <fche@redhat.com>
> Date:   Sat Nov 30 10:46:44 2019 -0500
> 
>     debuginfod server: support .deb/.ddeb archives
>     
>     Add support for scanning .deb / .ddeb files, enabled with a new
>     command line option "-U".

Too bad -D was already taken. sniff.

What is the difference between a .deb and a .ddeb file/archive?

> +2019-12-02  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* debuginfod.cxx (*_rpm_*): Rename to *_archive_* throughout.
> +	(scan_archives): New read-mostly global to identify archive
> +	file extensions and corresponding extractor commands.
> +	(parse_opt): Handle new -U flag.
> +
>  2019-11-26  Mark Wielaard  <mark@klomp.org>
>  
>  	* Makefile.am (BUILD_STATIC): Add needed libraries for libdw and
> diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
> index aa7ffcf..e96a4dd 100644
> --- a/debuginfod/debuginfod.cxx
> +++ b/debuginfod/debuginfod.cxx
> @@ -106,6 +106,15 @@ using namespace std;
>  #endif
>  
>  
> +inline bool
> +string_endswith(const string& haystack, const string& needle)
> +{
> +  return (haystack.size() >= needle.size() &&
> +	  equal(haystack.end()-needle.size(), haystack.end(),
> +                needle.begin()));
> +}
> +
> +
>  // Roll this identifier for every sqlite schema incompatiblity.
>  #define BUILDIDS "buildids9"
>  
> @@ -231,9 +240,9 @@ static const char DEBUGINFOD_SQLITE_DDL[] =
>    "create view if not exists " BUILDIDS "_stats as\n"
>    "          select 'file d/e' as label,count(*) as quantity from " BUILDIDS "_f_de\n"
>    "union all select 'file s',count(*) from " BUILDIDS "_f_s\n"
> -  "union all select 'rpm d/e',count(*) from " BUILDIDS "_r_de\n"
> -  "union all select 'rpm sref',count(*) from " BUILDIDS "_r_sref\n"
> -  "union all select 'rpm sdef',count(*) from " BUILDIDS "_r_sdef\n"
> +  "union all select 'archive d/e',count(*) from " BUILDIDS "_r_de\n"
> +  "union all select 'archive sref',count(*) from " BUILDIDS "_r_sref\n"
> +  "union all select 'archive sdef',count(*) from " BUILDIDS "_r_sdef\n"
>    "union all select 'buildids',count(*) from " BUILDIDS "_buildids\n"
>    "union all select 'filenames',count(*) from " BUILDIDS "_files\n"
>    "union all select 'files scanned (#)',count(*) from " BUILDIDS "_file_mtime_scanned\n"
> @@ -324,6 +333,7 @@ static const struct argp_option options[] =
>     { NULL, 0, NULL, 0, "Scanners:", 1 },
>     { "scan-file-dir", 'F', NULL, 0, "Enable ELF/DWARF file scanning threads.", 0 },
>     { "scan-rpm-dir", 'R', NULL, 0, "Enable RPM scanning threads.", 0 },
> +   { "scan-deb-dir", 'U', NULL, 0, "Enable DEB scanning threads.", 0 },   
>     // "source-oci-imageregistry"  ... 
>  
>     { NULL, 0, NULL, 0, "Options:", 2 },
> @@ -371,7 +381,7 @@ static unsigned maxigroom = false;
>  static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
>  static set<string> source_paths;
>  static bool scan_files = false;
> -static bool scan_rpms = false;
> +static map<string,string> scan_archives;
>  static vector<string> extra_ddl;
>  static regex_t file_include_regex;
>  static regex_t file_exclude_regex;
> @@ -402,7 +412,13 @@ parse_opt (int key, char *arg,
>        if (http_port > 65535) argp_failure(state, 1, EINVAL, "port number");
>        break;
>      case 'F': scan_files = true; break;
> -    case 'R': scan_rpms = true; break;
> +    case 'R':
> +      scan_archives[".rpm"]="rpm2cpio";
> +      break;
> +    case 'U':
> +      scan_archives[".deb"]="dpkg-deb --fsys-tarfile";
> +      scan_archives[".ddeb"]="dpkg-deb --fsys-tarfile";
> +      break;
>      case 'L':
>        traverse_logical = true;
>        break;
> @@ -851,7 +867,11 @@ handle_buildid_r_match (int64_t b_mtime,
>        return 0;
>      }
>  
> -  string popen_cmd = string("rpm2cpio " + shell_escape(b_source0));
> +  string archive_decoder = "/dev/null";
> +  for (auto&& arch : scan_archives)
> +    if (string_endswith(b_source0, arch.first))
> +      archive_decoder = arch.second;
> +  string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
>    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
>    if (fp == NULL)
>      throw libc_exception (errno, string("popen ") + popen_cmd);

This seems a lot of work for dealing with non-archives. If the file
doesn't end with any known extension do we really have to try to create
a pipe, fork, execute a shell and then see that /dev/null couldn't be
executed just to throw an exception?

> @@ -863,9 +883,9 @@ handle_buildid_r_match (int64_t b_mtime,
>      throw archive_exception("cannot create archive reader");
>    defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
>  
> -  rc = archive_read_support_format_cpio(a);
> +  rc = archive_read_support_format_all(a);
>    if (rc != ARCHIVE_OK)
> -    throw archive_exception(a, "cannot select cpio format");
> +    throw archive_exception(a, "cannot select all format");
>    rc = archive_read_support_filter_all(a);
>    if (rc != ARCHIVE_OK)
>      throw archive_exception(a, "cannot select all filters");
> @@ -902,7 +922,7 @@ handle_buildid_r_match (int64_t b_mtime,
>            throw archive_exception(a, "cannot extract file");
>          }
>  
> -      inc_metric ("http_responses_total","result","rpm");
> +      inc_metric ("http_responses_total","result","archive");
>        struct MHD_Response* r = MHD_create_response_from_fd (archive_entry_size(e), fd);
>        if (r == 0)
>          {
> @@ -916,7 +936,7 @@ handle_buildid_r_match (int64_t b_mtime,
>            MHD_add_response_header (r, "Content-Type", "application/octet-stream");
>            add_mhd_last_modified (r, archive_entry_mtime(e));
>            if (verbose > 1)
> -            obatched(clog) << "serving rpm " << b_source0 << " file " << b_source1 << endl;
> +            obatched(clog) << "serving archive " << b_source0 << " file " << b_source1 << endl;
>            /* libmicrohttpd will close it. */
>            if (result_fd)
>              *result_fd = fd;
> @@ -1858,16 +1878,20 @@ thread_main_scan_source_file_path (void* arg)
>  
>  
>  
> -// Analyze given *.rpm file of given age; record buildids / exec/debuginfo-ness of its
> +// Analyze given archive file of given age; record buildids / exec/debuginfo-ness of its
>  // constituent files with given upsert statements.
>  static void
> -rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
> +archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
>                sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
>                time_t mtime,
>                unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
>                bool& fts_sref_complete_p)
>  {
> -  string popen_cmd = string("rpm2cpio " + shell_escape(rps));
> +  string archive_decoder = "/dev/null";
> +  for (auto&& arch : scan_archives)
> +    if (string_endswith(rps, arch.first))
> +      archive_decoder = arch.second;
> +  string popen_cmd = archive_decoder + " " + shell_escape(rps);
>    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
>    if (fp == NULL)
>      throw libc_exception (errno, string("popen ") + popen_cmd);

Likewise as above. Can we skip the whole popen dance if nothing
matches?

> @@ -1879,9 +1903,9 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
>      throw archive_exception("cannot create archive reader");
>    defer_dtor<struct archive*,int> archive_closer (a, archive_read_free);
>  
> -  int rc = archive_read_support_format_cpio(a);
> +  int rc = archive_read_support_format_all(a);
>    if (rc != ARCHIVE_OK)
> -    throw archive_exception(a, "cannot select cpio format");
> +    throw archive_exception(a, "cannot select all formats");
> 
>    rc = archive_read_support_filter_all(a);
>    if (rc != ARCHIVE_OK)
>      throw archive_exception(a, "cannot select all filters");

In theory you could know the format you are expecting, just like you
know the decoder above. Might that give slight more accurate error
messages and maybe be slightly more efficient since libarchive doesn't
need to try to see if it correctly detect the format itself.

> @@ -2027,11 +2051,11 @@ rpm_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_up
>  
>  
>  
> -// scan for *.rpm files
> +// scan for archive files such as .rpm
>  static void
> -scan_source_rpm_path (const string& dir)
> +scan_source_archive_path (const string& dir)
>  {
> -  obatched(clog) << "fts/rpm traversing " << dir << endl;
> +  obatched(clog) << "fts/archive traversing " << dir << endl;
>  
>    sqlite_ps ps_upsert_buildids (db, "rpm-buildid-intern", "insert or ignore into " BUILDIDS "_buildids VALUES (NULL, ?);");
>    sqlite_ps ps_upsert_files (db, "rpm-file-intern", "insert or ignore into " BUILDIDS "_files VALUES (NULL, ?);");
> @@ -2060,7 +2084,7 @@ scan_source_rpm_path (const string& dir)
>    struct timeval tv_start, tv_end;
>    gettimeofday (&tv_start, NULL);
>    unsigned fts_scanned=0, fts_regex=0, fts_cached=0, fts_debuginfo=0;
> -  unsigned fts_executable=0, fts_rpm = 0, fts_sref=0, fts_sdef=0;
> +  unsigned fts_executable=0, fts_archive = 0, fts_sref=0, fts_sdef=0;
>  
>    FTS *fts = fts_open (dirs,
>                         (traverse_logical ? FTS_LOGICAL : FTS_PHYSICAL|FTS_XDEV)
> @@ -2082,7 +2106,7 @@ scan_source_rpm_path (const string& dir)
>          break;
>  
>        if (verbose > 2)
> -        obatched(clog) << "fts/rpm traversing " << f->fts_path << endl;
> +        obatched(clog) << "fts/archive traversing " << f->fts_path << endl;
>  
>        try
>          {
> @@ -2101,7 +2125,7 @@ scan_source_rpm_path (const string& dir)
>            if (!ri || rx)
>              {
>                if (verbose > 3)
> -                obatched(clog) << "fts/rpm skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
> +                obatched(clog) << "fts/archive skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl;
>                fts_regex ++;
>                continue;
>              }
> @@ -2116,13 +2140,13 @@ scan_source_rpm_path (const string& dir)
>  
>              case FTS_F:
>                {
> -                // heuristic: reject if file name does not end with ".rpm"
> -                // (alternative: try opening with librpm etc., caching)
> -                string suffix = ".rpm";
> -                if (rps.size() < suffix.size() ||
> -                    rps.substr(rps.size()-suffix.size()) != suffix)
> +		bool any = false;
> +                for (auto&& arch : scan_archives)
> +                  if (string_endswith(rps, arch.first))
> +		    any = true;
> +		if (! any)
>                    continue;
> -                fts_rpm ++;
> +                fts_archive ++;
>  
>                  /* See if we know of it already. */
>                  int rc = ps_query
> @@ -2133,7 +2157,7 @@ scan_source_rpm_path (const string& dir)
>                  ps_query.reset();
>                  if (rc == SQLITE_ROW) // i.e., a result, as opposed to DONE (no results)
>                    // no need to recheck a file/version we already know
> -                  // specifically, no need to parse this rpm again, since we already have
> +                  // specifically, no need to parse this archive again, since we already have
>                    // it as a D or E or S record,
>                    // (so is stored with buildid=NULL)
>                    {
> @@ -2141,29 +2165,29 @@ scan_source_rpm_path (const string& dir)
>                      continue;
>                    }
>  
> -                // intern the rpm file name
> +                // intern the archive file name
>                  ps_upsert_files
>                    .reset()
>                    .bind(1, rps)
>                    .step_ok_done();
>  
> -                // extract the rpm contents via popen("rpm2cpio") | libarchive | loop-of-elf_classify()
> +                // extract the archive contents
>                  unsigned my_fts_executable = 0, my_fts_debuginfo = 0, my_fts_sref = 0, my_fts_sdef = 0;
>                  bool my_fts_sref_complete_p = true;
>                  try
>                    {
> -                    rpm_classify (rps,
> +                    archive_classify (rps,
>                                    ps_upsert_buildids, ps_upsert_files,
>                                    ps_upsert_de, ps_upsert_sref, ps_upsert_sdef, // dalt
>                                    f->fts_statp->st_mtime,
>                                    my_fts_executable, my_fts_debuginfo, my_fts_sref, my_fts_sdef,
>                                    my_fts_sref_complete_p);
> -                    inc_metric ("scanned_total","source","rpm");
> -                    add_metric("found_debuginfo_total","source","rpm",
> +                    inc_metric ("scanned_total","source","archive");
> +                    add_metric("found_debuginfo_total","source","archive",
>                                 my_fts_debuginfo);
> -                    add_metric("found_executable_total","source","rpm",
> +                    add_metric("found_executable_total","source","archive",
>                                 my_fts_executable);
> -                    add_metric("found_sourcerefs_total","source","rpm",
> +                    add_metric("found_sourcerefs_total","source","archive",
>                                 my_fts_sref);
>                    }
>                  catch (const reportable_exception& e)
> @@ -2172,7 +2196,7 @@ scan_source_rpm_path (const string& dir)
>                    }
>  
>                  if (verbose > 2)
> -                  obatched(clog) << "scanned rpm=" << rps
> +                  obatched(clog) << "scanned archive=" << rps
>                                   << " mtime=" << f->fts_statp->st_mtime
>                                   << " executables=" << my_fts_executable
>                                   << " debuginfos=" << my_fts_debuginfo
> @@ -2197,7 +2221,7 @@ scan_source_rpm_path (const string& dir)
>  
>              case FTS_ERR:
>              case FTS_NS:
> -              throw libc_exception(f->fts_errno, string("fts/rpm traversal ") + string(f->fts_path));
> +              throw libc_exception(f->fts_errno, string("fts/archive traversal ") + string(f->fts_path));
>  
>              default:
>              case FTS_SL: /* ignore symlinks; seen in non-L mode only */
> @@ -2206,9 +2230,9 @@ scan_source_rpm_path (const string& dir)
>  
>            if ((verbose && f->fts_info == FTS_DP) ||
>                (verbose > 1 && f->fts_info == FTS_F))
> -            obatched(clog) << "fts/rpm traversing " << rps << ", scanned=" << fts_scanned
> +            obatched(clog) << "fts/archive traversing " << rps << ", scanned=" << fts_scanned
>                             << ", regex-skipped=" << fts_regex
> -                           << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
> +                           << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
>                             << ", executable=" << fts_executable
>                             << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
>          }
> @@ -2222,9 +2246,9 @@ scan_source_rpm_path (const string& dir)
>    gettimeofday (&tv_end, NULL);
>    double deltas = (tv_end.tv_sec - tv_start.tv_sec) + (tv_end.tv_usec - tv_start.tv_usec)*0.000001;
>  
> -  obatched(clog) << "fts/rpm traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
> +  obatched(clog) << "fts/archive traversed " << dir << " in " << deltas << "s, scanned=" << fts_scanned
>                   << ", regex-skipped=" << fts_regex
> -                 << ", rpm=" << fts_rpm << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
> +                 << ", archive=" << fts_archive << ", cached=" << fts_cached << ", debuginfo=" << fts_debuginfo
>                   << ", executable=" << fts_executable
>                   << ", sourcerefs=" << fts_sref << ", sourcedefs=" << fts_sdef << endl;
>  }
> @@ -2232,18 +2256,18 @@ scan_source_rpm_path (const string& dir)
>  
>  
>  static void*
> -thread_main_scan_source_rpm_path (void* arg)
> +thread_main_scan_source_archive_path (void* arg)
>  {
>    string dir = string((const char*) arg);
>  
>    unsigned rescan_timer = 0;
>    sig_atomic_t forced_rescan_count = 0;
> -  set_metric("thread_timer_max", "rpm", dir, rescan_s);
> -  set_metric("thread_tid", "rpm", dir, tid());
> +  set_metric("thread_timer_max", "archive", dir, rescan_s);
> +  set_metric("thread_tid", "archive", dir, tid());
>    while (! interrupted)
>      {
> -      set_metric("thread_timer", "rpm", dir, rescan_timer);
> -      set_metric("thread_forced_total", "rpm", dir, forced_rescan_count);
> +      set_metric("thread_timer", "archive", dir, rescan_timer);
> +      set_metric("thread_forced_total", "archive", dir, forced_rescan_count);
>        if (rescan_s && rescan_timer > rescan_s)
>          rescan_timer = 0;
>        if (sigusr1 != forced_rescan_count)
> @@ -2254,10 +2278,10 @@ thread_main_scan_source_rpm_path (void* arg)
>        if (rescan_timer == 0)
>          try
>            {
> -            set_metric("thread_working", "rpm", dir, time(NULL));
> -            inc_metric("thread_work_total", "rpm", dir);
> -            scan_source_rpm_path (dir);
> -            set_metric("thread_working", "rpm", dir, 0);
> +            set_metric("thread_working", "archive", dir, time(NULL));
> +            inc_metric("thread_work_total", "archive", dir);
> +            scan_source_archive_path (dir);
> +            set_metric("thread_working", "archive", dir, 0);
>            }
>          catch (const sqlite_exception& e)
>            {

So in general it is nice that libarchive can just open any given
archive and that we don't need to update the sql schema to keep track
of different archive types. But it does feel like the errors, logs and
metrics are a little generic (e.g. "cannot select all format"). I think
in a couple of places it would be nice if we just knew whether we were
dealing with rpms or debs. It also seems nice to have some separate
metrics. Might it be an idea to not just make scan_archives a simple
string map, but include the actual archive type in it? Or pass the
expected type to scanner thread or to the archive_classify or the
scan_source_archive_path so you can keep referring to rpms or debs in
the logs, error messages and metrics but still keep them as generic
"archives" in the sql tables?

> @@ -2490,8 +2514,8 @@ main (int argc, char *argv[])
>        error (EXIT_FAILURE, 0,
>               "unexpected argument: %s", argv[remaining]);
>  
> -  if (!scan_rpms && !scan_files && source_paths.size()>0)
> -    obatched(clog) << "warning: without -F and/or -R, ignoring PATHs" << endl;
> +  if (scan_archives.size()==0 && !scan_files && source_paths.size()>0)
> +    obatched(clog) << "warning: without -F -R -U, ignoring PATHs" << endl;
>  
>    (void) signal (SIGPIPE, SIG_IGN); // microhttpd can generate it incidentally, ignore
>    (void) signal (SIGINT, signal_handler); // ^C
> @@ -2611,16 +2635,22 @@ main (int argc, char *argv[])
>    if (maxigroom)
>      obatched(clog) << "maxigroomed database" << endl;
>  
> -
>    obatched(clog) << "search concurrency " << concurrency << endl;
>    obatched(clog) << "rescan time " << rescan_s << endl;
>    obatched(clog) << "groom time " << groom_s << endl;
> +  if (scan_archives.size()>0)
> +    {
> +      obatched ob(clog);
> +      auto& o = ob << "scanning archive types ";
> +      for (auto&& arch : scan_archives)
> +	o << arch.first << " ";
> +      o << endl;
> +    }

The indentation is correct in the source, but email seems to have
mangled it. Just note that tabs and spaces aren't used consistently in
the source file.

>    const char* du = getenv(DEBUGINFOD_URLS_ENV_VAR);
>    if (du && du[0] != '\0') // set to non-empty string?
>      obatched(clog) << "upstream debuginfod servers: " << du << endl;
>  
> -  vector<pthread_t> source_file_scanner_threads;
> -  vector<pthread_t> source_rpm_scanner_threads;
> +  vector<pthread_t> scanner_threads;
>    pthread_t groom_thread;
>  
>    rc = pthread_create (& groom_thread, NULL, thread_main_groom, NULL);
> @@ -2632,20 +2662,21 @@ main (int argc, char *argv[])
>        pthread_t pt;
>        rc = pthread_create (& pt, NULL, thread_main_scan_source_file_path, (void*) it.c_str());
>        if (rc < 0)
> -        error (0, 0, "warning: cannot spawn thread (%d) to scan source files %s\n", rc, it.c_str());
> +        error (0, 0, "warning: cannot spawn thread (%d) to scan files %s\n", rc, it.c_str());
>        else
> -        source_file_scanner_threads.push_back(pt);
> +        scanner_threads.push_back(pt);
>      }
>  
> -  if (scan_rpms) for (auto&& it : source_paths)
> -    {
> -      pthread_t pt;
> -      rc = pthread_create (& pt, NULL, thread_main_scan_source_rpm_path, (void*) it.c_str());
> -      if (rc < 0)
> -        error (0, 0, "warning: cannot spawn thread (%d) to scan source rpms %s\n", rc, it.c_str());
> -      else
> -        source_rpm_scanner_threads.push_back(pt);
> -    }
> +  if (scan_archives.size() > 0)
> +    for (auto&& it : source_paths)
> +      {
> +        pthread_t pt;
> +        rc = pthread_create (& pt, NULL, thread_main_scan_source_archive_path, (void*) it.c_str());
> +        if (rc < 0)
> +          error (0, 0, "warning: cannot spawn thread (%d) to scan archives %s\n", rc, it.c_str());
> +        else
> +          scanner_threads.push_back(pt);
> +      }
>  
>    /* Trivial main loop! */
>    set_metric("ready", 1);
> @@ -2656,10 +2687,8 @@ main (int argc, char *argv[])
>    if (verbose)
>      obatched(clog) << "stopping" << endl;
>  
> -  /* Join any source scanning threads. */
> -  for (auto&& it : source_file_scanner_threads)
> -    pthread_join (it, NULL);
> -  for (auto&& it : source_rpm_scanner_threads)
> +  /* Join all our threads. */
> +  for (auto&& it : scanner_threads)
>      pthread_join (it, NULL);
>    pthread_join (groom_thread, NULL);

Note that in some of the unpatches code there are still references to
rpms in some logs and comments that should also be "archive".

> diff --git a/doc/ChangeLog b/doc/ChangeLog
> index 00a61ac..398048f 100644
> --- a/doc/ChangeLog
> +++ b/doc/ChangeLog
> @@ -1,3 +1,12 @@
> +2019-12-02  Frank Ch. Eigler  <fche@redhat.com
> +
> +	* debuginfod.8: Add -U (DEB) flag, generalize RPM to "archive".
> +
> +2019-11-26  Frank Ch. Eigler  <fche@redhat.com>
> +	    Aaron Merey  <amerey@redhat.com>
> +
> +	* debuginfod.8, find-debuginfo.1, debuginfod_*.3: New files.
> +

That second ChangeLog entry looks incorrect.

>  .TP
>  .B "\-t SECONDS"  "\-\-rescan\-time=SECONDS"
> -Set the rescan time for the file and RPM directories.  This is the
> +Set the rescan time for the file and archive directories.  This is the
>  amount of time the scanning threads will wait after finishing a scan,
>  before doing it again.  A rescan for unchanged files is fast (because
>  the index also stores the file mtimes).  A time of zero is acceptable,
> @@ -143,8 +152,8 @@ independent of the groom time (including if it was zero).
>  .B "\-G"
>  Run an extraordinary maximal-grooming pass at debuginfod startup.
>  This pass can take considerable time, because it tries to remove any
> -debuginfo-unrelated content from the RPM-related parts of the index.
> -It should not be run if any recent RPM-related indexing operations
> +debuginfo-unrelated content from the archive-related parts of the index.
> +It should not be run if any recent archive-related indexing operations
>  were aborted early.  It can take considerable space, because it
>  finishes up with an sqlite "vacuum" operation, which repacks the
>  database file by triplicating it temporarily.  The default is not to
> @@ -155,7 +164,7 @@ do maximal-grooming.  See also the \fIDATA MANAGEMENT\fP section.
>  Set the concurrency limit for all the scanning threads.  While many
>  threads may be spawned to cover all the given PATHs, only NUM may
>  concurrently do CPU-intensive operations like parsing an ELF file
> -or an RPM.  The default is the number of processors on the system;
> +or an archive.  The default is the number of processors on the system;
>  the minimum is 1.
>  
>  .TP
> @@ -257,10 +266,10 @@ many files.  This section offers some advice about the implications.
>  
>  As a general explanation for size, consider that debuginfod indexes
>  ELF/DWARF files, it stores their names and referenced source file
> -names, and buildids will be stored.  When indexing RPMs, it stores
> -every file name \fIof or in\fP an RPM, every buildid, plus every
> -source file name referenced from a DWARF file.  (Indexing RPMs takes
> -more space because the source files often reside in separate
> +names, and buildids will be stored.  When indexing archives, it stores
> +every file name \fIof or in\fP an archive, every buildid, plus every
> +source file name referenced from a DWARF file.  (Indexing archives
> +takes more space because the source files often reside in separate
>  subpackages that may not be indexed at the same pass, so extra
>  metadata has to be kept.)

Is my understanding that debug debs don't contain the actual source
correct? If that is the case can't we take advantage of that by never
indexing source files from debs?
 
> @@ -283,14 +292,14 @@ This means that the sqlite files grow fast during initial indexing,
>  slowly during index rescans, and periodically shrink during grooming.
>  There is also an optional one-shot \fImaximal grooming\fP pass is
>  available.  It removes information debuginfo-unrelated data from the
> -RPM content index such as file names found in RPMs ("rpm sdef"
> -records) that are not referred to as source files from any binaries
> -find in RPMs ("rpm sref" records).  This can save considerable disk
> -space.  However, it is slow and temporarily requires up to twice the
> -database size as free space.  Worse: it may result in missing
> -source-code info if the RPM traversals were interrupted, so the not
> -all source file references were known.  Use it rarely to polish a
> -complete index.
> +archive content index such as file names found in archives ("archive
> +sdef" records) that are not referred to as source files from any
> +binaries find in archives ("archive sref" records).  This can save
> +considerable disk space.  However, it is slow and temporarily requires
> +up to twice the database size as free space.  Worse: it may result in
> +missing source-code info if the archive traversals were interrupted,
> +so the not all source file references were known.  Use it rarely to
> +polish a complete index.

Already in the original, but should that be "so that not all
source..."?

>  You should ensure that ample disk space remains available.  (The flood
>  of error messages on -ENOSPC is ugly and nagging.  But, like for most
> @@ -317,7 +326,7 @@ happens, new versions of debuginfod will issue SQL statements to
>  \fIdrop\fP all prior schema & data, and start over.  So, disk space
>  will not be wasted for retaining a no-longer-useable dataset.
>  
> -In summary, if your system can bear a 0.5%-3% index-to-RPM-dataset
> +In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
>  size ratio, and slow growth afterwards, you should not need to
>  worry about disk space.  If a system crash corrupts the database,
>  or you want to force debuginfod to reset and start over, simply

Here I think you should leave it at RPM since that is what you
originally measured. It isn't clear to me that things are the same for
debs since those don't include the sources?

Cheers,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-06 23:32 ` Mark Wielaard
@ 2019-12-07  3:03   ` Frank Ch. Eigler
  2019-12-11 21:21     ` Mark Wielaard
  0 siblings, 1 reply; 9+ messages in thread
From: Frank Ch. Eigler @ 2019-12-07  3:03 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

Hi -

Thanks for the review!

> I looked at some distros, but the only ones that provide consistent
> debug[info] packages do so in rpm or deb format.

Yeah.

> What is the difference between a .deb and a .ddeb file/archive?

AIUI, .ddeb = debugging .deb ... although the "-dbgsym" substring
already says about the same thing.


> > @@ -851,7 +867,11 @@ handle_buildid_r_match (int64_t b_mtime,
> >        return 0;
> >      }
> >  
> > -  string popen_cmd = string("rpm2cpio " + shell_escape(b_source0));
> > +  string archive_decoder = "/dev/null";
> > +  for (auto&& arch : scan_archives)
> > +    if (string_endswith(b_source0, arch.first))
> > +      archive_decoder = arch.second;
> > +  string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
> >    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
> >    if (fp == NULL)
> >      throw libc_exception (errno, string("popen ") + popen_cmd);
> 
> This seems a lot of work for dealing with non-archives. 

We don't do this for non-archives.  This is in the path where an
archive record already matched in the database.


> > +archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
> >                sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
> >                time_t mtime,
> >                unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
> >                bool& fts_sref_complete_p)
> >  {
> > -  string popen_cmd = string("rpm2cpio " + shell_escape(rps));
> > +  string archive_decoder = "/dev/null";
> > +  for (auto&& arch : scan_archives)
> > +    if (string_endswith(rps, arch.first))
> > +      archive_decoder = arch.second;
> > +  string popen_cmd = archive_decoder + " " + shell_escape(rps);
> >    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
> >    if (fp == NULL)
> >      throw libc_exception (errno, string("popen ") + popen_cmd);
> 
> Likewise as above. Can we skip the whole popen dance if nothing
> matches?

If you check out the caller, this part is not even called if
the extension does not match.


> >    rc = archive_read_support_filter_all(a);
> >    if (rc != ARCHIVE_OK)
> >      throw archive_exception(a, "cannot select all filters");
> 
> In theory you could know the format you are expecting, just like you
> know the decoder above. Might that give slight more accurate error
> messages and maybe be slightly more efficient since libarchive doesn't
> need to try to see if it correctly detect the format itself.

The "filter" part is for different compression algorithms.  Keeping it
this way costs us nothing and lets the code tolerate changes in
compression policy in the future.


> But it does feel like the errors, logs and metrics are a little
> generic (e.g. "cannot select all format").

The way in which specializing the format errors could help if
debuginfod were run against rpms that had a non-cpio payload, or debs
that had a non-tar payload.  This means some sort of corruption, which
contravenes our "trustworthy data" assumption -- or upstream policy
change, which is nothing to worry about.

If you think separate metrics for .deb vs .rpm archives might be
useful, can do.


> Note that in some of the unpatches code there are still references to
> rpms in some logs and comments that should also be "archive".

OK, will recheck, but some of those references may be valid because
RPMs do have some unique characteristics.  Some other mentions were
nuked from the branch version of the patch, so the email is a little
obsolete.


> > +2019-12-02  Frank Ch. Eigler  <fche@redhat.com
> > +
> > +	* debuginfod.8: Add -U (DEB) flag, generalize RPM to "archive".
> > +
> > +2019-11-26  Frank Ch. Eigler  <fche@redhat.com>
> > +	    Aaron Merey  <amerey@redhat.com>
> > +
> > +	* debuginfod.8, find-debuginfo.1, debuginfod_*.3: New files.
> > +
> 
> That second ChangeLog entry looks incorrect.

It adds an entry that was missed earlier.


> Is my understanding that debug debs don't contain the actual source
> correct?

Correct, not at this time.

> If that is the case can't we take advantage of that by never
> indexing source files from debs?

Or specifically, not populating the -sdefs/-srefs data.  That's a
reasonable optimization - OTOH if at some point, deb files start
including proper debugedit-style sources, then this code will just
work.


> [...]
> Already in the original, but should that be "so that not all
> source..."?

Yes.


> > -In summary, if your system can bear a 0.5%-3% index-to-RPM-dataset
> > +In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
> >  size ratio, and slow growth afterwards, you should not need to
> >  worry about disk space.  If a system crash corrupts the database,
> >  or you want to force debuginfod to reset and start over, simply
> 
> Here I think you should leave it at RPM since that is what you
> originally measured. It isn't clear to me that things are the same for
> debs since those don't include the sources?

Actually, if the above optimization is not made, the present code does
consume approximately the same amount of data (outgoing source-refs
plus deb-file-list source-defs).


- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-06 21:17 ` Frank Ch. Eigler
@ 2019-12-11 21:06   ` Mark Wielaard
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Wielaard @ 2019-12-11 21:06 UTC (permalink / raw)
  To: Frank Ch. Eigler, elfutils-devel

Hi,

On Fri, 2019-12-06 at 16:17 -0500, Frank Ch. Eigler wrote:
> Presenting testing for the debuginfod .deb/.ddeb support patch, after
> finding a good debian-packaging tutorial, and generating a workable
> basic set of test deb's on a Ubuntu box.

According to Matthias on irc there should be no difference between
Ubuntu and Debian doing this, so this should cover both systems.

>     debuginfod: deb support, tests
>     
>     Using a synthetic .deb/.ddeb from a Ubuntu 18 machine,
>     extend the debuginfod testsuite with some .deb processing,
>     if the dpkg-deb binary is installed.

Looks good. Thanks.
All buildbot workers should already have the dpkg-deb binary installed.

Two nitpicks below, not very important.

> diff --git a/config/ChangeLog b/config/ChangeLog
> index d71fb39..9b2a408 100644
> --- a/config/ChangeLog
> +++ b/config/ChangeLog
> @@ -1,3 +1,8 @@
> +2019-12-06  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* elfutils.spec.in (debuginfod): Add BuildRequire dpkg
> +	for deb testing.  (Available on Fedora & EPEL, not base RHEL.)
> +
>  2019-11-28  Mark Wielaard  <mark@klomp.org>
>  
>  	* elfutils.spec.in (debuginfod): Add an explicit Requires
> diff --git a/config/elfutils.spec.in b/config/elfutils.spec.in
> index 1cdca21..faeb7f8 100644
> --- a/config/elfutils.spec.in
> +++ b/config/elfutils.spec.in
> @@ -35,6 +35,9 @@ BuildRequires: pkgconfig(libarchive) >= 3.1.2
>  BuildRequires: bzip2
>  # For the run-debuginfod-find.sh test case in %check for
> /usr/sbin/ss
>  BuildRequires: iproute
> +%if 0%{?fedora} >= 20
> +BuildRequires: dpkg
> +%endif
>  BuildRequires: curl

I believe all public builders (koji, copr, etc) do have epel enabled,
so I would not conditionalize the BuildRequires.

> @@ -205,10 +209,12 @@ rpm_test() {

Maybe rename this archive_test?

Cheers,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-07  3:03   ` Frank Ch. Eigler
@ 2019-12-11 21:21     ` Mark Wielaard
  2019-12-13 19:25       ` Frank Ch. Eigler
  0 siblings, 1 reply; 9+ messages in thread
From: Mark Wielaard @ 2019-12-11 21:21 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: elfutils-devel

Hi Frank,

On Fri, 2019-12-06 at 22:03 -0500, Frank Ch. Eigler wrote:
> @@ -851,7 +867,11 @@ handle_buildid_r_match (int64_t b_mtime,
> > >        return 0;
> > >      }
> > >  
> > > -  string popen_cmd = string("rpm2cpio " + shell_escape(b_source0));
> > > +  string archive_decoder = "/dev/null";
> > > +  for (auto&& arch : scan_archives)
> > > +    if (string_endswith(b_source0, arch.first))
> > > +      archive_decoder = arch.second;
> > > +  string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
> > >    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
> > >    if (fp == NULL)
> > >      throw libc_exception (errno, string("popen ") + popen_cmd);
> > 
> > This seems a lot of work for dealing with non-archives. 
> 
> We don't do this for non-archives.  This is in the path where an
> archive record already matched in the database.
> 
> > > +archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
> > >                sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
> > >                time_t mtime,
> > >                unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
> > >                bool& fts_sref_complete_p)
> > >  {
> > > -  string popen_cmd = string("rpm2cpio " + shell_escape(rps));
> > > +  string archive_decoder = "/dev/null";
> > > +  for (auto&& arch : scan_archives)
> > > +    if (string_endswith(rps, arch.first))
> > > +      archive_decoder = arch.second;
> > > +  string popen_cmd = archive_decoder + " " + shell_escape(rps);
> > >    FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
> > >    if (fp == NULL)
> > >      throw libc_exception (errno, string("popen ") + popen_cmd);
> > 
> > Likewise as above. Can we skip the whole popen dance if nothing
> > matches?
> 
> If you check out the caller, this part is not even called if
> the extension does not match.

I see, I missed that both functions are only called after first
checking the archive type. I think it might be helpful/clearer if both
methods would be called with the intended archive type then, also
because that might make it simpler to...

> > But it does feel like the errors, logs and metrics are a little
> > generic (e.g. "cannot select all format").
> 
> The way in which specializing the format errors could help if
> debuginfod were run against rpms that had a non-cpio payload, or debs
> that had a non-tar payload.  This means some sort of corruption, which
> contravenes our "trustworthy data" assumption -- or upstream policy
> change, which is nothing to worry about.
> 
> If you think separate metrics for .deb vs .rpm archives might be
> useful, can do.

If it isn't too much work then I do think it would be useful/clearer if
the logs/metrics reported debs and rpms separately.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-11 21:21     ` Mark Wielaard
@ 2019-12-13 19:25       ` Frank Ch. Eigler
  2019-12-22 15:33         ` Mark Wielaard
  0 siblings, 1 reply; 9+ messages in thread
From: Frank Ch. Eigler @ 2019-12-13 19:25 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

Hi -

> I see, I missed that both functions are only called after first
> checking the archive type. I think it might be helpful/clearer if both
> methods would be called with the intended archive type then, also
> because that might make it simpler to...

The archive subtype (rpm vs deb) is not stored in the database, as
this would break the current schema, and is really not needed.  In any
case, it's easily recovered for purpose of separate metrics.  Following
sur-patch added to fche/debuginfod-deb branch:


commit 02694cd29672d6912569a4bfe03b703bc134a821
Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Fri Dec 13 14:21:12 2019 -0500

    debuginfod deb support: review responses

diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index f022995f490f..70cb95fecd65 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -868,9 +868,13 @@ handle_buildid_r_match (int64_t b_mtime,
     }
 
   string archive_decoder = "/dev/null";
+  string archive_extension = "";
   for (auto&& arch : scan_archives)
     if (string_endswith(b_source0, arch.first))
-      archive_decoder = arch.second;
+      {
+        archive_extension = arch.first;
+        archive_decoder = arch.second;
+      }
   string popen_cmd = archive_decoder + " " + shell_escape(b_source0);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
@@ -922,7 +926,7 @@ handle_buildid_r_match (int64_t b_mtime,
           throw archive_exception(a, "cannot extract file");
         }
 
-      inc_metric ("http_responses_total","result","archive");
+      inc_metric ("http_responses_total","result",archive_extension + " archive");
       struct MHD_Response* r = MHD_create_response_from_fd (archive_entry_size(e), fd);
       if (r == 0)
         {
@@ -1881,16 +1885,20 @@ thread_main_scan_source_file_path (void* arg)
 // Analyze given archive file of given age; record buildids / exec/debuginfo-ness of its
 // constituent files with given upsert statements.
 static void
-archive_classify (const string& rps, sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
-              sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
-              time_t mtime,
-              unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
-              bool& fts_sref_complete_p)
+archive_classify (const string& rps, string& archive_extension,
+                  sqlite_ps& ps_upsert_buildids, sqlite_ps& ps_upsert_files,
+                  sqlite_ps& ps_upsert_de, sqlite_ps& ps_upsert_sref, sqlite_ps& ps_upsert_sdef,
+                  time_t mtime,
+                  unsigned& fts_executable, unsigned& fts_debuginfo, unsigned& fts_sref, unsigned& fts_sdef,
+                  bool& fts_sref_complete_p)
 {
   string archive_decoder = "/dev/null";
   for (auto&& arch : scan_archives)
     if (string_endswith(rps, arch.first))
-      archive_decoder = arch.second;
+      {
+        archive_extension = arch.first;
+        archive_decoder = arch.second;
+      }
   string popen_cmd = archive_decoder + " " + shell_escape(rps);
   FILE* fp = popen (popen_cmd.c_str(), "r"); // "e" O_CLOEXEC?
   if (fp == NULL)
@@ -2176,18 +2184,19 @@ scan_source_archive_path (const string& dir)
                 bool my_fts_sref_complete_p = true;
                 try
                   {
-                    archive_classify (rps,
+                    string archive_extension;
+                    archive_classify (rps, archive_extension,
                                   ps_upsert_buildids, ps_upsert_files,
                                   ps_upsert_de, ps_upsert_sref, ps_upsert_sdef, // dalt
                                   f->fts_statp->st_mtime,
                                   my_fts_executable, my_fts_debuginfo, my_fts_sref, my_fts_sdef,
                                   my_fts_sref_complete_p);
-                    inc_metric ("scanned_total","source","archive");
-                    add_metric("found_debuginfo_total","source","archive",
+                    inc_metric ("scanned_total","source",archive_extension + " archive");
+                    add_metric("found_debuginfo_total","source",archive_extension + " archive",
                                my_fts_debuginfo);
-                    add_metric("found_executable_total","source","archive",
+                    add_metric("found_executable_total","source",archive_extension + " archive",
                                my_fts_executable);
-                    add_metric("found_sourcerefs_total","source","archive",
+                    add_metric("found_sourcerefs_total","source",archive_extension + " archive",
                                my_fts_sref);
                   }
                 catch (const reportable_exception& e)
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index cd31e30e0491..01c7e58e5ffc 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -81,7 +81,8 @@ wait_ready()
   done;
 
   if [ $timeout -eq 0 ]; then
-    echo "metric $what never changed to $value on port $port"
+      echo "metric $what never changed to $value on port $port"
+      curl -s http://127.0.0.1:$port/metrics
     exit 1;
   fi
 }
@@ -166,7 +167,7 @@ cp -rvp ${abs_srcdir}/debuginfod-rpms R
 kill -USR1 $PID1
 # All rpms need to be in the index
 rpms=$(find R -name \*rpm | wc -l)
-wait_ready $PORT1 'scanned_total{source="archive"}' $rpms
+wait_ready $PORT1 'scanned_total{source=".rpm archive"}' $rpms
 
 kill -USR1 $PID1  # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs
 # Expect all source files found in the rpms (they are all called hello.c :)
@@ -191,11 +192,11 @@ sourcefiles=$(find -name \*\\.debug \
 cd ..
 rm -rf extracted
 
-wait_ready $PORT1 'found_sourcerefs_total{source="archive"}' $sourcefiles
+wait_ready $PORT1 'found_sourcerefs_total{source=".rpm archive"}' $sourcefiles
 
-# Run a bank of queries against the debuginfod-rpms test cases
+# Run a bank of queries against the debuginfod-rpms / debuginfod-debs test cases
 
-rpm_test() {
+archive_test() {
     __BUILDID=$1
     __SOURCEPATH=$2
     __SOURCESHA1=$3
@@ -221,14 +222,14 @@ rpm_test() {
 # common source file sha1
 SHA=f4a1a8062be998ae93b8f1cd744a398c6de6dbb1
 # fedora30
-rpm_test c36708a78618d597dee15d0dc989f093ca5f9120 /usr/src/debug/hello2-1.0-2.x86_64/hello.c $SHA
-rpm_test 41a236eb667c362a1c4196018cc4581e09722b1b /usr/src/debug/hello2-1.0-2.x86_64/hello.c $SHA
+archive_test c36708a78618d597dee15d0dc989f093ca5f9120 /usr/src/debug/hello2-1.0-2.x86_64/hello.c $SHA
+archive_test 41a236eb667c362a1c4196018cc4581e09722b1b /usr/src/debug/hello2-1.0-2.x86_64/hello.c $SHA
 # rhel7
-rpm_test bc1febfd03ca05e030f0d205f7659db29f8a4b30 /usr/src/debug/hello-1.0/hello.c $SHA
-rpm_test f0aa15b8aba4f3c28cac3c2a73801fefa644a9f2 /usr/src/debug/hello-1.0/hello.c $SHA
+archive_test bc1febfd03ca05e030f0d205f7659db29f8a4b30 /usr/src/debug/hello-1.0/hello.c $SHA
+archive_test f0aa15b8aba4f3c28cac3c2a73801fefa644a9f2 /usr/src/debug/hello-1.0/hello.c $SHA
 # rhel6
-rpm_test bbbf92ebee5228310e398609c23c2d7d53f6e2f9 /usr/src/debug/hello-1.0/hello.c $SHA
-rpm_test d44d42cbd7d915bc938c81333a21e355a6022fb7 /usr/src/debug/hello-1.0/hello.c $SHA
+archive_test bbbf92ebee5228310e398609c23c2d7d53f6e2f9 /usr/src/debug/hello-1.0/hello.c $SHA
+archive_test d44d42cbd7d915bc938c81333a21e355a6022fb7 /usr/src/debug/hello-1.0/hello.c $SHA
 
 RPM_BUILDID=d44d42cbd7d915bc938c81333a21e355a6022fb7 # in rhel6/ subdir, for a later test
 
@@ -276,11 +277,13 @@ if type dpkg-deb 2>/dev/null; then
     cp -rvp ${abs_srcdir}/debuginfod-debs/*deb D
     kill -USR1 $PID2
     # All debs need to be in the index
-    debs=$(find D -name \*deb | wc -l)
-    wait_ready $PORT2 'scanned_total{source="archive"}' `expr $debs`
+    debs=$(find D -name \*.deb | wc -l)
+    wait_ready $PORT2 'scanned_total{source=".deb archive"}' `expr $debs`
+    ddebs=$(find D -name \*.ddeb | wc -l)
+    wait_ready $PORT2 'scanned_total{source=".ddeb archive"}' `expr $ddebs`
 
     # ubuntu
-    rpm_test f17a29b5a25bd4960531d82aa6b07c8abe84fa66 "" ""
+    archive_test f17a29b5a25bd4960531d82aa6b07c8abe84fa66 "" ""
 fi
 
 rm -rf $DEBUGINFOD_CACHE_PATH

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFCv2: debuginfod debian archive support
  2019-12-13 19:25       ` Frank Ch. Eigler
@ 2019-12-22 15:33         ` Mark Wielaard
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Wielaard @ 2019-12-22 15:33 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: elfutils-devel

Hi Frank,

On Fri, 2019-12-13 at 14:25 -0500, Frank Ch. Eigler wrote:
> > I see, I missed that both functions are only called after first
> > checking the archive type. I think it might be helpful/clearer if
> > both methods would be called with the intended archive type then,
> > also because that might make it simpler to...
> 
> The archive subtype (rpm vs deb) is not stored in the database, as
> this would break the current schema, and is really not needed.  In any
> case, it's easily recovered for purpose of separate metrics.  Following
> sur-patch added to fche/debuginfod-deb branch:

I think this is ready to be squashed and put on master. I had hoped out
Debian/Ubuntu friends would speak up. But they had their chance and
will now have to live with the code as is :)

Cheers,

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-12-22 15:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02 22:54 RFCv2: debuginfod debian archive support Frank Ch. Eigler
2019-12-05 12:17 ` Mark Wielaard
2019-12-06 21:17 ` Frank Ch. Eigler
2019-12-11 21:06   ` Mark Wielaard
2019-12-06 23:32 ` Mark Wielaard
2019-12-07  3:03   ` Frank Ch. Eigler
2019-12-11 21:21     ` Mark Wielaard
2019-12-13 19:25       ` Frank Ch. Eigler
2019-12-22 15:33         ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).