public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Romain GEISSLER <romain.geissler@amadeus.com>
Cc: "elfutils-devel@sourceware.org" <elfutils-devel@sourceware.org>,
	 Francois RIGAULT <francois.rigault@amadeus.com>
Subject: Re: Performance issue with systemd-coredump and container process linking 2000 shared libraries.
Date: Wed, 21 Jun 2023 18:24:49 +0200	[thread overview]
Message-ID: <3465f95ee22b1ac433f1268f113e3813430be70a.camel@klomp.org> (raw)
In-Reply-To: <754003F4-3708-4C9C-AA30-EB76DAECF059@amadeus.com>

[-- Attachment #1: Type: text/plain, Size: 3311 bytes --]

Hi Romain,

On Tue, 2023-06-20 at 22:05 +0000, Romain GEISSLER wrote:
> 
> Our real use case happens on a Openshift 4.13 node, so the OS is Red Hat Core OS 9 (which I assume shares a lot of foundations with RHEL 9).
> 
> On our side Francois also told me this afternoon that he didn’t really reproduce the same thing with my reproducer posted here and the real systemd-coredump issue he witnessed live, and also noticed that with DEBUGINFOD_URLS unset/set to the empty string my reproducer has no problem anymore. What he witnessed on the real case (using perf/gdb) was that apparently lots of time was spend in elf_getdata_rawchunk and often in this kind of stack:
> 
> Samples: 65K of event 'cpu-clock:pppH', Event count (approx.): 16468500000                                                                                                                                 
> Overhead  Command         Shared Object             Symbol                                                                                                                                                 
>   98.24%  (sd-parse-elf)  libelf-0.188.so           [.] elf_getdata_rawchunk
>    0.48%  (sd-parse-elf)  libelf-0.188.so           [.] 0x00000000000048a3
>    0.27%  (sd-parse-elf)  libelf-0.188.so           [.] gelf_getphdr
>    0.11%  (sd-parse-elf)  libc.so.6                 [.] _int_malloc
>    0.10%  (sd-parse-elf)  libelf-0.188.so           [.] gelf_getnote
>    0.06%  (sd-parse-elf)  libc.so.6                 [.] __libc_calloc
>    0.05%  (sd-parse-elf)  [kernel.kallsyms]         [k] __softirqentry_text_start
>    0.05%  (sd-parse-elf)  libc.so.6                 [.] _int_free
> 
> 
> (gdb) bt
> #0  0x00007f0ba8a88194 in elf_getdata_rawchunk () from target:/lib64/libelf.so.1
> #1  0x00007f0ba98e5013 in module_callback.lto_priv () from target:/usr/lib64/systemd/libsystemd-shared-252.so
> #2  0x00007f0ba8ae7291 in dwfl_getmodules () from target:/lib64/libdw.so.1
> #3  0x00007f0ba98e6dc0 in parse_elf_object () from target:/usr/lib64/systemd/libsystemd-shared-252.so
> #4  0x0000562c474f2d5e in submit_coredump ()
> #5  0x0000562c474f57d1 in process_socket.constprop ()
> #6  0x0000562c474efbf8 in main ()
> 
> My reproducer actually doesn’t fully re-implement what systemd implements (the parsing of the package metadata is clearly omitted), so I thought I had reproduced the same problem while apparently I didn’t, sorry for that. We will also have to double check if really just using 2000 dummy libraries is enough or if this also needs to have a more complex binary like we have in our real case.
> 
> Tomorrow on our side we will have to play a bit with a local build of systemd-coredump and try to run it manually to better understand what’s going wrong.
> 

Seeing those performance results I understand why you were suspecting
the linked list data structure used in elf_getdata_rawchunk.

Would you be able to test a rebuild libelf with the attached patch,
which replaces that datastructure with a binary search tree?

It didn't really show much speedup locally (in the noise, maybe 0.01
sec faster on ~0.25 sec run). But if there are more than 2000 calls to
elf_getdata_rawchunk it should make things faster.

Thanks,

Mark

[-- Attachment #2: Type: text/x-patch, Size: 7186 bytes --]

From 3aca5b5f1f1617db2220022d9061dcaf129e54c4 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Wed, 21 Jun 2023 18:05:12 +0200
Subject: [PATCH] libelf: Replace list of elf_getdata_rawchunk results with a
 tree

elf_getdata_rawchunks did a linear search to see if a chunk was
already fetched. Replace this list with a binary search tree to make
lookup faster when a lot of Elf_Data_Chunk were created.

       * libelf/libelfP.h (Elf_Data_Chunk): Remove next field.
       (struct Elf): Change the rawchunks type from Elf_Data_Chunk *
       to void *.
       * elf_getdata_rawchunk.c (chunk_compare): New static function.
       (elf_getdata_rawchunk): Use tsearch instead of a manual linked
       list.
       * elf_end.c (free_chunk): New static function.
       (elf_end): Call tdestroy instead of walking linked list.

Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 libelf/elf_end.c              | 22 +++++++++-------
 libelf/elf_getdata_rawchunk.c | 47 +++++++++++++++++++++++++----------
 libelf/libelfP.h              | 13 ++++------
 3 files changed, 52 insertions(+), 30 deletions(-)

diff --git a/libelf/elf_end.c b/libelf/elf_end.c
index 5c451f36..3e5d4c86 100644
--- a/libelf/elf_end.c
+++ b/libelf/elf_end.c
@@ -1,5 +1,6 @@
 /* Free resources associated with Elf descriptor.
    Copyright (C) 1998,1999,2000,2001,2002,2004,2005,2007,2015,2016 Red Hat, Inc.
+   Copyright (C) 2023 Mark J. Wielaard <mark@klomp.org>
    This file is part of elfutils.
    Written by Ulrich Drepper <drepper@redhat.com>, 1998.
 
@@ -32,12 +33,22 @@
 #endif
 
 #include <assert.h>
+#include <search.h>
 #include <stddef.h>
 #include <stdlib.h>
 
 #include "libelfP.h"
 
 
+static void
+free_chunk (void *n)
+{
+  Elf_Data_Chunk *rawchunk = (Elf_Data_Chunk *)n;
+  if (rawchunk->dummy_scn.flags & ELF_F_MALLOCED)
+    free (rawchunk->data.d.d_buf);
+  free (rawchunk);
+}
+
 int
 elf_end (Elf *elf)
 {
@@ -112,20 +123,13 @@ elf_end (Elf *elf)
 
     case ELF_K_ELF:
       {
-	Elf_Data_Chunk *rawchunks
+	void *rawchunks
 	  = (elf->class == ELFCLASS32
 	     || (offsetof (struct Elf, state.elf32.rawchunks)
 		 == offsetof (struct Elf, state.elf64.rawchunks))
 	     ? elf->state.elf32.rawchunks
 	     : elf->state.elf64.rawchunks);
-	while (rawchunks != NULL)
-	  {
-	    Elf_Data_Chunk *next = rawchunks->next;
-	    if (rawchunks->dummy_scn.flags & ELF_F_MALLOCED)
-	      free (rawchunks->data.d.d_buf);
-	    free (rawchunks);
-	    rawchunks = next;
-	  }
+	tdestroy (rawchunks, free_chunk);
 
 	Elf_ScnList *list = (elf->class == ELFCLASS32
 			     || (offsetof (struct Elf, state.elf32.scns)
diff --git a/libelf/elf_getdata_rawchunk.c b/libelf/elf_getdata_rawchunk.c
index 5a35ccdc..cfd40396 100644
--- a/libelf/elf_getdata_rawchunk.c
+++ b/libelf/elf_getdata_rawchunk.c
@@ -1,6 +1,6 @@
 /* Return converted data from raw chunk of ELF file.
    Copyright (C) 2007, 2014, 2015 Red Hat, Inc.
-   Copyright (C) 2022 Mark J. Wielaard <mark@klomp.org>
+   Copyright (C) 2022, 2023 Mark J. Wielaard <mark@klomp.org>
    This file is part of elfutils.
 
    This file is free software; you can redistribute it and/or modify
@@ -33,12 +33,28 @@
 
 #include <assert.h>
 #include <errno.h>
+#include <search.h>
 #include <stdlib.h>
 #include <string.h>
 
 #include "libelfP.h"
 #include "common.h"
 
+static int
+chunk_compare (const void *a, const void *b)
+{
+  Elf_Data_Chunk *da = (Elf_Data_Chunk *)a;
+  Elf_Data_Chunk *db = (Elf_Data_Chunk *)b;
+
+  if (da->offset != db->offset)
+    return da->offset - db->offset;
+
+  if (da->data.d.d_size != db->data.d.d_size)
+    return da->data.d.d_size - db->data.d.d_size;
+
+  return da->data.d.d_type - db->data.d.d_type;
+}
+
 Elf_Data *
 elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
 {
@@ -75,19 +91,25 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
   rwlock_rdlock (elf->lock);
 
   /* Maybe we already got this chunk?  */
-  Elf_Data_Chunk *rawchunks = elf->state.elf.rawchunks;
-  while (rawchunks != NULL)
+  Elf_Data_Chunk key;
+  key.offset = offset;
+  key.data.d.d_size = size;
+  key.data.d.d_type = type;
+  Elf_Data_Chunk **found = tsearch (&key, &elf->state.elf.rawchunks,
+				    &chunk_compare);
+  if (found == NULL)
+    goto nomem;
+
+  /* Existing entry.  */
+  if (*found != &key && *found != NULL)
     {
-      if ((rawchunks->offset == offset || size == 0)
-	  && rawchunks->data.d.d_size == size
-	  && rawchunks->data.d.d_type == type)
-	{
-	  result = &rawchunks->data.d;
-	  goto out;
-	}
-      rawchunks = rawchunks->next;
+      result = &(*found)->data.d;
+      goto out;
     }
 
+  /* New entry.  */
+  *found = NULL;
+
   size_t align = __libelf_type_align (elf->class, type);
   if (elf->map_address != NULL)
     {
@@ -189,8 +211,7 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
   rwlock_unlock (elf->lock);
   rwlock_wrlock (elf->lock);
 
-  chunk->next = elf->state.elf.rawchunks;
-  elf->state.elf.rawchunks = chunk;
+  *found = chunk;
   result = &chunk->data.d;
 
  out:
diff --git a/libelf/libelfP.h b/libelf/libelfP.h
index 6624f38a..d3c241e5 100644
--- a/libelf/libelfP.h
+++ b/libelf/libelfP.h
@@ -1,5 +1,6 @@
 /* Internal interfaces for libelf.
    Copyright (C) 1998-2010, 2015, 2016 Red Hat, Inc.
+   Copyright (C) 2023 Mark J. Wielaard <mark@klomp.org>
    This file is part of elfutils.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 1998.
 
@@ -262,11 +263,7 @@ typedef struct Elf_ScnList
 typedef struct Elf_Data_Chunk
 {
   Elf_Data_Scn data;
-  union
-  {
-    Elf_Scn dummy_scn;
-    struct Elf_Data_Chunk *next;
-  };
+  Elf_Scn dummy_scn;
   int64_t offset;		/* The original raw offset in the Elf image.  */
 } Elf_Data_Chunk;
 
@@ -324,7 +321,7 @@ struct Elf
       Elf_ScnList *scns_last;	/* Last element in the section list.
 				   If NULL the data has not yet been
 				   read from the file.  */
-      Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results.  */
+      void *rawchunks;		/* Tree of elf_getdata_rawchunk results.  */
       unsigned int scnincr;	/* Number of sections allocate the last
 				   time.  */
       int ehdr_flags;		/* Flags (dirty) for ELF header.  */
@@ -343,7 +340,7 @@ struct Elf
       Elf_ScnList *scns_last;	/* Last element in the section list.
 				   If NULL the data has not yet been
 				   read from the file.  */
-      Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results.  */
+      void *rawchunks;		/* Tree of elf_getdata_rawchunk results.  */
       unsigned int scnincr;	/* Number of sections allocate the last
 				   time.  */
       int ehdr_flags;		/* Flags (dirty) for ELF header.  */
@@ -368,7 +365,7 @@ struct Elf
       Elf_ScnList *scns_last;	/* Last element in the section list.
 				   If NULL the data has not yet been
 				   read from the file.  */
-      Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results.  */
+      void *rawchunks;		/* Tree of elf_getdata_rawchunk results.  */
       unsigned int scnincr;	/* Number of sections allocate the last
 				   time.  */
       int ehdr_flags;		/* Flags (dirty) for ELF header.  */
-- 
2.40.1


  reply	other threads:[~2023-06-21 16:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14 13:39 Romain Geissler
2023-06-19 15:08 ` Mark Wielaard
2023-06-19 19:56   ` Romain GEISSLER
2023-06-19 20:54     ` Luca Boccassi
2023-06-20 11:59       ` Mark Wielaard
2023-06-20 12:12         ` Luca Boccassi
2023-06-20 13:15     ` Mark Wielaard
2023-06-20 21:37   ` Mark Wielaard
2023-06-20 22:05     ` Romain GEISSLER
2023-06-21 16:24       ` Mark Wielaard [this message]
2023-06-21 18:14         ` Romain GEISSLER
2023-06-21 19:39           ` Mark Wielaard
2023-06-22  8:10             ` Romain GEISSLER
2023-06-22 12:03               ` Mark Wielaard
2023-06-27 14:09         ` Mark Wielaard
2023-06-30 16:09           ` Romain GEISSLER
2023-06-30 22:35             ` Mark Wielaard
2023-06-22  2:40       ` Frank Ch. Eigler
2023-06-22 10:59         ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3465f95ee22b1ac433f1268f113e3813430be70a.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=francois.rigault@amadeus.com \
    --cc=romain.geissler@amadeus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).