From: Mark Wielaard <mark@klomp.org>
To: Romain GEISSLER <romain.geissler@amadeus.com>
Cc: "elfutils-devel@sourceware.org" <elfutils-devel@sourceware.org>,
Francois RIGAULT <francois.rigault@amadeus.com>
Subject: Re: Performance issue with systemd-coredump and container process linking 2000 shared libraries.
Date: Wed, 21 Jun 2023 18:24:49 +0200 [thread overview]
Message-ID: <3465f95ee22b1ac433f1268f113e3813430be70a.camel@klomp.org> (raw)
In-Reply-To: <754003F4-3708-4C9C-AA30-EB76DAECF059@amadeus.com>
[-- Attachment #1: Type: text/plain, Size: 3311 bytes --]
Hi Romain,
On Tue, 2023-06-20 at 22:05 +0000, Romain GEISSLER wrote:
>
> Our real use case happens on a Openshift 4.13 node, so the OS is Red Hat Core OS 9 (which I assume shares a lot of foundations with RHEL 9).
>
> On our side Francois also told me this afternoon that he didn’t really reproduce the same thing with my reproducer posted here and the real systemd-coredump issue he witnessed live, and also noticed that with DEBUGINFOD_URLS unset/set to the empty string my reproducer has no problem anymore. What he witnessed on the real case (using perf/gdb) was that apparently lots of time was spend in elf_getdata_rawchunk and often in this kind of stack:
>
> Samples: 65K of event 'cpu-clock:pppH', Event count (approx.): 16468500000
> Overhead Command Shared Object Symbol
> 98.24% (sd-parse-elf) libelf-0.188.so [.] elf_getdata_rawchunk
> 0.48% (sd-parse-elf) libelf-0.188.so [.] 0x00000000000048a3
> 0.27% (sd-parse-elf) libelf-0.188.so [.] gelf_getphdr
> 0.11% (sd-parse-elf) libc.so.6 [.] _int_malloc
> 0.10% (sd-parse-elf) libelf-0.188.so [.] gelf_getnote
> 0.06% (sd-parse-elf) libc.so.6 [.] __libc_calloc
> 0.05% (sd-parse-elf) [kernel.kallsyms] [k] __softirqentry_text_start
> 0.05% (sd-parse-elf) libc.so.6 [.] _int_free
>
>
> (gdb) bt
> #0 0x00007f0ba8a88194 in elf_getdata_rawchunk () from target:/lib64/libelf.so.1
> #1 0x00007f0ba98e5013 in module_callback.lto_priv () from target:/usr/lib64/systemd/libsystemd-shared-252.so
> #2 0x00007f0ba8ae7291 in dwfl_getmodules () from target:/lib64/libdw.so.1
> #3 0x00007f0ba98e6dc0 in parse_elf_object () from target:/usr/lib64/systemd/libsystemd-shared-252.so
> #4 0x0000562c474f2d5e in submit_coredump ()
> #5 0x0000562c474f57d1 in process_socket.constprop ()
> #6 0x0000562c474efbf8 in main ()
>
> My reproducer actually doesn’t fully re-implement what systemd implements (the parsing of the package metadata is clearly omitted), so I thought I had reproduced the same problem while apparently I didn’t, sorry for that. We will also have to double check if really just using 2000 dummy libraries is enough or if this also needs to have a more complex binary like we have in our real case.
>
> Tomorrow on our side we will have to play a bit with a local build of systemd-coredump and try to run it manually to better understand what’s going wrong.
>
Seeing those performance results I understand why you were suspecting
the linked list data structure used in elf_getdata_rawchunk.
Would you be able to test a rebuild libelf with the attached patch,
which replaces that datastructure with a binary search tree?
It didn't really show much speedup locally (in the noise, maybe 0.01
sec faster on ~0.25 sec run). But if there are more than 2000 calls to
elf_getdata_rawchunk it should make things faster.
Thanks,
Mark
[-- Attachment #2: Type: text/x-patch, Size: 7186 bytes --]
From 3aca5b5f1f1617db2220022d9061dcaf129e54c4 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Wed, 21 Jun 2023 18:05:12 +0200
Subject: [PATCH] libelf: Replace list of elf_getdata_rawchunk results with a
tree
elf_getdata_rawchunks did a linear search to see if a chunk was
already fetched. Replace this list with a binary search tree to make
lookup faster when a lot of Elf_Data_Chunk were created.
* libelf/libelfP.h (Elf_Data_Chunk): Remove next field.
(struct Elf): Change the rawchunks type from Elf_Data_Chunk *
to void *.
* elf_getdata_rawchunk.c (chunk_compare): New static function.
(elf_getdata_rawchunk): Use tsearch instead of a manual linked
list.
* elf_end.c (free_chunk): New static function.
(elf_end): Call tdestroy instead of walking linked list.
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
libelf/elf_end.c | 22 +++++++++-------
libelf/elf_getdata_rawchunk.c | 47 +++++++++++++++++++++++++----------
libelf/libelfP.h | 13 ++++------
3 files changed, 52 insertions(+), 30 deletions(-)
diff --git a/libelf/elf_end.c b/libelf/elf_end.c
index 5c451f36..3e5d4c86 100644
--- a/libelf/elf_end.c
+++ b/libelf/elf_end.c
@@ -1,5 +1,6 @@
/* Free resources associated with Elf descriptor.
Copyright (C) 1998,1999,2000,2001,2002,2004,2005,2007,2015,2016 Red Hat, Inc.
+ Copyright (C) 2023 Mark J. Wielaard <mark@klomp.org>
This file is part of elfutils.
Written by Ulrich Drepper <drepper@redhat.com>, 1998.
@@ -32,12 +33,22 @@
#endif
#include <assert.h>
+#include <search.h>
#include <stddef.h>
#include <stdlib.h>
#include "libelfP.h"
+static void
+free_chunk (void *n)
+{
+ Elf_Data_Chunk *rawchunk = (Elf_Data_Chunk *)n;
+ if (rawchunk->dummy_scn.flags & ELF_F_MALLOCED)
+ free (rawchunk->data.d.d_buf);
+ free (rawchunk);
+}
+
int
elf_end (Elf *elf)
{
@@ -112,20 +123,13 @@ elf_end (Elf *elf)
case ELF_K_ELF:
{
- Elf_Data_Chunk *rawchunks
+ void *rawchunks
= (elf->class == ELFCLASS32
|| (offsetof (struct Elf, state.elf32.rawchunks)
== offsetof (struct Elf, state.elf64.rawchunks))
? elf->state.elf32.rawchunks
: elf->state.elf64.rawchunks);
- while (rawchunks != NULL)
- {
- Elf_Data_Chunk *next = rawchunks->next;
- if (rawchunks->dummy_scn.flags & ELF_F_MALLOCED)
- free (rawchunks->data.d.d_buf);
- free (rawchunks);
- rawchunks = next;
- }
+ tdestroy (rawchunks, free_chunk);
Elf_ScnList *list = (elf->class == ELFCLASS32
|| (offsetof (struct Elf, state.elf32.scns)
diff --git a/libelf/elf_getdata_rawchunk.c b/libelf/elf_getdata_rawchunk.c
index 5a35ccdc..cfd40396 100644
--- a/libelf/elf_getdata_rawchunk.c
+++ b/libelf/elf_getdata_rawchunk.c
@@ -1,6 +1,6 @@
/* Return converted data from raw chunk of ELF file.
Copyright (C) 2007, 2014, 2015 Red Hat, Inc.
- Copyright (C) 2022 Mark J. Wielaard <mark@klomp.org>
+ Copyright (C) 2022, 2023 Mark J. Wielaard <mark@klomp.org>
This file is part of elfutils.
This file is free software; you can redistribute it and/or modify
@@ -33,12 +33,28 @@
#include <assert.h>
#include <errno.h>
+#include <search.h>
#include <stdlib.h>
#include <string.h>
#include "libelfP.h"
#include "common.h"
+static int
+chunk_compare (const void *a, const void *b)
+{
+ Elf_Data_Chunk *da = (Elf_Data_Chunk *)a;
+ Elf_Data_Chunk *db = (Elf_Data_Chunk *)b;
+
+ if (da->offset != db->offset)
+ return da->offset - db->offset;
+
+ if (da->data.d.d_size != db->data.d.d_size)
+ return da->data.d.d_size - db->data.d.d_size;
+
+ return da->data.d.d_type - db->data.d.d_type;
+}
+
Elf_Data *
elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
{
@@ -75,19 +91,25 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
rwlock_rdlock (elf->lock);
/* Maybe we already got this chunk? */
- Elf_Data_Chunk *rawchunks = elf->state.elf.rawchunks;
- while (rawchunks != NULL)
+ Elf_Data_Chunk key;
+ key.offset = offset;
+ key.data.d.d_size = size;
+ key.data.d.d_type = type;
+ Elf_Data_Chunk **found = tsearch (&key, &elf->state.elf.rawchunks,
+ &chunk_compare);
+ if (found == NULL)
+ goto nomem;
+
+ /* Existing entry. */
+ if (*found != &key && *found != NULL)
{
- if ((rawchunks->offset == offset || size == 0)
- && rawchunks->data.d.d_size == size
- && rawchunks->data.d.d_type == type)
- {
- result = &rawchunks->data.d;
- goto out;
- }
- rawchunks = rawchunks->next;
+ result = &(*found)->data.d;
+ goto out;
}
+ /* New entry. */
+ *found = NULL;
+
size_t align = __libelf_type_align (elf->class, type);
if (elf->map_address != NULL)
{
@@ -189,8 +211,7 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
rwlock_unlock (elf->lock);
rwlock_wrlock (elf->lock);
- chunk->next = elf->state.elf.rawchunks;
- elf->state.elf.rawchunks = chunk;
+ *found = chunk;
result = &chunk->data.d;
out:
diff --git a/libelf/libelfP.h b/libelf/libelfP.h
index 6624f38a..d3c241e5 100644
--- a/libelf/libelfP.h
+++ b/libelf/libelfP.h
@@ -1,5 +1,6 @@
/* Internal interfaces for libelf.
Copyright (C) 1998-2010, 2015, 2016 Red Hat, Inc.
+ Copyright (C) 2023 Mark J. Wielaard <mark@klomp.org>
This file is part of elfutils.
Contributed by Ulrich Drepper <drepper@redhat.com>, 1998.
@@ -262,11 +263,7 @@ typedef struct Elf_ScnList
typedef struct Elf_Data_Chunk
{
Elf_Data_Scn data;
- union
- {
- Elf_Scn dummy_scn;
- struct Elf_Data_Chunk *next;
- };
+ Elf_Scn dummy_scn;
int64_t offset; /* The original raw offset in the Elf image. */
} Elf_Data_Chunk;
@@ -324,7 +321,7 @@ struct Elf
Elf_ScnList *scns_last; /* Last element in the section list.
If NULL the data has not yet been
read from the file. */
- Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */
+ void *rawchunks; /* Tree of elf_getdata_rawchunk results. */
unsigned int scnincr; /* Number of sections allocate the last
time. */
int ehdr_flags; /* Flags (dirty) for ELF header. */
@@ -343,7 +340,7 @@ struct Elf
Elf_ScnList *scns_last; /* Last element in the section list.
If NULL the data has not yet been
read from the file. */
- Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */
+ void *rawchunks; /* Tree of elf_getdata_rawchunk results. */
unsigned int scnincr; /* Number of sections allocate the last
time. */
int ehdr_flags; /* Flags (dirty) for ELF header. */
@@ -368,7 +365,7 @@ struct Elf
Elf_ScnList *scns_last; /* Last element in the section list.
If NULL the data has not yet been
read from the file. */
- Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */
+ void *rawchunks; /* Tree of elf_getdata_rawchunk results. */
unsigned int scnincr; /* Number of sections allocate the last
time. */
int ehdr_flags; /* Flags (dirty) for ELF header. */
--
2.40.1
next prev parent reply other threads:[~2023-06-21 16:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-14 13:39 Romain Geissler
2023-06-19 15:08 ` Mark Wielaard
2023-06-19 19:56 ` Romain GEISSLER
2023-06-19 20:54 ` Luca Boccassi
2023-06-20 11:59 ` Mark Wielaard
2023-06-20 12:12 ` Luca Boccassi
2023-06-20 13:15 ` Mark Wielaard
2023-06-20 21:37 ` Mark Wielaard
2023-06-20 22:05 ` Romain GEISSLER
2023-06-21 16:24 ` Mark Wielaard [this message]
2023-06-21 18:14 ` Romain GEISSLER
2023-06-21 19:39 ` Mark Wielaard
2023-06-22 8:10 ` Romain GEISSLER
2023-06-22 12:03 ` Mark Wielaard
2023-06-27 14:09 ` Mark Wielaard
2023-06-30 16:09 ` Romain GEISSLER
2023-06-30 22:35 ` Mark Wielaard
2023-06-22 2:40 ` Frank Ch. Eigler
2023-06-22 10:59 ` Mark Wielaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3465f95ee22b1ac433f1268f113e3813430be70a.camel@klomp.org \
--to=mark@klomp.org \
--cc=elfutils-devel@sourceware.org \
--cc=francois.rigault@amadeus.com \
--cc=romain.geissler@amadeus.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).